Provided by Kaggle

Team Member Julian Liu, Thomas Rapstine, Hoang Tran, Rong Lu, Jayde Thompson, Arun Kaashyap Arunachalam


With this project, we will not write python code. Instead, we will discuss and learn by running completed code scripts from Kagglers. This project is about some fundamental classification methods in machine learning. It is a great way to discover machine learning from a bird’s-eye view. Only basic Python knowledge required. Feel free to join us through Slack.


In this competition, Kagglers will develop models capable of classifying mixed patterns of proteins in microscope images. The Human Protein Atlas will use these models to build a tool integrated with their smart-microscopy system to identify a protein’s location(s) from a high-throughput image.

Proteins are “the doers” in the human cell, executing many functions that together enable life. Historically, classification of proteins has been limited to single patterns in one or a few cell types, but in order to fully understand the complexity of the human cell, models must classify mixed patterns across a range of different human cells.

Images visualizing proteins in cells are commonly used for biomedical research, and these cells could hold the key for the next breakthrough in medicine. However, thanks to advances in high-throughput microscopy, these images are generated at a far greater pace than what can be manually evaluated. Therefore, the need is greater than ever for automating biomedical image analysis to accelerate the understanding of human cells and disease.

Nature Methods has indicated interest in considering a paper discussing the outcome and approaches of the challenge. The Human Protein Atlas team would like to invite top performing teams to join as co-authors in the writing of this paper.

Top performing teams will also be eligible to compete for the special prize. Additional information for both the special prize and co-authoring for Nature Methods will become available through the Discussion posts once the main competition is complete.


The Human Protein Atlas is a Sweden-based initiative aimed at mapping all human proteins in cells, tissues and organs. All the data in the knowledge resource is open access to allow anyone to pursue exploration of the human proteome. In a recent publication, the Human Protein Atlas team has demonstrated the promise of both citizen science and artificial intelligence approaches in describing the location of human proteins in images, however current results are yet to approach expert-level annotations (Sullivan et al, Nature Biotechnology, Oct 2018).