Chris Pollett > Students >
Swathi

    ( Print View)

    [Bio]

    [Blog]

    [C297 Proposal]

    [D1:Dataset Overview - pdf(ipynb)]

    [Deliverable 1]

    [Deliverable 2]

    [Visualization Techniques - pdf(Slides)]

    [Deliverable 3]

    [Deliverable 4]

    [CS 297 Report PDF]

    [C298 Proposal]

CS298 Proposal

Clustering Organ Cell Types

Swathi M.V.S (venkatasatyaswathi.mattaparthi@sjsu.edu)

Advisor: Dr. Chris Pollett

Committee Members: Dr. Robert Chun, Dr. Wendy Lee

Abstract:

Every organ in the human body is composed of specific tissues. In turn, each tissue is made of specific cells. Cells are the basic building blocks of all living organisms. The human body contains trillions of cells. Each cell has a nucleus which contains DNA - the blueprint of all organisms. DNA contains genes that carry genetic information. This information has all the instructions for the production of proteins by organisms. The Human Cell Atlas is an international project whose initiative is to create a comprehensive reference map of all human cells. The data for the Human Cell Atlas is obtained from Tabula Sapiens, a project conducted by Chan Zuckerberg Biohub in California. This dataset is a first-draft human cell atlas of nearly 500,000 cells from 24 organs of 15 human donors. It provides insights into the molecular composition of different cell types, containing gene expression patterns, signaling pathways, and so on. The Tabula Sapiens dataset is in H5ad (Hierarchical Data Format 5 Annotated) format and contains single-cell transcriptomic data. Transcriptomic data gives information about the RNA molecules present in cells, focusing on gene expression. My project will use The Human Cell Atlas as a reference to study the Tabula Sapiens Dataset to cluster the cells present in various organs and gain insight into the different types of cells present in the human body.

CS297 Results

  • Studied and explored the Heart data of the Tabula Sapiens dataset to understand the different cells and features present in the data.
  • Performed Classification and Clustering techniques on Heart data to distinguish the different types of cells.
  • Implemented Binary Classification algorithms to identify a particular cell type of the Heart data.
  • Built a Neural Network Classifier to identify the various cells present in the Heart data.

Proposed Schedule

Week 1: Jan 30 - Feb 5Submit CS298 Proposal
Week 2: Feb 6 - Feb 12Explore and preprocess Tabula Sapiens All Cells data
Week 3 - Week 4: Feb 13 - Feb 26Perform k-means clustering and explore various methods of choosing k
Week 5 - Week 6: Feb 27 - Mar 11Implement Hierarchical clustering to identify the number of clusters
Week 7 - Week 9: Mar 12 - Apr 1Compare the clustering of different organ types
Week 10 - Week 13: Apr 9 - May 6Work on CS298 Report/ Presentation

Key Deliverables:

  • Software
    • Explore the All Cells dataset and use it to perform k-means clustering, and experiment with different methods of choosing 'k'.
    • Use the dataset containing all cells, and implement Hierarchical clustering to find out the number of clusters.
    • Compare the clustering of different tissue (organ) types and check if there is a similarity among the clusters to identify any new cell types.
  • Report
    • CS298 Report
    • CS298 Presentation

Innovations and Challenges

  • One of the challenges is understanding and preprocessing single-cell human data of all the cells as it contains biological features of various organs.
  • Selecting the optimal number of clusters in k-means and Hierarchical clustering is an important task. Different methods may generate varying results, and identifying the most suitable approach is challenging.
  • The Human Cell Atlas is an ongoing research project, hence not many references are available. We aim to see if any new cell types can be identified from clustering, any new findings would be helpful in this field.

References:

[1] A Cartography of Human Histology in the Making https://www.economist.com/science-and-technology/2023/03/08/a-cartography-of-human-histolo gy-is-in-the-making, March 2023

[2] O. Rozenblatt-Rosen, M. Stubbington, A. Regev, and S. Teichmann, "The Human Cell Atlas: from vision to reality", Nature 550, 451-453, 2017, doi: 10.1038/550451a

[3] Tabula Sapiens Dataset https://tabula-sapiens-portal.ds.czbiohub.org/

[4] The Tabula Sapiens Consortium, "The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans", Science376, eabl4896, 2022, doi: 10.1126/science.abl4896