Chris Pollett > Students >
Swathi
( Print View)

[C297 Proposal]

[D1:Dataset Overview - pdf(ipynb)]

[Deliverable 1]

[Deliverable 2]

[Visualization Techniques - pdf(Slides)]

[Deliverable 3]

[Deliverable 4]

[CS 297 Report PDF]

[C298 Proposal]

Deliverable 1:Tabula Sapiens Dataset Overview

Description:

Performed Exploratory Data Analysis on Heart data from Tabula Sapiens Dataset to understand the features and type of data. Also performed Dimensionality Reduction.

Implementation Steps:

Download and install Python, Anaconda and Jupyer Notebook.
Download Tabula Sapiens - Heart Dataset from CZ Biohub website.
Install anndata, scanpy, numpy, pandas, matplotlib, seaborn python packages.
Execute the code snippets step by step as mentioned in Google Colab(ipynb) file.

Results:

The data is in h5ad format which is specific to R language for single cell RNA sequencing.

The data present in Heart Dataset contains 11,505 cells and 56,604 genes.

TB Dimensions of Heart Data

The attributes of the Heart data

TB Heart Data Attributes

The names of first 10 genes are shown below.

TB Heart first 10 gene names

Performed Dimensionality Reduction using UMAP (Uniform Manifold Approximation and Projection)

TB Heart Data Dim Reduction UMAP

Performed Dimensionality Reduction using PCA (Principle Component Analysis)

TB Heart Data Dim Reduction PCA