Deliverable 1:Tabula Sapiens Dataset Overview
Description:
Performed Exploratory Data Analysis on Heart data from Tabula Sapiens Dataset to understand the features and type of data. Also performed Dimensionality Reduction.
Implementation Steps:
- Download and install Python, Anaconda and Jupyer Notebook.
- Download Tabula Sapiens - Heart Dataset from CZ Biohub website.
- Install anndata, scanpy, numpy, pandas, matplotlib, seaborn python packages.
- Execute the code snippets step by step as mentioned in Google Colab(ipynb) file.
Results:
- The data is in h5ad format which is specific to R language for single cell RNA sequencing.
- The data present in Heart Dataset contains 11,505 cells and 56,604 genes.
- The attributes of the Heart data
- The names of first 10 genes are shown below.
- Performed Dimensionality Reduction using UMAP (Uniform Manifold Approximation and Projection)
- Performed Dimensionality Reduction using PCA (Principle Component Analysis)
|