{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### <center>San Jose State University<br>Department of Applied Data Science<br><br>**DATA 200<br>Computational Programming for Data Analytics**<br><br>Spring 2024<br>Instructor: Ron Mak</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# `seaborn` Facet Grid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "salaries = pd.read_csv('salary.csv')\n",
    "salaries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Use a `FacetGrid` to visualize multiple variables separately. It takes a dataframe containing up to three dimensions, `row`, `col`, and `hue`. The variables should be ***categorical*** or ***discrete***.\n",
    "#### Call:\n",
    "``` Python\n",
    "seaborn.FacetGrid(data, row, col, hue, height)\n",
    "```\n",
    "#### where:\n",
    "- #### ***data***: A dataframe where each column corresponds to a variable and each row's values is an observation.\n",
    "- #### ***row***, ***col***, ***hue***: The variables that constitue a subset of the data, which will be visualized by separate facets in the grid.\n",
    "- #### ***height*** (optional): The height in inches of each facet.\n",
    "#### To visualize the data on the grid, call:\n",
    "``` Python\n",
    "FacetGrid.map(func, *args, **kwargs)\n",
    "```\n",
    "#### where:\n",
    "- #### ***func***: A plotting function that takes data and keyword arguments.\n",
    "- #### ****args***: Column names in the data that identify variables with data to plot. The data for each variable is passed to ***func*** in the specified order.\n",
    "- #### *****kwargs***: Keyword arguments passed to ***func***"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "grid = sns.FacetGrid(salaries, col='District')\n",
    "grid.map(sns.scatterplot, 'Salary', 'Age')\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Top 30 YouTube Channels\n",
    "#### A `FacetGrid` shows the number of subscribers and the number of views for the top 30 YouTube channels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "youtube = pd.read_csv(\"youtube.csv\")\n",
    "youtube"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "channels = youtube[youtube.columns[0]].tolist()\n",
    "channels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "subs = youtube[youtube.columns[1]].tolist()\n",
    "subs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "views = youtube[youtube.columns[2]].tolist()\n",
    "views"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.DataFrame(\n",
    "    {'YouTube Channels': channels + channels, \n",
    "     'Count (millions)': subs + views, \n",
    "     'Type'            : ['Subscribers']*len(subs) \n",
    "                             + ['Views']*len(views)\n",
    "    }\n",
    ")\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Visualize the dataframe using a `FacetGrid` with two columns. The first column shows the count of subscribers for each YouTube channel, and the second column shows the count of views."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sns.set()\n",
    "\n",
    "grid = sns.FacetGrid(df, col='Type', hue='Type', sharex=False, height=8)\n",
    "grid.map(sns.barplot, 'Count (millions)', 'YouTube Channels',\n",
    "         order=channels)\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### See https://bravo-sierra.medium.com/facetgrid-vs-axessubplot-type-with-seaborn-5aa730dd8add"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Adapted from ***Data Visualization with Python***, by Mario Döbler and Tim Großmann, Packt 2019, ISBN 978-1-78995-646-7"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Additional material (c) 2024 by Ronald Mak"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
