{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9e334ab8-eb54-45bb-ba39-704a46402baa",
   "metadata": {},
   "source": [
    "### <center>San Jose State University<br>Department of Applied Data Science<br><br>**DATA 200<br>Computational Programming for Data Analytics**<br><br>Spring 2024<br>Instructor: Ron Mak</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cafef7ba-5958-4981-bc1b-0e1aaa5e04db",
   "metadata": {},
   "source": [
    "# More `matplotlib`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50739fd6-8e52-453b-811e-a58dbd28d8ca",
   "metadata": {},
   "source": [
    "## `%matplotlib inline` \"magic\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c87d7d46-9e79-456e-83b9-032ea2c6a716",
   "metadata": {},
   "source": [
    "#### When using `matplotlib` in a Jupyter notebook, the \"magic command\"\n",
    "``` Python\n",
    "%matplotlib inline\n",
    "```\n",
    "#### enables the graphs to be drawn inside the notebook.\n",
    "#### (Online forums claim that is no longer necessary with the latest version of Jupyter notebook.)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5af68027-b69d-4ecc-ae85-78b5994fe57b",
   "metadata": {},
   "source": [
    "## The `Figure` container object\n",
    "#### Whenever we create a graph, the highest container for all the objects that make up the graph is a `Figure` object. If we simply call `plt.plot()`, Python implicitly creates the `Figure` object for us.\n",
    "#### For example, we can plot Facebook stock prices."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a6a6aed-868f-4574-9085-f0543a6fc8df",
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49ca3bb8-d9fd-451d-977e-bbc26d2ab203",
   "metadata": {},
   "outputs": [],
   "source": [
    "fb = pd.read_csv('fb_stock_prices_2018.csv', \n",
    "                 index_col='date', parse_dates=True)\n",
    "fb"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "472ca40c-f59c-4fe0-bec1-d865017fb3a5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Parameters: x values, y values\n",
    "plt.plot(fb.index, fb.open)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c82a0421-f3b5-4cdc-8fef-9d2b94431493",
   "metadata": {},
   "source": [
    "#### We can access an implicitly recreated `Figure` object by calling `plt.figure()` if, for example, we want to change the figure size or its resoluton."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "464b0ff7-dc2a-46e6-b121-773501706f1e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# By default, size units are inches.\n",
    "plt.figure(figsize=(7, 3), dpi=300)\n",
    "\n",
    "plt.plot(fb.index, fb.open)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8b25d9ef-d00b-4725-bee0-e3249d5482e2",
   "metadata": {},
   "source": [
    "## Displaying a graph"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a8bb273b-86f8-4215-94ce-5f568ba420c2",
   "metadata": {},
   "source": [
    "#### Python won't display a graph until we tell it to. That allows us to add features or make changes to the graph by configuring its objects while they are held in memory. We call `plt.show()` to finally display the graph.\n",
    "#### We must call `plt.show()` in a standalone Python program. However, in a Jupyter notebook, executing the cell containing the graph creation code will automatically display it (and remove it from memory). Therefore, in a notebook, it's not necessary to make the call."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "79bad914-857c-4bdc-afab-d8a47e11bf6a",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(fb.index, fb.open)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "292c50b1-d950-43f9-a450-45f2a75b262e",
   "metadata": {},
   "source": [
    "#### Including a call to `plt.show()` in a notebook helps if we later decide to convert the notebook to a standalong Python program. Also, since `plt.show()` has no return value, so it cuts out extraneous output in a notebook cell.\n",
    "#### After it is displayed, the graph's objects are removed from memory. We must recreate the graph to display it again."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7e75546d-fa6a-4409-86bf-cb2e8a5e2c5a",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.show()  # nothing will be displayed the second time"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7687b653-b543-47db-bcc1-d8900f82d2bb",
   "metadata": {},
   "source": [
    "## Histograms"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cc8593da-340f-4518-9f20-407ffadd2d7e",
   "metadata": {},
   "source": [
    "#### When creating and displaying histograms, do not ignore bin size. Bin size is the width of each subrange of x values for which there is a bar. The smaller the bin size, the more bars."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef646e1d-97ca-4d22-9d4e-1489bb0ad900",
   "metadata": {},
   "outputs": [],
   "source": [
    "quakes = pd.read_csv('earthquakes.csv')\n",
    "quakes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e6af131-3128-4699-a917-40b7d3a65984",
   "metadata": {},
   "source": [
    "#### Note the call to method `query()` below on the dataframe object. We only want to plot the magnitudes with type \"ml\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6c9993ac-9313-49ef-b75b-895f0e062f88",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.hist(quakes.query('magType == \"ml\"').mag)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ec258c1-e128-4dae-9841-907d8abafae1",
   "metadata": {},
   "source": [
    "#### With the default bin size, the data appears to be roughly normally distributed.\n",
    "#### But appearances can be deceiving. Note how the shape of the distribution changes with different bin sizes, especially how the distribution appears to change from unimodal to bimodal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc84af46-7e45-4564-b94a-696d90b303dd",
   "metadata": {},
   "outputs": [],
   "source": [
    "x = quakes.query('magType == \"ml\"').mag\n",
    "\n",
    "# Parameters: number of rows, number of columns\n",
    "fig, axes = plt.subplots(1, 2, figsize=(10, 3))\n",
    "\n",
    "for ax, bins in zip(axes, [7, 35]):\n",
    "    ax.hist(x, bins=bins)\n",
    "    ax.set_title(f'{bins} bins')\n",
    "    \n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d44d4e08-31dc-4ea0-b6fe-6b6b1b6c29de",
   "metadata": {},
   "source": [
    "## Subplots"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a91f10e1-2dd2-4b99-b48f-cdf25f0a4a68",
   "metadata": {},
   "source": [
    "#### `plt.subplots()` returns the `Figure` object and the list of `Axes` objects that it contains. Each `Axes` object is a separate plot within the `Figure` container."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b593bcbc-985a-4f5d-8eb0-814f4fda2465",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axes = plt.subplots(1, 2)\n",
    "axes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "246509fb-bccc-4523-bbaa-1d7e9dc2e774",
   "metadata": {},
   "source": [
    "#### `Figure` and `Axes` objects have methods with similar or identical names to their `pyplot` function counterparts. For example,\n",
    "``` Python\n",
    "plt.hist()\n",
    "ax.hist()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e74ba26-c087-4f2d-95ea-02cd3561a527",
   "metadata": {},
   "source": [
    "#### Instead of calling `plt.subplots()`, we can call the `Figure` method `add_axes()`. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3504aa1d-a30a-4932-829f-74ced9bda0b9",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure(figsize=(3, 3))\n",
    "\n",
    "# Parameters left, bottom, width, height:\n",
    "#   left is the distance of the left axis from the left border\n",
    "#   height is the distance of the bottom axis from the bottom border\n",
    "outside = fig.add_axes([0.1, 0.1, 0.9, 0.9])\n",
    "inside  = fig.add_axes([0.7, 0.7, 0.25, 0.25])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "acc121ca-aedb-4188-bb26-02e20fda638c",
   "metadata": {},
   "source": [
    "#### And, there's `GridSpec`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19d45567-d051-4a53-98d1-13c2bb89fd5b",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure(figsize=(8, 8))\n",
    "gs  = fig.add_gridspec(3, 3)\n",
    "\n",
    "# Parameter: gs[which rows, which columns]\n",
    "#            can use range notation\n",
    "top_left  = fig.add_subplot(gs[0, 0])\n",
    "mid_left  = fig.add_subplot(gs[1, 0])\n",
    "top_right = fig.add_subplot(gs[:2, 1:])  # rows 0 and 1, cols 1 and 2\n",
    "bottom    = fig.add_subplot(gs[2,:])     # row 2, all columns\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "404feb1d-a52e-4a57-94cd-4cb3b07dc294",
   "metadata": {},
   "source": [
    "## Saving graphs"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5635eb24-79ae-437a-8387-5f1236f0c3af",
   "metadata": {},
   "source": [
    "#### Call `plt.savefig()` to save a graph in an image file. But be sure to call it before displaying the graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "47937b00-2166-4117-af16-7834851d9e96",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(fb.index, fb.open)\n",
    "plt.savefig('FacebookStock.png')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2044ab71-65e3-4730-b067-f17a7a64809d",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.close()  # required for standalone Python programs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4cfb0376-544c-4cc7-a016-5a7945d7a8ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Additional material (c) 2024 by Ronald Mak"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}