{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### <center>San Jose State University<br>Department of Applied Data Science<br><br>**DATA 200<br>Computational Programming for Data Analytics**<br><br>Spring 2024<br>Instructor: Ron Mak</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bar Charts: Movie Comparison"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bar chart parameters\n",
    "#### To create a bar chart, call `plt.bar(x, height, width)`, where:\n",
    "- #### *x* is the sequence of x coordinates of the bars\n",
    "- #### *y* is the sequence of the heights of the bars\n",
    "- #### *width* is the width of all the bars (optional, default is 0.8)\n",
    "#### Example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.bar(['Adam', 'Betty', 'Chuck', 'Didi'],\n",
    "        [75, 97, 85, 92])\n",
    "plt.title('Test scores')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bar chart with subcategories\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Recall that the top visualization container is a `Figure` object. It can contain multiple `Axes` objects. An `Axes` object is an actual plot or subplot, depending on whether we draw a single plot or multiple plots. An `Axes` object itself contains multiple subobjects, including ones that control axes, tick marks, legends, title, textboxes, grid, and other objects.\n",
    "#### **NOTE:** Do not confuse `Axes` object (where the plot lives) with the x ***axis*** and the y ***axis***, or the x and y ***axes*** which are parts of the plot.\n",
    "#### All the objects are customizable. In the example below, we explictly get the current `Axes` object with a call to function `gca()` in order to set some of its attributes.\n",
    "```\n",
    "ax = plt.gca()\n",
    "ax.set_xticklabels(labels)\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "labels = ['Adam', 'Betty', 'Chuck', 'Didi']\n",
    "x = np.arange(len(labels))\n",
    "\n",
    "bar_width = 0.4\n",
    "\n",
    "# Display the bars side-by-side.\n",
    "plt.bar(x - bar_width/2, [75, 97, 85, 92],\n",
    "        width=bar_width, label='Midterm')\n",
    "plt.bar(x + bar_width/2, [80, 97, 88, 99], \n",
    "        width=bar_width, label='Final')\n",
    "\n",
    "# Get the current Axes object\n",
    "ax = plt.gca()\n",
    "\n",
    "# Must set ticks and labels manually.\n",
    "plt.xticks(x)\n",
    "ax.set_xticklabels(labels)\n",
    "\n",
    "# Graph title and legend.\n",
    "plt.title('Midterm and Final Test scores')\n",
    "plt.legend()\n",
    "\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### We will use a bar plot to compare movie scores. You are given five movies with scores from Rotten Tomatoes. The Tomatometer is the percentage of approved Tomatometer critics who have given a positive review for the movie. The Audience Score is the percentage of users who have given a score of 3.5 or higher out of 5. Compare these two scores among the five movies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "movie_scores = pd.read_csv('movie_scores.csv')\n",
    "movie_scores"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Use `matplotlib` to create a visually-appealing bar plot comparing the two scores for all five movies.\n",
    "#### Use the movie titles as labels for the x-axis. Use percentages in an interval of 20 for the y-axis and minor ticks in interval of 5. Add a legend and a suitable title to the plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the figure.\n",
    "plt.figure(figsize=(10, 5), dpi=300)\n",
    "\n",
    "# Create the bar plot.\n",
    "x = np.arange(len(movie_scores['MovieTitle']))\n",
    "width = 0.3\n",
    "plt.bar(x - width/2, movie_scores['Tomatometer'], \n",
    "        width, label='Tomatometer')\n",
    "plt.bar(x + width/2, movie_scores['AudienceScore'], \n",
    "        width, label='Audience Score')\n",
    "\n",
    "# Specify ticks.\n",
    "plt.xticks(x, rotation=10)\n",
    "plt.yticks(np.arange(0, 101, 20))\n",
    "\n",
    "# Get the current Axes object for setting tick labels \n",
    "# and the horizontal grid\n",
    "ax = plt.gca()\n",
    "\n",
    "# Set axis tick labels.\n",
    "ax.set_xticklabels(movie_scores['MovieTitle'])\n",
    "ax.set_yticklabels(['0%', '20%', '40%', '60%', '80%', '100%'])\n",
    "\n",
    "# Add minor ticks for y-axis in the interval of 5.\n",
    "ax.set_yticks(np.arange(0, 100, 5), minor=True)\n",
    "\n",
    "# Add major horizontal grid with solid lines.\n",
    "ax.yaxis.grid(which='major')\n",
    "\n",
    "# Add minor horizontal grid with dashed lines.\n",
    "ax.yaxis.grid(which='minor', linestyle='--')\n",
    "\n",
    "# Add title.\n",
    "plt.title('Movie comparison')\n",
    "\n",
    "# Add legend.\n",
    "plt.legend()\n",
    "\n",
    "# Show plot.\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Adapted from ***Data Visualization with Python***, by Mario Döbler and Tim Großmann, Packt 2019, ISBN 978-1-78995-646-7"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Additional material (c) 2024 by Ronald Mak"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}