### <center>San Jose State University<br>Department of Applied Data Science<br><br>**DATA 200<br>Computational Programming for Data Analytics**<br><br>Spring 2024<br>Instructor: Ron Mak</center>

# Bar Charts: Movie Comparison

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

%matplotlib inline

## Bar chart parameters
#### To create a bar chart, call `plt.bar(x, height, width)`, where:
- #### *x* is the sequence of x coordinates of the bars
- #### *y* is the sequence of the heights of the bars
- #### *width* is the width of all the bars (optional, default is 0.8)
#### Example:

In [None]:
plt.bar(['Adam', 'Betty', 'Chuck', 'Didi'],
        [75, 97, 85, 92])
plt.title('Test scores')
plt.show()

## Bar chart with subcategories


#### Recall that the top visualization container is a `Figure` object. It can contain multiple `Axes` objects. An `Axes` object is an actual plot or subplot, depending on whether we draw a single plot or multiple plots. An `Axes` object itself contains multiple subobjects, including ones that control axes, tick marks, legends, title, textboxes, grid, and other objects.
#### **NOTE:** Do not confuse `Axes` object (where the plot lives) with the x ***axis*** and the y ***axis***, or the x and y ***axes*** which are parts of the plot.
#### All the objects are customizable. In the example below, we explictly get the current `Axes` object with a call to function `gca()` in order to set some of its attributes.
```
ax = plt.gca()
ax.set_xticklabels(labels)
```

In [None]:
labels = ['Adam', 'Betty', 'Chuck', 'Didi']
x = np.arange(len(labels))

bar_width = 0.4

# Display the bars side-by-side.
plt.bar(x - bar_width/2, [75, 97, 85, 92],
        width=bar_width, label='Midterm')
plt.bar(x + bar_width/2, [80, 97, 88, 99], 
        width=bar_width, label='Final')

# Get the current Axes object
ax = plt.gca()

# Must set ticks and labels manually.
plt.xticks(x)
ax.set_xticklabels(labels)

# Graph title and legend.
plt.title('Midterm and Final Test scores')
plt.legend()

plt.show()

#### We will use a bar plot to compare movie scores. You are given five movies with scores from Rotten Tomatoes. The Tomatometer is the percentage of approved Tomatometer critics who have given a positive review for the movie. The Audience Score is the percentage of users who have given a score of 3.5 or higher out of 5. Compare these two scores among the five movies.

In [None]:
movie_scores = pd.read_csv('movie_scores.csv')
movie_scores

#### Use `matplotlib` to create a visually-appealing bar plot comparing the two scores for all five movies.
#### Use the movie titles as labels for the x-axis. Use percentages in an interval of 20 for the y-axis and minor ticks in interval of 5. Add a legend and a suitable title to the plot.

In [None]:
# Create the figure.
plt.figure(figsize=(10, 5), dpi=300)

# Create the bar plot.
x = np.arange(len(movie_scores['MovieTitle']))
width = 0.3
plt.bar(x - width/2, movie_scores['Tomatometer'], 
        width, label='Tomatometer')
plt.bar(x + width/2, movie_scores['AudienceScore'], 
        width, label='Audience Score')

# Specify ticks.
plt.xticks(x, rotation=10)
plt.yticks(np.arange(0, 101, 20))

# Get the current Axes object for setting tick labels 
# and the horizontal grid
ax = plt.gca()

# Set axis tick labels.
ax.set_xticklabels(movie_scores['MovieTitle'])
ax.set_yticklabels(['0%', '20%', '40%', '60%', '80%', '100%'])

# Add minor ticks for y-axis in the interval of 5.
ax.set_yticks(np.arange(0, 100, 5), minor=True)

# Add major horizontal grid with solid lines.
ax.yaxis.grid(which='major')

# Add minor horizontal grid with dashed lines.
ax.yaxis.grid(which='minor', linestyle='--')

# Add title.
plt.title('Movie comparison')

# Add legend.
plt.legend()

# Show plot.
plt.show()

In [None]:
plt.close()

#### Adapted from ***Data Visualization with Python***, by Mario Döbler and Tim Großmann, Packt 2019, ISBN 978-1-78995-646-7

In [None]:
# Additional material (c) 2024 by Ronald Mak