{ "cells": [ { "cell_type": "markdown", "id": "cceb0c53-86e0-416b-b626-c8ba5e2f3695", "metadata": {}, "source": [ "###
San Jose State University
Department of Applied Data Science

**DATA 200
Computational Programming for Data Analytics**

Spring 2024
Instructor: Ron Mak

**Assignment #5
Dictionaries and Sets**

Assigned: February 29, 2024
Due: March 7 at 5:30 PM

200 points maximum
Individual work only!
" ] }, { "cell_type": "markdown", "id": "0df5d653-3774-4e97-969f-1e7c92713271", "metadata": {}, "source": [ "# Dictionaries" ] }, { "cell_type": "markdown", "id": "938e22b7-8c9c-4a00-a33d-0512915085ec", "metadata": {}, "source": [ "## Using a dictionary often can simplify code and improve performance." ] }, { "cell_type": "markdown", "id": "54960783-b919-4b54-8ddc-aa03414fff57", "metadata": {}, "source": [ "#### **PROBLEM 1:** [60 points] Rewrite the number translator program in the Feb. 15 notebook `NumberTranslator.ipynb` using a dictionary containing the number words and their values. For example,\n", "``` python\n", "words = { 1: 'one', 2: 'two' }\n", "```\n", "#### You should be able to replace the `if elif else` statements." ] }, { "cell_type": "markdown", "id": "c76a8f14-ad9f-4a36-82fe-71bc61586ffd", "metadata": {}, "source": [ "#### **PROBLEM 2:** [40 points] Test your rewritten number translator with 10 randomly generated integer values from 1 through 999,999,999 inclusive." ] }, { "cell_type": "markdown", "id": "3c68a2de-4519-42c4-b799-145f069ab64a", "metadata": {}, "source": [ "# Sets" ] }, { "cell_type": "markdown", "id": "e3bab89b-3dd2-44b2-ac01-7a5ca2a0a7a7", "metadata": {}, "source": [ "## Great software design involves not only writing good code, but also choosing the right way to store your data in memory.\n", "#### Suppose you have a large collection of unsorted random values and you need to search the collection for particular target values. Should you store the random values in a list or in a set? Which data structure is more efficient for searches?" ] }, { "cell_type": "markdown", "id": "47e9c29f-3ab1-49b3-98a7-6537afd13ad5", "metadata": {}, "source": [ "#### **PROBLEM 3:** [20 points] Generate 1,000,000 random integer values in the range 0 through 9,999,999 inclusive. Each time you generate a random value, append it to a list named `randoms_list` and also enter it into a set named `randoms_set`. Both the list and the set should initially be empty, and when you're done with this step, both will contain the same random values. It is OK that you might have some duplicate values in the list. Do not sort the list. Of course, the set will have only unique values." ] }, { "cell_type": "markdown", "id": "69c6529c-5efb-4245-b0de-23b13363cd81", "metadata": {}, "source": [ "#### **PROBLEM 4:** [20 points] Create an initially empty list named `target_values`. In each iteration of a loop, generate a random value in the range 0 through 9,999,999 inclusive and enter it into list `target_values`. Use the `in` operator to search for the random value in `randoms_set`. Stop the loop after you've generated the 10th random value that was found. At the end of this step, `target_values` will be a list of random values, 10 of which were found and the rest were not.\n", "#### Since both `randoms_list` and `randoms_set` contain the same values, each target value will be in both `randoms_list` and `randoms_set`, or in neither. " ] }, { "cell_type": "markdown", "id": "534b4e78-02d1-40b6-8315-7cd005b2185a", "metadata": {}, "source": [ "#### **PROBLEM 5:** [30 points] Time how long it takes to search `randoms_list` using the `in` operator for all the values in `target_values`, one at a time. Calculate and print the elapsed time in milliseconds that it took some code to execute like this:\n", "``` python\n", "import time\n", "\n", "start_time = time.process_time_ns()\n", "\n", "### Code to be timed here. ###\n", " \n", "end_time = time.process_time_ns()\n", "\n", "elapsed_time = round((end_time - start_time)/10**6, 3)\n", "print(f\"Elapsed time: {elapsed_time:,.3f} ms\")\n", "```\n", "#### The code to be timed should be searches of `random_list` for each value in `target_values`. Store into list `found_in_list` the values that were found. After all the searches are done and you've recorded the elapsed time, print the elapsed time. Then print the contents of list `found_in_list`. (Printing this list should **not** be part of the timed code.)" ] }, { "cell_type": "markdown", "id": "90a424f2-f71c-441d-827f-4606f1062614", "metadata": {}, "source": [ "#### **PROBLEM 6:** [30 points] Time how long it takes to search `randoms_set` using the `in` operator for all the values in `target_values`, one at a time. Calculate and print the elapsed time in milliseconds it took the code to execute. Store all the found values in list `found_in_set`. After printing the elapsed time, print the contents of `found_in_set`, which should contain the same values in list `found_in_list`." ] }, { "cell_type": "markdown", "id": "8781c245-4f4f-4c59-afa6-4514f3d2d74b", "metadata": {}, "source": [ "#### Which was more efficient for searches, the list or the set? How much more efficient was one over the other? What is your ratio of list search time over your set search time?" ] }, { "cell_type": "code", "execution_count": null, "id": "adbb1f25-866d-4f12-8a98-6221ca8fb8cf", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }