{ "cells": [ { "cell_type": "markdown", "id": "0335ac2a-9011-40af-afec-af28ff4a5199", "metadata": {}, "source": [ "# An example analytic query against a ***dimensional*** model's star schema vs. a ***nondimensional*** model" ] }, { "cell_type": "code", "execution_count": null, "id": "0a5ed588-6c52-40b6-b122-f8d730915e09", "metadata": {}, "outputs": [], "source": [ "from pandas import DataFrame\n", "from DATA225utils import make_connection, dataframe_query" ] }, { "cell_type": "code", "execution_count": null, "id": "2eafa249-996f-4634-a456-05c27385373f", "metadata": {}, "outputs": [], "source": [ "conn_warehouse = make_connection(config_file = 'zagi-warehouse.ini')\n", "conn_sales = make_connection(config_file = 'zagi-sales.ini')" ] }, { "cell_type": "markdown", "id": "b194a3cc-6808-490b-aa5d-64d7d74a87e7", "metadata": {}, "source": [ "## Query against the dimensional model\n", "#### How do the ***quantities of sold products*** in all categories on ***Mondays*** by vendor ***Pacifica Gear*** within the ***Chicagoland*** region during the ***first quarter*** of ***2023*** compare to the ***second quarter*** of ***2023***?\n", "#### We analyze the `units_sold` measure of the `sales` fact table relative to the `calendar`, `store`, and `product` dimension tables." ] }, { "cell_type": "code", "execution_count": null, "id": "4cd0bcda-ef1c-4192-87f1-5da3ce56014d", "metadata": {}, "outputs": [], "source": [ "_, df = dataframe_query(conn_warehouse,\n", " \"\"\"\n", " SELECT c.qtr AS quarter, \n", " SUM(sa.units_sold) AS total_units_sold, \n", " p.product_category_name,\n", " p.product_vendor_name, c.day_of_week\n", " \n", " FROM calendar c, store s, product p, sales sa\n", " \n", " WHERE c.calendar_key = sa.calendar_key\n", " AND s.store_key = sa.store_key\n", " AND p.product_key = sa.product_key\n", " AND p.product_vendor_name = 'Pacifica Gear'\n", " AND s.store_region_name = 'Chicagoland'\n", " AND c.day_of_week = 'Monday'\n", " AND c.year = 2023\n", " AND c.qtr IN (1, 2)\n", "\n", " GROUP BY p.product_category_name, \n", " p.product_vendor_name,\n", " c.day_of_week, \n", " c.qtr\n", " ORDER BY c.qtr\n", " \"\"\"\n", " )\n", "\n", "df" ] }, { "cell_type": "markdown", "id": "91de7eec-d03b-4b71-b8e2-e9fe7cdc2fe1", "metadata": {}, "source": [ "## Query against the nondimensional model\n", "#### How do the ***quantities of sold products*** in all categories on ***Mondays*** by vendor ***Pacifica Gear*** within the ***Chicagoland*** region during the ***first quarter*** of ***2023*** compare to the ***second quarter*** of ***2023***?\n", "#### We must join seven tables: `region`, `store`, `sales_transaction`, `sold_via`, `product`, `vendor`, and `category`. We must call date-extraction functions: `QUARTER`, `DAYNAME`, `WEEKDAY`, and `YEAR` for each row of the result." ] }, { "cell_type": "code", "execution_count": null, "id": "44eb3d54-24fd-4bd8-a253-f903ad21bfa3", "metadata": {}, "outputs": [], "source": [ "_, df = dataframe_query(conn_sales,\n", " \"\"\"\n", " SELECT QUARTER(st.tdate) AS quarter, \n", " SUM(sv.no_of_items) AS total_units_sold, \n", " c.category_name, v.vendor_name,\n", " DAYNAME(st.tdate) AS day_of_week\n", "\n", " FROM region r, store s, sales_transaction st, sold_via sv,\n", " product p, vendor v, category c\n", "\n", " WHERE r.region_id = s.region_id\n", " AND s.store_id = st.store_id\n", " AND st.tid = sv.tid\n", " AND sv.product_id = p.product_id\n", " AND p.vendor_id = v.vendor_id\n", " AND p.category_id = c.category_id\n", " AND v.vendor_name = 'Pacifica Gear'\n", " AND r.region_name = 'Chicagoland'\n", " AND WEEKDAY(st.tdate) = 0\n", " AND YEAR(st.tdate) = 2023\n", " AND QUARTER(st.tdate) IN (1, 2)\n", "\n", " GROUP BY c.category_name, \n", " v.vendor_name,\n", " DAYNAME(st.tdate), \n", " QUARTER(st.tdate)\n", " ORDER BY QUARTER(st.tdate)\n", " \"\"\"\n", " )\n", "\n", "df" ] }, { "cell_type": "code", "execution_count": null, "id": "a4708450-a027-460e-9d92-13c171347b00", "metadata": { "tags": [] }, "outputs": [], "source": [ "conn_warehouse.close()\n", "conn_sales.close()" ] }, { "cell_type": "markdown", "id": "21a3784e-0bd0-44b6-9786-e780d84a577d", "metadata": {}, "source": [ "#### (c) Copyright 2023 by Ronald Mak" ] }, { "cell_type": "code", "execution_count": null, "id": "06a3db89-e2bf-4965-830e-8a691e67e093", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.18" } }, "nbformat": 4, "nbformat_minor": 5 }