Introduction to SQL Queries (website)
Week | Date | Content |
---|---|---|
1 | Jan 30 | Slides: Course objectives; project teams; team semester project; postmortem report; grades; what is data?; structured vs. unstructured data; what is information?; what is a database?; types of database management systems; operational database; analytical database; database application architecture; table structure; primary key; foreign key; introduction to SQL; SQL syntax; table joins; how queries work |
2 | Aug 28 |
Slides:
Connect to a database; fetch records; query results in a
dataframe; dynamically generated queries; create table;
insert records; ETL of CSV data; clear a table; drop
databases and tables; export/import a database; create a
user; mysql command; database requirements; conceptual,
logical, and physical data models; entity-relationship
(ER) diagrams; one-to-one, one-to-many, and many-to-many
relationships
How to access the school server with VPN Jupyter notebooks: 01-School.ipynb 02-FetchAll.ipynb 03-FetchOne.ipynb 04-FetchMany.ipynb 05-MabelFlynnStudents-df.ipynb 06-StudentsOf.ipynb 07-CreateTable.ipynb 08-Insert.ipynb 09-Insert-tuple.ipynb 10-Insert-dictionary.ipynb 11-CSV-read.ipynb 12-CSV-ETL.ipynb Utility files: DATA225utils.py employee.py Configuration files: school.ini school-remote.ini personnel.ini titanic.ini CSV file: TitanicSurvival.csv |
3 | Sep 11 | Slides: ER diagram (conceptual model); entities, attributes, and relationships; ERDPlus; mapping an ER diagram to a relational schema (logical model); entity integrity constraint; referential integrity constraint; mapping a relational schema to a database (physical model); EER diagram |
4 | Sep 18 | Slides: Table creation with constraints; named constraints; table loading with constraints; ALTER TABLE command; DROP TABLE with constraints; functional dependencies; update anomalies; normalization; normal forms 1NF, 2NF, and 3NF |
5 | Sep 25 |
Checkpoint Exam #1
Solutions: Checkpoint-1-solutions.pptx Slides: SELECT; aggregate functions; GROUP BY; HAVING; nested queries; IS NULL; EXISTS; NOT; INSERT INTO SELECT Jupyter notebooks: Nested-SELECT-Sales.ipynb INSERT-INTO-SELECT-Titanic.ipynb SELECT-IN-School.ipynb SELECT-EXISTS-Buiilding.ipynb Databases: ZAGIsales.sql building.sql |
6 | Oct 2 |
Slides:
ALTER TABLE; cascading deletes and updates; windowing
functions; MYSQL text, numeric, date and time functions;
relational algebra; joins; join aliases; self, inner,
left outer, right outer, and full outer joins; views;
indexing
Jupyter notebooks: Windowing.ipynb StringConcat.ipynb Joins.ipynb Views.ipynb Indexes.ipynb Example semester project: Overivew.pptx TheDeciders_Report_DATA225.pdf Demo.mov |
7 | Oct 9 |
Slides:
Nested SELECTS and views; database design example;
stored procedures; Python GUI-based database application
Notebooks: NestedSelect-View.ipynb CarDealership.ipynb StudentProcs.sql StudentProcsInstall.ipynb StudentProcsCall.ipynb TitanicProcs.sql TitanicProcsInstall.ipynb TitanicProcsCall.ipynb MonthProcs.sql MonthProcsInstall.ipynb MonthProcsCall.ipynb GUI-based database application: SchoolApp.py SchoolApp.ipynb MainSchoolWindow.py StudentClassesDialog.py TeacherStudentsDialog.py TeacherProcsInstall.ipynb main_school_window.ui student_classes_dialog.ui teacher_students_dialog.ui DemoGUI.zip Remote database server: SchoolServer.pptx |
8 | Oct 16 |
Slides:
Triggers and events;
object-relational mapping (ORM); SQLAlchemy;
transaction management; ACID properties; concurrency
problems; transaction conflicts; locking; deadlocks;
two-phase locking; recovery; checkpoints; data loss and
recovery; reliability and RAID
Jupyter notebooks: Triggers.ipynb Events.ipynb ORM-school.ipynb ORM-customer.ipynb Transactions-commands.ipynb Transactions-methods.ipynb |
9 | Oct 23 |
Slides:
Generating bar charts;
data wrangling; replace missing values with averages
and by regression analysis; outliers; team projects
Jupyter notebooks: TitanicBarchart-Python.ipynb TitanicBarChart-SQL.ipynb TitanicReplaceAges-Python.ipynb TitanicReplaceAges-SQL.ipynb StudentBooks-Python.ipynb StudentBooks-SQL.ipynb HomePrices-SQL.ipynb ThreeIndependents-SQL.ipynb |
10 | Oct 30 |
Checkpoint Exam #2
Solution: Checkpoint2-solutions.ipynb TitanicWindow.py titanic_window.ui Slides: Operational vs. analytical databases; application-oriented vs. subject-oriented; data warehouses; data warehouse components; business intelligence (BI); dimensional modeling; dimension tables; fact tables; star schema |
11 | Nov 6 |
Slides:
Dimensional modeling and star schemas; dimension and fact
tables; ETL to create a dimensional model; queries
against a star schema, transaction ID and times; multiple
fact tables; detailed vs. aggregated fact tables;
surrogate keys and slowly changing dimensions
Jupyter notebooks: zagi_warehouse.ipynb zagi_display_tables.ipynb zagi_queries.ipynb zagi_aggregated.ipynb Data warehouse: zagi_warehouse.sql Data warehouse sources: zagi_sales.sql zagi_facilities.sql zagi_customers.sql |
12 | Nov 13 |
Slides:
ETL diagrams; factless fact table;
GROUP BY ... WITH ROLLUP; museum field trip; time-series
analysis; SQL support for moving averages; SQL support
for exponential smoothing; executive dashboard
Jupyter notebooks: Students-star.ipynb Students-rollup.ipynb MovingAverage.ipynb ExponentialSmooting.ipynb Databases: students.sql stock_prices.sql |
13 | Nov 20 |
Slides:
Project survey questions; NoSQL databases; structured vs.
semi-structured data; document model; CAP theorem;
MongoDB database; pymongo; collections and documents;
queries with joins and matches
Jupyter notebooks (must be executed in this order): 13-1-SchoolDatabase.ipynb 13-2-ContactCollection.ipynb 13-3-StudentCollection.ipynb 13-4-TeacherCollection.ipynb 13-5-ClassCollection.ipynb 13-6-TakesCollection.ipynb 13-7-SchoolQueries.ipynb |
14 | Nov 27 |
Checkpoint Exam #3
Solutions: Checkpoint3-solutions.ipynb Slides: Query optimization; public key cryptography; data virtualization; distributed databases; cloud computing Bonus notebooks: Stress.ipynb ChatGPT.ipynb |
Classified standing or instructor consent.
Database Systems: Introduction to Databases and Data Warehouses, Edition 2.0 Nenad Jukic, Susan Vrbsky, Svetlozar Nestorov, Abhishek Sharma Prospect Press, 2021 eTextbook: 978-1-943153-67-1 available from Redshelf.com and VitalSource.com Paperback: 978-1-943153-68-8 available from Redshelf.com Many examples in class will come from this excellent book. We will also use the book’s database modeling tools at https://erdplus.com |
Principles of Database Management: The Practical Guide to Storing, Managing and Analyzing Big and Small Data Wilfried Lemahieu, Seppe vanden Broucke, Bart Baesens Cambridge University Press, 20185 978-1107186125 A more advanced text. Well-written with many examples and colorful diagrams. |
The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition Ralph Kimball and Margy Wiley, 2013 978-1118530801 Ralph Kimball is a pioneer of data warehousing and dimensional data modeling. I have worked closely with Dr. Kimball in the past. |