San Jose State University : Site Name

Navigation

Main Content

Cats like to be read to.

Ronald Mak

Department of Computer Science
Department of Applied Data Science
Fall Semester 2023

Office hours: W 6:00 – 7:00 PM Clark Hall CL 325
Office location: Clark Hall CL 325 (mostly working from home)
E-mail: ron.mak@sjsu.edu

DATA 225 Database Systems for Analytics (section 21)


M 6:00 - 8:45 PM room BBC 102

Introduction to SQL Queries (website)

Assignments

# Assigned Due Assignment
1 Aug 21 Aug 28 Software installations

Database: school.sql
Python code: Assignment1.py
Jupyter notebook: Assignment1.ipynb
2 Aug 28 Sep 11 Create and load database tables with CSV data

How to access the school server with VPN
3 Sep 11 Sep 18 a. Data modeling (team)

b. SQL practice #1 (individual)
Database import file: ZAGIsales.sql
Solutions: Asssignment3b-solutions.ipynb
4 Sep 18 Sep 25 a. Constraints and normalization

b. Normalization practice (individual)

Lectures

Week Date Content
1 Jan 30 Slides: Course objectives; project teams; team semester project; postmortem report; grades; what is data?; structured vs. unstructured data; what is information?; what is a database?; types of database management systems; operational database; analytical database; database application architecture; table structure; primary key; foreign key; introduction to SQL; SQL syntax; table joins; how queries work
2 Aug 28 Slides: Connect to a database; fetch records; query results in a dataframe; dynamically generated queries; create table; insert records; ETL of CSV data; clear a table; drop databases and tables; export/import a database; create a user; mysql command; database requirements; conceptual, logical, and physical data models; entity-relationship (ER) diagrams; one-to-one, one-to-many, and many-to-many relationships

How to access the school server with VPN

Jupyter notebooks: 01-School.ipynb    02-FetchAll.ipynb    03-FetchOne.ipynb    04-FetchMany.ipynb    05-MabelFlynnStudents-df.ipynb    06-StudentsOf.ipynb    07-CreateTable.ipynb    08-Insert.ipynb    09-Insert-tuple.ipynb    10-Insert-dictionary.ipynb    11-CSV-read.ipynb    12-CSV-ETL.ipynb
Utility files: DATA225utils.py    employee.py
Configuration files: school.ini    school-remote.ini    personnel.ini    titanic.ini
CSV file: TitanicSurvival.csv
3 Sep 11 Slides: ER diagram (conceptual model); entities, attributes, and relationships; ERDPlus; mapping an ER diagram to a relational schema (logical model); entity integrity constraint; referential integrity constraint; mapping a relational schema to a database (physical model); EER diagram
4 Sep 18 Slides: Table creation with constraints; named constraints; table loading with constraints; ALTER TABLE command; DROP TABLE with constraints; functional dependencies; update anomalies; normalization; normal forms 1NF, 2NF, and 3NF

Goals

This class emphasizes developing operational and analytical data management applications in Python that use back-end databases and data warehouses. Project teams will develop interactive applications where users enter data on the command line and GUI-based (graphical user interface) applications where users enter data via dialogs and forms.

Course Learning Outcomes (CLO)

Prerequisites

Classified standing or instructor consent.

Recommended Books

Database Systems:
Introduction to Databases and Data Warehouses, Edition 2.0

Nenad Jukic, Susan Vrbsky, Svetlozar Nestorov, Abhishek Sharma
Prospect Press, 2021

eTextbook: 978-1-943153-67-1
available from Redshelf.com and VitalSource.com

Paperback: 978-1-943153-68-8
available from Redshelf.com

Many examples in class will come from this excellent book.
We will also use the book’s database modeling tools at https://erdplus.com
Principles of Database Management:
The Practical Guide to Storing, Managing and Analyzing Big and Small Data

Wilfried Lemahieu, Seppe vanden Broucke, Bart Baesens
Cambridge University Press, 20185
978-1107186125

A more advanced text.
Well-written with many examples and colorful diagrams.
The Data Warehouse Toolkit:
The Definitive Guide to Dimensional Modeling, 3rd Edition

Ralph Kimball and Margy
Wiley, 2013
978-1118530801

Ralph Kimball is a pioneer of data warehousing and dimensional data modeling.
I have worked closely with Dr. Kimball in the past.

Some Useful Websites