Chris Pollett> CS157b
( Print View )

Student Corner:
[Lecture Notes]
[Discussion Board]

Course Info:
[Texts & Links]
[Description]
[Course Outcomes]
[Outcomes Matrix]
[Course Schedule]
[Grading]
[Requirements/HW/Quizzes]
[Class Protocols]
[Exam Info]
[Regrades]
[University Policies]
[Announcements]

HW Assignments:
[Hw1] [Hw2] [Hw3]
[Hw4] [Hw5] [Quizzes]

Practice Exams:
[Midterm] [Final]

HW#5 --- last modified May 03 2023 21:49:47.

Solution set.

Due date: May 15

Files to be submitted:
  Hw5.zip

Purpose: To gain experience with information integration, data mining, and OLAP. Related Course Outcomes:

Related Course Outcomes:

The main course outcomes covered by this assignment are:

CLO7 -- Be able to create or deploy a query mediator for a system with at least two data sources.

CLO8 -- Be able to determine how the A Priori algorithm would operate on a toy dataset

CLO9 -- Design and deploy analytical databases for OLAP.

Specification:

This homework will consist of two parts, a written part and a coding/experiment part. Both parts will be submitted in the Hw5.zip file. The written part should be in a file, named Hw5.pdf, within this zip-file. For the written part of the homework you should write solutions for the book questions listed below. In what you turn in, make sure to write the names and student ids for each group member. For each problem, first copy and paste the question, then write your solution to it beneath it.

  1. Exercise 20.1.1 modified so that (a) p = 4, (b) p= 150, (c) p = 600
  2. Exercise 20.2.2(c) modified as: Express in the map-reduce framework, the following operation: $R \bowtie_{R.A > S.B} S$.
  3. Exercise 20.5.2a (do with three sites 1,2,3 and where 1 wants to abort, 2 and 3 want to commit).
  4. Exercise 20.6.2 (change i to 30%, ii to 15%)
  5. Exercise 22.1.1 (modify so B_4 is {milk, pepsi, coke}

For the coding part of the homework, I want you to write a simulator for the following real world set-up: You have a thermostat app which can be used by households that want to adjust their house temperatures (on a phone but changing info on their thermostat). For each household, this app stores data locally in a sqlite database in the thermostat. The app manufacturer has their own site where they keep an OLAP database using Mysql of what thermostat changes people are doing (so can develop smart features), for which households, when during the day, and to what geographic regions. One could imagine the app developer has a web app that households can use to access info about their thermostat usage. The thermostat app (in the thermostat itself) periodical connects to the developer site and communicate the changes since the last time it spoke. This is essentially a mediated transaction the mobile database and the OLAP database.

Although you could actually code a mobile app and an app for the thermostat for this homework in Java using Android, that would be way too much work. Instead, I want you to write two command line programs to simulate our set-up: ThermostatApp.java and Olap.java. For these programs, the top of your program's class file should have any constants the grader needs to tweak to make a JDBC connection to a Mysql DB or sqlite Db. You should have a README.txt file saying what changes the grader needs to make to get your code to work on the grader's machine. You should also have a file olap.sql, the grader can use to create your Mysql database. The grader will compile your programs from the command line with the lines:

javac ThermostatApp.java
javac Olap.java

ThermostatApp should support three different functionalities, each of which can be tested from the command line, using lines like:

java ThermostatApp set id location current_date_time current_temperature set_temperature
java ThermostatApp show id date
java ThermostatApp send id

Here id should be some 9 digit user_id such as 123456789. The command should store in a sqlite file id.sqlite (where id has been replaced with a number) in a table ADJUSTMENTS, the information location, current_date_time, current_temperature, set_temperature. Here location consists of three int's separated by hyphens, the first being a latitude, the second a longitude, and the last being a device number at that location (if user had multiple thermostats in the same house). For example, 37-121-1. The current_date_time be in the format: YYYY-mm-dd:HH:ii:ss. For example, 2023-05-04:21:07:35. Notice hours are represented using a 24 hour clock, and leading zeros are used if the value is below 10. The current_temperature and set_temperature should be in integers representing temperature in Celcius. An example using this command might be:

java ThermostatApp set 987654321 45-100-1 2023-05-04:21:07:35 56 65

For the second command, data should be in the format: YYYY-mm-dd. The second command can be used to list out changes to temperature settings that were done on the app for a particular id on a particular day. For example,

java ThermostatApp show 987654321 2023-05-04

Might list out:

45-100-1 2023-05-04:03:07:35 56 65
45-100-2 2023-05-04:10:04:42 84 78
45-100-1 2023-05-04:11:02:01 83 76

The id.sqlite files should have a table that is used to keep track of the last time that data was sent to the app developer's servers. The command java ThermostatApp send id should use this last time info and read the data in the ADJUSTMENTS table since that time, and write it into a Mysql database named olap. It should then update the last time data sent info in the id.sqlite database.

Your Olap program should be run from the command line with a command like:

java olap region device_num time_period

It should then output the average set_temperature for that device_num, region, and time period. Further it should also output the rollup (your query should have a WITH ROLLUP as part of its GROUP UP) of these values. Here device_num is the number of a device at a given long-lat (maybe 1 can be thought as downstairs devices and 2 as upstairs devices). Here region is of the form lat1-long1:lat2-long2 representing a rectangle and time period can be YYYY or YYYY-mm, or YYYY-mm-dd. For example,

java olap 35-115:45-125 1 2023-05

You can use a where clause for the region, but do a group by with roll up on the remaining parameters to output:

device year month avg
1      2023   5   68.9
1      2023  NULL 74.2
1      NULL  NULL 75.3
NULL   NULL  NULL 74.1

This completes the description of what your programs need to do for this homework, below is the grading scheme.

Point Breakdown

Written exercises (1pt each) 5pts
README.txt and olap.sql as described1pt
ThermostatApp set, show, and send command work as described using databases as described (1pt each)3pts
Olap program works as described using databases as described1pt
Total10pts