CS298 Proposal
Adding Differential Privacy to an Open Source Discussion Board System
Pragya Rana (pragya.rana9@gmail.com)
Advisor: Dr. Chris Pollett
Committee Members: Dr. Chris Pollett, Dr. Melody Moh, Mr. Mahesh Subedi
Abstract:
Currently in Yioops discussion board
system, various statistical data are calculated such as number of users belonging to a
group, number of views of a thread, etc. When statistical data are made publicly available, there is no
guarantee that the privacy of an individual is preserved. This project deals with implementing a privacy system for statistics generated by the Yioop search engine and discussion board system. Differential privacy preserves the privacy up to some controllable parameters of individuals when statistics from a database are made public. With this measure, accurate information about the database is provided while at the same time, privacy of the
individual is maintained.
CS297 Results
- Gave a presentation on Differential Privacy, understood Yioop system and it's privacy.
- Added statistical chart when clicked on most visited
threads in Yioop.
- Developed a test suite of statistical attacks against query
and discussion board statistics.
- Implemented differential privacy to number of views of each group's thread.
Proposed Schedule
Week 1:
Jan 30 - Feb 5 | Complete CS298 Proposal and submit paper work to CS department. |
Week 2:
Feb 6- Feb 12 | Start working on Deliverable #1: Enhance UI Security Feature. |
Week 3:
Feb 13 - Feb 19 | Continue working on Deliverable #1. |
Week 4:
Feb 20- Feb 26 | Continue working on Deliverable #1. |
Week 5:
Feb 27 - Mar 5 | Complete Deliverable #1. |
Week 6:
Mar 6 - Mar 12 | Start working on Deliverable #2: Randomize user information in the database. |
Week 7:
Mar 13 - Mar 19 | Continue working on Deliverable #2. |
Week 8:
Mar 20 - Mar 26 | Complete Deliverable #2. |
Week 9:
Apr 3 - Apr 9 | Start working on Deliverable #3: Add Differential Privacy to query statistics page. |
Week 10:
Apr 10 - Apr 16 | Continue working on Deliverable #3. |
Week 11:
Apr 17 - Apr 23 | Complete Deliverable #3. |
Week 12:
Apr 24 - Apr 30 | Start working on Deliverable #4: Add Differential Privacy to group statistics page. |
Week 13:
May 1 - May 7 | Continue working on Deliverable #4. |
Week 14:
May 8 - May 14 | Complete Deliverable #4. |
Week 15:
May 15 - May 21 | Start working on CS298 Report and Presentation. |
Week 16:
May 22 - May 28 | Presentation and report. |
Key Deliverables:
- Software
- Enhance Security Feature in the UI by adding a turn on/off button for Differential Privacy under Security section of the Yioop search engine.
- Randomize user information in the database by obfuscating the user id in order to prevent the leak of user data.
- Add Differential Privacy to query statistics page that displays the statistics about each query entered by a user in the search.
- Add Differential Privacy to group statistics page that displays the statistics about each group's number of views.
- Report
- CS 298 Report
- CS 298 Presentation
Innovations and Challenges
- Differential privacy is achieved by fuzzifying the actual data. The data is fuzzified by adding some noise to it. The challenge here is to calculate the amount of noise that should to be added to the actual data while still making it as accurate as possible.
- With reference to Deliverable #2, given the condition that Yioop's database is made publicly available, what sensitive information should we consider while protecting the privacy of an individual. For example, if we think a user's id is senstive, we could randomize a user's information by obfuscating the id of the user. This could be done by using hash.
- There are limited papers that are published on the topic of Differential Privacy. So doing research on this topic and implementing differential privacy is a challenge.
References:
Dwork, C. Differential Privacy, 33rd International Colloquium on Automata, Languages and Programming, part II, 2006
Dwork, C. and Roth, A. The Algorithmic Foundations of Differential Privacy, Foundations and Trends in Theoretical Computer Science Vol. 9, Nos. 34 (2014) 211407, 2014
|