Chris Pollett >
Students > [Bio] [Blog] |
Project BlogWeek 15 - May 10th 2022
Discussed the the tentative slides I had made for my oral defense. Chris showed me the room and
described the procedures that I would have to go through during and after my oral defense. He
emphasized that I also increase the size of the font of my tables and add diagrams of all my
ML models. I still have not filled out some slides and they will have to be done by the 18th.
Things to do for this week:
Week 14 - May 3rd 2022
We went over the rough draft of my report and talked about upcoming tasks that need to be done
in order to pass CS298. I also asked him about graduation details since I missed the deadline to
send in forms to apply for graduation this quarter.
Things to do for this week:
Week 13 - April 26th 2022
Discussed the relevance of results of the experiments that I completed and how to interpret them.
Random noise was applying in larger and larger amounts, however the more noise that was added,
the more galaxy classifications where guessed. This means the galaxy sensor is more noisy. Also
confirms the noise multiplier trend extends all the way down and 0.3 multiplier.
Things to do for this week:
Week 12 - April 19th 2022
I first showed Chris the results of using the classifying model to classify the images using
different denoising methods. Using the denoise_tv_chambolle method from skimage game me results
closest to a 50-50 split without being too biased towards one camera model. This contrasts with
the results that Lucas got because he claimed that wavelet based denoising methods works the best.
I do make the assumption that the best denoising methods would results in a classifier resulting in
a 50-50 split. I also showed him the results of weighting the noise before addition onto cropped
images for spoofing. I found that there is a trend with weights that are smaller (< 1.0) producing
better results for being able to spoof with iphone noise. The opposite is true with galaxy noise
imprinting classifying more correctly with values > 1.0. I still need to test more to find where
where is the optimal values for each class. I should have also mention in the meeting that I handle
the loss of precision when converting floats to integers when saving images by rounding and not
truncation. I also save the resulting images as pngs to not further compress the image with the jpg format more than what is necessary.
Things to do for this week:
Week 11 - April 12th 2022
After generating enough crops to conduct all the experiments. I discovered that
even the base test was having low accuracy. Usually extremely leaning towards
predicting one model over the other. I tried precropping the training set,
saving the crops as pngs, and using other classifying models. However,
I finally discovered that i accidentally let an old variable that cropped the
test images to a larger size than they should be. The result was the model
predicting on test images with large black borders shown below.
Things to do for this week:
Week 10 - April 5th 2022
I was having issues with the test set being heavily favored towards classifying images as iphone
regardless of noise. I explained that the model trains on 5 croppings of a high resolution image
(3024x4032). While the test set just takes 5 croppings (which end up being the whole image) of a
128x128 image. This is due to the GAN only producing 128x128 noise prints. The GAN already takes
20 minutes to train on 240 128x128 image croppings. So, training a single GAN on the whole image
would take 732 times longer (10.2 days) and I need to train one GAN model per camera model. This
is infeasible and Chris told me that I can just make the training set the same as the test set
by doing 5 128x128 croppings of the training set before feeding into the model and just using
the center == true method to avoid the preprocessing taking lh, rb... crops. I also proposed
using the same technique on the images to train the GAN. I will take 5 crops for each image (
maybe in rb, lh... positions or randomly) and use that set to train the GAN. The GAN will end up
having 5x240 = 1200 images to train on. The result will have 128x128 images for testing and
training and manually doing the cropping before data preprocessing.
Things to do for this week:
Week 9 - Mar 29th 2022Spring break, no meeting Week 8 - Mar 22nd 2022Let professor know that I have a current working implementation of a GAN and Classifier. Also mentioned the changes I made to the GAN since last week. I changed the noise prints to be saved as npz files instead of images in order to accurately save the negative noise values since images cannot have negative pixel values. I also removed noise scaling on noise prints. Scaling is useful for discrimination but the generator will not be able to produce noise accurate reconstruction for spoofing. Also confirmed that I need to have three separate sets of data. One set for training the GAN, another for training the Classifier, and the last set (will include spoofed images from the Generator) will be for testing the classifier with 90/10 training test split. Could also use images from the same phone model and test to see if I can spoof specific phones even of the same model type. I will probably use two galaxy s8's and iphone X. Some pictures from the dataset will be from the same tripod position but they will probably be included in the testing set. Original dataset that the classifier used had 275 images per class. So I should also have 275 images per camera model to train the classifier. Maybe GAN can have around 240 images per class. I can make a graph of Accuracy of spoofed images vs training set size of the classifier to see if there is a sharp decrease in accuracy when the sample size drops beneath a threshold. I may be able to infer something at that stage.
Things to do for this week:
Week 7 - Mar 15th 2022Discussed concerns about model run time since the input dimensions are very high if I were to have noise that is RGB and is the full dimensions of the picture. Since noise is thought to be relatively local repeating patterns, Chris suggested that I just use random crops of images and treat them as full size images since most detection models only classify images based off of crops anyways. Chris also mentioned that I need to still update the blog and I need to revise my proposal's timeline so that report is finished by May 1st Things to do for this week:
Week 6 - Mar 8th 2022Canceled meeting due to lack of significant progression on project.
Week 5 - Mar 1st 2022Found an article to base my GAN network around:
Things to do for this week:
Week 4 - Feb 22nd 2022My results with PCA and LBP were not great. I would like to do more extensive testing with the sample size and other hyper parameters, but will have to put that on the backburner, so I can start on trying out GANs.
Things to do for this week:
Week 3 - Feb 15th 2022Experimented with Local binary pattern image augmentation. Inspired by
Things to do for this week:
Week 2 - Feb 8th 2022Was able to find two committee members. Had some conflicts with getting permission to take CS298. Had to contact the CS dept to sort things out. Had to make several Docusign links and had to fill out a "Petition for Advancement to Graduate Candidacy" form. Waiting on Response from CS dept
Things to do for this week:
Week 1 - Feb 1st 2022Learned about the requirements towards getting permission to sign up for CS298. I have to find two full time professors who are willing to review my report and be on my committee.
Things to do for this week:
Week 16 - December 7th 2021Discussed the next steps to look forward to. I have to start thinking about who I want on my committee. Things to do for this week:
Week 15 - November 30th 2021Pre-Meeting NotesWas able to complete report up to Dev 3. The other text below that is filler that will be deleted.
Things to do for this week:
Week 14 - November 23rd 2021Pre-Meeting NotesFollowed these notebooks
Discussed how a GAN would work to suit my needs to solving this source camera identification model and how to spoof images. Discussed the project report specs. Things to do for this week:
Week 13 - November 16th 2021Discussed the research paper that was REF 3. Even though the topic wasn't ideal timing, it was important to see how robust the tests and current algorithms were for noise print detection. The main takeaway from this article was that noiseprints can be completely removed and fool detection models. I need to take this a step further and imprint a fake noise to fool a model to think it's a different class. Things to do for this week:
Week 12 - November 10th 2021Discussed the results of the benchmark and the types of tests they did. Discussed the difficulties of getting the benchmark to run. Decided next steps were to start working on getting a GAN working and to get deliverable 3 on the website. Things to do for this week:
Week 11 - November 2nd 2021Pre-Meeting NotesPotential benchmark
Things to do for this week:
Week 10 - October 26th 2021Pre-Meeting NotesPotential benchmark
Discussed the concerns about how my method will be accurate. Professor explained an experiment that I could do to test my findings. We also discussed methods at which I can find a benchmark. Also need to go back to dev2 sometime and test out if adding plt.gray() makes the noise prints better since they did that in their article. Lukas' article fails with cropping images. I can try resizing and see how that works ref: A_Survey_on_Digital_Camera_Image_Forensic_Methods Things to do for this week:
Week 9 - October 19th 2021Discussed the finding I had with the code I used to generate different denoise images. I found the bior3.5 wavelets had the most consistent results with producing denoised images. I didn't get the chance to test out noise generation methods as I had doubts with the scoring method. After discussing this with Chris, I determined that I would use AVG(PSNR)/VAR for scoring the denoised images and AVG(PSNR) for scoring the noiseprints. Things to do for this week:
Week 8 - October 12th 2021Pre-Meeting Notes
Gave a brief overview on how wavelets can denoise images. Discussed the scikit-image library. The professor gave me advice on how to use PSNR as a metric to decide how good the noise print without having the true raw image. Since the nature of my project assumes that all images have inherent noise. Things to do for this week:
Week 7 - October 5th 2021Pre-Meeting Notes
Spent most of meeting presenting [2]. Asked Prof. Pollet to clarify once more how wavelet transforms is better for signal reconstruction than Fourier transforms. Things to do for this week:
Week 6 - September 28th 2021Pre-Meeting Notes
Sources Taylor Series
Fourier Transforms
Wavelets
Discussed the theory behind taylor series, fourier transforms, and wavelets. Prof. Pollet told me that I could use Wavelet and any type of signal analysis/decomposition methods for any type of data. I was concerned that I would not be able to use wavelets since common applications apply to signals over time, amplitude, and frequency. I can use any type of data as dimensions to decompose over. We also discussed the Heisenberg's uncertainty principle in more detail. It seems that a lot of these electrical engineering concepts can be applied to computer science as well. I was also reminded to try to separate the color channels into three different matrices to reduce dimensions Things to do for this week:
Week 5 - September 21st 2021Discussed Paper 1 again to further analyze the specific method of which they extracted sensor noise patterns from their dataset. The best method the authors found was a wavelet based denoising filter. The paper used a package from another paper but we would like to implement a wavelet based filter ourselves. I also plan to experiment with Fourier transformations as they used that to analyze the properties of noise in each pixel row. Things to do for this week:
Week 4 - September 14th 2021Pre-Meeting Notes
Discussed the results of my experiment 1. Assumed that there were more white pixels than expected is because when converted signed ints to unsigned ints, the two-compliment representation causes the conversion to turn small negatives into large numbers. This turns some spots white. On average some spots will be small numbers and other spots will be large numbers (represented by white and black pixels). Also noticed larger blocks of pixels near the edges of the picture and finer grains of noise near the center. This indicates that the camera is more sensitive towards the middle of the lens. It is also possible that whiter images have less noise (larger blocks in the noise map). Also noticed that I save the noise maps as pngs, there is a little discrepancy since the input images are jpgs but adding another layer of compression shouldn't improve the results too much. Also looked at the dataset and discovered that the images are not of the same objects across different camera models. Things to do for this week:
Week 3 - September 7th 2021Discussed the various ways to take sensor noise patterns and any future complications that the project can hold. Discussed the above dataset and my updated proposal. I presented reference 1. Things to do for this week:
Week 2 - August 31th 2021Discussed further about the topic. Pollet made suggestions on what types of deliverables he would like to see. He also explained what his role would be as my advisor and what he expects of me for CS297 Things to do for this week:
Week 1 - August 24th 2021Met with the rest of his students for the CS297/298 classes and decided when to meet. Made introductions with the other students and got to know each other. Professor suggested this project idea: finding a way to change the sensor data of an image to seem like it was taken from a different camera. Things to do for this week:
|