Chris Pollett > Students

A first look into transformation functions and wavelet decomposition

Justin Chang (justin.h.chang@sjsu.edu)

Purpose: My goal for this deliverable was to thoroughly research and practice using signal analyzing functions. Signal decomposition is used for denoising images. For my project, I assume the images already have inherent noise and I and trying to denoise the image to obtain the noise from the subtraction of the original image from the noisy one. After I am able to obtain these "noiseprints", I will be able to run experiments to find the best parameters for denoising an image.

Conclusion: In order to reach an understanding about how wavelet decomposition work, I first had to refamiliarize myself with Taylor Series and understand how any function can be approximated by a polynomial function. The purpose of transforming a function in this manner is that polynomial functions are much easier to compute, derivate and integrate, making them much easier to analyze. Fourier Transforms was the next step towards understanding the ground theory behind wavelets. The Fourier Transform is heavily used in signal analysis as it transforms a signal in the time domain to the frequency domain. It is similar to Taylor Series but instead of polynomials, the signal is decomposed into a series of sinusoidals. While we lose information in the time domain we can approximate time using Fast Fourier Transforms which utilized a sliding a window of time and frequency. However due to the Heisenberg uncertainty principle the size of the window will still be constant and will have to trade off between time or frequency resolution. Wavelets typically are much more accurate and decomposing a signal due to many different properties. Wavelets typically have compact support and can represent local regions in a signal much better. Wavelets also support a variety of window sizes that support large time windows for lower frequencies and narrow time windows for high frequencies. The theory behind wavelet denoising of an image is the filter out small coefficients that represent small influences to the shape of the signal. In image terms, it means it smooths out the small variations in pixels which ultimately represent noise. I use the scipy-image library to denoise images and compare PSNR values between pairs of images to score each denoising method. Since there is not original image to compare against, I am currently just measuring to see how consistent each method is. I'll be comparing each noise extraction method on how similar they are between different images, since the assumption is that the noiseprint should remain consistent among all phones no matter what the picture is of. My experiments proved that bior3.5 is the most consistent. I also tested three different methods to generate noise: scaled, absolute value, and uint. I found that the scaled method has the highest average psnr pair and determined that produces the most similar results.

Code:



import matplotlib.pyplot as plt

from skimage.restoration import (denoise_wavelet, estimate_sigma)
from skimage import data, img_as_float
from skimage.util import random_noise
from skimage.metrics import peak_signal_noise_ratio
import skimage.io
from PIL import Image
import os, sys
import numpy as np
import statistics

def read_img(jpg_path):
    img = skimage.io.imread(jpg_path)
    noisy = img_as_float(img)
    return noisy

def denoise(noisy, dm):

    if dm == 'bior35':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='bior3.5',
                            rescale_sigma=True, wavelet_levels=5)
    elif dm == 'bior44':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='bior4.4',
                            rescale_sigma=True, wavelet_levels=5)
    elif dm == 'coif4':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='coif4',
                            rescale_sigma=True, wavelet_levels=5)
    elif dm == 'coif8':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='coif8',
                            rescale_sigma=True, wavelet_levels=5)
    elif dm == 'db4':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='db4',
                            rescale_sigma=True, wavelet_levels=5)
    elif dm == 'db8':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='db8',
                            rescale_sigma=True, wavelet_levels=5)
    elif dm == 'sym4':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='sym4',
                            rescale_sigma=True, wavelet_levels=5)
    elif dm == 'sym8':
        im_denoised = denoise_wavelet(noisy, convert2ycbcr=True, multichannel=True,
                            method='BayesShrink', mode='soft', wavelet='sym8',
                            rescale_sigma=True, wavelet_levels=5)

    im_denoised = np.clip(im_denoised, 0, 1)
    return im_denoised

os.chdir('C:/Users/Justin/Documents/SJSU/masters_project/deliverable2')
rootdir = 'C:/Users/Justin/Documents/SJSU/masters_project/deliverable2'
dm_list = ['bior35', 'bior44', 'coif4', 'coif8', 'db4', 'db8', 'sym4', 'sym8']
# dm_list = ['bior35']
score = {'blk':{}, 'cup':{}, 'plushie':{}, 'towel':{}, 'wall':{}}

for file in os.listdir(rootdir):
    item_dir = os.path.join(rootdir, file)
    if os.path.isdir(item_dir):
        print(item_dir)
        for dm in dm_list:
            for filename in os.listdir(item_dir):
                if filename.endswith(".JPG"):
                    jpg_path = os.path.join(item_dir, filename)
                    noisy = read_img(jpg_path)
                    im_denoised = denoise(noisy, dm)
                    plt.imsave(os.path.join(item_dir, dm, filename), im_denoised)
                # pass
            denoised_im_name_list = os.listdir(os.path.join(item_dir,dm))
            denoised_im_list = [read_img(os.path.join(item_dir,dm,im_name)) for im_name in denoised_im_name_list]
            # print(denoised_im_list)
            psnr_list = []
            for i in range(len(denoised_im_list)):
                for j in range(i+1, len(denoised_im_list)):
                    # print(i,j)
                    psnr = peak_signal_noise_ratio(denoised_im_list[i], denoised_im_list[j])
                    psnr_list.append(psnr)
            variance = statistics.variance(psnr_list)
            avg = np.average(psnr_list)
            print(f'Avg {dm}: {avg}\nVar {dm}: {variance}\n')
            score[file][dm] = avg/variance

print(score)
with open('dictionary_score.txt','w') as data: 
      data.write(str(score))


with open('best scores.txt','w') as data: 
    for item in score:
        tuple_list = score[item].items()
        print(f'Max score for {item}: {max(tuple_list,key=lambda x:x[1])}')
        data.write(f'Max score for {item}: {max(tuple_list,key=lambda x:x[1])}\n')

#ratio of avg psnr / variance


npbior_path = 'C:/Users/Justin/Documents/SJSU/masters_project/noiseprints'
chosen_wavelet = ['bior35']
np_score = {'abs':{}, 'scaled':{}, 'uint':{}}


for file in os.listdir(rootdir):
    item_dir = os.path.join(rootdir, file)
    if os.path.isdir(item_dir):
        print(item_dir)
        for dm in chosen_wavelet:
            for filename in os.listdir(item_dir):
                if filename.endswith(".JPG"):
                    jpg_path = os.path.join(item_dir, filename)
                    # print(jpg_path)
                    noisy = read_img(jpg_path)
                    # print(os.path.join(item_dir, 'bior35', filename))
                    denoised_img = read_img(os.path.join(item_dir, 'bior35', filename))
                    noise_print = Image.fromarray((noisy*255 - denoised_img*255).astype(np.uint8))
                    tmp_path = os.path.join(npbior_path, 'uint', filename)
                    # print('= ' + tmp_path)
                    plt.imsave(tmp_path, noise_print)

                    tmp_path = os.path.join(npbior_path, 'abs', filename)
                    noise_print = Image.fromarray(abs((noisy*255 - denoised_img*255)).astype(np.uint8))
                    plt.imsave(tmp_path, noise_print)

                    tmp_path = os.path.join(npbior_path, 'scaled', filename)
                    noise_diff = noisy*255 - denoised_img*255
                    min_a = np.amin(noise_diff)
                    max_a = np.amax(noise_diff)
                    noise_diff_scaled = (noise_diff-min_a) * (255/(max_a - min_a))
                    noise_print = Image.fromarray(noise_diff_scaled.astype(np.uint8), 'RGB')
                    plt.imsave(tmp_path, noise_print)
for file in os.listdir(npbior_path):
    item_dir = os.path.join(npbior_path, file)
    if os.path.isdir(item_dir):
        print(item_dir)
        noise_im_name_list = os.listdir(os.path.join(item_dir))
        noise_im_list = [read_img(os.path.join(item_dir,im_name)) for im_name in noise_im_name_list]
        psnr_list = []
        for i in range(len(noise_im_list)):
            for j in range(i+1, len(noise_im_list)):
                # print(i,j)
                psnr = peak_signal_noise_ratio(noise_im_list[i], noise_im_list[j])
                # print(psnr)
                psnr_list.append(psnr)
        avg = np.average(psnr_list)
        np_score[file] = avg


print(np_score)
with open('dictionary_np_score.txt','w') as data: 
      data.write(str(np_score))

with open('best np_scores.txt','w') as data: 
    tuple_list = np_score.items()
    print(f'Max score for {item}: {max(tuple_list,key=lambda x:x[1])}')
    data.write(f'Max score for {item}: {max(tuple_list,key=lambda x:x[1])}\n')

#compute avg psnr between pairs of images
# loop through all images and subtracted denoised image from noisy img to get the noiseprint
# then save the all the noise prints to a folder in npbior that corresponds to the noise
# generation method (i.e. uint, scaled), then do pair wise psnr comparison and take the average
# of all psnr scores. the highest of the two scores means that the noise prints between all
# images are similar and is the better score