Chris Pollett >
Students > [Bio] [Del1] [Del2] [Del3] |
Deliverable_TitleDescription: Zipf Distribution: This is an implementation of Zipf's law. According to Zipf's law, the nth most frequently occuring element is inversely proportional to it's rank r: fj = beta* 1/j ^a (n =1,...,N) Data generated by this algorithm, when plotted on logarithmic axes becomes a straight line. Example:This is what my code outputs on these inputs.
Paragraph_to_explain_example. The above output was obtained with beta value of 1000, a value of 1.0 and number of values per line as 10. #include <stdio.h> #include <math.h> #include <stdlib.h> #define MAX_VALUES 1000 long zipf_freq_dist[MAX_VALUES]; void generate_zipf_freq(float alpha, float beta); void output_zipf_dist(int num_vals_per_line); int vals_exist_for_output(void); /* Need to pass in values of alpha, beta and num_vals_per_line. Based on these values, generate_zipf_freq and output_zipf_dist are called. */ int main(int argc, char **argv) { float alpha, beta; int i, j, num_vals_per_line; if (argc < 4) {{ printf("Usage: %s < alpha > < beta > < num_vals_per_line > of output>\n", argv[0]); exit(0); } alpha = atof(argv[1]); beta = atof(argv[2]); num_vals_per_line = atoi(argv[3]); generate_zipf_freq(alpha, beta); output_zipf_dist(num_vals_per_line); } /* * Generate zipf freq.: Computes the frequency of MAX_VALUES ranks. */ void generate_zipf_freq(float alpha, float beta) { int i; for (i = 1; i < MAX_VALUES; i++) { zipf_freq_dist[i] = lroundf(beta / pow(i, alpha)); } } /* Checks if all values of a given frequency have been output already. If not, it is output and frequency is decremented. If that frequency is 0, return. */ void output_zipf_dist(int num_vals_per_line) { int i, j; while (1) { i = 0; while (i < num_vals_per_line) { if (!vals_exist_for_output()) { printf("\n"); return; } j = (lrand48()) % MAX_VALUES; if (zipf_freq_dist[j] > 0) { printf("%d ", j); i++; zipf_freq_dist[j]--; } } printf("\n");; } } /* Checks if frequency os a partic exist. If exist, return 1. */ int vals_exist_for_output() { int k, vals_exist = 0; for (k = 0; k < MAX_VALUES; k++) { if (zipf_freq_dist[k] > 0) { vals_exist = 1; break; } } return vals_exist; } |