Chris Pollett > Students >
Aarathi

    ( Print View )

    [Bio]

    [Project Blog]

    [CS297 Proposal]

    [Del1]

    [Reading-PPT]

    [Del2]

    [Del3]

    [CS297Report-PDF]

    [CS298Proposal]

    [CS298Del1]

    [Project Code-ZIP]

    [CS298Presentation-PDF]

    [CS298Report-PDF]

                          

























Deliverable_Title

Description: Zipf Distribution: This is an implementation of Zipf's law. According to Zipf's law, the nth most frequently occuring element is inversely proportional to it's rank r:

    fj = beta* 1/j ^a (n =1,...,N)

Data generated by this algorithm, when plotted on logarithmic axes becomes a straight line.

Example:This is what my code outputs on these inputs.
Output:
618 587 491 623 517 565 914 197 519 566
592 554 408 254 450 756 768 619 561 539
926 340 858 788 583 112 571 3 895 492
867 253 160 931 415 51 580 911 845 306
144 574 414 416 583 337 701 977 767 270
199 143 597 69 70 944 160 39 614 785
116 298 114 721 406 185 324 368 239 544
720 678 718 763 704 527 171 957 52 778
689 264 366 585 238 380 244 890 855 937
12 644 313 528 773 770 385 695 426 377
460 343 355 120 464 891 790 27 329 358
403 228 175 81 597 79 621 209 404 631
461 993 223 282 318 1 830 894 375 483
78 663 934 805 198 735 185 792 774 550
...........................

Paragraph_to_explain_example. The above output was obtained with beta value of 1000, a value of 1.0 and number of values per line as 10.


#include <stdio.h>
#include <math.h>
#include <stdlib.h>


#define MAX_VALUES 1000
long zipf_freq_dist[MAX_VALUES];

void generate_zipf_freq(float alpha, float beta);
void output_zipf_dist(int num_vals_per_line);
int vals_exist_for_output(void);

/*
Need to pass in values of alpha, beta and num_vals_per_line.
Based on these values, generate_zipf_freq and output_zipf_dist are called.

*/

int main(int argc, char **argv)
{
   float alpha, beta;
   int i, j, num_vals_per_line;

   if (argc < 4)
   {{
      printf("Usage: %s < alpha > < beta > <
      num_vals_per_line > of output>\n", argv[0]);
      exit(0);
   }

   alpha = atof(argv[1]);
   beta = atof(argv[2]);
   num_vals_per_line = atoi(argv[3]);

   generate_zipf_freq(alpha, beta);
   output_zipf_dist(num_vals_per_line);
}

/*
 * Generate zipf freq.: Computes the frequency of MAX_VALUES ranks.
*/
void generate_zipf_freq(float alpha, float beta)
{
   int i;

   for (i = 1; i < MAX_VALUES; i++)
        {
      zipf_freq_dist[i] = lroundf(beta / pow(i, alpha));
   }
}

/*
Checks if all values of a given frequency have been output already.
If  not, it is output and frequency is decremented. If that frequency is 0, return.
*/
void output_zipf_dist(int num_vals_per_line)
{
  int i, j;

   while (1) {
     i = 0;
     while (i < num_vals_per_line) {
       if (!vals_exist_for_output()) {
         printf("\n");
         return;
       }
       j = (lrand48()) % MAX_VALUES;
       if (zipf_freq_dist[j] > 0) {
         printf("%d ", j);
         i++;
         zipf_freq_dist[j]--;
       }
     }
     printf("\n");;
   }
}

/*
Checks if frequency os a partic exist. If exist, return 1.

*/

int vals_exist_for_output()
{
  int k, vals_exist = 0;

  for (k = 0; k < MAX_VALUES; k++)
  {
    if (zipf_freq_dist[k] > 0)
    {
      vals_exist = 1;
      break;
    }
  }
  return vals_exist;
}