Chris Pollett > Students > Snigdha

    Print View


    [CS297 Proposal]

    [Project Blog]

    [Del 1: Reading Review - PDF]

    [Del 2: Naive Bayes Classifier]

    [Del 3: Language Setting]

    [Del 4: Git Clone using cURL]

    [Del 5: CS297 Report - PDF]

    [CS298 Proposal]

    [CS298 Presentation - PDF]

    [CS298 Report - PDF]


Deliverable 4 - Git Clone Using cURL Requests

The aim of this deliverable is to reproduce the effects of Git Clone using cURL requests. This deliverable is completed without using any external libraries. This deliverable helped me in understanding structure and organization of Git objects.

As a part of this deliverable, I had configured a local Git repository. Purpose behind setting up a local code repository is to check the requests made during the clone operation. I used XAMPP to set up the local Git repository. I used WebDav to achieve this. By default WebDav is commented in httpd.conf file in XAMPP. As a part of the initial set up WebDav is uncommented and location of remote repository is provided in the httpd.conf file. A directory for local git repository was created inside the XAMPP's document root. This local repository.git was initialized as a bare repository and server information is updated. After this step a nested folder structure containing Java and Python source code files was created. This nested structure is pushed into the local Git repository. Git init command was executed over the base folder of this nested folder structure. After this, Git add and commit commands were executed. Once the files were committed, Git push command was executed to add files to the bare repository. Additionally, I performed the Git clone operation and checked the access log to find the requests made during the Git clone operation.

In the Git repository actual content of the files are stored as a blob file. These blob files are represented by their respective SHA hashes represented in hexadecimal. Git tree objects contains the information for overall organization of the blob files. The tree object shows the size of the tree object. After this the tree object has a repetitive pattern for each file or folder that includes Unix access code followed by the file or folder name followed by the twenty byte SHA hash in binary. In bare repository first two byte of SHA hash is represented as folders inside object folder and rest of the bytes were represented as a file. Each nested folder is again represented as Git tree object which contains the information about the structure of the files inside it. Overall organization of folders in the first level is represented by the master tree object which may have other tree objects based upon the nesting. Git blob and Git tree objects are compressed.

As a part of this deliverable, I had made cURL request for each GET request of Git clone operation via a PHP program. Data returned from cURL request was uncompressed to see the actual content. The program was coded in such a way that after uncompressing contents of Git tree object the program extracts SHA hash associated with each file and converts it from binary to hexadecimal. The program make url out of these SHA hashes and makes subsequent cURL requests to get each file. Contents of each file was uncompressed and new file was created to write the uncompressed contents. Before creating the file, the program determines its level of nesting inside the base folder. The program finally creates the nested structure to write the contents to the actual file.

[PHP Program to Perform Git Clone - Zip]