GIT repository cleanup

The project has been moved from the TFS & TFVC to the VSO & GIT and I was surprised by the size of one of my GIT repositories – it was 13GB!

This is quite large for a GIT repo and was caused by two things – the long project history and the large files contained in that history.

Keeping a GIT repository as small as possible is a good practice, so let’s do some cleanup.

GIT repository cleanup

I realized that a 64-bit GIT is required when working with such an enormous repository. If a 32-bit GIT is used, there will be errors like this:

$ git gc
Counting objects: 137142, done.
fatal: Out of memory? mmap failed: No such file or directory
error: failed to run repack

I used the Cygwin64 client (including the GIT package) to handle my repo.

1. Repack repository

$ git repack

Now it’s time to sit back and relax 🙂

2. Manually delete unused folders in your repository.

All large files from the repo should be manually removed to create a clean final commit. This is needed for the next step:

// manually delete folders in git workspace and commit.
$ git add .

$ git commit -m “PIMIntegration and Build Tasks projects were removed from repo”

3. Now it’s time to clean all files from the repository that are larger than 5MB.
The BFG tool works much faster (10-720x faster) than the ‘git-filter-branch’ command:

$ java -jar bfg.jar -- private --strip-blobs-bigger-than 5M  .git

After this cleanup our repository is reduced to about 200MB in size!

4. Cut old history.

Now more than four years of history have been converted from the TFS, so it’s time to cut the old history and start from Jan 01 2014.

This is done by finding a commit from Jan 01 2014:

commit-id : 536e9a4d1c

And removing everything older than that:

$ git checkout --orphan temp 536e9a4d1c

$ git commit -m "Truncated history"

$ git rebase --onto temp 536e9a4d1c master

$ git branch -D temp

5. These have now been removed from the history but are still present on the hard drive. It’s now time to clean up and repack.

$ git reflog expire --expire=now --all

$ git gc --prune=now --aggressive

6. And it’s done!

Now this local repository is about 150MB, much better that the initial 13Gb.

This entry was posted in BFG, cleanup, GIT, GIT-TF, TFS and tagged , , , , . Bookmark the permalink.

1 Response to GIT repository cleanup

  1. Pingback: GIT repository cleanup after migration from TFS | Dinesh Ram Kali.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s