Mercurial vs. Git, performance for a server

On a typical human workflow, the raw speed of a command doesn’t really matter, as long as it’s not too long, because the time the command takes to complete is far below the time for typing the commend — even if you just type “C-r com ENTER”. The highest typing speed ever measured is 216 words per minute¹², which equals to 1080 characters, or 56ms per keystroke.

¹: http://www.owled.com/typing.html ²: More: http://en.wikipedia.org/wiki/Words_per_minute

Where performance of the fast commands really matters is when you implement a server which uses versiontracking for some task. Also a server has more options for getting higher performance than a normal user – for example the log can be cached. For this test I use these assumptions:

Method

As actions I’ll first test committing and repository size, because these are the areas, where the biggest changes from the general usecase are to be expected.

The changes from the traditional test are:

All of these 5 tests will be done with models corresponding to the 4 repository types. The model gets generated by simply getting the files changed in each commit. Then at each iteration a commit is chosen at random and a number of lines are appended to each file corresponding to the full size of the repo divided by the number of total changes to files (sum of the files in each commit). Just using the size of the repo incurs an error, because the individual changes will likely be bigger. This can still be checked later on.

Info

The files at each commit can be found by simply calling $ hg log --template "{files}n"

The size via $ du -hsc repo/*

From knittl2010 we know that the size of the files doesn’t have a noticeable effect on the performance of the system, so we can treat additions and changes as simple additions.

The changed files for the profiles are are in hg-files.txt, git-files.txt, freenet-files.txt and 1d6-files.txt, respectively.

The code will now load the list of files, shuffle it, do the commits (only appending the data: Use a random selection of lines of some of the code files³) and measure the times per commit, outputting them as simple list of newline seperated numbers.

Afterwards check the size of the repositories.

Testcases

The testcases are:

Needed from the program side: A setup function which creates a new testdir (name: test-time()), loads the profiles and sets up the basic repositories. A profile-enacting function which takes the files-list, the size of the data to append per file and a target directory. * A time-tracking commit function for each testcase which just puts the result time on stdout.

The code is public on Bitbucket.

³: cat ~/Quell/Programme/Mercurial/hg-stable//.py git2//.c git2//.c ~/Quell/Programme/freenet/fred-staging-hg/src/freenet//.java ~/ews//.txt ~/ews/*.txt | shuf > random_lines.txt
3.1 MiB from freenet, 892 kiB from 1d6, 1.7 MiB from Mercurial and 2,6 MiB from git.

Usage

Run the script

$ python run_test.py

hg vs. git for server applications - performance test

-- 2011-12-16 23:54:07 --


Created with pyMarkdown Minisite
using the layout from the pybrary.