As my friends and colleagues know, I think the history of a project is very important for development. Being able to
blame or read the log have proven very useful for finding bugs and understanding why a piece of code was written the way it was. Therefore, it makes sense to do whatever is possible to make sure history is preserved when moving files across repositories.
Luckily for us, git has made it extremely easy.
Merging a repository (bar) into another repository (foo) is easy.
$ cd /path/to/foo $ # To use a local copy, replay the url with: file:///path/to/bar/.git $ git remote add bar https://git.domain.com/bar.git $ git fetch bar $ git merge bar/master
This is it. It is very simple and retains all of the history from bar while maintaining the same commit hashes! This means that for example daed567e will point to the same commit in both foo and bar.
Unfortunately it is not always that simple. Sometimes you may face conflicts, if for example you had a README file in both repositories, the merge operation will fail. Luckily, this is also easy to solve.
First, abort the failed merge (if you already tried to merge):
$ git merge --abort
Now switch to a temporary branch that holds bar:
$ git checkout -b barmaster bar/master
Now you can deal with the conflicting files by either removing them, moving all of bar into a directory such as bar_directory or renaming them individually.
We can finally switch back to master and merge our branch again:
$ git checkout master $ git merge barmaster
We’re done! Do not forget to push your changes.
Splitting repositories is slightly more involved compared to merging them because we would like to remove all of the unrelated files and commits from history so our new repository is clean.
There are two approaches for this stage. The whitelist (we only keep a list of files) and the blacklist (we keep everything except for the list of files). I prefer the whitelist approach, so I will only cover it in this article.
For this example we will split bar out of foobar.
Let us first start by switching to a temporary branch we can work on.
$ git checkout -b tmp
Now we need to decide which files we would like to preserve.
Optional: Retain files that have been renamed throughout history.
If we have a file called a that has been renamed to b at some point in history, we would like to preserve both a and b. A useful command to find all of the past names of a file is:
$ git log --name-only --format=format: --follow -- path/to/file | sort -u
Just add both names of the file into the script we will create below.
Now we will create a script that moves the correct files into a new temporary directory and run it on all of our repository’s history.
#!/bin/bash mkdir -p newroot/ # Pipe output to silence "file not found" warnings. mv README.md newroot/ 2>/dev/null mv src newroot/ 2>/dev/null true
Run this script on the project history:
$ git filter-branch -f --prune-empty --tree-filter /path/to/script HEAD
After that, we should have a new repository with a directory called newroot that contains all of the files we wish to preserve. If we spotted an issue, we can just reset our branch to the initial state (
git reset --hard master) and try again, otherwise, we can move to the next step: filtering the repository to be only this directory.
$ git filter-branch --prune-empty -f --subdirectory-filter newroot
Assuming everything is correct we can go on and push it to our new repository as master.
$ git remote add bar git+ssh://firstname.lastname@example.org/bar.git $ git push bar tmp:master
That’s it! You have now split bar out of foo. The last remaining thing to do is to delete the remaining bar related files from our foobar repository and commit the changes.
Moving arbitrary files between repositories
Moving arbitrary files is very easy when you consider it is just a split from one repository followed by a merge to another. For this reason I will not elaborate further, just follow the two sections above.
This is a very simple guide. In some more complex cases you will probably have to write more complex scripts or use some optimization techniques. I suggest you also take a look at the slides for a talk I gave about migrating the Enlightenment project from SVN to git. They contain some useful tips and tricks. Especially if you have a big project with a very rich history.
Please let me know if you encountered any issues or have any suggestions.
This article was originally posted on Tom’s personal blog, and has been approved to be posted here.