Migrating a Subversion(SVN) repository to Git/GitLab/GitHub repository

sagary2j

Sagar R Ravkhande

Posted on March 18, 2022

Migrating a Subversion(SVN) repository to Git/GitLab/GitHub repository

Prerequisite:

Configure a migration environment on a local workstation and install the following software:

Git
Subversion
git-svn utility (part of Git)

What is a Version control systems?

Version control systems are integral to building software. They combine your repository of project files with a history of all your code changes, making it easy to edit and understand your code over time.

The main benefit of using a version control system is that it keeps your team’s workflows organized as they work through various types of releases. With one in place, team members can easily research, track, and undo code. They can work on the same code simultaneously without code conflicts. Plus, the whole team can track who made what changes, when, and why.

But before you implement a version control system into your team’s workflow, you need to figure out which one is right for you. While most options out there have similar benefits, their differences are important.

Git Vs. SVN

With all version control systems, project files sit on a server that you push your files to when you have completed your work on your local machine. However, deciding whether to use a centralized version control system (like SVN) or a distributed version control system (like Git) will affect how you commit changes.

Remember, not all version control systems fit all teams and all needs. A method that works perfectly for one company may be entirely wrong for your team. To determine which system to use, you need to look at how each system works.

GitVsSVN

What is SVN?

Apache Subversion, also known as Subversion, SVN represents the most popular centralized version control system on the market. With a centralized system, all files and historical data are stored on a central server. Developers can commit their changes directly to that central server repository.

Work is comprised of three parts:

Trunk: The trunk is the hub of your current, stable code and product. It only includes tested, unbroken code. This acts as a base where all changes are made from.

Branches: Here is where you house new code and features. Using a copy of the trunk code, team members conduct research and development in the branch. Doing so allows each team member to work on the enhanced features without disrupting each other’s progress.

Tags: Consider tags a duplicate of a branch at a given point in time. Tags aren’t used during development, but rather during deployment after the branch’s code is finished. Marking your code with tags makes it easy to review and, if necessary, revert your code.

To create a new feature you first branch the code from the trunk, i.e. take an exact copy of the trunk and place it into a new folder within the branches area. Then you work on your feature. When you’re done, you merge your changes back into the trunk.

The benefit of branching is the ability to make commits into the branch without breaking the trunk. You only merge into the trunk when your code is error-free. This keeps your trunk stable. And users generally appreciate how easy it is to use and understand SVN.

However, working on one central server means there is a single point of failure. If there is an error, it can destroy all builds. Limited offline access is also a frequent point of complaint.

What is Git?

Unlike SVN, Git utilizes multiple repositories: a central repository and a series of local repositories. Local repositories are exact copies of the central repository complete with the entire history of changes.

The Git workflow is similar to SVN, but with an extra step: to create a new feature, you take an exact copy of the central repository to create your local repository on your local machine (you can think of this as your “local trunk”). Then you work on your local repository exactly as you would in SVN by creating new branches, tags, etc. When you’re done, you merge your branches into your local repository (i.e. local trunk). When you’re ready to merge into the central repository, you push your changes from your local repository to the central repository.

Many people prefer Git for version control for a few reasons:

  1. It’s faster to commit.
    Because you commit to the central repository more often in SVN, network traffic slows everyone down. Whereas with Git, you’re working mostly on your local repository and only committing to the central repository every so often.

  2. No more single point of failure.
    With SVN, if the central repository goes down or some code breaks the build, no other developers can commit their code until the repository is fixed. With Git, each developer has their own repository, so it doesn’t matter if the central repository is broken. Developers can continue to commit code locally until the central repository has been fixed, and then they can push their changes.

  3. It’s available offline.
    Unlike SVN, Git can work offline, allowing your team to continue working without losing features if they lose connection.

Teams also opt for Git because it’s open source and cross-platform. That means that support is available for all platforms, multiple sets of technologies, languages, and frameworks. And it’s supported by virtually all operating systems.

There is one con teams find frustrating: the ever-growing complexity of history logs. Because developers take extra steps when merging, history logs of each issue can become dense and difficult to decipher. This can potentially make analyzing your system harder.

Steps to Convert the source SVN repository to a local Git repository

The goal of this step is to convert the source Subversion repository to a local bare Git repository. A bare Git repository does not have a local working checkout of files that can be changed, instead it only contains the repository's history and the metadata about the repository itself. This is the recommended format for sharing a Git repository via a remote repository hosted on a service like Azure Repos.

1. Set local Environment variables

SVN_REPO=<<"SVN repo URL without the /trunk">>
e.g. "svn+ssh://org.svn.services.com/svnroot/sourceprj"

GIT_REPO=<<"Git repo URL">>
e.g. "https://gitlab.com/org/destinationprj.git"

Note: We will be using this variables throughout the migration process.

2. Retrieve a list of all Subversion authors/committers

From the root of your local Subversion checkout, run this command:

svn log -q $SVN_REPO | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2"@*.com>"}' | sort -u > ~/authors.txt
Enter fullscreen mode Exit fullscreen mode

This command will retrieve all the log messages, extract the usernames, eliminate any duplicate usernames, sort the usernames, and place them into a "authors.txt" file. The file authors.txt should look like this.

testuser = testuser <testuser@gmail.com>
Enter fullscreen mode Exit fullscreen mode

3. Clone the Subversion repository using git-svn

The following command will do the standard git-svn transformation using the authors.txt file created in the earlier step. It will place the Git repository in the ~/temp folder in your local machine.

git svn clone $SVN_REPO --no-metadata -A ~/authors.txt --stdlayout ~/temp
Enter fullscreen mode Exit fullscreen mode

If you are using the standard trunk, branches, tags layout you'll just put --stdlayout. However, if you have something different you may have to pass the --trunk, --branches, and --tags to find what is what.

4. Convert svn:ignore properties to .gitignore - [Optional]

If your svn repo was using svn:ignore properties, you can easily convert this to a .gitignore file using:

cd ~/temp git svn show-ignore > .gitignore git add .gitignore git commit -m 'Convert svn:ignore properties to .gitignore.'
Enter fullscreen mode Exit fullscreen mode

5. Push repository to a bare git repository

First, create a bare repository and make its default branch match svn’s “trunk” branch name.

git init --bare ~/new-bare.git 
cd ~/new-bare.git 
git symbolic-ref HEAD refs/heads/trunk
Enter fullscreen mode Exit fullscreen mode

Then push the temp repository to the new bare repository.

cd ~/temp
git remote add bare ~/new-bare.git
git config remote.bare.push 'refs/remotes/*:refs/heads/*'
git push bare
Enter fullscreen mode Exit fullscreen mode

6. Rename “trunk” branch to the branch you want to migrate

Your main development branch will be named “trunk” which matches the name it was in Subversion. You’ll want to rename it to Git’s standard “master” or "main" or "develop" branch using:

cd ~/new-bare.git 
git branch
git branch -m origin/trunk develop
Enter fullscreen mode Exit fullscreen mode

7. Clean up branches and tags

This part depends on what svn produces. List git refs first and use "refs/heads/origin/tags" or "refs/heads/tags" as appropriate.

git-svn makes all of Subversions tags into very-short branches in Git of the form “tags/name”. You’ll want to convert all those branches into actual Git tags using:

cd ~/new-bare.git
git for-each-ref --format='%(refname)' refs/heads/origin/tags | awk -F'/' '{print $(NF)}' | while read ref; do echo $ref; done;
Enter fullscreen mode Exit fullscreen mode

Change git-svn "tag branches to actual tags".

git for-each-ref --format='%(refname)' refs/heads/origin/tags |
awk -F'/' '{print $(NF)}' |
while read ref
do
  git tag "$ref" "refs/heads/origin/tags/$ref";
  git branch -D "origin/tags/$ref";
done
Enter fullscreen mode Exit fullscreen mode

8. Set remote origin to gitlab project and push tags.

Note: Have a gitlab token handy for this bit

git remote add origin $GIT_REPO
git push --tags
git push --set-upstream origin develop --force
Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
sagary2j
Sagar R Ravkhande

Posted on March 18, 2022

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related