In this post, I explore and set up a way to publicly host a selection of my privately hosted repositories. See the results at my Git site.
There are three parts to this setup:
There’s something oddly self-empowering about managing your own services. Even though I don’t host any trendy services from my box, I use it extensively to host my Git repositories. Using Git, I can turn digital text - the ephemeral medium, into something more permanent. But the text becomes permanent in an curious way. It’s not etched in stone, but rather, it’s endemic. Any one instance of the text is fragile, but as an idea, the text is persistent.
Currently, I use bare Git repositories with SSH access. It works completely fine for version control. However, there are a few problems with my setup compared to say, using a GitHub profile.
I want to fix all of these problems in one swoop. Enter GitWeb.
There are plenty of other solutions to host your Git repos as a webpage. Gitea is particularly popular among the self-hosting community. GitLab can also be self hosted. I’m sure there are others in this class. The problem with these is that they are bloated for my purposes. I don’t need CI/CD (aside from a few shell scripts and some git hooks). I don’t need issue tracking, fancy collaboration tools, or granular access control. These extra features become a liability for me because they add complexity to the project and make it less likely that I’ll complete it. I want something simpler.
GitWeb does exactly what I want. It’s a CGI script that, when coupled with a webserver like Apache, can read a directory full of Git repositories and serve them like a webpage. Using that interface, you can navigate code, and depending on the configuration, use Git blame, search or whatever. That’s it. No extra frills. It’s delightfully minimal.
I should mention that there are alternatives at the GitWeb class of
tools. CGit is the most notable competitor at this level. CGit is a fine
choice. I admit that I chose GitWeb for two trivial reasons. First,
GitWeb has the aesthetic that I’m used to for these minimal webhosted
repositories. Second, the git instaweb command made it seem
really easy to get set up.
I have a private machine that I regularly interact with. I want to my public host to mirror a selection of those repositories. There are two ways that I can think to accomplish that:
I don’t want my public repository to point to my private repository
because, well, I want that to be private. Not to mention that it’s not
necessarily on a static IP like the public server is. So pushing from
private makes more sense than pulling from public. That also simplifies
the setup because, for normal Git uses, I only really need my private
git user to interact with my public server on my
behalf.
There’s a few generic web hosting tasks that I won’t go into detail about, but I’ll list them here. They all vary slightly based on which tools you’re using and they are easily found with a web search. The steps are:
certbot to set up Let’s Encrypt certficates. This
installs and configures SSL/TLS certificates so that your site uses
https to encrypt traffic. The certbot utility, with the
appropriate plugin also edits the configuration that you set up in the
previous step.Your web server configuration varies greatly. It also has some security-critical features, so I’m not showing my actual configuration here. But these are some of the important directives that you need to consider and what they do.
This port 80 block routes requests intended for
http://git.yourserver.com to
https://git.yourserver.com. By rerouting requests, we
ensure that everyone uses HTTPS.
<VirtualHost *:80>
ServerName git.yourserver.com
RewriteEngine on
RewriteCond %{SERVER_NAME}=git.yourserver.com
RewriteRule ^ https://${SERVER_NAME}%{REQUEST_URI} [NE, R=Permanent]
</VirtualHost>
<VirtualHost *:443>
...
</VirtualHost>Rewrite instead of
Redirect because we are accounting for
yourserver.com in addition to
git.yourserver.com.VirtualHosts for your
other domains.There are important details to stuff into your HTTPS block as well. When you install the GitWeb package, it includes, among other things, a CSS file, a JavaScript file, and a CGI script that’s written in Perl. It’s important that your webserver has access to all those files. Here are the main steps to consider:
mod_cgi is
enabled.)gitweb.cgi script.FollowSymLinks and ExecCGI.You can find advice about setting up your webserver in The GitWeb docs.
The GitWeb configuration is pretty straightforward. The default
configuration comes commented. Uncomment configurations that you want to
enable and change values that you want to change. Here’s a snippet
(comments are indicated by #):
# path to git projects (<project>.git)
$projectroot = "/var/www/git";
# directory to use for temp files
$git_temp = "/tmp";
$site_name = "git.yourserver.com";
# target of the home link on top of all pages
$home_link = $my_uri || "/";
$projects_list_description_width = 80;There are some additional features that you can enable, such as
enabling git blame or syntax highlighting. These features
consume more server resources, but they’re also cool.
Warning: Syntax highlighting caused problems for
.bats files. With syntax highlighting enabled, you cannot
view those files in the typical tree interface. Instead, you must read
.bats files as raw. I suspect there are many other such
cases.
If you have performed the steps above, here’s what you have achieved:
git.yourserver.com and make a request
to your server via HTTPS.gitweb.cgi script for
information to put in its response.gitweb.cgi script looks in
/var/www/git for content and finds… nothing.This section describes how you can populate that directory and keep it up to date with your repositories.
On the public machine, we need a git user. Technically,
git@private is the only user that will be accessing
git@public. But the only operations it needs are Git
operations, so let’s lock it down by defining git@public’s
shell to be git-shell. You can do this one of two ways:
--shell=/usr/bin/git-shell to your
adduser command when you create the git
user.usermod --shell=/usr/bin/git-shell git if you’ve
already created the git user.After the git user is created, you can add the
git@private’s public key to
git@public:/home/git/.ssh/authorized_keys. Now,
git@private can run Git commands as git@public
via the git-shell.
Public and private repositories need to be set up to communicate with one another.
The following snippet shows how you can initialize an empty repository on the public host and add a description that is used by GitWeb.
# Context: git@public:/var/www/git or /srv/git or whatever
git init --bare <project.git>
echo 'description of project' > <project.git>/descriptionThe following snippet shows how you can configure your private
repository so that running git push submits changes to the
public facing repository.
# Context: git@private:/srv/git/project.git or whatever
git remote add --mirror=push <remote-name> git@<public>:</var/www/git>/<project.git>There are a few other git remote add options that I’m
not totally sure about. Like tagging. Is --tags or
--no-tags the default? I currently don’t use tags, but it
seems like it would be useful for versioning. That is something that I
might do in the future.
One enhancement to consider is that this could be scripted. Since you
can run remote commands by using ssh, you should be able to
do the whole thing from the private side. (See the appendix below that
investigates this approach.)
Once it’s set up, you can run git push from the private
repo to push changes to the public mirror. However, it can get annoying
to push a set of changes from your workspace to your private repo, then
switch to your private repo to push the same changes to your public
repo. There are some enhancements you can make to automate that
part.
Note: You could also include this setup in the repo setup script.
As git user, invoke crontab -e and add the following
line to mirror the repo at midnight. Adjust the schedule however you
want.
0 0 * * * /usr/bin/git -C <path/to/repo> push
The -C option changes to
<path/to/repo> before invoking the subsequent
git subcommand.
There are several hooks that run at various points of the Git workflow. It looks like a post-recieve hook on the private repository is the way to go. My understanding is that the following happens:
The relevant part of the doc says this:
The [post-recieve] hook executes on the remote repository once after all the proposed ref updates are processed and if at least one ref is updated as the result. (source)
Here’s another post recommending the post-receive hook for mirroring.
You should be able to set up the hook with the following applied to the private repo:
echo "git push" > <project>.git/hooks/post-recieve \
&& chmod +x <project>.git/hooks/post-recieveNotes:
<project>.git convention.git push doesn’t work from an empty bare repository. It
will complain about “no refs in common” or something. You must have at
least one branch or tag or something. That’s fine. Why mirror a totally
empty repository anyways?I’m pleased with everything so far. It’s very likely that I’ll iterate on the process in the future. I expect that
However, after being burnt too many times by premature optimization in the past, I’m leaning into Adam Savage’s philosophy that it’s important not to start with a specialized tool kit, but to come to it as a consequence of meaningful experience. For now, I’m happy to get something working, use it, and iterate on it later.
This is the script that I wanted to define. I wanted to be able to
switch to my private git server, and run
git-init.sh <repo-name> <description> and have
it set up the whole thing. But there’s a subtle, fundamental problem
with the approach in the mirror_repo_setup step. It’s a
pretty interesting problem. I challenge interested readers to find it
for themselves before reading it below. Hint: Think about who is doing
what.
#!/bin/bash
# Use repo.git convention for bare repositories.
# All references herein are to bare repositories.
REPO_NAME="$1".git
REPO_DESCRIPTION=${2:-"No Description provided."}
MIRROR_GITUSER=...
MIRROR_HOST=...
MIRROR_SRV_DIR=...
LOCAL_SRV_DIR=...
LOCAL_REMOTE_NAME=...
mirror_repo_setup () {
local cmd="git -C $MIRROR_SRV_DIR init --bare $REPO_NAME"
# Leave $cmd unquoted. $cmd should be unpacked for use with ssh.
ssh "$MIRROR_GITUSER"@"$MIRROR_HOST" $cmd
ssh "$MIRROR_GITUSER"@"$MIRROR_HOST" echo "$REPO_DESCRIPTION" > "$REPO_NAME"/description
}
# The indentation clash from the heredoc looked really ugly, so I factored it
# out of the private_repo setup.
_write_hook () {
cat << EOF > "$1"
#!/bin/bash
git push
EOF
}
private_repo_setup () {
local repo_path="$LOCAL_SRV_DIR/$REPO_NAME"
local hook_file="$repo_path"/hooks/post-receive
git -C "$LOCAL_SRV_DIR" init --bare "$REPO_NAME"
git -C $repo_path remote add \
--mirror=push "$LOCAL_REMOTE_NAME" \
"$MIRROR_GITUSER"@"$MIRROR_HOST":"$MIRROR_SRV_DIR"/"$REPO_NAME"
_write_hook "$hook_file"
chmod +x "$hook_file"
}
# MAIN
mirror_repo_setup
private_repo_setupThe problem is with $MIRROR_GITUSER. If we are using SSH
authentication, then any user with push access can assume the role of
$MIRROR_GITUSER. For security purposes, that user better
have a restricted shell. Namely, git-shell so that they can
still use git push and whatnot.
I see a few options:
mirror_repo_setup by hand.Number three is calling my name for now. I only have a few repositories. Although I’d much rather have a script than a document telling me which commands to run, it’s not worth the setup cost.
On the bright side, the private_repo_setup part is still
valid.
From a 3-2-1 backup perspective, you should have three copies on two different media, with one offsite. Hosting my Git repositories outside my network satifisfies the “one offsite” criteria.↩︎