I run a GitLab server.
Sounds easy, right? All you need to do is run the GitLab omnibus installer on your target system of choice and be on your merry way. Not so fast. What if you’re like me and you’re:
- running your own server at home in a VM
- where you don’t have control over your ISP’s port filtering
- and you need to be able to move your Gitlab installation at a moment’s notice to a new network or server.
The easy solution isn’t enough. You need to do more.
What these limitations really mean is that I cannot directly expose my GitLab server to the Internet. But, if I cannot just open a port on my router and direct my subdomain to the GitLab server, how can I make the service publicly available? In my specific case, I’m working with external developers, so I want the server to be available not only through a VPN connection to my home network, but also through a standard HTTPS connection.
The high-level answer is: a relay server.
Relays are fairly common in peer-to-peer applications. One common situation they’re well suited to are cases where a computer is on a network without the ability to forward ports, so nothing can directly connect to them. Bittorrent Sync, for example, relies on relays whenever two peers cannot negotiate a connection directly to each other; this is usually due to a firewall between the two machines blocking the incoming ports they’re trying to use.
Going back to the GitLab server, I need something that supports relaying an HTTP connection from a machine that is not public facing to another machine that can readily accept connections from the Internet. The solution: SSH tunnels.
SSH tunneling works by forwarding a port over an outbound SSH connection on the client to a server, then exposing a port to the server. At that point, you can simply access the resource at localhost:someport on the server. If you needed to expose something simple, say a file syncing program or remote administration tool, your work would be done here.
But what if you want to access that resource on anything that can connect to the SSH server? One option is to make the forwarded port globally available (that is, make the SSH server listen to incoming connections on all interfaces, not only the local loopback). This is a simple solution – all it involves is changing the SSH server’s configuration file to allow these types of connections – but it comes with a few major disadvantages:
- you’re limited to only ports above 1024, unless your SSH client connects to the server as root. Since allowing remote root logins is always bad idea, as it is the most common target of automated bruteforce logins, you probably don’t want to allow these types of connections from your client. Regular users (non-root) cannot bind to ports below 1024.
- you can only run one HTTP service on a port. So, if you have 3 different web services you want to relay, you’d need to ask your users to connect over a non-standard port. Only one service can be on the standard ports 80 and 443. This is rarely a good practice.
- you probably don’t want to expose all of your SSH tunnels to the world. Yes, your firewall can block ports that you don’t want just anybody to connect to, but why take the risk? Relying on a firewall creates potential for a single point of failure and allowing SSH tunnels to be accessible on all interfaces violates the Principle of Least Privilege.
Again, this is a solvable problem. But before I share the answer, I need to take a quick detour to talk about the general unreliability of the SSH protocol.
Actually, the SSH protocol is well-implemented and reliable. The problem with it stems from the fact that any network interruption has a tendency to terminate an SSH connection. This is especially true if you ever plan on changing the network that your SSH client (i.e. your Gitlab server) is running on without fully restarting the server or manually restarting the SSH client. You’re also bound to face issues if your ISP suffers from a lot of uptime issues.
AutoSSH to the rescue. AutoSSH is basically a watchdog for the SSH protocol. If the connection breaks, AutoSSH automatically attempts to reconnect. It does a great job at recovering from a network disruption and can recover minutes (or even hours) after a failure.
Now, returning to the original discussion, we need to ask how can we make our HTTP service accessible to the Internet in a way that isn’t prone to the aforementioned issues? One solution: a reverse HTTP proxy, like Apache’s mod_proxy. There’s also a solution for Nginx, which works great, but is not the focus of this post.
mod_proxy allows you to use all of the power of a bona fide HTTP server, including VirtualHosts, which support multitenancy environments on your relay server. So, you end up with the ability to control access to a resource based on what domain is used to access the server. This makes it simple for you to have service1.yourdomain.com and service2.yourdomain.com point to two different servers, while being accessible via the same IP on your relay server.
To make this work, you essentially set up proxy rules that forward the connection to the forwarded port, so you might have rules that look like:
ProxyPass / "http://localhost:12345/service" ProxyPassReverse / "http://localhost:12345/service"
Which transparently forward the connection to the server that’s inaccessible from the Internet. Your users won’t even be able to tell where the server is actually running; they’ll only know the IP of the relay server. (Just don’t rely on this for anonymity; anybody with the ability to observe your relay server’s connections will be able to tell the origin of these SSH connections.)
Before I wrap up, you may have noticed that although I mention Gitlab, this approach can work with any HTTP-based web service. You may run into some minor issues with specific configuration settings, especially related to the URLs used in the application. But, the good news is I’ve been able to get this to work with: Gitlab, Apache, Jira, and Confluence. I’m sure it works with other applications, as well – just make sure you change the base URL that your application uses to the one that is publicly accessible.
If you plan on using SSL/TLS for encrypting a connection between GitLab and your users, you’ll need to be sure to install the SSL certificate on both your relay server and your Gitlab server. I found this to also be true with Confluence and Jira, as well.
One last thing: GitLab also allows users to access repositories via an SSH connection. While it’s easy to proxy HTTP content, you’re out of luck with SSH – you have to find another way to access that service. I don’t consider this a critical downside, because you always have the ability to access Gitlab repositories via HTTPS, but it can be a source of frustration for some users.