GitHub Enterprise Geo-replication uses multiple active replicas to fulfill requests from geographically distributed data centers.

Multiple active replicas can provide a shorter distance to the nearest replica. For example, an organization with offices in San Francisco, New York, and London could run the primary appliance in a datacenter near New York and two replicas in datacenters near San Francisco and London. Using Geolocation-aware DNS, users can be directed to the closest server available and access repository data faster. Designating the appliance near New York as the primary helps reduce the latency between the hosts, compared to the appliance near San Francisco being the primary which has a higher latency to London.

The active replica proxies requests that it can't process itself to the primary instance. The replicas function as a point of presence terminating all SSL connections. Traffic between hosts is sent through an encrypted VPN connection, similar to a two node High Availability configuration without Geo-replication.

Git requests and specific fileserver requests, such as LFS and file uploads, can be served directly from the replica without loading any data from the primary. Web requests are always routed to the primary, but if the replica is closer to the user the requests are faster due to the closer SSL termination.

Geo DNS, such as Amazon's Route 53 service, is required for Geo-replication to work seamlessly. The hostname for the instance should resolve to the replica that is closest to the user's location.

Limitations

Writing requests to the replica requires sending the data to the primary and all replicas. This means that the performance of all writes are limited by the slowest replica. Geo-replication will not add capacity to a GitHub Enterprise instance or solve performance issues related to insufficient CPU or memory resources.

Monitoring a Geo-replication configuration

You can monitor the availability of GitHub Enterprise by checking the status code that is returned for the https://<hostname>/status URL. An appliance that can service user traffic will return status code 200 (OK). An appliance may return a 503 (Service Unavailable) for a few reasons:

  • The appliance is a passive replica, such as the replica in a two node High Availability configuration.
  • The appliance is in maintenance mode.
  • The appliance is part of a Geo-replication configuration, but is an inactive replica.

You can also use the Replication overview dashboard available at:

https://[HOSTNAME]/setup/replication

Further reading