Configuring backups on your appliance
As part of a disaster recovery plan, you can protect production data on your GitHub Enterprise Server instance by configuring automated backups.
In this article
- About GitHub Enterprise Server Backup Utilities
- Installing GitHub Enterprise Server Backup Utilities
- Scheduling a backup
- Restoring a backup
About GitHub Enterprise Server Backup Utilities
GitHub Enterprise Server Backup Utilities is a backup system you install on a separate host, which takes backup snapshots of your GitHub Enterprise Server instance at regular intervals over a secure SSH network connection. You can use a snapshot to restore an existing GitHub Enterprise Server instance to a previous state from the backup host.
Only data added since the last snapshot will transfer over the network and occupy additional physical storage space. To minimize performance impact, backups are performed online under the lowest CPU/IO priority. You do not need to schedule a maintenance window to perform a backup.
For more detailed information on features, requirements, and advanced usage, see the GitHub Enterprise Server Backup Utilities README.
To use GitHub Enterprise Server Backup Utilities, you must have a Linux or Unix host system separate from your GitHub Enterprise Server instance.
You can also integrate GitHub Enterprise Server Backup Utilities into an existing environment for long-term permanent storage of critical data.
We recommend that the backup host and your GitHub Enterprise Server instance be geographically distant from each other. This ensures that backups are available for recovery in the event of a major disaster or network outage at the primary site.
Physical storage requirements will vary based on Git repository disk usage and expected growth patterns:
|Storage||Five times the primary instance's allocated storage|
More resources may be required depending on your usage, such as user activity and selected integrations.
Installing GitHub Enterprise Server Backup Utilities
Note: To ensure a recovered appliance is immediately available, perform backups targeting the primary instance even in a Geo-replication configuration.
Download the latest GitHub Enterprise Server Backup Utilities release and extract the file with the
$ tar -xzvf /path/to/github-backup-utils-vMAJOR.MINOR.PATCH.tar.gz
- Copy the included
backup.configand open in an editor.
- Set the
GHE_HOSTNAMEvalue to your primary GitHub Enterprise Server instance's hostname or IP address.
- Set the
GHE_DATA_DIRvalue to the filesystem location where you want to store backup snapshots.
- Open your primary instance's settings page at
https://HOSTNAME/setup/settingsand add the backup host's SSH key to the list of authorized SSH keys. For more information, see Accessing the administrative shell (SSH).
Verify SSH connectivity with your GitHub Enterprise Server instance with the
To create an initial full backup, run the
For more information on advanced usage, see the GitHub Enterprise Server Backup Utilities README.
Scheduling a backup
You can schedule regular backups on the backup host using the
cron(8) command or a similar command scheduling service. The configured backup frequency will dictate the worst case recovery point objective (RPO) in your recovery plan. For example, if you have scheduled the backup to run every day at midnight, you could lose up to 24 hours of data in a disaster scenario. We recommend starting with an hourly backup schedule, guaranteeing a worst case maximum of one hour of data loss if the primary site data is destroyed.
If backup attempts overlap, the
ghe-backup command will abort with an error message, indicating the existence of a simultaneous backup. If this occurs, we recommended decreasing the frequency of your scheduled backups. For more information, see the "Scheduling backups" section of the GitHub Enterprise Server Backup Utilities README.
Restoring a backup
In the event of prolonged outage or catastrophic event at the primary site, you can restore your GitHub Enterprise Server instance by provisioning another GitHub Enterprise appliance and performing a restore from the backup host. You must add the backup host's SSH key to the target GitHub Enterprise appliance as an authorized SSH key before restoring an appliance.
To restore your GitHub Enterprise Server instance from the last successful snapshot, use the
ghe-restore command. You should see output similar to this:
$ ghe-restore -c 188.8.131.52 > Checking for leaked keys in the backup snapshot that is being restored ... > * No leaked keys found > Connect 184.108.40.206:122 OK (v2.9.0) > WARNING: All data on GitHub Enterprise appliance 220.127.116.11 (v2.9.0) > will be overwritten with data from snapshot 20170329T150710. > Please verify that this is the correct restore host before continuing. > Type 'yes' to continue: yes > Starting restore of 18.104.22.168:122 from snapshot 20170329T150710 # ...output truncated > Completed restore of 22.214.171.124:122 from snapshot 20170329T150710 > Visit https://126.96.36.199/setup/settings to review appliance configuration.
Note: The network settings are excluded from the backup snapshot. You must manually configure the network on the target GitHub Enterprise Server appliance as required for your environment.
You can use these additional options with
-cflag overwrites the settings, certificate, and license data on the target host even if it is already configured. Omit this flag if you are setting up a staging instance for testing purposes and you wish to retain the existing configuration on the target. For more information, see the "Using using backup and restore commands" section of the GitHub Enterprise Server Backup Utilities README.
-sflag allows you to select a different backup snapshot.