Backup on Linux with duply

From Thomas-Krenn-Wiki
Jump to: navigation, search

Duply is a simple Python script for creating incremental, symmetrically encrypted file-level backups. Duply operates as a front-end to duplicity.[1] With Duply, you can create backups both locally on the protected computer or on a separate (remote) system. Duply supports ftp, ssh, s3, rsync, cifs, webdav, http.

Installation

On Ubuntu or Debian Duply can be installed using the following command:

aptitude install duply

Configuration

By using the duply <backupname> create command, a new Duply profile is created. Because read access is required for a full backup of all files in a directory, duply should be run with root user privileges.

A duply profile is created in the user's home directory under ~/.duply/ and consists of the following files:

  • conf
  • exclude
  • post
  • pre
  • gpg-key.asc (only optional if gpg-key was exported)

GPG Key Creation

By using gpg --gen-key a new GPG-Key is created. During key creation, it is advisable to perform other work on the host in order to increase the entropy of the system.

gpg --gen-key
gpg (GnuPG) 1.4.11; Copyright (C) 2010 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What key size do you want? (2048) 4096
Requested key size is 4096 bits
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0) 
Key does not expire at all
Is this correct? (y/N) y

You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"

Real name: Example User
Email address: email@example.com
Comment: 
You selected this USER-ID:
    "Example User <email@example.com>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
You need a Pass phrase to protect your secret key.

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

Not enough random bytes available.  Please do some other work to give
the OS a chance to collect more entropy! (Need 253 more bytes)
..........+++++

gpg: key 9627014B marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   2  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 2u
pub   4096R/9627014B 2013-06-07
      Key fingerprint = 705D B57E 8526 FB24 360E  E54D 13A1 AC6B 9627 014B
uid                  Example User <email@example.com>
sub   4096R/DB7D5661 2013-06-07

Backup Configuration

In order to provide a symmetric encryption Duply requires the GPG-Key ID and password. It is shown here using Key-ID 9627014B. This key must be stored in the automatically generated configuration file (here: /root/.duply/backup/conf)in the Duply folder along with the password.

pub   4096R/9627014B 2013-06-07
   Key fingerprint = 705D B57E 8526 FB24 360E  E54D 13A1 AC6B 9627 014B
uid                  Example User <email@example.com>
sub   4096R/DB7D5661 2013-06-07
conf File
GPG_KEY='_KEY_ID_'
GPG_PW='_GPG_PASSWORD_'

Additional options GPG_OPTS= for compression and the kind of encryption can be set.

GPG_OPTS='--compress-algo=bzip2 --personal-cipher-preferences AES256,AES192'

Duply checks that the GPG key is valid and that the pass phrase is correct before each action. This can be prevented using the option GPG_TEST='disabled'.

#GPG_TEST='disabled'

The next step is the selection of the backup target. Duply understands all common protocols for transferring data. The host syntax is as follows:

scheme://[user:password@]host[:port]/[/]path

In the file conf there is a list of supported protocols and their Syntax.

#   file://[/absolute_]path
#   ftp[s]://user[:password]@other.host[:port]/some_dir
#   hsi://user[:password]@other.host/some_dir
#   cf+http://container_name
#   imap[s]://user[:password]@host.com[/from_address_prefix]
#   rsync://user[:password]@other.host[:port]::/module/some_dir
#   # rsync over ssh (only keyauth)
#   rsync://user@other.host[:port]/relative_path
#   rsync://user@other.host[:port]//absolute_path
#   # for the s3 user/password are AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY
#   s3://[user:password]@host/bucket_name[/prefix]
#   s3+http://[user:password]@bucket_name[/prefix]
#   # scp and sftp are aliases for the ssh backend
#   ssh://user[:password]@other.host[:port]/some_dir
#   tahoe://alias/directory
#   webdav[s]://user[:password]@other.host/some_dir

It should be noted that special characters must be encoded using url or they will be entered in the TARGET_USER, TARGET_PASS parameters.

TARGET='scheme://user[:password]@host[:port]/[/]path'

Next, the backup root directory can be set using the SOURCE= command. If a backup of several sub-folders from /consisting of (e.g.: /etc /var /home need to be saved) the SOURCE variable must also be set in the same manner.

SOURCE='/'

The following parameters control the maximum age and number of full backups the Duply should retain. It is important to note here that Duply will not delete any backups unless requested. By using MAX_AGE the maximum backup age is determined.

MAX_AGE=1Y

Using MAX_FULL_BACKUPS can determine the maximum number of full backups duply will retain.

MAX_FULL_BACKUPS=5

Alternatively, it can also specify how old a full backup must be before a new full backup will be created using MAX_FULLBKP_AGE.

MAX_FULLBKP_AGE=2W
DUPL_PARAMS="$DUPL_PARAMS --full-if-older-than $MAX_FULLBKP_AGE " 

To restrict time loss from transmission errors, duply devides its backups into 25 MB files. This parameter can be changed with the code VOLSIZE.

VOLSIZE=10
DUPL_PARAMS="$DUPL_PARAMS --volsize $VOLSIZE "

Additional optional VERBOSITY and TEMP_DIR parameters can be set.

Pre and Post Scripts

Duply allows the use of pre and post scripts. The pre script is executed just before the backup, the post script directly after the backup. For example, with these scripts, snapshots of LVM volumes or SQL database dumps can be made and included in the backup. The pre und post files must be executable and be in the respective duply directory (e.g.: /home/user/.duply/backup/).

Example

Here is an example of PRE/POST script creating an SQL dump of all databases before backup and is deleted after the backup.

pre File
/usr/bin/mysqldump --all-databases -u root -ppw> /tmp/sqldump-$(date '+%F')
post File
/bin/rm /tmp/sqldump-$(date '+%F')

Exclude

Duply normally uses a whitelist. To include certain directories or files from a backup the exclude file must be created in the Duply directory. The syntax allows you to add directories and files using + /pfad/zur/datei. To exclude a directory the exclude ein - /pfad/zum/Verzeichnis command must be inserted. In addition, Duply allows the use of wild cards. The file illustrated here exclude secures the directory /etc/, /root/, /var/www/ and excludes all other directories.

+ /etc/
+ /root/
+ /var/www/
- **

Duply Parameter

Duply offers a variety of command line parameters for backup and recovery of data. The entire list can be found Duply's main page.

When several parameters are used, they are separated by an underscore (_).

With /usr/bin/duply /root/.duply/test full_verify_purge --force a full backup is created and old backups are deleted. Backups which have MAX_AGE exceeded are displayed with purge and are deleted via the additional option --force.

The command /usr/bin/duply /root/.duply/test incr performs an incremental backup.

Cronjob

Within Duply there is no (Daemon) service, but a script, for example, can be run regularly via Cron.

A good example of a Cronjob configuration would be:

0 0 * * 7 /usr/bin/duply /root/.duply/test full_verify_purge --force
0 0 * * 1-6 /usr/bin/duply /root/.duply/test incr

In this configuration a full backup is created Sundays at 0:00 and old backups are deleted. Incremental backups are performed every day Monday - Saturday. New Cronjobs can be created with the crontab -e command. It should be noted that the absolute path must be specified for all commands and configuration files.

Additional Information

References

  1. Duplicity (duplicity.nongnu.org)

Template:Fnemeth

Related articles

Linux Performance Measurements using vmstat
Updates with LXC 1.0
Wake On LAN under Linux