Git-annex manages files in the git repository without playing their contents directly into thegit repo. This seems somewhat paradoxical at first, but keeps git from having to manage to large of files in the repo. Here only the file name and associated data is located directly in the git repo. The data in the files themselves are stored in a separate folder and are managed by git-annex.
Git-annex provides different usage scenarios and security functions. It can ensure that multiple copies of a file are included in the repositories. As a result, a file cannot be accidentally deleted, because git-annex checks the number of copies. Furthermore, the complete contents of a file no longer need to be available on each system. They can be retrieved by other systems via git annex get when needed.
The git-annex installation on Ubuntu can be performed manually or from the repos:
- The git-annex version lags behind the current development versions from the repos:
- git-annex is available as pre-compiled software. The binaries are unpacked and used on the target system:
:~$ wget 'http://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-amd64.tar.gz' :~$ tar xzf git-annex-standalone-amd64.tar.gz :~$ PATH="$PATH:$HOME/downloads/git-annex.linux" :~$ git-annex version git-annex version: 4.20131003-gbe0b734 build flags: Assistant Webapp Pairing Testsuite S3 WebDAV Inotify DBus XMPP Feeds Quvi TDFA key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SHA256 SHA1 SHA512 SHA224 SHA384 WORM URL remote types: git gcrypt S3 bup directory rsync web webdav glacier hook
The path changes can also be set permanently on Ubuntu (see Permanently setting environment variables on Ubuntu). The following line is added to the file of the home directory .pam_environment:
Git-annex manages the file name like symbolic links in their actual content in the so-called "Indirect Mode":
:~/annex$ ls -la [...] debhelper-slides.pdf -> .git/annex/objects/32/64/SHA256E-s1988981--8aaa02dda217 bbabd79a11a5f93fdd4ca8ae4e723c86b4bb91c69d4095a84006.pdf/SHA256E-s1988981--8aaa 02dda217bbabd79a11a5f93fdd4ca8ae4e723c86b4bb91c69d4095a84006.pdf
If the file contents to a file name are available, then a valid symbolic link exists. Otherwise, the link is shown as empty and the file content must first be retrieved from another repository using git annex get.
Modifying Files in Indirect Mode
To edit a file that is located in the git-annex repository, the file must first be unlocked. This step should be made first to preserve the file in case of accidental loss (git-annex checks to see if there are numcopies of the file in other repos):
:~/annex$ git annex unlock debhelper-slides.pdf unlock debhelper-slides.pdf (copying...) ok :~/annex$ ls -la [...] -rw-r--r-- 1 tktest tktest 1988981 Oct 9 12:30 debhelper-slides.pdf
After unlocking it the file can be edited. A subsequent commit again generates the symbolic link via a post-commit hook for git-annex.
In addition to the Indirect Mode, the Direct Mode offers the convenience of editing the files directly. The security features git-annex normally provides are not needed. Therefore, all git-annex repositories start, in general, in Indirect Mode. The repositories that were created via the web interface by git-annex assistant are the exception. The modes can also switch between indirect and direct using commands:
:~/annex$ git annex direct commit # On branch master nothing to commit (working directory clean) ok direct debhelper-slides.pdf ok direct git-pkg-2011.pdf ok direct ok
Create and Manage Your First Directory
In the following example, Alice and Bob exchange data directly. They both have a git-annex repository where the data is managed. Since they can communicate directly with each other, via SSH, with git-annex:
- Alice adds a file to git-annex on her system and updates the metadata with
git annex sync.
- Bob calls up
git annex syncto also synchronize. In the first step he receives a broken symbolic link in the file.
- Bob calls up
git annex getand gets the contents of the system file.
- Alice actively copies using
git annex copythe file to Bob.
- Bob locally accepts the files that have come from Alice in git-annex using
git annex sync.
- Be sure that each file is present at least once. As long as the other holds the file, one or the other can drop (let the data maintain the data name or the link).
git-annex Repo Alice
Alice creates annex, a git-annex repository in her directory:
:~/annex$ git init Initialized empty Git repository in /home/alice/annex/.git/ :~/annex$ git annex init "Alice" init Alice ok (Recording state in git...)
She adds the data that she wants to manage with git-annex:
:~/annex$ cp ~/Downloads/debhelper-slides.pdf . :~/annex$ git annex add . add debhelper-slides.pdf (checksum...) ok (Recording state in git...) :~/annex$ git commit -a -m "Added slides" [master (root-commit) 761a810] Added slides 1 file changed, 1 insertion(+) create mode 120000 debhelper-slides.pdf
To synchronize with Bob, she adds the git-annex repository as a remote repository:
:~/annex$ git remote add bob ssh://firstname.lastname@example.org/home/bob/annex
git-annex Repo Bob
Bob clones the existing repo of Alice and initializes his git-annex repository with his name:
:~/annex$ git clone ssh://email@example.com/home/alice/annex . Cloning into '.'... remote: Counting objects: 13, done. remote: Compressing objects: 100% (9/9), done. remote: Total 13 (delta 2), reused 0 (delta 0) Receiving objects: 100% (13/13), done. Resolving deltas: 100% (2/2), done. :~/annex$ git annex init "Bob" init Bob ok (Recording state in git...)
He also adds Alice remotely:
:~/annex$ git remote add alice ssh://firstname.lastname@example.org/home/alice/annex
Alice and Bob are now Synchronized
Alice and Bob can update on their pages via
git annex sync
each others repos and share changes with each other:
:~/annex$ git annex sync bob commit ok pull bob remote: Counting objects: 5, done. remote: Compressing objects: 100% (3/3), done. remote: Total 5 (delta 0), reused 1 (delta 0) Unpacking objects: 100% (5/5), done. From ssh://192.168.56.104/home/bob/annex * [new branch] git-annex -> bob/git-annex * [new branch] master -> bob/master ok (merging bob/git-annex into git-annex...) (Recording state in git...) push bob Counting objects: 7, done. Delta compression using up to 4 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 435 bytes, done. Total 3 (delta 0), reused 0 (delta 0) To ssh://email@example.com/home/bob/annex * [new branch] git-annex -> synced/git-annex * [new branch] master -> synced/master ok
When Alice has added new files, Bob first receives a broken symbolic link without the actual file data:
:~/annex$ git annex sync alice (merging synced/git-annex origin/git-annex into git-annex...) commit ok pull alice From ssh://192.168.56.1/home/alice/annex * [new branch] git-annex -> alice/git-annex * [new branch] master -> alice/master * [new branch] synced/master -> alice/synced/master ok
The following find command lists broken symbolic links:
:~/annex$ find -L . -type l ./git-pkg-2011.pdf
Via a git annex get from Alice, he can also receive the file contents:
:~/annex$ git annex get . get debhelper-slides.pdf (from alice...) SHA256E-s1988981--8aaa02dda217bbabd79a11a5f93fdd4ca8ae4e723c86b4bb91c69d4095a84006.pdf 1988981 100% 24.63MB/s 0:00:00 (xfer#1, to-check=0/1) sent 30 bytes received 1989376 bytes 265254.13 bytes/sec total size is 1988981 speedup is 1.00 ok (Recording state in git...)
Alice Copies the Data to Bob
:~/annex$ cp ~/Downloads/git-pkg-2011.pdf . :~/annex$ git annex add . add git-pkg-2011.pdf (checksum...) ok (Recording state in git...) :~/annex$ git commit -a -m "Added tutorial" [master 50c8091] Added tutorial 1 file changed, 1 insertion(+) create mode 120000 git-pkg-2011.pdf :~/annex$ git annex copy . --to bob copy debhelper-slides.pdf (checking bob...) ok copy git-pkg-2011.pdf (checking bob...) (to bob...) SHA256E-s359984--e87901d377b5c31377a87eb07a28cd133b07feed380f869867abb04bc85d3e47.pdf 359984 100% 52.01MB/s 0:00:00 (xfer#1, to-check=0/1) sent 360173 bytes received 31 bytes 720408.00 bytes/sec total size is 359984 speedup is 1.00 ok (Recording state in git...)
When Bob synchronizes the file the file contents are also included. Without git annex copy Bob would have only found a broken system link.:
:~/annex$ git annex sync commit ok pull origin :~/annex$ ls debhelper-slides.pdf git-pkg-2011.pdf
Since Alice copied the file, it is now located in both Alice and Bob:
:~/annex$ git annex whereis . whereis debhelper-slides.pdf (2 copies) de5e57a3-4517-4a05-84ee-60708bbd9d3b -- here (Bob) e3d44122-8756-4f1c-aa5b-5ecdfe01bc4b -- origin (Alice) ok whereis git-pkg-2011.pdf (2 copies) de5e57a3-4517-4a05-84ee-60708bbd9d3b -- here (Bob) e3d44122-8756-4f1c-aa5b-5ecdfe01bc4b -- origin (Alice) ok
- git-annex Installations-Pakete (git-annex.branchable.com)
- git annex direct mode (git-annex.branchable.com)
- git-remote Manual Page (kernel.org)
Author: Georg Schönberger