1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188
|
.. The content of this file was auto-generated by your mad uncle from Germany
using the cast2rst script and a recorded asciicast as input.
Do not edit this file!
DataLad provides seamless management of nested Git repositories...
Let's create a dataset
.. code-block:: ansi-color
[1;36m~[0m % datalad create demo
[[1;37mINFO [0m] Creating a new annex repo at /demo/demo
[1;1mcreate[0m([1;32mok[0m): /demo/demo ([1;35mdataset[0m)
[0m[1;36m~[0m % cd demo
A DataLad dataset is just a Git repo with some initial configuration
.. code-block:: ansi-color
[1;36m~/demo[0m % git log --oneline
[33m472e34b[m[33m ([m[1;36mHEAD -> [m[1;32mmaster[m[33m)[m [DATALAD] new dataset
[33mf968257[m [DATALAD] Set default backend for all files to be MD5E
We can generate nested datasets, by telling DataLad to register a
new dataset in a parent dataset
.. code-block:: ansi-color
[1;36m~/demo[0m % datalad create -d . sub1
[[1;37mINFO [0m] Creating a new annex repo at /demo/demo/sub1
[1;1madd[0m([1;32mok[0m): sub1 ([1;35mdataset[0m) [added new subdataset]
[1;1madd[0m([1;32mnotneeded[0m): sub1 ([1;35mdataset[0m) [nothing to add from /demo/demo/sub1]
[1;1madd[0m([1;32mnotneeded[0m): .gitmodules ([1;35mfile[0m) [already included in the dataset]
[1;1msave[0m([1;32mok[0m): /demo/demo ([1;35mdataset[0m)
[1;1mcreate[0m([1;32mok[0m): sub1 ([1;35mdataset[0m)
action summary:
add (notneeded: 2, ok: 1)
create (ok: 1)
save (ok: 1)
A subdataset is nothing more than regular Git submodule
.. code-block:: ansi-color
[1;36m~/demo[0m % git submodule
5f0cddf2026e3fb4864139f27e7415fd72c7d4d0 sub1 (heads/master)
Of course subdatasets can be nested
.. code-block:: ansi-color
[1;36m~/demo[0m % datalad create -d . sub1/justadir/sub2
[[1;37mINFO [0m] Creating a new annex repo at /demo/demo/sub1/justadir/sub2
[1;1madd[0m([1;32mok[0m): sub1/justadir/sub2 ([1;35mdataset[0m) [added new subdataset]
[1;1madd[0m([1;32mnotneeded[0m): sub1/justadir/sub2 ([1;35mdataset[0m) [nothing to add from /demo/demo/sub1/justadir/sub2]
[1;1madd[0m([1;32mnotneeded[0m): sub1/.gitmodules ([1;35mfile[0m) [already included in the dataset]
[1;1madd[0m([1;32mnotneeded[0m): sub1 ([1;35mdataset[0m) [already known subdataset]
[1;1msave[0m([1;32mok[0m): /demo/demo/sub1 ([1;35mdataset[0m)
[1;1msave[0m([1;32mok[0m): /demo/demo ([1;35mdataset[0m)
[1;1mcreate[0m([1;32mok[0m): sub1/justadir/sub2 ([1;35mdataset[0m)
action summary:
add (notneeded: 3, ok: 1)
create (ok: 1)
save (ok: 2)
Unlike Git, DataLad automatically takes care of committing all
changes associated with the added subdataset up to the given
parent dataset
.. code-block:: ansi-color
[1;36m~/demo[0m % git status
On branch master
nothing to commit, working tree clean
Let's create some content in the deepest subdataset
.. code-block:: ansi-color
[1;36m~/demo[0m % mkdir sub1/justadir/sub2/anotherdir
[1;36m~/demo[0m % touch sub1/justadir/sub2/anotherdir/afile
Git can only tell us that something underneath the top-most
subdataset was modified
.. code-block:: ansi-color
[1;36m~/demo[0m % git status
On branch master
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
(commit or discard the untracked or modified content in submodules)
[31mmodified: sub1[m (untracked content)
no changes added to commit (use "git add" and/or "git commit -a")
DataLad saves us from further investigation
.. code-block:: ansi-color
[1;36m~/demo[0m % datalad diff -r
modified(dataset): sub1
modified(dataset): sub1/justadir/sub2
untracked(directory): sub1/justadir/sub2/anotherdir
Like Git, it can report individual untracked files, but also across
repository boundaries
.. code-block:: ansi-color
[1;36m~/demo[0m % datalad diff -r --report-untracked all
modified(dataset): sub1
modified(dataset): sub1/justadir/sub2
untracked(file): sub1/justadir/sub2/anotherdir/afile
Adding this new content with Git or git-annex would be an exercise
.. code-block:: ansi-color
[1;36m~/demo[0m % git add sub1/justadir/sub2/anotherdir/afile
fatal: Pathspec 'sub1/justadir/sub2/anotherdir/afile' is in submodule 'sub1'
DataLad does not require users to determine the correct repository
in the tree
.. code-block:: ansi-color
[1;36m~/demo[0m % datalad add -d . sub1/justadir/sub2/anotherdir/afile
[1;1madd[0m([1;32mok[0m): sub1/justadir/sub2/anotherdir/afile ([1;35mfile[0m)
[1;1msave[0m([1;32mok[0m): /demo/demo/sub1/justadir/sub2 ([1;35mdataset[0m)
[1;1msave[0m([1;32mok[0m): /demo/demo/sub1 ([1;35mdataset[0m)
[1;1msave[0m([1;32mok[0m): /demo/demo ([1;35mdataset[0m)
action summary:
add (ok: 1)
save (ok: 3)
Again, all associated changes in the entire dataset tree, up to
the given parent dataset, were committed
.. code-block:: ansi-color
[1;36m~/demo[0m % git status
On branch master
nothing to commit, working tree clean
DataLad's 'diff' is able to report the changes from these related
commits throughout the repository tree
.. code-block:: ansi-color
[1;36m~/demo[0m % datalad diff --revision @~1 -r
modified(dataset): sub1
modified(dataset): sub1/justadir/sub2
added(file): sub1/justadir/sub2/anotherdir/afile
_____________________________________
/ Demo was using datalad 0.9.2.dev1. \
\ Discover more at http://datalad.org /
-------------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
.. code-block:: ansi-color
[1;36m~/demo[0m % exit
|