File: drop.rst

package info (click to toggle)
datalad 1.3.1-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 7,132 kB
  • sloc: python: 69,299; sh: 1,521; makefile: 220
file content (148 lines) | stat: -rw-r--r-- 5,390 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
.. -*- mode: rst -*-
.. vi: set ft=rst sts=4 ts=4 sw=4 et tw=79:

.. _chap_design_drop:

***********************
Drop dataset components
***********************

.. topic:: Specification scope and status

   This specification is a proposal, subject to review and further discussion.
   It is now partially implemented in the `drop` command.

§1 The :command:`drop` command is the antagonist of :command:`get`. Whatever a
`drop` can do, should be undoable by a subsequent :command:`get` (given
unchanged remote availability).

§2 Like :command:`get`, :command:`drop` primarily operates on a mandatory path
specification (to discover relevant files and sudatasets to operate on).

§3 :command:`drop` has ``--what`` parameter that serves as an extensible
"mode-switch" to cover all relevant scenarios, like 'drop all file content in
the work-tree' (e.g. ``--what files``, default, `#5858
<https://github.com/datalad/datalad/issues/5858>`__), 'drop all keys from any
branch' (i.e. ``--what allkeys``, `#2328
<https://github.com/datalad/datalad/issues/2328>`__), but also '"drop" AKA
uninstall entire subdataset hierarchies' (e.g. ``--what all``), or drop
preferred content (``--what preferred-content``, `#3122
<https://github.com/datalad/datalad/issues/3122>`__).

§4 :command:`drop` prevents data loss by default (`#4750
<https://github.com/datalad/datalad/issues/4750>`__). Like :command:`get` it
features a ``--reckless`` "mode-switch" to disable some or all potentially slow
safety mechanism, i.e. 'key available in sufficient number of other remotes',
'main or all branches pushed to remote(s)' (`#1142
<https://github.com/datalad/datalad/issues/1142>`__), 'only check availability
of keys associated with the worktree, but not other branches'. "Reckless
operation" can be automatic, when following a reckless :command:`get` (`#4744
<https://github.com/datalad/datalad/issues/4744>`__).

§5 :command:`drop` properly manages annex lifetime information, e.g. by announcing
an annex as ``dead`` on removal of a repository (`#3887
<https://github.com/datalad/datalad/issues/3887>`__).

§6 Like :command:`get`, drop supports parallelization `#1953
<https://github.com/datalad/datalad/issues/1953>`__

§7 `datalad drop` is not intended to be a comprehensive frontend to `git annex
drop` (e.g. limited support for e.g. `#1482
<https://github.com/datalad/datalad/issues/1482>`__ outside standard use cases
like `#2328 <https://github.com/datalad/datalad/issues/2328>`__).

.. note::
  It is understood that the current `uninstall` command is largely or
  completely made obsolete by this :command:`drop` concept.

§8 Given the development in `#5842
<https://github.com/datalad/datalad/issues/5842>`__  towards the complete
obsolescence of `remove` it becomes necessary to import one of its proposed
features:

§9 :command:`drop` should be able to recognize a botched attempt to delete a
dataset with a plain rm -rf, and act on it in a meaningful way, even if it is
just hinting at chmod + rm -rf.


Use cases
=========

The following use cases operate in the dataset hierarchy depicted below::

  super
  ├── dir
  │   ├── fileD1
  │   └── fileD2
  ├── fileS1
  ├── fileS2
  ├── subA
  │   ├── fileA
  │   ├── subsubC
  │   │   ├── fileC
  │   └── subsubD
  └── subB
      └── fileB

Unless explicitly stated, all command are assumed to be executed in the root of `super`.

- U1: ``datalad drop fileS1``

   Drops the file content of `file1` (as currently done by :command:`drop`)

- U2: ``datalad drop dir``

   Drop all file content in the directory (``fileD{1,2}``; as currently done by
   :command:`drop`

- U3: ``datalad drop subB``

   Drop all file content from the entire `subB` (`fileB`)

- U4: ``datalad drop subB --what all``

   Same as above (default ``--what files``), because it is not operating in the
   context of a superdataset (no automatic upward lookups). Possibly hint at
   next usage pattern).

- U5: ``datalad drop -d . subB --what all``

  Drop all from the superdataset under this path. I.e. drop all from the
  subdataset and drop the subdataset itself (AKA uninstall)

- U6: ``datalad drop subA --what all``

  Error: "``subA`` contains subdatasets, forgot --recursive?"

- U7: ``datalad drop -d . subA -r --what all``

  Drop all content from the subdataset (``fileA``) and its subdatasets
  (``fileC``), uninstall the subdataset (``subA``) and its subdatasets
  (``subsubC``, ``subsubD``)

- U8: ``datalad drop subA -r --what all``

  Same as above, but keep ``subA`` installed

- U9: ``datalad drop sub-A -r``

   Drop all content from the subdataset and its subdatasets (``fileA``,
   ``fileC``)

- U10: ``datalad drop . -r --what all``

  Drops all file content and subdatasets, but leaves the superdataset
  repository behind

- U11: ``datalad drop -d . subB``

  Does nothing and hints at alternative usage, see
  https://github.com/datalad/datalad/issues/5832#issuecomment-889656335

- U12: ``cd .. && datalad drop super/dir``

  Like :command:`get`, errors because the execution is not associated with a
  dataset. This avoids complexities, when the given `path`'s point to multiple
  (disjoint) datasets. It is understood that it could be done, but it is
  intentionally not done. `datalad -C super drop dir` or `datalad drop -d super
  super/dir` would work.