1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348
|
# ddpt examples
# =============
# Lines that start with "#", like this one, are comments.
# Lines that start with "$" are commands entered by the user. Some
# long commands are split over several lines with a trailing "\"
# on all but the last line of the command.
# Other non-blank lines are command output. Command output is shown
# in only some cases.
# dd "standard" regular file to regular file copy. 'stat -c %s <file>'
# is a way of getting a file's length, in bytes.
$ stat -c %s src
6915
$ dd if=src of=dst
13+1 records in
13+1 records out
6915 bytes (6.9 kB) copied, 0.000128857 s, 53.7 MB/s
# Now lets look at the ddpt equivalent. So we try the same options but
# the dst2 file exists and it is relatively large.
$ stat -c %s dst2
524800
$ ddpt if=src of=dst2
Assume block size of 512 bytes for both input and output
13+1 records in
13+1 records out
time to transfer data: 0.000121 secs at 59.24 MB/sec
$ stat -c %s dst
524800
# ddpt does not truncate dst2, it overwrites it. So if dst2's file length
# is longer than src's file length then the output file needs to be
# truncated:
$ ddpt if=src of=dst2 oflag=trunc
Assume block size of 512 bytes for both input and output
13+1 records in
13+1 records out
time to transfer data: 0.000243 secs at 29.50 MB/sec
$ stat -c %s dst2
6915
# So now src is the same length as dst and dst2. And they all contain the
# same data. The ddpt benefit of not truncating the output file by default
# is with write sparing and the resume capability.
# Sometimes it may be useful to preserve permissions and timestamps
# of the src file. After the copy (by ddpt or dd) call these:
$ chmod --reference=src dst2
$ touch --reference=src dst2
# and to preserve the src's ACL (Access Control List):
$ getfacl src | setfacl --set-file=- dst2
# Both dd and ddpt default to a block size of 512 bytes (bs=512) as they
# are designed to move disk data which up until recent times typically
# had a block size of 512 bytes. However for copying regular (normal)
# files bs=1 would be a clearer choice. dd is relatively inefficient when
# bs=1 but ddpt should be faster. So we can do this:
$ ddpt if=src bs=1 of=dst2 oflag=trunc
6915+0 records in
6915+0 records out
time to transfer data: 0.000134 secs at 51.60 MB/sec
# Notice that "records in" and "records out" are byte counts since "bs=1"
# if the src file is large then
$ dd if=src of=dst
# can be quite inefficient because dd reads BS (argument to "bs=") or
# IBS bytes at a time, then writes them to dst and continues until
# all of src has been read (or the COUNT is exhausted). So something
# like this is often suggested:
$ dd if=src of=dst bs=64k
# ddpt reads BPT*IBS bytes at a time from src before writing them to
# dst2. The default value of BPT varies depending on IBS; for IBS=512
# BPT defaults to 128. So for 512 byte blocks ddpt reads in chunks of
# 64 KB. Hence this invocation of ddpt remains quite efficient:
$ ddpt if=src ibs=512 of=dst2 oflag=trunc obs=1
# The advantage of keeping the IBS value low (and specifically equal to
# the logical block size for block devices) is that the SKIP and COUNT
# arguments are in units of IBS bytes:
$ ddpt if=/dev/sda skip=0x4215cc bs=512 of=dst2.img count=1234560
# There is no need to worry about short reads on a block device
# so giving BS sets both IBS and OBS to the same value. So now
# SKIP and COUNT are in 512 byte units (so SKIP is a Logical Block
# Address (LBA)). Notice that SKIP is given in hexadecimal (dd only
# accepts decimal arguments). Since the "bpt=" option is not given
# BPT defaults to 128 and each read into the copy buffer is
# 128*512 = 64 KB. The dd equivalent:
$ dd if=/dev/sda skip=4330956 bs=512 of=dst.img count=1234560
# will be slow since dd will read 512 bytes, write 512 bytes at a time.
# Changing to "bs=64k" looks like it will help but now SKIP and COUNT
# need to be divided by 128. However the SKIP value is not divisible by
# 128. An efficient solution is not pretty.
# When block sizes differ between the input and output devices, ddpt may
# zero pad the last copy segment so an integral number of OBS sized blocks
# are written. With this example the output (sent to stderr) is shown:
$ ddpt if=/dev/sda ibs=512 of=/dev/sdc obs=4096 count=9
9+0 records in
2+0 records out
time to transfer data: 0.000045 secs at 102.40 MB/sec
# The COUNT implies a copy of 9*512 = 4608 bytes. That spills into the
# second record (block) of /dev/sdc because its logical block size is 4096
# bytes. ddpt pads with zeros which is what the last 3542 bytes of the
# second block of /dev/sdc will contain after the copy.
# Sparse writes can be used to count the number of blocks that contain
# all zeros. sda1 is a half full 73 GB partition (on a SSD) and we check
# for zero blocks 40 GB from its start and for a length of 5 GB:
$ ddpt if=/dev/sda1 skip=80m bs=512 oflag=sparse count=10m
Output file not specified so no copy, just reading input
10485760+0 records in
0+0 records out
5672704 bypassed records out
time to read data: 20.583461 secs at 260.83 MB/sec
# Actually a copy buffer (64 KB) at a time is being checked for all
# zeros so that count understates the true value. By setting OBPC to
# 1, each block will be checked at a (slight) cost in execution time:
$ ddpt if=/dev/sda1 skip=80m bs=512 oflag=sparse count=10m bpt=128,1
Output file not specified so no copy, just reading input
10485760+0 records in
0+0 records out
6317217 bypassed records out
time to read data: 20.575803 secs at 260.92 MB/sec
# Zero filled blocks are listed as "bypassed records out" (even though
# nothing is actually written). When the granularity of the check for
# zeros is 64 KB then 5672704 zero blocks are found. When the granularity
# of the check is reduced to 512 bytes then 6317217 zero blocks are found.
# As an example of write sparing, assume the regular file t exists
# and tt doesn't. Lets say the length of t is 524897 bytes:
$ ddpt if=t bs=512 of=tt oflag=sparing
1025+1 records in
1025+1 records out
0 bypassed records out
time to transfer data: 0.001061 secs at 495.11 MB/sec
# Now repeating the operation with oflag=sparing still set:
$ ddpt if=t bs=512 of=tt oflag=sparing
1025+1 records in
0+0 records out
1025+1 bypassed records out
time to transfer data: 0.001680 secs at 312.69 MB/sec
# Since t and tt should now contain the same data write sparing
# has been able to bypass all writes to tt.
#
# The "time to transfer" line in the output can be removed by
# the addition of the status=noxfer option:
$ ddpt if=t bs=512 of=tt oflag=sparing status=noxfer
1025+1 records in
0+0 records out
1025+1 bypassed records out
# Imaging disks and partitions to a regular file can take a long
# time. Sometimes the copy must be interrupted or there is
# some failure (say power) which stops the copy. In such cases
# oflag=resume may be helpful. In the case shown below a small
# partition is being copied to a regular file and it is
# interrupted with ^C from the keyboard:
$ ddpt if=/dev/sda2 of=sda2.bin bs=512
^CInterrupted by signal SIGINT, remaining block count=1409601
662784+0 records in
662784+0 records out
time to transfer data: 5.226487 secs at 64.93 MB/sec
To resume, invoke with same arguments plus oflag=resume
# Taking the advice from the last line:
$ ddpt if=/dev/sda2 of=sda2.bin bs=512 oflag=resume
resume adjusting skip=662784, seek=662784, and count=1409601
1409601+0 records in
1409601+0 records out
time to transfer data: 10.543506 secs at 68.45 MB/sec
# By checking the size of sda2.bin the resume logic has adjusted
# the skip, seek and count options to complete the rest of the
# copy. If the copy was finished then making the same invocation
# is harmless:
$ ddpt if=/dev/sda2 of=sda2.bin bs=512 oflag=resume
resume finds copy complete, exiting
# And if sda2.bin was empty on did not exist then a full copy
# would occur.
# ddpt supports a trim operation on the output file when it is
# accessed via the pt interface. Some SSDs support the trim
# operation (also known as unmap) with "deterministic read
# zero after trim". ddpt treats a trim like sparse, however
# instead of bypassing a segment of zeros a trim command is sent.
# In SCSI parlance trim is a WRITE SAME with the UNMAP bit set.
$ ddpt if=/dev/sdb1 bs=512 of=/dev/sg1 seek=73899000 oflag=trim
18314037+0 records in
16970165+0 records out
1343872 trimmed records out
time to transfer data: 174.057264 secs at 53.87 MB/sec
# To trim (zero) a large portion of a SSD use /dev/zero as the
# input file. This will zero from logical block 73899000 until
# the end of /dev/sg1 which is a SSD:
$ ddpt if=/dev/zero bs=512 of=/dev/sg1 seek=73899000 oflag=trim
Progress report:
remaining block count=38895160
43507456+0 records in
0+0 records out
43507328 trimmed records out
time to transfer data so far: 405.905942 secs at 54.88 MB/sec
continuing ...
82402488+0 records in
0+0 records out
82402488 trimmed records out
time to transfer data: 768.067647 secs at 54.93 MB/sec
# Notice the "Progress report:" line and the indented lines
# following it. What happened here was a SIGUSR1 signal was sent
# to the process running ddpt with a 'kill -s SIGUSR1 <pid>'
# command. The <pid> of a running ddpt can be found with the 'ps ax'
# command. The progress report finished with "continuing ..."
# line. The un-indented lines at the end of the output were
# placed there at the completion of the ddpt copy.
# self trim describes the technique of reading a block device
# (accessed via a pt interface) and checking for segments of
# zeros (64 KB of zeros in the first case). Segments full of
# zeros are "trimmed".
# In Linux /dev/sg* and /dev/bsg/* are pt devices.
$ ddpt if=/dev/sg0 bs=512 skip=130045952 iflag=self,trim
# The bpt option can be used to both increase the size of the
# copy segment and reduce granularity on trim check to 1 output
# block (i.e. 512 bytes at a time). This may result in a lot more
# small "trim" commands being issued.
$ ddpt if=/dev/sg0 bs=512 skip=130045952 iflag=self,trim bpt=1024,1
# the self flag does some option juggling and transforms the previous
# invocation into:
$ ddpt if=/dev/sg0 bs=512 skip=130045952 of=/dev/sg0 seek=130045952 \
oflag=trim,nowrite bpt=1024,1
# which is now a "copy" back to the same file. Nasty things happen if
# SKIP and SEEK are not the same. Best to stick with the simpler
# "iflag=self,trim" form and avoid the pitfalls of replicated arguments.
# If some command line arithmetic is required (e.g. with the skip, seek
# and/or count arguments) then the bash shell offers the "$(( ))" syntax.
# It is basically integer arithmetic, probably up to 64 bits precision,
# with hex number accepted (leading 0x) but without multiplier suffixes
# (e.g. $((1M + 1)) is not accepted). See the "ARITHMETIC EVALUATION"
# section in the bash man page.
$ ddpt if=/dev/sg1 skip=$((0xfff + 1)) count=1
# xcopy is an abbreviation of the SCSI EXTENDED COPY command and facility.
# There are two variants: "LID1" (List Identifier length of 1 byte)
# and "LID4". xcopy is a performance win when disks (LUs) are remote or
# disks have better bandwidth to a storage switch (e.g. a SAS expander)
# than they do to server machines. Some xcopy implementations use the term
# "remote copy". This example copies the contents of /dev/sdc to /dev/sdd
$ ddpt if=/dev/sdc iflag=xcopy bs=512 of=/dev/sdd
204800+0 records in
204800+0 records out
4 xcopy commands done
time to transfer data: 0.162026 secs at 647.17 MB/sec
# In this case the EXTENDED COPY command is sent to /dev/sdc . To send that
# command to the destination (i.e. /dev/sdd) instead then replace
# "iflag=xcopy" with "oflag=xcopy". If the above fails add "verbose=1" and
# ramp that up to 5 to see what is happening "in the weeds".
# Logically the resulting state of the destination should be the same
# whether or not the xcopy facility is used. The difference should be in the
# performance, with the xcopy version being faster, possibly much faster,
# with essentially no load on the machine issuing the xcopy.
# A subset of xcopy(LID4) that does disk to disk, token based copies and has
# the market name ODX is supported. A full disk to disk copy looks like this:
$ ddpt --odx if=/dev/sg3 of=/dev/sg4 bs=512
20971520+0 records in
20971520+0 records out
time to transfer data: 38.994866 secs at 275.35 MB/sec
# ODX also has a facility to zero out blocks:
$ ddpt rtype=zero if=/dev/null of=/dev/sg4 bs=512
20971520+0 records out
time to transfer data: 25.348843 secs at 423.59 MB/sec
# The ROD type of "zero" has a special, static ROD Token associated with it.
# Network copies can be done by exposing a ROD Token (512 bytes long) or a
# sequence of them. In the following example both mach_a and mach_b can "see"
# the same storage array. One of that array's targets contains two LUs:
# /dev/sg3 and /dev/sg4. In the first step, one or more RODs are populated
# with 1m blocks starting at LBA 0x1234 from /dev/sg3. Those ROD Tokens are
# passed back to mach_a which places them in a file called "my.tk". A
# network copy ("scp") is used to copy "my.tk" to mach_b. mach_b then uses
# those 1m blocks of data represented by the ROD Tokens in "my.tk" to write
# to /dev/sg4, starting at LBA 0 (since no seek= option is given):
mach_a $ ddpt if=/dev/sg3 bs=512 skip=0x1234,1m rtf=my.tk
1048576+0 records in
time to transfer data: 0.002217 secs at 242160.99 MB/sec
mach_a $ scp -p my.tk user@mach_b:/tmp
mach_b $ ddpt rtf=/tmp/my.tk bs=512 of=/dev/sg4 count=1m
1048576+0 records out
time to transfer data: 0.747828 secs at 717.91 MB/sec
# Job files are designed to lessen to tedium of repeatedly entering numerous
# command options to ddpt. Instead, some or all options can be placed in a
# job file can then be named on the command line. For example, assume a file
# called "read_1m_zero.jf" contains the following 9 lines:
# Example job file for ddpt that reads from /dev/zero 1m blocks, each
# of 512 bytes. The count value of "1m" is 1024*1024=1048576
if=/dev/zero
bs=512
count=1m
of=/dev/null
# -vv
# This can be used like this (and can be executed by a non-root user since
# /dev/zero can be read by all and the copy is relatively harmless):
$ ddpt read_1m_zero.jf
1048576+0 records in
0+0 records out
time to read data: 0.055672 secs at 9643.46 MB/sec
# It may not be a good idea to put the if= and particularly of= options
# inside a job file as they are most likely to change. Some options are
# allowed to be changed while others are not, typically with a view to
# safety. For example:
$ ddpt read_1m_zero.jf if=/dev/null
Second IFILE argument??
# Some other options can be overridden, in which case the last one seen
# is used:
$ ddpt count=1000x1000 read_1m_zero.jf
1048576+0 records in
0+0 records out
time to read data: 0.052276 secs at 10269.93 MB/sec
$ ddpt read_1m_zero.jf count=1000x1000
1000000+0 records in
0+0 records out
time to read data: 0.051909 secs at 9863.41 MB/sec
# A job file can contain comments: anything from a "#" to the end of a
# line is considered a comment. Blanks lines are permitted and ignored.
# Job files can invoke other job files, to a level of 4 deep. Some checks
# are made that a file assumed to contain text and ddpt options is not
# actually binary, but that is error prone.
# Douglas Gilbert 20141226
|