1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696
|
NAME
Data::Dumper::Compact - Vertically compact width-limited data formatter
SYNOPSIS
Basic usage as a function:
use Data::Dumper::Compact 'ddc';
warn ddc($some_data_structure);
warn ddc($some_data_structure, \%options);
Slightly more clever usage as a function:
use Data::Dumper::Compact ddc => \%default_options;
warn ddc($some_data_structure);
warn ddc($some_data_structure, \%extra_options);
OO usage:
use Data::Dumper::Compact;
warn Data::Dumper::Compact->dump($data, \%options);
my $ddc = Data::Dumper::Compact->new(\%options);
warn $ddc->dump($data);
warn $ddc->dump($data, \%extra_options);
DESCRIPTION
Data::Dumper::Compact, henceforth referred to as DDC, was born because I
was annoyed at valuable wasted whitespace paging through both
Data::Dumper and Data::Dump based logs - Data::Dump attempts to format
horizontally first, but then if it fails, immediately switches to
formatting fully vertically, rather than trying to e.g. format a six
element arrayref three per line.
So here's a few of the specifics (noting that all examples unless
otherwise specified are dumped with default options):
Arrays and Strings
Given arrays consisting of reasonably long strings, DDC does its best to
produce a sane representation within its "max_width":
[
1, 2, [
'longstringislonglongstringislonglongstringislong',
'longstringislonglongstringislong', 'longstringislong',
'longstringislonglongstringislonglongstringislong', 'longstringislong',
'longstringislonglongstringislong', 'longstringislong',
'longstringislonglongstringislong',
'longstringislonglongstringislonglongstringislong',
'longstringislonglongstringislong', 'longstringislonglongstringislong',
'longstringislonglongstringislonglongstringislong', 'longstringislong',
'longstringislong', 'longstringislonglongstringislonglongstringislong',
'longstringislong', 'longstringislong', 'longstringislong',
'longstringislonglongstringislong',
'longstringislonglongstringislonglongstringislong', 'a', 'b', 'c',
'longstringislonglongstringislonglongstringislonglongstringislong',
'longstringislonglongstringislonglongstringislonglongstringislong',
'longstringislonglongstringislonglongstringislonglongstringislong',
], 3,
]
Keys and Hashrefs
When faced with a "-foo" style value, it gets a "=>" even in an array,
and hash values that we can are single-line formatted:
[
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', [
'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb',
'cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc',
],
-blah => { baz => 'quux', foo => 'bar' },
]
The String Thing
Strings are single quoted when DDC is absolutely sure that's safe, and
double quoted otherwise:
[ { -foo => {
bar =>
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa',
baz => "bbbbbbbbbbbbbbbbbbbb\nbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb",
} } ]
Lonely hash key
When a single hash key can't be formatted in a oneline form within the
length, DDC will try spilling it to its own line:
{
-xxxxxxxxxxxxx => 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
}
If even that isn't enough, it formats it below and indented:
{ -xxxxxxxxxxxxx =>
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
}
Strings and the dot operator
If a string simply won't fit, DDC splits it and indents it using ".":
[ 'xyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyx'
.'yxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxy'
.'xyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyx'
.'yxyxyxyxyxyxyxyxyxyxyxyxyxyxyxy'
]
Unknown unknowns
Anything DDC doesn't understand is passed through its "dumper" option,
though since Data::Dumper (at the time of writing) forgets to pass
through its indentation level to B::Concise, we slightly tweak that
behaviour on the way in for the default "dumper". But the end result
looks like:
{ foo => { bar => sub {
use warnings;
use strict 'refs';
my($x, $y) = @_;
return $x * $y;
} } }
Bless you
When encountering an object, if it's a blessed array or hashref, DDC
will attempt to format that too:
[ bless( {
x => 3,
y => [ 'foo', 'bar', 'baz', 'quux', 'fleem', 'blather', 'obrien' ],
z => 'lololololololololololololololol',
}, "OhGods::Lol" ) ]
All together now
The full set of behaviours allows compact (and, we hope, readable)
versions of complex data structures. To provide one of the examples that
expired this module - here is the formatting under standard options for
a moderately complex SQL::Abstract update statement:
{
_ => [
'tree_table', -join => {
as => 'tree',
on => { 'tree.id' => 'tree_with_path.id' },
to => { -select => {
from => 'tree_with_path',
select => '*',
with_recursive => [
[ 'tree_with_path', 'id', 'parent_id', 'path' ], { -select => {
_ => [
'id', 'parent_id', { -as =>
[
{ -cast => { -as => [ 'id', 'char', 255 ] } },
'path',
]
},
],
from => 'tree_table',
union_all => { -select => {
_ => [
't.id', 't.parent_id', { -as => [
{ -concat => [ 'r.path', \"'/'", 't.id' ] },
'path',
] },
],
from => [
'tree_table', -as => 't', -join => {
as => 'r',
on => { 't.parent_id' => 'r.id' },
to => 'tree_with_path',
},
],
} },
where => { parent_id => undef },
} },
],
} },
},
],
set => { path => { -ident => [ 'tree', 'path' ] } },
}
And the version (generated by setting "max_width" to 40) that runs out
of space and thereby forces the "spill vertically" logic to kick in
while still attemping to be at least somewhat compact:
{
_ => [
'tree_table',
'-join',
{
as => 'tree',
on => {
'tree.id' => 'tree_with_path.id',
},
to => {
-select => {
from => 'tree_with_path',
select => '*',
with_recursive => [
[
'tree_with_path',
'id',
'parent_id',
'path',
],
{
-select => {
_ => [
'id',
'parent_id',
{
-as => [
{
-cast => {
-as => [
'id',
'char',
255,
],
},
},
'path',
],
},
],
from => 'tree_table',
union_all => {
-select => {
_ => [
't.id',
't.parent_id',
{
-as => [
{
-concat => [
'r.path',
\"'/'",
't.id',
],
},
'path',
],
},
],
from => [
'tree_table',
'-as',
't',
'-join',
{
as => 'r',
on => {
't.parent_id' => 'r.id',
},
to => 'tree_with_path',
},
],
},
},
where => {
parent_id => undef,
},
},
},
],
},
},
},
],
set => {
path => {
-ident => [
'tree',
'path',
],
},
},
}
Summary
Hopefully it's clear what the goal is, and what we've done to achieve
it.
While the system is already somewhat configurable, further options are
almost certainly implementable, although if you really want such an
option then we expect you to turn up with documentation and test cases
for it so we just have to write the code.
OPTIONS
max_width
Represents the width that DDC will attempt to keep as the maximum (if
something overflows it in spite of our best efforts, DDC will fall back
to a more vertically sprawling format to at least overflow as little as
feasible).
Default: 78
indent_by
The string to indent by. To set e.g. 4 space indent, pass "' 'x4".
Default: ' ' (two spaces).
indent_width
How many characters one indent should be considered to be. Generally you
only need to manually set this if your "indent_by" is "\t".
Default: "length($self->indent_by)"
transforms
Set of transforms to apply on every "dump" operation. See "transform"
for more information.
Default: "[]"
dumper
The dumper function to be used for dumping things DDC doesn't
understand, such as coderefs, regexprefs, etc.
Defaults to the same options as Data::Dumper::Concise (which is, itself,
only a Data::Dumper configuration albeit it comes with Devel::Dwarn
which is rather more interesting) - although on top of that we add a
little bit of extra cleverness to make B::Deparse use the correct
indentation, since for some reason Data::Dumper doesn't (at the time of
writing) do that.
If you supply it yourself, it needs to be a single argument coderef -
you could for example use "\&Data::Dumper::Dumper" though that would
almost certainly be pointless.
EXPORTS
ddc
use Data::Dumper::Compact 'ddc';
use Data::Dumper::Compact 'ddc' => \%options;
If the first argument to "use"/"import()" is 'ddc', a subroutine "ddc()"
is installed in the calling package which behaves like calling "dump".
If the second argument is a hashref, it becomes the options passed to
"new".
This feature is effectively sugar over "dump_cb", in that:
Data::Dumper::Compact->import(ddc => \%options)
is equivalent to:
*ddc = Data::Dumper::Compact->new(\%options)->dump_cb;
METHODS
new
my $ddc = Data::Dumper::Compact->new;
my $ddc = Data::Dumper::Compact->new(%options);
my $ddc = Data::Dumper::Compact->new(\%options);
Constructor. Takes a hash or hashref of "OPTIONS"
dump
my $formatted = Data::Dumper::Compact->dump($data, \%options?);
my $formatted = $ddc->dump($data, \%merge_options?);
This is the method you're going to want to call most of the time, and
ties together the rest of the functionality into a single
data-structure-to-string bundle. With just a data argument, it's
equivalent to:
$ddc->format( $ddc->transform( $ddc->transforms, $ddc->expand($data) );
In class method form, options provided are passed to "new"; in instance
method form, options if provided are merged into $ddc just for this
invocation.
dump_cb
my $cb = $ddc->dump_cb;
Returns a subroutine reference that's a curried call to "dump":
$cb->($data, \%extra_options); # equivalent to $ddc->dump(...)
Mostly useful for if you want to create a custom "ddc()" like thing:
use Data::Dumper::Compact;
BEGIN { *Dumper = Data::Dumper::Compact->new->dump_cb }
expand
my $exp = $ddc->expand($data);
Expands a data structure to DDC tagged data. The result is, recursively,
[ $type, $payload ]
where if $type is one of "string", "key", or "thing", the payload is a
simple string ("thing" meaning something unknown and therefore delegated
to "dumper"). If the type is an array:
[ array => \@values ]
and if the type is a hash:
[ hash => [ \@keys, \%value_map ] ]
where the keys provide an order for formatting, and the value map is a
hashref of keys to expanded values.
A plain string becomes a "string", unless it fits the "-foo" style
pattern that autoquotes, in which case it becomes a "key".
add_transform
$ddc->add_transform(sub { ... });
$ddc->add_transform({ hash => sub { ... }, _ => sub { ... });
Appends a transform to "$ddc->transforms", see "transform" for
behaviour.
Returns $ddc to enable chaining.
transform
my $tf_exp = $ddc->transform($tfspec, $exp);
Takes a transform specification and expanded tagged data and returns the
transformed expanded expression. A transform spec is an arrayref
containing transforms, where each transform is applied in order, so the
last transform added via "add_transform" will be the last one to
transform the data (each transform will consist of a datastructure
representing which parts of the $exp tree it should be called for, plus
subroutines representing the relevant transforms).
Transform subroutines are called as a method on the $ddc with the
arguments of "$type, $payload, $path" where $path is an arrayref of the
keys/values of the containing hashes and arrays, aggregated as DDC
descends through the $exp tree.
Each transform is expected to return either nothing, to indicate it
doesn't wish to modify the result, or a replacement expanded data
structure. The simplest form of transform is a subref, which gets called
for everything.
So, to add ' IN MICE' to every string that's part of an array under a
hash key called study_results, i.e.:
my $data = { study_results => [
'Sense Of Touch Is Formed In the Brain Before Birth'.
"We can't currently cure MS but a single cell could change that",
] };
my $tf_exp = $ddc->transform([ sub {
my ($self, $type, $payload, $path) = @_;
return unless $type eq 'string' and ($path->[-2]||'') eq 'study_results';
return [ $type, $payload.' IN MICE' ];
} ], $ddc->expand($data));
will return:
[ hash => [
[ 'study_results' ],
{ study_results => [ array => [
[ string => 'Sense Of Touch Is Formed In the Brain Before Birth IN MICE' ],
[ string => "We can't currently cure MS but a single cell could change that IN MICE", ],
] ] }
] ]
If a hashref is found, then the values are expected to be transforms,
and DDC will use "$hashref->{$type}||$hashref->{_}" as the transform, or
skip if neither is present. So the previous example could be written as:
$ddc->transform([ { string => sub {
my ($self, $type, $payload, $path) = @_;
return unless ($path->[-2]||'') eq 'study_results';
return [ $type, $payload.' IN MICE' ];
} } ], $ddc->expand($data));
If the value of the spec entry itself *or* the relevant hash value is an
arrayref, it is assumed to contain a spec for trailing path entries,
with the last element being the transform subroutine. A path entry match
can be an exact scalar (tested via "eq" since it works fine for both
strings and integer array indices), regexp, "undef" to indicate "any
value is fine here", or a subroutine which will be called with the path
entry as both $_[0] and $_. So the example we've been using could also
be written as:
$ddc->transform([ { string => [
'study_results', undef,
sub { [ string => $_[2].' IN MICE' ] }
] } ], $ddc->expand($data));
or
$ddc->transform([ { string => [
qr/^study_results$/, sub { 1 },
sub { [ string => $_[2].' IN MICE' ] }
] } ], $ddc->expand($data));
Note that while the $tfspec is not passed to transform subroutines, for
the duration of the "transform" call the "transforms" option is
localised to the provided routine, so
sub {
my ($self, $type, $payload, $path) = @_;
my $tfspec = $self->transforms;
...
}
will return the top level $tfspec passed to the transform call.
Thanks to <http://twitter.com/justsaysinmice> for the inspiration.
format
my $formatted = $ddc->format($exp);
Takes expanded tagged data and renders it to a formatted string,
suitable for printing or warning or etc.
Accepts the following type tags: "array", "list", "hash", "key",
"string", "thing". Arrays and hashes are formatted as compactly as
possible within the constraint of "max_width", but if overflow occurs
then DDC falls back to spilling everything vertically, so newlines are
used for most spacing and therefore it doesn't exceed the max width any
more than strictly necessary.
Strings are formatted as single quote if obvious, and double quote if
not.
Keys are treated as strings when present as hash values, but when an
element of array values, are formatted ask "the_key =>" where possible.
Lists are formatted as single line "qw()" expressions if possible, or "(
... )" if not.
Arrays and hashes are formatted in the manner to which one would hope
readers are accustomed, except more compact.
ALGORITHM
The following is a description of the current algorithm of DDC. We
reserve the right to change it for the better.
If you didn't already read the overview examples in "WHY" do that first.
Vertical mode means DDC has given up on fitting within the desired width
and is now just trying to not use *too* much vertical space.
Oneline mode is DDC testing to see if a single line rendering of
something will fit within the available space. Things will often be
rendered more than once since DDC is optimising for compact readable
output rather than raw straight line performance.
Top level formatting
If something is formatted and the remaining width is zero or negative,
DDC accepts default on "max_width" and bails out to a fully vertical
approach so it overflows the desired width no more than necessary.
Array formatting
If already in vertical mode, formats one array element per line,
appended with ",":
[
1,
2,
3
]
If in possible oneline mode, formats all but the last element according
to the "Array element" rules, the last element according to normal
formatting, and joins them with ' ' in the hopes this is narrow enough.
Return this if oneline is forced or it fits:
[ 1, 2, 3 ]
If there's only a single internal member, tries to use the "Single entry
formatting" strategy to cuddle it.
[ [
<something inside>
] ]
Otherwise, attempts to bundle things as best possible: Each element is
formatted according to the "Array element" rules, and multiple results
are concatenated together onto a single line where that still remains
within the available width.
[
'foo', 'bar', 'baz',
'red', 'white', 'blue',
]
Array element
Elements are normally formatted as "$formatted.','" except if an element
is of type "key" in which cases it becomes "$key =>".
"whatever the smeg",
smeg_off =>
List formatting
The type "list" is synthetic and only introduced by transforms.
It is formatted identically to an arrayref except with "( )" instead of
"[ ]", with the exception that if it consists of only plain strings and
will fit onto a single line, it formats as a "qw(x y x)" style list.
qw(foo bar baz)
(
'foo',
'bar',
'baz',
)
Single entry formatting
Where possible, a single entry will be cuddled such that the opening
delimiters are both on the first line, and the closing delimiters both
on the final line, to reduce the vertical space consumption of nested
single entry array and/or hashrefs.
to => { -select => {
...
} }
[ 'SRV:8FB66F32' ], [ [
'/opt/voice-srvc-native/bin/async-srvc-att-gateway-poller', 33,
'NERV::Voice::SRV::Native::AsyncSRVATTGatewayPoller::main',
] ],
Hash formatting
If already in vertical mode, key/value pairs are formatted separated by
newlines, with no attention paid to key length.
{
foo => ...,
bar => ...,
}
If potentially in oneline mode, key/value pairs are formatted separated
by ', ' and the value is returned if forced or if remaining width allows
the oneline rendering.
{ foo => ..., bar => ... }
Otherwise, all key/value pairs are formatted as "key => value" where
possible, but if the first line of the value is too long, the value is
moved to the next line and indented.
key => 'shortvalue'
key =>
'overlylongvalue'
If there's only a single such key/value pair, tries to use the "Single
entry formatting" strategy to cuddle it.
{ zathrus => {
listened_to => 0,
} }
Otherwise returns key/value pairs indented and separated by newlines
{
foo => ...,
bar => ...,
}
String formatting
Uses single quotes if sure that's safe, double quotes otherwise.
'foo bar baz quux'
"could have been '' but nicer to not screw up\n the indents with a newline"
Attempts to format a string within the available width, using multiple
lines and the "." concatenation operator if necessary,.
'this would be an'
.'annoyingly long'
.'string'
The target width is set to 20 in vertical mode to try and not be too
ugly.
Object formatting
Objects are tested to see if their underlying reference is an array or
hash. If so, it's formatted with 'bless( ' prepended and ', $class)'
appended. This so far appears to interact nicely with everything else.
AUTHOR
mst - Matt S Trout (cpan:MSTROUT) <mst@shadowcat.co.uk>
CONTRIBUTORS
None so far.
COPYRIGHT
Copyright (c) 2019 the Data::Dumper::Compact "AUTHOR" and "CONTRIBUTORS"
as listed above.
LICENSE
This library is free software and may be distributed under the same
terms as perl itself. See <https://dev.perl.org/licenses/>.
|