File: yaml_tools_api.md

package info (click to toggle)
abinit 9.10.4-3
  • links: PTS, VCS
  • area: main
  • in suites: sid
  • size: 518,712 kB
  • sloc: xml: 877,568; f90: 577,240; python: 80,760; perl: 7,019; ansic: 4,585; sh: 1,925; javascript: 601; fortran: 557; cpp: 454; objc: 323; makefile: 77; csh: 42; pascal: 31
file content (715 lines) | stat: -rw-r--r-- 29,923 bytes parent folder | download | duplicates (2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
---
description: Tools available for YAML based tests
authors: TC
---

# YAML-based test suite: Fortran and Python API

This page gives an overview of the internal implementation in order 
to facilitate future developments and extensions.
In the first part we discuss how to write new Yaml documents using the low-level Fortran API.
In the second part, we describe the python implementation and the steps required to 
register a new Yaml document in the framework.
Finally, we provide recommendations for the name of the variables used in Yaml documents in 
order to improve readability and facilitate the integration between Yaml documents and the python code.

## Fortran API

The low-level Fortran API consists of the following modules:

**m_pair_list.F90**:

: a Fortran module for constructing dictionaries mapping strings to values.
  Internally, the dictionary is implemented in C in terms of a list of key-value pairs.
  The pair list can dynamically hold either real, integer or string values.
  It implements the basic getter and setter methods and an iterator system to allow looping
  over the key-value pairs easily in Fortran.
  <!--
  Lists have been chosen over hash tables because, in this particular context,
  performance is not critical and the O(n) cost required to locate a key is negligible
  given that there will never be a lot of elements in the table.
  -->

**m_stream_string.F90**:

:  a Fortran module providing tools to manipulate variable length
  strings with a file like interface (**stream** object)

**m_yaml_out.F90:**

:  a Fortran module to easily output YAML documents from Fortran.
  This module represents the main entry point for client code and may be used
  to implement methods to output Fortran objects.
  It provides basic routines to produce YAML documents based on a mapping at the root
  and containing well formatted (as human readable as possible and valid YAML) 1D and 2D arrays of numbers
  (integer or real), key-value mapping (using pair lists), list of key-value
  mapping and scalar fields.
  Routines support tags for specifying special data structures.

**m_neat.F90** (NEw Abinit Test system):

: a higher level module providing Fortran procedures to create specific
  documents associated to important physical properties. Currently routines
  for total energy components and ground state results are implemented.

<!--
A more detailed example is provided [below](#creating-a-new-document-in-fortran).
The new infrastructure has been designed with extensibility and ease-of-use in mind.
-->
From the perspective of the developer, adding a new YAML document requires three steps:

1. Implement the output of the YAML document in Fortran using the pre-existent API.
   Associate a **unique tag** to the new document. The tag identify the document and
   its structure.

2. register the tag and the associated class in the python code. 
   Further details about this procedure can be found below.

3. Create a YAML configuration file defining the quantities that should be compared
   with the corresponding tolerances.

An example will help clarify. Let's assume we want to implement the output of
the `!Etot` document with the different components of the total free energy.
<!--
The workflow is split in two parts. The idea is to separate the computation
from the composition of the document, and separate the composition of the
document from the actual rendering.
-->
The Fortran code that builds the list of (key, values) entries will look like:
**MG: The Fortran examples should be refactored and merged with the text in Creating a new document in Fortran**

```fortran
! Import modules
use m_neat, only: neat_etot
use m_pair_list, only: pair_list

real(dp) :: etot
type(pair_list) :: e_components  ! List of (key, value) entries

call e_components%set("comment", s="Total energie and its components")
call e_components%set("Total Energy", r=etot)

! footer of the routine, once all data have been stored
call neat_etot(e_components, unit)
```

In *47_neat/m\_neat.F90* we will implement *neat\_etot*:

```fortran
subroutine neat_etot(components, unit)

type(pair_list), intent(in) :: components
integer, intent(in) :: unit

call yaml_single_dict("Etot", "", components, 35, 500, width=20, &
                      file=unit, real_fmt='(ES20.13)')
 !
 ! 35 -> max size of the labels, needed to extract from the pairlist,
 ! 500 -> max size of the strings, needed to extract the comment
 !
 ! width -> width of the field name side, permit a nice alignment of the values

end subroutine neat_etot
```

### Creating a new document in Fortran

Developers are invited to browse the sources of `m_neat` and `m_yaml_out` to
have a comprehensive overview of the available tools. Indeed the routines are
documented and commented and creating a hand-written reference is likely to go
out of sync quicker than on-site documentation. Here we show a little example to
give the feeling of the process.

The simplest way to create a YAML document have been introduced above with the
use of `yaml_single_dict`. However this method is limited to scalars only. It is
possible to put 1D and 2D arrays, dictionaries and even tabular data.
The best way to create a new document is to have a routine `neat_my_new_document` 
that will take all required data in argument and will do all the formatting work at once.
This routine will declare a `stream_string` object to build the YAML document
inside, open the document with `yaml_open_doc`, fill it, close it with
`yaml_close_doc` and finally print it with `wrtout_stream`.
Here come a basic skeleton of `neat` routine:

```fortran
subroutine neat_my_new_document(data_1, data_2,... , iout)
  ! declare your pieces of data... (arrays, numbers, pair_list...)
  ...
  integer,intent(in) :: iout  ! this is the output file descriptor
!Local variables-------------------------------
  type(stream_string) :: stream

  ! open the document
  call yaml_open_doc('my label', 'some comments on the document', stream=stream)

  ! fill the document
  ...

  ! close and output the document
  call yaml_close_doc(stream=stream)
  call wrtout_stream(stream, iout)

end subroutine neat_my_new_document
```

Suppose we want a 2D matrix of real number with the tag `!NiceMatrix` in our document
for the field name 'that matrix' we will add the following to the middle section:

```fortran
call yaml_add_real2d('that matrix', dimension_1, dimension_2, mat_data, &
                     tag='NiceMatrix', stream=stream)
```

Other `m_yaml_out` routines provide a similar interface: first the label, then the
data and its structural metadata, then a bunch of optional arguments for
formatting, adding tags, tweaking spacing etc.

## Python API

<!--
Roughly speaking, one can group the python modules into three different parts:

- the *fldiff* algorithm
- the interface with the PyYAML library and the parsing of the output data
- the parsing of the YAML configuration file and the testing logic
-->

- The fldiff algorithm has been slightly modified to extract YAML
  documents from the output file and store them for later treatment.

- Newt tools have been written to facilitate the creation of new python classes
  corresponding to YAML tags for implementing new logic operating on the extracted data.
  These tools are very easy to use via class decorators.

- These tools have been used to create basic classes for futures tags, among
  other classes that directly convert YAML list of numbers into NumPy arrays.
  These classes may be used as examples for the creation of further tags.

- A parser for test configuration have been added and all facilities to do
  tests are in place.

- A command line tool `testtools.py` to allow doing different manual actions
  (see Test CLI)


### fldiff algorithm

The machinery used to extract data from the output file is defined in *~abinit/tests/pymods/data_extractor.py*. 
This module takes cares of the identification of the YAML documents inside the output file including 
the treatment of the meta-characters found in the first column following the original fldiff.pl implementation.

*~abinit/tests/pymods/fldiff.py* is the main driver called by *testsuite.py*.
This part implements the legacy fldiff algorithm and uses the objects defined in *yaml_tools*
to perform the YAML-based tests and produce the final report.

### Interface with the PyYAML library

The interface with the PyYAML library is implemented in *~abinit/tests/pymods/yaml_tools/*.
The `__init__.py` module exposes the public API that consists in:

- the `yaml_parse` function that parses documents from the output files
- the `Document` class that gives an interface to an extracted document with its metadata.

<!--
- flags `is_available` (`True` if both PyYAML and Numpy are available) and
  `has_pandas` (`True` if Pandas is available)
-->

- *register_tag.py* defines tools to register tags.
   See below for further details.

### YAML-based testing logic

The logic used to parse configuration files is defined in *meta_conf_parser.py*. 
It contains the creation of config trees, the logic to register constraints and
parameters and the logic to apply constraints.
The `Constraint` class hosts the code to build a constraint, to identify candidates
to its application and to apply it. The identification is based on the type of the
candidate which is compared to a reference type or as set of types.

## How to extend the test suite

### Main entry points

There are three main entry points to the system that are discussed in order of increasing complexity.
The first one is the YAML configuration file.
It does not require any Python knowledge, only a basic comprehension of the conventions
used for writing tests. Being fully *declarative* (no logic) it should be quite easy
to learn its usage from the available examples.
The second one is the *~abinit/tests/pymods/yaml_tests/conf_parser.py* file. It
contains the declarations of the available constraints and parameters. A basic
python understanding is required in order to modify this file. Comments and doc strings
should help users to grasp the meaning of this file. More details available
[in this section](#constraints-and-parameters-registration).

The third one is the file *~abinit/tests/pymods/yaml_tests/structures/*. It
defines the structures used by the YAML parser when encountering a tag (starting
with !), or in some cases when reaching a given pattern (__undef__ for example).
The *structures* directory is a package organised by features (ex: there is a
file *structures/ground_state.py*). Each file define structures for a given
feature. All files are imported by the main script `structures/__init__.py`.
Even if the abstraction layer on top of the _yaml_ module should help, it is
better to have a good understanding of more "advanced" python concepts like
*inheritance*, *decorators*, *classmethod* etc.

## Tag registration and classes implicit methods and attributes

### Basic tag registration tools

The *register_tag.py* module defines python decorators that are used to build python classes 
associated to Yaml documents.

`yaml_map`
: Registers a structure based on a YAML mapping. The class must provide a
  method `from_map` that receives a `dict` object as argument and returns an instance of
  the class. `from_map` should be a class method, but might be a normal method if
  the constructor accepts zero arguments.

`yaml_seq`
: Register a structure based on a YAML sequence/list. The class must provide
  a `from_seq` method that takes a `list` as input and returns an instance of the class.

`yaml_scalar`
: Register a structure based on a YAML scalar/anything that does not fit in the
  two other types but can be a string. The class must provide a `from_scalar`
  method that takes a `string` in argument and returns an instance of the class. It
  is useful to have complex parsing. A practical case is the CSV table parsed by pandas.

`yaml_implicit_scalar`
: Provide the possibility to have special parsing without explicit tags. The
  class it is applied to should meet the same requirements than for `yaml_scalar`
  but should also provide a class attribute named `yaml_pattern`. It can be either
  a string or a compile regex and it should match the expected structure of the
  scalar.

`auto_map`
: Provide a basic general purpose interface for map objects:
  - dict-like interface inherited from `BaseDictWrapper` (`get`, `__contains__`,
    `__getitem__`, `__setitem__`, `__delitem__`, `__iter__`, `keys` and `items`)
  - a `from_map` method that accept any key from the dictionary and add them as
  attributes.
  - a `__repr__` method that show all attributes

`yaml_auto_map`
: equivalent to `auto_map` followed by `yaml_map`

`yaml_not_available_tag`
: This is a normal function (not a decorator) that take a tag name and a message as
  arguments and will register the tag but will raise a warning each time the tag
  is used. An optional third argument `fatal` replace the warning by an error if
  set to `True`.


### Implicit methods and attributes

Some class attributes and methods are used here and there in the tester logic
when available. They are never required but can change some behaviour when provided.

`_is_base_array` Boolean class attribute
: Assert that the object derived from `BaseArray` when set to `True` (`isinstance`
  is not reliable because of the manipulation of the `sys.path`)

`_not_available` Boolean class attribute
: set to `True` in object returned when a tag is registered with
  `yaml_not_available_tag` to make all constraints fail with a more
  useful message.

`is_dict_like` Boolean class attribute
: Assert that the object provide a full `dict` like interface when set to
  `True`. Used in `BaseDictWrapper`.

`__iter__` method
: Standard python implicit method for iterables. If an object has it the test
  driver will crawl its elements.

`get_children` method
: Easy way to allow the test driver to crawl the children of the object.
  It does not take any argument and return a dictionary `{child_name: child_object ...}`.

`has_no_child`  Boolean class attribute
: Prevent the test driver to crawl the children even if it has an `__iter__`
  method (not needed for strings).

`short_str` method
: Used (if available) when the string representation of the object is too long
  for error messages. It should return a string representing the object and not
  too long. There is no constraints on the output length but one should keep
  things readable.

## Constraints and parameters registration

### Add a new parameter

To have the parser recognise a new token as a parameter, one should edit the
*~abinit/tests/pymods/conf_parser.py*.

The `conf_parser` variable in this file have a method `parameter` to register a
new parameter. Arguments are the following:

- `token`: mandatory, the name used in configuration for this parameter
- `default`: optional (`None`), the default value used when the parameter is
  not available in the configuration.
- `value_type`: optional (`float`), the expected type of the value found in
  the configuration.
- `inherited`: optional (`True`), whether or not an explicit value in the
  configuration should be propagated to deeper levels.

Example:

```python
conf_parser.parameter('tol_eq', default=1e-8, inherited=True)
```

### Adding a constraint

To have the parser recognise a new token as a constraint, one should also edit
*~abinit/tests/pymods/conf_parser.py*.

`conf_parser` has a method `constraint` to register a new constraint. It is
supposed to be used as a decorator (on a function) that takes keywords arguments. 
The arguments are all optional and are the following:

- `name`: (`str`) the name to be used in config files. If not
  specified, the name of the function is used.
- `value_type`: (`float`) the expected type of the value found in the
  configuration.
- `inherited`: (`True`), whether or not the constraint should be propagated to
  deeper levels.
- `apply_to`: (`'number'`), the type of data this constraint can be applied to.
  This can be a type or one of the special strings (`'number'`, `'real'`,
  `'integer'`, `'complex'`, `'Array'` (refer to numpy arrays), `'this'`) `'this'`
  is used when the constraint should be applied to the structure where it is
  defined and not be inherited.
- `use_params`: (`[]`) a list of names of parameters to be passed as argument to
  the test function.
- `exclude`: (`set()`) a set of names of constraints that should not be applied
  when this one is.
- `handle_undef`: (`True`) whether or not the special value `undef`
  should be handled before calling the test function.
  If `True` and a `undef` value is present in the data the test will fail or
succeed depending on the value of the special parameter `allow_undef`.
  If `False`, `undef` values won't be checked. They are equivalent to *NaN*.

The decorated function contains the actual test code. It should return `True` if
the test succeed and either `False` or an instance of `FailDetail` (from
*common.py*) if the test failed.

If the test is simple enough one should use `False`.  However if the
test is compound of several non-trivial checks `FailDetail` come in handy to
tell the user which part failed. When you want to signal a failed test and
explaining what happened return `FailDetail('some explanations')`. The message
passed to `FailDetail` will be transmitted to the final report for the user.

Example with `FailDetail`:

```python
@conf_parser.constraint(exclude={'ceil', 'tol_abs', 'tol_rel', 'ignore'})
def tol(tolv, ref, tested):
    '''
        Valid if both relative and absolute differences between the values
        are below the given tolerance.
    '''
    if abs(ref) + abs(tested) == 0.0:
        return True
    elif abs(ref - tested) / (abs(ref) + abs(tested)) >= tolv:
        return FailDetail('Relative error above tolerance.')
    elif abs(ref - tested) >= tolv:
        return FailDetail('Absolute error above tolerance.')
    else:
        return True
```

### How to add a new tag

Pyyaml offer the possibility to directly convert YAML documents to a
Python class using tags. To register a new tag, edit the file `~abinit/tests/pymods/yaml_tools/structures.py`. 
In this file there are several
classes that are *decorated* with one of `@yaml_map`, `@yaml_scalar`, `@yaml_seq` or `@yaml_auto_map`.
These [decorators](https://realpython.com/primer-on-python-decorators/) are the functions that actually
register the class as a known tag.
Whether you should use one or another depend on the organization of the data in the YAML document.
Is it a dictionary, a scalar or a list/sequence ?

In the majority of the cases, one wants the `tester` to browse the children of the structure and apply
the relevant constraints on it and to access attributes through their original name
from the data. In this case, the simpler way to register a new tag is to use
`yaml_auto_map`. For example to register a tag Etot that simply register all
fields from the data tree and let the tester check them with `tol_abs` or
`tol_rel` we would put the following in *~abinit/tests/pymods/yaml_tools/structures/ground_state.py*:

```python
@yaml_auto_map
class Etot(object):
    pass
```

Now the name of the class "Etot" is associated with a tag recognized by the YAML
parser.

`yaml_auto_map` does several things for us:

- it gives the class a `dict` like interface by defining relevant methods, which
  allows the `tester` to browse the children
- it registers the tag in the YAML parser
- it automatically registers all attributes found in the data tree as attributes
  of the class instance. These attributes are accessible through the attribute
  syntax (ex: `my_object.my_attribute_unit`) with a normalized name (basically remove
  characters that cannot be in a python identifier like spaces and punctuation characters)
  or through the dictionary syntax with their original name (ex: `my_object['my attribute (unit)']`)

Sometimes one wants more control over the building of the class instance.
This is what `yaml_map` is for.
Let suppose we still want to register tag but we want to select only a subset of
the components for example. We will use `yaml_map` to gives us control over the
building of the instance. This is done by implementing the `from_map` class
method (a class method is a method that is called from the class instead of
being called from an instance). This method take in argument a dictionary built
from the data tree by the YAML parser and should return an instance of our class.

```python
@yaml_map
class EnergyTerms(object):
    def __init__(self, kin, hart, xc, ew):
        self.kinetic = kin
        self.hartree = hart
        self.xc = xc
        self.ewald = ew

    @classmethod
    def from_map(cls, d):
        # cls is the class (Etot here but it can be
        # something else if we subclass Etot)
        # d is a dictionary built from the data tree

        kin = d['kinetic']
        hart = d['hartree']
        xc = d['xc']
        ew = d['Ewald energy']
        return cls(kin, hart, xc, ew)
```

Now we fully control the building of the structure, however we lost the ability
for the tester to browse the components to check them. If we want to only make
our custom check it is fine. For example we can define a method
`check_components` and use the `callback` constraint like this:

In `~abinit/tests/pymods/yaml_tools/structure/ground_state.py`

```python
@yaml_map
class EnergyTerms(object):
    # same code as the previous example

    def check_components(self, tested):
        # self will always be the reference object
        return (
            abs(self.kinetic - other.kinetic) < 1.0e-10
            and abs(self.hartree - other.hartree) < 1.0e-10
            and abs(self.xc - other.xc) < 1.0e-10
            and abs(self.ewald - other.ewald) < 1.0e-10
        )
```

In the configuration file

```yaml
EnergyTerms:
    callback:
        method: check_components
```

However it can be better to give back the control to `tester`. For that purpose
we can implement the method `get_children` that should return a dictionary of the
data to be checked automatically. `tester` will detect it and use it to check
what you gave him.

```python
@yaml_map
class EnergyTerms(object):
    # same code as the previous example

    def get_children(self):
        return {
            'kin': self.kinetic,
            'hart': self.hartree,
            'xc': self.xc,
            'ew': self.ewald
        }
```

Now `tester` will be able to apply `tol_abs` and friends to the components we gave him.

If the class has a __complete__ dict-like read interface (`__iter__` yielding
keys, `__contains__`, `__getitem__`, `keys` and `items`) then it can have a
class attribute `is_dict_like` set to `True` and it will be treated as any other
node (it not longer need `get_children`). `yaml_auto_map` registered classes
automatically address these requirements.

`yaml_seq` is analogous to `yaml_map` however `to_map` became `to_seq` and the
YAML source data have to match the YAML sequence structure that is either `[a, b, c]` or

```yaml
- a
- b
- c
```

The argument passed to `to_seq` is a list. If one wants `tester` to browse the
elements of the resulting object one can either implement a `get_children`
method or implement the `__iter__` python special method.

If for some reason a class have the `__iter__` method implemented but one does
__not__ want tester to browse its children (`BaseArray` and its subclasses are
such a case) one can defined the `has_no_child` attribute and set it to `True`.
Then tester won't try to browse it. Strings are a particular case. They have the
`__iter__` method but will never been browsed.

To associate a tag to anything else than a YAML mapping or sequence one can use
`yaml_scalar`. This decorator expect the class to have a method `from_scalar`
that takes the raw source as a string in argument and return an instance of the
class. It can be used to create custom parsers of new number representation.
For example to create a 3D vector with unit tag:

```python
@yaml_scalar
class Vec3Unit(object):

    def __init__(self, x, y, z, unit):
        self.x, self.y, self.z = x, y, z
        self.unit = unit

    @classmethod
    def from_scalar(cls, raw):
        sx, sy, sz, unit, *_ = raw.split()  # split on blanks
        return cls(float(sx), float(sy), float(sz), unit)
```

With that new tag registered the YAML parser will happily parse something like

```yaml
kpt: !Vec3Unit 0.5 0.5 0.5 Bohr^-1
```

Finally when the scalar have a easily detectable form one can create an implicit
scalar. An implicit scalar have the advantage to be detected by the YAML parser
without the need of writing the tag before as soon as it match a given regular
expression. For example to parse directly complex numbers we could (naively) do
the following:

```python
@yaml_implicit_scalar
class YAMLComplex(complex):

    # A (quite rigid) regular expression matching complex numbers
    yaml_pattern = r'[+-]?\d+\.\d+) [+-] (\d+\.\d+)i'

    # this have nothing to do with yam_implicit_scalar, it is just the way to
    # create a subclass of a python *native* object
    @staticmethod
    def __new__(*args, **kwargs):
        return complex.__new__(*args, **kwargs)

    @classmethod
    def from_scalar(cls, scal):
        # few adjustment to match the python restrictions on complex number
        # writing
        return cls(scal.replace('i', 'j').replace(' ', ''))
```

## Recommend conventions for Yaml documents

This section discusses the basic conventions that should be followed when writing YAML documents in Fortran.
Note that these conventions are motivated by technical aspects that will facilitate the integration with the
python language as well as the implementation of post-processing tools.

* The tag name should use CamelCase so that one can directly map tag names to python classes
  that are usually given following this convention (see also [PEP8](https://www.python.org/dev/peps/pep-0008/)).

* The tag name should be self-explanatory.

* Whenever possible, the keywords should be a **valid python identifier**. This means one should avoid white spaces
  as much possible and avoid names starting with non-alphabetic characters. White spaces should be replaced by
  underscores. Prefer lower-case names whenever possible.

* By default, quantities are supposed to be given in **atomic units**. If the document contains quantities given
  in other units, we recommended to encode this information in the name using the syntax: `foo_eV`.

*  Avoid names starting with an underscore because these names are reserved for future additions 
   in the python infrastructure.

* `tol_abs`, `tol_rel`, `tol_vec`, `tol_eq`, `ceil`, `ignore`, `equation`, `equations`, `callback` and `callbacks` are reserved 
   keywords that shall not be used in Yaml documents.

* The *comment* field is optional but it is recommended especially when the purpose of the document is not obvious.

An example of well-formed document

```yaml
--- !EnergyTerms
comment             : Components of total free energy (in Hartree)
kinetic_energy      :  5.279019930263079807E+00
hartree_energy      :  8.846303409910728499E-01
xc_energy           : -4.035286123400158687E+00
total_energy_eV     : -2.756386620520307815E+02
...
```

An example of Yaml document that does not follow our guidelines:

```yaml
--- !ETOT
Kinetic energy      :  5.279019930263079807E+00
Hartree energy      :  8.846303409910728499E-01
_XC energy          : -4.035286123400158687E+00
Total energy(eV)    : -2.756386620520307815E+02
equation            : 1 + 1 = 3
...
```

<!--
In a nutshell:

* no savage usage of write statements!
* In the majority of the cases, YAML documents should be opened and closed in the same routine!
* To be discussed: `crystal%yaml_write(stream, indent=4)`
* One should also promote the use of python-friendly keys: white space replaced by underscore, no number as first character,
  tags should be CamelCase
* Is Yaml mode supported only for main output files? What happens to the other files in the files\_to\_test section?
Tags are now mandatory for documents
* A document with a tag is considered a standardized document i.e. a document for which there's
  an official commitment from the ABINIT community to maintain backward compatibility.
  Official YAML documents may be used by third-party software to implement post-processing tools.


: The *label* field appears in all data document. It should be a unique identifier
  at the scale of an _iteration state_. The tag is not necessarily unique, it
  describes the structure of the document and there is no need to use it unless
  special logic have to be implemented for the document. 
-->

<!--
## Further developments

This new YAML-based infrastructure can be used as building block to implement the
high-level logic required by more advanced integration tests such as:

Parametrized tests

: Tests in which multiple parameters are changed either at the level of the input variables
  or at the MPI/OpenMP level.
  Typical example: running calculations with [[useylm]] in [0, 1] or [[paral_kgb]] = 1 runs
  with multiple configurations of [[npfft]], [[npband]], [[npkpt]].

Benchmarks

: Tests to monitor the scalability of the code and make sure that serious bottlenecks are not
  introduced in trunk/develop when important dimension are increased (e.g.
  [[chksymbreak]] > 0 with [[ngkpt]] > 30\*\*3).
  Other possible applications: monitor the memory allocated to detect possible regressions.

Interface with AbiPy and Abiflows

: AbiPy has its own set of integration tests but here we mainly focus on the python layer
  without testing for numerical values.
  Still it would be nice to check for numerical reproducibility, especially when it comes to
  workflows that are already used in production for high-throughput applications (e.g. DFPT).
-->