File: RELEASE_NOTES_v0.8.rst

package info (click to toggle)
pytables 3.3.0-5
  • links: PTS, VCS
  • area: main
  • in suites: stretch
  • size: 14,972 kB
  • ctags: 16,919
  • sloc: python: 59,339; ansic: 46,596; cpp: 1,463; sh: 476; makefile: 428
file content (151 lines) | stat: -rw-r--r-- 6,460 bytes parent folder | download | duplicates (4)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
What's new in PyTables 0.8
----------------------------

On this release, many enhancements has been added and some bugs has
been fixed. Here is the (non-exhaustive) list:

- The new VLArray class enables you to store large lists of rows
  containing variable numbers of elements. The elements can
  be scalars or fully multimensional objects, in the PyTables
  tradition. This class supports two special objects as rows:
  Unicode strings (UTF-8 codification is used internally) and
  generic Python objects (through the use of cPickle).

- The new EArray class allows you to enlarge already existing
  multidimensional homogeneous data objects. Consider it
  an extension of the already existing Array class, but
  with more functionality. Online compression or other filters
  can be applied to EArray instances, for example.

  Another nice feature of EA's is their support for fully
  multidimensional data selection with extended slices.  You
  can write "earray[1,2:3,...,4:200]", for example, to get the
  desired dataset slice from the disk. This is implemented
  using the powerful selection capabilities of the HDF5
  library, which results in very highly efficient I/O
  operations. The same functionality has been added to Array
  objects as well.

- New UnImplemented class. If a dataset contains unsupported
  datatypes, it will be associated with an UnImplemented
  instance, then inserted into to the object tree as usual.
  This allows you to continue to work with supported objects
  while retaining access to attributes of unsupported
  datasets.  This has changed from previous versions, where a
  RuntimeError occurred when an unsupported object was
  encountered.

  The combination of the new UnImplemented class with the
  support for new datatypes will enable PyTables to greatly
  increase the number of types of native HDF5 files that can
  be read and modified.

- Boolean support has been added for all the Leaf objects.

- The Table class has now an append() method that allows you
  to save large buffers of data in one go (i.e. bypassing the
  Row accessor). This can greatly improve data gathering
  speed.

- The standard HDF5 shuffle filter (to further enhance the
      compression level) is supported.

- The standard HDF5 fletcher32 checksum filter is supported.

- As the supported number of filters is growing (and may be
  further increased in the future), a Filters() class has been
  introduced to handle filters more easily.  In order to add
  support for this class, it was necessary to make a change in
  the createTable() method that is not backwards compatible:
  the "compress" and "complib" parameters are deprecated now
  and the "filters" parameter should be used in their
  place. You will be able to continue using the old parameters
  (only a Deprecation warning will be issued) for the next few
  releases, but you should migrate to the new version as soon
  as possible. In general, you can easily migrate old code by
  substituting code in its place::

    table = fileh.createTable(group, 'table', Test, '', complevel, complib)

  should be replaced by::

    table = fileh.createTable(group, 'table', Test, '',
                              Filters(complevel, complib))

- A copy() method that supports slicing and modification of
  filtering capabilities has been added for all the Leaf
  objects. See the User's Manual for more information.

- A couple of new methods, namely copyFile() and copyChilds(),
  have been added to File class, to permit easy replication
  of complete hierarchies or sub-hierarchies, even to
  other files. You can change filters during the copy
  process as well.

- Two new utilities has been added: ptdump and
  ptrepack. The utility ptdump allows the user to examine
  the contents of PyTables files (both metadata and actual
  data). The powerful ptrepack utility lets you
  selectively copy (portions of) hierarchies to specific
  locations in other files. It can be also used as an
  importer for generic HDF5 files.

- The meaning of the stop parameter in read() methods has
  changed. Now a value of 'None' means the last row, and a
  value of 0 (zero) means the first row. This is more
  consistent with the range() function in python and the
  __getitem__() special method in numarray.

- The method Table.removeRows() is no longer limited by table
  size.  You can now delete rows regardless of the size of the
  table.

- The "numarray" value has been added to the flavor parameter
  in the Table.read() method for completeness.

- The attributes (.attr instance variable) are Python
  properties now. Access to their values is no longer
  lazy, i.e. you will be able to see both system or user
  attributes from the command line using the tab-completion
  capability of your python console (if enabled).

- Documentation has been greatly improved to explain all the
  new functionality. In particular, the internal format of
  PyTables is now fully described. You can now build
  "native" PyTables files using any generic HDF5 software
  by just duplicating their format.

- Many new tests have been added, not only to check new
  functionality but also to more stringently check
  existing functionality. There are more than 800 different
  tests now (and the number is increasing :).

- PyTables has a new record in the data size that fits in one
  single file: more than 5 TB (yeah, more than 5000 GB), that
  accounts for 11 GB compressed, has been created on an AMD
  Opteron machine running Linux-64 (the 64 bits version of the
  Linux kernel). See the gory details in:
  http://pytables.sf.net/html/HowFast.html.

- New platforms supported: PyTables has been compiled and tested
  under Linux32 (Intel), Linux64 (AMD Opteron and Alpha), Win32
  (Intel), MacOSX (PowerPC), FreeBSD (Intel), Solaris (6, 7, 8
  and 9 with UltraSparc), IRIX64 (IRIX 6.5 with R12000) and it
  probably works in many more architectures. In particular,
  release 0.8 is the first one that provides a relatively clean
  porting to 64-bit platforms.

- As always, some bugs have been solved (especially bugs that
  occur when deleting and/or overwriting attributes).

- And last, but definitely not least, a new donations section
  has been added to the PyTables web site
  (http://sourceforge.net/projects/pytables, then follow the
  "Donations" tag). If you like PyTables and want this effort
  to continue, please, donate!

Enjoy!,

-- Francesc Alted
falted@pytables.org