1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482
|
# javaobj-py3
[](https://pypi.python.org/pypi/javaobj-py3/)
[](https://pypi.python.org/pypi/javaobj-py3/)
[](https://github.com/tcalmant/python-javaobj/actions/workflows/build.yml)
[](https://coveralls.io/r/tcalmant/python-javaobj?branch=master)
*python-javaobj* is a python library that provides functions for reading and
writing (writing is WIP currently) Java objects serialized or will be
deserialized by `ObjectOutputStream`. This form of object representation is a
standard data interchange format in Java world.
The `javaobj` module exposes an API familiar to users of the standard library
`marshal`, `pickle` and `json` modules.
## About this repository
This project is a fork of *python-javaobj* by Volodymyr Buell, originally from
[Google Code](http://code.google.com/p/python-javaobj/) and now hosted on
[GitHub](https://github.com/vbuell/python-javaobj).
This fork intends to work both on Python 2.7 and Python 3.4+.
## Compatibility Warnings
### New implementation of the parser
| Implementations | Version |
|-----------------|----------|
| `v1`, `v2` | `0.4.0+` |
Since version 0.4.0, two implementations of the parser are available:
* `v1`: the *classic* implementation of `javaobj`, with a work in progress
implementation of a writer.
* `v2`: the *new* implementation, which is a port of the Java project
[`jdeserialize`](https://github.com/frohoff/jdeserialize/),
with support of the object transformer (with a new API) and of the `numpy`
arrays loading.
You can use the `v1` parser to ensure that the behaviour of your scripts
doesn't change and to keep the ability to write down files.
You can use the `v2` parser for new developments
*which won't require marshalling* and as a *fallback* if the `v1`
fails to parse a file.
### Object transformers V1
| Implementations | Version |
|-----------------|----------|
| `v1` | `0.2.0+` |
As of version 0.2.0, the notion of *object transformer* from the original
project as been replaced by an *object creator*.
The *object creator* is called before the deserialization.
This allows to store the reference of the converted object before deserializing
it, and avoids a mismatch between the referenced object and the transformed one.
### Object transformers V2
| Implementations | Version |
|-----------------|----------|
| `v2` | `0.4.0+` |
The `v2` implementation provides a new API for the object transformers.
Please look at the *Usage (V2)* section in this file.
### Bytes arrays
| Implementations | Version |
|-----------------|----------|
| `v1` | `0.2.3+` |
As of version 0.2.3, bytes arrays are loaded as a `bytes` object instead of
an array of integers.
### Custom Transformer
| Implementations | Version |
|-----------------|----------|
| `v2` | `0.4.2+` |
A new transformer API has been proposed to handle objects written with a custom
Java writer.
You can find a sample usage in the *Custom Transformer* section in this file.
## Features
* Java object instance un-marshalling
* Java classes un-marshalling
* Primitive values un-marshalling
* Automatic conversion of Java Collections to python ones
(`HashMap` => `dict`, `ArrayList` => `list`, etc.)
* Basic marshalling of simple Java objects (`v1` implementation only)
* Automatically uncompresses GZipped files
## Requirements
* Python >= 2.7 or Python >= 3.4
* `enum34` and `typing` when using Python <= 3.4 (installable with `pip`)
* Maven 2+ (for building test data of serialized objects.
You can skip it if you do not plan to run `tests.py`)
## Usage (V1 implementation)
Un-marshalling of Java serialised object:
```python
import javaobj
with open("obj5.ser", "rb") as fd:
jobj = fd.read()
pobj = javaobj.loads(jobj)
print(pobj)
```
Or, you can use `JavaObjectUnmarshaller` object directly:
```python
import javaobj
with open("objCollections.ser", "rb") as fd:
marshaller = javaobj.JavaObjectUnmarshaller(fd)
pobj = marshaller.readObject()
print(pobj.value, "should be", 17)
print(pobj.next, "should be", True)
pobj = marshaller.readObject()
```
**Note:** The objects and methods provided by `javaobj` module are shortcuts
to the `javaobj.v1` package, for Compatibility purpose.
It is **recommended** to explicitly import methods and classes from the `v1`
(or `v2`) package when writing new code, in order to be sure that your code
won't need import updates in the future.
## Usage (V2 implementation)
The following methods are provided by the `javaobj.v2` package:
* `load(fd, *transformers, use_numpy_arrays=False)`:
Parses the content of the given file descriptor, opened in binary mode (`rb`).
The method accepts a list of custom object transformers. The default object
transformer is always added to the list.
The `use_numpy_arrays` flag indicates that the arrays of primitive type
elements must be loaded using `numpy` (if available) instead of using the
standard parsing technic.
* `loads(bytes, *transformers, use_numpy_arrays=False)`:
This the a shortcut to the `load()` method, providing it the binary data
using a `BytesIO` object.
**Note:** The V2 parser doesn't have the marshalling capability.
Sample usage:
```python
import javaobj.v2 as javaobj
with open("obj5.ser", "rb") as fd:
pobj = javaobj.load(fd)
print(pobj.dump())
```
### Object Transformer
An object transformer can be called during the parsing of a Java object
instance or while loading an array.
The Java object instance parsing works in two main steps:
1. The transformer is called to create an instance of a bean that inherits
`JavaInstance`.
1. The latter bean is then called:
* When the object is written with a custom block data
* After the fields and annotations have been parsed, to update the content
of the Python bean.
Here is an example for a Java `HashMap` object. You can look at the code of
the `javaobj.v2.transformer` module to see the whole implementation.
```python
class JavaMap(dict, javaobj.v2.beans.JavaInstance):
"""
Inherits from dict for Python usage, JavaInstance for parsing purpose
"""
def __init__(self):
# Don't forget to call both constructors
dict.__init__(self)
JavaInstance.__init__(self)
def load_from_blockdata(self, parser, reader, indent=0):
"""
Reads content stored in a block data.
This method is called only if the class description has both the
`SC_EXTERNALIZABLE` and `SC_BLOCK_DATA` flags set.
The stream parsing will stop and fail if this method returns False.
:param parser: The JavaStreamParser in use
:param reader: The underlying data stream reader
:param indent: Indentation to use in logs
:return: True on success, False on error
"""
# This kind of class is not supposed to have the SC_BLOCK_DATA flag set
return False
def load_from_instance(self, indent=0):
# type: (int) -> bool
"""
Load content from the parsed instance object.
This method is called after the block data (if any), the fields and
the annotations have been loaded.
:param indent: Indentation to use while logging
:return: True on success (currently ignored)
"""
# Maps have their content in their annotations
for cd, annotations in self.annotations.items():
# Annotations are associated to their definition class
if cd.name == "java.util.HashMap":
# We are in the annotation created by the handled class
# Group annotation elements 2 by 2
# (storage is: key, value, key, value, ...)
args = [iter(annotations[1:])] * 2
for key, value in zip(*args):
self[key] = value
# Job done
return True
# Couldn't load the data
return False
class MapObjectTransformer(javaobj.v2.api.ObjectTransformer):
"""
Creates a JavaInstance object with custom loading methods for the
classes it can handle
"""
def create_instance(self, classdesc):
# type: (JavaClassDesc) -> Optional[JavaInstance]
"""
Transforms a parsed Java object into a Python object
:param classdesc: The description of a Java class
:return: The Python form of the object, or the original JavaObject
"""
if classdesc.name == "java.util.HashMap":
# We can handle this class description
return JavaMap()
else:
# Return None if the class is not handled
return None
```
### Custom Object Transformer
The custom transformer is called when the class is not handled by the default
object transformer.
A custom object transformer still inherits from the `ObjectTransformer` class,
but it also implements the `load_custom_writeObject` method.
The sample given here is used in the unit tests.
#### Java sample
On the Java side, we create various classes and write them as we wish:
```java
class CustomClass implements Serializable {
private static final long serialVersionUID = 1;
public void start(ObjectOutputStream out) throws Exception {
this.writeObject(out);
}
private void writeObject(ObjectOutputStream out) throws IOException {
CustomWriter custom = new CustomWriter(42);
out.writeObject(custom);
out.flush();
}
}
class RandomChild extends Random {
private static final long serialVersionUID = 1;
private int num = 1;
private double doub = 4.5;
RandomChild(int seed) {
super(seed);
}
}
class CustomWriter implements Serializable {
protected RandomChild custom_obj;
CustomWriter(int seed) {
custom_obj = new RandomChild(seed);
}
private static final long serialVersionUID = 1;
private static final int CURRENT_SERIAL_VERSION = 0;
private void writeObject(ObjectOutputStream out) throws IOException {
out.writeInt(CURRENT_SERIAL_VERSION);
out.writeObject(custom_obj);
}
}
```
An here is a sample writing of that kind of object:
```java
ObjectOutputStream oos = new ObjectOutputStream(
new FileOutputStream("custom_objects.ser"));
CustomClass writer = new CustomClass();
writer.start(oos);
oos.flush();
oos.close();
```
#### Python sample
On the Python side, the first step is to define the custom transformers.
They are children of the `javaobj.v2.transformers.ObjectTransformer` class.
```python
class BaseTransformer(javaobj.v2.transformers.ObjectTransformer):
"""
Creates a JavaInstance object with custom loading methods for the
classes it can handle
"""
def __init__(self, handled_classes=None):
self.instance = None
self.handled_classes = handled_classes or {}
def create_instance(self, classdesc):
"""
Transforms a parsed Java object into a Python object
:param classdesc: The description of a Java class
:return: The Python form of the object, or the original JavaObject
"""
if classdesc.name in self.handled_classes:
self.instance = self.handled_classes[classdesc.name]()
return self.instance
return None
class RandomChildTransformer(BaseTransformer):
def __init__(self):
super(RandomChildTransformer, self).__init__(
{"RandomChild": RandomChildInstance}
)
class CustomWriterTransformer(BaseTransformer):
def __init__(self):
super(CustomWriterTransformer, self).__init__(
{"CustomWriter": CustomWriterInstance}
)
class JavaRandomTransformer(BaseTransformer):
def __init__(self):
super(JavaRandomTransformer, self).__init__()
self.name = "java.util.Random"
self.field_names = ["haveNextNextGaussian", "nextNextGaussian", "seed"]
self.field_types = [
javaobj.v2.beans.FieldType.BOOLEAN,
javaobj.v2.beans.FieldType.DOUBLE,
javaobj.v2.beans.FieldType.LONG,
]
def load_custom_writeObject(self, parser, reader, name):
if name != self.name:
return None
fields = []
values = []
for f_name, f_type in zip(self.field_names, self.field_types):
values.append(parser._read_field_value(f_type))
fields.append(javaobj.beans.JavaField(f_type, f_name))
class_desc = javaobj.beans.JavaClassDesc(
javaobj.beans.ClassDescType.NORMALCLASS
)
class_desc.name = self.name
class_desc.desc_flags = javaobj.beans.ClassDataType.EXTERNAL_CONTENTS
class_desc.fields = fields
class_desc.field_data = values
return class_desc
```
Second step is defining the representation of the instances, where the real
object loading occurs. Those classes inherit from
`javaobj.v2.beans.JavaInstance`.
```python
class CustomWriterInstance(javaobj.v2.beans.JavaInstance):
def __init__(self):
javaobj.v2.beans.JavaInstance.__init__(self)
def load_from_instance(self):
"""
Updates the content of this instance
from its parsed fields and annotations
:return: True on success, False on error
"""
if self.classdesc and self.classdesc in self.annotations:
# Here, we known there is something written before the fields,
# even if it's not declared in the class description
fields = ["int_not_in_fields"] + self.classdesc.fields_names
raw_data = self.annotations[self.classdesc]
int_not_in_fields = struct.unpack(
">i", BytesIO(raw_data[0].data).read(4)
)[0]
custom_obj = raw_data[1]
values = [int_not_in_fields, custom_obj]
self.field_data = dict(zip(fields, values))
return True
return False
class RandomChildInstance(javaobj.v2.beans.JavaInstance):
def load_from_instance(self):
"""
Updates the content of this instance
from its parsed fields and annotations
:return: True on success, False on error
"""
if self.classdesc and self.classdesc in self.field_data:
fields = self.classdesc.fields_names
values = [
self.field_data[self.classdesc][self.classdesc.fields[i]]
for i in range(len(fields))
]
self.field_data = dict(zip(fields, values))
if (
self.classdesc.super_class
and self.classdesc.super_class in self.annotations
):
super_class = self.annotations[self.classdesc.super_class][0]
self.annotations = dict(
zip(super_class.fields_names, super_class.field_data)
)
return True
return False
```
Finally we can use the transformers in the loading process.
Note that even if it is not explicitly given, the `DefaultObjectTransformer`
will be also be used, as it is added automatically by `javaobj` if it is
missing from the given list.
```python
# Load the object using those transformers
transformers = [
CustomWriterTransformer(),
RandomChildTransformer(),
JavaRandomTransformer()
]
pobj = javaobj.loads("custom_objects.ser", *transformers)
# Here we show a field that isn't visible from the class description
# The field belongs to the class but it's not serialized by default because
# it's static. See: https://stackoverflow.com/a/16477421/12621168
print(pobj.field_data["int_not_in_fields"])
```
|