1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76
|
.. _advanced_usage:
**************
Advanced usage
**************
Stream 7z content
-----------------
The py7zr provide a way to stream archive content with ``extract`` and ``extractall`` methods.
The methods accept an object which implement ``py7zr.io.WriterFactory`` interface as ``factory`` arugment.
Your custom class which implements ``WriterFactory`` interface should have ``create`` method.
The ``create`` method should return your custom class object which implements ``Py7zIO`` interface.
The ``Py7zIO`` interface is as similar as Python Standard ``BinaryIO`` but it also have ``size`` method.
When the py7zr extract the archive contents, it calls the factory object with filename and ask creating the io object.
The py7zr write the io object with ``write`` method, which is as similar as an object returned by the standard ``open`` method.
You can process ``write`` in your custom class. The method is called on-the-fly when the py7zr move to extract the target
archive file.
Note: Because the py7zr may run in multi-threaded, your custom class should be thread-safe.
The py7zr provide the way to stream content without buffering all the archive content in a memory.
Example to extract into network storage
---------------------------------------
Here is a pseudo code to demonstrate a way to extract contents into cloud storage.
To simplify things in an example code, all the authentication, bucket operation,
all the mandatory headers are omitted.
.. code-block::
def extract_stream():
factory = StreamIOFactory()
with py7zr.SevenZipFile("target.7z") as archive:
archive.extractall(factory=factory)
class StreamIO(py7zr.io.Py7zIO):
"""Example network storage writer."""
def __init__(self, fname: str):
self.fname = fname
self.length = 0
def write(self, data: [bytes, bytearray]):
self.length += len(data)
# the py7zr will call write multiple time, so you need to append data
requests.put("https://your.custom.network.storage.example.com/append/to/file/command/", data=data)
def read(self, size: Optional[int] = None) -> bytes:
return b''
def seek(self, offset: int, whence: int = 0) -> int:
return offset
def flush(self) -> None:
pass
def size(self) -> int:
return self.length
class StreamIOFactory(py7zr.io.WriterFactory):
"""Factory class to return StreamWriter object."""
def __init__(self):
self.products = {}
def create(self, filename: str) -> Py7zIO:
product = StreamIO(filename)
self.products[filename] = product
return product
|