File: ioformats.rst

package info (click to toggle)
python-librosa 0.11.0-5
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 166,732 kB
  • sloc: python: 21,731; makefile: 141; sh: 2
file content (140 lines) | stat: -rw-r--r-- 5,399 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
.. _ioformats:

Advanced I/O Use Cases
^^^^^^^^^^^^^^^^^^^^^^

This section covers advanced use cases for input and output which go beyond the I/O
functionality currently provided by *librosa*.

Read specific formats
---------------------

*librosa* uses `soundfile <https://github.com/bastibe/PySoundFile>`_ and `audioread <https://github.com/beetbox/audioread>`_ for reading audio.
As of v0.7, librosa uses `soundfile` by default, and falls back on `audioread` only when dealing with codecs unsupported by `soundfile`.
For a list of codecs supported by `soundfile`, see the *libsndfile* `documentation <http://www.mega-nerd.com/libsndfile/>`_.

.. warning:: audioread support is deprecated as of librosa 0.10.0, and will be removed completely in version 1.0.

.. note:: See installation instruction for PySoundFile `here <https://python-soundfile.readthedocs.io/en/latest/>`_.

Librosa's load function is meant for the common case where you want to load an entire (fragment of a) recording into memory, but some applications require more flexibility.
In these cases, we recommend using `soundfile` directly.
Reading audio files using `soundfile` is similar to the method in *librosa*. One important difference is that the read data is of shape `(nb_samples, nb_channels)` compared to `(nb_channels, nb_samples)` in :func:`librosa.core.load`. Also the signal is not resampled to 22050 Hz by default, hence it would need be transposed and resampled for further processing in *librosa*. The following example is equivalent to `librosa.load(librosa.util.ex('trumpet'))`:

.. code-block:: python
    :linenos:

    import librosa
    import soundfile as sf

    # Get example audio file
    filename = librosa.ex('trumpet')

    data, samplerate = sf.read(filename, dtype='float32')
    data = data.T
    data_22k = librosa.resample(data, samplerate, 22050)


Blockwise Reading
-----------------

For large audio signals it could be beneficial to not load the whole audio file
into memory. Librosa 0.7 introduced a streaming interface, which can be used to
work on short fragments of audio sequentially.  :func:`librosa.stream` cuts an input
file into *blocks* of audio, which correspond to a given number of *frames*,
which can be iterated over as in the following example:


.. code-block:: python
   :linenos:

   import librosa

   sr = librosa.get_samplerate('/path/to/file.wav')

   # Set the frame parameters to be equivalent to the librosa defaults
   # in the file's native sampling rate
   frame_length = (2048 * sr) // 22050
   hop_length = (512 * sr) // 22050

   # Stream the data, working on 128 frames at a time
   stream = librosa.stream('path/to/file.wav',
                           block_length=128,
                           frame_length=frame_length,
                           hop_length=hop_length)

   chromas = []
   for y in stream:
      chroma_block = librosa.feature.chroma_stft(y=y, sr=sr,
                                                 n_fft=frame_length,
                                                 hop_length=hop_length,
                                                 center=False)
      chromas.append(chromas)
                                                

In this example, each audio fragment ``y`` will consist of 128 frames worth of samples,
or more specifically, ``len(y) == frame_length + (block_length - 1) * hop_length``.
Each fragment ``y`` will overlap with the subsequent fragment by ``frame_length - hop_length``
samples, which ensures that stream processing will provide equivalent results to if the entire
sequence was processed in one step (assuming padding / centering is disabled).

For more details about the streaming interface, refer to :func:`librosa.stream`.


Read file-like objects
----------------------

If you want to read audio from file-like objects (also called *virtual files*)
you can use `soundfile` as well.  (This will also work with :func:`librosa.load` and :func:`librosa.stream`, provided
that the underlying codec is supported by `soundfile`.)

E.g.: read files from zip compressed archives:

.. code-block:: python
    :linenos:

    import zipfile as zf
    import soundfile as sf
    import io

    with zf.ZipFile('test.zip') as myzip:
        with myzip.open('stereo_file.wav') as myfile:
            tmp = io.BytesIO(myfile.read())
            data, samplerate = sf.read(tmp)

Download and read from URL:

.. code-block:: python
    :linenos:

    import soundfile as sf
    import io

    from six.moves.urllib.request import urlopen

    url = "https://raw.githubusercontent.com/librosa/librosa/master/tests/data/test1_44100.wav"

    data, samplerate = sf.read(io.BytesIO(urlopen(url).read()))


Write out audio files
---------------------
`PySoundFile <https://python-soundfile.readthedocs.io/en/latest/>`_ provides output functionality that can be used directly with numpy array audio buffers:

.. code-block:: python
    :linenos:

    import numpy as np
    import soundfile as sf

    rate = 44100
    data = np.random.uniform(-1, 1, size=(rate * 10, 2))

    # Write out audio as 24bit PCM WAV
    sf.write('stereo_file.wav', data, samplerate, subtype='PCM_24')

    # Write out audio as 24bit Flac
    sf.write('stereo_file.flac', data, samplerate, format='flac', subtype='PCM_24')

    # Write out audio as 16bit OGG
    sf.write('stereo_file.ogg', data, samplerate, format='ogg', subtype='vorbis')