File: index.md

package info (click to toggle)
python-multipart 0.0.20-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid, trixie
  • size: 716 kB
  • sloc: python: 2,206; sh: 17; makefile: 5
file content (186 lines) | stat: -rw-r--r-- 5,662 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
# Python-Multipart

Python-Multipart is a streaming multipart parser for Python.

## Quickstart

### Simple Example

The following example shows a quick example of parsing an incoming request body in a simple WSGI application:

```python
import python_multipart

def simple_app(environ, start_response):
    ret = []

    # The following two callbacks just append the name to the return value.
    def on_field(field):
        ret.append(b"Parsed value parameter named: %s" % (field.field_name,))

    def on_file(file):
        ret.append(b"Parsed file parameter named: %s" % (file.field_name,))

    # Create headers object.  We need to convert from WSGI to the actual
    # name of the header, since this library does not assume that you are
    # using WSGI.
    headers = {'Content-Type': environ['CONTENT_TYPE']}
    if 'HTTP_X_FILE_NAME' in environ:
        headers['X-File-Name'] = environ['HTTP_X_FILE_NAME']
    if 'CONTENT_LENGTH' in environ:
        headers['Content-Length'] = environ['CONTENT_LENGTH']

    # Parse the form.
    python_multipart.parse_form(headers, environ['wsgi.input'], on_field, on_file)

    # Return something.
    start_response('200 OK', [('Content-type', 'text/plain')])
    ret.append(b'\n')
    return ret

from wsgiref.simple_server import make_server
from wsgiref.validate import validator

httpd = make_server('', 8123, simple_app)
print("Serving on port 8123...")
httpd.serve_forever()
```

If you test this with curl, you can see that the parser works:

```console
$ curl -ik -F "foo=bar" http://localhost:8123/
HTTP/1.0 200 OK
Date: Sun, 07 Apr 2013 01:40:52 GMT
Server: WSGIServer/0.1 Python/2.7.3
Content-type: text/plain

Parsed value parameter named: foo
```

For a more in-depth example showing how the various parts fit together, check out the next section.

### In-Depth Example

In this section, we’ll build an application that computes the SHA-256 hash of all uploaded files in a streaming manner.

To start, we need a simple WSGI application. We could do this with a framework like Flask, Django, or Tornado, but for now let’s stick to plain WSGI:

```python
import python_multipart

def simple_app(environ, start_response):
    start_response('200 OK', [('Content-type', 'text/plain')])
    return ['Hashes:\n']

from wsgiref.simple_server import make_server
httpd = make_server('', 8123, simple_app)
print("Serving on port 8123...")
httpd.serve_forever()
```

You can run this and check with curl that it works properly:

```console
$ curl -ik http://localhost:8123/
HTTP/1.0 200 OK
Date: Sun, 07 Apr 2013 01:49:03 GMT
Server: WSGIServer/0.1 Python/2.7.3
Content-type: text/plain
Content-Length: 8

Hashes:
```

Good! It works. Now, let’s add some of the code that we need. What we need to do, essentially, is set up the appropriate parser and callbacks so that we can access each portion of the request as it arrives, without needing to store any parts in memory.

We can start off by checking if we need to create the parser at all - if the Content-Type isn’t multipart/form-data, then we’re not going to do anything.

The final code should look like this:

```python
import hashlib
import python_multipart
from python_multipart.multipart import parse_options_header

def simple_app(environ, start_response):
    ret = []

    # Python 2 doesn't have the "nonlocal" keyword from Python 3, so we get
    # around it by setting attributes on a dummy object.
    class g(object):
        hash = None

    # This is called when a new part arrives.  We create a new hash object
    # in this callback.
    def on_part_begin():
        g.hash = hashlib.sha256()

    # We got some data!  Update our hash.
    def on_part_data(data, start, end):
        g.hash.update(data[start:end])

    # Our current part is done, so we can finish the hash.
    def on_part_end():
        ret.append("Part hash: %s" % (g.hash.hexdigest(),))

    # Parse the Content-Type header to get the multipart boundary.
    content_type, params = parse_options_header(environ['CONTENT_TYPE'])
    boundary = params.get(b'boundary')

    # Callbacks dictionary.
    callbacks = {
        'on_part_begin': on_part_begin,
        'on_part_data': on_part_data,
        'on_part_end': on_part_end,
    }

    # Create the parser.
    parser = python_multipart.MultipartParser(boundary, callbacks)

    # The input stream is from the WSGI environ.
    inp = environ['wsgi.input']

    # Feed the parser with data from the request.
    size = int(environ['CONTENT_LENGTH'])
    while size > 0:
        to_read = min(size, 1024 * 1024)
        data = inp.read(to_read)
        parser.write(data)

        size -= len(data)
        if len(data) != to_read:
            break

    start_response('200 OK', [('Content-type', 'text/plain')])
    return ret

from wsgiref.simple_server import make_server
httpd = make_server('', 8123, simple_app)
print("Serving on port 8123...")
httpd.serve_forever()
```

And you can see that this works:

```console
$ echo "Foo bar" > /tmp/test.txt
$ shasum -a 256 /tmp/test.txt
0b64696c0f7ddb9e3435341720988d5455b3b0f0724688f98ec8e6019af3d931  /tmp/test.txt
$ curl -ik -F file=@/tmp/test.txt http://localhost:8123/
HTTP/1.0 200 OK
Date: Sun, 07 Apr 2013 02:09:10 GMT
Server: WSGIServer/0.1 Python/2.7.3
Content-type: text/plain

Hashes:
Part hash: 0b64696c0f7ddb9e3435341720988d5455b3b0f0724688f98ec8e6019af3d931
```


## Historical note

This package used to be accessed via `import multipart`. This still works for
now (with a warning) as long as the Python package `multipart` is not also
installed. If both are installed, you need to use the full PyPI name
`python_multipart` for this package.