File: s3-uploading-files.rst

package info (click to toggle)
python-boto3 1.26.27%2Bdfsg-1
  • links: PTS, VCS
  • area: main
  • in suites: bookworm
  • size: 7,880 kB
  • sloc: python: 12,629; makefile: 128
file content (162 lines) | stat: -rw-r--r-- 5,331 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
.. Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.

   This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0
   International License (the "License"). You may not use this file except in compliance with the
   License. A copy of the License is located at http://creativecommons.org/licenses/by-nc-sa/4.0/.

   This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
   either express or implied. See the License for the specific language governing permissions and
   limitations under the License.


###############
Uploading files
###############

The AWS SDK for Python provides a pair of methods to upload a file to an S3
bucket.

The ``upload_file`` method accepts a file name, a bucket name, and an object 
name. The method handles large files by splitting them into smaller chunks 
and uploading each chunk in parallel.

.. code-block:: python

    import logging
    import boto3
    from botocore.exceptions import ClientError
    import os


    def upload_file(file_name, bucket, object_name=None):
        """Upload a file to an S3 bucket

        :param file_name: File to upload
        :param bucket: Bucket to upload to
        :param object_name: S3 object name. If not specified then file_name is used
        :return: True if file was uploaded, else False
        """

        # If S3 object_name was not specified, use file_name
        if object_name is None:
            object_name = os.path.basename(file_name)

        # Upload the file
        s3_client = boto3.client('s3')
        try:
            response = s3_client.upload_file(file_name, bucket, object_name)
        except ClientError as e:
            logging.error(e)
            return False
        return True


The ``upload_fileobj`` method accepts a readable file-like object. The file 
object must be opened in binary mode, not text mode.

.. code-block:: python

    s3 = boto3.client('s3')
    with open("FILE_NAME", "rb") as f:
        s3.upload_fileobj(f, "BUCKET_NAME", "OBJECT_NAME")


The ``upload_file`` and ``upload_fileobj`` methods are provided by the S3 
``Client``, ``Bucket``, and ``Object`` classes. The method functionality 
provided by each class is identical. No benefits are gained by calling one 
class's method over another's. Use whichever class is most convenient.


The ExtraArgs parameter
===========================

Both ``upload_file`` and ``upload_fileobj`` accept an optional ``ExtraArgs`` 
parameter that can be used for various purposes. The list of valid 
``ExtraArgs`` settings is specified in the ``ALLOWED_UPLOAD_ARGS`` attribute 
of the ``S3Transfer`` object 
at :py:attr:`boto3.s3.transfer.S3Transfer.ALLOWED_UPLOAD_ARGS`.

The following ``ExtraArgs`` setting specifies metadata to attach to the S3 
object.

.. code-block:: python

    s3.upload_file(
        'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
        ExtraArgs={'Metadata': {'mykey': 'myvalue'}}
    )


The following ``ExtraArgs`` setting assigns the canned ACL (access control 
list) value 'public-read' to the S3 object.

.. code-block:: python

    s3.upload_file(
        'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
        ExtraArgs={'ACL': 'public-read'}
    )


The ``ExtraArgs`` parameter can also be used to set custom or multiple ACLs.

.. code-block:: python

    s3.upload_file(
        'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
        ExtraArgs={
            'GrantRead': 'uri="http://acs.amazonaws.com/groups/global/AllUsers"',
            'GrantFullControl': 'id="01234567890abcdefg"',
        }
    )


The Callback parameter
==========================

Both ``upload_file`` and ``upload_fileobj`` accept an optional ``Callback`` 
parameter. The parameter references a class that the Python SDK invokes 
intermittently during the transfer operation.

Invoking a Python class executes the class's ``__call__`` method. For each 
invocation, the class is passed the number of bytes transferred up 
to that point. This information can be used to implement a progress monitor.

The following ``Callback`` setting instructs the Python SDK to create an 
instance of the ``ProgressPercentage`` class. During the upload, the 
instance's ``__call__`` method will be invoked intermittently.

.. code-block:: python

    s3.upload_file(
        'FILE_NAME', 'BUCKET_NAME', 'OBJECT_NAME',
        Callback=ProgressPercentage('FILE_NAME')
    )


An example implementation of the ``ProcessPercentage`` class is shown below.

.. code-block:: python

    import os
    import sys
    import threading

    class ProgressPercentage(object):

        def __init__(self, filename):
            self._filename = filename
            self._size = float(os.path.getsize(filename))
            self._seen_so_far = 0
            self._lock = threading.Lock()

        def __call__(self, bytes_amount):
            # To simplify, assume this is hooked up to a single filename
            with self._lock:
                self._seen_so_far += bytes_amount
                percentage = (self._seen_so_far / self._size) * 100
                sys.stdout.write(
                    "\r%s  %s / %s  (%.2f%%)" % (
                        self._filename, self._seen_so_far, self._size,
                        percentage))
                sys.stdout.flush()