1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111
|
# Copyright 2021, New York University and the TUF contributors
# SPDX-License-Identifier: MIT OR Apache-2.0
"""Provides an implementation of ``FetcherInterface`` using the urllib3 HTTP
library.
"""
from __future__ import annotations
import logging
from typing import TYPE_CHECKING
# Imports
import urllib3
import tuf
from tuf.api import exceptions
from tuf.ngclient._internal.proxy import ProxyEnvironment
from tuf.ngclient.fetcher import FetcherInterface
if TYPE_CHECKING:
from collections.abc import Iterator
# Globals
logger = logging.getLogger(__name__)
# Classes
class Urllib3Fetcher(FetcherInterface):
"""An implementation of ``FetcherInterface`` based on the urllib3 library.
Attributes:
socket_timeout: Timeout in seconds, used for both initial connection
delay and the maximum delay between bytes received.
chunk_size: Chunk size in bytes used when downloading.
"""
def __init__(
self,
socket_timeout: int = 30,
chunk_size: int = 400000,
app_user_agent: str | None = None,
) -> None:
# Default settings
self.socket_timeout: int = socket_timeout # seconds
self.chunk_size: int = chunk_size # bytes
# Create User-Agent.
ua = f"python-tuf/{tuf.__version__}"
if app_user_agent is not None:
ua = f"{app_user_agent} {ua}"
self._proxy_env = ProxyEnvironment(headers={"User-Agent": ua})
def _fetch(self, url: str) -> Iterator[bytes]:
"""Fetch the contents of HTTP/HTTPS url from a remote server.
Args:
url: URL string that represents a file location.
Raises:
exceptions.SlowRetrievalError: Timeout occurs while receiving
data.
exceptions.DownloadHTTPError: HTTP error code is received.
Returns:
Bytes iterator
"""
# Defer downloading the response body with preload_content=False.
# Always set the timeout. This timeout value is interpreted by
# urllib3 as:
# - connect timeout (max delay before first byte is received)
# - read (gap) timeout (max delay between bytes received)
try:
response = self._proxy_env.request(
"GET",
url,
preload_content=False,
timeout=urllib3.Timeout(self.socket_timeout),
)
except urllib3.exceptions.MaxRetryError as e:
if isinstance(e.reason, urllib3.exceptions.TimeoutError):
raise exceptions.SlowRetrievalError from e
if response.status >= 400:
response.close()
raise exceptions.DownloadHTTPError(
f"HTTP error occurred with status {response.status}",
response.status,
)
return self._chunks(response)
def _chunks(
self, response: urllib3.response.BaseHTTPResponse
) -> Iterator[bytes]:
"""A generator function to be returned by fetch.
This way the caller of fetch can differentiate between connection
and actual data download.
"""
try:
yield from response.stream(self.chunk_size)
except urllib3.exceptions.MaxRetryError as e:
if isinstance(e.reason, urllib3.exceptions.TimeoutError):
raise exceptions.SlowRetrievalError from e
finally:
response.release_conn()
|