usp.web_client.requests_client

Implementation of usp.web_client.abstract_client with Requests.

class usp.web_client.requests_client.RequestsWebClient

Bases: AbstractWebClient

requests-based web client to be used by the sitemap fetcher.

__init__(verify=True, wait: float | None = None, random_wait: bool = False, session: Session | None = None)
Parameters:
  • verify – whether certificates should be verified for HTTPS requests.

  • wait – time to wait between requests, in seconds.

  • random_wait – if true, wait time is multiplied by a random number between 0.5 and 1.5.

  • session – a custom session object to use, or None to create a new one.

get(url: str) AbstractWebClientResponse

Fetch a URL and return a response.

Method shouldn’t throw exceptions on connection errors (including timeouts); instead, such errors should be reported via Response object.

Parameters:

url – URL to fetch.

Returns:

Response object.

set_max_response_data_length(max_response_data_length: int) None

Set the maximum number of bytes that the web client will fetch.

Parameters:

max_response_data_length – Maximum number of bytes that the web client will fetch, or None to fetch all.

set_proxies(proxies: Dict[str, str]) None

Set a proxy for the request.

Parameters:

proxies – Proxy definition where the keys are schemes (“http” or “https”) and values are the proxy address. Example: {'http': 'http://user:pass@10.10.1.10:3128/'}, or an empty dict to disable proxy.

set_timeout(timeout: int | Tuple[int, int] | None) None

Set HTTP request timeout.

See also: Requests timeout docs

Parameters:

timeout – An integer to use as both the connect and read timeouts, or a tuple to specify them individually, or None for no timeout

class usp.web_client.requests_client.RequestsWebClientErrorResponse

Bases: WebClientErrorResponse

Error response from the Requests parser.

__init__(message: str, retryable: bool)

Constructor.

Parameters:
  • message – Message describing what went wrong.

  • retryable – True if the request should be retried.

message() str

Return message describing what went wrong.

Returns:

Message describing what went wrong.

retryable() bool

Return True if request should be retried.

Returns:

True if request should be retried.

class usp.web_client.requests_client.RequestsWebClientSuccessResponse

Bases: AbstractWebClientSuccessResponse

requests-based successful response.

__init__(requests_response: Response, max_response_data_length: int | None = None)
Parameters:
  • requests_response – Response data

  • max_response_data_length – Maximum data length, or None to not restrict.

header(case_insensitive_name: str) str | None

Return HTTP header value for a given case-insensitive name, or None if such header wasn’t set.

Parameters:

case_insensitive_name – HTTP header’s name, e.g. “Content-Type”.

Returns:

HTTP header’s value, or None if it was unset.

raw_data() bytes

Return encoded raw data of the response.

Returns:

Encoded raw data of the response.

status_code() int

Return HTTP status code of the response.

Returns:

HTTP status code of the response, e.g. 200.

status_message() str

Return HTTP status message of the response.

Returns:

HTTP status message of the response, e.g. “OK”.