Downloader¶
-
class
parfive.
Downloader
(max_conn=5, progress=True, file_progress=True, loop=None, notebook=None, overwrite=False, headers=None)[source]¶ Bases:
object
Download files in parallel.
- Parameters
max_conn :
int
, optionalThe number of parallel download slots.
progress :
bool
, optionalIf
True
show a main progress bar showing how many of the total files have been downloaded. IfFalse
, no progress bars will be shown at all.file_progress :
bool
, optionalIf
True
andprogress
is true, showmax_conn
progress bars detailing the progress of each individual file being downloaded.loop :
asyncio.AbstractEventLoop
, optionalThe event loop to use to download the files. If not specified a new loop will be created and executed in a new thread so it does not interfere with any currently running event loop.
notebook :
bool
, optionalIf
True
tqdm will be used in notebook mode. IfNone
an attempt will be made to detect the notebook and guess which progress bar to use.overwrite :
bool
orstr
, optionalDetermine how to handle downloading if a file already exists with the same name. If
False
the file download will be skipped and the path returned to the existing file, ifTrue
the file will be downloaded and the existing file will be overwritten, if'unique'
the filename will be modified to be unique.headers :
dict
Request headers to be passed to the server. Adds
User-Agent
information aboutparfive
,aiohttp
andpython
if not passed explicitely.
Attributes Summary
The total number of files already queued for download.
Methods Summary
download
([timeouts])Download all files in the queue.
enqueue_file
(url[, path, filename, overwrite])Add a file to the download queue.
retry
(results)Retry any failed downloads in a results object.
Attributes Documentation
-
queued_downloads
¶ The total number of files already queued for download.
Methods Documentation
-
download
(timeouts=None)[source]¶ Download all files in the queue.
- Parameters
timeouts :
dict
, optionalOverrides for the default timeouts for http downloads. Supported keys are any accepted by the
aiohttp.ClientTimeout
class. Defaults to 5 minutes for total session timeout and 90 seconds for socket read timeout.- Returns
filenames :
parfive.Results
A list of files downloaded.
Notes
The defaults for the
'total'
and'sock_read'
timeouts can be overridden by two environment variablesPARFIVE_TOTAL_TIMEOUT
andPARFIVE_SOCK_READ_TIMEOUT
.
-
enqueue_file
(url, path=None, filename=None, overwrite=None, **kwargs)[source]¶ Add a file to the download queue.
- Parameters
url :
str
The URL to retrieve.
path :
str
, optionalThe directory to retrieve the file into, if
None
defaults to the current directory.filename :
str
orcallable
, optionalThe filename to save the file as. Can also be a callable which takes two arguments the url and the response object from opening that URL, and returns the filename. (Note, for FTP downloads the response will be
None
.) IfNone
the HTTP headers will be read for the filename, or the last segment of the URL will be used.overwrite :
bool
orstr
, optionalDetermine how to handle downloading if a file already exists with the same name. If
False
the file download will be skipped and the path returned to the existing file, ifTrue
the file will be downloaded and the existing file will be overwritten, if'unique'
the filename will be modified to be unique. IfNone
the value set when constructing theDownloader
object will be used.kwargs :
dict
Extra keyword arguments are passed to
aiohttp.ClientSession.get
oraioftp.ClientSession
depending on the protocol.
Notes
Proxy URL is read from the environment variables
HTTP_PROXY
orHTTPS_PROXY
, depending on the protocol of theurl
passed. Proxy Authenticationproxy_auth
should be passed as aaiohttp.BasicAuth
object. Proxy Headersproxy_headers
should be passed asdict
object.
-
retry
(results)[source]¶ Retry any failed downloads in a results object.
Note
This will start a new event loop.
- Parameters
results :
parfive.Results
A previous results object, the
.errors
property will be read and the downloads retried.- Returns
results :
parfive.Results
A modified version of the input
results
with all the errors from this download attempt and any new files appended to the list of file paths.