一、语法
onlylove@ubuntu:~$ wget -help
GNU Wget 1.20.3, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...
Mandatory arguments to long options are mandatory for short options too.
Startup:
-V, --version display the version of Wget and exit
-h, --help print this help
-b, --background go to background after startup
-e, --execute=COMMAND execute a `.wgetrc'-style command
Logging and input file:
-o, --output-file=FILE log messages to FILE
-a, --append-output=FILE append messages to FILE
-d, --debug print lots of debugging information
-q, --quiet quiet (no output)
-v, --verbose be verbose (this is the default)
-nv, --no-verbose turn off verboseness, without being quiet
--report-speed=TYPE output bandwidth as TYPE. TYPE can be bits
-i, --input-file=FILE download URLs found in local or external FILE
-F, --force-html treat input file as HTML
-B, --base=URL resolves HTML input-file links (-i -F)
relative to URL
--config=FILE specify config file to use
--no-config do not read any config file
--rejected-log=FILE log reasons for URL rejection to FILE
Download:
-t, --tries=NUMBER set number of retries to NUMBER (0 unlimits)
--retry-connrefused retry even if connection is refused
--retry-on-http-error=ERRORS comma-separated list of HTTP errors to retry
-O, --output-document=FILE write documents to FILE
-nc, --no-clobber skip downloads that would download to
existing files (overwriting them)
--no-netrc don't try to obtain credentials from .netrc
-c, --continue resume getting a partially-downloaded file
--start-pos=OFFSET start downloading from zero-based position OFFSET
--progress=TYPE select progress gauge type
--show-progress display the progress bar in any verbosity mode
-N, --timestamping don't re-retrieve files unless newer than
local
--no-if-modified-since don't use conditional if-modified-since get
requests in timestamping mode
--no-use-server-timestamps don't set the local file's timestamp by
the one on the server
-S, --server-response print server response
--spider don't download anything
-T, --timeout=SECONDS set all timeout values to SECONDS
--dns-timeout=SECS set the DNS lookup timeout to SECS
--connect-timeout=SECS set the connect timeout to SECS
--read-timeout=SECS set the read timeout to SECS
-w, --wait=SECONDS wait SECONDS between retrievals
--waitretry=SECONDS wait 1..SECONDS between retries of a retrieval
--random-wait wait from 0.5*WAIT...1.5*WAIT secs between retrievals
--no-proxy explicitly turn off proxy
-Q, --quota=NUMBER set retrieval quota to NUMBER
--bind-address=ADDRESS bind to ADDRESS (hostname or IP) on local host
--limit-rate=RATE limit download rate to RATE
--no-dns-cache disable caching DNS lookups
--restrict-file-names=OS restrict chars in file names to ones OS allows
--ignore-case ignore case when matching files/directories
-4, --inet4-only connect only to IPv4 addresses
-6, --inet6-only connect only to IPv6 addresses
--prefer-family=FAMILY connect first to addresses of specified family,
one of IPv6, IPv4, or none
--user=USER set both ftp and http user to USER
--password=PASS set both ftp and http password to PASS
--ask-password prompt for passwords
--use-askpass=COMMAND specify credential handler for requesting
username and password. If no COMMAND is
specified the WGET_ASKPASS or the SSH_ASKPASS
environment variable is used.
--no-iri turn off IRI support
--local-encoding=ENC use ENC as the local encoding for IRIs
--remote-encoding=ENC use ENC as the default remote encoding
--unlink remove file before clobber
--xattr turn on storage of metadata in extended file attributes
Directories:
-nd, --no-directories don't create directories
-x, --force-directories force creation of directories
-nH, --no-host-directories don't create host directories
--protocol-directories use protocol name in directories
-P, --directory-prefix=PREFIX save files to PREFIX/..
--cut-dirs=NUMBER ignore NUMBER remote directory components
HTTP options:
--http-user=USER set http user to USER
--http-password=PASS set http password to PASS
--no-cache disallow server-cached data
--default-page=NAME change the default page name (normally
this is 'index.html'.)
-E, --adjust-extension save HTML/CSS documents with proper extensions
--ignore-length ignore 'Content-Length' header field
--header=STRING insert STRING among the headers
--compression=TYPE choose compression, one of auto, gzip and none. (default: none)
--max-redirect maximum redirections allowed per page
--proxy-user=USER set USER as proxy username
--proxy-password=PASS set PASS as proxy password
--referer=URL include 'Referer: URL' header in HTTP request
--save-headers save the HTTP headers to file
-U, --user-agent=AGENT identify as AGENT instead of Wget/VERSION
--no-http-keep-alive disable HTTP keep-alive (persistent connections)
--no-cookies don't use cookies
--load-cookies=FILE load cookies from FILE before session
--save-cookies=FILE save cookies to FILE after session
--keep-session-cookies load and save session (non-permanent) cookies
--post-data=STRING use the POST method; send STRING as the data
--post-file=FILE use the POST method; send contents of FILE
--method=HTTPMethod use method "HTTPMethod" in the request
--body-data=STRING send STRING as data. --method MUST be set
--body-file=FILE send contents of FILE. --method MUST be set
--content-disposition honor the Content-Disposition header when
choosing local file names (EXPERIMENTAL)
--content-on-error output the received content on server errors
--auth-no-challenge send Basic HTTP authentication information
without first waiting for the server's
challenge
HTTPS (SSL/TLS) options:
--secure-protocol=PR choose secure protocol, one of auto, SSLv2,
SSLv3, TLSv1, TLSv1_1, TLSv1_2 and PFS
--https-only only follow secure HTTPS links
--no-check-certificate don't validate the server's certificate
--certificate=FILE client certificate file
--certificate-type=TYPE client certificate type, PEM or DER
--private-key=FILE private key file
--private-key-type=TYPE private key type, PEM or DER
--ca-certificate=FILE file with the bundle of CAs
--ca-directory=DIR directory where hash list of CAs is stored
--crl-file=FILE file with bundle of CRLs
--pinnedpubkey=FILE/HASHES Public key (PEM/DER) file, or any number
of base64 encoded sha256 hashes preceded by
'sha256//' and separated by ';', to verify
peer against
--random-file=FILE file with random data for seeding the SSL PRNG
--ciphers=STR Set the priority string (GnuTLS) or cipher list string (OpenSSL) directly.
Use with care. This option overrides --secure-protocol.
The format and syntax of this string depend on the specific SSL/TLS engine.
HSTS options:
--no-hsts disable HSTS
--hsts-file path of HSTS database (will override default)
FTP options:
--ftp-user=USER set ftp user to USER
--ftp-password=PASS set ftp password to PASS
--no-remove-listing don't remove '.listing' files
--no-glob turn off FTP file name globbing
--no-passive-ftp disable the "passive" transfer mode
--preserve-permissions preserve remote file permissions
--retr-symlinks when recursing, get linked-to files (not dir)
FTPS options:
--ftps-implicit use implicit FTPS (default port is 990)
--ftps-resume-ssl resume the SSL/TLS session started in the control connection when
opening a data connection
--ftps-clear-data-connection cipher the control channel only; all the data will be in plaintext
--ftps-fallback-to-ftp fall back to FTP if FTPS is not supported in the target server
WARC options:
--warc-file=FILENAME save request/response data to a .warc.gz file
--warc-header=STRING insert STRING into the warcinfo record
--warc-max-size=NUMBER set maximum size of WARC files to NUMBER
--warc-cdx write CDX index files
--warc-dedup=FILENAME do not store records listed in this CDX file
--no-warc-compression do not compress WARC files with GZIP
--no-warc-digests do not calculate SHA1 digests
--no-warc-keep-log do not store the log file in a WARC record
--warc-tempdir=DIRECTORY location for temporary files created by the
WARC writer
Recursive download:
-r, --recursive specify recursive download
-l, --level=NUMBER maximum recursion depth (inf or 0 for infinite)
--delete-after delete files locally after downloading them
-k, --convert-links make links in downloaded HTML or CSS point to
local files
--convert-file-only convert the file part of the URLs only (usually known as the basename)
--backups=N before writing file X, rotate up to N backup files
-K, --backup-converted before converting file X, back up as X.orig
-m, --mirror shortcut for -N -r -l inf --no-remove-listing
-p, --page-requisites get all images, etc. needed to display HTML page
--strict-comments turn on strict (SGML) handling of HTML comments
Recursive accept/reject:
-A, --accept=LIST comma-separated list of accepted extensions
-R, --reject=LIST comma-separated list of rejected extensions
--accept-regex=REGEX regex matching accepted URLs
--reject-regex=REGEX regex matching rejected URLs
--regex-type=TYPE regex type (posix|pcre)
-D, --domains=LIST comma-separated list of accepted domains
--exclude-domains=LIST comma-separated list of rejected domains
--follow-ftp follow FTP links from HTML documents
--follow-tags=LIST comma-separated list of followed HTML tags
--ignore-tags=LIST comma-separated list of ignored HTML tags
-H, --span-hosts go to foreign hosts when recursive
-L, --relative follow relative links only
-I, --include-directories=LIST list of allowed directories
--trust-server-names use the name specified by the redirection
URL's last component
-X, --exclude-directories=LIST list of excluded directories
-np, --no-parent don't ascend to the parent directory
Email bug reports, questions, discussions to <bug-wget@gnu.org>
and/or open issues at https://savannah.gnu.org/bugs/?func=additem&group=wget.
onlylove@ubuntu:~$
二、参数说明
1、Startup
参数 | 说明 |
---|
-V, --version | 显示 Wget 和退出的版本 |
-h, --help | 打印此帮助 |
-b, --background | 启动后转到后台 |
-e, --execute=COMMAND | 执行 “.wgetrc” 样式的命令 |
2、Logging and input file
参数 | 说明 |
---|
-o, --output-file=FILE | 将消息记录到文件 |
-a, --append-output=FILE | 将消息追加到文件 |
-d, --debug | 打印大量调试信息 |
-q, --quiet | 安静(无输出) |
-v, --verbose | 详细(这是默认值) |
-nv, --no-verbose | 关闭冗长,而不安静 |
–report-speed=TYPE | 输出带宽为类型。 类型可以是位 |
-i, --input-file=FILE | 下载在本地或外部文件中找到的 URL |
-F, --force-html | 将输入文件视为 HTML |
-B, --base=URL | 解析相对于 URL 的 HTML 输入文件链接 (-i -F) |
–config=FILE | 指定要使用的配置文件 |
–no-config | 不读取任何配置文件 |
–rejected-log=FILE | 将 URL 拒绝的原因记录到 FILE |
3、Download
参数 | 说明 |
---|
-t, --tries=NUMBER | 将重试次数设置为 NUMBER(0 个不限) |
–retry-connrefused | 即使连接被拒绝,也要重试 |
–retry-on-http-error=ERRORS | 以逗号分隔的HTTP错误列表,以重试 |
-O, --output-document=FILE | 将文档写入文件 |
-nc, --no-clobber | 跳过将下载到现有文件的下载(覆盖它们) |
–no-netrc | 不要尝试从 .netrc 获取凭据 |
-c, --continue | 继续获取部分下载的文件 |
–start-pos=OFFSET | 从零开始下载位置偏移 |
–progress=TYPE | 选择进度计类型 |
–show-progress | 在任何详细模式下显示进度条 |
-N, --timestamping | 不要重新检索文件,除非比本地文件新 |
–no-if-modified-since | 不要在时间戳模式下使用条件if-modified-since获取请求 |
–no-use-server-timestamps | 不要用服务器上的时间戳来设置本地文件的时间戳 |
-S, --server-response | 打印服务器响应 |
–spider | 不要下载任何东西 |
-T, --timeout=SECONDS | 将所有超时值设置为秒 |
–dns-timeout=SECS | 将 DNS 查找超时设置为 SECS |
–connect-timeout=SECS | 将连接超时设置为 SECS |
–read-timeout=SECS | 将读取超时设置为 SECS |
-w, --wait=SECONDS | 等待检索之间的秒数 |
–waitretry=SECONDS | 等待 1…在检索的重试之间等待 1…秒 |
–random-wait | 等待从0.5 *等待…1.5 *检索之间的等待秒 |
–no-proxy | 显式关闭代理 |
-Q, --quota=NUMBER | 将检索配额设置为 NUMBER |
–bind-address=ADDRESS | 绑定到本地主机上的地址(主机名或 IP) |
–limit-rate=RATE | 将下载速率限制为速率 |
–no-dns-cache | 禁用缓存 DNS 查找 |
–restrict-file-names=OS | 将文件名中的字符限制为操作系统允许的字符 |
–ignore-case | 匹配文件/目录时忽略大小写 |
-4, --inet4-only | 仅连接到 IPv4 地址 |
-6, --inet6-only | 仅连接到 IPv6 地址 |
–prefer-family=FAMILY | 首先连接到指定系列的地址、IPv6、IPv4 或无 |
–user=USER | 将 ftp 和 http 用户都设置为 USER |
–password=PASS | 将 ftp 和 http 密码都设置为 PASS |
–ask-password | 提示输入密码 |
–use-askpass=COMMAND | 指定用于请求用户名和密码的凭据处理程序。 如果未指定 COMMAND,则使用WGET_ASKPASS或SSH_ASKPASS环境变量。 |
–no-iri | 关闭 IRI 支持 |
–local-encoding=ENC | 使用 ENC 作为 IRI 的本地编码 |
–remote-encoding=ENC | 使用 ENC 作为默认远程编码 |
–unlink | 在 clobber 之前删除文件 |
–xattr | 在扩展文件属性中打开元数据存储 |
4、Directories
参数 | 说明 |
---|
-nd, --no-directories | 不创建目录 |
-x, --force-directories | 强制创建目录 |
-nH, --no-host-directories | 不创建主机目录 |
–protocol-directories | 在目录中使用协议名称 |
-P, --directory-prefix=PREFIX | 将文件保存到前缀/… |
–cut-dirs=NUMBER | 忽略 NUMBER 远程目录组件 |
5、HTTP options
参数 | 说明 |
---|
–http-user=USER | 将 http 用户设置为 USER |
–http-password=PASS | 将 http 密码设置为 PASS |
–no-cache | 禁止服务器缓存数据 |
–default-page=NAME | 更改默认的页面名称(通常是’index.html’)。 |
-E, --adjust-extension | 使用适当的扩展名保存 HTML/CSS 文档 |
–ignore-length | 忽略’Content-Length’报头字段 |
–header=STRING | 插入字符串头 |
–compression=TYPE | 选择压缩,auto, gzip和none。(默认值:无) |
–max-redirect | 每页允许的最大重定向次数 |
–proxy-user=USER | 将 USER 设置为代理用户名 |
–proxy-password=PASS | 将 PASS 设置为代理密码 |
–referer=URL | 在HTTP请求中包含’Referer: URL’头 |
–save-headers | 将 HTTP 标头保存到文件 |
-U, --user-agent=AGENT | 标识为代理而不是 Wget/VERSION |
–no-http-keep-alive | 禁用 HTTP 保持活动状态(持久连接) |
–no-cookies | 不要使用 cookies |
–load-cookies=FILE | 在会话之前从文件加载 Cookie |
–save-cookies=FILE | 会话后将 Cookie 保存到文件 |
–keep-session-cookies | 加载和保存会话(非永久性)Cookie |
–post-data=STRING | 使用POST方法;发送STRING作为数据 |
–post-file=FILE | 使用POST方法;发送文件内容 |
–method=HTTPMethod | 在请求中使用HTTPMethod |
–body-data=STRING | 将字符串作为数据发送。–method 必须设置 |
–body-file=FILE | 发送文件的内容。–method 必须设置 |
–content-disposition | 选择本地文件名时遵循内容处置标头(实验性) |
–content-on-error | 在服务器错误时输出接收的内容 |
–auth-no-challenge | 发送基本 HTTP 身份验证信息,而无需先等待服务器的质询 |
6、HTTPS (SSL/TLS) options
参数 | 说明 |
---|
–secure-protocol=PR | 选择安全协议,包括auto、SSLv2、SSLv3、TLSv1、TLSv1 1、TLSv1 2和PFS |
–https-only | 只遵循安全的HTTPS链接 |
–no-check-certificate | 不要验证服务器的证书 |
–certificate=FILE | 客户端证书文件 |
–certificate-type=TYPE | 客户端证书类型:PEM或DER |
–private-key=FILE | 私钥文件 |
–private-key-type=TYPE | 私钥类型:PEM或DER |
–ca-certificate=FILE | 包含 CA 捆绑包的文件 |
–ca-directory=DIR | 存储 CA 哈希列表的目录 |
–crl-file=FILE | 包含 CRL 捆绑包的文件 |
–pinnedpubkey=FILE/HASHES | 公钥 (PEM/DER) 文件,或任意数量的 base64 编码的 sha256 哈希, 前面是"sha256//",分隔为";",以验证对等 |
–random-file=FILE | 包含用于设定 SSL PRNG 种子的随机数据的文件 |
–ciphers=STR | 可以直接设置优先级字符串(GnuTLS)或密码列表字符串(OpenSSL)。 请谨慎使用。此选项将覆盖 --secure-protocol。 此字符串的格式和语法取决于特定的 SSL/TLS 引擎。 |
7、HSTS options
参数 | 说明 |
---|
–no-hsts | 禁用 HSTS |
–hsts-file | HSTS 数据库的路径(将覆盖默认值) |
8、FTP options
参数 | 说明 |
---|
–ftp-user=USER | 将 ftp 用户设置为 USER |
–ftp-password=PASS | 将 ftp 密码设置为 PASS |
–no-remove-listing | 不删除".listing"文件 |
–no-glob | 关闭 FTP 文件名置入 |
–no-passive-ftp | 禁用"被动"传输模式 |
–preserve-permissions | 保留远程文件权限 |
–retr-symlinks | 递归时,获取链接到的文件(不是 dir) |
9、FTPS options
参数 | 说明 |
---|
–ftps-implicit | 使用隐式 FTPS(默认端口为 990) |
–ftps-resume-ssl | 打开数据连接时恢复在控制连接中启动的 SSL/TLS 会话 |
–ftps-clear-data-connection | 只对控制通道进行加密;所有数据将以明文形式显示 |
–ftps-fallback-to-ftp | 如果目标服务器不支持 FTPS,则回退到 FTP |
10、WARC options
参数 | 说明 |
---|
–warc-file=FILENAME | 将请求/响应数据保存到 .warc.gz 文件 |
–warc-header=STRING | 将字符串插入到 warcinfo 记录中 |
–warc-max-size=NUMBER | 将 WARC 文件的最大大小设置为 NUMBER |
–warc-cdx | 写入 CDX 索引文件 |
–warc-dedup=FILENAME | 不存储此 CDX 文件中列出的记录 |
–no-warc-compression | 不要使用 GZIP 压缩 WARC 文件 |
–no-warc-digests | 不计算 SHA1 摘要 |
–no-warc-keep-log | 不要将日志文件存储在 WARC 记录中 |
–warc-tempdir=DIRECTORY | WARC 编写器创建的临时文件的位置 |
11、Recursive download
参数 | 说明 |
---|
-r, --recursive | 指定递归下载 |
-l, --level=NUMBER | 最大递归深度(inf 或 0 表示无限) |
–delete-after | 下载文件后将其本地删除 |
-k, --convert-links | 使下载的 HTML 或 CSS 中的链接指向本地文件 |
–convert-file-only | 仅转换 URL 的文件部分(通常称为基名) |
–backups=N | 在写入文件 X 之前,最多轮换 N 个备份文件 |
-K, --backup-converted | 在转换文件 X 之前,备份为 X.orig |
-m, --mirror | -N -r -l inf --no-remove-listing 的快捷方式 |
-p, --page-requisites | 获取显示 HTML 页面所需的所有图像等 |
–strict-comments | 启用 HTML 注释的严格 (SGML) 处理 |
12、Recursive accept/reject
参数 | 说明 |
---|
-A, --accept=LIST | 以逗号分隔的已接受扩展名列表 |
-R, --reject=LIST | 以逗号分隔的被拒绝扩展名列表 |
–accept-regex=REGEX | 正则表达式匹配接受的 URL |
–reject-regex=REGEX | 正则表达式匹配被拒绝的 URL |
–regex-type=TYPE | 正则表达式类型 (posix|pcre) |
-D, --domains=LIST | 接受域的逗号分隔列表 |
–exclude-domains=LIST | 以逗号分隔的已拒绝域列表 |
–follow-ftp | 从 HTML 文档中访问 FTP 链接 |
–follow-tags=LIST | 以逗号分隔的跟随 HTML 标记列表 |
–ignore-tags=LIST | 以逗号分隔的被忽略 HTML 标记的列表 |
-H, --span-hosts | 递归时转到外部主机 |
-L, --relative | 仅关注相对链接 |
-I, --include-directories=LIST | 允许的目录列表 |
–trust-server-names | 使用重定向 URL 的最后一个组件指定的名称 |
-X, --exclude-directories=LIST | 排除目录列表 |
-np, --no-parent | 不要上升到父目录 |
三、man wget
待完成