我需要从使用curl的网页标题中获取2个值.我已经能够使用以下方法单独获取值:
@H_502_6@response1=$(curl -I -s http://www.example.com | grep HTTP/1.1 | awk {'print $2'}) response2=$(curl -I -s http://www.example.com | grep Server: | awk {'print $2'})
但我无法弄清楚如何使用单个curl请求单独grep值,如:
@H_502_6@response=$(curl -I -s http://www.example.com) http_status=$response | grep HTTP/1.1 | awk {'print $2'} server=$response | grep Server: | awk {'print $2'}
每次尝试都会导致错误消息或空值.我确信这只是一个语法问题.
最佳答案
完全bashsolution.演示如何在不需要awk的情况下轻松解析其他标头:
@H_502_6@shopt -s extglob # required to trim whitespace; see below while IFS=':' read key value; do # trim whitespace in "value" value=${value##+([[:space:]])}; value=${value%%+([[:space:]])} case "$key" in Server) SERVER="$value" ;; Content-Type) CT="$value" ;; HTTP*) read PROTO STATUS MSG <<< "$key{$value:+:$value}" ;; esac done < <(curl -sI http://www.google.com) echo $STATUS echo $SERVER echo $CT
生产:
@H_502_6@302 GFE/2.0 text/html; charset=UTF-8
根据RFC-2616,HTTP标头的建模如“Standard for the Format of ARPA Internet Text Messages” (RFC822)所述,其中明确说明了第3.1.2节:
The field-name must be composed of printable ASCII characters
(i.e.,characters that have values between 33. and 126.,
decimal,except colon). The field-body may be composed of any
ASCII characters,except CR or LF. (While CR and/or LF may be
present in the actual text,they are removed by the action of
unfolding the field.)
所以上面的脚本应该捕获任何RFC- [2] 822兼容的头,但有一个明显的例外,即folded headers.