cURL命令使用错误总结

问题描述

原本,是想写一个脚本来从 Unsplash 下载图片,通过 apt install axel 安装了 axel 来进行下载(相对较快);使用 jq 来对返回的 json 进行解析,获取下载地址,为了测试命令的有效性,将返回的 json 结果存放在文件中 unsplash.json 中 :

1
2
3
4
5
6
7
8
[
"links": {
"self": "https://api.unsplash.com/photos/3AzS4zAYaXk",
"html": "https://unsplash.com/photos/3AzS4zAYaXk",
"download": "https://unsplash.com/photos/3AzS4zAYaXk/download",
"download_location": "https://api.unsplash.com/photos/3AzS4zAYaXk/download"
}
]

调用 jq 命令解析,并将 download 的值作为 axel 的命令行参数:

1
2
3
4
5
6
jq -C .[].links.download unsplash.json | xargs -t -n 1 axel -o 2.jpg

## console:
axel -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download
Initializing download: https://unsplash.com/photos/3AzS4zAYaXk/download
Could not parse URL.

标准输出或者错误显示不能解析 URL, 查看这个 URL 没有问题,直接执行 axel -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download 也没有出现异常。 由于没有显示具体的 URL 异常在哪里,换用 cUrl 来执行:

1
2
3
4
5
jq -C .[].links.download unsplash.json | xargs -t -n 1 curl -o 2.jpg

## console:
curl -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download
curl: (3) [globbing] bad range in column 3

同样,请求失败,同样直接执行 curl -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download 也能够成功,显然通过 jq 输出,再通过管道符和 xargs 输出的 url 不正常,但是直接看有没有什么特殊的字符,期间,尝试将 jq 解析的结果保存成变量,url 中添加变量,如:

1
curl -o 2.jpg $varUrl

同样执行失败,百思不得其解 :laughing: .

解决思路

智商不够,百度来凑,无果,打开 StackOverFlow, 搜索 bad range in column ,从一个类似的问题中找到了线索或者说答案。

原文 strange-characters-appearing-in-bash-variable-expansion) 中描述的问题是通过 grep 命令 filter json 中的值,并将其作为变量,在 curl 的 url 中引用该变量:

1
2
3
pod_in_question=$(curl -u uname:password -k very.cluster.com/api/v1/namespaces/default/pods/ | grep -i '"name": "myapp-' | cut -d '"' -f 4)

curl -g -u uname:password -k -X DELETE "very.cluster.com/api/v1/namespaces/default/pods/${pod_in_question}"

结果,请求 url 中出现一些特殊的转义字符,原因是使用的 bash 环境, grep 命令默认使用 --colour=always , 使得过滤的结果中出现了颜色的转义序列 ANSI escape sequences, 支持这些转义序列的终端的这些字符不可见,使用 hexdump -C 可以查看,因此针对原文的问题解决方案就是 grep --colour=never.

再回到我的问题,使用 hexdump 打印:

1
2
3
4
5
6
7
jq -C .[].links.download unsplash.json | hexdump -C

00000000 1b 5b 30 3b 33 32 6d 22 68 74 74 70 73 3a 2f 2f |.[0;32m"https://|
00000010 75 6e 73 70 6c 61 73 68 2e 63 6f 6d 2f 70 68 6f |unsplash.com/pho|
00000020 74 6f 73 2f 33 41 7a 53 34 7a 41 59 61 58 6b 2f |tos/3AzS4zAYaXk/|
00000030 64 6f 77 6e 6c 6f 61 64 22 1b 5b 30 6d 0a |download".[0m.|
0000003e

明显可以看到 url 的首尾出现了特殊的颜色转义字符,32 是绿色的色彩码. 显然, 是由于 jq -C 的选项造成的结果, 去掉或者指定 -M (monochrome (don’t colorize JSON)), 问题解决了.

stackoverflow 中的问题还有一个答案, 给出了如何找出 cUrl 使用中出现问题如何定位的思路:

1
2
3
4
5
jq -C .[].links.download unsplash.json | xargs -t -n 1 curl -g --libcurl /tmp/libcurl -o 2.jpg	

cat /tmp/libcurl

curl_easy_setopt(hnd, CURLOPT_URL, "\033[0;32mhttps://unsplash.com/photos/3AzS4zAYaXk/download\033[0m");

结论

  1. 引用 stackoverflow 给出使用 cURL 的最佳实践:

    The best practise for URL syntax in cURL:

    • If Variable Expansion is required:
      • Apply the -g switch to disable potential globbing done by cURL
    • Otherwise:
      • Use $variable as part of a “quoted” url string, instead of ${variable}
  2. 使用 grep, jq 以及管道符 | 应该注意颜色转义序列, 为了使脚本通用, 必要时在所有可能会产生此类问题的命令中关闭颜色输出

Hexo 博客同步脚本

Hexo 博客同步脚本

之前为了本地博客目录与 Hexo 博客目录的独立, 编写了脚本实现复制, 添加分类, 图片上传,Redis 去重等操作, 并上传 github

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
#!/bin/bash
#
# author: guo
# date: 2020-05-31
# description: a automatic way to update hexo post, steps:
# 1. copy local directories containing markdown files to hexo source path
# 2. add front-matter before every hexo post
# 3. run hexo commands to deploy
#
# hexo front-matter is used to classify the posts in hexo, format:
# ---
# title: ls Invalid option
# date: 2020/05/30 22:21:59
# categories:
# - Linux
# tags:
# - Linux
# ---
# use the direct directory name containing md files as categories and tags value
# use the file modification time as date value
#

function log() {
echo "[$(date +"%F %T")]: $@"
}

function convertUrl() {

mdPath=$1
picPath=$2

# convert 'F:\shell_tool\a.png' to '/mnt/f/shell_tool/png'
tempPath=$(echo -n "$picPath" | tr '\\' '/' | tr -d ':')
tempPath=$(echo -ne "${tempPath}" | sed -e 's/^[[:space:]]*//' | sed -e 's/[[:space:]]*$//')

linuxPicPath="/mnt/"$(echo -n ${tempPath,})

if [ ! -f "$linuxPicPath" ]; then
return
fi

echo -e "[Linux Pic Path]:\t"$linuxPicPath

# to match string F:\\shell_tool\\a.png, literal text '\' not escape
matchPath=$(echo -n "$picPath" | sed 's/\\/\\\\/g')
# echo $matchPath

key=$(echo -n $linuxPicPath | base64)
setRes=$(redis-cli setnx $key 1)

echo -e "[Redis Set]\t\t" $setRes
if (($setRes == 0)); then

picName=${linuxPicPath##*/}
buildBedPicPath="https://raw.githubusercontent.com/weirdWimp/blog-store/main/img/"$picName
echo -e "[ReBedPicPath]:\t\t"$buildBedPicPath

eval sed -i 's#${matchPath}#${buildBedPicPath}#' '$mdPath'
return
fi

upRes=$(picgo u $linuxPicPath)

if [[ "$upRes" = *SUCCESS* ]]; then
picBedUrl=${upRes##*SUCCESS]:}
picBedUrl=$(echo $picBedUrl | tr -d '[:space:]')
echo -e "[PicBed Path]\t\t"$picBedUrl
eval sed -i 's#${matchPath}#${picBedUrl}#' '$mdPath'
fi
}

## function to add add front-matter
function addHeader() {
dir=$1
oldIFS=$IFS
IFS=$(echo -ne "\x1c")
for path in $(find $dir -type f -name "*.md" -exec printf {}"\x1c" \;); do
file=${path##*/}
title=${file%.*}
crtdat=$(ls -l --time-style=+"%Y/%m/%d %T" $path | cut -d " " -f 6-7)
categories=${dir##*/}
tags=$categories
head="---\ntitle: $title\ndate: $crtdat\ncategories:\n- $categories\ntags:\n- $tags\n---\n\n\n"
sed -i "1i$head" $path

pics=$(cat $path | grep -P '!\[.*\]\(.*\)' | sed -E 's/!\[.*\]\((.*)\)/\1/' | tr '\n' '\034')
if [ -z "$pics" ]; then
continue
fi

# echo "pictures:###"$pics"==="
for pic in $pics; do
convertUrl $path $pic
done

# sleep 1s
done
IFS=$oldIFS
}

filepath="/mnt/f/shell_tool/hexo_sync/sync.config"
basedir="/mnt/f/md-blog/weirdWimp.github.io"
postdir="/mnt/f/md-blog/weirdWimp.github.io/source/_posts"
while read line; do
if [ -d "$line" ]; then
# echo "$line exists"
dirnam=${line##*/}
targetDir="$postdir/$dirnam"
if [ -d "$targetDir" ]; then
# echo "deleting $targetDir"
sudo rm -rf "$targetDir"
fi
mkdir -p "$targetDir"
cp -r -p "$line" "$postdir"
addHeader "$targetDir"
fi
done <$filepath

```bash mark to ```bash
find "$postdir" -type f -name "*.md" -print0 | xargs -0 -n 1 sed -i -E 's/^[^`]*`{3,}sh(ell)?/```bash/'
find "$postdir" -type f -name "*.md" -print0 | xargs -0 -n 1 sed -i -E 's/^[^`]*`{3,}/```/g'

find "$postdir" -type f -name "*.png" -print0 | xargs -0 -n 1 sed -i -E 's/^[^`]*`{3,}/```/g'

cd $basedir || exit 1

log "start to clean..." >>"/mnt/f/shell_tool/hexo_sync/run_date.log"
/usr/local/bin/hexo clean

log "start to generate..." >>"/mnt/f/shell_tool/hexo_sync/run_date.log"
/usr/local/bin/hexo generate

log "start to deploy remote..." >>"/mnt/f/shell_tool/hexo_sync/run_date.log"
/usr/local/bin/hexo deploy

echo -e "\n" >>"/mnt/f/shell_tool/hexo_sync/run_date.log"

Shell变量

Shell Variable

example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#!/usr/bin/env bash

# set default variable value
first_var=${1:-first}
echo ${first_var}

# set the default value for variable user if it does not have one
echo ${user:=second}

# warn
# var=${1:=defaultValue} ### FAIL with an error cannot assign in this way
# var=${1:-defaultValue} ### Perfect


# display error message
third_var=${3:?"Third argument is not definied or empty"}
fourth_var=${4:"Fourth argument is not definied"}


# display error message and run command
fifth_var=${5:? "Fifth argument is not definied or empty and print current dir" $(pwd)}


# variable length
echo ${#var}

# strip string varibale
msg="who.is.my.love"
# front strip
echo ${msg#*.} # shortest match begin from front, result: is.my.love
echo ${msg##*.} # longest match begin from front, result: love

# back strip
echo ${msg%.*} # shortest match begin from end, result: who.is.my
echo ${msg%%*.} # longest match begin from end, result: who

# substring
echo ${msg:4} # ${var:position}
echo ${msg:4:2} # ${var:position:length}

## convert case
echo ${msg^} # Who.is.my.love
echo ${msg^^} # WHO.IS.MY.LOVE

upper_msg="WHO.IS.MY.LOVE"
echo ${upper_msg,} # wHO.IS.MY.LOVE
echo ${upper_msg,,} # who.is.my.love

#Only convert first character in $dest if it is a capital ‘H’:
echo ${upper_msg,H} # fial

# Want to get the names of variables whose names begin with prefix
VECH="Bus"
VECH1="Car"
VECH2="Train"
echo "${!VECH*}"


# print all L* variables' name and value
for var in ${!L*}; do
echo "name: ${var}, value: ${!var}"
done

# name: LANG, value: C.UTF-8
# name: LESSCLOSE, value: /usr/bin/lesspipe %s %s
# name: LESSOPEN, value: | /usr/bin/lesspipe %s
# name: LINENO, value: 58
# name: LINES, value: 56
# name: LOGNAME, value: guo
# name: LS_COLORS, value: rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:

参考文献

[1] How To Use Bash Parameter Substitution Like A Pro - nixCraft (cyberciti.biz)

decode

1
2
3
4
5
6
7
8
9
10
11
#!/bin/bash

orig=(01101101 01101001 01100100 01101110 01101001 01100111 01101000 01110100)
key=(01001101 01001101 01010100 01111110 01101111 01100001 01000000 00010100)

for i in "${!orig[@]}";do
o=$(echo -n $((2#${orig[$i]})))
k=$(echo -n $((2#${key[$i]})))
echo $(($o ^ $k)) | xargs -n 1 | while read dec; do echo "ibase=10;obase=2;$dec" | bc | tr "\n" " " | sed 's/^/0/g'; done

done

查看网络流量与端口

Linux 下常用的命令 lsofnetstat 都可以用来列出端口以及运行在端口上的服务。

验证环境

为了验证这两个命令,在本地的一台 Ununtun 机器上部署了一个简单的 Kafka broker,端口为 9092,局域网 ip 为 192.168.31.188。在我的开发环境中,ip 为 192.168.31.51 启动了一个 kafka 生产者,在每一次生产消息后会进行休眠一段时间

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
public class KafkaUtil {

public static void main(String[] args) {
System.out.println("pid:" + getPid());
KafkaProducer<String, String> producer = createProducer();
for (int i = 0; i < 10; i++) {
String time = LocalDateTime.now().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME);
String message = "message at " + time;

System.out.println("message: " + message);
ProducerRecord<String, String> record = new ProducerRecord<>("test", message);
producer.send(record, (r, e) -> {
if (r != null) {
System.out.printf("topic:%s, partition:%s, offset:%s\n", r.topic(), r.partition(), r.offset());
}

if (e != null) {
e.printStackTrace();
}
});

threadSleep(Duration.ofMinutes(10));
}
producer.flush();
}

private static KafkaProducer<String, String> createProducer() {
Properties properties = new Properties();
properties.put("bootstrap.servers", "192.168.31.188:9092");
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("acks", "all");
KafkaProducer<String, String> producer = new KafkaProducer<>(properties);
return producer;
}

public static void threadSleep(Duration duration) {
try {
long millis = duration.toMillis();
Thread.sleep(millis);
} catch (InterruptedException e) {
// ignore
}
}

public static int getPid() {
RuntimeMXBean runtime = ManagementFactory.getRuntimeMXBean();
String name = runtime.getName(); // format: "pid@hostname"
try {
return Integer.parseInt(name.substring(0, name.indexOf('@')));
} catch (Exception e) {
return -1;
}
}
}

启动时打印出了该程序的进程 PID: pid:23264, 此过程也可以通过 Windos 下的任务管理器查看,如果是 IDEA 中运行的,可以在 进程 的选项卡下的 IntelliJ IDEA 子进程下查看

image-20210815120804600

Windos 下也有 netstat 命令来查看端口占用和相关的进程,我们找到进程 ID 为 23264 的所有连接,可以看到目标列,即为 Kafka 的 broker 监听的地址 (192.168.31.188:9092)

1
2
3
4
5
6
7
8
# -a 显示所有连接和侦听端口 -n 以数字形式显示地址和端口号 -o 显示拥有的与每个连接关联的进程 ID
C:\Users\guo>netstat -ano | findstr "23264"
协议 本地地址 外部地址 状态 PID
TCP 127.0.0.1:50281 127.0.0.1:50280 ESTABLISHED 23264
TCP 127.0.0.1:50282 127.0.0.1:50283 ESTABLISHED 23264
TCP 127.0.0.1:50283 127.0.0.1:50282 ESTABLISHED 23264
TCP 192.168.31.51:50286 192.168.31.188:9092 ESTABLISHED 23264
TCP 192.168.31.51:50288 192.168.31.188:9092 ESTABLISHED 23264

lsof

Linux 下的 lsof, 即 list open files, 列出已打开的文件, -i 选项列出打开的网络接口文件 ( -i select IPv[46] files)。在 kafka broker 所在机器下查看 9092 端口的使用情况,可以找到与之对应的连接:

1
2
3
4
5
6
# -n 以数字形式显示地址
ph@guo-lenovo:~$ sudo lsof -i -n | grep 9092
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 174961 root 95u IPv6 2748921 0t0 TCP *:9092 (LISTEN)
java 174961 root 100u IPv6 2854156 0t0 TCP 192.168.31.188:9092->192.168.31.51:50286 (ESTABLISHED)
java 174961 root 101u IPv6 2854157 0t0 TCP 192.168.31.188:9092->192.168.31.51:50288 (ESTABLISHED)

192.168.31.188:9092->192.168.31.51:50286 箭头前表示本地地址(Source/Local),箭头后表示外部地址(Target/Foreign)

打印所有开放的端口

根据输出的形式,可以以一个简单的脚本实现打印出当前所有的开放端口:

1
sudo lsof -i | grep -Eo ":[0-9a-zA-Z]+->" | grep -Eo "[0-9a-zA-Z]+" | sort | uniq

netstat

Linux 下的 netstat 与 Windos 下的命令效果一样的:

1
2
3
4
5
6
7
8
# -t tcp
# -n, --numeric don't resolve names
# -p, --programs display PID/Program name for sockets
ph@guo-lenovo:~$ netstat -tnp | grep 9092
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp6 0 0 192.168.31.188:9092 192.168.31.51:50288 ESTABLISHED -
tcp6 0 0 192.168.31.188:9092 192.168.31.51:50286 ESTABLISHED -