cURL命令使用错误总结

2021-09-11 Linux

问题描述

原本，是想写一个脚本来从 Unsplash 下载图片，通过 apt install axel 安装了 axel 来进行下载（相对较快）；使用 jq 来对返回的 json 进行解析，获取下载地址，为了测试命令的有效性，将返回的 json 结果存放在文件中 unsplash.json 中：

[
    "links": {
      "self": "https://api.unsplash.com/photos/3AzS4zAYaXk",
      "html": "https://unsplash.com/photos/3AzS4zAYaXk",
      "download": "https://unsplash.com/photos/3AzS4zAYaXk/download",
      "download_location": "https://api.unsplash.com/photos/3AzS4zAYaXk/download"
    }
]

调用 jq 命令解析，并将 download 的值作为 axel 的命令行参数：

jq -C .[].links.download unsplash.json | xargs -t -n 1 axel -o 2.jpg

## console：
axel -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download
Initializing download: https://unsplash.com/photos/3AzS4zAYaXk/download
Could not parse URL.

标准输出或者错误显示不能解析 URL, 查看这个 URL 没有问题，直接执行 axel -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download 也没有出现异常。由于没有显示具体的 URL 异常在哪里，换用 cUrl 来执行：

jq -C .[].links.download unsplash.json | xargs -t -n 1 curl -o 2.jpg

## console:
curl -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download
curl: (3) [globbing] bad range in column 3

同样，请求失败，同样直接执行 curl -o 2.jpg https://unsplash.com/photos/3AzS4zAYaXk/download 也能够成功，显然通过 jq 输出，再通过管道符和 xargs 输出的 url 不正常，但是直接看有没有什么特殊的字符，期间，尝试将 jq 解析的结果保存成变量，url 中添加变量，如：

1	curl -o 2.jpg $varUrl

同样执行失败，百思不得其解 :laughing: .

解决思路

智商不够，百度来凑，无果，打开 StackOverFlow，搜索 bad range in column ，从一个类似的问题中找到了线索或者说答案。

原文 strange-characters-appearing-in-bash-variable-expansion) 中描述的问题是通过 grep 命令 filter json 中的值，并将其作为变量，在 curl 的 url 中引用该变量：

1
2
3

pod_in_question=$(curl -u uname:password -k very.cluster.com/api/v1/namespaces/default/pods/ | grep -i '"name": "myapp-' | cut -d '"' -f 4)

curl -g -u uname:password -k -X DELETE "very.cluster.com/api/v1/namespaces/default/pods/${pod_in_question}"

结果，请求 url 中出现一些特殊的转义字符，原因是使用的 bash 环境， grep 命令默认使用 --colour=always , 使得过滤的结果中出现了颜色的转义序列 ANSI escape sequences, 支持这些转义序列的终端的这些字符不可见，使用 hexdump -C 可以查看，因此针对原文的问题解决方案就是 grep --colour=never.

再回到我的问题，使用 hexdump 打印：

jq -C .[].links.download unsplash.json | hexdump -C

00000000  1b 5b 30 3b 33 32 6d 22  68 74 74 70 73 3a 2f 2f  |.[0;32m"https://|
00000010  75 6e 73 70 6c 61 73 68  2e 63 6f 6d 2f 70 68 6f  |unsplash.com/pho|
00000020  74 6f 73 2f 33 41 7a 53  34 7a 41 59 61 58 6b 2f  |tos/3AzS4zAYaXk/|
00000030  64 6f 77 6e 6c 6f 61 64  22 1b 5b 30 6d 0a        |download".[0m.|
0000003e

明显可以看到 url 的首尾出现了特殊的颜色转义字符，32 是绿色的色彩码. 显然, 是由于 jq -C 的选项造成的结果, 去掉或者指定 -M (monochrome (don’t colorize JSON)), 问题解决了.

stackoverflow 中的问题还有一个答案, 给出了如何找出 cUrl 使用中出现问题如何定位的思路:

jq -C .[].links.download unsplash.json | xargs -t -n 1 curl -g --libcurl /tmp/libcurl -o 2.jpg	

cat /tmp/libcurl

 curl_easy_setopt(hnd, CURLOPT_URL, "\033[0;32mhttps://unsplash.com/photos/3AzS4zAYaXk/download\033[0m");

结论

引用 stackoverflow 给出使用 cURL 的最佳实践:

The best practise for URL syntax in cURL:
- If Variable Expansion is required:
  - Apply the -g switch to disable potential globbing done by cURL
- Otherwise:
  - Use $variable as part of a “quoted” url string, instead of ${variable}
使用 grep, jq 以及管道符 | 应该注意颜色转义序列, 为了使脚本通用, 必要时在所有可能会产生此类问题的命令中关闭颜色输出

Hexo 博客同步脚本

2021-09-04 Linux

Hexo 博客同步脚本

之前为了本地博客目录与 Hexo 博客目录的独立, 编写了脚本实现复制, 添加分类, 图片上传，Redis 去重等操作, 并上传 github

#!/bin/bash
#
# author: guo
# date: 2020-05-31
# description: a automatic way to update hexo post, steps:
# 1. copy local directories containing markdown files to hexo source path
# 2. add front-matter before every hexo post
# 3. run hexo commands to deploy
#
# hexo front-matter is used to classify the posts in hexo, format:
#   ---
#   title: ls Invalid option
#   date: 2020/05/30 22:21:59
#   categories:
#   - Linux
#   tags:
#   - Linux
#   ---
# use the direct directory name containing md files as categories and tags value
# use the file modification time as date value
#

function log() {
    echo "[$(date +"%F %T")]: $@"
}

function convertUrl() {

    mdPath=$1
    picPath=$2

    # convert 'F:\shell_tool\a.png' to '/mnt/f/shell_tool/png'
    tempPath=$(echo -n "$picPath" | tr '\\' '/' | tr -d ':')
    tempPath=$(echo -ne "${tempPath}" | sed -e 's/^[[:space:]]*//' | sed -e 's/[[:space:]]*$//')

    linuxPicPath="/mnt/"$(echo -n ${tempPath,})

    if [ ! -f "$linuxPicPath" ]; then
        return
    fi

    echo -e "[Linux Pic Path]:\t"$linuxPicPath

    # to match string F:\\shell_tool\\a.png, literal text '\' not escape
    matchPath=$(echo -n "$picPath" | sed 's/\\/\\\\/g')
    # echo $matchPath

    key=$(echo -n $linuxPicPath | base64)
    setRes=$(redis-cli setnx $key 1)

    echo -e "[Redis Set]\t\t" $setRes
    if (($setRes == 0)); then

        picName=${linuxPicPath##*/}
        buildBedPicPath="https://raw.githubusercontent.com/weirdWimp/blog-store/main/img/"$picName
        echo -e "[ReBedPicPath]:\t\t"$buildBedPicPath

        eval sed -i 's#${matchPath}#${buildBedPicPath}#' '$mdPath'
        return
    fi

    upRes=$(picgo u $linuxPicPath)

    if [[ "$upRes" = *SUCCESS* ]]; then
        picBedUrl=${upRes##*SUCCESS]:}
        picBedUrl=$(echo $picBedUrl | tr -d '[:space:]')
        echo -e "[PicBed Path]\t\t"$picBedUrl
        eval sed -i 's#${matchPath}#${picBedUrl}#' '$mdPath'
    fi
}

## function to add add front-matter
function addHeader() {
    dir=$1
    oldIFS=$IFS
    IFS=$(echo -ne "\x1c")
    for path in $(find $dir -type f -name "*.md" -exec printf {}"\x1c" \;); do
        file=${path##*/}
        title=${file%.*}
        crtdat=$(ls -l --time-style=+"%Y/%m/%d %T" $path | cut -d " " -f 6-7)
        categories=${dir##*/}
        tags=$categories
        head="---\ntitle: $title\ndate: $crtdat\ncategories:\n- $categories\ntags:\n- $tags\n---\n\n\n"
        sed -i "1i$head" $path

        pics=$(cat $path | grep -P '!\[.*\]\(.*\)' | sed -E 's/!\[.*\]\((.*)\)/\1/' | tr '\n' '\034')
        if [ -z "$pics" ]; then
            continue
        fi

        # echo "pictures:###"$pics"==="
        for pic in $pics; do
            convertUrl $path $pic
        done

        # sleep 1s
    done
    IFS=$oldIFS
}

filepath="/mnt/f/shell_tool/hexo_sync/sync.config"
basedir="/mnt/f/md-blog/weirdWimp.github.io"
postdir="/mnt/f/md-blog/weirdWimp.github.io/source/_posts"
while read line; do
    if [ -d "$line" ]; then
        # echo "$line exists"
        dirnam=${line##*/}
        targetDir="$postdir/$dirnam"
        if [ -d "$targetDir" ]; then
            # echo "deleting $targetDir"
            sudo rm -rf "$targetDir"
        fi
        mkdir -p "$targetDir"
        cp -r -p "$line" "$postdir"
        addHeader "$targetDir"
    fi
done <$filepath

```bash mark to ```bash
find "$postdir" -type f -name "*.md" -print0 | xargs -0 -n 1 sed -i -E 's/^[^`]*`{3,}sh(ell)?/```bash/'
find "$postdir" -type f -name "*.md" -print0 | xargs -0 -n 1 sed -i -E 's/^[^`]*`{3,}/```/g'

find "$postdir" -type f -name "*.png" -print0 | xargs -0 -n 1 sed -i -E 's/^[^`]*`{3,}/```/g'

cd $basedir || exit 1

log "start to clean..." >>"/mnt/f/shell_tool/hexo_sync/run_date.log"
/usr/local/bin/hexo clean

log "start to generate..." >>"/mnt/f/shell_tool/hexo_sync/run_date.log"
/usr/local/bin/hexo generate

log "start to deploy remote..." >>"/mnt/f/shell_tool/hexo_sync/run_date.log"
/usr/local/bin/hexo deploy

echo -e "\n" >>"/mnt/f/shell_tool/hexo_sync/run_date.log"

Shell变量

2021-08-29 Linux

Shell Variable

example

#!/usr/bin/env bash

# set default variable value
first_var=${1:-first}
echo ${first_var}

# set the default value for variable user if it does not have one
echo ${user:=second}

# warn
# var=${1:=defaultValue}  ### FAIL with an error cannot assign in this way
# var=${1:-defaultValue}  ### Perfect


# display error message
third_var=${3:?"Third argument is not definied or empty"}
fourth_var=${4:"Fourth argument is not definied"}


# display error message and run command
fifth_var=${5:? "Fifth argument is not definied or empty and print current dir" $(pwd)}


# variable length
echo ${#var}

# strip string varibale
msg="who.is.my.love"
# front strip
echo ${msg#*.}  # shortest match begin from front, result: is.my.love
echo ${msg##*.} # longest match begin from front,  result: love

# back strip
echo ${msg%.*}  # shortest match begin from end, result: who.is.my
echo ${msg%%*.} # longest match begin from end, result: who

# substring
echo ${msg:4}       # ${var:position}
echo ${msg:4：2}    # ${var:position:length}

## convert case
echo ${msg^}    # Who.is.my.love
echo ${msg^^}   # WHO.IS.MY.LOVE

upper_msg="WHO.IS.MY.LOVE"
echo ${upper_msg,}    # wHO.IS.MY.LOVE
echo ${upper_msg,,}   # who.is.my.love

#Only convert first character in $dest if it is a capital ‘H’:
echo ${upper_msg,H} # fial

# Want to get the names of variables whose names begin with prefix
VECH="Bus"
VECH1="Car"
VECH2="Train"
echo "${!VECH*}"


# print all L* variables' name and value
for var in ${!L*}; do 
    echo "name: ${var}, value: ${!var}"
done

# name: LANG, value: C.UTF-8
# name: LESSCLOSE, value: /usr/bin/lesspipe %s %s
# name: LESSOPEN, value: | /usr/bin/lesspipe %s
# name: LINENO, value: 58
# name: LINES, value: 56
# name: LOGNAME, value: guo
# name: LS_COLORS, value: rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:

参考文献

[1] How To Use Bash Parameter Substitution Like A Pro - nixCraft (cyberciti.biz)

decode

2021-08-29 Linux

#!/bin/bash

orig=(01101101 01101001 01100100 01101110 01101001 01100111 01101000 01110100)
key=(01001101 01001101 01010100 01111110 01101111 01100001 01000000 00010100)

for i in "${!orig[@]}";do
    o=$(echo -n $((2#${orig[$i]})))
    k=$(echo -n $((2#${key[$i]})))
    echo $(($o ^ $k)) | xargs -n 1 | while read dec; do echo  "ibase=10;obase=2;$dec" | bc | tr "\n" " " | sed 's/^/0/g'; done

done

查看网络流量与端口

2021-08-15 Linux

Linux 下常用的命令 lsof 和 netstat 都可以用来列出端口以及运行在端口上的服务。

验证环境

为了验证这两个命令，在本地的一台 Ununtun 机器上部署了一个简单的 Kafka broker，端口为 9092，局域网 ip 为 192.168.31.188。在我的开发环境中，ip 为 192.168.31.51 启动了一个 kafka 生产者，在每一次生产消息后会进行休眠一段时间

public class KafkaUtil {

    public static void main(String[] args) {
        System.out.println("pid:" + getPid());
        KafkaProducer<String, String> producer = createProducer();
        for (int i = 0; i < 10; i++) {
            String time = LocalDateTime.now().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME);
            String message = "message at " + time;

            System.out.println("message: " + message);
            ProducerRecord<String, String> record = new ProducerRecord<>("test", message);
            producer.send(record, (r, e) -> {
                if (r != null) {
                    System.out.printf("topic:%s, partition:%s, offset:%s\n", r.topic(), r.partition(), r.offset());
                }

                if (e != null) {
                    e.printStackTrace();
                }
            });

            threadSleep(Duration.ofMinutes(10));
        }
        producer.flush();
    }
    
    private static KafkaProducer<String, String> createProducer() {
        Properties properties = new Properties();
        properties.put("bootstrap.servers", "192.168.31.188:9092");
        properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        properties.put("acks", "all");
        KafkaProducer<String, String> producer = new KafkaProducer<>(properties);
        return producer;
    }

    public static void threadSleep(Duration duration) {
        try {
            long millis = duration.toMillis();
            Thread.sleep(millis);
        } catch (InterruptedException e) {
            // ignore
        }
    }

    public static int getPid() {
        RuntimeMXBean runtime = ManagementFactory.getRuntimeMXBean();
        String name = runtime.getName(); // format: "pid@hostname"
        try {
            return Integer.parseInt(name.substring(0, name.indexOf('@')));
        } catch (Exception e) {
            return -1;
        }
    }
}

启动时打印出了该程序的进程 PID: pid:23264, 此过程也可以通过 Windos 下的任务管理器查看，如果是 IDEA 中运行的，可以在 进程 的选项卡下的 IntelliJ IDEA 子进程下查看

Windos 下也有 netstat 命令来查看端口占用和相关的进程，我们找到进程 ID 为 23264 的所有连接，可以看到目标列，即为 Kafka 的 broker 监听的地址（192.168.31.188:9092）

# -a 显示所有连接和侦听端口 -n 以数字形式显示地址和端口号 -o 显示拥有的与每个连接关联的进程 ID
C:\Users\guo>netstat -ano | findstr "23264"
  协议    本地地址                外部地址                状态             PID
  TCP    127.0.0.1:50281        127.0.0.1:50280        ESTABLISHED     23264
  TCP    127.0.0.1:50282        127.0.0.1:50283        ESTABLISHED     23264
  TCP    127.0.0.1:50283        127.0.0.1:50282        ESTABLISHED     23264
  TCP    192.168.31.51:50286    192.168.31.188:9092    ESTABLISHED     23264
  TCP    192.168.31.51:50288    192.168.31.188:9092    ESTABLISHED     23264

lsof

Linux 下的 lsof, 即 list open files, 列出已打开的文件， -i 选项列出打开的网络接口文件 ( -i select IPv[46] files)。在 kafka broker 所在机器下查看 9092 端口的使用情况，可以找到与之对应的连接：

# -n 以数字形式显示地址
ph@guo-lenovo:~$ sudo lsof -i -n | grep 9092
COMMAND      PID            USER   FD   TYPE  DEVICE SIZE/OFF NODE NAME
java      174961            root   95u  IPv6 2748921      0t0  TCP *:9092 (LISTEN)
java      174961            root  100u  IPv6 2854156      0t0  TCP 192.168.31.188:9092->192.168.31.51:50286 (ESTABLISHED)
java      174961            root  101u  IPv6 2854157      0t0  TCP 192.168.31.188:9092->192.168.31.51:50288 (ESTABLISHED)

192.168.31.188:9092->192.168.31.51:50286 箭头前表示本地地址（Source/Local），箭头后表示外部地址（Target/Foreign）

打印所有开放的端口

根据输出的形式，可以以一个简单的脚本实现打印出当前所有的开放端口：

1	sudo lsof -i \| grep -Eo ":[0-9a-zA-Z]+->" \| grep -Eo "[0-9a-zA-Z]+" \| sort \| uniq

netstat

Linux 下的 netstat 与 Windos 下的命令效果一样的：

# -t tcp
# -n, --numeric  don't resolve names
# -p, --programs display PID/Program name for sockets
ph@guo-lenovo:~$ netstat -tnp | grep 9092
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6       0      0 192.168.31.188:9092     192.168.31.51:50288     ESTABLISHED -
tcp6       0      0 192.168.31.188:9092     192.168.31.51:50286     ESTABLISHED -

Prev Next