awk-内置函数-CFANZ编程社区

int(expr) 截断为整数

sqrt(expr) 平方根

rand() 返回一个随机数N，0和1范围，0<N<1

srand([expr]) 使用expr生成随机数，如果不指定，默认使用当前时间为种子，如果前面有种子则使用生成随机数

asort(a,b) 对数组a的值进行排序，把排序后的值存到新的数组b中，新排序的数组下标从1开始

asorti(a,b) 对数组a的下标进行排序，同上

sub(r,s[,t]) 对输入的记录用s替换r正则匹配，t可选针对某字段替换，但只替换第一个字符串

gsub(r,s[,t]) 对输入的记录用s替换r正则匹配，t可选针对某字段替换，否则替换所有字符串

gensub(r,s,h[,t]) 对输入的记录用s替换r正则匹配，h替换指定索引位置

index(s,t) 返回s中字符串t的索引位置，0为不存在

length([s]) 返回s的长度

match(s,r[,a]) 测试字符串s是否包含匹配r的字符串，如果不包含返回0

split(s,a[,r[,seps]]) 根据分隔符seps将s分成数组a

substr(s,i[,n]) 截取字符串s从i开始到长度n，如果n没指定则是剩余部分

tolower(str) str中的所有大写转换成小写

toupper(str) str中的所有小写转换成大写

systime() 当前时间戳

strftime(format[,timestamp[,utc-flag]]]) 格式化输出时间，将时间戳转为字符串

int()

[root@study ~]# echo -e "123abc\nabc123\n123abc123"
123abc
abc123
123abc123
[root@study ~]# echo -e "123abc\nabc123\n123abc123"|awk '{print int($0)}'
123
0
123

rand()和srand()

#rand()并不是每次运行就是一个随机数，会一直保持一个不变：
[root@study ~]# awk 'BEGIN{print rand()}'
0.237788
[root@study ~]# awk 'BEGIN{print rand()}'
0.237788
#当执行srand()函数后，rand()才会发生变化，所以一般在awk着两个函数结合生成随机数，但是也有很大几率生成一样
[root@study ~]# awk 'BEGIN{srand();print rand()}'
0.235636
[root@study ~]# awk 'BEGIN{srand();print rand()}'
0.608455
#如果想生成1-10的随机数可以这样：
[root@study ~]# awk 'BEGIN{srand();print int(rand()*10)}'
5
[root@study ~]# awk 'BEGIN{srand();print int(rand()*10)}'
1

如果想更完美生成随机数，还得做相应的处理！

asort和asorti()

排序数组

[root@study ~]# seq -f "str%.g" 5|awk '{a[x++]=$0}END{s=asort(a,b);for(i=1;i<=s;i++)print b[i],i}'
str1 1
str2 2
str3 3
str4 4
str5 5
[root@study ~]# seq -f "str%.g" 5|awk '{a[x++]=$0}END{s=asorti(a,b);for(i=1;i<=s;i++)print b[i],i}'
0 1
1 2
2 3
3 4
4 5

asort将a数组的值放到数组b，a下标丢弃，并将数组b的总行号赋值给s，新数组b下标从1开始，然后遍历。

sub和gsub

#替换正则匹配的字符串
[root@study ~]# tail /etc/services |awk '/blp5/{sub(/tcp/,"icmp");print $0}'
blp5            48129/icmp               # Bloomberg locator
blp5            48129/udp               # Bloomberg locator
[root@study ~]# tail /etc/services |awk '/blp5/{gsub(/c/,"9");print $0}'
blp5            48129/t9p               # Bloomberg lo9ator
blp5            48129/udp               # Bloomberg lo9ator
[root@study ~]# echo 1 2 2 3 4 5 |awk 'gsub(2,7,$2){print $0}'
1 7 2 3 4 5
[root@study ~]# echo 1 2 2 3 4 5 |awk 'gsub(2,7){print $0}'
1 7 7 3 4 5
[root@study ~]# echo 1 2 2 a b c |awk 'gsub(/[0-9]/,'0'){print $0}'
0 0 0 a b c

在指定行的前后加一行，注意函数内用正则要用单引号

[root@study ~]# seq 5|awk 'NR==2{sub('/.*/',"&\ntxt")}{print}'
1
2
txt
3
4
5
[root@study ~]# seq 5|awk 'NR==2{sub("/.*/","&\ntxt")}{print}'
1
2
3
4
5

#获取字段索引起始位置
[root@study ~]# tail -5 /etc/services |awk '{print index($2,"tcp")}'
7
0
7
0
7
#统计字段长度
[root@study ~]# tail -5 /etc/services |awk '{print length($2)}'
9
9
9
9
9
#统计数组长度
[root@study ~]# tail -5 /etc/services |awk '{a[$1]=$2}END{print length(a)}'
3

match

[root@study ~]# echo "123abc#456cde 789aaa#234bbb 999aaa#aaabbb"|xargs -n1|awk '{print match($0,234)}'
0
8
0
#如果记录匹配字符串234，则返回索引位置，否则返回0。
#那么，我们只想打印包含这个字符串的记录就可以这样：
[root@study ~]# echo "123abc#456cde 789aaa#234bbb 999aaa#aaabbb"|xargs -n1|awk '{if(match($0,234)!=0)print $0}'
789aaa#234bbb

split

#切分记录为数组a
[root@study ~]# echo -e "123#456#789\nabc#cde#fgh"|awk '{split($0,a);for(v in a)print a[v],v}'
123#456#789 1
abc#cde#fgh 1
#以#号切分记录为数组a
[root@study ~]# echo -e "123#456#789\nabc#cde#fgh"|awk '{split($0,a,"#");for(v in a)print a[v],v}'
123 1
456 2
789 3
abc 1
cde 2
fgh 3

substr

#截取字符串索引4到最后
[root@study ~]# echo -e "123#456#789\nabc#cde#fgh"|awk '{print substr($0,4)}'
#456#789
#cde#fgh
#截取字符串索引4到5个长度
[root@study ~]# echo -e "123#456#789\nabc#cde#fgh"|awk '{print substr($0,4,5)}'
#456#
#cde#

时间处理

#返回当前时间戳
[root@study ~]# awk 'BEGIN{print systime()}'
1644305931
#将时间戳转化成日期和时间
[root@study ~]# echo 1644306666|awk '{print strftime("%Y-%m-%d %H:%M:%S",$0)}'
2022-02-08 15:51:06