0
点赞
收藏
分享

微信扫一扫

linux学习100篇86:awk -3

松鼠树屋 2021-09-19 阅读 46
日记本

awk
内置变量:

FS :定义输入字段分隔符 Field Separator ,同 F

RS :定义输入记录分隔符 Record Separator

OFS :定义输出字段分隔符 Out Field Separator

ORS :定义输出记录分隔符 Out Record Separator

NF :数据文件中的字段总数,可以简单理解为 列数

NR :已处理的输入记录数,可以简单理解为 行数

也可以通过 v 参数自定义变量或传递外部变量

$ less -S example.gtf | awk '{print $9}'
gene_id
gene_id
gene_id
gene_id
gene_id
gene_id
gene_id
gene_id

root 13:24:36 ~/data/Data
$ less -S example.gtf | awk 'BEGIN {FS="\t"} {print $9}'
gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_coding"; gene_status "KNOWN"; gene_name "RP11-34P13.1"; transcript_type "protein_coding"; transcript_status "KNOWN"; transcript_name "RP11-34P13.1-201"; level 3; havana_gene "OTTHUMG00000000961";
$ less -S example.gtf | awk 'BEGIN {FS="\t"} {print NR,$9}' |less -S
1 gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_co
2 gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_co
3 gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_co
4 gene_id "ENSG00000223972"; transcript_id "ENSG00000223972"; gene_type "protein_co
5 gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_co
6 gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_co
7 gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_co
8 gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_co
9 gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_co
10 gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_c
11 gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_c
12 gene_id "ENSG00000223972"; transcript_id "ENST00000450305"; gene_type "protein_c
13 gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_c
14 gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_type "protein_c
:

举报

相关推荐

0 条评论