0
点赞
收藏
分享

微信扫一扫

语音数据集中label的词频统计

栖桐 2022-02-12 阅读 54


import os
import jieba
from collections import Counter
with open("/home/dfy/Downloads/111time/zong_new.txt","r",encoding="utf-8") as f:
data=f.readlines()
c_list=[]
for one in data:

c_list+=[i for i in jieba.cut(one.split("\t")[1], cut_all=False)]

print(sorted(Counter(c_list).items(),key=lambda d:d[1]))
print(sorted(Counter([i for i in "".join(c_list)]).items(),key=lambda d:d[1]))
if __name__ == '__main__':
pass



举报

相关推荐

0 条评论