Python:IRIS数据的初步处理
此前一文分享了如何向IRIS申请数据,详细可参考(Python:向IRIS发送邮件申请事件或者连续数据)一文。但下载好数据后是一个个以.tar.mseed为结尾的数据包,该怎么对这个数据处理,得到sac文件?本文将简单介绍一下数据的转换过程,并提供一个简单的例子。
一. 将所有下载好的.tar.mseed为结尾的数据包放入mseed_data文件夹,执行01batch_decompress_data.py脚本,针对每一个.tar.mseed的数据包会生成.mseed,.meta,.xml三个文件。
import os
import sys
from glob import glob
if __name__ == "__main__":
tar_mseed_data_saved_rpath = "./mseed_data/"
tar_mseed_file_path_list = glob(tar_mseed_data_saved_rpath+"*.tar.mseed")
tar_mseed_file_path_list.sort()
file_saved_path = "./waveform_data/"
os.makedirs(file_saved_path, exist_ok=True)
# traverse every tar mseed file.
for i in range(0, len(tar_mseed_file_path_list)):
#for i in range(0, 1):
sgl_tar_mseed_file_path = tar_mseed_file_path_list[i]
cmd = "tar -xvf %s -C%s"%(sgl_tar_mseed_file_path, file_saved_path)
os.system(cmd)
二. 执行02extract_sac_response_files.py脚本。这个脚本中有两个子函数,extract_dataless_files()和extract_sac_and_response_files()。extract_dataless_files()函数是用来提取dataless文件,这里需要用到stationxml-seed-converter-2.1.0.jar,因此需要在电脑上安装java。extract_sac_and_response_files()函数是提取sac数据以及仪器响应文件,这里默认提取PZ文件,这里需要用到rdseed。提前还需要提供一个台站列表。
# 台站列表如下
"""
AC BCI 42.366600 20.067499 500.00
AC KBN 40.623600 20.787399 800.00
"""
import os
import sys
from glob import glob
def extract_dataless_files(root_path, station_info_path):
root_path = root_path + "waveform_data/"
# obtain the information of stations.
fileID = open(station_info_path, "r")
data_lines = fileID.readlines()
fileID.close()
net_sta_list = [".".join(dl.split()[0:2]) for dl in data_lines]
net_sta_list.sort()
for i in range(0, len(net_sta_list)):
sgl_net_sta = net_sta_list[i]
folder_path_list = glob(root_path + sgl_net_sta + "*")
if not len(folder_path_list):
print("There are some errors when we try to find folder with %s!"%(sgl_net_sta))
continue
folder_path = folder_path_list[0]
match_dataless = folder_path + "/" + "*.dataless"
if len(glob(match_dataless)) != 0:
print("There is already dataless file for %s!"%(sgl_net_sta))
continue
match_xml = folder_path + "/" + "*.xml"
match_xml_path = glob(match_xml)
if len(match_xml_path) == 0:
print("There is not xml file for %s!"%(sgl_net_sta))
continue
cmd = "java -jar stationxml-seed-converter-2.1.0.jar %s"%(match_xml_path[0])
os.system(cmd)
def extract_sac_and_response_files(root_path, station_info_path):
response_saved_path = root_path + "reponse_files"
os.makedirs(response_saved_path, exist_ok=True)
root_path = root_path + "waveform_data/"
# obtain the information of stations.
fileID = open(station_info_path, "r")
data_lines = fileID.readlines()
fileID.close()
net_sta_list = [".".join(dl.split()[0:2]) for dl in data_lines]
net_sta_list.sort()
for i in range(0, len(net_sta_list)):
#for i in range(0, 1):
sgl_net_sta = net_sta_list[i]
folder_path_list = glob(root_path + sgl_net_sta + "*")
if not len(folder_path_list):
print("There are some errors when we try to find folder with %s!"%(sgl_net_sta))
continue
folder_path = folder_path_list[0]
data_folder_path = folder_path + "/" + "data/"
if os.path.exists(data_folder_path):
print("There is already data folder for %s!"%(sgl_net_sta))
continue
os.makedirs(data_folder_path)
match_dataless = folder_path + "/" + "*.dataless"
dataless_path = glob(match_dataless)[0]
match_mseed = folder_path + "/" + "*.mseed"
mseed_path = glob(match_mseed)[0]
cmd = "rdseed -df %s -g %s -q %s"%(mseed_path, dataless_path, data_folder_path)
os.system(cmd)
cmd2 = "rdseed -pf %s -q %s"%(dataless_path, response_saved_path)
os.system(cmd2)
if __name__ == "__main__":
root_path = "./"
station_info_path = "./station_info"
#extract_dataless_files(root_path, station_info_path)
extract_sac_and_response_files(root_path, station_info_path)
以上是代码分享,提供的例子可以在此处下载https://download.csdn.net/download/u011563036/83602004。希望能帮到需要的人,有错误的地方望告知。
注:代码只在ubuntu18.04中做过测试。使用了3.6版本的python。
参考:
1、https://github.com/iris-edu/stationxml-seed-converter/releases(stationxml-seed-converter-2.1.0.jar的来源)
2. https://github.com/iris-edu-legacy/rdseed (rdseed的来源)