一、需求解析
例如有SQL=“select * from t1 where name='DriverWon';”,需要提取出where的值name=‘DriverWon’
当然也存在其他的情况,将可能出现的SQL语句列举如下:
sql_list = ["select * from t1 where name='DriverWon';",
"select * from t1 where name='DriverWon' order by 1 desc;",
"select * from t1 where name='DriverWon' limit 10;",
"select * from t1 where name='DriverWon' or id=10;",
"select * from t1 where name='DriverWon' AND id>10 and is_del is not null order by 1 desc;",
"select * from t1 where name='DriverWon' ;",
"select * from t1;",
"select * from t1 where id =(select id from t2 where name='DriverWon');"]
二、代码实现
sqlparse解析器提取
使用sqlparse库解析SQL,提取出where的值
import sqlparse
def where_extract_sqlparse(sql):
"""
SQL解析器(sqlparse)提取SQL中where的值
:param sql:
:return:
"""
stmt = sqlparse.parse(sql)[0].tokens
where_list = [s.value for s in stmt if isinstance(s, sqlparse.sql.Where)]
return where_list
if __name__ == "__main__":
for i, v in enumerate(sql_list):
print(f"SQL{i + 1}提取结果:{where_extract_sqlparse(v)}")
运行结果:
re正则表达式提取
使用re库匹配并提取出where的值
import re
def where_extract_re(sql):
"""
正则表达式提取SQL中where的值
:param sql:
:return:
"""
# 提取where和关键字之间的字符串
match = re.search(r'where\s+([^;]*?(?=order by|limit|;|$))', sql, re.IGNORECASE) # re正则表达式匹配,忽略大小写
if not match:
result = ''
else:
result = match.groups(1)[0]
# 根据and或or拆分结果
if 'and' in result.lower():
result = result.replace('AND', 'and')
result = [s.strip() for s in result.split('and')]
elif 'or' in result.lower():
result = result.replace('OR', 'or')
result = [s.strip() for s in result.split('or')]
return result
if __name__ == "__main__":
for i, v in enumerate(sql_list):
print(f"SQL{i + 1}提取结果:{where_extract_re(v)}")
运行结果:
三、结果分析
从运行结果来看,使用re方法更加适合于前文描述的需求,但是当遇到更为复杂的SQL时(多重嵌套的子查询等)结果可能不尽人意,若是使用sqlparse方法则需要再次处理提取出的结果,将where、空格以及分号等字符去除。