如何用Python遍历一个复杂的对象结构
源码的名字是:psqlparse,是一个基于 libpg_query 来解析 sql 的,其中使用了 cython,
地址是 https://github.com/alculquicondor/psqlparse,给个解决思路就行,这个程序不复杂,大约能看懂,已经看了一天,问题是:复杂的对象结构,其中包含 json,list 的,还是嵌套的,如何遍历它,优雅的。
如何用Python遍历一个复杂的对象结构
我使用了,dict,有没有更好的办法。
要遍历一个复杂的Python对象结构,比如嵌套的字典、列表或自定义对象,递归是最直接的方法。下面是一个通用的递归遍历函数,它能处理各种常见的数据类型:
def traverse(obj, depth=0):
indent = ' ' * depth
if isinstance(obj, dict):
print(f'{indent}Dict with {len(obj)} items:')
for key, value in obj.items():
print(f'{indent} Key: {key}')
traverse(value, depth + 2)
elif isinstance(obj, (list, tuple, set)):
type_name = obj.__class__.__name__
print(f'{indent}{type_name} with {len(obj)} items:')
for i, item in enumerate(obj):
print(f'{indent} Index {i}:')
traverse(item, depth + 2)
elif hasattr(obj, '__dict__'):
print(f'{indent}Object of {obj.__class__.__name__}:')
traverse(obj.__dict__, depth + 1)
else:
print(f'{indent}{repr(obj)}')
# 示例使用
data = {
'users': [
{'name': 'Alice', 'age': 30, 'tags': ['admin', 'dev']},
{'name': 'Bob', 'age': 25}
],
'settings': {'debug': True, 'version': '1.0'}
}
traverse(data)
这个函数会递归地遍历整个结构,打印出每个层级的类型和内容。对于字典,它会遍历所有键值对;对于列表/元组/集合,它会遍历所有元素;对于自定义对象,它会通过__dict__访问其属性。
如果你需要收集特定信息而不是打印,可以修改函数让它返回数据。比如要收集所有字符串值:
def collect_strings(obj):
strings = []
if isinstance(obj, dict):
for value in obj.values():
strings.extend(collect_strings(value))
elif isinstance(obj, (list, tuple, set)):
for item in obj:
strings.extend(collect_strings(item))
elif isinstance(obj, str):
strings.append(obj)
return strings
# 使用示例
all_strings = collect_strings(data)
print(all_strings) # 输出所有字符串值
用递归处理嵌套结构最省事。
比如下面的例子:
import psqlparse
query=r"select (b.script_name),‘中’ from (select * from temp.halfisolateworkjob20171221) a left join (select * from temp.script2table) b on a.schema_name||’.’||a.table_name=b.table_name where a.create_time <‘20171201’ and a.owner=‘app_vgop’ and schema_name=‘SESSION’ and process_flag=false and b.script_name is not null order by 1"
query2=r"insert into dis.td_bd_area_info_d SELECT A.DEAL_DATE,A.INT_ID,A.ZH_LABEL,A.COUNTY_ID, B.ZH_LABEL OUNTY_NAME,B.CITY_ID,case when B.city_id = ‘40’ then ‘邢台市’ when B.city_id = ‘33’ then ‘秦皇岛市’ when B.city_id = ‘41’ then ‘邯郸市’ when B.city_id = ‘34’ then ‘沧州市’ when B.city_id = ‘36’ then ‘廊坊市’ when B.city_id = ‘32’ then ‘石家庄市’ when B.city_id = ‘37’ then ‘张家口市’ when B.city_id = ‘38’ then ‘保定市’ when B.city_id = ‘42’ then ‘唐山市’ when B.city_id = ‘43’ then ‘衡水市’ when B.city_id = ‘39’ then '承德市’ELSE ‘其他’ END ,CASE WHEN A.CELL_SOURCE in (‘铁通割接’,‘无线宽带’,‘新国标’,‘自建’,‘自建无线宽带’) and A.COVER_TYPE IN (‘0’,‘1’,‘2’,‘3’) THEN ‘自建有线’ WHEN A.CELL_SOURCE in (‘铁通割接’,‘无线宽带’,‘新国标’,‘自建’,‘自建无线宽带’) and A.COVER_TYPE IN (‘4’) THEN ‘自建无线(WLAN)’ WHEN A.CELL_SOURCE in (‘铁通割接’,‘无线宽带’,‘新国标’,‘自建’,‘自建无线宽带’) and A.COVER_TYPE IN (‘6’) THEN ‘自建无线( 4G )’ WHEN A.CELL_SOURCE in (‘铁通割接’,‘无线宽带’,‘新国标’,‘自建’,‘自建无线宽带’) and A.COVER_TYPE IN (‘6’) THEN ‘自建无线( 4G )’ WHEN A.CELL_SOURCE in (‘第三方割接’,‘第三方无线宽带’) and A.COVER_TYPE IN (‘0’,‘1’,‘2’,‘3’) THEN ‘三方有线’ WHEN A.CELL_SOURCE in (‘第三方割接’,‘第三方无线宽带’) and A.COVER_TYPE IN (‘4’) THEN ‘三方无线( WLAN )’ else ‘其他’ end,case when A.AREA_TYPE = ‘市区(含县城)’ then ‘市区’ when A.AREA_TYPE = ‘乡镇(含城乡结合部)’ then ‘乡镇’ when A.AREA_TYPE = ‘农村’ then ‘农村’ else ‘其他’ end,A.CELL_SOURCE,A.COVER_TYPE,A.HOUSE_NUM,ROW_NUMBER() OVER (PARTITION BY A.INT_ID ORDER BY A.MODIFY_TIME DESC , B.MODIFY_TIME DESC ) RN FROM DW.TD_RMS_ADD_CELL_D A LEFT JOIN DW.TD_RMS_COUNTY_D B ON A.COUNTY_ID = B.INT_ID AND B.DEAL_DATE = 20170101 where A.DEAL_DATE = 20170101;"
statements = psqlparse.parse(query1)
used_tables = statements[0]
dir(used_tables.from_clause.items[0])
\每一个 sql 的内容不同,返回的值也是不同的,下面这个在简单的 sql 中可以,所以我必须得到 statements 的全部内部,或者是遍历它,否则不知道结构,无法处理。
statements 里每个元素如果是相同类型却又不同 attributes 的话,那不好意思这还真只能__dict__。
如果每个元素.__class__不一样,你倒是可以写一个
def handle_each(obj):
if isinstance(obj, node_type1):
…
elif …
…
else:
…
def handle(statements):
for stmt in statements:
if isinstance(stmt, some_nested_struct): # 嵌套你就跟进去
handle(stmt)
else:
handle_each(stmt)
你查一下他这个库有没有自带的 serializer, 有的话应该是直接到 xml/json。
好的,我试试看。
已经解决,非常感谢


