Python如何解析上交所行情文件SJSHQ.dbf(数据来源于巨灵数据)

使用 dbf, dbfread,simpledbf 均报错。
b = dbfread.read(‘SJSXX.dbf’, ‘gb2312’)
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/deprecated_dbf.py”, line 49, in read
return DeprecatedDBF(filename, load=True, **kwargs)
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/dbf.py”, line 136, in init
self.load()
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/deprecated_dbf.py”, line 18, in load
self[:] = self._iter_records(b’ ‘)
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/dbf.py”, line 316, in _iter_records
for field in self.fields]
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/dbf.py”, line 316, in <listcomp>
for field in self.fields]
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/field_parser.py”, line 79, in parse
return func(field, data)
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/field_parser.py”, line 87, in parseC
return self.decode_text(data.rstrip(b’\0 '))
File “/home/zhangyunfang/pymongo_py3/lib/python3.6/site-packages/dbfread/field_parser.py”, line 45, in decode_text
return decode_text(text, self.encoding, errors=self.char_decode_errors)
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc9 in position 0: ordinal not in range(128)

求各位大神解答。
Python如何解析上交所行情文件SJSHQ.dbf(数据来源于巨灵数据)


1 回复

要解析上交所行情文件SJSHQ.dbf,可以用dbfread或pandas的read_dbf。dbfread是纯Python库,处理DBF文件比较方便。

先安装库:

pip install dbfread

然后写解析代码:

from dbfread import DBF

def parse_sjshq_dbf(file_path):
    """
    解析SJSHQ.dbf文件
    :param file_path: DBF文件路径
    :return: 包含所有记录的列表
    """
    try:
        # 打开DBF文件,指定编码(巨灵数据常用GBK或GB2312)
        table = DBF(file_path, encoding='gbk')
        
        # 转换为列表
        records = list(table)
        
        print(f"成功解析 {len(records)} 条记录")
        print("字段名:", table.field_names)
        
        # 显示前几条记录
        for i, record in enumerate(records[:3]):
            print(f"记录 {i+1}: {record}")
            
        return records
        
    except Exception as e:
        print(f"解析失败: {e}")
        return None

# 使用示例
if __name__ == "__main__":
    records = parse_sjshq_dbf("SJSHQ.dbf")

如果要用pandas:

import pandas as pd

def parse_with_pandas(file_path):
    # 需要先安装dbfread或直接使用pandas的read_dbf
    try:
        df = pd.read_csv(file_path, encoding='gbk')  # 如果是CSV格式
        # 或者用专门的DBF读取库
        return df
    except:
        print("请确保文件格式正确")

注意编码问题,巨灵数据的DBF文件通常用GBK编码,如果遇到乱码可以试试’gb2312’或’utf-8’。

用dbfread直接读就行。

回到顶部