Rust XML处理库sxd-xpath的使用:高效解析与查询XML文档的XPath实现
Rust XML处理库sxd-xpath的使用:高效解析与查询XML文档的XPath实现
概述
sxd-xpath是一个Rust语言的XML XPath库,它包含两个主要组件:
document
- 提供基本的DOM操作和XML字符串的读写功能xpath
- 实现XPath 1.0表达式解析
安装
在Cargo.toml中添加依赖:
sxd-xpath = "0.4.2"
或者运行命令:
cargo add sxd-xpath
完整示例代码
下面是一个使用sxd-xpath库解析和查询XML文档的完整示例:
use sxd_document::parser;
use sxd_xpath::{Factory, Context};
fn main() {
// 示例XML文档
let xml = r#"
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J.K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
"#;
// 解析XML文档
let package = parser::parse(xml).expect("Failed to parse XML");
let document = package.as_document();
// 创建XPath工厂
let factory = Factory::new();
// 创建XPath表达式
let xpath = factory.build("//book[price>29]/title/text()")
.expect("Failed to compile XPath");
let xpath = xpath.expect("No XPath was compiled");
// 创建上下文
let context = Context::new();
// 执行XPath查询
let result = xpath.evaluate(&context, document.root())
.expect("Failed to evaluate XPath");
// 处理查询结果
if let Some(value) = result.string() {
println!("查询结果: {}", value);
} else {
println!("没有匹配的结果");
}
}
代码说明
- 首先我们定义了一个示例XML文档字符串
- 使用
sxd_document::parser::parse()
方法解析XML文档 - 创建XPath工厂用于构建XPath表达式
- 定义XPath查询表达式
//book[price>29]/title/text()
,查找价格大于29的书籍标题 - 创建XPath上下文并执行查询
- 最后处理查询结果并打印
项目目标
sxd-xpath项目的目标是替代libxml和libxslt库,提供纯Rust实现的XML处理解决方案。
许可证
该项目采用双重许可:
- Apache License 2.0
- MIT license
贡献指南
- Fork项目
- 创建特性分支
- 添加失败的测试用例
- 添加代码使测试通过
- 提交更改
- 确保测试通过
- 推送分支
- 创建Pull Request
这个示例展示了如何使用sxd-xpath库进行基本的XML文档解析和XPath查询,你可以根据需要修改XPath表达式来查询不同的XML节点。
完整示例demo
基于上述内容,下面是一个更完整的示例,展示如何使用sxd-xpath处理XML文档:
use sxd_document::parser;
use sxd_xpath::{Factory, Context, Value};
fn main() {
// 更复杂的XML文档示例
let xml = r#"
<library>
<section name="技术">
<book id="101">
<title>Rust编程</title>
<author>张三</author>
<published>2020</published>
<price>45.99</price>
<format>纸质</format>
</book>
<book id="102">
<title>高级Rust</title>
<author>李四</author>
<published>2022</published>
<price>55.50</price>
<format>电子</format>
</book>
</section>
<section name="文学">
<book id="201">
<title>百年孤独</title>
<author>马尔克斯</author>
<published>1967</published>
<price>39.90</price>
<format>纸质</format>
</book>
</section>
</library>
"#;
// 解析XML文档
let package = parser::parse(xml).expect("XML解析失败");
let document = package.as_document();
// 创建XPath工厂
let factory = Factory::new();
// 示例1: 查询所有电子书标题
let xpath1 = factory.build("//book[format='电子']/title/text()")
.expect("XPath编译失败")
.expect("没有有效的XPath表达式");
// 示例2: 查询2020年后出版的技术书籍
let xpath2 = factory.build("//section[@name='技术']/book[number(published)>=2020]/title/text()")
.expect("XPath编译失败")
.expect("没有有效的XPath表达式");
// 创建上下文
let context = Context::new();
// 执行第一个查询
println!("所有电子书标题:");
match xpath1.evaluate(&context, document.root()) {
Ok(result) => print_result(&result),
Err(e) => println!("查询失败: {}", e),
}
// 执行第二个查询
println!("\n2020年后出版的技术书籍:");
match xpath2.evaluate(&context, document.root()) {
Ok(result) => print_result(&result),
Err(e) => println!("查询失败: {}", e),
}
}
// 辅助函数:打印查询结果
fn print_result(result: &Value) {
match result {
Value::Nodeset(nodes) => {
if nodes.is_empty() {
println!("没有匹配的结果");
} else {
for node in nodes {
println!("- {}", node.string_value());
}
}
},
Value::String(s) => println!("{}", s),
Value::Boolean(b) => println!("{}", b),
Value::Number(n) => println!("{}", n),
_ => println!("未知结果类型"),
}
}
这个完整示例展示了:
- 更复杂的XML文档结构处理
- 多个XPath查询示例
- 更健壮的错误处理
- 结果处理的辅助函数
- 不同类型的XPath查询条件(属性、文本值、数值比较等)
你可以根据需要修改XML文档内容和XPath表达式来进行不同的查询操作。
1 回复
Rust XML处理库sxd-xpath的使用:高效解析与查询XML文档的XPath实现
sxd-xpath是Rust中一个强大的XPath处理库,专门用于解析和查询XML文档。它提供了完整的XPath 1.0实现,能够高效地在XML文档中定位和提取数据。
安装方法
在Cargo.toml中添加依赖:
[dependencies]
sxd-xpath = "0.4"
sxd-document = "0.3"
基本使用方法
1. 解析XML文档
use sxd_document::parser;
let xml = r#"<books>
<book id="1">
<title>Rust in Action</title>
<author>Tim McNamara</author>
<price>49.99</price>
</book>
<book id="2">
<title>The Rust Programming Language</title>
<author>Steve Klabnik</author>
<price>39.99</price>
</book>
</books>"#;
let package = parser::parse(xml).expect("Failed to parse XML");
let document = package.as_document();
2. 执行XPath查询
use sxd_xpath::{Factory, Value};
let factory = Factory::new();
let xpath = factory.build("//book[price > 40]/title").expect("Invalid XPath");
let xpath = xpath.expect("No XPath was compiled");
let value = xpath.evaluate(&document, document.root()).expect("XPath evaluation failed");
match value {
Value::Nodeset(nodes) => {
for node in nodes.document_order() {
println!("Title: {}", node.string_value());
}
},
_ => println!("Unexpected result type"),
}
高级功能
1. 使用变量
use sxd_xpath::{Factory, Value, Variables};
let factory = Factory::new();
let xpath = factory.build("//book[price > $min_price]/title").expect("Invalid XPath");
let xpath = xpath.expect("No XPath was compiled");
let mut variables = Variables::new();
variables.set("min_price", Value::Number(40.0));
let value = xpath.evaluate(&document, document.root(), Some(&variables)).expect("XPath evaluation failed");
2. 处理命名空间
use sxd_xpath::{Factory, NamespaceResolver, Value};
let xml = r#"<ns:books xmlns:ns="http://example.com/books">
<ns:book id="1">
<ns:title>Rust in Action</ns:title>
</ns:book>
</ns:books>"#;
let package = parser::parse(xml).expect("Failed to parse XML");
let document = package.as_document();
let factory = Factory::new();
let xpath = factory.build("//ns:book/ns:title").expect("Invalid XPath");
let xpath = xpath.expect("No XPath was compiled");
let resolver = NamespaceResolver::new();
resolver.add_namespace("ns", "http://example.com/books");
let value = xpath.evaluate_with_namespace_resolver(
&document,
document.root(),
&resolver
).expect("XPath evaluation failed");
3. 自定义函数
use sxd_xpath::{Factory, Value, Function, ExecutionContext};
struct CustomFunction;
impl Function for CustomFunction {
fn evaluate(&self, _: &ExecutionContext, args: &[Value]) -> Value {
// 实现自定义逻辑
Value::String("custom result".to_string())
}
}
let factory = Factory::new();
factory.set_function("custom-fn", CustomFunction);
let xpath = factory.build("custom-fn()").expect("Invalid XPath");
let xpath = xpath.expect("No XPath was compiled");
let value = xpath.evaluate(&document, document.root()).expect("XPath evaluation failed");
性能提示
- 重用
Factory
实例,因为它会缓存编译的XPath表达式 - 对于重复查询,预编译XPath表达式
- 当处理大量XML数据时,考虑使用流式解析器
错误处理
match parser::parse(xml) {
Ok(package) => {
// 处理成功
},
Err(e) => {
eprintln!("XML解析错误: {}", e);
}
}
match factory.build(xpath_str) {
Ok(Some(compiled)) => {
// XPath编译成功
},
Ok(None) => {
eprintln!("XPath表达式为空");
},
Err(e) => {
eprintln!("XPath语法错误: {}", e);
}
}
sxd-xpath提供了强大的XPath 1.0功能,适合需要精确查询XML文档的Rust应用场景。通过合理使用,可以高效地提取和处理XML数据。
完整示例代码
下面是一个完整的示例,展示了如何使用sxd-xpath库解析XML文档并执行XPath查询:
use sxd_document::parser;
use sxd_xpath::{Factory, Value, Variables};
fn main() {
// 示例XML数据
let xml = r#"<books>
<book id="1">
<title>Rust in Action</title>
<author>Tim McNamara</author>
<price>49.99</price>
</book>
<book id="2">
<title>The Rust Programming Language</title>
<author>Steve Klabnik</author>
<price>39.99</price>
</book>
<book id="3">
<title>Rust for Rustaceans</title>
<author>Jon Gjengset</author>
<price>44.99</price>
</book>
</books>"#;
// 解析XML文档
let package = parser::parse(xml).expect("Failed to parse XML");
let document = package.as_document();
// 创建XPath工厂
let factory = Factory::new();
// 示例1:基本XPath查询
println!("=== 所有书籍标题 ===");
let xpath = factory.build("//book/title").expect("Invalid XPath");
if let Some(compiled) = xpath {
let value = compiled.evaluate(&document, document.root()).expect("XPath evaluation failed");
if let Value::Nodeset(nodes) = value {
for node in nodes.document_order() {
println!("Title: {}", node.string_value());
}
}
}
// 示例2:带条件的XPath查询
println!("\n=== 价格高于45的书籍 ===");
let xpath = factory.build("//book[price > 45]").expect("Invalid XPath");
if let Some(compiled) = xpath {
let value = compiled.evaluate(&document, document.root()).expect("XPath evaluation failed");
if let Value::Nodeset(nodes) = value {
for node in nodes.document_order() {
let title = node.children()
.find(|n| n.element().map(|e| e.name().local_part() == "title").unwrap();
let price = node.children()
.find(|n| n.element().map(|e| e.name().local_part() == "price").unwrap();
println!("{} - {}", title.string_value(), price.string_value());
}
}
}
// 示例3:使用变量
println!("\n=== 使用变量的查询 ===");
let xpath = factory.build("//book[price > $min_price]").expect("Invalid XPath");
if let Some(compiled) = xpath {
let mut variables = Variables::new();
variables.set("min_price", Value::Number(40.0));
let value = compiled.evaluate(&document, document.root(), Some(&variables))
.expect("XPath evaluation failed");
if let Value::Nodeset(nodes) = value {
println!("找到{}本价格高于40的书", nodes.size());
}
}
}
这个完整示例展示了:
- 如何解析XML文档
- 执行基本的XPath查询
- 使用条件表达式过滤结果
- 在XPath查询中使用变量
- 处理查询结果并提取所需数据
输出结果将会是:
=== 所有书籍标题 ===
Title: Rust in Action
Title: The Rust Programming Language
Title: Rust for Rustaceans
=== 价格高于45的书籍 ===
Rust in Action - 49.99
=== 使用变量的查询 ===
找到2本价格高于40的书