Rust PostgreSQL查询解析库pg_query的使用,支持SQL语法解析与查询分析
pg_query.rs
这个Rust库使用实际的PostgreSQL服务器源代码来解析SQL查询并返回PostgreSQL内部解析树。
它还允许您规范化查询(用$1等替换常量值),并将这些规范化查询再次解析为解析树。
当您构建此库时,它会构建PostgreSQL服务器源代码的一部分,然后将其静态链接到此库中。
开始使用
将以下内容添加到您的Cargo.toml
[dependencies]
pg_query = "5.1"
示例
解析查询
use pg_query::NodeRef;
let result = pg_query::parse(“SELECT * FROM contacts”);
assert!(result.is_ok());
let result = result.unwrap();
assert_eq!(result.tables(), vec![“contacts”]);
assert!(matches!(result.protobuf.nodes()[0].0, NodeRef::SelectStmt(_)));
规范化查询
let result = pg_query::normalize("SELECT 1 FROM x WHERE y = (SELECT 123 FROM a WHERE z = 'bla')").unwrap();
assert_eq!(result, "SELECT $1 FROM x WHERE y = (SELECT $2 FROM a WHERE z = $3)");
查询指纹
let result = pg_query::fingerprint("SELECT * FROM contacts.person WHERE id IN (1, 2, 3, 4);").unwrap();
assert_eq!(result.hex, "643d2a3c294ab8a7");
截断查询
let query = "INSERT INTO \"x\" (a, b, c, d, e, f) VALUES (?)";
let result = pg_query::parse(query).unwrap();
assert_eq!(result.truncate(32).unwrap(), "INSERT INTO x (...) VALUES (...)");
完整示例代码
use pg_query::NodeRef;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// 示例1: 解析查询
let parse_result = pg_query::parse(“SELECT * FROM contacts”)?;
println!(“解析的表: {:?}”, parse_result.tables());
// 示例2: 规范化查询
let normalized = pg_query::normalize("SELECT 1 FROM x WHERE y = (SELECT 123 FROM a WHERE z = 'bla')")?;
println!("规范化查询: {}", normalized);
// 示例3: 查询指纹
let fingerprint_result = pg_query::fingerprint("SELECT * FROM contacts.person WHERE id IN (1, 2, 3, 4);")?;
println!("查询指纹: {}", fingerprint_result.hex);
// 示例4: 截断查询
let truncate_query = "INSERT INTO \"x\" (a, b, c, d, e, f) VALUES (?)";
let truncate_result = pg_query::parse(truncate_query)?;
let truncated = truncate_result.truncate(32)?;
println!("截断后的查询: {}", truncated);
Ok(())
}
致谢
感谢Paul Mason在pg_parse上的工作,这个crate基于此。
在0.6.0版本之后,Paul将pg_query crate捐赠给了pganalyze团队。pg_parse是一个更轻量级的替代方案,专注于查询解析,而pg_query旨在与Ruby gem具有功能对等性。
许可证
PostgreSQL服务器源代码,根据PostgreSQL许可证使用。
部分版权 (c) 1996-2023, The PostgreSQL Global Development Group
部分版权 (c) 1994, The Regents of the University of California
所有其他部分均根据MIT许可证授权,详见LICENSE文件。
版权 (c) 2021 Paul Mason
版权 (c) 2021-2023, Duboce Labs, Inc. (pganalyze)
Rust PostgreSQL查询解析库pg_query的使用
介绍
pg_query是一个用于解析和分析PostgreSQL查询语句的Rust库,它基于PostgreSQL的查询解析器,能够将SQL语句解析为抽象语法树(AST),并提供丰富的查询分析功能。该库特别适合需要处理SQL语句的应用程序,如查询优化器、SQL格式化工具、语法检查器等。
主要特性
- 支持标准PostgreSQL SQL语法
- 将SQL语句解析为JSON格式的AST
- 提供查询分析和重构功能
- 支持查询标准化和去语义化
- 跨平台兼容
安装方法
在Cargo.toml中添加依赖:
[dependencies]
pg_query = "0.10"
基本使用方法
1. 解析SQL语句
use pg_query::{Node, NodeEnum, NodeRef};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let sql = "SELECT id, name FROM users WHERE age > 18";
// 解析SQL语句
let result = pg_query::parse(sql)?;
// 获取解析树
let parse_tree = result.parse_tree;
println!("{:#?}", parse_tree);
Ok(())
}
2. 提取查询信息
use pg_query::{Node, NodeEnum};
fn extract_columns(ast: &Node) -> Vec<String> {
let mut columns = Vec::new();
if let NodeEnum::SelectStmt(select_stmt) = &ast.node {
if let Some(target_list) = &select_stmt.target_list {
for target in target_list {
if let NodeEnum::ResTarget(res_target) = &target.node {
if let Some(name) = &res_target.name {
columns.push(name.clone());
}
}
}
}
}
columns
}
3. 查询标准化
fn normalize_query(sql: &str) -> Result<String, Box<dyn std::error::Error>> {
let normalized = pg_query::normalize(sql)?;
Ok(normalized)
}
// 示例:将查询参数化
let sql = "SELECT * FROM users WHERE id = 123";
let normalized = normalize_query(sql)?;
println!("{}", normalized); // 输出: SELECT * FROM users WHERE id = $1
4. 指纹生成
fn generate_fingerprint(sql: &str) -> Result<String, Box<dyn std::error::Error>> {
let fingerprint = pg_query::fingerprint(sql)?;
Ok(fingerprint)
}
// 生成查询指纹用于缓存或比较
let sql1 = "SELECT name FROM users WHERE id = 1";
let sql2 = "SELECT name FROM users WHERE id = 2";
let fp1 = generate_fingerprint(sql1)?;
let fp2 = generate_fingerprint(sql2)?;
println!("指纹是否相同: {}", fp1 == fp2); // 输出: true
高级用法
遍历AST节点
use pg_query::{Node, NodeEnum, NodeRef};
fn walk_ast(node: &Node, depth: usize) {
let indent = " ".repeat(depth);
match &node.node {
NodeEnum::SelectStmt(select) => {
println!("{}Select语句", indent);
// 递归处理子节点
}
NodeEnum::ColumnRef(column) => {
println!("{}列引用: {:?}", indent, column);
}
// 处理其他节点类型...
_ => {}
}
// 递归处理所有子节点
if let Some(children) = node.get_children() {
for child in children {
walk_ast(&child, depth + 1);
}
}
}
查询重写示例
fn rewrite_query(sql: &str) -> Result<String, Box<dyn std::error::Error>> {
let parsed = pg_query::parse(sql)?;
// 这里可以添加自定义的重写逻辑
// 例如:修改表名、添加条件等
// 将修改后的AST重新转换为SQL
let rewritten_sql = pg_query::deparse(&parsed.parse_tree)?;
Ok(rewritten_sql)
}
错误处理
fn safe_parse(sql: &str) -> Result<(), Box<dyn std::error::Error>> {
match pg_query::parse(sql) {
Ok(result) => {
println!("解析成功: {:?}", result);
Ok(())
}
Err(e) => {
eprintln!("解析错误: {}", e);
Err(e.into())
}
}
}
完整示例demo
use pg_query::{Node, NodeEnum, NodeRef};
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
// 示例SQL查询
let sql = "SELECT id, name, email FROM users WHERE age > 18 AND status = 'active' ORDER BY name DESC";
println!("原始SQL: {}", sql);
println!("{}", "=".repeat(50));
// 1. 解析SQL语句
println!("1. 解析SQL语句:");
let result = pg_query::parse(sql)?;
println!("解析成功,获得AST");
println!();
// 2. 提取查询信息
println!("2. 提取查询信息:");
if let Some(parse_tree) = result.parse_tree {
let columns = extract_columns(&parse_tree);
println!("提取的列: {:?}", columns);
let table_name = extract_table_name(&parse_tree);
println!("查询的表: {:?}", table_name);
let conditions = extract_conditions(&parse_tree);
println!("查询条件: {:?}", conditions);
}
println!();
// 3. 查询标准化
println!("3. 查询标准化:");
let normalized = pg_query::normalize(sql)?;
println!("标准化后的SQL: {}", normalized);
println!();
// 4. 指纹生成
println!("4. 指纹生成:");
let fingerprint = pg_query::fingerprint(sql)?;
println!("查询指纹: {}", fingerprint);
println!();
// 5. 遍历AST节点
println!("5. 遍历AST节点:");
if let Some(parse_tree) = result.parse_tree {
walk_ast(&parse_tree, 0);
}
println!();
// 6. 查询重写示例
println!("6. 查询重写示例:");
let rewritten = rewrite_query_add_limit(sql, 10)?;
println!("重写后的SQL: {}", rewritten);
Ok(())
}
// 提取列名
fn extract_columns(ast: &Node) -> Vec<String> {
let mut columns = Vec::new();
if let NodeEnum::SelectStmt(select_stmt) = &ast.node {
if let Some(target_list) = &select_stmt.target_list {
for target in target_list {
if let NodeEnum::ResTarget(res_target) = &target.node {
if let Some(name) = &res_target.name {
columns.push(name.clone());
} else if let Some(NodeEnum::ColumnRef(column_ref)) = res_target.val.as_ref().map(|n| &n.node) {
if let Some(fields) = &column_ref.fields {
for field in fields {
if let NodeEnum::String(str_node) = &field.node {
columns.push(str_node.str.clone());
}
}
}
}
}
}
}
}
columns
}
// 提取表名
fn extract_table_name(ast: &Node) -> Option<String> {
if let NodeEnum::SelectStmt(select_stmt) = &ast.node {
if let Some(from_clause) = &select_stmt.from_clause {
for from_item in from_clause {
if let NodeEnum::RangeVar(range_var) = &from_item.node {
return range_var.relname.clone();
}
}
}
}
None
}
// 提取查询条件
fn extract_conditions(ast: &Node) -> Vec<String> {
let mut conditions = Vec::new();
if let NodeEnum::SelectStmt(select_stmt) = &ast.node {
if let Some(where_clause) = &select_stmt.where_clause {
extract_conditions_recursive(where_clause, &mut conditions);
}
}
conditions
}
fn extract_conditions_recursive(node: &Node, conditions: &mut Vec<String>) {
match &node.node {
NodeEnum::A_Expr(a_expr) => {
if let (Some(left), Some(right)) = (&a_expr.lexpr, &a_expr.rexpr) {
let left_str = node_to_string(left);
let right_str = node_to_string(right);
let op = a_expr.name.as_ref().and_then(|names| {
names.first().and_then(|name| {
if let NodeEnum::String(str_node) = &name.node {
Some(str_node.str.clone())
} else {
None
}
})
}).unwrap_or_default();
conditions.push(format!("{} {} {}", left_str, op, right_str));
}
}
NodeEnum::BoolExpr(bool_expr) => {
if let Some(args) = &bool_expr.args {
for arg in args {
extract_conditions_recursive(arg, conditions);
}
}
}
_ => {}
}
}
fn node_to_string(node: &Node) -> String {
match &node.node {
NodeEnum::ColumnRef(column_ref) => {
if let Some(fields) = &column_ref.fields {
fields.iter()
.filter_map(|f| {
if let NodeEnum::String(str_node) = &f.node {
Some(str_node.str.clone())
} else {
None
}
})
.collect::<Vec<String>>()
.join(".")
} else {
"?".to_string()
}
}
NodeEnum::A_Const(a_const) => {
if let Some(val) = &a_const.val {
match &val.node {
NodeEnum::Integer(int_node) => int_node.ival.to_string(),
NodeEnum::String(str_node) => format!("'{}'", str_node.str),
NodeEnum::Float(float_node) => float_node.str.clone(),
_ => "?".to_string(),
}
} else {
"?".to_string()
}
}
_ => "?".to_string(),
}
}
// 遍历AST节点
fn walk_ast(node: &Node, depth: usize) {
let indent = " ".repeat(depth);
match &node.node {
NodeEnum::SelectStmt(_) => {
println!("{}Select语句", indent);
}
NodeEnum::ColumnRef(column) => {
if let Some(fields) = &column.fields {
let column_name = fields.iter()
.filter_map(|f| {
if let NodeEnum::String(str_node) = &f.node {
Some(str_node.str.clone())
} else {
None
}
})
.collect::<Vec<String>>()
.join(".");
println!("{}列引用: {}", indent, column_name);
}
}
NodeEnum::RangeVar(range_var) => {
if let Some(name) = &range_var.relname {
println!("{}表引用: {}", indent, name);
}
}
NodeEnum::A_Expr(a_expr) => {
let op = a_expr.name.as_ref().and_then(|names| {
names.first().and_then(|name| {
if let NodeEnum::String(str_node) = &name.node {
Some(str_node.str.clone())
} else {
None
}
})
}).unwrap_or_default();
println!("{}表达式操作: {}", indent, op);
}
NodeEnum::SortBy(sort_by) => {
if let Some(sort_dir) = &sort_by.sortby_dir {
let direction = match sort_dir {
0 => "ASC",
1 => "DESC",
_ => "UNKNOWN",
};
println!("{}排序方向: {}", indent, direction);
}
}
_ => {}
}
// 递归处理所有子节点
if let Some(children) = node.get_children() {
for child in children {
walk_ast(&child, depth + 1);
}
}
}
// 查询重写:添加LIMIT子句
fn rewrite_query_add_limit(sql: &str, limit: i32) -> Result<String, Box<dyn Error>> {
let parsed = pg_query::parse(sql)?;
if let Some(mut parse_tree) = parsed.parse_tree {
// 修改AST添加LIMIT子句
if let NodeEnum::SelectStmt(select_stmt) = &mut parse_tree.node {
select_stmt.limit_count = Some(Node {
node: NodeEnum::A_Const {
val: Some(Box::new(Node {
node: NodeEnum::Integer {
ival: limit,
},
})),
},
});
}
// 将修改后的AST重新转换为SQL
let rewritten_sql = pg_query::deparse(&parse_tree)?;
Ok(rewritten_sql)
} else {
Err("解析失败".into())
}
}
实际应用场景
- SQL语法检查器:验证SQL语句的正确性
- 查询优化器:分析查询结构进行优化
- 权限检查:解析查询以进行访问控制
- 查询日志分析:标准化和分类查询
- SQL格式化工具:美化SQL输出
注意事项
- 该库基于PostgreSQL的解析器,支持PostgreSQL特有的语法
- 解析大型查询时注意性能影响
- AST结构可能随版本更新而变化
通过pg_query库,开发者可以在Rust应用中轻松实现强大的SQL处理功能,充分利用PostgreSQL成熟的查询解析能力。