Rust语法分析库tree-sitter-elixir的使用:支持Elixir语言的高效解析与语法树构建
Rust语法分析库tree-sitter-elixir的使用:支持Elixir语言的高效解析与语法树构建
概述
tree-sitter-elixir是Elixir语言的tree-sitter语法解析器,可用于高效解析Elixir代码并构建语法树。该项目已用于生产环境,目前被GitHub用于源代码高亮和代码导航。
安装
在项目目录中运行以下Cargo命令安装:
cargo add tree-sitter-elixir
或在Cargo.toml中添加:
tree-sitter-elixir = "0.3.4"
使用示例
下面是一个完整的示例demo,展示如何使用tree-sitter-elixir解析Elixir代码:
use tree_sitter::Parser;
use tree_sitter_elixir::language;
fn main() {
// 创建解析器
let mut parser = Parser::new();
// 设置Elixir语言
parser.set_language(language()).expect("Error loading Elixir grammar");
// 要解析的Elixir代码
let code = r#"
defmodule Hello do
def world do
IO.puts("Hello, world!")
end
end
"#;
// 解析代码
let tree = parser.parse(code, None).unwrap();
// 获取根节点
let root_node = tree.root_node();
// 打印语法树
println!("Syntax tree:\n{}", root_node.to_sexp());
// 遍历子节点
let mut cursor = root_node.walk();
println!("\nChild nodes:");
for child in root_node.children(&mut cursor) {
println!("- {:?}", child);
}
}
完整示例代码
以下是一个更完整的tree-sitter-elixir使用示例,展示了更多功能:
use tree_sitter::{Parser, TreeCursor, Node};
use tree_sitter_elixir::language;
fn main() {
// 初始化解析器
let mut parser = Parser::new();
parser.set_language(language()).expect("加载Elixir语法失败");
// 示例Elixir代码
let code = r#"
defmodule Math do
@moduledoc "数学运算模块"
@doc "加法函数"
def add(a, b) do
a + b
end
defp private_func do
:ok
end
end
"#;
// 解析代码
let tree = parser.parse(code, None).unwrap();
let root = tree.root_node();
// 打印整个语法树
println!("完整语法树:\n{}", root.to_sexp());
// 递归遍历语法树
println!("\n递归遍历节点:");
traverse_tree(root, 0);
// 查找特定节点
println!("\n查找函数定义:");
find_functions(root);
}
// 递归遍历语法树的辅助函数
fn traverse_tree(node: Node, depth: usize) {
let indent = " ".repeat(depth);
println!("{}{}: {}", indent, node.kind(), node.to_sexp());
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
traverse_tree(child, depth + 1);
}
}
// 查找所有函数定义的辅助函数
fn find_functions(node: Node) {
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
if child.kind() == "function" {
println!("找到函数: {}", child.to_sexp());
// 查找函数名
let mut func_cursor = child.walk();
for func_child in child.children(&mut func_cursor) {
if func_child.kind() == "call" {
println!(" 函数名: {}", func_child.to_sexp());
}
}
}
find_functions(child); // 递归查找
}
}
开发
更多开发细节请参考项目文档。
特性
- 生产就绪,被GitHub实际使用
- 支持Elixir语言的完整语法解析
- 高效构建语法树
许可证
Apache-2.0许可证
1 回复
Rust语法分析库tree-sitter-elixir的使用
tree-sitter-elixir是一个基于tree-sitter的Elixir语言语法分析器,它能够高效地解析Elixir代码并构建语法树。这个库特别适合用于构建Elixir代码编辑器、IDE插件、代码格式化工具或静态分析工具。
主要特性
- 高性能的增量解析
- 准确的语法树构建
- 支持错误恢复
- 跨平台支持
- 与tree-sitter生态系统无缝集成
安装方法
首先,在你的Cargo.toml
中添加依赖:
[dependencies]
tree-sitter = "0.20"
tree-sitter-elixir = "0.0.1" # 请检查最新版本
基本使用方法
1. 解析Elixir代码
use tree_sitter::Parser;
fn main() {
// 创建解析器
let mut parser = Parser::new();
// 设置Elixir语言
let language = tree_sitter_elixir::language();
parser.set_language(language).unwrap();
// 解析Elixir代码
let code = r#"
defmodule Hello do
def world do
IO.puts("Hello, World!")
end
end
"#;
let tree = parser.parse(code, None).unwrap();
// 输出语法树
println!("{}", tree.root_node().to_sexp());
}
2. 遍历语法树
fn walk_tree(node: tree_sitter::Node, source: &str, depth: usize) {
let indent = " ".repeat(depth);
println!("{}{:?}", indent, node.kind());
if node.child_count() == 0 {
println!("{} Text: {:?}", indent, node.utf8_text(source.as_bytes()).unwrap());
}
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
walk_tree(child, source, depth + 1);
}
}
3. 查询语法树
use tree_sitter::Query;
fn find_function_definitions(tree: &tree_sitter::Tree, source: &str) {
let query = Query::new(
tree_sitter_elixir::language(),
"(function_definition name: (identifier) @function-name)"
).unwrap();
let mut cursor = tree_sitter::QueryCursor::new();
let matches = cursor.matches(&query, tree.root_node(), source.as_bytes());
for m in matches {
for capture in m.captures {
let node = capture.node;
println!("Found function: {}", node.utf8_text(source.as_bytes()).unwrap());
}
}
}
高级用法
增量解析
let mut parser = Parser::new();
parser.set_language(tree_sitter_elixir::language()).unwrap();
// 第一次解析
let mut tree = parser.parse("defmodule A do end", None).unwrap();
// 修改代码后增量解析
let edit = tree_sitter::InputEdit {
start_byte: 14,
old_end_byte: 14,
new_end_byte: 15,
start_position: tree_sitter::Point::new(0, 14),
old_end_position: tree_sitter::Point::new(0, 14),
new_end_position: tree_sitter::Point::new(0, 15),
};
// 应用编辑并重新解析
tree.edit(&edit);
let new_tree = parser.parse("defmodule B do end", Some(&tree).unwrap();
错误处理
let code_with_errors = r#"
defmodule Invalid do
def func do
missing_end
"#;
let tree = parser.parse(code_with_errors, None).unwrap();
if tree.root_node().has_error() {
println!("Code contains syntax errors");
// 遍历错误节点
let mut cursor = tree.walk();
for node in tree.root_node().children(&mut cursor) {
if node.is_error() || node.is_missing() {
println!("Error at: {}:{}", node.start_position().row, node.start_position().column);
}
}
}
实际应用示例
构建简单的Elixir代码格式化工具
use tree_sitter::{Parser, TreeCursor};
fn format_elixir_code(code: &str) -> String {
let mut parser = Parser::new();
parser.set_language(tree_sitter_elixir::language()).unwrap();
let tree = parser.parse(code, None).unwrap();
let mut formatted = String::new();
let mut indent = 0;
let mut cursor = tree.walk();
format_node(&mut cursor, code, &mut formatted, &mut indent);
formatted
}
fn format_node(cursor: &mut TreeCursor, source: &str, output: &mut String, indent: &mut usize) {
let node = cursor.node();
match node.kind() {
"do_block" => {
output.push_str(" do");
*indent += 1;
for child in node.children(cursor) {
format_node(cursor, source, output, indent);
}
*indent -= 1;
output.push_str(&format!("\n{}end", " ".repeat(*indent)));
}
_ => {
if node.child_count() == 0 {
output.push_str(node.utf8_text(source.as_bytes()).unwrap());
} else {
for child in node.children(cursor) {
format_node(cursor, source, output, indent);
}
}
}
}
}
完整示例代码
use tree_sitter::{Parser, Query, QueryCursor, TreeCursor};
fn main() {
// 示例1:解析Elixir代码
let mut parser = Parser::new();
parser.set_language(tree_sitter_elixir::language()).unwrap();
let code = r#"
defmodule Example do
def hello(name) do
IO.puts("Hello, #{name}!")
end
end
"#;
let tree = parser.parse(code, None).unwrap();
println!("语法树:\n{}", tree.root_node().to_sexp());
// 示例2:遍历语法树
println!("\n遍历语法树:");
walk_tree(tree.root_node(), code, 0);
// 示例3:查询函数定义
println!("\n查询函数定义:");
find_function_definitions(&tree, code);
// 示例4:格式化代码
println!("\n格式化后的代码:");
let formatted = format_elixir_code(code);
println!("{}", formatted);
}
fn walk_tree(node: tree_sitter::Node, source: &str, depth: usize) {
let indent = " ".repeat(depth);
println!("{}{:?}", indent, node.kind());
if node.child_count() == 0 {
println!("{} Text: {:?}", indent, node.utf8_text(source.as_bytes()).unwrap());
}
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
walk_tree(child, source, depth + 1);
}
}
fn find_function_definitions(tree: &tree_sitter::Tree, source: &str) {
let query = Query::new(
tree_sitter_elixir::language(),
"(function_definition name: (identifier) @function-name)"
).unwrap();
let mut cursor = QueryCursor::new();
let matches = cursor.matches(&query, tree.root_node(), source.as_bytes());
for m in matches {
for capture in m.captures {
let node = capture.node;
println!("发现函数: {}", node.utf8_text(source.as_bytes()).unwrap());
}
}
}
fn format_elixir_code(code: &str) -> String {
let mut parser = Parser::new();
parser.set_language(tree_sitter_elixir::language()).unwrap();
let tree = parser.parse(code, None).unwrap();
let mut formatted = String::new();
let mut indent = 0;
let mut cursor = tree.walk();
format_node(&mut cursor, code, &mut formatted, &mut indent);
formatted
}
fn format_node(cursor: &mut TreeCursor, source: &str, output: &mut String, indent: &mut usize) {
let node = cursor.node();
match node.kind() {
"do_block" => {
output.push_str(" do");
*indent += 1;
for child in node.children(cursor) {
format_node(cursor, source, output, indent);
}
*indent -= 1;
output.push_str(&format!("\n{}end", " ".repeat(*indent)));
}
_ => {
if node.child_count() == 0 {
output.push_str(node.utf8_text(source.as_bytes()).unwrap());
} else {
for child in node.children(cursor) {
format_node(cursor, source, output, indent);
}
}
}
}
}
注意事项
- tree-sitter-elixir仍在开发中,某些边缘语法可能不完全支持
- 解析大型文件时,考虑使用增量解析以提高性能
- 错误恢复功能可以帮助处理不完整的代码,但可能产生不准确的语法树
- 定期检查更新以获取最新的语言支持改进
通过tree-sitter-elixir,你可以轻松地在Rust中构建强大的Elixir语言处理工具,利用其高效的解析能力和丰富的语法树信息。