Rust语法分析库tree-sitter-elixir的使用:支持Elixir语言的高效解析与语法树构建

Rust语法分析库tree-sitter-elixir的使用:支持Elixir语言的高效解析与语法树构建

概述

tree-sitter-elixir是Elixir语言的tree-sitter语法解析器,可用于高效解析Elixir代码并构建语法树。该项目已用于生产环境,目前被GitHub用于源代码高亮和代码导航。

安装

在项目目录中运行以下Cargo命令安装:

cargo add tree-sitter-elixir

或在Cargo.toml中添加:

tree-sitter-elixir = "0.3.4"

使用示例

下面是一个完整的示例demo,展示如何使用tree-sitter-elixir解析Elixir代码:

use tree_sitter::Parser;
use tree_sitter_elixir::language;

fn main() {
    // 创建解析器
    let mut parser = Parser::new();
    
    // 设置Elixir语言
    parser.set_language(language()).expect("Error loading Elixir grammar");
    
    // 要解析的Elixir代码
    let code = r#"
    defmodule Hello do
      def world do
        IO.puts("Hello, world!")
      end
    end
    "#;
    
    // 解析代码
    let tree = parser.parse(code, None).unwrap();
    
    // 获取根节点
    let root_node = tree.root_node();
    
    // 打印语法树
    println!("Syntax tree:\n{}", root_node.to_sexp());
    
    // 遍历子节点
    let mut cursor = root_node.walk();
    println!("\nChild nodes:");
    for child in root_node.children(&mut cursor) {
        println!("- {:?}", child);
    }
}

完整示例代码

以下是一个更完整的tree-sitter-elixir使用示例,展示了更多功能:

use tree_sitter::{Parser, TreeCursor, Node};
use tree_sitter_elixir::language;

fn main() {
    // 初始化解析器
    let mut parser = Parser::new();
    parser.set_language(language()).expect("加载Elixir语法失败");

    // 示例Elixir代码
    let code = r#"
    defmodule Math do
      @moduledoc "数学运算模块"
      
      @doc "加法函数"
      def add(a, b) do
        a + b
      end
      
      defp private_func do
        :ok
      end
    end
    "#;

    // 解析代码
    let tree = parser.parse(code, None).unwrap();
    let root = tree.root_node();
    
    // 打印整个语法树
    println!("完整语法树:\n{}", root.to_sexp());
    
    // 递归遍历语法树
    println!("\n递归遍历节点:");
    traverse_tree(root, 0);
    
    // 查找特定节点
    println!("\n查找函数定义:");
    find_functions(root);
}

// 递归遍历语法树的辅助函数
fn traverse_tree(node: Node, depth: usize) {
    let indent = "  ".repeat(depth);
    println!("{}{}: {}", indent, node.kind(), node.to_sexp());
    
    let mut cursor = node.walk();
    for child in node.children(&mut cursor) {
        traverse_tree(child, depth + 1);
    }
}

// 查找所有函数定义的辅助函数
fn find_functions(node: Node) {
    let mut cursor = node.walk();
    for child in node.children(&mut cursor) {
        if child.kind() == "function" {
            println!("找到函数: {}", child.to_sexp());
            
            // 查找函数名
            let mut func_cursor = child.walk();
            for func_child in child.children(&mut func_cursor) {
                if func_child.kind() == "call" {
                    println!("  函数名: {}", func_child.to_sexp());
                }
            }
        }
        find_functions(child);  // 递归查找
    }
}

开发

更多开发细节请参考项目文档。

特性

  • 生产就绪,被GitHub实际使用
  • 支持Elixir语言的完整语法解析
  • 高效构建语法树

许可证

Apache-2.0许可证


1 回复

Rust语法分析库tree-sitter-elixir的使用

tree-sitter-elixir是一个基于tree-sitter的Elixir语言语法分析器,它能够高效地解析Elixir代码并构建语法树。这个库特别适合用于构建Elixir代码编辑器、IDE插件、代码格式化工具或静态分析工具。

主要特性

  • 高性能的增量解析
  • 准确的语法树构建
  • 支持错误恢复
  • 跨平台支持
  • 与tree-sitter生态系统无缝集成

安装方法

首先,在你的Cargo.toml中添加依赖:

[dependencies]
tree-sitter = "0.20"
tree-sitter-elixir = "0.0.1"  # 请检查最新版本

基本使用方法

1. 解析Elixir代码

use tree_sitter::Parser;

fn main() {
    // 创建解析器
    let mut parser = Parser::new();
    
    // 设置Elixir语言
    let language = tree_sitter_elixir::language();
    parser.set_language(language).unwrap();
    
    // 解析Elixir代码
    let code = r#"
    defmodule Hello do
      def world do
        IO.puts("Hello, World!")
      end
    end
    "#;
    
    let tree = parser.parse(code, None).unwrap();
    
    // 输出语法树
    println!("{}", tree.root_node().to_sexp());
}

2. 遍历语法树

fn walk_tree(node: tree_sitter::Node, source: &str, depth: usize) {
    let indent = "  ".repeat(depth);
    println!("{}{:?}", indent, node.kind());
    
    if node.child_count() == 0 {
        println!("{}  Text: {:?}", indent, node.utf8_text(source.as_bytes()).unwrap());
    }
    
    let mut cursor = node.walk();
    for child in node.children(&mut cursor) {
        walk_tree(child, source, depth + 1);
    }
}

3. 查询语法树

use tree_sitter::Query;

fn find_function_definitions(tree: &tree_sitter::Tree, source: &str) {
    let query = Query::new(
        tree_sitter_elixir::language(),
        "(function_definition name: (identifier) @function-name)"
    ).unwrap();
    
    let mut cursor = tree_sitter::QueryCursor::new();
    let matches = cursor.matches(&query, tree.root_node(), source.as_bytes());
    
    for m in matches {
        for capture in m.captures {
            let node = capture.node;
            println!("Found function: {}", node.utf8_text(source.as_bytes()).unwrap());
        }
    }
}

高级用法

增量解析

let mut parser = Parser::new();
parser.set_language(tree_sitter_elixir::language()).unwrap();

// 第一次解析
let mut tree = parser.parse("defmodule A do end", None).unwrap();

// 修改代码后增量解析
let edit = tree_sitter::InputEdit {
    start_byte: 14,
    old_end_byte: 14,
    new_end_byte: 15,
    start_position: tree_sitter::Point::new(0, 14),
    old_end_position: tree_sitter::Point::new(0, 14),
    new_end_position: tree_sitter::Point::new(0, 15),
};

// 应用编辑并重新解析
tree.edit(&edit);
let new_tree = parser.parse("defmodule B do end", Some(&tree).unwrap();

错误处理

let code_with_errors = r#"
defmodule Invalid do
  def func do
    missing_end
"#;

let tree = parser.parse(code_with_errors, None).unwrap();

if tree.root_node().has_error() {
    println!("Code contains syntax errors");
    
    // 遍历错误节点
    let mut cursor = tree.walk();
    for node in tree.root_node().children(&mut cursor) {
        if node.is_error() || node.is_missing() {
            println!("Error at: {}:{}", node.start_position().row, node.start_position().column);
        }
    }
}

实际应用示例

构建简单的Elixir代码格式化工具

use tree_sitter::{Parser, TreeCursor};

fn format_elixir_code(code: &str) -> String {
    let mut parser = Parser::new();
    parser.set_language(tree_sitter_elixir::language()).unwrap();
    let tree = parser.parse(code, None).unwrap();
    
    let mut formatted = String::new();
    let mut indent = 0;
    let mut cursor = tree.walk();
    
    format_node(&mut cursor, code, &mut formatted, &mut indent);
    formatted
}

fn format_node(cursor: &mut TreeCursor, source: &str, output: &mut String, indent: &mut usize) {
    let node = cursor.node();
    
    match node.kind() {
        "do_block" => {
            output.push_str(" do");
            *indent += 1;
            for child in node.children(cursor) {
                format_node(cursor, source, output, indent);
            }
            *indent -= 1;
            output.push_str(&format!("\n{}end", "  ".repeat(*indent)));
        }
        _ => {
            if node.child_count() == 0 {
                output.push_str(node.utf8_text(source.as_bytes()).unwrap());
            } else {
                for child in node.children(cursor) {
                    format_node(cursor, source, output, indent);
                }
            }
        }
    }
}

完整示例代码

use tree_sitter::{Parser, Query, QueryCursor, TreeCursor};

fn main() {
    // 示例1:解析Elixir代码
    let mut parser = Parser::new();
    parser.set_language(tree_sitter_elixir::language()).unwrap();
    
    let code = r#"
    defmodule Example do
      def hello(name) do
        IO.puts("Hello, #{name}!")
      end
    end
    "#;
    
    let tree = parser.parse(code, None).unwrap();
    println!("语法树:\n{}", tree.root_node().to_sexp());
    
    // 示例2:遍历语法树
    println!("\n遍历语法树:");
    walk_tree(tree.root_node(), code, 0);
    
    // 示例3:查询函数定义
    println!("\n查询函数定义:");
    find_function_definitions(&tree, code);
    
    // 示例4:格式化代码
    println!("\n格式化后的代码:");
    let formatted = format_elixir_code(code);
    println!("{}", formatted);
}

fn walk_tree(node: tree_sitter::Node, source: &str, depth: usize) {
    let indent = "  ".repeat(depth);
    println!("{}{:?}", indent, node.kind());
    
    if node.child_count() == 0 {
        println!("{}  Text: {:?}", indent, node.utf8_text(source.as_bytes()).unwrap());
    }
    
    let mut cursor = node.walk();
    for child in node.children(&mut cursor) {
        walk_tree(child, source, depth + 1);
    }
}

fn find_function_definitions(tree: &tree_sitter::Tree, source: &str) {
    let query = Query::new(
        tree_sitter_elixir::language(),
        "(function_definition name: (identifier) @function-name)"
    ).unwrap();
    
    let mut cursor = QueryCursor::new();
    let matches = cursor.matches(&query, tree.root_node(), source.as_bytes());
    
    for m in matches {
        for capture in m.captures {
            let node = capture.node;
            println!("发现函数: {}", node.utf8_text(source.as_bytes()).unwrap());
        }
    }
}

fn format_elixir_code(code: &str) -> String {
    let mut parser = Parser::new();
    parser.set_language(tree_sitter_elixir::language()).unwrap();
    let tree = parser.parse(code, None).unwrap();
    
    let mut formatted = String::new();
    let mut indent = 0;
    let mut cursor = tree.walk();
    
    format_node(&mut cursor, code, &mut formatted, &mut indent);
    formatted
}

fn format_node(cursor: &mut TreeCursor, source: &str, output: &mut String, indent: &mut usize) {
    let node = cursor.node();
    
    match node.kind() {
        "do_block" => {
            output.push_str(" do");
            *indent += 1;
            for child in node.children(cursor) {
                format_node(cursor, source, output, indent);
            }
            *indent -= 1;
            output.push_str(&format!("\n{}end", "  ".repeat(*indent)));
        }
        _ => {
            if node.child_count() == 0 {
                output.push_str(node.utf8_text(source.as_bytes()).unwrap());
            } else {
                for child in node.children(cursor) {
                    format_node(cursor, source, output, indent);
                }
            }
        }
    }
}

注意事项

  1. tree-sitter-elixir仍在开发中,某些边缘语法可能不完全支持
  2. 解析大型文件时,考虑使用增量解析以提高性能
  3. 错误恢复功能可以帮助处理不完整的代码,但可能产生不准确的语法树
  4. 定期检查更新以获取最新的语言支持改进

通过tree-sitter-elixir,你可以轻松地在Rust中构建强大的Elixir语言处理工具,利用其高效的解析能力和丰富的语法树信息。

回到顶部