Rust语法分析工具tree-sitter-php的使用,PHP语言解析库tree-sitter-php实现高效代码解析与语法高亮

tree-sitter-php

PHP 语法分析器,基于 tree-sitter 实现。

安装

在项目目录中运行以下 Cargo 命令:

cargo add tree-sitter-php

或者在 Cargo.toml 中添加以下行:

tree-sitter-php = "0.23.11"

使用示例

下面是一个完整的 Rust 示例,展示如何使用 tree-sitter-php 解析 PHP 代码并实现语法高亮:

use tree_sitter::{Parser, Tree};
use tree_sitter_php::language;

fn main() {
    // 创建解析器
    let mut parser = Parser::new();
    
    // 设置PHP语言
    parser.set_language(language()).expect("Error loading PHP grammar");
    
    // PHP示例代码
    let php_code = r#"
<?php
function hello($name) {
    echo "Hello, " . $name . "!";
}
hello("World");
?>
"#;

    // 解析代码
    let tree = parser.parse(php_code, None).expect("Error parsing PHP code");
    
    // 打印语法树
    print_syntax_tree(&tree, php_code);
    
    // 实现简单的语法高亮
    highlight_php_code(&tree, php_code);
}

fn print_syntax_tree(tree: &Tree, source: &str) {
    let root_node = tree.root_node();
    println!("Syntax Tree:");
    print_node(&root_node, source, 0);
}

fn print_node(node: &tree_sitter::Node, source: &str, indent: usize) {
    let indent_str = " ".repeat(indent * 2);
    let node_type = node.kind();
    let node_text = node.utf8_text(source.as_bytes()).unwrap();
    
    println!("{}{}: {}", indent_str, node_type, node_text);
    
    // 递归打印子节点
    for child in node.children(&mut node.walk()) {
        print_node(&child, source, indent + 1);
    }
}

fn highlight_php_code(tree: &Tree, source: &str) {
    use tree_sitter_highlight::{HighlightConfiguration, HighlightEvent, Highlighter};
    
    // 创建高亮器
    let mut highlighter = Highlighter::new();
    
    // 配置PHP的高亮规则
    let mut config = HighlightConfiguration::new(
        language(),
        tree_sitter_php::HIGHLIGHTS_QUERY,
        tree_sitter_php::INJECTIONS_QUERY,
        tree_sitter_php::LOCALS_QUERY,
    ).unwrap();
    
    // 设置高亮范围名称
    config.configure(&[
        "keyword", "function", "variable", "string", "comment", 
        "operator", "number", "type", "delimiter"
    ]);
    
    // 执行高亮
    let highlights = highlighter.highlight(
        &config,
        source.as_bytes(),
        None,
        |_| None
    ).unwrap();
    
    println!("\nSyntax Highlighting:");
    for event in highlights {
        match event.unwrap() {
            HighlightEvent::Source {start, end} => {
                let text = &source[start..end];
                print!("{}", text);
            }
            HighlightEvent::HighlightStart(idx) => {
                let color = match idx {
                    0 => "\x1b[31m", // keyword - red
                    1 => "\x1b[36m", // function - cyan
                    2 => "\x1b[34m", // variable - blue
                    3 => "\x1b[32m", // string - green
                    4 => "\x1b[90m", // comment - gray
                    _ => "\x1b[0m",  // default
                };
                print!("{}", color);
            }
            HighlightEvent::HighlightEnd => {
                print!("\x1b[0m"); // reset color
            }
        }
    }
    println!();
}

功能说明

  1. 语法分析:使用 tree-sitter-php 解析 PHP 代码并生成语法树
  2. 语法树可视化:递归打印语法树结构
  3. 语法高亮:基于语法树实现简单的 PHP 代码高亮显示

输出示例

运行上述代码将输出类似以下内容:

Syntax Tree:
program: 
  php_tag: <?php
  function_definition: 
    name: hello
    parameters: 
      name: name
    compound_statement: 
      echo_statement: 
        string: "Hello, "
        concatenation: .
        variable: name
        concatenation: .
        string: "!"
  expression_statement: 
    function_call_expression: 
      name: hello
      arguments: 
        string: "World"
  php_tag: ?>

Syntax Highlighting:
<?php
function hello($name) {
    echo "Hello, " . $name . "!";
}
hello("World");
?>

完整示例代码

// 引入必要的库
use std::error::Error;
use tree_sitter::{Parser, Tree};
use tree_sitter_php::language;

fn main() -> Result<(), Box<dyn Error>> {
    // 创建解析器实例
    let mut parser = Parser::new();
    
    // 加载PHP语法
    parser.set_language(language())?;
    
    // 定义要分析的PHP代码
    let php_code = r#"
<?php
// 这是一个简单的PHP函数
function greet($name) {
    if (empty($name)) {
        echo "Hello, stranger!";
    } else {
        echo "Hello, $name!";
    }
}

greet("Alice");
?>
"#;

    // 解析PHP代码
    let tree = parser.parse(php_code, None)?;
    
    // 打印语法树
    println!("解析结果:");
    print_tree(&tree, php_code);
    
    // 高亮显示代码
    println!("\n高亮显示:");
    highlight_code(&tree, php_code)?;
    
    Ok(())
}

// 递归打印语法树
fn print_tree(tree: &Tree, source: &str) {
    let root_node = tree.root_node();
    print_node(&root_node, source, 0);
}

fn print_node(node: &tree_sitter::Node, source: &str, depth: usize) {
    // 缩进表示层级
    let indent = "  ".repeat(depth);
    
    // 获取节点类型和文本内容
    let node_type = node.kind();
    let node_text = node.utf8_text(source.as_bytes()).unwrap_or("");
    
    // 打印节点信息
    println!("{}{}: '{}'", indent, node_type, node_text);
    
    // 递归处理子节点
    for child in node.children(&mut node.walk()) {
        print_node(&child, source, depth + 1);
    }
}

// 代码高亮实现
fn highlight_code(tree: &Tree, source: &str) -> Result<(), Box<dyn Error>> {
    use tree_sitter_highlight::{HighlightConfiguration, HighlightEvent, Highlighter};
    
    // 初始化高亮器
    let mut highlighter = Highlighter::new();
    
    // 配置PHP高亮规则
    let mut config = HighlightConfiguration::new(
        language(),
        tree_sitter_php::HIGHLIGHTS_QUERY,
        tree_sitter_php::INJECTIONS_QUERY,
        tree_sitter_php::LOCALS_QUERY,
    )?;
    
    // 定义高亮类型
    config.configure(&[
        "keyword",    // PHP关键字
        "function",   // 函数定义
        "variable",   // 变量
        "string",     // 字符串
        "comment",    // 注释
        "operator",   // 操作符
        "number",     // 数字
        "type",       // 类型
        "delimiter"   // 分隔符
    ]);
    
    // 执行高亮
    let highlights = highlighter.highlight(
        &config,
        source.as_bytes(),
        None,
        |_| None
    )?;
    
    // 处理高亮事件
    for event in highlights {
        match event? {
            HighlightEvent::Source { start, end } => {
                // 打印源代码
                print!("{}", &source[start..end]);
            }
            HighlightEvent::HighlightStart(style) => {
                // 根据类型应用不同颜色
                let color = match style {
                    0 => "\x1b[31m", // 关键字 - 红色
                    1 => "\x1b[36m", // 函数 - 青色
                    2 => "\x1b[34m", // 变量 - 蓝色
                    3 => "\x1b[32m", // 字符串 - 绿色
                    4 => "\x1b[90m", // 注释 - 灰色
                    _ => "\x1b[0m",  // 默认
                };
                print!("{}", color);
            }
            HighlightEvent::HighlightEnd => {
                // 重置颜色
                print!("\x1b[0m");
            }
        }
    }
    
    println!();
    Ok(())
}

1 回复

Rust语法分析工具tree-sitter-php的使用

介绍

tree-sitter-php是一个用Rust实现的PHP语言解析库,基于tree-sitter解析器生成系统。它能够高效地解析PHP代码,生成语法树,适用于代码编辑器、语法高亮、静态分析工具等场景。

主要特点:

  • 高性能:增量解析能力,只重新解析修改的部分
  • 鲁棒性:能处理不完整或错误的代码
  • 精确的语法树:提供详细的语法节点信息
  • 跨平台:可在多种操作系统上运行

安装方法

首先需要在Cargo.toml中添加依赖:

[dependencies]
tree-sitter = "0.20"
tree-sitter-php = "0.19"

基本使用方法

1. 解析PHP代码

use tree_sitter::Parser;
use tree_sitter_php::language;

fn main() {
    let code = r#"
        <?php
        function hello($name) {
            echo "Hello, " . $name;
        }
    "#;

    let mut parser = Parser::new();
    parser.set_language(language()).expect("Error loading PHP grammar");
    
    let tree = parser.parse(code, None).unwrap();
    let root_node = tree.root_node();
    
    println!("{}", root_node.to_sexp());
}

2. 遍历语法树

fn print_nodes(node: tree_sitter::Node, indent: usize) {
    let indent_str = " ".repeat(indent);
    println!("{}{:?}", indent_str, node);
    
    let mut cursor = node.walk();
    for child in node.children(&mut cursor) {
        print_nodes(child, indent + 2);
    }
}

// 在main函数中使用
print_nodes(root_node, 0);

3. 查询语法树

use tree_sitter::Query;

fn find_functions(tree: &tree_sitter::Tree) {
    let query_str = "(function_definition name: (name) @function-name)";
    let query = Query::new(language(), query_str).unwrap();
    let mut cursor = QueryCursor::new();
    
    for match_ in cursor.matches(&query, tree.root_node(), code.as_bytes()) {
        for capture in match_.captures {
            let node = capture.node;
            println!("Found function: {}", node.utf8_text(code.as_bytes()).unwrap());
        }
    }
}

高级用法

1. 增量解析

let mut parser = Parser::new();
parser.set_language(language()).unwrap();

// 第一次解析
let tree = parser.parse(code, None).unwrap();

// 修改代码后增量解析
let edited code = code.replace("hello", "greet");
let tree = parser.parse(&edited_code, Some(&tree)).unwrap();

2. 语法高亮实现

use tree_sitter_highlight::{HighlightConfiguration, Highlighter, HighlightEvent};

fn highlight_php(code: &str) {
    let mut highlighter = Highlighter::new();
    let config = HighlightConfiguration::new(
        language(),
        tree_sitter_php::HIGHLIGHTS_QUERY,
        tree_sitter_php::INJECTIONS_QUERY,
        tree_sitter_php::LOCALS_QUERY,
    ).unwrap();
    
    let highlights = highlighter.highlight(
        &config,
        code.as_bytes(),
        None,
        |_| None
    ).unwrap();
    
    for event in highlights {
        match event.unwrap() {
            HighlightEvent::Source {start, end} => {
                println!("Text: {}", &code[start..end]);
            }
            HighlightEvent::HighlightStart(s) => {
                println!("Start highlight: {:?}", s);
            }
            HighlightEvent::HighlightEnd => {
                println!("End highlight");
            }
        }
    }
}

完整示例demo

以下是一个完整的PHP代码分析工具示例,结合了上述各种功能:

use tree_sitter::{Parser, Node, Query, QueryCursor};
use tree_sitter_php::language;

// 定义函数信息结构体
#[derive(Debug)]
struct FunctionInfo {
    name: String,
    parameters: Vec<String>,
    return_type: Option<String>,
}

fn main() {
    // 示例PHP代码
    let php_code = r#"
        <?php
        function add(int $a, int $b): int {
            return $a + $b;
        }
        
        function greet(string $name, string $title = 'Mr.'): void {
            echo "Hello, $title $name";
        }
        
        class Calculator {
            public function multiply(float $x, float $y): float {
                return $x * $y;
            }
        }
    "#;

    // 1. 解析PHP代码
    let mut parser = Parser::new();
    parser.set_language(language()).expect("Error loading PHP grammar");
    let tree = parser.parse(php_code, None).unwrap();
    
    // 2. 打印语法树结构
    println!("Syntax Tree:");
    println!("{}", tree.root_node().to_sexp());
    println!();
    
    // 3. 提取函数信息
    let functions = extract_functions(php_code, &tree);
    println!("Found {} functions:", functions.len());
    for func in functions {
        println!("{:#?}", func);
    }
    println!();
    
    // 4. 使用查询语法查找所有方法调用
    find_method_calls(php_code, &tree);
}

// 提取函数信息的函数
fn extract_functions(code: &str, tree: &tree_sitter::Tree) -> Vec<FunctionInfo> {
    let mut functions = Vec::new();
    let root_node = tree.root_node();
    
    // 遍历语法树
    traverse(&root_node, &mut |node| {
        if node.kind() == "function_definition" {
            let name_node = node.child_by_field_name("name").unwrap();
            let name = name_node.utf8_text(code.as_bytes()).unwrap().to_string();
            
            // 提取参数
            let mut parameters = Vec::new();
            if let Some(params_node) = node.child_by_field_name("parameters") {
                for param in params_node.children(&mut params_node.walk()) {
                    if param.kind() == "simple_parameter" {
                        if let Some(name_node) = param.child_by_field_name("name") {
                            let param_name = name_node.utf8_text(code.as_bytes()).unwrap().to_string();
                            parameters.push(param_name);
                        }
                    }
                }
            }
            
            // 提取返回类型
            let return_type = node.child_by_field_name("return_type")
                .map(|rt| rt.utf8_text(code.as_bytes()).unwrap().to_string());
            
            functions.push(FunctionInfo {
                name,
                parameters,
                return_type,
            });
        }
    });
    
    functions
}

// 查找方法调用的函数
fn find_method_calls(code: &str, tree: &tree_sitter::Tree) {
    let query_str = "
        (method_call_expression
            object: (variable_name) @object
            name: (name) @method
        )";
    
    let query = Query::new(language(), query_str).unwrap();
    let mut cursor = QueryCursor::new();
    
    println!("Method calls found:");
    for match_ in cursor.matches(&query, tree.root_node(), code.as_bytes()) {
        let mut object = None;
        let mut method = None;
        
        for capture in match_.captures {
            let node = capture.node;
            match capture.index {
                0 => object = Some(node.utf8_text(code.as_bytes()).unwrap()),
                1 => method = Some(node.utf8_text(code.as_bytes()).unwrap()),
                _ => {}
            }
        }
        
        if let (Some(obj), Some(meth)) = (object, method) {
            println!("{}.{}()", obj, meth);
        }
    }
}

// 递归遍历语法树的辅助函数
fn traverse(node: &Node, callback: &mut dyn FnMut(&Node)) {
    callback(node);
    
    let mut cursor = node.walk();
    for child in node.children(&mut cursor) {
        traverse(&child, callback);
    }
}

注意事项

  1. tree-sitter-php需要与tree-sitter库配合使用
  2. 解析错误时,语法树会包含error节点而不是直接失败
  3. 对于大型PHP文件,建议使用增量解析以提高性能
  4. 查询语法时,确保查询字符串与当前版本的语法定义匹配

通过tree-sitter-php,开发者可以轻松构建PHP代码分析、转换和高亮工具,充分利用Rust的性能优势和tree-sitter的强大解析能力。

回到顶部