Rust语法分析工具tree-sitter-php的使用,PHP语言解析库tree-sitter-php实现高效代码解析与语法高亮
tree-sitter-php
PHP 语法分析器,基于 tree-sitter 实现。
安装
在项目目录中运行以下 Cargo 命令:
cargo add tree-sitter-php
或者在 Cargo.toml 中添加以下行:
tree-sitter-php = "0.23.11"
使用示例
下面是一个完整的 Rust 示例,展示如何使用 tree-sitter-php 解析 PHP 代码并实现语法高亮:
use tree_sitter::{Parser, Tree};
use tree_sitter_php::language;
fn main() {
// 创建解析器
let mut parser = Parser::new();
// 设置PHP语言
parser.set_language(language()).expect("Error loading PHP grammar");
// PHP示例代码
let php_code = r#"
<?php
function hello($name) {
echo "Hello, " . $name . "!";
}
hello("World");
?>
"#;
// 解析代码
let tree = parser.parse(php_code, None).expect("Error parsing PHP code");
// 打印语法树
print_syntax_tree(&tree, php_code);
// 实现简单的语法高亮
highlight_php_code(&tree, php_code);
}
fn print_syntax_tree(tree: &Tree, source: &str) {
let root_node = tree.root_node();
println!("Syntax Tree:");
print_node(&root_node, source, 0);
}
fn print_node(node: &tree_sitter::Node, source: &str, indent: usize) {
let indent_str = " ".repeat(indent * 2);
let node_type = node.kind();
let node_text = node.utf8_text(source.as_bytes()).unwrap();
println!("{}{}: {}", indent_str, node_type, node_text);
// 递归打印子节点
for child in node.children(&mut node.walk()) {
print_node(&child, source, indent + 1);
}
}
fn highlight_php_code(tree: &Tree, source: &str) {
use tree_sitter_highlight::{HighlightConfiguration, HighlightEvent, Highlighter};
// 创建高亮器
let mut highlighter = Highlighter::new();
// 配置PHP的高亮规则
let mut config = HighlightConfiguration::new(
language(),
tree_sitter_php::HIGHLIGHTS_QUERY,
tree_sitter_php::INJECTIONS_QUERY,
tree_sitter_php::LOCALS_QUERY,
).unwrap();
// 设置高亮范围名称
config.configure(&[
"keyword", "function", "variable", "string", "comment",
"operator", "number", "type", "delimiter"
]);
// 执行高亮
let highlights = highlighter.highlight(
&config,
source.as_bytes(),
None,
|_| None
).unwrap();
println!("\nSyntax Highlighting:");
for event in highlights {
match event.unwrap() {
HighlightEvent::Source {start, end} => {
let text = &source[start..end];
print!("{}", text);
}
HighlightEvent::HighlightStart(idx) => {
let color = match idx {
0 => "\x1b[31m", // keyword - red
1 => "\x1b[36m", // function - cyan
2 => "\x1b[34m", // variable - blue
3 => "\x1b[32m", // string - green
4 => "\x1b[90m", // comment - gray
_ => "\x1b[0m", // default
};
print!("{}", color);
}
HighlightEvent::HighlightEnd => {
print!("\x1b[0m"); // reset color
}
}
}
println!();
}
功能说明
- 语法分析:使用 tree-sitter-php 解析 PHP 代码并生成语法树
- 语法树可视化:递归打印语法树结构
- 语法高亮:基于语法树实现简单的 PHP 代码高亮显示
输出示例
运行上述代码将输出类似以下内容:
Syntax Tree:
program:
php_tag: <?php
function_definition:
name: hello
parameters:
name: name
compound_statement:
echo_statement:
string: "Hello, "
concatenation: .
variable: name
concatenation: .
string: "!"
expression_statement:
function_call_expression:
name: hello
arguments:
string: "World"
php_tag: ?>
Syntax Highlighting:
<?php
function hello($name) {
echo "Hello, " . $name . "!";
}
hello("World");
?>
完整示例代码
// 引入必要的库
use std::error::Error;
use tree_sitter::{Parser, Tree};
use tree_sitter_php::language;
fn main() -> Result<(), Box<dyn Error>> {
// 创建解析器实例
let mut parser = Parser::new();
// 加载PHP语法
parser.set_language(language())?;
// 定义要分析的PHP代码
let php_code = r#"
<?php
// 这是一个简单的PHP函数
function greet($name) {
if (empty($name)) {
echo "Hello, stranger!";
} else {
echo "Hello, $name!";
}
}
greet("Alice");
?>
"#;
// 解析PHP代码
let tree = parser.parse(php_code, None)?;
// 打印语法树
println!("解析结果:");
print_tree(&tree, php_code);
// 高亮显示代码
println!("\n高亮显示:");
highlight_code(&tree, php_code)?;
Ok(())
}
// 递归打印语法树
fn print_tree(tree: &Tree, source: &str) {
let root_node = tree.root_node();
print_node(&root_node, source, 0);
}
fn print_node(node: &tree_sitter::Node, source: &str, depth: usize) {
// 缩进表示层级
let indent = " ".repeat(depth);
// 获取节点类型和文本内容
let node_type = node.kind();
let node_text = node.utf8_text(source.as_bytes()).unwrap_or("");
// 打印节点信息
println!("{}{}: '{}'", indent, node_type, node_text);
// 递归处理子节点
for child in node.children(&mut node.walk()) {
print_node(&child, source, depth + 1);
}
}
// 代码高亮实现
fn highlight_code(tree: &Tree, source: &str) -> Result<(), Box<dyn Error>> {
use tree_sitter_highlight::{HighlightConfiguration, HighlightEvent, Highlighter};
// 初始化高亮器
let mut highlighter = Highlighter::new();
// 配置PHP高亮规则
let mut config = HighlightConfiguration::new(
language(),
tree_sitter_php::HIGHLIGHTS_QUERY,
tree_sitter_php::INJECTIONS_QUERY,
tree_sitter_php::LOCALS_QUERY,
)?;
// 定义高亮类型
config.configure(&[
"keyword", // PHP关键字
"function", // 函数定义
"variable", // 变量
"string", // 字符串
"comment", // 注释
"operator", // 操作符
"number", // 数字
"type", // 类型
"delimiter" // 分隔符
]);
// 执行高亮
let highlights = highlighter.highlight(
&config,
source.as_bytes(),
None,
|_| None
)?;
// 处理高亮事件
for event in highlights {
match event? {
HighlightEvent::Source { start, end } => {
// 打印源代码
print!("{}", &source[start..end]);
}
HighlightEvent::HighlightStart(style) => {
// 根据类型应用不同颜色
let color = match style {
0 => "\x1b[31m", // 关键字 - 红色
1 => "\x1b[36m", // 函数 - 青色
2 => "\x1b[34m", // 变量 - 蓝色
3 => "\x1b[32m", // 字符串 - 绿色
4 => "\x1b[90m", // 注释 - 灰色
_ => "\x1b[0m", // 默认
};
print!("{}", color);
}
HighlightEvent::HighlightEnd => {
// 重置颜色
print!("\x1b[0m");
}
}
}
println!();
Ok(())
}
1 回复
Rust语法分析工具tree-sitter-php的使用
介绍
tree-sitter-php是一个用Rust实现的PHP语言解析库,基于tree-sitter解析器生成系统。它能够高效地解析PHP代码,生成语法树,适用于代码编辑器、语法高亮、静态分析工具等场景。
主要特点:
- 高性能:增量解析能力,只重新解析修改的部分
- 鲁棒性:能处理不完整或错误的代码
- 精确的语法树:提供详细的语法节点信息
- 跨平台:可在多种操作系统上运行
安装方法
首先需要在Cargo.toml中添加依赖:
[dependencies]
tree-sitter = "0.20"
tree-sitter-php = "0.19"
基本使用方法
1. 解析PHP代码
use tree_sitter::Parser;
use tree_sitter_php::language;
fn main() {
let code = r#"
<?php
function hello($name) {
echo "Hello, " . $name;
}
"#;
let mut parser = Parser::new();
parser.set_language(language()).expect("Error loading PHP grammar");
let tree = parser.parse(code, None).unwrap();
let root_node = tree.root_node();
println!("{}", root_node.to_sexp());
}
2. 遍历语法树
fn print_nodes(node: tree_sitter::Node, indent: usize) {
let indent_str = " ".repeat(indent);
println!("{}{:?}", indent_str, node);
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
print_nodes(child, indent + 2);
}
}
// 在main函数中使用
print_nodes(root_node, 0);
3. 查询语法树
use tree_sitter::Query;
fn find_functions(tree: &tree_sitter::Tree) {
let query_str = "(function_definition name: (name) @function-name)";
let query = Query::new(language(), query_str).unwrap();
let mut cursor = QueryCursor::new();
for match_ in cursor.matches(&query, tree.root_node(), code.as_bytes()) {
for capture in match_.captures {
let node = capture.node;
println!("Found function: {}", node.utf8_text(code.as_bytes()).unwrap());
}
}
}
高级用法
1. 增量解析
let mut parser = Parser::new();
parser.set_language(language()).unwrap();
// 第一次解析
let tree = parser.parse(code, None).unwrap();
// 修改代码后增量解析
let edited code = code.replace("hello", "greet");
let tree = parser.parse(&edited_code, Some(&tree)).unwrap();
2. 语法高亮实现
use tree_sitter_highlight::{HighlightConfiguration, Highlighter, HighlightEvent};
fn highlight_php(code: &str) {
let mut highlighter = Highlighter::new();
let config = HighlightConfiguration::new(
language(),
tree_sitter_php::HIGHLIGHTS_QUERY,
tree_sitter_php::INJECTIONS_QUERY,
tree_sitter_php::LOCALS_QUERY,
).unwrap();
let highlights = highlighter.highlight(
&config,
code.as_bytes(),
None,
|_| None
).unwrap();
for event in highlights {
match event.unwrap() {
HighlightEvent::Source {start, end} => {
println!("Text: {}", &code[start..end]);
}
HighlightEvent::HighlightStart(s) => {
println!("Start highlight: {:?}", s);
}
HighlightEvent::HighlightEnd => {
println!("End highlight");
}
}
}
}
完整示例demo
以下是一个完整的PHP代码分析工具示例,结合了上述各种功能:
use tree_sitter::{Parser, Node, Query, QueryCursor};
use tree_sitter_php::language;
// 定义函数信息结构体
#[derive(Debug)]
struct FunctionInfo {
name: String,
parameters: Vec<String>,
return_type: Option<String>,
}
fn main() {
// 示例PHP代码
let php_code = r#"
<?php
function add(int $a, int $b): int {
return $a + $b;
}
function greet(string $name, string $title = 'Mr.'): void {
echo "Hello, $title $name";
}
class Calculator {
public function multiply(float $x, float $y): float {
return $x * $y;
}
}
"#;
// 1. 解析PHP代码
let mut parser = Parser::new();
parser.set_language(language()).expect("Error loading PHP grammar");
let tree = parser.parse(php_code, None).unwrap();
// 2. 打印语法树结构
println!("Syntax Tree:");
println!("{}", tree.root_node().to_sexp());
println!();
// 3. 提取函数信息
let functions = extract_functions(php_code, &tree);
println!("Found {} functions:", functions.len());
for func in functions {
println!("{:#?}", func);
}
println!();
// 4. 使用查询语法查找所有方法调用
find_method_calls(php_code, &tree);
}
// 提取函数信息的函数
fn extract_functions(code: &str, tree: &tree_sitter::Tree) -> Vec<FunctionInfo> {
let mut functions = Vec::new();
let root_node = tree.root_node();
// 遍历语法树
traverse(&root_node, &mut |node| {
if node.kind() == "function_definition" {
let name_node = node.child_by_field_name("name").unwrap();
let name = name_node.utf8_text(code.as_bytes()).unwrap().to_string();
// 提取参数
let mut parameters = Vec::new();
if let Some(params_node) = node.child_by_field_name("parameters") {
for param in params_node.children(&mut params_node.walk()) {
if param.kind() == "simple_parameter" {
if let Some(name_node) = param.child_by_field_name("name") {
let param_name = name_node.utf8_text(code.as_bytes()).unwrap().to_string();
parameters.push(param_name);
}
}
}
}
// 提取返回类型
let return_type = node.child_by_field_name("return_type")
.map(|rt| rt.utf8_text(code.as_bytes()).unwrap().to_string());
functions.push(FunctionInfo {
name,
parameters,
return_type,
});
}
});
functions
}
// 查找方法调用的函数
fn find_method_calls(code: &str, tree: &tree_sitter::Tree) {
let query_str = "
(method_call_expression
object: (variable_name) @object
name: (name) @method
)";
let query = Query::new(language(), query_str).unwrap();
let mut cursor = QueryCursor::new();
println!("Method calls found:");
for match_ in cursor.matches(&query, tree.root_node(), code.as_bytes()) {
let mut object = None;
let mut method = None;
for capture in match_.captures {
let node = capture.node;
match capture.index {
0 => object = Some(node.utf8_text(code.as_bytes()).unwrap()),
1 => method = Some(node.utf8_text(code.as_bytes()).unwrap()),
_ => {}
}
}
if let (Some(obj), Some(meth)) = (object, method) {
println!("{}.{}()", obj, meth);
}
}
}
// 递归遍历语法树的辅助函数
fn traverse(node: &Node, callback: &mut dyn FnMut(&Node)) {
callback(node);
let mut cursor = node.walk();
for child in node.children(&mut cursor) {
traverse(&child, callback);
}
}
注意事项
- tree-sitter-php需要与tree-sitter库配合使用
- 解析错误时,语法树会包含error节点而不是直接失败
- 对于大型PHP文件,建议使用增量解析以提高性能
- 查询语法时,确保查询字符串与当前版本的语法定义匹配
通过tree-sitter-php,开发者可以轻松构建PHP代码分析、转换和高亮工具,充分利用Rust的性能优势和tree-sitter的强大解析能力。