Golang语义分析工具实现方案求评测

Golang语义分析工具实现方案求评测你好，如标题所述，我只需要一双眼睛来检查一下我做的语义分析是否正确。基本上，我只是在检查一个 变量 是否已经被声明过。

任何人都可以给我提供改进的方法和/或任何建议，比如命名约定、模式、速度提升等等。

你可以在这里查看： main/semantic ← 需要一点帮助的代码。

提前感谢大家，你们的帮助对我来说非常重要，我是 Go 语言的新手。

此致，

vueper 1楼

没有人愿意审查一下吗？

更多关于Golang语义分析工具实现方案求评测的实战系列教程也可以访问 https://www.itying.com/category-94-b0.html

htzhanglong 2楼

非常感谢您抽出时间审阅我的代码，这对我意义重大，再次衷心感谢。

skillian: 在我看来，这里应该是实际检查标识符是否已声明的部分，但它假设如果在当前作用域中找不到该标识符，那么它一定是全局变量。这门语言支持闭包或嵌套作用域吗？

是的，我也对此感到困惑，这也是我无法决定最佳实现方式的原因。我试图修复的是以下代码：

var a = "global";
function createB() {
  var b = a;
  return b;
}
var a = "other global";
function createC() {
   var c = a;
   return c;
}

puts(createdB()); // 这里需要打印 "global"，但它会打印 "other global";
puts(createA()); // 这会打印 "other global";

所以我的问题，并不是要正确地对环境进行快照，那是我最初实现语义分析时遇到的问题。

关于这一点，

skillian: 编辑：在我看来，全局变量似乎可以被重新定义，因为你为全局变量使用的是切片而不是 *Scope。也许这是有意为之，但如果不是，或许你可以将 Semantic 的定义改为：

是的，我最初的想法是允许重新定义全局变量，但您说得对，这可能是一个很好的改进点。

bupafengyu 3楼

我对编译原理只有基本的了解，但有点困惑 (*semantic.Semantic).expectIdentifierDeclare 这个函数是做什么的。在我看来，这里应该是实际检查以确保标识符已声明的地方，但它假设如果在当前作用域中找不到该标识符，那么它一定是全局变量。这门语言支持闭包或嵌套作用域吗？例如：

function () {
    var a = true;
    {
        var b = a;
    }
}

如果支持，你可能需要遍历 scopeStack 中的各个帧，以查看外层帧是否声明了该变量。

编辑：另外，在我看来，全局变量似乎可以被重新定义，因为你使用的是切片（slice）来存储全局变量，而不是 *Scope。也许这是有意为之，但如果不是，或许你可以将 Semantic 的定义改为：

type Semantic struct {
	scopeStack      Stack
	globalVariables Scope
	localVariables  map[ast.Node]int
	errors          []string
}

func New() *Semantic {
	s := &Semantic{
		scopeStack: make(Stack, 1, 8),	// 不妨预先预留一些容量，
		globalVariables: Scope{},		// 以避免早期从 1 -> 2 -> 4 -> 8
	}									// 的重新分配
	s.scopeStack[0] = &s.globalVariables
	return s
}

/* ... */

func (s *Semantic) declare(name string) {
// 不再需要这个了，因为栈上现在总有一个作用域。
// 第一个就是全局作用域。
//
//	if s.scopeStack.IsEmpty() {
//		s.globalVariables = append(s.globalVariables, name)
//		return
//	}

	peek, _ := s.scopeStack.Peek()
	(*peek).Put(name, false)
}

yibo5220 4楼

以下是针对你语义分析工具的代码评审：

代码结构分析

1. 主要问题：作用域管理

当前实现中作用域管理存在缺陷，pushScope 和 popScope 的实现可能导致作用域链断裂：

// 当前实现的问题
func (s *Semantic) pushScope() {
    s.scope = &Scope{
        parent: s.scope,
        vars:   make(map[string]bool),
    }
}

func (s *Semantic) popScope() {
    if s.scope.parent != nil {
        s.scope = s.scope.parent
    }
}

改进方案：

type Scope struct {
    parent *Scope
    vars   map[string]struct{}  // 使用空结构体节省内存
    depth  int
}

func (s *Semantic) pushScope() {
    depth := 0
    if s.scope != nil {
        depth = s.scope.depth + 1
    }
    
    s.scope = &Scope{
        parent: s.scope,
        vars:   make(map[string]struct{}),
        depth:  depth,
    }
    s.scopeStack = append(s.scopeStack, s.scope) // 维护栈用于调试
}

func (s *Semantic) popScope() error {
    if s.scope == nil {
        return fmt.Errorf("no scope to pop")
    }
    s.scope = s.scope.parent
    s.scopeStack = s.scopeStack[:len(s.scopeStack)-1]
    return nil
}

2. 变量声明检查优化

当前 visitVarDecl 方法可以优化错误处理和类型安全：

func (s *Semantic) visitVarDecl(node *ast.VarDecl) error {
    if node.Name == "" {
        return fmt.Errorf("variable declaration missing name")
    }
    
    // 检查当前作用域
    if _, exists := s.scope.vars[node.Name]; exists {
        return fmt.Errorf("variable '%s' already declared in current scope", node.Name)
    }
    
    // 检查所有父作用域（如果不需要遮蔽）
    if !s.allowShadowing {
        current := s.scope
        for current != nil {
            if _, exists := current.vars[node.Name]; exists {
                return fmt.Errorf("variable '%s' shadows declaration in outer scope", node.Name)
            }
            current = current.parent
        }
    }
    
    s.scope.vars[node.Name] = struct{}{}
    s.symbols = append(s.symbols, Symbol{
        Name:  node.Name,
        Type:  node.Type,
        Scope: s.scope.depth,
        Pos:   node.Pos,
    })
    
    return nil
}

3. 标识符使用检查增强

visitIdentifier 方法需要更全面的作用域查找：

func (s *Semantic) visitIdentifier(node *ast.Identifier) error {
    if node.Name == "" {
        return fmt.Errorf("empty identifier")
    }
    
    // 在作用域链中查找变量
    current := s.scope
    for current != nil {
        if _, exists := current.vars[node.Name]; exists {
            // 记录使用信息
            s.references = append(s.references, Reference{
                Name:     node.Name,
                Scope:    current.depth,
                Pos:      node.Pos,
                IsDefined: true,
            })
            return nil
        }
        current = current.parent
    }
    
    // 未找到变量
    s.errors = append(s.errors, fmt.Errorf("undefined variable '%s' at position %d", 
        node.Name, node.Pos))
    
    s.references = append(s.references, Reference{
        Name:     node.Name,
        Scope:    -1,
        Pos:      node.Pos,
        IsDefined: false,
    })
    
    return nil
}

4. 性能优化建议

添加缓存机制和批量处理：

type Semantic struct {
    scope      *Scope
    scopeStack []*Scope
    symbols    []Symbol
    references []Reference
    errors     []error
    
    // 缓存
    symbolCache map[string]Symbol
    scopeCache  map[int]*Scope
    
    // 配置
    allowShadowing bool
    strictMode     bool
}

func (s *Semantic) Analyze(nodes []ast.Node) (*Result, error) {
    // 预分配内存
    s.symbols = make([]Symbol, 0, len(nodes)/2)
    s.references = make([]Reference, 0, len(nodes))
    s.errors = make([]error, 0, 10)
    
    // 批量处理节点
    for _, node := range nodes {
        if err := s.visit(node); err != nil {
            if s.strictMode {
                return nil, err
            }
            s.errors = append(s.errors, err)
        }
    }
    
    return &Result{
        Symbols:    s.symbols,
        References: s.references,
        Errors:     s.errors,
    }, nil
}

5. 测试用例示例

添加全面的测试覆盖：

func TestSemanticAnalyzer(t *testing.T) {
    tests := []struct {
        name     string
        code     string
        wantErrs int
    }{
        {
            name: "valid variable declaration",
            code: `var x int; var y string;`,
            wantErrs: 0,
        },
        {
            name: "duplicate variable in same scope",
            code: `var x int; var x string;`,
            wantErrs: 1,
        },
        {
            name: "variable shadowing",
            code: `var x int; { var x string; }`,
            wantErrs: 0, // 或 1，取决于配置
        },
        {
            name: "undefined variable",
            code: `x = 10;`,
            wantErrs: 1,
        },
    }
    
    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            parser := NewParser(tt.code)
            nodes, _ := parser.Parse()
            
            analyzer := NewSemanticAnalyzer()
            result, err := analyzer.Analyze(nodes)
            
            if tt.wantErrs != len(result.Errors) {
                t.Errorf("expected %d errors, got %d", tt.wantErrs, len(result.Errors))
            }
        })
    }
}

关键改进点

作用域管理：修复作用域链的维护，添加深度跟踪
错误处理：提供更详细的错误信息和位置
内存优化：使用 map[string]struct{} 替代 map[string]bool
性能优化：添加缓存和预分配
可配置性：支持严格模式、变量遮蔽等配置选项
测试覆盖：确保各种边界情况都被处理

当前实现的基础架构是正确的，主要需要加强作用域管理和错误处理的健壮性。