Golang中非原子加载与CompareAndSwap是否冲突？

Golang中非原子加载与CompareAndSwap是否冲突？非原子读取是否会与 atomic.CompareAndSwap 冲突？我有以下代码：

package main

import (
	"fmt"
	"sync"
	"sync/atomic"
)

func main() {
	x := int32(0)
	for i := 0; i < 100000; i++ {
		n := int32(0)
		wg := sync.WaitGroup{}
		wg.Add(2)
		go func() {
			atomic.CompareAndSwapInt32(&n, 1, 2)
			wg.Done()
		}()
		go func() {
			x += n
			wg.Done()
		}()
		wg.Wait()
	}
	fmt.Println(x)
}

这是一个非常简单的程序：有两个 goroutine，其中一个在 n 上调用 atomic.CompareAndSwap()，另一个（非原子地）读取 n。使用 go run -race main.go 运行此程序会得到以下输出：

==================
WARNING: DATA RACE
Read at 0x00c00001413c by goroutine 8:
  main.main.func2()
      /tmp/atomictest/main.go:20 +0x59

Previous write at 0x00c00001413c by goroutine 7:
  sync/atomic.CompareAndSwapInt32()
      /home/jauhararifin/.gvm/gos/go1.21/src/runtime/race_amd64.s:310 +0xb
  sync/atomic.CompareAndSwapInt32()
      <autogenerated>:1 +0x18

Goroutine 8 (running) created at:
  main.main()
      /tmp/atomictest/main.go:19 +0x4a

Goroutine 7 (running) created at:
  main.main()
      /tmp/atomictest/main.go:15 +0x17c
==================

检测到数据竞争，因为我（非原子地）读取 n 的同时，atomic.CompareAndSwap(&n, ..., ...) 也在并发执行。请注意，我将 CAS 操作设置为总是失败。

这可以理解，因为我在读取 n 时没有使用原子操作。但我阅读了 Go sync.Mutex 的实现：

func (m *Mutex) Lock() {
	// Fast path: grab unlocked mutex.
	if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
		if race.Enabled {
			race.Acquire(unsafe.Pointer(m))
		}
		return
	}
	// Slow path (outlined so that the fast path can be inlined)
	m.lockSlow()
}

func (m *Mutex) lockSlow() {
	var waitStartTime int64
	starving := false
	awoke := false
	iter := 0
	old := m.state
	...
}

如你所见，m.state 上存在可能的并发访问。在快速路径中，一个 goroutine 可以在 &m.state 上执行 CAS。同时，另一个 goroutine 可以在慢速路径中执行 old := m.state。但是，使用 -race 标志并发执行 Lock() 并没有显示任何数据竞争报告。这是如何实现的？

更多关于Golang中非原子加载与CompareAndSwap是否冲突？的实战教程也可以访问 https://www.itying.com/category-94-b0.html

yuanlaile 1楼

peakedshout:

看起来标准库有某种忽略机制？

这是我的第一个假设。但我不知道如何证明这一点。

更多关于Golang中非原子加载与CompareAndSwap是否冲突？的实战系列教程也可以访问 https://www.itying.com/category-94-b0.html

caililin 2楼

if race.Enabled {
        race.Acquire(unsafe.Pointer(m))
    }

这告诉检测器不存在竞争问题。

sinazl 3楼作者

这似乎是真的。感谢你为我解答了一个问题。之前，我一直在考虑运行时的检测，没想到关键点在于在cmd中构建。

再次感谢。

h691938207 4楼

嗯，你说得对。这非常有趣。我修改了标准库并添加了一个函数：

func(m *Mutex)Test(){
if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
if race.Enabled {
race.Acquire(unsafe.Pointer(m))
}
return
}
_=m.state
}

发现调用时没有产生警告？看起来标准库有某种忽略机制？

yuanlaile 5楼

哦，我误解了你的问题。我重新阅读了你提供的代码以及互斥锁的源代码。

这两个例子本质上是不同的。你需要理解 atomic.CompareAndSwapInt32 在做什么：

开始原子执行
读取状态变量，比较旧变量，当一致时写入新值
结束原子执行唯一能执行上述操作的 goroutine 是第一个抢到锁的 goroutine (A)。

那么后续的 goroutine 在做什么呢？

开始原子执行
读取状态变量，比较旧变量，发现不一致，不写入新值
结束原子执行
读取状态变量 (old := m.state)
…

你发现了吗？后续的 goroutine 只涉及读取操作，没有发生写入。并发读取变量不会产生竞争。

一个简化的例子是：

type T struct {
	x int32
}

func (t *T) X() {
	if atomic.CompareAndSwapInt32(&t.x, 0, 1) {
		return
	}
	_ = t.x
	//fmt.Println(t.x)
}

yuanlaile 6楼

但是，即使没有触发 race.Acquire(unsafe.Pointer(m))，数据竞争也应该发生。例如，假设有 3 个 goroutine：

goroutine 1：锁定互斥锁，在快速路径中成功
goroutine 2：在快速路径中锁定互斥锁，但失败并进入慢速路径
goroutine 2 现在即将执行 old := m.state
goroutine 3：锁定互斥锁，它尝试在快速路径中执行 atomic.CompareAndSwap(...)
goroutine 1：释放锁
此处存在数据竞争：goroutine 2 现在执行 old := m.state，而 goroutine 3 同时执行 atomic.CompareAndSwap(...)。在 goroutine 2 和 3 的情况下，没有调用 race.Acquire(unsafe.Pointer(m))。

在上述场景中，goroutine 2 和 3 并发访问 m，其中一个执行加载操作，另一个执行 CAS 操作，期间没有调用 runtime.Acquire。然而，并没有捕获到数据竞争。

gougou168 7楼

我查阅了一些资料：

llvm/llvm-project/blob/main/compiler-rt/lib/tsan/go/tsan_go.cpp

//===-- tsan_go.cpp -------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// ThreadSanitizer runtime for Go language.
//
//===----------------------------------------------------------------------===//

#include "tsan_rtl.h"
#include "tsan_symbolize.h"
#include "sanitizer_common/sanitizer_common.h"
#include <stdlib.h>

namespace __tsan {

void InitializeInterceptors() {

此文件已被截断。显示原始文件

以及一些源代码： src/runtime/race/* src/runtime/race.go src/sync/*

看起来竞态检测是由外部工具执行的，但我仍然不知道如何注册特定的忽略包。看来我必须分析 goroutine 的工作流程…… 但目前我没有精力继续探索。我会等待其他人的回答。

bupafengyu 8楼

你找到了吗？后续的 goroutine 只涉及读操作，没有写操作发生。变量的并发读取不会产生竞态。

如果你检查我的原始代码：

n := int32(0)
go func() {
	atomic.CompareAndSwapInt32(&n, 1, 2)
}()
go func() {
	x += n
}()

上面的 atomic.CompareAndSwapInt32 永远不会成功交换 n。因为 n 初始化为 0，而 CAS 试图将其从 1 交换到 2。所以 CAS 总是失败。它只是读取 n，什么也不做。它从未交换 n。在此次执行结束时，n 总是保证为零。所以，在我的原始代码中，同样没有写操作发生。但是，Go 将其计为数据竞态。

如果你看我上面的例子：

goroutine 1：锁定互斥锁，在快速路径中成功
goroutine 2：在快速路径中锁定互斥锁，但失败并进入慢速路径
goroutine 2 现在即将执行 old := m.state
goroutine 3：锁定互斥锁，它尝试在快速路径中执行 atomic.CompareAndSwap(...)
goroutine 1：释放锁
数据竞态发生：goroutine 2 现在执行 old := m.state，而 goroutine 3 并发地执行 atomic.CompareAndSwap(...)。在 goroutine 2 和 3 的情况下，没有调用 race.Acquire(unsafe.Pointer(m))。

在上面的步骤 (6) 中，goroutine 2 和 3 都可以访问相同的数据 m.state，其中 goroutine 3 将通过 CAS 写入它，而 goroutine 2 将在没有原子操作的情况下读取它。但这并没有触发数据竞态检测。

ionicwang 9楼

我花了一些时间研究在启用 -race 标志时，Go 是如何生成 SSA 的。事实证明，当你读取一个变量时，Go 会在 SSA 中添加额外的 call runtime.raceread 指令。这个指令通常不会被添加。只有当你使用 -race 标志构建时，它才会被添加。

有趣的是，在 mutex.go 文件（位于 sync 包内）中，并没有添加 runtime.raceread 调用。

因此，我搜索了 Go 源代码，在代码生成部分寻找 raceread 的出现。我发现了这个：

// src/cmd/compile/internal/ssagen/ssa.go
ir.Syms.Raceread = typecheck.LookupRuntimeFunc("raceread")

然后我查看了 ir.Syms.Raceread 在何处被使用，并发现了这个：

// src/cmd/compile/internal/ssagen/ssa.go
func (s *state) instrument2(t *types.Type, addr, addr2 *ssa.Value, kind instrumentKind) {
...
	} else if base.Flag.Race {
		// for non-composite objects we can write just the start
		// address, as any write must write the first byte.
		switch kind {
		case instrumentRead:
			fn = ir.Syms.Raceread
		case instrumentWrite:
			fn = ir.Syms.Racewrite
		default:
			panic("unreachable")
		}
	} else if base.Flag.ASan {

我的直觉告诉我，当添加 -race 标志时会调用 instrument2。因此，我仔细查看了 instrument2：

func (s *state) instrument2(t *types.Type, addr, addr2 *ssa.Value, kind instrumentKind) {
	if !s.curfn.InstrumentBody() {
		return
	}

看起来，当 InstrumentBody() 返回 false 时，会跳过检测代码。

func (f *Func) InstrumentBody() bool           { return f.flags&amp;funcInstrumentBody != 0 }
func (f *Func) SetInstrumentBody(b bool)           { f.flags.set(funcInstrumentBody, b) }

于是，我查看了谁调用了 SetInstrumentBody。

	if !base.Flag.Race || !base.Compiling(base.NoRacePkgs) {
		fn.SetInstrumentBody(true)
	}

base.NoRacePkgs 看起来很有希望，我查看了它的内部：

var NoRacePkgs = []string{"sync", "sync/atomic"}

是的，看起来 sync 包是一个例外。检测代码不会添加到那里。因此，即使在 sync 包中存在数据竞争，Go 也不会报告它们。

yibo5220 10楼

在 Go 中，非原子读取与 atomic.CompareAndSwap 确实会引发数据竞争，但 sync.Mutex 的实现通过特定的内存访问模式避免了这个问题。关键在于 Go 的内存模型和编译器/运行时对竞争检测的处理方式。

你的示例代码确实存在数据竞争，因为对 n 的非原子读取与 CAS 操作并发执行。而 sync.Mutex 的实现之所以没有报告竞争，是因为：

内存访问模式：在 lockSlow() 中读取 m.state 时，虽然是非原子操作，但此时互斥锁已经处于锁定状态或正在竞争状态，读取的是当前 goroutine 的本地视图。
竞争检测器的启发式规则：Go 的竞争检测器对某些模式进行了特殊处理，特别是标准库中的同步原语。
内存屏障和编译器优化：sync/atomic 操作包含隐式的内存屏障，影响编译器的优化和内存可见性。

以下是更清晰的示例说明：

package main

import (
	"fmt"
	"sync"
	"sync/atomic"
)

type CustomMutex struct {
	state int32
}

func (m *CustomMutex) Lock() {
	// 快速路径：尝试直接获取锁
	if atomic.CompareAndSwapInt32(&m.state, 0, 1) {
		return
	}
	// 慢速路径：读取当前状态
	old := m.state // 非原子读取，但此时锁已被持有
	fmt.Printf("Current state: %d\n", old)
	// 实际实现会有更复杂的逻辑
}

func main() {
	var mu CustomMutex
	var wg sync.WaitGroup
	
	// 模拟并发访问
	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func() {
			mu.Lock()
			// 临界区
			atomic.StoreInt32(&mu.state, 0) // 原子释放
			wg.Done()
		}()
	}
	wg.Wait()
}

关键区别在于：

在 sync.Mutex 的 lockSlow() 中读取 m.state 时，该值已经通过原子操作（CAS）被修改，且后续操作会重新验证状态
标准库的实现经过了精心设计，确保即使有非原子读取，也不会破坏正确性
竞争检测器对标准库模式有特殊处理

对于用户代码，安全的方法是始终使用原子操作进行并发访问：

// 安全的方式：使用 atomic.Load
func safeRead(addr *int32) int32 {
	return atomic.LoadInt32(addr)
}

// 或者使用 atomic.Value 进行更复杂的类型
var sharedValue atomic.Value

func writer() {
	sharedValue.Store("new value")
}

func reader() {
	val := sharedValue.Load()
	fmt.Println(val)
}

总结：虽然 sync.Mutex 的实现中混合了原子和非原子操作，但这是经过特殊设计和验证的模式。在用户代码中，为避免数据竞争，应始终对共享变量的并发访问使用原子操作或互斥锁保护。