Golang Go语言单线程原子操作性能怎么这么差?
package main import ( "sync/atomic" "fmt" "time" )func main() {
<span class="kd">var</span> <span class="nx">t1</span> <span class="kt">uint64</span> <span class="p">=</span> <span class="mi">0</span> <span class="kd">var</span> <span class="nx">t2</span> <span class="kt">uint64</span> <span class="p">=</span> <span class="mi">0</span> <span class="nx">endChan</span> <span class="o">:=</span> <span class="nb">make</span><span class="p">(</span><span class="kd">chan</span> <span class="kt">int</span><span class="p">)</span> <span class="k">for</span> <span class="nx">i</span> <span class="o">:=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="p"><</span> <span class="mi">1000</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span> <span class="p">{</span> <span class="k">go</span> <span class="kd">func</span><span class="p">()</span> <span class="p">{</span> <span class="k">for</span> <span class="nx">i</span> <span class="o">:=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="p"><</span> <span class="mi">10000</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span> <span class="p">{</span> <span class="nx">atomic</span><span class="p">.</span><span class="nx">AddUint64</span><span class="p">(</span><span class="o">&</span><span class="nx">t1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="nx">t2</span> <span class="o">+=</span> <span class="mi">1</span> <span class="p">}</span> <span class="nx">endChan</span> <span class="o"><-</span> <span class="mi">1</span> <span class="p">}()</span> <span class="p">}</span> <span class="k">for</span> <span class="nx">i</span> <span class="o">:=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="p"><</span> <span class="mi">1000</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span> <span class="p">{</span> <span class="o"><-</span><span class="nx">endChan</span> <span class="p">}</span> <span class="c1">// 测试非原子操作造成的值不正确</span> <span class="c1">// t1= 10000000</span> <span class="c1">// t2= 8513393</span> <span class="nx">fmt</span><span class="p">.</span><span class="nx">Println</span><span class="p">(</span><span class="s">"t1="</span><span class="p">,</span> <span class="nx">t1</span><span class="p">)</span> <span class="nx">fmt</span><span class="p">.</span><span class="nx">Println</span><span class="p">(</span><span class="s">"t2="</span><span class="p">,</span> <span class="nx">t2</span><span class="p">)</span> <span class="c1">// 性能测试</span> <span class="kd">func</span><span class="p">()</span> <span class="p">{</span> <span class="kd">var</span> <span class="nx">t1</span> <span class="kt">uint64</span> <span class="p">=</span> <span class="mi">0</span> <span class="nx">startTime</span> <span class="o">:=</span> <span class="nx">time</span><span class="p">.</span><span class="nx">Now</span><span class="p">()</span> <span class="k">for</span> <span class="nx">i</span> <span class="o">:=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="p"><</span> <span class="mi">1000000000</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span> <span class="p">{</span> <span class="nx">t1</span> <span class="o">+=</span> <span class="mi">1</span> <span class="p">}</span> <span class="nx">endTime</span> <span class="o">:=</span> <span class="nx">time</span><span class="p">.</span><span class="nx">Now</span><span class="p">()</span> <span class="nx">fmt</span><span class="p">.</span><span class="nx">Println</span><span class="p">(</span><span class="s">"非原子操作耗时:"</span><span class="p">,</span> <span class="nx">endTime</span><span class="p">.</span><span class="nx">Sub</span><span class="p">(</span><span class="nx">startTime</span><span class="p">))</span> <span class="c1">// 非原子操作耗时: 535.0303ms</span> <span class="p">}()</span> <span class="kd">func</span><span class="p">()</span> <span class="p">{</span> <span class="kd">var</span> <span class="nx">t1</span> <span class="kt">uint64</span> <span class="p">=</span> <span class="mi">0</span> <span class="nx">startTime</span> <span class="o">:=</span> <span class="nx">time</span><span class="p">.</span><span class="nx">Now</span><span class="p">()</span> <span class="k">for</span> <span class="nx">i</span> <span class="o">:=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="p"><</span> <span class="mi">1000000000</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span> <span class="p">{</span> <span class="nx">atomic</span><span class="p">.</span><span class="nx">AddUint64</span><span class="p">(</span><span class="o">&</span><span class="nx">t1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="p">}</span> <span class="nx">endTime</span> <span class="o">:=</span> <span class="nx">time</span><span class="p">.</span><span class="nx">Now</span><span class="p">()</span> <span class="nx">fmt</span><span class="p">.</span><span class="nx">Println</span><span class="p">(</span><span class="s">"原子操作耗时:"</span><span class="p">,</span> <span class="nx">endTime</span><span class="p">.</span><span class="nx">Sub</span><span class="p">(</span><span class="nx">startTime</span><span class="p">))</span> <span class="c1">//原子操作耗时: 14.7758413s</span> <span class="p">}()</span>
}
原子操作的实现不是锁总线?单线程应该锁总线应该不会影响性能吧?
Golang Go语言单线程原子操作性能怎么这么差?
更多关于Golang Go语言单线程原子操作性能怎么这么差?的实战系列教程也可以访问 https://www.itying.com/category-94-b0.html
赞, 有意思的测试, 我猜测有几个问题:
1. 在测非原子操作耗时的时候, 我不确定 go 的编译器直接优化掉, 有精力的话, 你可以试一下 1. 用 if / else 替代 for 循环, 2. 把 t+=1 封个函数.
2. 即便真的差距这么大, 也容易用指令流水线的原理来解释.
更多关于Golang Go语言单线程原子操作性能怎么这么差?的实战系列教程也可以访问 https://www.itying.com/category-94-b0.html
go 的编译器直接优化掉 for 循环
– 删删改改弄错了.
对了, 如果要有实际应用场景的话, 是不是可以考虑用一个 go routine 来维护 t 这个变量, 即增加的时候往一个有 buffer 的 chan 里写 delta, 这样一般不会阻塞, 至于查询, 如果不需要准确值, 直接读 t 就好, 如果需要准确, 就比较棘手了.
对 go 的内存模型不是很了解,这里原子操作,atomic.AddUint64
的[实现]( https://github.com/golang/go/blob/master/src/sync/atomic/64bit_arm.go#L27)其实就是一条[CMPXCHGQ
]( https://github.com/golang/go/blob/master/src/sync/atomic/asm_amd64.s#L55)指令,即 CAS ,Q
代表 quadword 。
锁
把 t+=1 封个函数后,非原子操作耗时: 3.168774395s ,原子操作耗时: 11.310976061s
试了一把加锁版本的,比原子操作慢上两倍。。
C 的原子操作也很慢, 用 OSX 的 OSAtomicAdd64 编译参数-Os 同样的测试也要 8s 多 Go 版本在我这里 10s ,但是 C 版本的非原子操作超级快,应该是编译器优化了
封函数之后还要加 -gcflags ‘-l’ 把 inline 去掉
针对您提到的Golang单线程原子操作性能问题,以下是我的专业回复:
Golang的原子操作通常被认为是一种高效的并发编程工具,因为它们可以在不使用锁的情况下保证数据的一致性,并且避免了锁的开销,如获取锁、释放锁以及可能的线程阻塞。然而,性能感受可能受到多种因素的影响:
- 硬件和平台差异:原子操作的效率取决于具体的硬件平台和操作系统。在某些硬件架构上,原子操作的开销可能会比期望的更高。
- 上下文切换:虽然原子操作本身在用户态完成,开销较小,但如果代码中启动了大量goroutine,可能会导致过多的上下文切换,影响整体性能。
- 内存一致性:原子操作需要确保内存的一致性,这可能会导致额外的内存屏障,从而影响性能。
总的来说,Golang的原子操作在大多数情况下是高效的,但具体性能可能受到多种因素的影响。如果您在单线程环境下遇到性能问题,建议检查代码是否存在其他潜在的性能瓶颈,或者考虑使用性能分析工具进行诊断和优化。同时,也可以尝试在不同的硬件和操作系统平台上进行测试,以获取更全面的性能数据。