Golang虚拟内存占用持续增加但已分配内存保持不变的原因分析
Golang虚拟内存占用持续增加但已分配内存保持不变的原因分析
我遇到了一个非常奇怪的问题。我编写了一些围绕使用net/http进行大量并发HTTP请求的代码。这段代码在我的本地机器上运行良好,但在在线服务器上似乎会导致内存泄漏,而pprof和runtime似乎从未检测到。
每当我将代码部署到云端(我尝试过AWS、Google Cloud和Microsoft Azure),系统内存会稳定消耗,直到内存不足导致代码崩溃,云服务器挂起。
我一直在使用free -m和top(关注res而非virt)来跟踪系统内存消耗,而同样的内存消耗问题在我的本地机器上没有出现:
Microsoft Azure (ubuntu total used free shared buff/cache available
Mem: 16041 505 15055 14 480 15228
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 506 15054 14 481 15227
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 507 15052 14 481 15226
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 508 15051 14 481 15225
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 510 15049 14 481 15223
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 510 15049 14 481 15223
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 510 15049 14 481 15223
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 513 15047 14 481 15220
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 514 15045 14 481 15219
Swap: 0 0 0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16041 516 15043 14 481 15217
Swap: 0 0 0
桌面(ubuntu 16.04):
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16001 4386 7938 95 3677 11065
Swap: 16343 0 16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16001 4386 7938 95 3677 11065
Swap: 16343 0 16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16001 4386 7938 95 3677 11065
Swap: 16343 0 16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16001 4386 7937 95 3677 11065
Swap: 16343 0 16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16001 4387 7936 95 3677 11064
Swap: 16343 0 16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
total used free shared buff/cache available
Mem: 16001 4387 7936 95 3677 11064
Swap: 16343 0 16343
pprof和runtime似乎无法检测到内存泄漏。go tool pprof -top http://localhost:6060/debug/pprof/heap显示稳定的约2MB使用量,即使系统正在崩溃。
$ go tool pprof -top http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /home/lewington/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.063.pb.gz
File: main
Build ID: 5cb3aeea203dbef664dde0fb5ed13f564d3476e3
Type: inuse_space
Time: Jan 11, 2019 at 7:01am (UTC)
Showing nodes accounting for 2611.12kB, 100% of 2611.12kB total
flat flat% sum% ■■■ ■■■%
1537.31kB 58.88% 58.88% 2049.62kB 78.50% bitbucket.org/lewington/autoroller/brain/braindata.newChain
561.50kB 21.50% 80.38% 561.50kB 21.50% html.init
512.31kB 19.62% 100% 512.31kB 19.62% bitbucket.org/lewington/autoroller/brain/scale.newPoints
0 0% 100% 2049.62kB 78.50% bitbucket.org/lewington/autoroller/brain/braindata.(*contestant).addEmptyFluc
0 0% 100% 2049.62kB 78.50% bitbucket.org/lewington/autoroller/brain/braindata.(*contestantBag).AddEmptyFluc
0 0% 100% 2049.62kB 78.50% bitbucket.org/lewington/autoroller/brain/braindb.(*Lib).DetailedRaceOdds
0 0% 100% 2049.62kB 78.50%
在代码执行时记录runtime.MemStats也给出恒定输出:
time="2019-01-11T08:56:26Z" level=info msg="****** MEMORY USAGE *******"
time="2019-01-11T08:56:26Z" level=info msg="Alloc = 9 MiB"
time="2019-01-11T08:56:26Z" level=info msg="HeapIdle = 5 MiB"
time="2019-01-11T08:56:26Z" level=info msg="HeapInuse = 11 MiB"
time="2019-01-11T08:56:26Z" level=info msg="HeapReleased = 0 MiB"
time="2019-01-11T08:56:26Z" level=info msg="Malloc = 1\n"
time="2019-01-11T08:56:26Z" level=info msg="Frees = 1\n"
time="2019-01-11T08:56:26Z" level=info msg="TotalAlloc = 214 MiB"
time="2019-01-11T08:56:26Z" level=info msg="Sys = 23 MiB"
time="2019-01-11T08:56:26Z" level=info msg="StackInUse = 3 MiB"
time="2019-01-11T08:56:36Z" level=info msg="****** MEMORY USAGE *******"
time="2019-01-11T08:56:36Z" level=info msg="Alloc = 11 MiB"
time="2019-01-11T08:56:36Z" level=info msg="HeapIdle = 2 MiB"
time="2019-01-11T08:56:36Z" level=info msg="HeapInuse = 13 MiB"
time="2019-01-11T08:56:36Z" level=info msg="HeapReleased = 0 MiB"
time="2019-01-11T08:56:36Z" level=info msg="Malloc = 2\n"
time="2019-01-11T08:56:36Z" level=info msg="Frees = 2\n"
time="2019-01-11T08:56:36Z" level=info msg="TotalAlloc = 312 MiB"
time="2019-01-11T08:56:36Z" level=info msg="Sys = 23 MiB"
time="2019-01-11T08:56:36Z" level=info msg="StackInUse = 3 MiB"
time="2019-01-11T08:56:46Z" level=info msg="****** MEMORY USAGE *******"
time="2019-01-11T08:56:46Z" level=info msg="Alloc = 8 MiB"
time="2019-01-11T08:56:46Z" level=info msg="HeapIdle = 6 MiB"
time="2019-01-11T08:56:46Z" level=info msg="HeapInuse = 11 MiB"
time="2019-01-11T08:56:46Z" level=info msg="HeapReleased = 0 MiB"
time="2019-01-11T08:56:46Z" level=info msg="Malloc = 3\n"
time="2019-01-11T08:56:46Z" level=info msg="Frees = 3\n"
time="2019-01-11T08:56:46Z" level=info msg="TotalAlloc = 379 MiB"
time="2019-01-11T08:56:46Z" level=info msg="Sys = 24 MiB"
time="2019-01-11T08:56:46Z" level=info msg="StackInUse = 3 MiB"
另一个有用的线索来自go tool pprof -top -alloc_space http://localhost:6060/debug/pprof/heap(它跟踪累积/总内存使用量,即永远不会下降)。根据这个分析,在虚拟服务器上,总体上最大的内存使用者是compress/flate.NewReader,但当我在本地运行相同的命令时,这个函数根本没有出现,而是大多数字节归因于compress/flate.NewWriter。
来自Azure VM
go tool pprof -top -alloc_space http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /home/lewington/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.065.pb.gz
File: main
Build ID: 5cb3aeea203dbef664dde0fb5ed13f564d3476e3
Type: alloc_space
Time: Jan 11, 2019 at 7:02am (UTC)
Showing nodes accounting for 2717.45MB, 92.24% of 2946.06MB total
Dropped 186 nodes (■■■ <= 14.73MB)
flat flat% sum% ■■■ ■■■%
1118.27MB 37.96% 37.96% 1118.27MB 37.96% compress/flate.NewReader
368.09MB 12.49% 50.45% 368.09MB 12.49% bytes.makeSlice
115.95MB 3.94% 54.39% 1234.22MB 41.89% compress/gzip.(*Reader).Reset
111.88MB 3.80% 58.19% 130.88MB 4.44% net/http.(*Transport).dialConn
104.52MB 3.55% 61.73% 104.52MB 3.55% bitbucket.org/lewington/autoroller/realtime/odds.(*Odds).Odd
104.02MB 3.53% 65.26% 276.05MB 9.37% fmt.Sprintf
86MB 2.92% 68.18% 86MB 2.92% reflect.unsafe_New
75.03MB 2.55% 70.73% 75.03MB 2.55% strconv.appendEscapedRune
60.51MB 2.05% 72.78% 60.51MB 2.05% compress/flate.(*huffmanDecoder).init
58.51MB 1.99% 74.77% 58.51MB 1.99% bitbucket.org/lewington/autoroller/brain/braindata.(*contestant).autorollDegrees
53.51MB 1.82% 76.59% 54.01MB 1.83% fmt.Sprint
49MB 1.66% 78.25% 49MB 1.66% bitbucket.org/lewington/autoroller/realtime.(*interpreter).saveFlucForProvider
48.59MB 1.65% 79.90% 50.09MB 1.70% net.(*dnsMsg).Pack
42.01MB 1.43% 81.33% 42.01MB 1.43% strings.genSplit
42MB 1.43% 82.75% 205.04MB 6.96% github.com/sirupsen/logrus.(*TextFormatter).Format
35.02MB 1.19% 83.94% 35.02MB 1.19% net/textproto.(*Reader).ReadMIMEHeader
23.50MB 0.8% 84.74% 228.54MB 7.76% github.com/sirupsen/logrus.Entry.log
22MB 0.75% 85.48% 22MB 0.75% bitbucket.org/lewington/autoroller/realtime.(*cleaner).intCleaned
本地
$ go tool pprof -top -alloc_space http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /home/lewington/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.058.pb.gz
File: main
Build ID: 6996cbd6c46216e87717e5f3b483a90c9021d6b2
Type: alloc_space
Time: Jan 11, 2019 at 8:34pm (AEDT)
Showing nodes accounting for 5868.78kB, 100% of 5868.78kB total
flat flat% sum% ■■■ ■■■%
3610.34kB 61.52% 61.52% 4843.97kB 82.54% compress/flate.NewWriter
1233.63kB 21.02% 82.54% 1233.63kB 21.02% compress/flate.(*compressor).init
512.75kB 8.74% 91.27% 512.75kB 8.74% bytes.makeSlice
512.05kB 8.73% 100% 512.05kB 8.73% net/http.(*conn).readRequest
0 0% 100% 512.75kB 8.74% bytes.(*Buffer).ReadFrom
0 0% 100% 512.75kB 8.74% bytes.(*Buffer).grow
0 0% 100% 4843.97kB 82.54% compress/gzip.(*Writer).Write
0 0% 100% 512.75kB 8.74% io/ioutil.ReadFile
0 0% 100% 512.75kB 8.74% io/ioutil.readAll
0 0% 100% 5356.72kB 91.27% net/http.(*ServeMux).ServeHTTP
0 0% 100% 5868.78kB 100% net/http.(*conn).serve
0更多关于Golang虚拟内存占用持续增加但已分配内存保持不变的原因分析的实战教程也可以访问 https://www.itying.com/category-94-b0.html
大家好,快速更新一下:我发现我的一个依赖项 gokogiri 已知会导致内存泄漏,这似乎就是问题的根源。
更新:根据这个视频我怀疑我们可能遇到了cgo内存泄漏?有没有人知道如何调试这类问题?
你好
快速看了一下,我认为你进行了递归调用导致堆栈溢出。Go语言没有尾递归优化(如果我没记错的话),所以这可能是问题所在。调用链如下:
work → timedResponse → request → work
你好 Johandalabaka,
感谢你抽出时间!
如果是这种情况,我是否应该看到堆栈随时间增长?比如 MemStats.StackSys (https://golang.org/pkg/runtime/#MemStats)?我目前正在记录它,它保持稳定。
另外我觉得你可能看得有点太快了哈哈,throttle.timedResponse 调用的是 bot.request 而不是 throttle.request。不过还是感谢你的建议。
在并发HTTP请求场景下,虚拟内存持续增长但Go运行时内存统计保持不变,通常是由于操作系统层面的内存分配机制造成的。Go的垃圾回收器不会立即将内存返还给操作系统,而是保留以备重用。以下是可能的原因和解决方案:
1. HTTP连接未正确关闭 未关闭的响应体会导致连接和关联内存无法释放:
// 错误示例
resp, err := http.Get("http://example.com")
if err != nil {
log.Fatal(err)
}
// 忘记关闭resp.Body
// 正确做法
resp, err := http.Get("http://example.com")
if err != nil {
log.Fatal(err)
}
defer resp.Body.Close() // 必须关闭
body, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatal(err)
}
2. 连接池资源泄露 HTTP Transport的默认连接池可能导致内存积累:
// 自定义Transport控制连接池
transport := &http.Transport{
MaxIdleConns: 100,
MaxIdleConnsPerHost: 10,
IdleConnTimeout: 30 * time.Second,
}
client := &http.Client{
Transport: transport,
Timeout: 10 * time.Second,
}
// 使用后需要关闭连接
req, err := http.NewRequest("GET", "http://example.com", nil)
if err != nil {
log.Fatal(err)
}
resp, err := client.Do(req)
if err != nil {
log.Fatal(err)
}
defer resp.Body.Close()
3. 压缩读取器未正确重置或关闭
从pprof看到compress/flate.NewReader是主要内存使用者:
// 处理gzip响应时确保正确关闭
resp, err := http.Get("http://example.com/gzipped")
if err != nil {
log.Fatal(err)
}
defer resp.Body.Close()
var reader io.ReadCloser
switch resp.Header.Get("Content-Encoding") {
case "gzip":
reader, err = gzip.NewReader(resp.Body)
if err != nil {
log.Fatal(err)
}
defer reader.Close() // 必须关闭gzip reader
default:
reader = resp.Body
}
data, err := io.ReadAll(reader)
if err != nil {
log.Fatal(err)
}
4. 强制Go运行时释放内存 可以定期强制垃圾回收并释放内存:
// 定期强制释放内存回操作系统
func freeMemory() {
var m runtime.MemStats
runtime.ReadMemStats(&m)
if m.HeapIdle > 100*1024*1024 { // 如果空闲堆超过100MB
debug.FreeOSMemory() // 强制释放回OS
}
}
// 定时调用
go func() {
ticker := time.NewTicker(5 * time.Minute)
defer ticker.Stop()
for range ticker.C {
freeMemory()
}
}()
5. 监控goroutine泄露 未完成的goroutine也会占用内存:
// 使用pprof监控goroutine数量
import _ "net/http/pprof"
go func() {
log.Println(http.ListenAndServe("localhost:6060", nil))
}()
// 检查goroutine数量
numGoroutines := runtime.NumGoroutine()
fmt.Printf("当前goroutine数量: %d\n", numGoroutines)
6. 使用自定义内存限制 设置Go内存限制:
// Go 1.19+ 可以设置内存软限制
go func() {
debug.SetMemoryLimit(512 * 1024 * 1024) // 512MB
}()
主要问题在于云环境通常内存压力更大,Go的垃圾回收策略在不同环境下表现不同。通过确保资源正确释放、控制连接池大小和定期强制内存释放,可以缓解虚拟内存持续增长的问题。

