Golang虚拟内存占用持续增加但已分配内存保持不变的原因分析

Golang虚拟内存占用持续增加但已分配内存保持不变的原因分析 我遇到了一个非常奇怪的问题。我编写了一些围绕使用net/http进行大量并发HTTP请求的代码。这段代码在我的本地机器上运行良好,但在在线服务器上似乎会导致内存泄漏,pprofruntime似乎从未检测到

每当我将代码部署到云端(我尝试过AWS、Google Cloud和Microsoft Azure),系统内存会稳定消耗,直到内存不足导致代码崩溃,云服务器挂起。

我一直在使用free -mtop(关注res而非virt)来跟踪系统内存消耗,而同样的内存消耗问题在我的本地机器上没有出现:

Microsoft Azure (ubuntu total        used        free      shared  buff/cache   available
Mem:          16041         505       15055          14         480       15228
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         506       15054          14         481       15227
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         507       15052          14         481       15226
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         508       15051          14         481       15225
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         510       15049          14         481       15223
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         510       15049          14         481       15223
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         510       15049          14         481       15223
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         513       15047          14         481       15220
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         514       15045          14         481       15219
Swap:             0           0           0
lewington@gambling:~/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16041         516       15043          14         481       15217
Swap:             0           0           0

桌面(ubuntu 16.04):

lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16001        4386        7938          95        3677       11065
Swap:         16343           0       16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16001        4386        7938          95        3677       11065
Swap:         16343           0       16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16001        4386        7938          95        3677       11065
Swap:         16343           0       16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16001        4386        7937          95        3677       11065
Swap:         16343           0       16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16001        4387        7936          95        3677       11064
Swap:         16343           0       16343
lewington@lewington-desktop:~/code/go/src/bitbucket.org/lewington/autoroller$ free -m
              total        used        free      shared  buff/cache   available
Mem:          16001        4387        7936          95        3677       11064
Swap:         16343           0       16343

pprofruntime似乎无法检测到内存泄漏。go tool pprof -top http://localhost:6060/debug/pprof/heap显示稳定的约2MB使用量,即使系统正在崩溃。

$ go tool pprof -top http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /home/lewington/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.063.pb.gz
File: main
Build ID: 5cb3aeea203dbef664dde0fb5ed13f564d3476e3
Type: inuse_space
Time: Jan 11, 2019 at 7:01am (UTC)
Showing nodes accounting for 2611.12kB, 100% of 2611.12kB total
      flat  flat%   sum%        ■■■   ■■■%
 1537.31kB 58.88% 58.88%  2049.62kB 78.50%  bitbucket.org/lewington/autoroller/brain/braindata.newChain
  561.50kB 21.50% 80.38%   561.50kB 21.50%  html.init
  512.31kB 19.62%   100%   512.31kB 19.62%  bitbucket.org/lewington/autoroller/brain/scale.newPoints
         0     0%   100%  2049.62kB 78.50%  bitbucket.org/lewington/autoroller/brain/braindata.(*contestant).addEmptyFluc
         0     0%   100%  2049.62kB 78.50%  bitbucket.org/lewington/autoroller/brain/braindata.(*contestantBag).AddEmptyFluc
         0     0%   100%  2049.62kB 78.50%  bitbucket.org/lewington/autoroller/brain/braindb.(*Lib).DetailedRaceOdds
         0     0%   100%  2049.62kB 78.50% 

在代码执行时记录runtime.MemStats也给出恒定输出:

time="2019-01-11T08:56:26Z" level=info msg="****** MEMORY USAGE *******"
time="2019-01-11T08:56:26Z" level=info msg="Alloc = 9 MiB"
time="2019-01-11T08:56:26Z" level=info msg="HeapIdle = 5 MiB"
time="2019-01-11T08:56:26Z" level=info msg="HeapInuse = 11 MiB"
time="2019-01-11T08:56:26Z" level=info msg="HeapReleased = 0 MiB"
time="2019-01-11T08:56:26Z" level=info msg="Malloc = 1\n"
time="2019-01-11T08:56:26Z" level=info msg="Frees = 1\n"
time="2019-01-11T08:56:26Z" level=info msg="TotalAlloc = 214 MiB"
time="2019-01-11T08:56:26Z" level=info msg="Sys = 23 MiB"
time="2019-01-11T08:56:26Z" level=info msg="StackInUse = 3 MiB"

time="2019-01-11T08:56:36Z" level=info msg="****** MEMORY USAGE *******"
time="2019-01-11T08:56:36Z" level=info msg="Alloc = 11 MiB"
time="2019-01-11T08:56:36Z" level=info msg="HeapIdle = 2 MiB"
time="2019-01-11T08:56:36Z" level=info msg="HeapInuse = 13 MiB"
time="2019-01-11T08:56:36Z" level=info msg="HeapReleased = 0 MiB"
time="2019-01-11T08:56:36Z" level=info msg="Malloc = 2\n"
time="2019-01-11T08:56:36Z" level=info msg="Frees = 2\n"
time="2019-01-11T08:56:36Z" level=info msg="TotalAlloc = 312 MiB"
time="2019-01-11T08:56:36Z" level=info msg="Sys = 23 MiB"
time="2019-01-11T08:56:36Z" level=info msg="StackInUse = 3 MiB"

time="2019-01-11T08:56:46Z" level=info msg="****** MEMORY USAGE *******"
time="2019-01-11T08:56:46Z" level=info msg="Alloc = 8 MiB"
time="2019-01-11T08:56:46Z" level=info msg="HeapIdle = 6 MiB"
time="2019-01-11T08:56:46Z" level=info msg="HeapInuse = 11 MiB"
time="2019-01-11T08:56:46Z" level=info msg="HeapReleased = 0 MiB"
time="2019-01-11T08:56:46Z" level=info msg="Malloc = 3\n"
time="2019-01-11T08:56:46Z" level=info msg="Frees = 3\n"
time="2019-01-11T08:56:46Z" level=info msg="TotalAlloc = 379 MiB"
time="2019-01-11T08:56:46Z" level=info msg="Sys = 24 MiB"
time="2019-01-11T08:56:46Z" level=info msg="StackInUse = 3 MiB"

另一个有用的线索来自go tool pprof -top -alloc_space http://localhost:6060/debug/pprof/heap(它跟踪累积/总内存使用量,即永远不会下降)。根据这个分析,在虚拟服务器上,总体上最大的内存使用者是compress/flate.NewReader,但当我在本地运行相同的命令时,这个函数根本没有出现,而是大多数字节归因于compress/flate.NewWriter

来自Azure VM

go tool pprof -top -alloc_space http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /home/lewington/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.065.pb.gz
File: main
Build ID: 5cb3aeea203dbef664dde0fb5ed13f564d3476e3
Type: alloc_space
Time: Jan 11, 2019 at 7:02am (UTC)
Showing nodes accounting for 2717.45MB, 92.24% of 2946.06MB total
Dropped 186 nodes (■■■ <= 14.73MB)
      flat  flat%   sum%        ■■■   ■■■%
 1118.27MB 37.96% 37.96%  1118.27MB 37.96%  compress/flate.NewReader
  368.09MB 12.49% 50.45%   368.09MB 12.49%  bytes.makeSlice
  115.95MB  3.94% 54.39%  1234.22MB 41.89%  compress/gzip.(*Reader).Reset
  111.88MB  3.80% 58.19%   130.88MB  4.44%  net/http.(*Transport).dialConn
  104.52MB  3.55% 61.73%   104.52MB  3.55%  bitbucket.org/lewington/autoroller/realtime/odds.(*Odds).Odd
  104.02MB  3.53% 65.26%   276.05MB  9.37%  fmt.Sprintf
      86MB  2.92% 68.18%       86MB  2.92%  reflect.unsafe_New
   75.03MB  2.55% 70.73%    75.03MB  2.55%  strconv.appendEscapedRune
   60.51MB  2.05% 72.78%    60.51MB  2.05%  compress/flate.(*huffmanDecoder).init
   58.51MB  1.99% 74.77%    58.51MB  1.99%  bitbucket.org/lewington/autoroller/brain/braindata.(*contestant).autorollDegrees
   53.51MB  1.82% 76.59%    54.01MB  1.83%  fmt.Sprint
      49MB  1.66% 78.25%       49MB  1.66%  bitbucket.org/lewington/autoroller/realtime.(*interpreter).saveFlucForProvider
   48.59MB  1.65% 79.90%    50.09MB  1.70%  net.(*dnsMsg).Pack
   42.01MB  1.43% 81.33%    42.01MB  1.43%  strings.genSplit
      42MB  1.43% 82.75%   205.04MB  6.96%  github.com/sirupsen/logrus.(*TextFormatter).Format
   35.02MB  1.19% 83.94%    35.02MB  1.19%  net/textproto.(*Reader).ReadMIMEHeader
   23.50MB   0.8% 84.74%   228.54MB  7.76%  github.com/sirupsen/logrus.Entry.log
      22MB  0.75% 85.48%       22MB  0.75%  bitbucket.org/lewington/autoroller/realtime.(*cleaner).intCleaned

本地

$ go tool pprof -top -alloc_space http://localhost:6060/debug/pprof/heap
Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap
Saved profile in /home/lewington/pprof/pprof.main.alloc_objects.alloc_space.inuse_objects.inuse_space.058.pb.gz
File: main
Build ID: 6996cbd6c46216e87717e5f3b483a90c9021d6b2
Type: alloc_space
Time: Jan 11, 2019 at 8:34pm (AEDT)
Showing nodes accounting for 5868.78kB, 100% of 5868.78kB total
      flat  flat%   sum%        ■■■   ■■■%
 3610.34kB 61.52% 61.52%  4843.97kB 82.54%  compress/flate.NewWriter
 1233.63kB 21.02% 82.54%  1233.63kB 21.02%  compress/flate.(*compressor).init
  512.75kB  8.74% 91.27%   512.75kB  8.74%  bytes.makeSlice
  512.05kB  8.73%   100%   512.05kB  8.73%  net/http.(*conn).readRequest
         0     0%   100%   512.75kB  8.74%  bytes.(*Buffer).ReadFrom
         0     0%   100%   512.75kB  8.74%  bytes.(*Buffer).grow
         0     0%   100%  4843.97kB 82.54%  compress/gzip.(*Writer).Write
         0     0%   100%   512.75kB  8.74%  io/ioutil.ReadFile
         0     0%   100%   512.75kB  8.74%  io/ioutil.readAll
         0     0%   100%  5356.72kB 91.27%  net/http.(*ServeMux).ServeHTTP
         0     0%   100%  5868.78kB   100%  net/http.(*conn).serve
         0

更多关于Golang虚拟内存占用持续增加但已分配内存保持不变的原因分析的实战教程也可以访问 https://www.itying.com/category-94-b0.html

6 回复

很高兴你找到了

更多关于Golang虚拟内存占用持续增加但已分配内存保持不变的原因分析的实战系列教程也可以访问 https://www.itying.com/category-94-b0.html


大家好,快速更新一下:我发现我的一个依赖项 gokogiri 已知会导致内存泄漏,这似乎就是问题的根源。

更新:根据这个视频我怀疑我们可能遇到了cgo内存泄漏?有没有人知道如何调试这类问题?

你好

快速看了一下,我认为你进行了递归调用导致堆栈溢出。Go语言没有尾递归优化(如果我没记错的话),所以这可能是问题所在。调用链如下:

work → timedResponse → request → work

你好 Johandalabaka,

感谢你抽出时间!

如果是这种情况,我是否应该看到堆栈随时间增长?比如 MemStats.StackSys (https://golang.org/pkg/runtime/#MemStats)?我目前正在记录它,它保持稳定。

另外我觉得你可能看得有点太快了哈哈,throttle.timedResponse 调用的是 bot.request 而不是 throttle.request。不过还是感谢你的建议。

在并发HTTP请求场景下,虚拟内存持续增长但Go运行时内存统计保持不变,通常是由于操作系统层面的内存分配机制造成的。Go的垃圾回收器不会立即将内存返还给操作系统,而是保留以备重用。以下是可能的原因和解决方案:

1. HTTP连接未正确关闭 未关闭的响应体会导致连接和关联内存无法释放:

// 错误示例
resp, err := http.Get("http://example.com")
if err != nil {
    log.Fatal(err)
}
// 忘记关闭resp.Body

// 正确做法
resp, err := http.Get("http://example.com")
if err != nil {
    log.Fatal(err)
}
defer resp.Body.Close() // 必须关闭

body, err := io.ReadAll(resp.Body)
if err != nil {
    log.Fatal(err)
}

2. 连接池资源泄露 HTTP Transport的默认连接池可能导致内存积累:

// 自定义Transport控制连接池
transport := &http.Transport{
    MaxIdleConns:        100,
    MaxIdleConnsPerHost: 10,
    IdleConnTimeout:     30 * time.Second,
}

client := &http.Client{
    Transport: transport,
    Timeout:   10 * time.Second,
}

// 使用后需要关闭连接
req, err := http.NewRequest("GET", "http://example.com", nil)
if err != nil {
    log.Fatal(err)
}

resp, err := client.Do(req)
if err != nil {
    log.Fatal(err)
}
defer resp.Body.Close()

3. 压缩读取器未正确重置或关闭 从pprof看到compress/flate.NewReader是主要内存使用者:

// 处理gzip响应时确保正确关闭
resp, err := http.Get("http://example.com/gzipped")
if err != nil {
    log.Fatal(err)
}
defer resp.Body.Close()

var reader io.ReadCloser
switch resp.Header.Get("Content-Encoding") {
case "gzip":
    reader, err = gzip.NewReader(resp.Body)
    if err != nil {
        log.Fatal(err)
    }
    defer reader.Close() // 必须关闭gzip reader
default:
    reader = resp.Body
}

data, err := io.ReadAll(reader)
if err != nil {
    log.Fatal(err)
}

4. 强制Go运行时释放内存 可以定期强制垃圾回收并释放内存:

// 定期强制释放内存回操作系统
func freeMemory() {
    var m runtime.MemStats
    runtime.ReadMemStats(&m)
    
    if m.HeapIdle > 100*1024*1024 { // 如果空闲堆超过100MB
        debug.FreeOSMemory() // 强制释放回OS
    }
}

// 定时调用
go func() {
    ticker := time.NewTicker(5 * time.Minute)
    defer ticker.Stop()
    
    for range ticker.C {
        freeMemory()
    }
}()

5. 监控goroutine泄露 未完成的goroutine也会占用内存:

// 使用pprof监控goroutine数量
import _ "net/http/pprof"

go func() {
    log.Println(http.ListenAndServe("localhost:6060", nil))
}()

// 检查goroutine数量
numGoroutines := runtime.NumGoroutine()
fmt.Printf("当前goroutine数量: %d\n", numGoroutines)

6. 使用自定义内存限制 设置Go内存限制:

// Go 1.19+ 可以设置内存软限制
go func() {
    debug.SetMemoryLimit(512 * 1024 * 1024) // 512MB
}()

主要问题在于云环境通常内存压力更大,Go的垃圾回收策略在不同环境下表现不同。通过确保资源正确释放、控制连接池大小和定期强制内存释放,可以缓解虚拟内存持续增长的问题。

回到顶部