Golang并行处理FIFO管道保持消息顺序插件库parapipe的使用

介绍

Parapipe是一个零依赖的非阻塞缓冲FIFO管道库，用于构建代码和垂直扩展应用程序。与互联网上常见的常规管道示例不同，parapipe在每个步骤上并发执行所有操作，同时保持输出顺序。这个库不使用任何锁或互斥锁，只使用纯通道。

适用场景

处理的数据可以分成块（消息），流程可以包含一个或多个阶段
数据需要并发处理（垂直扩展）
必须保持消息的处理顺序

安装

go get -u github.com/nazar256/parapipe@latest

使用示例

基本用法

package main

import (
	"fmt"
	"runtime"
	"strconv"
	"time"

	"github.com/nazar256/parapipe"
)

func main() {
	// 1. 创建管道
	concurrency := runtime.NumCPU() // 每个管道并发处理的消息数
	p1 := parapipe.NewPipeline(concurrency, func(msg int) (int, bool) {
		time.Sleep(30 * time.Millisecond) // 模拟处理耗时
		return msg + 1000, true          // 返回处理结果和是否继续的标志
	})

	// 2. 添加更多处理阶段
	p2 := parapipe.Attach(p1, parapipe.NewPipeline(concurrency, func(msg int) (string, bool) {
		time.Sleep(30 * time.Millisecond)
		return strconv.Itoa(msg), true
	}))

	// 最终管道
	pipeline := parapipe.Attach(p2, parapipe.NewPipeline(concurrency, func(msg string) (string, bool) {
		time.Sleep(30 * time.Millisecond)
		return "#" + msg, true
	}))

	// 3. 读取输出
	go func() {
		for result := range pipeline.Out() {
			fmt.Println("Received:", result)
		}
	}()

	// 4. 推送消息进行处理
	for i := 0; i < 10; i++ {
		pipeline.Push(i)
	}

	// 5. 关闭管道
	pipeline.Close()
}

错误处理（断路器模式）

当某些情况下（错误）无法处理消息时，可以从步骤处理回调中返回false作为第二个返回值。第一个值将被忽略。

pipeline := parapipe.NewPipeline(4, func(inputValue InputType) (OutputType, bool) {
    someValue, err := someOperation(inputValue)
    if err != nil {
        // 处理错误
        return someValue, false // 返回false表示不继续处理
    }
    return someValue, true // 返回true表示继续处理
})

// 读取输出
for result := range pipeline.Out() {
    // 对结果进行处理
}

性能特点

Parapipe利用了泛型和通道。对于大多数情况来说，它应该足够高效。它在热代码中有零堆分配，因此对垃圾收集器的负载很小。然而，它在底层使用通道，并且主要受限于通道操作，每个消息有几个写入和读取操作。

其他应用场景

使用parapipe可以：

在API中作为流响应长JSON数据，并发检索、丰富和编组每个对象，保持顺序并将它们返回给客户端
从不同源获取并合并条目作为一个流
构建API控制器或处理程序
高效处理大文件

更多关于golang并行处理FIFO管道保持消息顺序插件库parapipe的使用的实战教程也可以访问 https://www.itying.com/category-94-b0.html

ionicwang 1楼

更多关于golang并行处理FIFO管道保持消息顺序插件库parapipe的使用的实战系列教程也可以访问 https://www.itying.com/category-94-b0.html

使用parapipe实现Golang并行FIFO管道处理

parapipe是一个用于Golang的并行处理库，它可以在保持消息顺序的同时实现并行处理。下面我将介绍如何使用parapipe库来实现并行FIFO管道处理。

parapipe基本概念

parapipe的核心思想是：

并行处理消息以提高吞吐量
保持消息的原始顺序
提供简单的API接口

安装parapipe

首先安装parapipe库：

go get github.com/nazar256/parapipe

基本使用示例

package main

import (
	"fmt"
	"time"

	"github.com/nazar256/parapipe"
)

func main() {
	// 创建一个并行管道，设置工作协程数为4
	pipe := parapipe.New(4)

	// 启动管道
	pipe.Start()

	// 向管道发送数据
	for i := 1; i <= 10; i++ {
		num := i // 创建副本
		pipe.In() <- parapipe.NewJob(num, func() (interface{}, error) {
			// 模拟耗时处理
			time.Sleep(time.Second)
			return num * num, nil
		})
	}

	// 关闭输入通道
	close(pipe.In())

	// 从输出通道读取结果
	for result := range pipe.Out() {
		if result.Err != nil {
			fmt.Printf("处理错误: %v\n", result.Err)
			continue
		}
		fmt.Printf("输入: %d, 输出: %v\n", result.Value.(int), result.ProcessedValue)
	}
}

高级特性

1. 自定义错误处理

pipe := parapipe.NewWithConfig(parapipe.Config{
	WorkerCount: 4,
	OnError: func(err error) {
		fmt.Printf("捕获到错误: %v\n", err)
	},
})

2. 控制并发度

// 动态调整工作协程数
pipe.SetWorkerCount(8)

3. 带缓冲的管道

pipe := parapipe.NewWithConfig(parapipe.Config{
	WorkerCount: 4,
	BufferSize:  100, // 设置缓冲区大小
})

实际应用示例

假设我们需要并行处理一批URL请求，但需要保持响应顺序：

package main

import (
	"fmt"
	"io/ioutil"
	"net/http"
	"time"

	"github.com/nazar256/parapipe"
)

func main() {
	urls := []string{
		"https://example.com",
		"https://google.com",
		"https://github.com",
		"https://golang.org",
	}

	// 创建管道，2个并行工作者
	pipe := parapipe.New(2)
	pipe.Start()

	// 发送任务
	for i, url := range urls {
		idx := i
		u := url
		pipe.In() <- parapipe.NewJob(idx, func() (interface{}, error) {
			resp, err := http.Get(u)
			if err != nil {
				return nil, err
			}
			defer resp.Body.Close()
			
			body, err := ioutil.ReadAll(resp.Body)
			if err != nil {
				return nil, err
			}
			
			return len(body), nil
		})
	}

	close(pipe.In())

	// 接收结果
	for result := range pipe.Out() {
		if result.Err != nil {
			fmt.Printf("URL %d 处理失败: %v\n", result.Value.(int), result.Err)
			continue
		}
		fmt.Printf("URL %d 响应大小: %d bytes\n", result.Value.(int), result.ProcessedValue.(int))
	}
}

性能考虑

工作协程数应根据任务类型和机器核心数调整
I/O密集型任务可以设置较多协程
CPU密集型任务协程数不宜过多

错误处理建议

始终检查result.Err
使用OnError回调进行统一错误处理
考虑实现重试逻辑

parapipe提供了一种简单而强大的方式来并行处理数据流同时保持顺序，非常适合需要并行处理但要求结果顺序与输入顺序一致的应用场景。