Rust分布式追踪库tracing-distributed的使用，实现高效跨服务调用链监控与性能分析

tracing-distributed on crates.io

tracing-distributed

当前版本: 0.3.1

这个crate提供:

TelemetryLayer，一个通用的tracing层，用于处理将span和事件发布到任意后端
为任意后端实现分布式追踪的实用工具

作为tracing层，TelemetryLayer可以与其他层组合以提供stdout日志记录、过滤等功能。

这个crate主要面向需要实现自己后端的人使用。一个使用honeycomb.io作为后端的具体实现可以在tracing-honeycomb crate中找到。

许可证

MIT

安装

在项目目录中运行以下Cargo命令：

cargo add tracing-distributed

或在Cargo.toml中添加以下行：

tracing-distributed = "0.4.0"

完整示例代码

use tracing_distributed::TelemetryLayer;
use tracing_subscriber::{layer::SubscriberExt, Registry};
use tracing::{info_span, info};

// 创建一个简单的TelemetryLayer实现
struct SimpleTelemetryLayer;

impl TelemetryLayer for SimpleTelemetryLayer {
    fn publish_span(&self, span: &tracing::Span) {
        println!("Span published: {:?}", span);
    }

    fn publish_event(&self, event: &tracing::Event<'_>) {
        println!("Event published: {:?}", event);
    }
}

fn main() {
    // 创建订阅者并添加TelemetryLayer
    let subscriber = Registry::default()
        .with(SimpleTelemetryLayer);
    
    // 设置全局默认订阅者
    tracing::subscriber::set_global_default(subscriber)
        .expect("Failed to set subscriber");
    
    // 创建一个span并记录一些事件
    let span = info_span!("my_span");
    let _enter = span.enter();
    
    info!("This is an info event");
    
    // 跨服务调用示例
    call_another_service();
}

fn call_another_service() {
    let span = info_span!("service_call", service_name = "database");
    let _enter = span.enter();
    
    info!("Making call to database service");
    // 模拟数据库调用
    std::thread::sleep(std::time::Duration::from_millis(100));
    info!("Database call completed");
}

代码说明

首先我们定义了一个简单的TelemetryLayer实现SimpleTelemetryLayer，它只是将span和事件打印到stdout
在main函数中，我们创建了一个Registry订阅者并添加了我们的TelemetryLayer
我们设置了全局默认订阅者
创建了一个示例span并记录了一些事件
演示了跨服务调用的示例，在call_another_service函数中模拟了一个数据库服务调用

这个示例展示了如何使用tracing-distributed的基本功能来实现跨服务调用链的追踪和监控。在实际应用中，您需要根据您的具体需求实现更复杂的TelemetryLayer，例如将数据发送到专门的监控系统。

zlyuanteng 1楼

Rust分布式追踪库tracing-distributed使用指南

概述

tracing-distributed是基于Rust tracing生态系统的分布式追踪库，主要用于实现跨服务调用链监控和性能分析。它扩展了标准tracing库的功能，为分布式系统提供了端到端的请求追踪能力。

主要特性

跨服务调用链追踪
上下文传播支持
性能开销低
兼容主流追踪后端
支持灵活的采样策略

完整示例代码

下面是一个完整的示例，展示如何使用tracing-distributed实现分布式追踪：

// 1. 添加依赖
/*
[dependencies]
tracing = "0.1"
tracing-distributed = "0.2"
tracing-subscriber = { version = "0.3", features = ["json"] }
reqwest = "0.11"
tokio = { version = "1.0", features = ["full"] }
*/

use tracing::{info_span, instrument};
use tracing_distributed::{DistributedTracer, Propagation, propagate};
use tracing_subscriber::prelude::*;
use std::time::Duration;
use reqwest::{Request, Response, header::HeaderMap};
use tokio::time::sleep;

// 模拟Request和Response类型
#[derive(Debug)]
struct Request {
    method: String,
    path: String,
    headers: HeaderMap,
}

#[derive(Debug)]
struct Response {
    status: u16,
}

impl Response {
    fn new() -> Self {
        Response { status: 200 }
    }
}

#[derive(Debug)]
enum Error {
    Reqwest(String),
}

// 初始化追踪系统
fn init_tracing() {
    let tracer = DistributedTracer::new()
        .with_propagation(Propagation::B3) // 使用B3传播格式
        .with_service_name("my-service");
    
    tracing_subscriber::registry()
        .with(tracer)
        .init();
}

// 处理请求的函数
#[instrument]
async fn handle_request(request: Request) -> Result<Response, Error> {
    // 从请求头中提取上下文
    let parent_ctx = propagate::extract_from_headers(&request.headers);
    
    // 创建子span
    let span = info_span!("request_processing", 
        http.method = %request.method,
        http.path = %request.path
    );
    
    // 在上下文中执行
    let _guard = span.set_parent(&parent_ctx).enter();
    
    // 模拟处理逻辑
    sleep(Duration::from_millis(50)).await;
    
    // 调用下游服务
    call_downstream_service("http://example.com/api").await?;
    
    Ok(Response::new())
}

// 调用下游服务的函数
#[instrument]
async fn call_downstream_service(url: &str) -> Result<(), Error> {
    let client = reqwest::Client::new();
    
    // 创建请求并注入追踪上下文
    let mut request = client.get(url);
    propagate::inject_into_headers(&mut request);
    
    // 发送请求
    let response = request.send().await
        .map_err(|e| Error::Reqwest(e.to_string()))?;
    
    // 模拟处理响应
    sleep(Duration::from_millis(30)).await;
    
    Ok(())
}

// 自定义采样器
struct CustomSampler;

impl tracing_distributed::Sampler for CustomSampler {
    fn sample(&self, context: &tracing::span::Context) -> tracing_distributed::SamplingDecision {
        // 只采样根span和带有特定标签的span
        if context.is_root() {
            tracing_distributed::SamplingDecision::RecordAndSample
        } else if context.has_field("http.method") {
            tracing_distributed::SamplingDecision::RecordAndSample
        } else {
            tracing_distributed::SamplingDecision::RecordOnly
        }
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 初始化追踪
    init_tracing();
    
    // 模拟请求
    let mut headers = HeaderMap::new();
    let request = Request {
        method: "GET".to_string(),
        path: "/api/test".to_string(),
        headers,
    };
    
    // 处理请求
    let _ = handle_request(request).await?;
    
    Ok(())
}

最佳实践说明

服务入口点：在服务入口处创建根span，如HTTP请求处理函数
关键操作：为数据库查询、外部API调用等关键操作创建子span
上下文传播：确保在跨服务调用时正确传播上下文
采样策略：根据业务需求定制采样策略，平衡追踪数据和系统性能
标签命名：使用一致的命名规范，如http.method、db.query等

性能分析扩展

可以通过添加metrics来增强性能监控：

use tracing_distributed::metrics;

#[instrument]
async fn expensive_operation() {
    let start = std::time::Instant::now();
    
    // 模拟耗时操作
    sleep(Duration::from_millis(100)).await;
    
    // 记录耗时
    metrics::histogram!("operation.duration", start.elapsed());
    metrics::counter!("operation.count", 1);
    
    // 记录错误率
    metrics::gauge!("operation.in_progress", -1f64);
}