Golang使用Apache Tika报错 - 如何调试`could not wait for server to finish: exit status 1`问题

Golang使用Apache Tika报错 - 如何调试could not wait for server to finish: exit status 1问题 我使用包“github.com/google/go-tika/tika”来启动Apache Tika服务器并从PDF中提取文本内容。在关闭Tika服务器时,我收到错误信息:

could not wait for server to finish: exit status 1

在main函数的末尾,我运行了:

//Stop Tika server & delete server
defer func() {
	err = s.Stop()
	if err != nil {
		fmt.Println("Stop Tika server error:", err)
		return
	}
	// Nachdem der Tika-Server verwendet wurde, löschen
	err = os.Remove(tikaServerPath)
	if err != nil {
		log.Println("Delete Tika Server error: ", err)
	}
}()

我在Windows上运行此工具。Stop()函数的代码如下:

// Stop shuts the server down, killing the underlying Java process. Stop
// must be called when finished with the server to avoid leaking the
// Java process. If it has not been started, stop will panic.
// If not running in a Windows environment, it is recommended to use Shutdown
// for a more graceful shutdown of the Java process.
func (s *Server) Stop() error {
	if err := s.cmd.Process.Kill(); err != nil {
		return fmt.Errorf("could not kill server: %v", err)
	}
	if err := s.cmd.Wait(); err != nil {
		return fmt.Errorf("could not wait for server to finish: %v", err)
	}
	return nil
}

cmd.Wait()

// Wait releases any resources associated with the Cmd.
func (c *Cmd) Wait() error {
	if c.Process == nil {
		return errors.New("exec: not started")
	}
	if c.ProcessState != nil {
		return errors.New("exec: Wait was already called")
	}

	state, err := c.Process.Wait()
	if err == nil && !state.Success() {
		err = &ExitError{ProcessState: state}
	}
	c.ProcessState = state

	var timer *time.Timer
	if c.ctxResult != nil {
		watch := <-c.ctxResult
		timer = watch.timer
		// If c.Process.Wait returned an error, prefer that.
		// Otherwise, report any error from the watchCtx goroutine,
		// such as a Context cancellation or a WaitDelay overrun.
		if err == nil && watch.err != nil {
			err = watch.err
		}
	}

	if goroutineErr := c.awaitGoroutines(timer); err == nil {
		// Report an error from the copying goroutines only if the program otherwise
		// exited normally on its own. Otherwise, the copying error may be due to the
		// abnormal termination.
		err = goroutineErr
	}
	closeDescriptors(c.parentIOPipes)
	c.parentIOPipes = nil

	return err
}

对于如何深入排查此错误,您有什么建议吗?


更多关于Golang使用Apache Tika报错 - 如何调试`could not wait for server to finish: exit status 1`问题的实战教程也可以访问 https://www.itying.com/category-94-b0.html

1 回复

更多关于Golang使用Apache Tika报错 - 如何调试`could not wait for server to finish: exit status 1`问题的实战系列教程也可以访问 https://www.itying.com/category-94-b0.html


这个错误通常是因为Tika服务器进程在Kill()调用前已经非正常退出,导致Wait()返回非零退出状态。以下是排查步骤和示例代码:

1. 检查服务器启动状态和日志 在调用Stop()前,先确认服务器是否正常运行,并捕获服务器进程的标准错误输出:

// 修改Server启动配置,捕获stderr
cmd := exec.Command("java", "-jar", tikaServerPath, "-p", "9998")
stderrPipe, _ := cmd.StderrPipe()
if err := cmd.Start(); err != nil {
    log.Fatal("Failed to start Tika server:", err)
}

// 实时读取stderr
go func() {
    scanner := bufio.NewScanner(stderrPipe)
    for scanner.Scan() {
        log.Println("Tika stderr:", scanner.Text())
    }
}()

// 等待服务器就绪
time.Sleep(5 * time.Second)

2. 添加进程状态检查Stop()前检查进程是否仍在运行:

defer func() {
    // 检查进程是否已退出
    if s.cmd.Process != nil {
        ps, err := s.cmd.Process.Wait()
        if err == nil {
            log.Printf("Process already exited: %v", ps)
            return
        }
    }
    
    err = s.Stop()
    if err != nil {
        fmt.Println("Stop Tika server error:", err)
        // 输出更详细的错误信息
        if exitErr, ok := err.(*exec.ExitError); ok {
            fmt.Printf("Exit code: %v\n", exitErr.ExitCode())
            fmt.Printf("Stderr: %s\n", exitErr.Stderr)
        }
    }
}()

3. 使用Shutdown替代Stop(Windows环境) 虽然文档提到Windows环境推荐使用Kill(),但可以尝试实现graceful shutdown:

func (s *Server) GracefulStop() error {
    // 尝试通过HTTP请求关闭
    resp, err := http.Get("http://localhost:9998/shutdown")
    if err == nil {
        defer resp.Body.Close()
        time.Sleep(2 * time.Second)
    }
    
    // 如果优雅关闭失败,强制终止
    if s.cmd.Process != nil {
        return s.cmd.Process.Kill()
    }
    return nil
}

4. 完整调试示例

func main() {
    tikaServerPath := "./tika-server.jar"
    
    // 启动服务器
    cmd := exec.Command("java", "-jar", tikaServerPath, "-p", "9998")
    
    // 重定向输出以便调试
    var stderr bytes.Buffer
    cmd.Stderr = &stderr
    
    if err := cmd.Start(); err != nil {
        log.Fatal("Start failed:", err)
    }
    
    // 等待服务器初始化
    time.Sleep(10 * time.Second)
    
    defer func() {
        // 先检查进程状态
        if cmd.Process != nil {
            if ps, err := cmd.Process.Wait(); err == nil {
                log.Printf("Process already exited with: %v", ps)
                log.Printf("Stderr output: %s", stderr.String())
                return
            }
        }
        
        // 尝试终止进程
        if err := cmd.Process.Kill(); err != nil {
            log.Println("Kill failed:", err)
        }
        
        // 等待并捕获错误
        if err := cmd.Wait(); err != nil {
            if exitErr, ok := err.(*exec.ExitError); ok {
                log.Printf("ExitError: Code=%v, Stderr=%s", 
                    exitErr.ExitCode(), 
                    stderr.String())
            }
            log.Println("Wait error:", err)
        }
    }()
    
    // 使用Tika服务器...
    client := tika.NewClient(nil, "http://localhost:9998")
    text, err := client.Parse(context.Background(), file)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(text)
}

5. 检查Java环境 确保Java正确安装且版本兼容:

func checkJava() error {
    cmd := exec.Command("java", "-version")
    output, err := cmd.CombinedOutput()
    if err != nil {
        return fmt.Errorf("Java not found: %v", err)
    }
    log.Println("Java version:", string(output))
    return nil
}

关键点:错误发生在Wait()阶段,说明进程已退出但返回了非零状态码。通过捕获stderr输出和检查进程状态,可以确定Tika服务器退出的具体原因。

回到顶部