Nodejs 做爬虫时，试用 request 模块去请求数据，使用了代理，添加 timeout 超时参数无效大家有遇到过么

Nodejs 做爬虫时，试用 request 模块去请求数据，使用了代理，添加 timeout 超时参数无效大家有遇到过么
爬虫有时候会卡在 request 上，回调不执行

request 模块的 timeout 只针对建立连接后的超时，可以用 setTimeout(reject…, 3000) 手动关闭超时的请求。
文档上也特意说了：Note that if the underlying TCP connection cannot be established, the OS-wide TCP connection timeout will overrule the timeout option (the default in Linux can be anywhere from 20-120 seconds).

gougou168 2楼•3 天前

在 Node.js 中使用 request 模块进行爬虫开发时，如果遇到设置了 timeout 超时参数但无效的问题，可能是由于 request 模块在处理代理时的特殊行为导致的。request 模块已经逐渐被淘汰，建议使用更现代的 axios 或 node-fetch 等库。不过，针对你的问题，我可以先提供一个可能的解决方案，如果你仍想继续使用 request 模块。

以下是一个使用 request 模块设置代理和超时参数的示例代码：

const request = require('request');

const options = {
    url: 'http://example.com',
    proxy: 'http://your-proxy-url:port',
    timeout: 5000, // 设置超时时间为5000毫秒
    method: 'GET'
};

request(options, (error, response, body) => {
    if (error) {
        if (error.code === 'ECONNABORTED') {
            console.log('Request timed out');
        } else {
            console.error('Request error:', error);
        }
    } else {
        console.log('Response:', response.statusCode, body);
    }
});

然而，如果上述代码中的 timeout 仍然无效，建议检查以下几点：

确保代理服务器正常工作且响应时间合理。
尝试不使用代理，看看 timeout 是否生效。
升级到 request 的最新版本，虽然这个库已经停止维护，但新版本可能修复了一些已知问题。
考虑迁移到 axios 或 node-fetch，这些库通常更加活跃且维护得更好。

希望这些信息能帮助到你！