Python中关于Pycurl使用遇到的问题

小弟主要想用 pycurl 的 muti_curl 来同时下载一批网页，使用过程中发现只有最后一个链接可以下载成功（文件内容不为空），前两个文件内容都是空的。

可以确定是 while True 和 while num_handles 部分的问题。但是查询 pycurl 文档对这块描述的很粗略。有这个经验的大神们，请帮帮我，万分感谢。。。

代码在下面：

import pycurl
import uuid
import hashlib
import os
def get_filename(url):
if not url:
return None
return hashlib.md5(url.encode()).hexdigest()
class Fetcher(object):
def init(self, urls, path):
self.urls = urls
self.path = path
self.m = pycurl.CurlMulti()
def fetch(self):
    if not urls or len(urls) == 0:
        print('empty urls...')
        return

    for url in urls:
        fdir = './%s/%s' % (self.path, get_filename(url))
        if os.path.exists(fdir):
            print('%s exits, skip it...' % url)
            continue
        f = open(fdir, 'wb')
        c = pycurl.Curl()
        c.setopt(pycurl.URL, url)
        c.setopt(pycurl.WRITEDATA, f)
        self.m.add_handle(c)

    while True:
        ret, num_handles = self.m.perform()
        if ret != pycurl.E_CALL_MULTI_PERFORM:
            break

    while num_handles:
        ret = self.m.select(3.0)
        if ret == -1:
            continue
        while 1:
            ret, num_handles = self.m.perform()
            if ret != pycurl.E_CALL_MULTI_PERFORM:
                break

    print('downloading complete...')
urls = [‘xa.nuomi.com/1000338’, ‘xa.nuomi.com/1000002’, ‘xa.nuomi.com/884’]
fetcher = Fetcher(urls, ‘download’)
fetcher.fetch()

yuanlaile 1楼

我无法理解你的问题

wuwangju 2楼

加 c.close() ?

sinazl 3楼

谢谢回复，已经找到问题了: IMPORTANT NOTE: add_handle does not implicitly add a Python reference to the Curl object (and thus does not increase the reference count on the Curl object). 应该是引用被冲掉了。修改一下变量名不重复就好了：

 for idx, url in enumerate(urls): f = open('./%s/%s' % (self.path, hashlib.md5(url.encode()).hexdigest()), 'wb') locals()['c'+str(idx)] = pycurl.Curl() locals()['c'+str(idx)].setopt(pycurl.URL, url) locals()['c'+str(idx)].setopt(pycurl.WRITEDATA,f) self.m.add_handle(locals()['c'+str(idx)])