Python爬取猫眼电影Top100榜单，为什么打印结果为空？

源代码如下：

import requests
from requests.exceptions import RequestException

def get_one_page(url):
try:
response = requests.get(url)
if requests.status_codes == 200:
return response.text
return None
except RequestException:
return None

def main():
headers = {‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3573.0 Safari/537.36’}
url = requests.get(‘https://www.hao123.com/’, headers=headers)
html = get_one_page(url)
print(html)

if name == ‘main’:
main()
Python爬取猫眼电影Top100榜单，为什么打印结果为空？

caililin 1楼

网址都错了啊兄弟

nodeper 2楼

你这问题我遇到过，大概率是请求头没设置好，猫眼的反爬机制会拦截没有User-Agent的请求。

直接上代码，重点看headers部分：

import requests
from bs4 import BeautifulSoup
import time

def get_maoyan_top100():
    url = 'https://maoyan.com/board/4'
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8',
        'Connection': 'keep-alive'
    }
    
    try:
        response = requests.get(url, headers=headers, timeout=10)
        response.encoding = 'utf-8'
        
        if response.status_code == 200:
            soup = BeautifulSoup(response.text, 'html.parser')
            movies = soup.find_all('div', class_='movie-item-info')
            
            if not movies:
                print("解析失败：可能是页面结构变了或需要处理动态加载")
                return
            
            for movie in movies[:10]:  # 先测试前10条
                name = movie.find('p', class_='name').text.strip()
                star = movie.find('p', class_='star').text.strip()
                releasetime = movie.find('p', class_='releasetime').text.strip()
                print(f"电影：{name} | 主演：{star} | 上映时间：{releasetime}")
                
        else:
            print(f"请求失败，状态码：{response.status_code}")
            
    except Exception as e:
        print(f"发生错误：{e}")

if __name__ == '__main__':
    get_maoyan_top100()

几个关键点：

headers必须完整：特别是User-Agent，模拟真实浏览器访问
检查状态码：先print一下response.status_code，不是200就说明请求被拒了
先测试少量数据：用movies[:10]先验证解析逻辑是否正确
页面结构可能变化：如果class名变了，需要重新审查元素

如果还是空，可能是猫眼改了页面结构，需要更新CSS选择器。

建议先用浏览器开发者工具看看实际返回的HTML内容。

nodeper 3楼

#1 哈哈哈哈哈哈

zlyuanteng 4楼

哈哈哈哈哈哈哈哈哈 hao123 还行

caililin 5楼

找点教程看看吧

zlyuanteng 6楼

额，网址之前是对的，后来换别的网址也是这样，不知道哪里出问题了

phonegap100 7楼

requests.get 请求了两遍，第二遍肯定是 return None 啦