写了一个中国大陆高校列表的Nodejs爬虫，有需要的可以试下

这几天恰好有个需求用到，就写了个。目前已经包含一份完整 json 了。

https://github.com/codeudan/crawler-china-mainland-universities

数据按省份分类，支持本科，专科，民办，独立院校分类。

h691938207 1楼•2 天前

爬到的数据发个看看

h691938207 2楼•2 天前

打破 0 star

itying888 3楼•2 天前

有各专业高考录取分的爬虫吗

sinazl 4楼•2 天前

有重复数据

yibo5220 5楼•2 天前

这数据准吗?怎么不用学信网的呀
https://gaokao.chsi.com.cn/sch/search–ss-on,option-qg,searchType-1,start-0.dhtml

ionicwang 6楼•2 天前

仓库的 china_mainland_universities.json 就是刚刚爬的。

caililin 7楼•2 天前

CHSI 官方权威
还能抓 211 985 标签

yibo5220 8楼•2 天前

修正这个 bug 了。

wuwangju 9楼•2 天前

多谢提醒。

songsunli 10楼•2 天前

china_mainland_universities.json 就是。

yibo5220 11楼•2 天前

问个小白问题：是我的 node 版本不对吗这行 async function main(){ 报错：SyntaxError:Unexpected token function

ionicwang 12楼•2 天前

你的 node 版本低于 8 吗?

nodeper 13楼•2 天前

为啥网页进你这个帖子背景就会变黑？？？

yuanlaile 14楼•2 天前

V2EX 上的模块，有的主题是不一样的

itying888 15楼•2 天前

嗯有点低 6.2 应该是版本的原因

vueper 16楼•2 天前

参照 5L 的地址，写了一个 php 版本
https://github.com/teg1c/crawler-china-mainland-universities-by-php

wuwangju 17楼•2 天前

有没有办法找到学校里的组织架构，求！

htzhanglong 18楼•2 天前

数据源改成学信网的了。

yibo5220 19楼•2 天前

你好！很高兴你对Node.js爬虫感兴趣。下面是一个简单的Node.js爬虫示例，用于抓取中国大陆高校的列表。这个示例使用axios库来发送HTTP请求，并使用cheerio库来解析HTML。

首先，你需要安装必要的库：

npm install axios cheerio

然后，你可以使用以下代码来抓取高校列表：

const axios = require('axios');
const cheerio = require('cheerio');

async function fetchUniversities() {
  const url = 'https://example.com/universities-list'; // 替换为实际的高校列表页面URL
  const { data } = await axios.get(url);
  const $ = cheerio.load(data);

  const universities = [];
  $('div.university').each((index, element) => {
    const name = $(element).find('h2').text().trim();
    const location = $(element).find('p.location').text().trim();
    universities.push({ name, location });
  });

  console.log(universities);
}

fetchUniversities().catch(console.error);

注意：

https://example.com/universities-list 需要替换为实际的高校列表页面的URL。
div.university、h2 和 p.location 是假设的HTML结构，你需要根据实际页面结构调整选择器。

这个示例展示了如何使用Node.js和相关库来抓取网页数据。你可以根据需要进一步扩展和优化这个爬虫，例如添加错误处理、分页支持等。希望这个示例对你有所帮助！