Skip to main content
Version: 22.6.1

Puppeteer

build

指南 | API | 常见问题 | 贡献 | 故障排除

¥Guides | API | FAQ | Contributing | Troubleshooting

Puppeteer 是一个 Node.js 库,它提供了一个高级 API 来通过 开发工具协议 控制 Chrome/Chromium。Puppeteer 默认以 无头 模式运行,但可以配置为在完整 ("有头") Chrome/Chromium 中运行。

¥Puppeteer is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode by default, but can be configured to run in full ("headful") Chrome/Chromium.

我能做些什么?

¥What can I do?

你可以在浏览器中手动执行的大多数操作都可以使用 Puppeteer 完成!以下是一些帮助你入门的示例:

¥Most things that you can do manually in the browser can be done using Puppeteer! Here are a few examples to get you started:

  • 生成页面的屏幕截图和 PDF。

    ¥Generate screenshots and PDFs of pages.

  • 抓取 SPA(单页应用)并生成预渲染内容(即 "SSR"(服务器端渲染))。

    ¥Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR" (Server-Side Rendering)).

  • 自动化表单提交、UI 测试、键盘输入等。

    ¥Automate form submission, UI testing, keyboard input, etc.

  • 使用最新的 JavaScript 和浏览器功能创建自动化测试环境。

    ¥Create an automated testing environment using the latest JavaScript and browser features.

  • 捕获站点的 时间线痕迹 以帮助诊断性能问题。

    ¥Capture a timeline trace of your site to help diagnose performance issues.

  • 测试 Chrome 扩展程序

    ¥Test Chrome Extensions.

新手入门

¥Getting Started

安装

¥Installation

要在项目中使用 Puppeteer,请运行:

¥To use Puppeteer in your project, run:

npm i puppeteer
# or using yarn
yarn add puppeteer
# or using pnpm
pnpm i puppeteer

当你安装 Puppeteer 时,它会自动下载最新版本的 用于测试的 Chrome(~170MB macOS、~282MB Linux、~280MB Windows)和 chrome-headless-shell 二进制文件(从 Puppeteer v21.6.0 开始),即带有 Puppeteer 的 保证能用于。浏览器默认下载到 $HOME/.cache/puppeteer 文件夹(从 Puppeteer v19.0.0 开始)。请参阅 configuration 了解用于控制下载行为的配置选项和环境变量。

¥When you install Puppeteer, it automatically downloads a recent version of Chrome for Testing (~170MB macOS, ~282MB Linux, ~280MB Windows) and a chrome-headless-shell binary (starting with Puppeteer v21.6.0) that is guaranteed to work with Puppeteer. The browser is downloaded to the $HOME/.cache/puppeteer folder by default (starting with Puppeteer v19.0.0). See configuration for configuration options and environmental variables to control the download behavor.

如果你使用 Puppeteer 将项目部署到托管提供商(例如 Render 或 Heroku),你可能需要将缓存位置重新配置到你的项目文件夹中(请参阅下面的示例),因为并非所有托管提供商都将 $HOME/.cache 包含在项目部署的文件夹中。

¥If you deploy a project using Puppeteer to a hosting provider, such as Render or Heroku, you might need to reconfigure the location of the cache to be within your project folder (see an example below) because not all hosting providers include $HOME/.cache into the project's deployment.

对于没有安装浏览器的 Puppeteer 版本,请参阅 puppeteer-core

¥For a version of Puppeteer without the browser installation, see puppeteer-core.

如果与 TypeScript 一起使用,则支持的最低 TypeScript 版本为 4.7.4

¥If used with TypeScript, the minimum supported TypeScript version is 4.7.4.

配置

¥Configuration

Puppeteer 使用多个默认值,可以通过配置文件进行自定义。

¥Puppeteer uses several defaults that can be customized through configuration files.

例如,要更改 Puppeteer 用于安装浏览器的默认缓存目录,你可以在应用的根目录中添加 .puppeteerrc.cjs(或 puppeteer.config.cjs),其中包含以下内容

¥For example, to change the default cache directory Puppeteer uses to install browsers, you can add a .puppeteerrc.cjs (or puppeteer.config.cjs) at the root of your application with the contents

const {join} = require('path');

/**

* @type {import("puppeteer").Configuration}
*/
module.exports = {
// Changes the cache location for Puppeteer.
cacheDirectory: join(__dirname, '.cache', 'puppeteer'),
};

添加配置文件后,你需要删除并重新安装 puppeteer 才能使其生效。

¥After adding the configuration file, you will need to remove and reinstall puppeteer for it to take effect.

请参阅 配置指南 了解更多信息。

¥See the configuration guide for more information.

puppeteer-core

对于 v1.7.0 以来的每个版本,我们都会发布两个包:

¥For every release since v1.7.0 we publish two packages:

puppeteer 是一款浏览器自动化产品。安装后,它会下载一个 Chrome 版本,然后使用 puppeteer-core 驱动该版本。作为终端用户产品,puppeteer 使用合理的 可以定制的 默认值自动执行多个工作流程。

¥puppeteer is a product for browser automation. When installed, it downloads a version of Chrome, which it then drives using puppeteer-core. Being an end-user product, puppeteer automates several workflows using reasonable defaults that can be customized.

puppeteer-core 是一个帮助驱动任何支持 DevTools 协议的库。作为一个库,puppeteer-core 完全通过其编程接口驱动,这意味着不采用默认值,并且 puppeteer-core 在安装时不会下载 Chrome。

¥puppeteer-core is a library to help drive anything that supports DevTools protocol. Being a library, puppeteer-core is fully driven through its programmatic interface implying no defaults are assumed and puppeteer-core will not download Chrome when installed.

如果你是 连接到远程浏览器自己管理浏览器,则应使用 puppeteer-core。如果你自己管理浏览器,则需要使用显式 executablePath 调用 puppeteer.launch(如果安装在标准位置,则调用 channel)。

¥You should use puppeteer-core if you are connecting to a remote browser or managing browsers yourself. If you are managing browsers yourself, you will need to call puppeteer.launch with an explicit executablePath (or channel if it's installed in a standard location).

使用 puppeteer-core 时,记得更改导入:

¥When using puppeteer-core, remember to change the import:

import puppeteer from 'puppeteer-core';

用法

¥Usage

Puppeteer 遵循 Node.js 的最新 维护长期支持 版本。

¥Puppeteer follows the latest maintenance LTS version of Node.

使用其他浏览器测试框架的人会对 Puppeteer 感到熟悉。你 launch/connect 一个 browsercreate 一些 pages,然后用 Puppeteer 的 API 操纵它们。

¥Puppeteer will be familiar to people using other browser testing frameworks. You launch/connect a browser, create some pages, and then manipulate them with Puppeteer's API.

如需更深入的使用,请查看我们的 指南示例

¥For more in-depth usage, check our guides and examples.

示例

¥Example

以下示例在 developer.chrome.com 中搜索包含文本 "automate beyond recorder" 的博客文章,单击第一个结果并打印博客文章的完整标题。

¥The following example searches developer.chrome.com for blog posts with text "automate beyond recorder", click on the first result and print the full title of the blog post.

import puppeteer from 'puppeteer';

(async () => {
// Launch the browser and open a new blank page
const browser = await puppeteer.launch();
const page = await browser.newPage();

// Navigate the page to a URL
await page.goto('https://developer.chrome.com/');

// Set screen size
await page.setViewport({width: 1080, height: 1024});

// Type into search box
await page.type('.devsite-search-field', 'automate beyond recorder');

// Wait and click on first result
const searchResultSelector = '.devsite-result-item-link';
await page.waitForSelector(searchResultSelector);
await page.click(searchResultSelector);

// Locate the full title with a unique string
const textSelector = await page.waitForSelector(
'text/Customize and automate'
);
const fullTitle = await textSelector?.evaluate(el => el.textContent);

// Print the full title
console.log('The title of this blog post is "%s".', fullTitle);

await browser.close();
})();

默认运行时设置

¥Default runtime settings

1.

默认情况下,Puppeteer 在 无头模式 中启动 Chrome。

¥By default Puppeteer launches Chrome in the Headless mode.

const browser = await puppeteer.launch();
// Equivalent to
const browser = await puppeteer.launch({headless: true});

在 v22 之前,Puppeteer 默认启动 旧的无头模式。旧的无头模式现在称为 chrome-headless-shell,并作为单独的二进制文件发布。chrome-headless-shell 与常规 Chrome 的行为并不完全匹配,但目前对于不需要完整 Chrome 功能集的自动化任务来说,它的性能更高。如果性能对你的用例更重要,请切换到 chrome-headless-shell,如下所示:

¥Before v22, Puppeteer launched the old Headless mode by default. The old headless mode is now known as chrome-headless-shell and ships as a separate binary. chrome-headless-shell does not match the behavior of the regular Chrome completely but it is currently more performant for automation tasks where the complete Chrome feature set is not needed. If the performance is more important for your use case, switch to chrome-headless-shell as following:

const browser = await puppeteer.launch({headless: 'shell'});

要启动 "有头" 版本的 Chrome,请在启动浏览器时将 headless 设置为 false 选项:

¥To launch a "headful" version of Chrome, set the headless to false option when launching a browser:

const browser = await puppeteer.launch({headless: false});

2.

默认情况下,Puppeteer 下载并使用特定版本的 Chrome,因此保证其 API 开箱即用。要将 Puppeteer 与不同版本的 Chrome 或 Chromium 一起使用,请在创建 Browser 实例时传入可执行文件的路径:

¥By default, Puppeteer downloads and uses a specific version of Chrome so its API is guaranteed to work out of the box. To use Puppeteer with a different version of Chrome or Chromium, pass in the executable's path when creating a Browser instance:

const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});

你还可以在 Firefox 中使用 Puppeteer。请参阅 跨浏览器支持状态 了解更多信息。

¥You can also use Puppeteer with Firefox. See status of cross-browser support for more information.

有关 Chromium 和 Chrome 之间差异的说明,请参阅 this articleThis article 描述了 Linux 用户的一些差异。

¥See this article for a description of the differences between Chromium and Chrome. This article describes some differences for Linux users.

3.

Puppeteer 创建自己的浏览器用户配置文件,并在每次运行时进行清理。

¥Puppeteer creates its own browser user profile which it cleans up on every run.

使用 Docker

¥Using Docker

看看我们的 Docker 指南

¥See our Docker guide.

使用 Chrome 扩展程序

¥Using Chrome Extensions

看看我们的 Chrome 扩展程序指南

¥See our Chrome extensions guide.

资源

¥Resources

贡献

¥Contributing

查看我们的 贡献指南 以了解 Puppeteer 开发的概述。

¥Check out our contributing guide to get an overview of Puppeteer development.

常见问题

¥FAQ

我们的 常见问题 已迁移到 我们的网站

¥Our FAQ has migrated to our site.