Skip to main content
Version: 24.38.0

页面交互

🌐 Page interactions

Puppeteer 允许通过鼠标、触摸事件和键盘输入与页面上的元素进行交互。通常,你首先使用 CSS 选择器 查询一个 DOM 元素,然后在选中的元素上执行操作。所有接受选择器的 Puppeteer API 默认都接受 CSS 选择器。此外,Puppeteer 提供了 自定义选择器语法,允许使用 XPath、文本、辅助功能属性查找元素,并访问 Shadow DOM,而无需执行 JavaScript。

🌐 Puppeteer allows interacting with elements on the page through mouse, touch events and keyboard input. Usually you first query a DOM element using a CSS selector and then invoke an action on the selected element. All of Puppeteer APIs that accept a selector, accept a CSS selector by default. Additionally, Puppeteer offers custom selector syntax that allows finding elements using XPath, Text, Accessibility attributes and accessing Shadow DOM without the need to execute JavaScript.

如果你想在不先选择元素的情况下发出鼠标或键盘事件,可以使用 page.mousepage.keyboardpage.touchscreen API。本指南的其余部分概述了如何选择 DOM 元素并对其执行操作。

🌐 If you want to emit mouse or keyboard events without selecting an element first, use the page.mouse, page.keyboard and page.touchscreen APIs. The rest of this guide, gives an overview on how to select DOM elements and invoke actions on them.

定位器

🌐 Locators

Locators 是选择元素并与之交互的推荐方式。Locators 封装了如何选择元素的信息,并允许 Puppeteer 自动等待元素出现在 DOM 中,并处于适合操作的正确状态。你总是使用 page.locator()frame.locator() 函数实例化一个 locator。如果 locator API 不提供你需要的功能,你仍然可以使用更底层的 API,例如 page.waitForSelector()ElementHandle

🌐 Locators is the recommended way to select an element and interact with it. Locators encapsulate the information on how to select an element and they allow Puppeteer to automatically wait for the element to be present in the DOM and to be in the right state for the action. You always instantiate a locator using the page.locator() or frame.locator() function. If the locator API doesn't offer a functionality you need, you can still use lower level APIs such as page.waitForSelector() or ElementHandle.

使用定位器单击元素

🌐 Clicking an element using locators

// 'button' is a CSS selector.
await page.locator('button').click();

定位器在单击前自动检查以下内容:

🌐 The locator automatically checks the following before clicking:

  • 确保元素位于视口中。
  • 等待元素变为可见或隐藏。
  • 等待元素启用。
  • 等待元素在连续两个动画帧中具有稳定的边界框。

填写输入

🌐 Filling out an input

// 'input' is a CSS selector.
await page.locator('input').fill('value');

自动检测输入类型,并选择适当的方式使用提供的值填写。例如,它将填写 <select> 元素以及 <input> 元素。

🌐 Automatically detects the input type and choose an appropriate way to fill it out with the provided value. For example, it will fill out <select> elements as well as <input> elements.

定位器在输入前自动检查以下内容:

🌐 The locator automatically checks the following before typing into the input:

  • 确保元素位于视口中。
  • 等待元素变为可见或隐藏。
  • 等待元素启用。
  • 等待元素在连续两个动画帧中具有稳定的边界框。

将鼠标悬停在元素上

🌐 Hover over an element

await page.locator('div').hover();

定位器在悬停前自动检查以下内容:

🌐 The locator automatically checks the following before hovering:

  • 确保元素位于视口中。
  • 等待元素变为可见或隐藏。
  • 等待元素在连续两个动画帧中具有稳定的边界框。

滚动一个元素

🌐 Scroll an element

[.scroll()] 函数使用鼠标滚轮事件来滚动一个元素。

🌐 The [.scroll()] functions uses mouse wheel events to scroll an element.

// Scroll the div element by 10px horizontally
// and by 20 px vertically.
await page.locator('div').scroll({
scrollLeft: 10,
scrollTop: 20,
});

定位器在滚动前会自动检查以下内容:

🌐 The locator automatically checks the following before scrolling:

  • 确保元素位于视口中。
  • 等待元素变为可见或隐藏。
  • 等待元素在连续两个动画帧中具有稳定的边界框。

等待元素可见

🌐 Waiting for element to be visible

有时你只需要等待元素可见。

🌐 Sometimes you only need to wait for the element to be visible.

// '.loading' is a CSS selector.
await page.locator('.loading').wait();

定位器在返回前自动检查以下内容:

🌐 The locator automatically checks the following before returning:

  • 等待元素变为可见或隐藏。

等待函数

🌐 Waiting for a function

有时等待一个用 JavaScript 函数表示的任意条件是有用的。在这种情况下,可以使用函数而不是选择器来定义定位器。下面的示例会等待页面上至少出现 3 个段落,然后提取它们的文本。你也可以调用定位器函数,例如 .click().fill(),而不是将元素映射为文本。

🌐 Sometimes it is useful to wait for an arbitrary condition expressed as a JavaScript function. In this case, locator can be defined using a function instead of a selector. The following example waits until at least 3 paragraphs are present on the page, then extracts their text. You can also call locator functions such as .click() or .fill() instead of mapping elements to text.

const paragraphs = await page
.locator(() => {
const paragraphs = document.querySelectorAll('p');

if (paragraphs.length >= 3) {
return [...paragraphs].map(p => p.textContent);
}
})
.wait();

在定位器上应用过滤器

🌐 Applying filters on locators

以下示例显示了如何向以 JavaScript 函数表示的定位器添加额外条件。只有当按钮元素的 textContent 为 'My button' 时,才会点击该按钮。

🌐 The following example shows how to add extra conditions to the locator expressed as a JavaScript function. The button element will only be clicked if its textContent is 'My button'.

await page
.locator('button')
.filter(button => button.textContent === 'My button')
.click();

由于 .filter() 的回调是在浏览器上下文中执行的,它无法访问 Node 范围内的变量。你可以构建一个字符串函数来注入变量:

🌐 Since .filter()'s callback is executed in browser context, it doesn't have access to variables from the Node scope. You can build a string function to inject a variable:

const buttonName = 'My button';
await page
.locator('button')
.filter(`button => button.textContent === ${JSON.stringify(buttonName)}`)
.click();

从定位器返回值

🌐 Returning values from a locator

map 函数允许将一个元素映射到 JavaScript 值。在这种情况下,调用 wait() 将返回反序列化的 JavaScript 值。

🌐 The map function allows mapping an element to a JavaScript value. In this case, calling wait() will return the deserialized JavaScript value.

const enabled = await page
.locator('button')
.map(button => !button.disabled)
.wait();

从定位器返回 ElementHandles

🌐 Returning ElementHandles from a locator

waitHandle 函数允许返回 ElementHandle。如果没有对应的定位器 API 来执行你需要的操作,这可能会很有用。

🌐 The waitHandle function allows returning the ElementHandle. It might be useful if there is no corresponding locator API for the action you need.

const buttonHandle = await page.locator('button').waitHandle();
await buttonHandle.click();

配置定位器

🌐 Configuring locators

可以配置定位器以调整配置先决条件和其他选项:

🌐 Locators can be configured to tune configure the preconditions and other options:

// Clicks on a button without waiting for any preconditions.
await page
.locator('button')
.setEnsureElementIsInTheViewport(false)
.setVisibility(null)
.setWaitForEnabled(false)
.setWaitForStableBoundingBox(false)
.click();

定位器超时

🌐 Locator timeouts

默认情况下,定位器会继承页面的超时设置。但是也可以为每个定位器单独设置超时。如果在指定的时间内未找到元素或未满足前置条件,将抛出TimeoutError

🌐 By default, locators inherit the timeout setting from the page. But it is possible to set the timeout on the per-locator basis. A TimeoutError will be thrown if the element is not found or the preconditions are not met within the specified time period.

// Time out after 3 sec.
await page.locator('button').setTimeout(3000).click();

获取定位器事件

🌐 Getting locator events

目前,定位器支持单个事件,该事件在定位器即将执行操作时通知你,以表明前置条件已满足:

🌐 Currently, locators support a single event that notifies you when the locator is about to perform the action indicating that pre-conditions have been met:

let willClick = false;
await page
.locator('button')
.on(LocatorEvent.Action, () => {
willClick = true;
})
.click();

此事件可用于记录/调试或其他用途。如果定位器重试操作,该事件可能会触发多次。

🌐 This event can be used for logging/debugging or other purposes. The event might fire multiple times if the locator retries the action.

waitForSelector

waitForSelector 是一个相比于定位器(locators)更底层的 API,它允许等待元素在 DOM 中可用。它不会在操作失败时自动重试,并且需要手动处置生成的 ElementHandle 以防止内存泄漏。该方法存在于 Page、Frame 和 ElementHandle 实例上。

// Import puppeteer
import puppeteer from 'puppeteer';

// Launch the browser.
const browser = await puppeteer.launch();

// Create a page.
const page = await browser.newPage();

// Go to your site.
await page.goto('YOUR_SITE');

// Query for an element handle.
const element = await page.waitForSelector('div > .class-name');

// Do something with element...
await element.click(); // Just an example.

// Dispose of handle.
await element.dispose();

// Close browser.
await browser.close();

出于向后兼容的原因,一些页面级 API,例如 page.click(selector)page.type(selector)page.hover(selector),是使用 waitForSelector 实现的。

🌐 Some page level APIs such as page.click(selector), page.type(selector), page.hover(selector) are implemented using waitForSelector for backwards-compatibility reasons.

无需等待即可查询

🌐 Querying without waiting

有时候你会知道页面上已经存在这些元素。在这种情况下,Puppeteer 提供了多种方法来查找匹配选择器的单个元素或多个元素。这些方法存在于 Page、Frame 和 ElementHandle 实例上。

🌐 Sometimes you know that the elements are already on the page. In that case, Puppeteer offers multiple ways to find an element or multiple elements matching a selector. These methods exist on Page, Frame and ElementHandle instances.

  • page.$() 返回匹配选择器的单个元素。
  • page.$$() 返回所有匹配选择器的元素。
  • page.$eval() 返回在第一个匹配选择器的元素上运行 JavaScript 函数的结果。
  • page.$$eval() 返回对每个匹配选择器的元素运行 JavaScript 函数的结果。

选择器

🌐 Selectors

Puppeteer 在每个接受选择器的 API 中都接受 CSS 选择器。此外,你还可以选择使用额外的选择器语法,以实现 CSS 选择器无法完成的操作。

🌐 Puppeteer accepts CSS selectors in every API that accepts a selector. Additionally, you can opt-in into using additional selector syntax to do more than CSS selectors offer.

非 CSS 选择器

🌐 Non-CSS selectors

Puppeteer 扩展了 CSS 语法,添加了自定义的 伪元素,用于定义如何使用非 CSS 选择器选择元素。Puppeteer 支持的伪元素都带有 -p 厂商前缀。

🌐 Puppeteer extends the CSS syntax with custom pseudo-elements that define how to select an element using a non-CSS selector. The Puppeteer supported pseudo-elements are prefixed with a -p vendor prefix.

XPath 选择器 (-p-xpath)

🌐 XPath selectors (-p-xpath)

XPath 选择器将使用浏览器的本地 Document.evaluate 来查询元素。

🌐 XPath selectors will use the browser's native Document.evaluate to query for elements.

// Runs the `//h2` as the XPath expression.
const element = await page.waitForSelector('::-p-xpath(//h2)');

文本选择器(-p-text

🌐 Text selectors (-p-text)

文本选择器将选择包含给定文本的“最小”元素,即使是在(打开的)影子根中。这里,“最小”意味着包含给定文本的最深层元素,而不是它们的父元素(从技术上讲,父元素也会包含给定文本)。

🌐 Text selectors will select "minimal" elements containing the given text, even within (open) shadow roots. Here, "minimum" means the deepest elements that contain a given text, but not their parents (which technically will also contain the given text).

// Click a button inside a div element that has Checkout as the inner text.
await page.locator('div ::-p-text(Checkout)').click();
// You need to escape CSS selector syntax such '(', ')' if it is part of the your search text ('Checkout (2 items)').
await page.locator(':scope >>> ::-p-text(Checkout \\(2 items\\))').click();
// or use quotes escaping any quotes that are part of the search text ('He said: "Hello"').
await page.locator(':scope >>> ::-p-text("He said: \\"Hello\\"")').click();

ARIA 选择器 (-p-aria)

🌐 ARIA selectors (-p-aria)

ARIA 选择器可以用于通过计算得到的可访问名称和角色来查找元素。这些标签是使用浏览器对可访问性树的内部表示计算得出的。这意味着像 labeledby 这样的 ARIA 关系会在查询运行之前被解析。如果你不想依赖任何特定的 DOM 结构或 DOM 属性,ARIA 选择器是很有用的。

🌐 ARIA selectors can be used to find elements using the computed accessible name and role. These labels are computed using the browsers internal representation of the accessibility tree. That means that ARIA relationships such as labeledby are resolved before the query is run. The ARIA selectors are useful if you do not want to depend on any particular DOM structure or DOM attributes.

await page.locator('::-p-aria(Submit)').click();
await page.locator('::-p-aria([name="Click me"][role="button"])').click();

穿透选择器(pierce/

🌐 Pierce selector (pierce/)

Pierce 选择器是一种选择器,用于返回文档中所有 shadow root 中匹配所提供 CSS 选择器的元素。我们建议使用 deep 组合器,因为它们在组合不同选择器时提供了更多灵活性。pierce/ 仅在 带前缀的表示法 中可用。

🌐 Pierce selector is a selector that returns all elements matching the provided CSS selector in all shadow roots in the document. We recommend using deep combinators instead because they offer more flexibility in combining difference selectors. pierce/ is only available in the prefixed notation.

await page.locator('pierce/div').click();
// Same query as the pierce/ one using deep combinators.
await page.locator('& >>> div').click();

查询 Shadow DOM 中的元素

🌐 Querying elements in Shadow DOM

CSS 选择器不允许深入 Shadow DOM,因此,Puppeteer 在 CSS 选择器语法中添加了两个组合符,以允许在 shadow DOM 内进行搜索。

🌐 CSS selectors do not allow descending into Shadow DOM, therefore, Puppeteer adds two combinators to the CSS selector syntax that allow searching inside shadow DOM.

>>> 组合子

🌐 The >>> combinator

>>> 被称为 深层后代 组合器。它类似于 CSS 的后代组合器(用单个空格字符   表示,例如 div button),用于选择父元素下任意深度的匹配元素。例如,my-custom-element >>> button 会选择 my-custom-element(影子宿主)阴影 DOM 内的所有按钮元素。

note

深度组合器仅作用于 CSS 选择器和开放的 shadow 根的第一层“深度”;例如,:is(div > > a) 将不起作用。

🌐 Deep combinators only work on the first "depth" of CSS selectors and open shadow roots; for example, :is(div > > a) will not work.

>>>> 组合子

🌐 The >>>> combinator

>>>> 被称为 deep child 组合器。它类似于 CSS 的子组合器(例如用 > 表示,如 div > button),它会选择父元素的直接 shadow root 下的匹配元素(如果该元素有 shadow root)。例如,my-custom-element >>>> button 会选择 my-custom-element(shadow host)直接 shadow root 内的所有可用按钮元素。

🌐 The >>>> is called the deep child combinator. It is analogous to the CSS's child combinator (denoted with >, for example, div > button) and it selects matching elements under the parent element's immediate shadow root, if the element has one. For example, my-custom-element >>>> button would select all button elements that are available inside the immediate shadow root of the my-custom-element (the shadow host).

自定义选择器

🌐 Custom selectors

你也可以使用 Puppeteer.registerCustomQueryHandler 添加你自己的伪元素。这对于基于框架对象或你的应用创建自定义选择器非常有用。

🌐 You can also add your own pseudo element using Puppeteer.registerCustomQueryHandler. This is useful for creating custom selectors based on framework objects or your application.

例如,你可以使用 react-component 伪元素编写所有选择器,并实现自定义逻辑来解析提供的 ID。

🌐 For example, you can write all your selectors using the react-component pseudo-element and implement a custom logic how to resolve the provided ID.

Puppeteer.registerCustomQueryHandler('react-component', {
queryOne: (elementOrDocument, selector) => {
// Dummy example just delegates to querySelector but you can find your
// React component because this callback runs in the page context.
return elementOrDocument.querySelector(`[id="${CSS.escape(selector)}"]`);
},
queryAll: (elementOrDocument, selector) => {
// Dummy example just delegates to querySelector but you can find your
// React component because this callback runs in the page context.
return elementOrDocument.querySelectorAll(`[id="${CSS.escape(selector)}"]`);
},
});

在你的应用中,你现在可以按如下方式编写选择器。

🌐 In your application you can now write selectors as following.

await page.locator('::-p-react-component(MyComponent)').click();
// OR used in conjunction with other selectors.
await page.locator('.side-bar ::-p-react-component(MyComponent)').click();

另一个例子展示了如何为定位 Vue 组件定义自定义查询处理器:

🌐 Another example shows how you can define a custom query handler for locating vue components:

caution

在依赖库或框架的内部 API 时要小心。它们随时可能发生变化。

🌐 Be careful when relying on internal APIs of libraries or frameworks. They can change at any time.

Puppeteer.registerCustomQueryHandler('vue', {
queryOne: (element, name) => {
const walker = document.createTreeWalker(element, NodeFilter.SHOW_ELEMENT);
do {
const currentNode = walker.currentNode;
if (
currentNode.__vnode?.ctx?.type?.name.toLowerCase() ===
name.toLocaleLowerCase()
) {
return currentNode;
}
} while (walker.nextNode());

return null;
},
});

按如下方式搜索给定的视图组件:

🌐 Search for a given view component as following:

const element = await page.$('::-p-vue(MyComponent)');

带前缀的选择器语法

🌐 Prefixed selector syntax

caution

虽然我们维护前缀选择器,但建议的方法是使用上面记录的选择器语法。

🌐 While we maintain prefixed selectors, the recommended way is to use the selector syntax documented above.

以下旧语法(${nonCssSelectorName}/${nonCssSelector})允许一次运行单个非 CSS 选择器,也受支持。请注意,这种语法不允许组合多个选择器。

🌐 The following legacy syntax (${nonCssSelectorName}/${nonCssSelector}) allows running a single non-CSS selector at a time is also supported. Note that this syntax does not allow combining multiple selectors.

// Same as ::-p-text("My text").
await page.locator('text/My text').click();
// Same as ::-p-xpath(//h2).
await page.locator('xpath///h2').click();
// Same as ::-p-aria(My label).
await page.locator('aria/My label').click();

await page.locator('pierce/div').click();