页面交互
¥Page interactions
Puppeteer 允许通过鼠标、触摸事件和键盘输入与页面上的元素进行交互。通常,你首先使用 CSS 选择器 查询 DOM 元素,然后对选定元素调用操作。所有接受选择器的 Puppeteer API 都默认接受 CSS 选择器。此外,Puppeteer 提供 自定义选择器语法,允许使用 XPath、Text、Accessibility 属性查找元素并访问 Shadow DOM,而无需执行 JavaScript。
¥Puppeteer allows interacting with elements on the page through mouse, touch events and keyboard input. Usually you first query a DOM element using a CSS selector and then invoke an action on the selected element. All of Puppeteer APIs that accept a selector, accept a CSS selector by default. Additionally, Puppeteer offers custom selector syntax that allows finding elements using XPath, Text, Accessibility attributes and accessing Shadow DOM without the need to execute JavaScript.
如果你想在不先选择元素的情况下触发鼠标或键盘事件,请使用 page.mouse
、page.keyboard
和 page.touchscreen
API。本指南的其余部分概述了如何选择 DOM 元素并对其调用操作。
¥If you want to emit mouse or
keyboard events without selecting an element first, use the
page.mouse
,
page.keyboard
and
page.touchscreen
APIs. The rest
of this guide, gives an overview on how to select DOM elements and invoke
actions on them.
定位器
¥Locators
定位器是选择元素并与其交互的推荐方式。定位器封装了有关如何选择元素的信息,它们允许 Puppeteer 自动等待元素出现在 DOM 中并处于正确的操作状态。你始终使用 page.locator()
或 frame.locator()
函数实例化定位器。如果定位器 API 不提供你需要的功能,你仍然可以使用更底层的 API,例如 page.waitForSelector()
或 ElementHandle
。
¥Locators is the recommended way to select an element and interact with it.
Locators encapsulate the information on how to select an element and they allow
Puppeteer to automatically wait for the element to be present in the DOM and to
be in the right state for the action. You always instantiate a locator using the
page.locator()
or
frame.locator()
function. If
the locator API doesn't offer a functionality you need, you can still use lower
level APIs such as
page.waitForSelector()
or ElementHandle
.
使用定位器单击元素
¥Clicking an element using locators
// 'button' is a CSS selector.
await page.locator('button').click();
定位器在单击前自动检查以下内容:
¥The locator automatically checks the following before clicking:
-
确保元素位于视口中。
¥Ensures the element is in the viewport.
-
等待元素变为 visible 或隐藏。
¥Waits for the element to become visible or hidden.
-
等待元素启用。
¥Waits for the element to become enabled.
-
等待元素在两个连续的动画帧上具有稳定的边界框。
¥Waits for the element to have a stable bounding box over two consecutive animation frames.
填写输入
¥Filling out an input
// 'input' is a CSS selector.
await page.locator('input').fill('value');
自动检测输入类型并选择合适的方式用提供的值填写。例如,它将填充 <select>
元素以及 <input>
元素。
¥Automatically detects the input type and choose an appropriate way to fill it
out with the provided value. For example, it will fill out <select>
elements as
well as <input>
elements.
定位器在输入前自动检查以下内容:
¥The locator automatically checks the following before typing into the input:
-
确保元素位于视口中。
¥Ensures the element is in the viewport.
-
等待元素变为 visible 或隐藏。
¥Waits for the element to become visible or hidden.
-
等待元素启用。
¥Waits for the element to become enabled.
-
等待元素在两个连续的动画帧上具有稳定的边界框。
¥Waits for the element to have a stable bounding box over two consecutive animation frames.
将鼠标悬停在元素上
¥Hover over an element
await page.locator('div').hover();
定位器在悬停前自动检查以下内容:
¥The locator automatically checks the following before hovering:
-
确保元素位于视口中。
¥Ensures the element is in the viewport.
-
等待元素变为 visible 或隐藏。
¥Waits for the element to become visible or hidden.
-
等待元素在两个连续的动画帧上具有稳定的边界框。
¥Waits for the element to have a stable bounding box over two consecutive animation frames.
滚动一个元素
¥Scroll an element
[.scroll()
] 函数使用鼠标滚轮事件来滚动元素。
¥The [.scroll()
] functions uses mouse wheel events to scroll an element.
// Scroll the div element by 10px horizontally
// and by 20 px vertically.
await page.locator('div').scroll({
scrollLeft: 10,
scrollTop: 20,
});
定位器在悬停前自动检查以下内容:
¥The locator automatically checks the following before hovering:
-
确保元素位于视口中。
¥Ensures the element is in the viewport.
-
等待元素变为 visible 或隐藏。
¥Waits for the element to become visible or hidden.
-
等待元素在两个连续的动画帧上具有稳定的边界框。
¥Waits for the element to have a stable bounding box over two consecutive animation frames.
等待元素可见
¥Waiting for element to be visible
有时你只需要等待元素可见。
¥Sometimes you only need to wait for the element to be visible.
// '.loading' is a CSS selector.
await page.locator('.loading').wait();
定位器在返回前自动检查以下内容:
¥The locator automatically checks the following before returning:
等待函数
¥Waiting for a function
有时等待用 JavaScript 函数表达的任意条件很有用。在这种情况下,可以使用函数而不是选择器来定义定位器。以下示例等待 MutationObserver 检测到页面上出现 HTMLCanvasElement
元素。你还可以在函数定位器上调用其他定位器函数,例如 .click()
或 .fill()
。
¥Sometimes it is useful to wait for an arbitrary condition expressed as a
JavaScript function. In this case, locator can be defined using a function
instead of a selector. The following example waits until the MutationObserver
detects a HTMLCanvasElement
element appearing on the page. You can also call
other locator functions such as .click()
or .fill()
on the function locator.
await page
.locator(() => {
let resolve!: (node: HTMLCanvasElement) => void;
const promise = new Promise(res => {
return (resolve = res);
});
const observer = new MutationObserver(records => {
for (const record of records) {
if (record.target instanceof HTMLCanvasElement) {
resolve(record.target);
}
}
});
observer.observe(document);
return promise;
})
.wait();
在定位器上应用过滤器
¥Applying filters on locators
以下示例展示了如何向以 JavaScript 函数形式表达的定位器添加额外条件。仅当按钮元素的 innerText
为 '我的按钮' 时,才会单击该元素。
¥The following example shows how to add extra conditions to the locator expressed
as a JavaScript function. The button element will only be clicked if its
innerText
is 'My button'.
await page
.locator('button')
.filter(button => button.innerText === 'My button')
.click();
从定位器返回值
¥Returning values from a locator
map
函数允许将元素映射到 JavaScript 值。在这种情况下,调用 wait()
将返回反序列化的 JavaScript 值。
¥The map
function allows mapping
an element to a JavaScript value. In this case, calling wait()
will return the
deserialized JavaScript value.
const enabled = await page
.locator('button')
.map(button => !button.disabled)
.wait();
从定位器返回 ElementHandles
¥Returning ElementHandles from a locator
waitHandle
函数允许返回 ElementHandle。如果你需要的操作没有相应的定位器 API,它可能会很有用。
¥The waitHandle
function
allows returning the
ElementHandle. It might be
useful if there is no corresponding locator API for the action you need.
const buttonHandle = await page.locator('button').waitHandle();
await buttonHandle.click();
配置定位器
¥Configuring locators
可以配置定位器以调整配置先决条件和其他选项:
¥Locators can be configured to tune configure the preconditions and other options:
// Clicks on a button without waiting for any preconditions.
await page
.locator('button')
.setEnsureElementIsInTheViewport(false)
.setVisibility(null)
.setWaitForEnabled(false)
.setWaitForStableBoundingBox(false)
.click();
定位器超时
¥Locator timeouts
默认情况下,定位器会从页面继承超时设置。但可以根据每个定位器设置超时。如果未找到元素或在指定时间段内未满足先决条件,则会抛出 TimeoutError。
¥By default, locators inherit the timeout setting from the page. But it is possible to set the timeout on the per-locator basis. A TimeoutError will be thrown if the element is not found or the preconditions are not met within the specified time period.
// Time out after 3 sec.
await page.locator('button').setTimeout(3000).click();
获取定位器事件
¥Getting locator events
目前,定位器支持 单个事件,当定位器即将执行操作时,它会通知你,表明已满足先决条件:
¥Currently, locators support a single event that notifies you when the locator is about to perform the action indicating that pre-conditions have been met:
let willClick = false;
await page
.locator('button')
.on(LocatorEvent.Action, () => {
willClick = true;
})
.click();
此事件可用于日志记录/调试或其他目的。如果定位器重试该操作,该事件可能会多次触发。
¥This event can be used for logging/debugging or other purposes. The event might fire multiple times if the locator retries the action.
waitForSelector
与定位器相比,waitForSelector
是一种更底层的 API,允许等待元素在 DOM 中可用。如果操作失败,它不会自动重试该操作,并且需要手动处理生成的 ElementHandle 以防止内存泄漏。该方法存在于 Page、Frame 和 ElementHandle 实例上。
¥waitForSelector
is a
lower-level API compared to locators that allows waiting for an element to be
available in DOM. It does not automatically retry the action if it fails and
requires manually disposing the resulting ElementHandle to prevent memory leaks.
The method exists on the Page, Frame and ElementHandle instances.
// Import puppeteer
import puppeteer from 'puppeteer';
// Launch the browser.
const browser = await puppeteer.launch();
// Create a page.
const page = await browser.newPage();
// Go to your site.
await page.goto('YOUR_SITE');
// Query for an element handle.
const element = await page.waitForSelector('div > .class-name');
// Do something with element...
await element.click(); // Just an example.
// Dispose of handle.
await element.dispose();
// Close browser.
await browser.close();
出于向后兼容的原因,某些页面级 API(例如 page.click(selector)
、page.type(selector)
、page.hover(selector)
)是使用 waitForSelector
实现的。
¥Some page level APIs such as page.click(selector)
, page.type(selector)
,
page.hover(selector)
are implemented using waitForSelector
for
backwards-compatibility reasons.
无需等待即可查询
¥Querying without waiting
有时你知道元素已在页面上。在这种情况下,Puppeteer 提供了多种方法来查找与选择器匹配的元素或多个元素。这些方法存在于 Page、Frame 和 ElementHandle 实例中。
¥Sometimes you know that the elements are already on the page. In that case, Puppeteer offers multiple ways to find an element or multiple elements matching a selector. These methods exist on Page, Frame and ElementHandle instances.
-
page.$()
返回与选择器匹配的单个元素。¥
page.$()
returns a single element matching a selector. -
page.$$()
返回与选择器匹配的所有元素。¥
page.$$()
returns all elements matching a selector. -
page.$eval()
返回对与选择器匹配的第一个元素运行 JavaScript 函数的结果。¥
page.$eval()
returns the result of running a JavaScript function on the first element matching a selector. -
page.$$eval()
返回对与选择器匹配的每个元素运行 JavaScript 函数的结果。¥
page.$$eval()
returns the result of running a JavaScript function on each element matching a selector.
选择器
¥Selectors
Puppeteer 在每个接受选择器的 API 中都接受 CSS 选择器。此外,你可以选择使用其他选择器语法来执行比 CSS 选择器提供的更多操作。
¥Puppeteer accepts CSS selectors in every API that accepts a selector. Additionally, you can opt-in into using additional selector syntax to do more than CSS selectors offer.
非 CSS 选择器
¥Non-CSS selectors
Puppeteer 使用自定义 pseudo-elements 扩展了 CSS 语法,该 pseudo-elements 定义了如何使用非 CSS 选择器选择元素。Puppeteer 支持的伪元素以 -p
浏览器前缀为前缀。
¥Puppeteer extends the CSS syntax with custom
pseudo-elements
that define how to select an element using a non-CSS selector. The Puppeteer
supported pseudo-elements are prefixed with a -p
vendor prefix.
XPath 选择器 (-p-xpath
)
¥XPath selectors (-p-xpath
)
XPath 选择器将使用浏览器的原生 Document.evaluate
来查询元素。
¥XPath selectors will use the browser's native Document.evaluate
to query for elements.
// Runs the `//h2` as the XPath expression.
const element = await page.waitForSelector('::-p-xpath(//h2)');
文本选择器 (-p-text
)
¥Text selectors (-p-text
)
文本选择器将选择包含给定文本的 "minimal" 个元素,即使是在(开放的)影子根中。这里,"minimum" 表示包含给定文本的最深元素,但不是它们的父元素(从技术上讲,它们也将包含给定文本)。
¥Text selectors will select "minimal" elements containing the given text, even within (open) shadow roots. Here, "minimum" means the deepest elements that contain a given text, but not their parents (which technically will also contain the given text).
// Click a button inside a div element that has Checkout as the inner text.
await page.locator('div ::-p-text(Checkout)').click();
// You need to escape CSS selector syntax such '(', ')' if it is part of the your search text ('Checkout (2 items)').
await page.locator(':scope >>> ::-p-text(Checkout \\(2 items\\))').click();
// or use quotes escaping any quotes that are part of the search text ('He said: "Hello"').
await page.locator(':scope >>> ::-p-text("He said: \\"Hello\\"")').click();
ARIA 选择器 (-p-aria
)
¥ARIA selectors (-p-aria
)
ARIA 选择器可用于使用计算出的可访问名称和角色来查找元素。这些标签是使用可访问性树的浏览器内部表示来计算的。这意味着在运行查询之前会解析 ARIA 关系(例如 labeledby)。如果你不想依赖任何特定的 DOM 结构或 DOM 属性,ARIA 选择器很有用。
¥ARIA selectors can be used to find elements using the computed accessible name and role. These labels are computed using the browsers internal representation of the accessibility tree. That means that ARIA relationships such as labeledby are resolved before the query is run. The ARIA selectors are useful if you do not want to depend on any particular DOM structure or DOM attributes.
await page.locator('::-p-aria(Submit)').click();
await page.locator('::-p-aria([name="Click me"][role="button"])').click();
Pierce 选择器 (pierce/
)
¥Pierce selector (pierce/
)
Pierce 选择器是一种选择器,它返回文档中所有影子根中与提供的 CSS 选择器匹配的所有元素。我们建议改用 深度组合器,因为它们在组合不同的选择器时提供了更大的灵活性。pierce/
仅在 前缀表示法 中可用。
¥Pierce selector is a selector that returns all elements matching the provided CSS selector in
all shadow roots in the document. We recommend using deep
combinators instead because they offer more
flexibility in combining difference selectors. pierce/
is only available in
the prefixed notation.
await page.locator('pierce/div').click();
// Same query as the pierce/ one using deep combinators.
await page.locator('& >>> div').click();
查询 Shadow DOM 中的元素
¥Querying elements in Shadow DOM
CSS 选择器不允许下降到 Shadow DOM,因此,Puppeteer 在 CSS 选择器语法中添加了两个组合器,允许在 影子 DOM 内部进行搜索。
¥CSS selectors do not allow descending into Shadow DOM, therefore, Puppeteer adds two combinators to the CSS selector syntax that allow searching inside shadow DOM.
>>>
组合器
¥The >>>
combinator
>>>
称为深度后代组合器。它类似于 CSS 的后代组合器(用单个空格字符
表示,例如 div button
),并且它会选择父元素下任意深度的匹配元素。例如,my-custom-element >>> button
将选择 my-custom-element
(影子主机)影子 DOM 内可用的所有按钮元素。
¥The >>>
is called the deep descendent combinator. It is analogous to the
CSS's descendent combinator (denoted with a single space character
, for
example, div button
) and it selects matching elements under the parent element
at any depth. For example, my-custom-element >>> button
would select all
button elements that are available inside shadow DOM of the my-custom-element
(the shadow host).
深度组合器仅适用于 CSS 选择器的第一个 "depth" 和开放的影子根;例如,:is(div > > a)
将不起作用。
¥Deep combinators only work on the first "depth" of CSS selectors and open shadow
roots; for example, :is(div > > a)
will not work.
>>>>
组合器
¥The >>>>
combinator
>>>>
称为深度子组合器。它类似于 CSS 的子组合器(用 >
表示,例如 div > button
),并且如果元素有,它会选择父元素的直接影子根下的匹配元素。例如,my-custom-element >>>> button
将选择 my-custom-element
(影子主机)直接影子根内可用的所有按钮元素。
¥The >>>>
is called the deep child combinator. It is analogous to the CSS's
child combinator (denoted with >
, for example, div > button
) and it selects
matching elements under the parent element's immediate shadow root, if the
element has one. For example,
my-custom-element >>>> button
would select all button elements that are available
inside the immediate shadow root of the my-custom-element
(the shadow host).
自定义选择器
¥Custom selectors
你还可以使用 Puppeteer.registerCustomQueryHandler 添加自己的伪元素。这对于基于框架对象或应用创建自定义选择器很有用。
¥You can also add your own pseudo element using Puppeteer.registerCustomQueryHandler. This is useful for creating custom selectors based on framework objects or your application.
例如,你可以使用 react-component
伪元素编写所有选择器,并实现自定义逻辑来解析提供的 ID。
¥For example, you can write all your selectors using the react-component
pseudo-element
and implement a custom logic how to resolve the provided ID.
Puppeteer.registerCustomQueryHandler('react-component', {
queryOne: (elementOrDocument, selector) => {
// Dummy example just delegates to querySelector but you can find your
// React component because this callback runs in the page context.
return elementOrDocument.querySelector(`[id="${CSS.escape(selector)}"]`);
},
queryAll: (elementOrDocument, selector) => {
// Dummy example just delegates to querySelector but you can find your
// React component because this callback runs in the page context.
return elementOrDocument.querySelectorAll(`[id="${CSS.escape(selector)}"]`);
},
});
在你的应用中,你现在可以按如下方式编写选择器。
¥In your application you can now write selectors as following.
await page.locator('::-p-react-component(MyComponent)').click();
// OR used in conjunction with other selectors.
await page.locator('.side-bar ::-p-react-component(MyComponent)').click();
另一个示例展示了如何定义用于定位 vue 组件的自定义查询处理程序:
¥Another example shows how you can define a custom query handler for locating vue components:
依赖库或框架的内部 API 时要小心。它们可以随时改变。
¥Be careful when relying on internal APIs of libraries or frameworks. They can change at any time.
Puppeteer.registerCustomQueryHandler('vue', {
queryOne: (element, name) => {
const walker = document.createTreeWalker(element, NodeFilter.SHOW_ELEMENT);
do {
const currentNode = walker.currentNode;
if (
currentNode.__vnode?.ctx?.type?.name.toLowerCase() ===
name.toLocaleLowerCase()
) {
return currentNode;
}
} while (walker.nextNode());
return null;
},
});
按如下方式搜索给定的视图组件:
¥Search for a given view component as following:
const element = await page.$('::-p-vue(MyComponent)');
带前缀的选择器语法
¥Prefixed selector syntax
虽然我们维护前缀选择器,但建议的方法是使用上面记录的选择器语法。
¥While we maintain prefixed selectors, the recommended way is to use the selector syntax documented above.
以下旧语法 (${nonCssSelectorName}/${nonCssSelector}
) 允许一次运行单个非 CSS 选择器,也受支持。请注意,此语法不允许组合多个选择器。
¥The following legacy syntax (${nonCssSelectorName}/${nonCssSelector}
) allows
running a single non-CSS selector at a time is also supported. Note that this
syntax does not allow combining multiple selectors.
// Same as ::-p-text("My text").
await page.locator('text/My text').click();
// Same as ::-p-xpath(//h2).
await page.locator('xpath///h2').click();
// Same as ::-p-aria(My label).
await page.locator('aria/My label').click();
await page.locator('pierce/div').click();