Does your company rely on browser automation or web scraping? We have a wild offer for our early customers! Read more →

Best Practices

Rebrowser works with any library that controls a browser using CDP (Chrome DevTools Protocol). Most automation libraries use this protocol to send commands to the browser.

By default, our system will automatically stop your run if there are no commands from your end for 30 seconds. This helps prevent browsers from running indefinitely if your script disconnects for any reason.

When working with a remote browser, it's important to pay attention to how you end your script.

In Puppeteer, simply calling browser.close() usually is enough to gracefully finish the remote browser session.

If you're working with raw CDP, you can send either Browser.close or Rebrowser.finishRun.

If the run finishes automatically due to the timeout, you'll be charged for this idle time without getting any use out of the remote browser.

Our research suggests that Playwright and Puppeteer might not send the proper CDP command when you call browser.close(). In this case, we recommend sending the Rebrowser.finishRun command explicitly before browser.close().

Here's an example:

1
2
3
4
5
6
7
// puppeteer
await page._client().send('Rebrowser.finishRun')
await browser.close()

// playwright
await (await browser.newBrowserCDPSession()).send('Rebrowser.finishRun')
await browser.close()

By default, Puppeteer has a defaultViewport parameter set to 800x600. This means the library will send an Emulation CDP command to the browser right after it connects. This isn't ideal for a remote browser since it's running on real hardware and has actual values for the viewport and other parameters used by browser fingerprinting systems.

To disable this behavior, specify defaultViewport: null in your connect method. Here's an example:

1
2
3
4
const browser = await puppeteer.connect({
  browserWSEndpoint: `wss://ws.rebrowser.net/?${new URLSearchParams(rebrowserParams)}`,
  defaultViewport: null,
})

Your typical web automation pipeline usually works with a local browser, which means you get almost instant responses to any CDP command you send.

However, all runs on Rebrowser are executed on a remote browser. This introduces some extra time for sending CDP commands and getting responses.

A straightforward practice to deal with this is to think carefully about running multiple async operations. In many cases, it's possible to optimize them, for example, by combining multiple commands into one.

Before optimization: 3 await calls, each triggering a separate CDP command that needs to go to the remote browser, be processed, and return a response.

1
2
3
const form = await page.$('form')
const formName = await form.getProperty('name')
await page.focus('#input')

After optimization: just 1 await call, one CDP command.

1
2
3
4
5
6
const { formName } = await page.evaluate(() => {
  const button = document.querySelector('form')
  const formName = button.name
  document.querySelector('#input').focus()
  return { formName }
})

Creating a new page is a resource-intensive operation that typically takes 700-900ms. To optimize this, our platform provides each browser instance with a pre-loaded about:blank page. By utilizing this existing page instead of creating a new one, you can reduce the startup time to 200-250ms.

The optimized approach first attempts to use the existing page, and only creates a new one if necessary, resulting in significantly faster execution.

Here's how to implement this optimization in Puppeteer:

1
2
3
4
// standard approach (700-900ms)
const page = await browser.newPage()
// optimized approach (200-250ms)
const page = (await browser.pages())[0] || (await browser.newPage())

Another example in Playwright:

1
2
3
4
5
6
// standard approach (700-900ms)
const context = await browser.newContext()
const page = await context.newPage()
// optimized approach (200-250ms)
const context = browser.contexts()[0] || (await browser.newContext())
const page = context.pages()[0] || (await context.newPage())

We recommend starting your work and debugging your logic with all content loaded as in a normal browser. However, once your pipeline is working and running stably, you might consider disabling the loading of some resources such as images, CSS, and others.

This will help improve your performance in terms of loading webpages, and also reduce bandwidth on your proxy. In some cases, you might be using metered residential proxies, and all these ads could easily consume most of your balance.

In the browser settings of your profiles and groups, you'll see an option called "Load Images" that allows you to disable loading any image resources during your runs.

You can also prevent many more types of content from loading, though the actual implementation depends on your automation library. Below is an example of how to do this in Puppeteer.

1
2
3
4
5
6
7
8
page.on('request', req => {
  if (req.resourceType() == 'stylesheet' || req.resourceType() == 'font' || req.resourceType() == 'image') {
    req.abort();
  }
  else {
    req.continue();
  }
})

Beware: this technique could increase the chances of your automation being detected. Some websites might analyze whether your browser actually loaded certain resources that might seem unnecessary at first glance.

You can enable verbose debug mode to see what CDP commands are being sent and received from your CDP driver. It's really useful when you're diving deep into debugging some complex stuff.

This feature is particularly useful for debugging issues that occur when your code behaves differently in remote browsers compared to local ones. By comparing the logs between both environments, you can identify discrepancies and pinpoint the root cause of the problem.

  • Puppeteer: DEBUG="puppeteer:*" node pptr.js
  • Playwright: DEBUG="pw:*" node pw.js

A common question from our clients is whether they should use puppeteer-extra with our remote browsers.

While puppeteer-extra-plugin-stealth is a powerful extension for Puppeteer that helps bypass antibot systems, its utility has changed over time. The package was highly effective for masking headless Chromium browsers in Linux environments during 2022. However, antibot systems have evolved significantly since then, and many of the package's techniques have become less effective or even counterproductive.

We strongly advise against using the stealth plugin with our cloud browsers. Our browsers already run on real hardware with authentic fingerprints, making additional spoofing unnecessary and potentially harmful. The genuine browser fingerprints work reliably out of the box - this is a key advantage of our service.

However, if you wish to use puppeteer-extra for other features specific to your use case, that's perfectly fine - our system is fully compatible with the library.

Bottom line: Avoid using puppeteer-extra-plugin-stealth with Rebrowser remote browsers, as it will likely decrease your success rates and compromise performance.