Does your company rely on browser automation or web scraping? We have a wild offer for our early customers! Read more →

How to Access Main Context Objects from Isolated Context in Puppeteer & Playwright

published 4 months ago
by Nick Webson

Our rebrowser-patches library that we released some time ago got really good feedback from the web automation community. It significantly increased success rates and removed captcha challenges from many major anti-automation players on the market.

However, our approach has one flaw: when you use always isolated mode (REBROWSER_PATCHES_RUNTIME_FIX_MODE=alwaysIsolated), it means that all your code is going to be executed in a separate isolated JS context which doesn't have access to the main context of the page.

To discover detailed reasons for that, you can read our previous post How to fix Runtime.Enable CDP detection of Puppeteer, Playwright and other automation libraries?.

For many people, it's a dealbreaker as they have to deal with some JS objects that are defined in the main context.

Here is a real use case from one of our customers. They've been running this code to catch the moment when the recaptcha script was loaded:

await page.waitForFunction(`typeof window.grecaptcha.execute === 'function'`)

After the patch is applied, this code will be executed in a new isolated context that will never have the object window.grecaptcha.

To get around this limitation, we can borrow an idea from Chrome docs for extension developers, more specifically "Communication with the embedding page".

This approach leverages window messages to communicate between contexts. page.evaluateOnNewDocument is not patched and still uses the main context to execute the script, so we can inject some good stuff in there.

Let's try to create a proof of concept.

// add event listener for window messages (executed in the main context)
await page.evaluateOnNewDocument(() => {
  window.addEventListener('message', (event) => {
    console.log('[main] msg', event)
    if (!event.data.scriptId || event.data.fromMain) {
      // ignore messages without scriptId and from ourselves (from main context)
      return
    }

    const response = {
      scriptId: event.data.scriptId,
      fromMain: true,
    }
    try {
      response.result = eval(event.data.scriptText)
    } catch (err) {
      response.error = err.message
    }

    window.postMessage(JSON.parse(JSON.stringify(response)))
  })
})

await page.goto('https://bot-detector.rebrowser.net', { waitUntil: 'load' })

// add a helper that we can reuse (executed in an isolated context)
await page.evaluate(() => {
  // listen for messages from main and emit custom event with a response for a specific scriptId
  window.addEventListener('message', (event) => {
    if (!(event.data.scriptId && event.data.fromMain)) {
      // ignore irrelevant messages
      return
    }
    console.log('[isolated] msg', event)
    window.dispatchEvent(new CustomEvent(`scriptId-${event.data.scriptId}`, { detail: event.data }))
  })

  // a helper that can be reused in other page.evaluate calls
  window.evaluateMain = (scriptFn) => {
    // generate unique scriptId for each call
    window.evaluateMainScriptId = (window.evaluateMainScriptId || 0) + 1
    const scriptId = window.evaluateMainScriptId
    return new Promise(resolve => {
      // listen for the response
      window.addEventListener(`scriptId-${scriptId}`, (event) => {
        resolve(event.detail)
      }, {
        once: true,
      })

      // prepare and send a message for the main context
      let scriptText = scriptFn
      if (typeof scriptText !== 'string') {
        scriptText = `(${scriptText.toString()})()`
      }
      window.postMessage({
        scriptId,
        scriptText,
      })
    })
  }
})

// use our helper to evaluate code in the main context
await page.evaluate(() => window.evaluateMain(() => document.getElementsByClassName('div'))

Boom. This code successfully passes main world execution test in rebrowser-bot-detector.

Issues with unsafe-eval

Using this approach, you might get this error:

Refused to evaluate a string as JavaScript because 'unsafe-eval' is not an allowed source of script in the following Content Security Policy directive: ...

It means that your page has CSP prohibiting eval() that we used in the code above.

You could use page.setBypassCSP(true) to fix this issue, but it's not recommended as it could be detected by a remote website quite easily. You can read more in rebrowser-bot-detector.

Another way to fix it is to not use eval() at all. So, instead of:

response.result = eval(event.data.scriptText)

You can use more explicit code:

await page.evaluate(() => window.evaluateMain(JSON.stringify({
  function: 'document.getElementById',
  args: ['detections-json'],
}))
// ...
const scriptData = JSON.parse(event.data.scriptText)
if (scriptData.function === 'document.getElementById') {
  response.result = document.getElementById(...scriptData.args)
}

This code won't break any CSP and will return the same result. Yes, it's more explicit and less flexible as you need to edit it every time you need to introduce a new function, but it gets the job done.

Can it be detected by anti-automation solutions?

Yes, but no.

Yes, because they can just add window.addEventListener('message', ...) to their script and they will receive all the messages from your isolated context. So, they can check the message for the scriptId property and flag you as a suspicious guy who reads Rebrowser blogs.

But no, because the messages mechanism is used on many major websites for legitimate reasons - to communicate with web workers, for example. Also, a huge number of extensions use it for communication, too. So, the fact of the presence of any window messages on the page is just not enough to conclude that you're using any kind of automation.

So, you can adjust the code and instead of scriptId, use userId or anything else, and change scriptText to just text. It's quite impossible for an anti-automation script to know about all the cases on all the websites. There are quite low chances that it's going to be ever detected if you just copy-paste the code from this post. Unless it becomes so popular that this approach will be a default in any automation script 🤔

What's next?

Now you've got your code running in an isolated context but having access to the main world objects. Congrats!

To test your code for automation detections and to try this approach, you can use rebrowser-bot-detector. Safe automation!

Nick Webson
Author
Nick Webson
Lead Software Engineer
Nick is a senior software engineer focusing on browser fingerprinting and modern web technologies. With deep expertise in JavaScript and robust API design, he explores cutting-edge solutions for web automation challenges. His articles combine practical insights with technical depth, drawing from hands-on experience in building scalable, undetectable browser solutions.
Try Rebrowser for free. Join our waitlist.
Due to high demand, Rebrowser is currently available by invitation only.
We're expanding our user base daily, so join our waitlist today.
Just share your email to unlock a new world of seamless automation.
Get invited within 7 days
No credit card required
No spam
Other Posts
best-unblocked-browsers-to-access-blocked-sites
Unlock the web with the best unblocked browsers! Discover top options to access restricted sites effortlessly and enjoy a free browsing experience.
published 2 months ago
by Nick Webson
why-your-account-got-banned-on-coinbase-understanding-the-risks-and-solutions
Discover the common reasons behind Coinbase account bans, learn how to avoid suspension, and explore alternative solutions for managing multiple accounts safely and efficiently.
published 5 months ago
by Robert Wilson
xpath-contains-function-a-complete-guide-for-web-scraping-and-automation
A comprehensive guide to mastering XPath contains() for web scraping and testing automation - with practical examples, best practices, and expert insights.
published 12 days ago
by Robert Wilson
python-xpath-selectors-guide-master-web-scraping-and-xml-parsing
A comprehensive guide to using XPath selectors in Python for efficient web scraping and XML parsing. Learn syntax, best practices, and real-world applications with practical examples.
published 19 days ago
by Robert Wilson
how-to-parse-datetime-strings-with-python-and-dateparser-the-ultimate-guide
Time is tricky: A comprehensive guide to parsing datetime strings in Python using dateparser - from basic usage and real-world examples to handling complex international formats and optimizing performance.
published 17 days ago
by Nick Webson
cloudflare-error-1015-you-are-being-rate-limited
Learn how to fix Cloudflare Error 1015, understand rate limiting, and implement best practices for web scraping. Discover legal solutions, API alternatives, and strategies to avoid triggering rate limits.
published 3 months ago
by Nick Webson