How to Access Main Context Objects from Isolated Context in Puppeteer & Playwright

published 4 months ago

by Nick Webson

Our rebrowser-patches library that we released some time ago got really good feedback from the web automation community. It significantly increased success rates and removed captcha challenges from many major anti-automation players on the market.

However, our approach has one flaw: when you use always isolated mode (REBROWSER_PATCHES_RUNTIME_FIX_MODE=alwaysIsolated), it means that all your code is going to be executed in a separate isolated JS context which doesn't have access to the main context of the page.

To discover detailed reasons for that, you can read our previous post How to fix Runtime.Enable CDP detection of Puppeteer, Playwright and other automation libraries?.

For many people, it's a dealbreaker as they have to deal with some JS objects that are defined in the main context.

Here is a real use case from one of our customers. They've been running this code to catch the moment when the recaptcha script was loaded:

await page.waitForFunction(`typeof window.grecaptcha.execute === 'function'`)

After the patch is applied, this code will be executed in a new isolated context that will never have the object window.grecaptcha.

To get around this limitation, we can borrow an idea from Chrome docs for extension developers, more specifically "Communication with the embedding page".

This approach leverages window messages to communicate between contexts. page.evaluateOnNewDocument is not patched and still uses the main context to execute the script, so we can inject some good stuff in there.

Let's try to create a proof of concept.

// add event listener for window messages (executed in the main context)
await page.evaluateOnNewDocument(() => {
  window.addEventListener('message', (event) => {
    console.log('[main] msg', event)
    if (!event.data.scriptId || event.data.fromMain) {
      // ignore messages without scriptId and from ourselves (from main context)
      return
    }

    const response = {
      scriptId: event.data.scriptId,
      fromMain: true,
    }
    try {
      response.result = eval(event.data.scriptText)
    } catch (err) {
      response.error = err.message
    }

    window.postMessage(JSON.parse(JSON.stringify(response)))
  })
})

await page.goto('https://bot-detector.rebrowser.net', { waitUntil: 'load' })

// add a helper that we can reuse (executed in an isolated context)
await page.evaluate(() => {
  // listen for messages from main and emit custom event with a response for a specific scriptId
  window.addEventListener('message', (event) => {
    if (!(event.data.scriptId && event.data.fromMain)) {
      // ignore irrelevant messages
      return
    }
    console.log('[isolated] msg', event)
    window.dispatchEvent(new CustomEvent(`scriptId-${event.data.scriptId}`, { detail: event.data }))
  })

  // a helper that can be reused in other page.evaluate calls
  window.evaluateMain = (scriptFn) => {
    // generate unique scriptId for each call
    window.evaluateMainScriptId = (window.evaluateMainScriptId || 0) + 1
    const scriptId = window.evaluateMainScriptId
    return new Promise(resolve => {
      // listen for the response
      window.addEventListener(`scriptId-${scriptId}`, (event) => {
        resolve(event.detail)
      }, {
        once: true,
      })

      // prepare and send a message for the main context
      let scriptText = scriptFn
      if (typeof scriptText !== 'string') {
        scriptText = `(${scriptText.toString()})()`
      }
      window.postMessage({
        scriptId,
        scriptText,
      })
    })
  }
})

// use our helper to evaluate code in the main context
await page.evaluate(() => window.evaluateMain(() => document.getElementsByClassName('div'))

Boom. This code successfully passes main world execution test in rebrowser-bot-detector.

Issues with unsafe-eval

Using this approach, you might get this error:

Refused to evaluate a string as JavaScript because 'unsafe-eval' is not an allowed source of script in the following Content Security Policy directive: ...

It means that your page has CSP prohibiting eval() that we used in the code above.

You could use page.setBypassCSP(true) to fix this issue, but it's not recommended as it could be detected by a remote website quite easily. You can read more in rebrowser-bot-detector.

Another way to fix it is to not use eval() at all. So, instead of:

response.result = eval(event.data.scriptText)

You can use more explicit code:

await page.evaluate(() => window.evaluateMain(JSON.stringify({
  function: 'document.getElementById',
  args: ['detections-json'],
}))
// ...
const scriptData = JSON.parse(event.data.scriptText)
if (scriptData.function === 'document.getElementById') {
  response.result = document.getElementById(...scriptData.args)
}

This code won't break any CSP and will return the same result. Yes, it's more explicit and less flexible as you need to edit it every time you need to introduce a new function, but it gets the job done.

Can it be detected by anti-automation solutions?

Yes, but no.

Yes, because they can just add window.addEventListener('message', ...) to their script and they will receive all the messages from your isolated context. So, they can check the message for the scriptId property and flag you as a suspicious guy who reads Rebrowser blogs.

But no, because the messages mechanism is used on many major websites for legitimate reasons - to communicate with web workers, for example. Also, a huge number of extensions use it for communication, too. So, the fact of the presence of any window messages on the page is just not enough to conclude that you're using any kind of automation.

So, you can adjust the code and instead of scriptId, use userId or anything else, and change scriptText to just text. It's quite impossible for an anti-automation script to know about all the cases on all the websites. There are quite low chances that it's going to be ever detected if you just copy-paste the code from this post. Unless it becomes so popular that this approach will be a default in any automation script 🤔

What's next?

Now you've got your code running in an isolated context but having access to the main world objects. Congrats!

To test your code for automation detections and to try this approach, you can use rebrowser-bot-detector. Safe automation!

Author

Nick Webson

Lead Software Engineer

Nick is a senior software engineer focusing on browser fingerprinting and modern web technologies. With deep expertise in JavaScript and robust API design, he explores cutting-edge solutions for web automation challenges. His articles combine practical insights with technical depth, drawing from hands-on experience in building scalable, undetectable browser solutions.

Table of Contents

Issues with unsafe-eval

Can it be detected by anti-automation solutions?

What's next?