Nodejs scraping website after javascript has loaded the values


Probably a newbie question on nodejs/jsdom

I am trying to scrape a website using node.js. I am using jsdom and jquery to get the html and parse the required things. But, somehow the values i am getting are not the ones shown on the website. Basically the values are dynamically changed by javascript and i want those values. The whole reason i was using nodejs/jsdom for scraping was that js would be executed and I get the values after that event.

Is there some way to tell jsdom to wait until the javascript executes? or have i got this all wrong? I have googled a lot on this matter.

Problem courtesy of: zubinmehta


You would be better of using something like casperjs It is a testing utility based on phantomjs. It is basically exactly like opening the page in a webkit browser, just without the GUI. You could write something like. I dont think it works with node, but it should be easy enough to run a casper script and pipe the output back to node.:

var casper = require('casper').create({
    loadImages: true,
    loadPlugins: true,
    verbose: true,
    //logLevel: 'info',
    clientScripts: [
    viewportSize: {
        width: 1366,
        height: 768,
    pageSettings: {
        javascriptEnabled: true,
        userAgent: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5',


casper.thenEvaluate(function () {
    //javascript code to run in the scope of the page
Solution courtesy of: tapan


I don't know if you're up for alternatives, but when I need such sensitive scraping, I just use Firefox with iMacros. It runs all browser JS just fine, because it is a browser.

Discussion courtesy of: x10

First off, how are you using jsdom? Apparently, jsdom.env does not execute scripts in the DOM, only the scripts that you add in the call to jsdom.env. If you want to execute scripts, I think you should use jsdom.jsdom.

Second, you need to specify an onload handler. This should execute after the document is ready, and hopefully any scripts will have changed the DOM to your liking.

Something like this:

var jsdom = require('jsdom').jsdom
  , document = jsdom(html)
  , window = document.createWindow();

document.onload = function() {
  // Do your stuff
Discussion courtesy of: Linus Gustav Larsson Thiel

This recipe can be found in it's original form on Stack Over Flow.