Unexpected results in Node.js implementing flow control with Series function (pattern?)

Problem

I'm attempting to implement a series function for flow control purposes but getting unexpected results. I need to check for the existence of a file, as the result a shell cmd initiated by Node, over the course of say 5000 ms. Using Mixu's Node book I am utilizing a series function for sequential execution (section 7.2.1). Per the Node book, this pattern would be appropriate if you needed one result to feed into another, but that does not seem to be the case as the final() is being executed before any of the callbacks from the argument functions return.

Below should be all the code needed to reproduce what I am seeing.

Expected output: (assume the file being checked for was created in < 1000ms)

1347312844082
true
1347312845082
true
1347312846083
true
1347312847082
true
1347312848082
true
true <-- last true result of console.log() in last param function of series function.

Actual output:

1347312844082
1347312845082
1347312846083
1347312847082
1347312848082
false <-- last param function of series function being evaluated before series args callbacks.
true
true
true
true
true

I'm guessing the callbacks from my sleep functions (which contain the fs.check calls)

  • are being pushed back as a result of the args param of the series function all being invoked by slice.call(arguments)
  • then shifting to the last function,
  • and finally handling the callbacks.

But this seems contradictory as to what is the described behavior in the Node book.

Note: I'm trying to avoid regressing back into sync calls (fs.existsSync) in an attempt to break my brain and maybe finally "get it" when it comes to Node/async programming.

    var outputFile = 'C:\\Some\\Valid\\Path\\foo.txt';
    var successCheck = false;
    series([
            sleep(1000, function () {
                printTime();
                fs.exists(outputFile, function (exists) {
                    console.log(exists);
                    successCheck = exists;
                });
            }),
            /* middle three deleted for brevity */
            sleep(1000, function () {
                printTime();
                fs.exists(outputFile, function (exists) {
                    console.log(exists);
                    successCheck = exists;
                });
            }),

            sleep(1000, function () {
                printTime();
                fs.exists(outputFile, function (exists) {
                    console.log(exists);
                    successCheck = exists;
                });
            })
            ], function() {
                console.log(successCheck);
        });


    function sleep (msInterval, callback) {
        var now = new Date().getTime();
        while(new Date().getTime() < now + msInterval) {
      // do nothing
        }
        callback();
    }

    function series (callbacks, last)  {
        var results = [];
        function next() {
            var callback = callbacks.shift();
            if(callback) {
                callback(function() {
                    results.push(Array.prototype.slice.call(arguments));
                    next();
                });
            } else {
                last(results);
            }
        }

        next();
    }

    function printTime () { var now = new Date().getTime(); console.log(now); }
Problem courtesy of: jbigelow

Solution

I see several problems with your code. First and foremost, your approach to "sleep" is completely broken.

JavaScript (and thus Node) is a single-threaded, event-driven language. If you do anything that blocks, you block your entire program.

Your sleep function blocks the entire program by spinning until msInterval milliseconds pass.

Here's exactly what is happening:

  • Start by setting some variables.
  • There is a call to series. In order to pass it arguments, we must evaluate those arguments.
  • sleep is invoked on the next line.
  • sleep spins for 1000 ms.
  • sleep invokes its callback.
  • The callback prints stuff to the console, issues an asynchronous I/O request, and returns.
  • sleep returns undefined.
  • The next call to sleep is invoked; the last 5 steps are repeated.
  • ...after 5 seconds, all calls to sleep have completed. You now have an Array that looks like this:

    [undefined, undefined, undefined, undefined, undefined]
    

    since sleep doesn't return anything.

  • Now series can be invoked with the arguments an array of undefineds and a Function.
  • results is instantiated.
  • next is invoked.
  • The first undefined from the array is shifted off.
  • Since undefined is falsy, jump to the else branch.
  • Call last with an empty array. (We never put anything in results)
  • The last callback prints successCheck, which is false, and returns.
  • At this point, synchronous code has completed.
  • JavaScript starts the next iteration of the event loop. The I/O requests you issued earlier are probably complete by this point, so the fs.exists callbacks are invoked in the order they completed (which is probably, but not necessarily the order they were issued).
  • The callbacks print to the console and set successCheck (which is never actually read again).
  • The program ends.

 

How to fix it

Never, ever, ever spin in a while loop. You could rewrite sleep like this:

function sleep(ms, cb) {
    return function(done) {
        setTimeout(function() {
            cb(done);
        }, ms);
    }
}

Here, we return a function that will be invoked later. This way, the array that gets passed to series is filled with the inner sleep function. When you call this sleep, nothing has actually happened yet.

When series calls the first array element, it is actually invoking the returned anonymous function, where we've closed over the ms and cb that were originally passed to the sleep call. The function sets a timeout for ms milliseconds. What that expires, the innermost anonymous function is invoked, which calls cb. It passes the done function as its argument, which is actually the anonymous function from series' next that pushes onto results and calls next.

Because the next step of your series isn't started until the previous step indicates it's done, you have to call the done callback from the step's function:

sleep(1000, function (done) {
    printTime();
    fs.exists(outputFile, function (exists) {
        console.log(exists);
        successCheck = exists;
        done();
    });
});

Change your approach

Of course, the way you really want to solve the problem ("How do I do something after this file has been created?") is not by polling fs.exists.

Instead, use filesystem change notifications! fs.watch allows you to listen for changes to files or folders.

var watcher = fs.watch('C:\\Some\\Valid\\Path\\', function(e, filename) {
    if (filename == 'foo.txt') {
        // success!
        watcher.close(); // stop watching for changes
    }
});
Solution courtesy of: josh3736

Discussion

There is currently no discussion for this recipe.

This recipe can be found in it's original form on Stack Over Flow.