What does it mean to say Apache spawns a thread per request, but node.js does not?

Problem

I have read about node.js and other servers such as Apache, where the threading is different. I simply do not understand what the threading means.

If I have a webpage that runs SQL to hit a database, say three different databases in the one server side page, what does that mean for threading in node.js? Apache? What does "thread" mean here?

Or as an article I saw, "start a new thread to handle each request."

What does it mean to say Apache spawns a thread per request, but node.js does not?

EDIT: I am hoping for an example that I can grasp. I'm used to having a server side page that hits a database(s). Several connections inside that file.

Problem courtesy of: johnny

Solution

A thread is a context of program execution. Programs that are single-threaded can only do one thing at once, where multi-threaded programs can do many things at once.

Think of it like a kitchen at a restaurant. A single chef can really only do one task at a time, be that chopping opinions or putting something in an oven. If an order comes in that requires lots of work from the chef (such as making salads vs. putting stuff in the oven and waiting) some meals may get delayed because that chef is busy. On the other hand, if that chef just has to bake a bunch of stuff, there isn't much work for him to do and he can make other meals while waiting for the food in the oven to be done.

With multiple chefs, many of these tasks can be done simultaneously. Many meals can be prepared simultaneously.

Apache's threading model is like hiring a fixed number of chefs (regardless of how many customers your restauarant has that night) and each chef can only work on one meal at a time. That means that if a meal order comes in, a dedicated chef is assigned to that meal. There will be times when that chef is busy chopping up ingredients and mixing cake batter, but there will also be times when he's just standing around waiting for the potatoes to boil. At any given time, you could have most of your chefs sitting idle, waiting on potatoes to boil and cake to bake and no more orders will be worked on, since each chef is dedicated to one order at a time.

To make matters worse, your kitchen is only as big as you can afford to make it. Each chef takes up space and resources, and you may have a situation where a bunch of chefs standing around holding the only spoons available are preventing other chefs from getting their food made.

Nginx is another web server (often used as a proxy) that you didn't ask about, but I'm including it to explain another threading model. It also hires a fixed number of chefs, but it hires fewer of them. Each chef can work on multiple meals at a time. So, if they're waiting on potatoes to boil while an order comes in for a chopped salad, they can go work on that salad instead of standing around idle. You can have a smaller kitchen (relative to the size of restaurant/number of customers) and get the same number of meals out, or more. It's a tight crew that is effective at not wasting time and resources.

Node.js is a bit different. It is single-threaded from a JavaScript perspective, but other tasks like disk and network IO are handled on separate threads automatically. It's like having a kitchen with only one chef, but that makes sense in some cases. If your kitchen has a lot of busy work for that chef, perhaps it makes sense to hire more chefs to do work. (To do this in Node.js, you can only spawn more processes, which is effectively like building a bunch of small kitchens right next to each other. You can have one guy standing out front coordinating the orders for all those kitchens.) However, if you're just a bakery (mainly just IO, with little busy-work for the chef), maybe you only need one chef.

To sum all this up, different threading models are used to divide work and process it effectively. Which threading model makes sense depends on your needs, and the other characteristics of the server you are choosing.

Solution courtesy of: Brad

Discussion

Node.js is single threaded in that it can only do one thing at once. You can run multiple instances of the node process on pretty much all cloud service providers, though. The apache process can multi-task on threads.

If the node process hangs for some reason, nothing else can happen. That's why its important to write node in an asyncronous way so that if a database query hangs, node can still take requests.

Without getting too technical, a thread can be thought of as a lane in the highway of the program. Its a specific channel of execution. In the lifetime of a request, a lot of things have to happen. All of those things are in one box.

Node doesn't have threads! You can think of it like a one lane road. But the way node is deployed you get many instances of that one lane road. They don't share anything though. If you a value gets added to an array in one, its not in the other. Anything that needs to be shared has to be shared in a cache or database layer.

Discussion courtesy of: Zeke Alexandre Nierenberg

This recipe can be found in it's original form on Stack Over Flow.