Confusion with node.js VS apache HTTPD

Problem

I am total confused with the difference between node and thread. Now In httpd documentation they say that they create a main process which creates a child process which in turn maintains a fixed number of threads.

A thread pool works like Apache’s pre-forking. The threads are created en block at the program start, and then the work load is distributed evenly. When there are more connections than threads, the new connections have to wait. But on the plus side, you save the cost for the thread creation.

Now my doubt here is since threads can run with shared memory why cant any number of threads be created?

When it comes to node. Node is completely event-driven. Basically the server consists of one thread processing one event after another. A new request coming in is one kind of event. The server starts processing it and when there is a blocking IO operation, it does not wait until it completes and instead registers a callback function. The server then immediately starts to process another event (maybe another request). When the IO operation is finished, that is another kind of event, and the server will process it (i.e. continue working on the request) by executing the callback as soon as it has time.

Now if the node does not create a new thread then how can it accept a new request.

Problem courtesy of: Maclean Pinto

Solution

From the blog Understanding the node.js event loop

Node.js keeps a single thread for your code…

It really is a single thread running: you can’t do any parallel code execution; doing a “sleep” for example will block the server for one second:

Source code

while(new Date().getTime() < now + 1000) {
   // do nothing
}

So while that code is running, node.js will not respond to any other requests from clients, since it only has one thread for executing your code. Or if you would have some CPU -intensive code, say, for resizing images, that would still block all other requests.

Check the blog for more details. It is explaining the the event loop in detail.

EDIT

Check this, a nice reference Multi-Process Node.js: Motivations, Challenges and Solutions

Solution courtesy of: Damodaran

Discussion

The problem with JavaScript is, that it is notoriously single-threaded. Therefore threading can't simply exist (there are some complex solutions, but those tend to be very complex in the end). Node.js uses a non-blocking IO approach to circumvent this restriction, which is in the end a smart use of single-threading by avoiding any blocking and having those task frequently pause for a second to let others do a bit of work as well. Similar to the old single-core multi-tasking emulation in Windows 95-ME.

Now the big problem is: should any of those processing threads ever be put on hold, then the entire server is put on hold. This is not an issue as long as you work small-scale, but this is an absolute no-go for a big server environment. In addition modern servers have 8 and more cores, and keeping 7 of those idle is a big waste of hardware.

HTTPD in the worker configuration spawns one fork of itself (the original executable acts as a backup just in case) and then spawns threads based on the configured value. It can in theory run 1000ands of threads simultaneously, but if your CPU only has 8 cores, no more than 8 of those threads can be processed at a single time - obviously. So the other 992 threads need to wait, and this can - depending on the task - even cause clients to run into a time-out, because they had to wait over 30 seconds.

How many threads a are a good value primarily depends on what your threads do. If they never block, but spend 100% of the time on actually solving the problem, then about cores * 2 is a good value for httpd, because of the core-limitation as above. If your threads block a lot (disk access, tcp communication), then a higher or much higher value might result in a better throughput. In the end the only way to find out is to launch test-data against the server and measure the throughput in various configurations.

Modern web servers however (httpd is centuries old) combine those two approaches: a non-blocking single receiver thread, that reads the data and then distributes it to available threads. This has the benefit that IO operations are lightning fast (even faster than node.js if the code is native code), but still all cores can be used simultaneously. Java can do this via NIO (new IO) and then distribute Runnables to Executors.

Discussion courtesy of: TwoThe

This recipe can be found in it's original form on Stack Over Flow.