Should I check in node_modules to git when creating a node.js app on Heroku?

Problem

I followed the basic getting started instructions for node.js on Heroku here:

https://devcenter.heroku.com/categories/nodejs

These instruction don't tell you to create a .gitignore node_modules, and therefore imply that node_modules should be checked in to git. When I include node_modules in git my getting started application ran correctly.

When I followed the more advanced example at:

https://devcenter.heroku.com/articles/realtime-polyglot-app-node-ruby-mongodb-socketio https://github.com/mongolab/tractorpush-server (source)

It instructed me to add node_modules to .gitignore. So I removed node_modules from git, added it to .gitignore, then re-deployed. This time the deployed failed like so:

-----> Heroku receiving push
-----> Node.js app detected
-----> Resolving engine versions
       Using Node.js version: 0.8.2
       Using npm version: 1.0.106
-----> Fetching Node.js binaries
-----> Vendoring node into slug
-----> Installing dependencies with npm
       Error: npm doesn't work with node v0.8.2
       Required: node@0.4 || 0.5 || 0.6
           at /tmp/node-npm-5iGk/bin/npm-cli.js:57:23
           at Object.<anonymous> (/tmp/node-npm-5iGk/bin/npm-cli.js:77:3)
           at Module._compile (module.js:449:26)
           at Object.Module._extensions..js (module.js:467:10)
           at Module.load (module.js:356:32)
           at Function.Module._load (module.js:312:12)
           at Module.require (module.js:362:17)
           at require (module.js:378:17)
           at Object.<anonymous> (/tmp/node-npm-5iGk/cli.js:2:1)
           at Module._compile (module.js:449:26)
       Error: npm doesn't work with node v0.8.2
       Required: node@0.4 || 0.5 || 0.6
           at /tmp/node-npm-5iGk/bin/npm-cli.js:57:23
           at Object.<anonymous> (/tmp/node-npm-5iGk/bin/npm-cli.js:77:3)
           at Module._compile (module.js:449:26)
           at Object.Module._extensions..js (module.js:467:10)
           at Module.load (module.js:356:32)
           at Function.Module._load (module.js:312:12)
           at Module.require (module.js:362:17)
           at require (module.js:378:17)
           at Object.<anonymous> (/tmp/node-npm-5iGk/cli.js:2:1)
           at Module._compile (module.js:449:26)
       Dependencies installed
-----> Discovering process types
       Procfile declares types -> mongod, redis, web
-----> Compiled slug size is 5.0MB
-----> Launching... done, v9

Running "heroku ps" confirms the crash. Ok, no problem, so I rolled back the change, add node_module back to the git repository and removed it from .gitignore. However, even after reverting, I still get the same error message on deploy but now the application is running correctly again. Running "heroku ps" tells me the application is running.

So my question is what's the right way to do this? Include node_modules or not? And why would I still be getting the error message when I rollback? My guess is the git repository is in a bad state on the Heroku side?

Problem courtesy of: Jason Griffin

Solution

Second Update

The FAQ is not available anymore.

From the documentation of shrinkwrap:

If you wish to lock down the specific bytes included in a package, for example to have 100% confidence in being able to reproduce a deployment or build, then you ought to check your dependencies into source control, or pursue some other mechanism that can verify contents rather than versions.

Shannon and Steven mentioned this before but I think, it should be part of the accepted answer.


Update

The source listed for the below recommendation has been updated. They are no longer recommending the node_modules folder be committed.

Usually, no. Allow npm to resolve dependencies for your packages.

For packages you deploy, such as websites and apps, you should use npm shrinkwrap to lock down your full dependency tree:

https://docs.npmjs.com/cli/shrinkwrap


Original Post

For reference, npm FAQ answers your question clearly:

Check node_modules into git for things you deploy, such as websites and apps. Do not check node_modules into git for libraries and modules intended to be reused. Use npm to manage dependencies in your dev environment, but not in your deployment scripts.

and for some good rationale for this, read Mikeal Rogers' post on this.


Source: https://docs.npmjs.com/misc/faq#should-i-check-my-node-modules-folder-into-git

Solution courtesy of: Kostia

Discussion

Instead of checking in node_modules, make a package.json file for your app.

The package.json file specifies the dependencies of your application. Heroku can then tell npm to install all of those dependencies. The tutorial you linked to contains a section on package.json files.

Discussion courtesy of: matzahboy

My biggest concern with not checking node_modules into git is that 10 years down the road, when your production application is still in use, npm may not be around. Or npm might become corrupted; or the maintainers might decide to remove the library that you rely on from their repository; or the version you use might be trimmed out.

This can be mitigated with repo managers like maven, because you can always use your own local Nexus or Artifactory to maintain a mirror with the packages that you use. As far as I understand, such a system doesn't exist for npm. The same goes for client-side library managers like Bower and Jamjs.

If you've committed the files to your own git repo, then you can update them when you like, and you have the comfort of repeatable builds and the knowledge that your app won't break because of some third-party action.

Discussion courtesy of: Jonathan

http://nodejs.org/api/modules.html

[...] node starts at the parent directory of the current module, and adds /node_modules, and attempts to load the module from that location.

If it is not found there, then it moves to the parent directory, and so on, until the root of the tree is reached.

If you're rolling your own modules specific to your app, you can keep those (and only those) in your app's /node_modules. And move out all the other dependencies to the parent directory.

This use case of pretty awesome, it lets you keep modules you created specifically for your app nicely with your app, and doesn't clutter your app with dependencies which can be installed later.

Discussion courtesy of: laggingreflex

From http://www.futurealoof.com/posts/nodemodules-in-git.html

To recap.

  • Only checkin node_modules for applications you deploy, not reusable packages you maintain.
  • Any compiled dependencies should have their source checked in, not the compile targets, and should $ npm rebuild on deploy.

My favorite part:

All you people who added node_modules to your gitignore, remove that shit, today, it’s an artifact of an era we’re all too happy to leave behind. The era of global modules is dead.

Discussion courtesy of: Benjamin Crouzier

What worked for me was explicitly adding a npm version to package.json ("npm": "1.1.x") and NOT checking in node_modules to git. It may be slower to deploy (since it downloads the packages each time), but I couldn't get the packages to compile when they were checked in. Heroku was looking for files that only existed on my local box.

Discussion courtesy of: Jason Griffin

I believe that npm install should not run in a production environment. There are several things that can go wrong - npm outage, download of newer dependencies (shrinkwrap seems to solved this) are two of them.

On the other hand, node_modules should not be committed on git. Apart from their big size, commits including them can become distracting.

The best solutions would be this: npm install should run in a CI environment that is similar to the production environment. All tests will run and a zipped release file will be created that will include all dependencies.

Discussion courtesy of: user2468170

I am using this solution:

  1. Create separate repository that holds node_modules. If you have native modules that should be build for specific platform then create separate repository for each platform.
  2. Attach these repositories to your project repository with git submodule:

git submodule add .../your_project_node_modules_windows.git node_modules_windows

git submodule add .../your_project_node_modules_linux_x86_64 node_modules_linux_x86_64

  1. Create link from platform-specific node_modules to node_modules directory and add node_modules to .gitignore.
  2. Run npm install.
  3. Commit submodule repository changes.
  4. Commit your project repository changes.

So you can easily switch between node_modules on different platforms (for example if you are developing on OS X and deploying to Linux).

Discussion courtesy of: mixel

I was going to leave this after this comment: Should I check in node_modules to git when creating a node.js app on Heroku?

But stackoverflow was formatting it weird. If you don't have identical machines and are checking in node_modules, do a .gitignore on the native extensions. Our .gitignore looks like:

# Ignore native extensions in the node_modules folder (things changed by npm rebuild)
node_modules/**/*.node
node_modules/**/*.o
node_modules/**/*.a
node_modules/**/*.mk
node_modules/**/*.gypi
node_modules/**/*.target
node_modules/**/.deps/
node_modules/**/build/Makefile
node_modules/**/**/build/Makefile

Test this by first checking everything in, and then have another dev do the following:

rm -rf node_modules
git checkout -- node_modules
npm rebuild
git status

Ensure that no files changed.

Discussion courtesy of: ibash

I have been using both committing node_modules folder and shrink-wrapping. Both solutions did not make me happy.

In short: committed node_modules adds too much noise to repository.
And shrinkwrap.json is not easy to manage and there is no guarantee that some shrink-wrapped project will build in a few years.

I found that Mozilla was using a separate repository for one of their projects https://github.com/mozilla-b2g/gaia-node-modules

So it did not take me long to implement this idea in a node CLI tool https://github.com/bestander/npm-git-lock

Just before every build add
npm-git-lock --repo [git@bitbucket.org:your/dedicated/node_modules/git/repository.git]

It will calculate hash of your package.json and will either check out node_modules content from a remote repo, or, if it is a first build for this package.json, will do a clean npm install and push the results to the remote repo.

Discussion courtesy of: bestander

You should not include node_modules in your .gitignore (or rather you should include node_modules in your source deployed to Heroku).

If node_modules:

  • exists then npm install will use those vendored libs and will rebuild any binary dependencies with npm rebuild.
  • doesn't exist then npm install will have to fetch all dependencies itself which adds time to the slug compile step.

See the Node.js buildpack source for these exact steps

However, the original error looks to be an incompatibility between the versions of npm and node. It is a good idea to always explicitly set the engines section of your packages.json according to this guide to avoid these types of situations:

{
  "name": "myapp",
  "version": "0.0.1",
  "engines": {
    "node": "0.8.x",
    "npm":  "1.1.x"
  }
}

This will ensure dev/prod parity and reduce the likelihood of such situations in the future.

Discussion courtesy of: Ryan Daigle

This recipe can be found in it's original form on Stack Over Flow.