MongoDB: Copy a collection of referenced documents as subdocuments

Problem

I made the mistake of designing a scheme so that I have two collections where one has documents which contain a manual reference to the other. I realized now that I should have created it so that the parent collection contained the other collection as sub-documents instead.

The problem is, I've already put this scheme out into a production environment where hundreds of entries have already been created. What I'd like to do is somehow scan over all of the existing data, and copy the items to their referenced parent_id as a sub-document.

Here is an example of my schema:

Collection 1 - User

_id
Name

Collection 2 - Photos

_id
url
user_id

Is there a quick way to change the existing documents to be one collection like this:

Collection - User

_id
Name
Photos: [...]

Once I have the database setup correctly, I can easily modify my code to use the new one, but the problem I'm having is figuring out how to quickly/procedural copy the documents to their parent.

Additional detail - I'm using MongoHQ.com to host my MongoDB.

Thank You.

Problem courtesy of: Matthew Lucas

Solution

I don't know the specifics of your environment, but this sort of change usually involves the following kinds of steps:

  1. Ensure that your old code doesn't complain if there is a Photos array in the User object.
  2. "Freeze" the application so that new User and Photo documents are not created
  3. Run a migration script that copies the Photo documents into the User documents. This should be pretty easy to create either in javaScript or through app code using the driver (see example below)
  4. Deploy the new version of the application that expects Photos to be embedded in the array
  5. "Unfreeze" the application to start creating new documents

If you cannot "Freeze/Unfreeze" you will need to run a delta script after step 4 that will migrate newly created Photo documents after the new application is deployed.

The script will look something like this (untested):

db.User.find().forEach(function (u) {
    u.Photos = new Array();
    db.Photo.find({user_id : u._id}).forEach(function (p) {
        u.Photos.push(p);
    }
    db.User.Save(u);
}
Solution courtesy of: Zaid Masud

Discussion

There is currently no discussion for this recipe.

This recipe can be found in it's original form on Stack Over Flow.