Skip to content Skip to sidebar Skip to footer

How To Save 1 Million Records To Mongodb Asyncronously?

I want to save 1 million records to mongodb using javascript like this: for (var i = 0; i<10000000; i++) { model = buildModel(i); db.save(model, function(err, done) { co

Solution 1:

It blew up because you are not waiting for an asynchronous call to complete before moving on to the next iteration. What this means is that you are building a "stack" of unresolved operations until this causes a problem. What is the name of this site again? Get the picture?

So this is not the best way to proceed with "Bulk" insertions. Fortunately the underlying MongoDB driver has already thought about this, aside from the callback issue mentioned earlier. There is in fact a "Bulk API" available to make this a whole lot better. And assuming you already pulled the native driver as the db object. But I prefer just using the .collection accessor from the model, and the "async" module to make everything clear:

var bulk = Model.collection.initializeOrderedBulkOp();
var counter = 0;

async.whilst(
  // Iterator conditionfunction() { return count < 1000000 },

  // Do this in the iteratorfunction(callback) {
    counter++;
    var model = buildModel(counter);
    bulk.insert(model);

    if ( counter % 1000 == 0 ) {
      bulk.execute(function(err,result) {
        bulk = Model.collection.initializeOrderedBulkOp();
        callback(err);
      });
    } else {
      callback();
    }
  },

  // When all is donefunction(err) {
    if ( counter % 1000 != 0 ) 
        bulk.execute(function(err,result) {
           console.log( "inserted some more" );
        });        
    console.log( "I'm finished now" ;
  }
);

The difference there is using both "asynchronous" callback methods on completion rather that just building up a stack, but also employing the "Bulk Operations API" in order to mitigate the asynchronous write calls by submitting everything in batch update statements of 1000 entries.

This does not not only not "build up a stack" of function execution like your own example code, but also performs efficient "wire" transactions by not sending everything all in individual statements, but rather breaking up into manageable "batches" for server commitment.

Solution 2:

You should probably use something like Async's eachLimit:

// Create a array of numbers 0-999999var models = newArray(1000000);
for (var i = models.length; i >= 0; i--)
  models[i] = i;

// Iterate over the array performing a MongoDB save operation for each item// while never performing more than 20 parallel saves at the same timeasync.eachLimit(models, 20, functioniterator(model, next){
  // Build a model and save it to the DB, call next when finished
  db.save(buildModel(model), next);
}, functiondone(err, results){
  if (err) { // When an error has occurred while trying to save any model to the DBconsole.error(err);
  } else { // When all 1,000,000 models have been saved to the DBconsole.log('Successfully saved ' + results.length + ' models to MongoDB.');
  }
});

Post a Comment for "How To Save 1 Million Records To Mongodb Asyncronously?"