MongoDB BulkWrite: the Highly Misunderstood Method

A Story of Pathos, Drama and Neglect

If a story of neglect, pretty much lacking drama, but loaded with pathos could be written about an OOP method, the main character in the book would be the MongoDB collection level bulkWrite() method.  This highly misunderstood method, often lightly skipped over by most DevOps (myself included!), is often misconstrued for a glorified version of the collection level insertMany() method.  When perusing the documentation for the command, one quickly sees that the syntax, at least at first glance, appears convoluted, and not worth the bother.  However, upon closer examination, come to find out, this potent little method actually packs quite a punch, allowing you to perform not only bulk inserts, but updates and deletes, all in a single command.

Now that I’ve (hopefully) gotten your attention, before diving into the nitty-gritty, the question of why bother needs to be addressed.

Why Bother With a Poor Cousin?

Given that bulkWrite() allows you to perform multiple inserts, deletes and updates in a single command, the next logical question that might come to mind is: why bother?  You could simply issue a series of insert, delete and update commands and be done with it.  Here is an example which adds four documents to the test.users collection, updates Betty to “active” status, and deletes Barney:

include __DIR__ . '/../vendor/autoload.php';
use MongoDB\Client;
$data = [
    ['key' => 'FRF','first' => 'Fred',  'last' => 'Flintstone','active' => 1],
    ['key' => 'WIF','first' => 'Wilma', 'last' => 'Flintstone','active' => 1],
    ['key' => 'BAR','first' => 'Barney','last' => 'Rubble',    'active' => 0],
    ['key' => 'BER','first' => 'Betty', 'last' => 'Rubble',    'active' => 0],
$client = new Client('mongodb://localhost:27017');
// drop users collection from test database
$client->test->users->updateOne(['key'=>'BER'],['$set' => ['active' => 1]]);
$query = $client->test->users->find();
foreach ($query as $document)var_dump($document);

Absolutely nothing wrong with this block of code … except that the insertMany(), updateOne() and deleteOne() calls each cost a round trip to and from the database.  If this example were to be converted into the same operation but using bulkWrite() instead, only one round trip would be required: much more efficient!  The same argument can also apply to your decision on whether or use bulkWrite() or deleteMany() or updateMany().

At this point, at least if you’ve gotten this far, you might be sold on the concept, but want to know more.  Let’s now look at bulkWrite() operations.

What Are BulkWrite Operations?

bulkWrite() operations are a set of pre-defined keys you need to add to the bulk write document that dictate which method is to be called next.  Here is a table that summarizes the operations, and associated options:

Operation Options Arguments
insertOne document <insert doc>
updateOne filter <query doc>
update <update doc>
updateMany filter <query doc>
update <update doc>
replaceOne filter <query doc>
replacement <replacement doc>
deleteOne filter <query doc>

Where the argument mentions “doc”, when running a bulkWrite() operation using the mongo shell, this would be a JSON document.  On the other hand, when running the same operation using the PHP MongoDB library, “doc” would take the form of a PHP associative array.  The operation and options would be  array keys.  The arguments would be a sub-array, itself consisting of key/value pairs.

OK, yes, I hear you: get to the good stuff will you?  Show me how to do it!

Using BulkWrite in the Mongo Shell

The beauty of using the PHP MongoDB Library, which leverages the MongoDB extension, is that you can first model commands using the mongo shell, and then later map the same command almost directly into your PHP app.  First step is to define the bulk write document, including operations and options.

bulkDoc = [
  {"insertOne" : { "document" : {"key" : "FRF", "first" : "Fred","last" : "Flintstone","active" : 1}}},
  {"insertOne" : { "document" : {"key" : "WIF", "first" : "Wilma","last" : "Flintstone","active" : 1}}},
  {"insertOne" : { "document" : {"key" : "BAR", "first" : "Barney","last" : "Rubble","active" : 0}}},
  {"insertOne" : { "document" : {"key" : "BER", "first" : "Betty","last" : "Rubble","active" : 0}}},
  {"updateOne" : { "filter" : {"key" : "BER"}, "update" : {"$set" : {"active" : 1}}}},
  {"deleteOne" : { "filter" : {"key" : "BAR"}}}

After that, it’s just a matter of running bulkWrite(), in this example on the test.users collection.  Note that the users collection is first dropped so that we get consistent test results.  We also throw in a find() to view results:

use test;

Here is how the output might appear:

So far, so good, eh?   Now to translate the query using the PHP MongoDB library.

Happily Bulk Writing in PHP

The first step is to translate the query from a JSON document into a PHP array for consumption by the MongoDB\Collection::bulkWrite() method.

$bulkDoc = [
    ['insertOne' => [['key' => 'FRF', 'first' => 'Fred', 'last' => 'Flintstone', 'active' => 1]]], 
    ['insertOne' => [['key' => 'WIF', 'first' => 'Wilma', 'last' => 'Flintstone', 'active' => 1]]], 
    ['insertOne' => [['key' => 'BAR', 'first' => 'Barney','last' => 'Rubble', 'active' => 0]]], 
    ['insertOne' => [['key' => 'BER', 'first' => 'Betty', 'last' => 'Rubble', 'active' => 0]]], 
    ['updateOne' => [['key' => 'BER'], ['$set' => ['active' => 1]]]], 
    ['deleteOne' => [['key' => 'BAR']]]

Right away (er … if you’re still awake at this point!), you might notice the double square brackets after each operation key.  The reason for this is because the MongoDB PHP library is not written to accept the keys document, filter and update needed by the equivalent shell method.  However, when the command is transmitted to MongoDB, the array-within-array structure needs to be maintained, thus the redundant square brackets.

Next, as with the shell example, we drop the users collection, run bulkWrite(), and run find() to view results:

$client = new Client('mongodb://localhost:27017');
$query = $client->test->users->find([],['projection' => ['_id' => 0]]);
    foreach ($query as $document)
        vprintf("%4s : %12s : %12s : %2d\n", $document->getArrayCopy());

The result is pretty much the same as above, taking into account the minor formatting difference:

And that about wraps things up.  To summarize: bulkWrite() is an efficient way to perform bulk operations involving a combination of insert, update and/or delete.  Use this command if you need to perform mass operations involving more than just insert, update or delete as it saves round trips between the application and database.