Shift8 Creative Graphic Design and Website Development

mongodb

MongoDB Queries with Lithium: Part Three

Posted by Tom on Mon, Aug 16 2010 18:13:00

This will probably be the last part of the series, unless I find some other type of query that may not be so obvious or that may be new in a future release of MongoDB. This will also probably contain the most obscure and complex examples.  

Regex With $or
So, on with the good part first. You probably want to create some sort of search for content on your site, be it on the front end for visitors or the back for only admins...It's a common thing for a site and while it's always a good idea to use something like Lucene/Solr/Sphinx if you expect complex searches and lots of them...You can make a quick and dirty search with most databases such as MySQL via a FULL TEXT search. Well, MongoDB does one better and takes advantage of regex, which I posted about in the last article for this series...But this time we're going to use a new operator, the $or operator in conjunction with regex. This is new in MongoDB 1.5.3, so ensure that you at least that version.

What's the $or do? Well, the $or is important because if you make a query with regex or not and want to search multiple fields, you will have all results from both matches be returned. The $or keeps you from getting duplicate results. If you remember in part one we used $elemMatch when dealing with matching multiple conditions within objects. In this example, we aren't working with complex objects, we're just saying go find a regex pattern in the title or the body of some document. Then again, maybe you will need to also apply what you learned about $elemMatch in the previous article with something like this. Right, so get on with it.

// I'm making one call to MongoRegex() but you may need more.
$search_regex = new \MongoRegex('/'.$this->request->data['query'].'/i');

$conditions => array(
     'public' => true,
     '$or' => array(
               array('title' => $search_regex),
               array('directions' => $search_regex)
          )
     )
);

// Now you can put the query together (or define $conditions inline with the find() call)
$recipes = Recipe::find('all', array(
     'fields' => array('title', 'url'),
     'conditions' => $conditions
));

My head exploded at first, but it's quite easy to follow along with. A little tip? Reference the MongoDB documentation, make your queries in a console window and when you get back the results you want...Take the text you entered and print it out on your page where you're making the query. I just print out the guts between the parenthesis. Then with your $conditions array that you build, run json_encode() and also print that out on the screen. It's a good way to see how the array format is getting translated. It makes it a lot easier if you're not a human PHP->JSON translator. For complex objects it helps quite a bit.

So what does that do above? A little context. I have some recipes and I only want those marked as being true for the "public" field to be included in the results first and foremost. You may have similar needs with some sort of publish status. Then we're running a regex on the query string passed via POST, but we're running it on two fields. In my case I have the title for the recipe and then some directions. I may also want to search ingredients too. Of course ingredients is in an object for me, so hey where's that good ol' $elemMatch? Yay, more complexity.

Ok. So sticking with this...What is returned is exactly what you'd expect, any recipe that's publicly visible that has the query in the title or directions (case insensitive match).

Lithium's "like" array key will not work in this case for two fields. It will for one. So in other words:

// Will NOT work
$conditions = array(
     'title' => array('like' => '/chicken/i'),
     'directions' => array('like' => '/cook/i')
);

// WILL work, but won't give us what we're after
$conditions = array(
     'title' => array('like' => '/chicken/i')
);

Again, keep in mind we also need to use the $or operator for a very important reason. We don't want duplicate records in our search results. How about the other operators and other conditions? Sure! You can apply the $nin and $ne and all the other operators too. Those are a little different though. So where we put in $or as if it was a field, we have to use $nin and $ne like the other article explained. Let me write it out again all glued together with the first example here.

$conditions => array(
     'public' => true,
     '$or' => array(
               array('title' => $search_regex),
               array('directions' => $search_regex)
          )
     ),
     'title' => array(
          '$nin' => array('BBQ Chicken', 'More Chicken')
     )
);

You can probably start to imagine how you would build an "advanced" search where you allowed the user to fill out a few different form fields that narrowed the search down. Ok, so let's glue together a query that combines a little of everything that we've gone over in all three articles. This will be as if a visitor came to the site, searched for recipes that were shared with a specific family (let's say their family), had "chicken" in the title/directions, and was not shared with the public. So only a private chicken recipe currently approved and shared with a specific family (group).

$conditions = array(
     'public' => false,
     '$or' => array(
          array('title' => $search_regex),
          array('directions' => $search_regex),
          array('ingredients.ingredient' => $search_regex)
     ),
     'title' => array(
          '$nin' => array('More Chicken')
     ),
     'families' => array(
          '$elemMatch' => array(
               'family_id' => '4c59d5ec7f8b9a6279000000',
               'approved' => true
          )
     )
);

Confused? You will be, in this episode of MongoDB & Lithium...Right, so your queries can get rather complex. Hopefully you won't be confused after reading my article, but to be fair, that's complex even if you were to type it straight up in the console written as JSON. It's no different really, but don't think that you need to make a custom query with Lithium, your query can be represented as a PHP array which really helps you out with dynamic values.

The above query also looks at ingredients, which in my case are within an array. In a production situation I probably would build a more complex regular expression or possibly several. The query could end up even more complex based on what's posted along to my search method. This should really give you some ideas though. How does this impact the database? I'm not really sure. I haven't benchmarked this at all or anything.

The $type Operator
One last query for the road? Sure. This is a simple one and probably very obscure. MongoDB lets you query by data type. Wha?? Crazy right? I personally haven't found a real great use for this yet, but you never know. Here's an example:

$conditions = array(
     'title' => array(
          '$type' => 2
     )
);

That will return every recipe where the title is a string. The $type operator accepts a list of numbers that are assigned to a BSON type. You can see the list of types on the MongoDB documentation page.

That's all for now, hope you enjoyed!

MongoDB Queries with Lithium: Part Two

Posted by Tom on Fri, Aug 13 2010 09:46:00

Let's take a step back for part two. Let's ignore complex (or any) objects and here we're going to go over some more basic (and common) queries that you may need to make. No, there's not really much order to the queries I'm going to go over in this series, sorry. It started as more of an "I'll post about what I'm doing as I do it." type of thing, but I'll try to go over some of the basics now. In fact, if you keep the MongoDB documentation site open for reference as you're coding, I guarantee that most queries you'll need won't be that hard.

Those of you coming from a strong love of MySQL will appreciate part two here. I'm going to go over what will replace your IN() and NOT IN() and I'll even brush on LIKE at the end of this post.

Equal To and In ($in)
So, probably your most basic query will be finding something by a specific value. We typically look up by id and we can do this in Lithium and it's been documented elsewhere. So if you've used Lithium and you've done this type of query, this will bore you...But it's good to point out that the following two conditions are basically the same.

$conditions = array('title' => 'BBQ Chicken');
$conditions = array('title' => array('BBQ Chicken'));

You can simply use a string value for the "title" key (the key being the field name in your collection in MongoDB). These aren't literally the same thing of course, the array() means to go get multiple. So you can add to that list and you've got yourself an $in basically. Here, these two are also work the same:

$conditions = array('title' => array('BBQ Chicken', 'More Chicken'));
$conditions = array('title' => array('$in' => array('BBQ Chicken', 'More Chicken')));

I suppose a difference here is that the first set of conditions would work if you were using MySQL or MongoDB with Lithium. The minute you start using specific operators for MongoDB (all those with dollar signs) then you're no longer coding in a database agnostic fashion. Which probably isn't a real big deal, unless you were trying to make some sort of application that could run with whatever database desired.

Not Equal To ($ne) and Not In ($nin)
Of course we have the opposite available to us as well. These will not be database agnostic, but you can run:

$conditions = array('title' => array('$ne' => 'BBQ Chicken'));
$conditions = array('title' => array('$nin' => array('BBQ Chicken', 'More Chicken')));

$ne being "not equal to" and $nin being "not in". The equivalent of <> and NOT IN() for MySQL users. You can probably start to see how all the operators are translated into this array format in Lithium's Model::find() method. Once you see a few you can pretty much assume the rest will be similar.

LIKE, as in Like Regular Expressions
So real quick I'm going to touch base on the equivalent of a MySQL LIKE in MongoDB. MongoDB is actually way cool in that it lets you run regular expressions! You can't simply put title => '/regex/'... That won't work because you're not technically defining a regular expression, you're defining a string. PHP has no RegExp() object class like JavaScript does. However, that PECL extension you installed for MongoDB does have a class to help you out here. So in order to use Lithium's find with a regex, you'll write something like this:

$conditions = array('title' => new \MongoRegex('/chicken/i'));

This is of course a very basic regular expression, but you do have the ability to create quite complex expressions and also take advantage of some flags; you can look here for the currently supported flags. Above you can see that I'm using the case insensitive flag. Those conditions would return me all of my chicken recipes.

Lithium also accounts for MongoRegex() as well so you don't actually need to create a new MongoRegex() object yourself. You could also write:

$conditions = array('title' => array('like' => '/chicken/i'));

For those of you not familiar with Lithium, let me glue it together for you. The Model::find() method is extremely powerful and simple to use in Lithium and it will help you make queries very quickly. You will write less code, things will look organized, and you won't be pulling your hair out. Not that I think using the straight up MongoDB extension is all too complex, but Lithium makes things a little nicer for you.

$recipes = Recipe::find('all', array('conditions' => array('title' => array('like' => '/chicken/i'))));

In a simple example, but keep in mind that aside from "conditions" you also have keys like "limit" and "fields" available to you in order to return a more specific set of data. $recipes here will be an object with all of my recipes with chicken in their titles. $recipes->data() would then return an array from that object for me to use nicely in my view template. That's one powerful short line of code. Typically to run the query you would be talking about establishing the connection first. You'd write more lines of code. Lithium makes things pretty easy, right?! Due to the speed of MongoDB this also might serve as your site's search engine. Not too shabby. Stay tuned for more, I'll get back to some of the more complex (and sometimes strange and probably not often used...but fun) queries you can make with Lithium and MongoDB.

MongoDB Queries with Lithium: Part One

Posted by Tom on Wed, Aug 11 2010 19:44:00

Ok, so this will probably be a whole big long series because there are a real good handful of different types of queries you can make with MongoDB. Combine that with the complex objects that you can store in the database and you really can get quite complex. Of course I also urge you to think about database design even if MongoDB is so easy that you dive in and just simply forget to think and draw out a schema map. I mean you have regex and all sorts of crazy stuff available. Check out this MongoDB documentation page for more.

So how does one access all this using the friendly Model::find() in Lithium? If you're like me, you're probably coming from CakePHP and the find() will make a lot of sense. You'll be comfortable specifying the conditions key and order, etc. But you won't know what to make about some more of the complex things. So in this series I'm going to provide some simple examples of how to go and form those advanced queries using the Lithium framework.

The Task at Hand
So today I came across a situation where I have an array of data stored in MongoDB. I'll give you the exact context, this is for my Family Spoon project. This is the recipe record holding an array of data for all the families it's being shared with...Or requesting to be shared with upon admin approval. So this "families" array has multiple items each with keys of "family_id" and "approved" and "request_date" and "approval_code" and what not. All different data types right? Strings and booleans and timestamps, oh my! So on some overview admin page we want the user to see all these recipes that they need to either approve or reject. We also need to only show to other people only the approved recipes that are being shared on some other page.

So you're talking about your typical AND operator in MySQL. You're saying go get me records where the family_id is this AND the approved field is that. Great and using the JSON we use dots to get into those arrays and it translates to:

$conditions = array('families.family_id' => 123, 'families.approved' => true);

Or does it? Mwuahahaha....No it actually won't give you the results you expect. What that will do is give you all recipes where the family id is 123 and all recipes that are true. I know what that sounds like...BUT...not where both are strictly the case. So you're getting all these recipes that aren't approved, what gives? That's kinda more like an OR situation with MySQL right?

A little check into the MongoDB docs under the "Reaching into Objects" section should help clarify. It says blah blah blah, "subobjects have to match exactly", blah blah blah. Crap. So then what? 

$elemMatch To the Rescue!
Oh, we use this $elemMatch thingy. It allows us to specify these conditions for more complex objects how we would expect. The question is does Lithium have some array key called 'elemMatch' to use like 'conditions'?? No it doesn't and you don't need to make some custom query...You can use find(). So here's where the thinking cap went on...Er, the trial and error cap. Lithium DOES look for $ in conditions. So how do we write them? Like this:

$conditions = array('families' => array('$elemMatch' => array('family_id' => 123, 'approved' => true)));

Make sense? I don't think it's real obvious, but it's pretty simple looking at it. It's just that it's not well documented (yet). So this will build out the query the way it needs to be and return the results you'd expect, we wouldn't see any "approved" = false recipes in the results. Note that in this case we also are not using any dot notation. Keep in mind that running conditions like the following are also perfectly valid and work.

$conditions = array('families.family_id' => 123);

It just will give you simply all recipes with the family id of 123. So in some cases you will use the dot notation and in other special cases you'll need to see some of those operators with the dollar sign. So, fortunately, Lithium does allow us to use those operators and you do not need to make any sort of custom query and forfeit the use of the find() method. Hooray! Just keep one very important thing in mind...Your find() call may not be database agnostic anymore so if you're using multiple database types or want to make an app database agnostic, watch out. More to come soon.

1 | 2