César's geek-side http://crodas.org bits comming in and out. Sat, 27 Feb 2010 05:47:19 +0000 http://wordpress.org/?v=2.9.2 en hourly 1 ActiveMongo http://crodas.org/activemongo.php http://crodas.org/activemongo.php#comments Sat, 27 Feb 2010 05:47:19 +0000 crodas http://crodas.org/?p=99 After awhile without blogging, I was quite busy with work and some personal projects. Well, I created my first PECL Package (and therefore I have my @php.net mail) that I built in order to accomplish my personal goal that is already working on my sandbox (I’ll cover this later in this blog).

This time I will talk about MongoDB, and my simple yet efficient ActiveRecord class that I wrote in less than a week, in order to make even easier use MongoDB from PHP.

If you already have any experience with MongoDB, you might be wondering why did you create it? MongoDB is already very simple. That’s right, but I want to make even easier and amicable, so I focused in these things:

  • Keep it simple stupid.
  • Easy iteration.
  • Optimized Updates.
  • MongoDB is already good for queries, don’t wrap it

Of course, you can do all this with just MongoDB, but it’s a bit tricky, especially the Optimized updates (just updating what had changed).

ActiveMongo usually would look like this,

<?php

ActiveMongo::connect("testing_db", "localhost");

class Users extends ActiveMongo
{
    /* Define our User's document properties, */
    /* just to make our code readable, not really needed */
    public $username;
    public $password;
    public $country;
    public $address;
}

$user = new User;
$user->username = "crodas";
$user->password = "password";
$user->country  = "Paraguay";
$user->address[0] = array('address1' => 'foobar', 'city' => 'Asuncion', 'zip' => 'xxxx');
/* Insert */
$user->save();
$user->password = "another_password";
/* Update, only the password would be updated. */
$user->save();
?>

Until here, there is nothing new, except that for Updates, instead of put the whole object back, ActiveMongo will perform a diff between the current result and the object’s property, and it will generate a special document using $unset, $set that is going to be sent to Mongo. Again, this operation is very simple, but it might be hard to detect, look at the next example:

<?php
/* .... */
$user->address[0] ['address1'] = 'Bar';
unset($user->address[0]['zip']);
$user->address[1] = array('address1' => 'another address', 'city' => 'Asuncion');
$user->save();
?>

In this case, ActiveMongo Instead of send the whole User object, just the following object will be sent to Mongo:

{
'$set' : [
     {'address.0.address1' : 'Bar'},
     {'address.1': {'address1' : 'another address', 'city' : 'Asuncion'}}
    ],
'$unset' : [{'address.0.zip' : 1}]
}

That is quite hard to generate by hand for every table collection, and put the entire document back is a waster of resource (network, IO and so forth).

Another important feature (at least for me :-) , are the data validation. I implemented it in a simple way (at least I think). Suppose that in the User collection, we want to store the password encoded with SHA1, and the username can’t be changed, this can be done as follow:

<?php
class User Extends ActiveMongo
{
    public $username;
    public $password;
    public $country;
    public $address;

    function username_filter($value, $past_value)
    {
        if ($past_value != null && $value != $past_value) {
           throw new FilterException("Can't change username");
        }
        if (!preg_match("s/[a-z][a-z0-9\-]+/", $value)) {
           throw new FilterException("Invalid username");
        }
    }

    function password_filter(&$value)
    {
        if (strlen($value) < 5) {
            return false; /* same as throw something */
        }
        $value = sha1($value);
    }
}
?>

Nice, isn’t it?. Of course, we can’t check if a current field exists or not, we can only validate that if exists. If you need to ensure that every document has some properties you can use the pre_save hook that receive as first parameter the operation (‘create’ or ‘update’) and document that will be saved in the second parameter.

class User Extends ActiveMongo
{
    /* ... previous code .../
    function pre_save($op, $document)
    {
        $to_check = array('username', 'password');

        switch($op) {
        case 'create':
            foreach ($check as $field) {
               if (!isset($document[$field])) {
                   throw new FilterException("Missing field {$field}");
               }
            }
            break;
        case 'update':
            foreach ((array)$document['$unset'] as $field) {
                if (isset($document['$unset'][$field])) {
                    throw new FilterException("The field {$field} cant be unset");
                }
            }
            break;
        }
    }
}

If the folk request (and if it is useful) this checking could be automated somehow (e.j: implementing a method checkFields() that return an array of required fields), meanwhile I’ll find this way pretty useful and amicable, also this hook could be used to check if the current user has permissions to perform an insert or creation (useful for CRM, currently ActiveMongo support only three possible hooks, pre_save, on_save (after the save()) and on_iterate (when it moves to the next record).

The most important part is how you query your database, for our luck, MongoDB has a very flexible query method, think on it as a compiled SQL, with no join :-) ),  because MongoDB is already kickass in queries, there is no need to abstract it.

class User Extends ActiveMongo
{
    /* ... previous code .../
    /* SELECT * FROM user WHERE karma > 15 ORDER BY karma DESC */
    function my_query($karma)
    {
         /* get MongoDB collection object */
         $col = $this->_getCollection();
         /* Let's build our request */
         $query = $col->find( array('karma' => array('$gt' => 15)) );
         $query->sort(array('karma' => -1));
         /* now give the query to ActiveMongo */
         $this->setCursor($query);

         return $this; /* to use directly with foreach */
    }
}

/* How to use it. */
$users = new User;
foreach ($users->my_query(15) as $user) {
    $user->user_type = 'super_user';
    $user->save();
}

/* ActiveMongo on its own provides a very simple query API */
/* no limit, no sorting */
$user = new User;
$user->username = 'crodas';
$user->find(); /* read the parameters from object property */

Currently ActivoMongo is still under development, but as far as I have tested seems pretty stable, most of development are to add new functionality, for instance I’m looking for an easy way to add references to other document, or set of documents, keeping in mind efficiency, talking as less as possible to the database.

Every release will be hosted at PHPClasses, and in the git repository I’ll mirror my under-development version.

Comments, patches, fork/merge request are more than welcome :-)

]]>
http://crodas.org/activemongo.php/feed 7
Thinking in Documents (and dropping ACID) http://crodas.org/thinking-in-documents-and-dropping-acid.php http://crodas.org/thinking-in-documents-and-dropping-acid.php#comments Sat, 28 Nov 2009 21:51:49 +0000 crodas http://crodas.org/?p=69 ]]> http://crodas.org/thinking-in-documents-and-dropping-acid.php/feed 0 Distributing PHP processing with Gearman http://crodas.org/distributing-php-processing-with-gearman.php http://crodas.org/distributing-php-processing-with-gearman.php#comments Tue, 24 Nov 2009 03:32:05 +0000 crodas http://crodas.org/?p=58 The weekend I wrote an article about Gearman published on the PHPClasses site.

My post at PHPClasses

My post at PHPClasses

Special thanks to my friend Manuel Lemos

]]>
http://crodas.org/distributing-php-processing-with-gearman.php/feed 0
Latinoware 2009 http://crodas.org/latinoware-2009.php http://crodas.org/latinoware-2009.php#comments Fri, 23 Oct 2009 16:37:02 +0000 crodas http://crodas.org/?p=34 Non-Framework MVC sites with PHP

View more documents from crodas.
]]>
http://crodas.org/latinoware-2009.php/feed 1
Weird but cool Pagerank’s usage http://crodas.org/weird-but-cool-pageranks-usage.php http://crodas.org/weird-but-cool-pageranks-usage.php#comments Tue, 06 Oct 2009 17:42:33 +0000 crodas http://crodas.org/?p=3 Yesterday, talking with a good friend, he told me he needed a good algorithm to detect keywords (relevant words) from a document. The first algorithm that came out from my head was a simple word frequency counter, discarding common words by building a list of stop-words with a previous learning. This algorithm is pretty obvious and I’m sure it is very used out there.

Then Googling for some papers (I have a bunch on my laptop but I do not recall where I stored it)  I found a paper that opened my mind (TextRank: Bringing order into Texts). It suggests to build a graph of words, then apply the PageRank Algorithm to the graph in order to know relevant words. I haven’t read it deeply yet, but I’ve got that idea with a brief reading, and it makes sense, I’m wondering why I never thought about it.

I’m planning to code it, just as a proof-of-concepts during this week.  Basically I will use some old code that I’ve coded (but never finished) awhile ago, I remember I build it very modular using classes, so adapt that code for these needs will be pretty straightforward.  And in the graph of words (and sets of  1, 2, 3 words probably), the previous word will reference the next word (If you have no idea about what I said here, just take a look here).

I will post the results here.

]]>
http://crodas.org/weird-but-cool-pageranks-usage.php/feed 9