PHP - Magic functions and XML

From LXF Wiki

Table of contents

Practical PHP

(Original version written by Paul Hudson for Linux Format magazine issue 50.)


Continuing on from the object-orientation discussion from last tutorial, we look at magic functions, and a simple way to do XML...


Despite devoting a whole four pages to the new OOP features in PHP 5 last issue, there's still quite a bit more we've yet to cover in the form of magic functions, different kinds of class definitions, and also static class variables. The magic functions are designed to providing useful little snippets of functionality to your objects as standard, and give you extra flexibility that would otherwise have been impossible. In addition, PHP 5 allows definition of classes as being abstract and final, as well as giving us static class variables - these are quite in-depth pieces of functionality as OOP goes, so we'll be going over them fairly slowly this issue to make sure it's all understood.

Finally, XML in PHP 5 has been vastly improved so that, finally, it's easy to make use of XML in PHP for every day tasks without needing to write a mountain of code to navigate the DOM tree. As the name implies, the SimpleXML extension makes XML simple, or at least somewhat more simple, which provides yet one more compelling reason to upgrade to PHP 5.

Anyway, enough talking and onto some code - these new magic functions aren't at all about pulling rabbits out of hats, so what are they and why do you want them?


A touch of magic

We looked at some magic functions last month, namely __toString(), __autoload(), etc. This month we're looking at three magic functions that are somewhat confusing because they are designed to give you maximum flexibility in your scripts, and therefore might seem somewhat vague and even pointless when you first try them out.

The three functions we're interested in are __get(), __set(), and __call(). All three are similar, but __get() and __set() are directly related, so we'll look at those two first.

Consider the following script:

<?php
  class car {
    public function __get($var) {
      echo "Getting $var\n";
    }

    public function __set($var, $val) {
      echo "Setting $var to $val\n";
    }
  }

  $volvo = new car();
  $volvo->Name = "Tim";
  echo $volvo->Name;
?>

Save that as getset.php and run it - naturally you'll need to have PHP 5 installed for it to work. The output you should get is this:

Setting Name to Tim
Getting Name

As you can see, we set the name of the car to Tim, then get the name and print it out. Do you see "Tim" being outputted? No. Well, let's take a look at how the scripts works!

The __get() magic function takes one parameter, which is the name of the variable to get. This is simple enough, and __get() will be called each time we reference a variable* from our object. Similarly, __set() takes the name of the variable to be set as well as the value to set it to, and you can see that working in the output. So, why doesn't anything actually happen other than the echo statements printing out?

Well, the problem here is that __get() and __set() only work on variables that are not set. This is why there's an asterisk a couple of sentences ago, next to "each time we reference a variable" - this is really "each time we reference a variable that has not been defined already". You see, __get() and __set() are designed to handle calls to variables that don't exist, and allow you to override the standard get/set with your own code.

In our example, we set the Name variable of our car object, and, as it doesn't have a Name variable, __set() kicks in. This call prints out "Setting name to Tim", but then doesn't do anything about it; the variable doesn't actually get set. The reasoning here is that you can customise the set action so that it does something entirely different - perhaps setting a value in a file rather than creating the object value; the choice is yours. As a result of this, we don't see the object's name being printed out, but we can redo the script to change that, like this:

<?php
  class car {
    public function __get($var) {
      echo "Getting $var\n";
    }

    public function __set($var, $val) {
      echo "Setting $var to $val\n";
      $this->$var = $val;
    }
  }

  $volvo = new car();
  $volvo->Name = "Tim";
  echo $volvo->Name;
?>

This time, __set() gets the call to set an undefined value, and /then carries through by setting it/ - this makes the script operate quite differently! The output is now this:

Setting Name to Tim
Tim

This time you can see we have the car's name being outputted properly, but we don't get the message "Getting Name" any more! It's usually around about now that people start thinking they can't win with __get() and __set(), but hold on while I explain the logic here. Keep in mind that __get() and __set() only work on variables that aren't defined in the object - by calling __set(), /and having it actually set the variable/, the variable becomes defined in the object. As a result of this, __get() no longer works - "we always knew these functionses were tricksy", as Gollum might say.


It it helpful?

Now, given that both __get() and __set() are somewhat evasive, how can they be used for anything interesting and, dare we ask it, useful? This takes some thinking. Given that we're able to do absolutely anything we want when we get and set variables, it's a logical extension to consider making objects more than just temporary script values - making them permanent. This can be achieved by tying our objects into a database system so that as we read and write values, we are in fact working behind the scenes with a database.

If this sounds difficult, you're probably just overestimating the difficulty of these two functions. Log in to your MySQL server, and enter the following commands:

USE test;
CREATE TABLE list_of_clowns (Name CHAR(100) PRIMARY KEY, Age TINYINT, BestTrick CHAR(30));
INSERT INTO list_of_clowns VALUES ('Clinko', 29, 'Juggling');
INSERT INTO list_of_clowns VALUES ('Bobo', 33, 'Pie-throwing');
INSERT INTO list_of_clowns VALUES ('The Amazing Bobert', 46, 'Car crashing');

Here we have a very simple table that contains information on three circus clowns, Clinko, Bobo, and The Amazing Bobert. What we want to do is to be able to create an object in a PHP script that represents one of these clowns, and manipulate the table data using OOP. Note that we are storing the table in the "test" MySQL database, which should have been created by default when you installed MySQL

Save this next script as clowns.php:

<?php
  mysql_connect("localhost", "someuser", "somepass");
  mysql_select_db("test");

  class clown_db {
    public $Name;

    public function __construct($Name) {
      $this->Name = $Name;
    }

    public function __get($var) {
      $result = mysql_query("SELECT $var FROM list_of_clowns WHERE Name = '{$this->Name}';");
      if (($result) && mysql_num_rows($result)) {
        extract(mysql_fetch_array($result));
        return $$var;
      }
    }

    public function __set($var, $val) {
      echo "Setting $var to $val\n";

      mysql_query("UPDATE list_of_clowns SET $var = '$val' WHERE Name = '{$this->Name}';");
    }
  }

  $clinko = new clown_db("Clinko");
  echo "My name is {$clinko->Name} and my age is {$clinko->Age}\n";
?>

There's quite a lot of code in there, and you will really need to go and read last issue if you're not sure what __construct() is. Here's how the code breaks down:

  • A clown class is defined, with a constructor to accept a name. Remember in our table schema we used the clown name as the primary key, so we create the clown giving it the name to use for table lookups.
  • The clown class has __get() and __set() functions in there to look up and set data from the list_of_clowns table - these look up and set data by using the primary key to differentiate between clowns.
  • The __get() function returns $$var, which is a variable variable that should be set to the value we just took out of the database, which extract() put into the global scope.
  • We create $clinko, a clown, and then get the script to output his (or her?) name and his age.

If you've followed the steps so far, you should get the output "My name is Clinko and my age is 29" - the object is communicating smoothly with the table. The script should work equally well if you change values of $clinko.


Calling the uncallable

In the same way that we have __get() and __set() to handle variables that don't exist, there is also __call() to handle calls to functions that don't exist. In many ways this is even more esoteric than __get() and __set() and so is equally more difficult to find valid uses for unless you entirely grasp the concept.

Using __call() is slightly more tricky than __get() and __set() because functions are able to take a variable number of parameters as input. As a result, PHP hands you the function that was attempted as the first parameter and the parameters that would have been passed as the second parameter. Give it a try with this implementation of __call():

public function __call($function, $args) {
  $args = implode(',', $args);
  echo "Calling $function with $args\n";
}

You can now call any function you like on this object, and, if you haven't already defined it, __call() will be used. Of course, that isn't very helpful - we want to extend this so that using __call() will automatically take maximum advantage of our database communication system. One possible implementation is this:

    public function __call($function, $args) {
      if (count($args) != 2) {
        echo "Not enough arguments passed to $function()!";
        return;
      }

      $result = mysql_query("SELECT $function FROM list_of_clowns WHERE Name = '{$this->Name}';");
      if (($result) && mysql_num_rows($result)) {
        extract(mysql_fetch_array($result));
      }

      $FaveTrick = $$function;
      for ($i = 1; $i <= $args[1]; ++$i) {
        echo "{$this->Name} does his $args[0]: ${$function}!\n";
      }
    }

This is somewhat complicated to look at, but it's simply designed to take a function name and a set of arguments, and try to make some sense of it. Here's how it might be used:

  $clinko->BestTrick('favourite trick', 1);
  $clinko->BestTrick('most favourite trick', 2);
  $clinko->FaveSong('favourite song', 2);

As you can see, we're using the undefined functions BestTrick() and FaveSong(), passing in two parameters each time. So, the first thing that __call() does is to bail out if we pass in less than two parameters - this is because we're using it very strictly right now; this might not be an possibility in your situation. Naturally you'll need to add a FaveSong field to list_of_clowns so that the data can be selected back out.

What the __call() function above does is take the function name and use that to extract data from the database, use the first parameter as part of the information being outputted, and the second parameter to decide how many times the database informations hould be echoed out.

So, using the three lines of BestTrick(), BestTrick(), and FaveSong(), we should get the following output:

My name is Bobo and my age is 29
Bobo does his favourite trick: Pie-throwing!
Bobo does his very favourite trick: Pie-throwing!
Bobo does his very favourite trick: Pie-throwing!
Bobo does his favourite song: The rain in Spain!
Bobo does his favourite song: The rain in Spain!

A fanciful use of __call(), perhaps, but you should get the point.


Abstract art

I promised I'd cover abstract and final classes as well as static class variables, so we're going to cover this relatively simple area as quickly as possible before leaving OOP for good and moving onto the new XML topic. These three are much easier than __get(), __set(), and __call(), thankfully!

Abstract classes are another OOP concept that are there to help you fulfil code contracts. In this case, defining a class as abstract means that it cannot ever be instantiated directly - you can't create objects of this class. This might sound useless at first - after all, where's the point in defining a class you can't create? Well, consider that because OOP models classes after reality, there will be some classes that simply cannot exist if more specific options are available, but are used to define child objects. A class "person", for example, might define variables such as Name, Age, etc, and is the basis for classes "man" and "woman". This might lead to code like this:

$bob = new man;
$freda = new woman;
$jim = new person;

This would make $bob male, $freda female, and $jim... well, it would make $jim an "it" - a person without sex. This is of course not possible. The person class is important, because we extend it down to man and woman, but we don't want to be able to create a person directly. To do this, we declare person as being abstract, or noncreatable, like this:

abstract class person {
...
}

Now, attempting to create a person directly (that is, not a man or a woman) will result in a fatal error.

Moving on, you can also declare classes as "final", like this:

final class clown_db {
...
}

Using final on clown_db, we've declared that the clown_db class is our very final version of this class and that it cannot be extended further. Adding code like this would cause a fatal error:

class new_clown extends clown_db {
...
}

Again, this is designed to help you enforce your code contracts. Finally, we have static class variables, which are essentially a way of letting objects of the same class share values. Declaring a static class variable in your class is a matter of using the "static" keyword, but you can also declare a starting value for it, like this:

static public $ThingsDone = 0;

We can make this work by changing the __call() function loop to this:

echo "{$this->Name} does his $args[0]: ${$function}!\n";
++clown_db::$ThingsDone;

The second line there specifies that we increase the static variable $ThingsDone of class clown_db by one. Change the end of the script to this:

$clinko->BestTrick('favourite trick', 3);
$clinko->BestTrick('very favourite trick', 3);
$clinko->FaveSong('favourite song', 2);
echo "Actions done in total: ", clown_db::$ThingsDone, "\n";
$bobo = new clown_db("Bobo");
$bobo->BestTrick('favourite trick', 3);
echo "Actions done in total: ", clown_db::$ThingsDone, "\n";

That makes the script print out the static class variable after it has been changed by two clown_db objects - if you run the script, you should get 8 the first time, as Clinko did eight things, and 11 the second tim, as Bobo did three more things. As you can see, both classes share the same variable, which is somewhat like having a simple global variable to handle it all but a great deal more precise.


Simple XML?

XML support before PHP 5 was somewhat tricky - you either used an event-based system or a DOM-based system, and neither were easy to learn or easy to use. PHP 5 comes to the rescue with the new SimpleXML extension, which is designed to convert XML files to usable PHP variables that you can treat like any other PHP variables. This sounds a lot more difficult than it actually is, and the best way to get into it is just to start programming!

Working with XML, requires, unsurprisingly, an XML data file to load from. Here's the XML we'll be using, lxf50.xml:

<EMPLOYEES>
  <EMPLOYEE>
    <NAME>Sam</NAME>
    <AGE>24</AGE>
  </EMPLOYEE>
  <EMPLOYEE>
    <NAME>Jon</NAME>
    <AGE>33</AGE>
  </EMPLOYEE>
</EMPLOYEES>

Now, to demonstrate how SimpleXML will convert that to variables we can use, try this script out:

<?php
  $employees = simplexml_load_file('lxf50.xml');
  print_r($employees);
?>

Give that code a try, and you'll see that SimpleXML has converted all of our XML input file into PHP objects and arrays. Note that it has also kept the letter case information of each XML element, so it's EMPLOYEE and AGE not Employee, etc. This allows us to write code such as this:

<?php
  $employees = simplexml_load_file('lxf50.xml');
  $foo = $employees->EMPLOYEE[0];
  echo $foo->NAME;
?>

As you can see, EMPLOYEE and NAME are arrays and variables as you're used to them, so simple XML takes the tricky area of XML and brings it down to a level everyone can use with no retraining required. This simplicity can be extended to strings simply by changing simplexml_load_string(), which takes the XML string to convert as its parameter, and so is called like this:

  $employees = simplexml_load_file('<EMPLOYEES><EMPLOYEE><NAME>Sam</NAME><AGE>24</AGE></EMPLOYEE><EMPLOYEE><NAME>Jon</NAME><AGE>33</AGE></EMPLOYEE></EMPLOYEES>');

This is obviously a bit more difficult to read, particularly in your nicely ordered scripts, but it does mean you can piece together several bits of XML and put them into one SimpleXML object with a single call.

As you can see, SimpleXML is a great new way to get simple XML capabilities into your scripts. However, note that it is just simple stuff - SimpleXML doesn't handle element attributes, such as <ID TYPE="foo">bar</ID>, or more complicated things such as schemas or DTDs.


Conclusion

It took quite a long time to fully explain the new OOP features in PHP 5, and even now there are some that I've had to skip to make enough room to cover SimpleXML. However, you should have a solid enough grounding that the other parts will be relatively easy! The key advantage to OOP, apart from the obvious one of code re-use, is that it allows you to form code contracts with yourself and other programmers to ensure your classes and objects can only be used as you want them to be - this is a great way to help make your scripts more reliable and more predictable, which means you spend less time hacking, and more time hammering your friends in Unreal Tournament!


Class hints

Is PHP a typeless language or what?

One of the key benefits many programmers cite to writing PHP is that is typeless - you can multiply a string by a string, or make a string up by adding two integers together. This approach makes learning the language easier, but on the flip side it does make administering and maintaining the code harder because, in other languages where variables are assigned types, the compiler will warn you about type misuse - a common sign of a programming error.

New in PHP 5 are "class hints", which allow you to specify what type of object must be passed into a function. For example, if we have a fish class and a submarine class, as well as a global function called fire_torpedo(), we certainly wouldn't want other programers to try to make a fish fire a torpedo! In this instance, we'd specify that the function must accept a certain type of object, namely only submarines, like this:

public function fire_torpedo(Submarine $sub) {
...
}

When this code is executed, PHP will check what goes into fire_torpedo(), and will error out if it's anything but a submarine object. Note that this check is done at run-time, when the function is actually called - your script will execute up to a point, then bail out if it passes in a bad object type. Also note that this check is inheritance-independent - if you specify that a person must be passed in, then pass in a man (a class extended from person), PHP won't complain.

You can do this same check by using a new keyword in PHP 5, called "instanceof", which simply returns true if the variable on the left of the keyword is of the same class type (or an ancestor of) the class on the right of the keyword.

<?php
  class person { }
  class man extends person { }
  class cat { }
  $foo = new person;
  $bar = new man;
  $baz = new cat;
  echo $foo instanceof person;
  echo $bar instanceof person;
  echo $baz instanceof person;
?>

The output of that script should be "11" - two 1s put together, which means two trues, because $foo is a person (it's of class person), $bar is a person (it's of class man, which extends from person), and $baz isn't a person, because it's a cat. Elementary!