PHP - MySQL 4.1

From LXF Wiki

Table of contents

Practical PHP Programming

(Original version written by Paul Hudson for LXF issue 63.)


And you thought Interfict was dead? We add the very last instalment...

Just kidding. In this tutorial we're going to be looking at MySQL 4.1, which has now been declared ready for production use. Despite being just a 0.1 upgrade from MySQL 4.0, there are lots of important differences, and at least one that will directly affect your PHP code no matter what. The headline new features in the release are:

  • Subqueries. The ability to run SELECT statements buried within other SELECTs.
  • Prepared statements. These allow you to replace parts of a query with question marks then fill them in later - great for queries you'll execute several times in a row.
  • A new binary client/server protocol.
  • A new HELP command to provide server-side SQL assistance.

Yes, there are more big features new in MySQL 4.1 than were added in MySQL 4.0, but the 4.0 release was largely about groundwork in preparation for this release and the others coming soon. MySQL 5.0 is already in a test stage, and 5.1 is looming over the horizon - both of which bring in even more big features. This is an exciting time for MySQL users, and in turn PHP users, so it's best to get to grips with it sooner rather than later.


Sinking the subqueries

It's a common problem: you want to run an SQL query, grab the results, then put them into another query. Usually that gives you code like this:

$lakes = array(-1);
$result = mysql_query("SELECT ID FROM lakes WHERE HasDucks = true;");
while ($r = mysql_fetch_assoc($result)) {
  $lakes[] = $r['ID'];
}

$result = mysql_query("SELECT * FROM FROM fishermen WHERE lakes IN ($lakes);");

The first query pulls up the IDs of all lakes that have ducks, and puts those numbers into an array. This array is then used to find out the details of fishermen who fish in those lakes. Simple enough, right? Problem is, it's a performance vortex: MySQL can optimise the queries individually, but we loop through the first result, extract its data, put it into an array, then put that array back through another SQL query.

Ideally what we want is one SQL query that eliminates all that code, and it just so happens that subqueries - often referred to as subselects - are perfect for this task. A traditional query takes its SQL and outputs its results, but subqueries allow one SQL query to return its results into another SQL query.

Here's our mini-script rewritten using a subquery:

$result = mysql_query("SELECT * FROM fisherman WHERE lakes IN (SELECT ID FROM lakes WHERE HasDucks = true);");

That one line does the same as the six previous lines, which should give you an idea of how powerful subqueries are! What's more, you can actually use subqueries within subqueries if you want to - although it might start to get confusing!

There are downsides to subqueries, of which the most pertinent is that the current MySQL query optimiser doesn't handle them too well. If you recall, the query optimiser works by storing the result of a query along with the text of the query itself. When new queries come in, they are compared byte-by-byte with the set of cached queries, and if there is a match then the cached result gets returned.

Take a look at these three:

SELECT * FROM foo ORDER BY ID ASC;
SELECT * FROM foo ORDER BY ID DESC;
SELECT * FROM foo ORDER BY ID asc;

That first query will execute and its results will be cached. That second query will come and, while we'd like to think the optimiser will say "hey, I can just flip the sort order of the previous query and return that", it won't - the two strings don't match, and the query will be performed from scratch and recached. The last query will come in, and also result in a cache miss, this time because the "asc" is in lower case whereas the cached version is in uppercase - d'oh!

Now consider how that applies to subqueries. What we really want is to have the inner query ("SELECT ID FROM lakes..." in the example above) cached, and have its cached results put into the outer query ("SELECT * FROM fisherman...") and be cached again. This isn't currently possible in MySQL because it just does a blanket byte-by-byte comparison of the entire query. So, the moral to the story is this: use subqueries all you like, just be careful!


May day!

Before we move onto looking at the new client/server protocol and prepared statements, I want to run you over the new help system just quickly. Have you ever encountered the situation where you type a query that looks valid, only to have it rejected by the server saying, "check the manual that corresponds to your MySQL server version for the right syntax to use"? This is a nice idea in theory, but in practice the documentation you end up checking is the online version that is constantly updated for the latest release and doesn't help much. Instead, you're left to wonder which part of your query is invalid for your server version, often using trial and error to track it down.

New with MySQL 4.1 is server-side help documentation that contains a concise version of the online manual specific to the version you're working with. To get started, type "help contents" - you'll see categories like Administration, Column Types, Data Definition, etc. You can read any of these by just typing them in, eg "help administration".

Although this is hardly a leap forward in imagination for the developers, it does make life a great deal easier - particularly if you're new to 4.1.


MySQLi

Okay, this is the biggest change of all: the client/server protocol has changed. Yes, that sounds pretty insignificant and not really worth worrying about, but it's absolutely crucial you understand what's happened.

  • MySQL 4.0 and earlier clients cannot connect to 4.1 servers
  • The mysql_* functions in PHP are designed for the MySQL 4.0 and earlier client libraries. These will not work with a 4.1 server.
  • There's a new extension in PHP 5, called mysqli, that has functions for working with the new 4.1 servers
  • The new protocol is binary, compressed and secure, which makes it much faster and safer than previously. This is a good thing.

The transition from the MySQL extension to the MySQLi is not going to be an easy one, unless you happen to use PEAR::DB. If migration was as simple as renaming mysql_query() to mysqli_query() throughout then there would be no need for me to write this! Instead, it's just a smidge harder: mysqli usually prefers you to provide a database connection variable with its functions. Other than /that/, it largely /is/ just about function renaming, however prepared statements are new and distinct from what you'll already know.

The best way to demonstrate MySQLi is just to show you some code written using the MySQL extension, then show the same thing again with the MySQLi extension. Here goes:

mysql_connect("localhost", "db", "alm65z");
mysql_select_db("gardening");
$result = mysql_query("SELECT ID, Model, Price FROM lawnmowers WHERE Manufacturer = 'John Deere';");
if (mysql_error()) {
  echo "There was an error: ", mysql_error();
} else {
  if (mysql_num_rows($result)) {
    while ($r = mysql_fetch_assoc($result)) {
      // process the mowers here
    }
  }
}

Now, we can do a straight port of that to MySQLi like this:

$db = mysqli_connect("localhost", "db", "alm65z");
mysqli_select_db($db, "gardening");
$result = mysqli_query($db, "SELECT ID, Model, Price FROM lawnmowers WHERE Manufacturer = 'John Deere';");
if (mysql_error($db)) {
  echo "There was an error: ", mysql_error($db);
} else {
  if (mysql_num_rows($result)) {
    while ($r = mysql_fetch_assoc($result)) {
      // process the mowers here
    }
  }
}

So, aside from just renaming functions, note that I've had to grab the return value from mysqli_connect() and use it as the first parameter to mysqli_select_db() and mysqli_query(). However, that is a "straight" port - a direct translation of MySQL to MySQLi. We can make one further change quite easily:

$db = mysqli_connect("localhost", "db", "alm65z", "gardening");

That new line shows how the mysqli_connect() function lets you specify the database name to work with as the fourth parameter - this required a separate function call in MySQL 4.0. While you can still use mysqli_select_db(), it has already been deprecated a little - it is now only recommended for /changing/ the database rather than selecting it initially.


Be prepared!

As any good Scout will tell you, it's always good to be prepared - a principle that applies to much more than needing to light a fire out in the wilderness. Parsing an SQL statement is no easy task: they can be complicated things, and if you're running many of the same type MySQL needs to keep on parsing the same text before executing it. What prepared statements allow you to do is define an SQL query with some parts missing, have MySQL process it and pre-optimise it, then supply the unique parts later. This is a wholly new concept for some people, and it uses a few new functions, however prepared statements are a much smarter and faster way of working so it's recommended you get used to them!

If you're stuck on the concept, let me give you an example. Imagine there are two hot dog salesmen at a sports game. There's a break in the game, and people rush to buy hot dogs from their stalls. The first salesman waits for each customer to arrive, takes their order, and puts a fresh sausage on to cook for them. When the sausage is finished, he takes it off the grill, adds condiments, and serves. Over on the other side of the stadium, the other hot dog salesman pre-cooked his hot dogs, which means that as each customer arrives he just takes one off the grill and adds condiments. Which one do you think sells more hot dogs?

Duh, the second one of course, so why don't we do the same with our SQL queries? No, I don't want you to pour ketchup onto them - I'm talking about preparing statements that will get used several times, and only adding the condiments (the bits that need to be unique) as needed.

The new functions you need to learn are:

  • mysqli_prepare(). This takes a database connection and an unfilled query as its only parameters. You need to specify blanks in the query using a question mark - these blanks need to be filled in later. You also need to store the return value of this function, as it is the handle to your prepared statement
  • mysqli_stmt_bind_param(), This takes your prepared statement as its first parameter, then the types of variables you want to include, and finally the variables themselves. This works a little bit like printf() - more soon.
  • mysqli_stmt_execute(). As you might imagine, this executes a prepared statement.
  • mysqli_bind_result(). Not to be confused with mysqli_bind_param(), this tells PHP where to place the results from your query.
  • mysqli_stmt_fetch(). This actually takes the results from your query and places them into your variables.

Of those, only really mysqli_stmt_bind_param() needs explaining beyond just showing you some code, because it takes a variable number of parameters. Parameter two should be a string of letters that represent what variables you are binding to your prepared statement - give it an "i" for an integer, a "d" for a double (floating-point number), or an "s" for a string. You can bind multiple parameters just by adding more letters - "idddsd" would mean "I want to bind an integer, three doubles, a string, and another double".

You provide these variables as extra parameters to the function - specifying idddsd would mean that mysqli_stmt_bind_param() gets called with eight parameters in total (the statement, the idddsd string, and the six variable names to bind). What you are binding here is the names of the variables to be used in the query, and not their values - those are set later. The same goes for mysqli_stmt_bind_result() - it takes as many parameters as you have fields in your query result, and then later fills these variables with your result data.


Show me the code!

Here's some code showing off prepared statements:

<?php
  $db = mysqli_connect("localhost", "db", "alm65z", "sciuridae");
  $statement = mysqli_prepare($db, "SELECT Name, Colour FROM squirrels WHERE TailLength = ?;");
  mysqli_stmt_bind_param($statement, "i", $i);
  mysqli_stmt_bind_result($statement, $squirrel_name, $squirrel_colour);

  for ($i = 5; $i <= 30;  ++$i) {
    mysqli_stmt_execute($statement);
    while (mysqli_stmt_fetch($statement)) {
      $length = $i . "cm";
      echo "$squirrel_name ($length) is $squirrel_colour\n";	
    }
  }
?>

The first line calls mysqli_connect(), which you know. The next line sets up our prepared statement by passing in an SQL query with a question mark - this is the variable we plan to replace later. Be sure not to surround your question mark with quotes or anything else, because if MySQL finds anything other than a vanilla question mark it won't replace it. The return value from mysqli_connect() gets stored in $statement, as it is used several times later.

With a call to mysqli_stmt_bind_param(), the code tells MySQL to look in the $i variable for an integer to complete the prepared query. Note that at this time $i is not set - we've told PHP /where/ to look, but it won't actually look in there until we execute the prepared statement. The next line sets up $squirrel_name and $squirrel_colour to store the fields from the MySQL result.

Next is the main loop. At this point we have our prepared statement all ready to go, we just need to start calling it - that's what the loop does. As you can see, it counts from 5 to 30, executing the query each time. As we've bound $i to the question mark in our statement, MySQL will replace it and execute the completed query, like this:

SELECT Name, Colour FROM squirrels WHERE TailLength = 5;
SELECT Name, Colour FROM squirrels WHERE TailLength = 6;
SELECT Name, Colour FROM squirrels WHERE TailLength = 7;
...
SELECT Name, Colour FROM squirrels WHERE TailLength = 29;
SELECT Name, Colour FROM squirrels WHERE TailLength = 30;

Yes, that's not the ideal way of just printing out a list of squirrels, but it should give you an idea of how much work is saved by using prepared statements!

Once the statement has been executed, mysqli_stmt_fetch() is used to get the results. Note that you just pass in the statement handle and it automatically puts the fields from the current row into the result variables bound earlier. This is inside a while loop because it will carry on returning rows as long as there are any left, and return when it has reached the end. If you don't call fetch() before executing the statement again, the previous rows get overwritten with the new rows.

Prepared statements are very flexible - much more so than we have space to print here. You can have several question marks in your queries, you can use INSERT, UPDATE, or DELETE queries as well as SELECTs, you can mix in subqueries and other complex types of SELECTs, and a lot more - and they all get the same speed boost.


Famous last words

Before I leave you to run riot with MySQL 4.1, there are three more things I want briefly cover: multiple queries, transactions, and object orientation. In both MySQL and MySQLi, the standard query function executes only one SQL statement. Consider this code:

$sql = "DELETE FROM clowns WHERE Name = 'Bobo'; DELETE FROM clowns WHERE Name = 'Dooley';
DELETE FROM clowns WHERE Name = '$delete_name';
mysqli_query($db, $sql);

Although there are three SQL statements in there (bonus points if you know where the clown name Dooley comes from!), MySQL will only execute the first one. This is a security precaution, as it stops people running SQL injection attacks such as doing this:

http://www.example.com/delete_clowns.php?delete_name='; UPDATE admin SET Password = 'abc123';

SQL injection attacks like that one try to force unwanted commands into your code - there someone is trying to force the $delete_name variable to actually blank out the administrator password for your database. By only allowing the first query to be executed, mysqli_query() (and also mysql_query()) are protecting you.

However, in the new MySQLi extension there's a new function called mysqli_multi_query() that, you guessed it, allows you to run multiple queries from the same call. This can provide quite a speed boost, but I hope that now I've explained the dangers you'll realise it's best to stay clear!

Next, if you've ever used the InnoDB table handler with MySQL you'll know it's a powerful way to extend MySQL into the enterprise. However, the previous way to commit and rollback transactions was to just pass in the command through mysql_query() - this has all changed now. Transactions are now handled through mysqli_commit() and mysqli_rollback() functions, like this:

mysqli_autocommit($db, false); // disable automatic transaction commits
mysqli_commit($db); // commit changes
mysqli_rollback($db); // rollback changes

Finally, if you're really into your OOP you can try using the object-oriented side of MySQLi. To get started, use this line:

$mysqli = new mysqli("localhost", "db", "alm65z", "some_database");

You can then use it to call the same functions we've been looking at here, just without the $db parameter everywhere - it stores that internally.

Happy PHPing!