Skip to main content

skip to main content

developerWorks  >  Open source  >

A PHP V5 migration guide

Make PHP V4 applications take advantage of V5's object-oriented features

developerWorks
Document options

Document options requiring JavaScript are not displayed


Learn and share!

Exchange know-how with your peers -- try our new Pass It Along beta app


Rate this page

Help us improve this content


Level: Intermediate

Jack D. Herrington (jherr@pobox.com), Senior Software Engineer, Leverage Software

26 Sep 2006

With the new language features of PHP V5, you can significantly improve your code's maintainability and stability. Learn how to migrate code developed in PHP V4 to V5 while taking advantage of these new features.

PHP V5 is a quantum step up from V4. The new language features make building reliable and maintaining class libraries much easier. In addition, the reworking of the standard libraries helped bring PHP more in line with its cousin Web languages, such as the Java™ programming language. Take a tour through some of PHP's new object-oriented features and learn how to migrate existing PHP V4 code into V5.

First, a bit about the new language features and how PHP's creators have changed the approach to objects from PHP V4. The idea with V5 was to create an industrial-strength language for Web application development. That meant understanding the limitations of PHP V4, then pulling known good language constructs from other languages (such as the Java, C#, C++, Ruby, and Perl languages) and incorporating them into PHP.

The first and most important addition was access protection for methods and instance variables on classes -- the public, protected, and private keywords. This addition allows class designers to retain control over the internals of their classes while expressing to the client of the class what he should or should not touch.

In PHP V4, everything was public. In PHP V5, class designers can say what is externally visible (public) and what is visible only internally to the class (private) or to descendants of the class (protected). Without these access controls, working on code in large groups or distributing code as libraries was hindered because consumers of those classes could easily use the wrong methods or access what should have been private member variables.

Another big addition were the keywords interface and abstract, which allow for contract programming. Contract programming means that one class provides a contract to another -- in other words: "Here is what I will do, and you don't need to know how it's done." Any class that implements that interface agrees to the contract. Any consumer of the interface agrees to use only the methods specified in the interface. The abstract keyword makes using interfaces a lot easier, as I show later.

These primary two features -- access control and contract programming -- allow for much larger teams of coders to work on much larger code bases more smoothly. They also allow IDEs to provide a much richer set of language intelligence features. Although this article covers several migration issues, I also spend a lot of time showing how to use these new key language features.

Access control

To demonstrate the new language features, I use a class called Configuration. This simple class holds configurable items for a Web application -- for example, the path to the images directory. Ideally, this information resides in a file or a database. Listing 1 shows a simplified version.


Listing 1. access.php4

<?php
class Configuration
{
  var $_items = array();

  function Configuration() {
    $this->_items[ 'imgpath' ] = 'images';
  }
  function get( $key ) {
    return $this->_items[ $key ];
  }
}

$c = new Configuration();
echo( $c->get( 'imgpath' )."\n" );
?>

This is a perfectly legitimate PHP V4 class. A member variable holds the list of configuration items, a constructor loads the items, and an accessor method called get() returns the value of an item.

When I run the script, this code appears on the command line:

% php access.php4
images
%

Great. It means that the code is working properly and that the value of the imgpath configuration item is being set and read properly.

The first step in converting this class to PHP V5 is to rename the constructor. In V5, the method that initializes the object (the constructor) is called __construct. This small change is shown below.


Listing 2. access1.php5

<?php
class Configuration
{
  var $_items = array();

  function __construct() {
    $this->_items[ 'imgpath' ] = 'images';
  }
  function get( $key ) {
    return $this->_items[ $key ];
  }
}

$c = new Configuration();
echo( $c->get( 'imgpath' )."\n" );
?>

This change isn't a big deal. It's just moving to the PHP V5 convention. The next step is to add the access controls to the class to ensure that clients of the class can't read or write the $_items member variable directly. This change is shown below.


Listing 3. access2.php5

<?php
class Configuration
{
  private $_items = array();

  public function __construct() {
    $this->_items[ 'imgpath' ] = 'images';
  }
  public function get( $key ) {
    return $this->_items[ $key ];
  }
}

$c = new Configuration();
echo( $c->get( 'imgpath' )."\n" );
?>

If a client of this object were to access the items array directly, that access would be denied because the array is marked private. With luck, the client would realize that the get() method provides the sought-after read access.

To show how to use protected access, I need a second class that inherits from the Configuration class. I call that class DBConfiguration and pretend that it reads the configuration values from a database. This setup is shown below.


Listing 4. access3.php

<?php
class Configuration
{
  protected $_items = array();

  public function __construct() {
    $this->load();
  }
  protected function load() { }
  public function get( $key ) {
    return $this->_items[ $key ];
  }
}

class DBConfiguration extends Configuration
{
  protected function load() {
    $this->_items[ 'imgpath' ] = 'images';
  }
}

$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."\n" );
?>

This listing shows the proper use of the protected keyword. The base class defines a method called load(). Descendants of this class override the load() method to add data to the items table. The load() method is internal to the class and its descendants, so it's invisible to any external clients. If the keyword were private, it couldn't be overridden.

I'm not completely happy with this design, however, because I've had to make the items array accessible to the DBConfiguration class. I'd rather have the Configuration class continue to maintain the items array completely so that if I add other descendants, those classes won't need to know how to maintain the items array. I make that change below.


Listing 5. access4.php5

<?php
class Configuration
{
  private $_items = array();

  public function __construct() {
    $this->load();
  }
  protected function load() { }
  protected function add( $key, $value ) {
    $this->_items[ $key ] = $value;
  }
  public function get( $key ) {
    return $this->_items[ $key ];
  }
}

class DBConfiguration extends Configuration
{
  protected function load() {
    $this->add( 'imgpath', 'images' );
  }
}

$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."\n" );
?>

Now, the items array can be private because the descendant classes use the protected add() method to add configuration items to the list. The Configuration class can change how it stores and reads configuration items without worrying about its descendant classes. As long as the load() and add() methods work in the same way, the descendants should be fine.

To me, the addition of access controls is the primary reason to consider moving to PHP V5. Is it because Grady Booch says it's one of the four pillars of object orientation? Nah. It's because I was once tasked with maintaining 100KLOC of C++ code, in which all methods and members were defined as public. I spent three days cleaning it up, and in the process, significantly reduced the bug count and increased the maintainability. Why? Because without access controls, it's impossible to know how the objects are using each other. And it's impossible to make changes without knowing what else will break. With C++, at least I had the compiler on my side. PHP doesn't have a compiler, so this type of access control becomes even more important.



Back to top


Contract programming

The next important feature to take advantage of when migrating from PHP V4 to V5 is support for contract programming through interfaces and abstract classes and methods. Listing 6 shows a version of the Configuration class in which the PHP V4 coder has tried to build a rudimentary interface even without the interface keyword.


Listing 6. interface.php4

<?php
class IConfiguration
{
  function get( $key ) { }
}

class Configuration extends IConfiguration
{
  var $_items = array();

  function Configuration() {
    $this->load();
  }
  function load() { }
  function get( $key ) {
    return $this->_items[ $key ];
  }
}

class DBConfiguration extends Configuration
{
  function load() {
    $this->_items[ 'imgpath' ] = 'images';
  }
}

$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."\n" );
?>

The listing starts with the small IConfiguration class, which defines the interface to be presented by any Configuration class or derived class. This interface defines the contract between the class and any of its clients. The contract states that all classes that implement IConfiguration must have a get() method and that any clients of IConfiguration must stick to using just that get() method.

This code works in PHP V5, but it's better to use the interface system provided, as shown below.


Listing 7. interface1.php5

<?php
interface IConfiguration
{
  function get( $key );
}

class Configuration implements IConfiguration
{
  ...
}

class DBConfiguration extends Configuration
{
  ...
}

$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."\n" );
?>

For one thing, it's clearer to the reader what's going on. For another, a single class can implement multiple interfaces. Listing 8 shows how to extend the Configuration class to implement the Iterator interface, which is internal to PHP.


Listing 8. interface2.php5

<?php
interface IConfiguration {
  ...
}

class Configuration implements IConfiguration, Iterator
{
  private $_items = array();

  public function __construct() {
    $this->load();
  }
  protected function load() { }
  protected function add( $key, $value ) {
    $this->_items[ $key ] = $value;
  }
  public function get( $key ) {
    return $this->_items[ $key ];
  }

  public function rewind() { reset($this->_items); }
  public function current() { return current($this->_items); }
  public function key() { return key($this->_items); }
  public function next() { return next($this->_items); }
  public function valid() { return ( $this->current() !== false ); }
}

class DBConfiguration extends Configuration {
  ...
}

$c = new DBConfiguration();
foreach( $c as $k => $v ) { echo( $k." = ".$v."\n" ); }
?>

The Iterator interface allows any class to look like an array to its clients. As you can see at the end of the script, you can use the foreach operator to iterate through all the configuration items in the Configuration object. This functionality wasn't possible with PHP V4, but you can use it in a wide variety of ways in your applications.

The advantage of the interface mechanism is that you can put together a contract quickly without implementing any of the methods. The downside is that to implement an interface, you must implement all the specified methods. Another helpful addition to PHP V5 is the abstract class, which makes it easy to have one base class to implement the core of an interface from which you then create concrete classes.

Another use of abstract classes is to create a single base class for multiple derived classes in which that base class should never be instantiated. For example, when DBConfiguration and Configuration are both present, only DBConfiguration should be used. The Configuration class is simply a base class -- an abstract class. So, you can use the abstract keyword to enforce that behavior, as shown below.


Listing 9. abstract.php5

<?php
abstract class Configuration
{
  protected $_items = array();

  public function __construct() {
    $this->load();
  }
  abstract protected function load();
  public function get( $key ) {
    return $this->_items[ $key ];
  }
}

class DBConfiguration extends Configuration
{
  protected function load() {
    $this->_items[ 'imgpath' ] = 'images';
  }
}

$c = new DBConfiguration();
echo( $c->get( 'imgpath' )."\n" );
?>

Now, any attempt to instantiate an object of type Configuration results in an error because the class is considered abstract and incomplete.



Back to top


Static methods and members

Another important addition to PHP V5 is support for static members and methods on classes. Through this functionality, you can use the popular singleton pattern. This pattern is ideal for the Configuration class because there should only be one configuration object for the application.

Listing 10 shows the PHP V5 version of the Configuration class as a singleton.


Listing 10. static.php5

<?php
class Configuration
{
  private $_items = array();

  static private $_instance = null;
  static public function get() {
    if ( self::$_instance == null ) 
       self::$_instance = new Configuration();
    return self::$_instance;
  }

  private function __construct() {
    $this->_items[ 'imgpath' ] = 'images';
  }
  public function __get( $key ) {
    return $this->_items[ $key ];
  }
}

echo( Configuration::get()->{ 'imgpath' }."\n" );
?>

The static keyword has a lot of uses. Think of it anytime you require access to some global data for all objects of a single type.



Back to top


Magic methods

Another huge addition to PHP V5 is the support for magic methods, which allow objects to change their interface on the fly -- for example, to add member variables for each configuration item in the Configuration object. Instead of using the get() method, you simply ask for a particular item as if it were an array, as shown below.


Listing 11. magic.php5

<?php
class Configuration
{
  private $_items = array();

  function __construct() {
    $this->_items[ 'imgpath' ] = 'images';
  }
    function __get( $key ) {
    return $this->_items[ $key ];
  }
}

$c = new Configuration();
echo( $c->{ 'imgpath' }."\n" );
?>

In this case, I create a new __get() method, which is called whenever the client looks for member variables on the object. The code in the method then uses the items array to look up the value and returns it as if there had been a member variable there for that keyword, in particular. At the bottom of the script, you can see that using the Configuration object is as simple as asking for the value of imgpath as if the object were an array.

When migrating from PHP V4 to V5, it's important to look at language features like these, which were wholly unavailable in V4, and to re-examine your classes to see how you can use them.



Back to top


Exceptions

I wrap up this article by looking at the new exception mechanism in PHP V5. Exceptions provide a whole new way of thinking about error handling. All programs inevitably generate errors -- files not found, running out of memory, etc. Without exceptions, you had to return an error code. Look at the PHP V4 code below.


Listing 12. file.php4

<?php
function parseLine( $l )
{
   // ...
   return array( 'error' => 0,
     data => array() // data here
   );
}

function readConfig( $path )
{
  if ( $path == null ) return -1;
  $fh = fopen( $path, 'r' );
  if ( $fh == null ) return -2;

  while( !feof( $fh ) ) {
    $l = fgets( $fh );
    $ec = parseLine( $l );
        if ( $ec['error'] != 0 ) return $ec['error'];
  }

  fclose( $fh );
  return 0;
}

$e = readConfig( 'myconfig.txt' );
if ( $e != 0 )
  echo( "There was an error (".$e.")\n" );
?>

This standard file I/O code reads a file, retrieves some data, and returns an error code if it encounters any problems. I have two issues with this script. The first is the error codes. What do they mean? To find out, you must create a second system to map these error codes to meaningful strings. The second problem is that the return from parseLine is complex. I would like it simply to return the data, but it really has to return an error code and the data. All too often, engineers (including myself) get lazy, return just the data, and ignore the error because it's difficult to manage.

Listing 13 shows how much cleaner the code is when you use exceptions.


Listing 13. file.php5

<?php
function parseLine( $l )
{
   // Parses and throws and exception when invalid
   return array(); // data
}

function readConfig( $path )
{
  if ( $path == null )
    throw new Exception( 'bad argument' );

  $fh = fopen( $path, 'r' );
  if ( $fh == null )
    throw new Exception( 'could not open file' );

  while( !feof( $fh ) ) {
    $l = fgets( $fh );
    $ec = parseLine( $l );
  }

  fclose( $fh );
}

try { 
  readConfig( 'myconfig.txt' );
} catch( Exception $e ) {
  echo( $e );
}
?>

I don't have to worry about error codes because the exception has in it the explanatory text of what went wrong. I also don't have to worry about tacking on an error code in the return from parseLine because if a problem occurs, that function simply throws an error. The stack unwinds to the nearest try/catch block, which is at the bottom of the script.

Exceptions will revolutionize the way you write code. Instead of having to manage painful error codes and mappings, you can concentrate on the problem at hand. The code is easier to read, easier to maintain, and (I would say) even encourages you to add error handling, which is always a benefit.



Back to top


Conclusion

The new object-oriented features and the addition of exception handling provide compelling reasons for migrating your code from PHP V4 to V5. As you've seen, the upgrades are not that difficult. The syntax of the extensions to PHP V5 feel like PHP. Yes, they come from languages such as Ruby, but I think they fit in nicely. And they extend the range of PHP from a scripting language for small sites into something that can compete at the enterprise level.



Resources

Learn

Get products and technologies
  • Innovate your next open source development project with IBM trial software, available for download or on DVD.


Discuss


About the author

Jack D. Herrington is a senior software engineer with more than 20 years of experience. He's the author of three books: Code Generation in Action, Podcasting Hacks, and PHP Hacks. He has also written more than 30 articles.




Rate this page


Please take a moment to complete this form to help us better serve you.



 


 


Not
useful
Extremely
useful
 


Share this....

digg Digg this story del.icio.us del.icio.us Slashdot Slashdot it!



Back to top