najafali.com


Questioning My Faith In Object Oriented PHP

I first came accross OO at Uni and I ‘got’ it straight away. After learning the basics of inheritance, abstract classes etc, I went straight to the GO4 book and have been an object/patterns junkie ever since. This was all in Java, and there was no other way to solve problems. It had to be OO.

Fast forward a few years and I’m building fairly basic websites for a living using PHP. You don’t need OO for these to work. A contact form would be the most complex thing we have going, so there’s no use busting out the command pattern just for that.

A few years later, and I’m moving on to full on CMS’s, accessing MySQL databases and interfacing with web services. At that stage, I naturally reacted by using patterns like command, data mapper, registry and loads others. They were the most natural tools available to me, PHP 5 was coming of age and life was good.

At the same time however, I felt like I was the only one. PHP Developers who had been at it for six or seven years longer than me just weren’t that into OO. I saw examples where objects were used here and there, or that perhaps used OO libraries like ZEND, but nothing hand-rolled that harnessed the sheer power of OO Design Patterns as they were intended.

This made me question my beliefs a little about OO. Is it really necessary? And if it is then why weren’t these seasoned PHP developers using it?  Surely for every benefit that you can list about OO, couldn’t you simply respond with ‘you can do that with functions’? Let’s take a look at some of the commonly mentioned advantages of OO and see if we can answer that way.

Encapsulation/Modularity
The story goes that OO allows you to ‘encapsulate’ certain functionality in your program. All this means is that certain parts of code are responsible for a specific, well defined functionality. The details of how this code implements that functionality are hidden from the rest of the program, the benefit being that you can change that code without having to make knock-on changes in other parts of the program. Also, you could in theory swap out that functionality for a completely different bit of code as long as the expected input and output are the same.  The thing is, with a bit of discipline, you can encapsulate parts of your code using functions, so OO isn’t really bringing anything new to the table there.

Code Reuse/Extensibility
In OO you can ‘re-use’ classes using inheritance. Say you built a class called ‘car’ with all sorts of car-related functionality like ‘accelerate’, ‘break’, ‘ignition’ etc. Let’s say you wanted to create another class called ‘convertible’. Writing ‘accelerate’, ‘break’, ‘ignition’ and all those other methods again for ‘convertible’ is a waste of time and introduces a lot of potential for error. Instead you make ‘convertible’ a subclass of ‘car’ and ‘convertible’ inherits all those behaviors. Again however, you can make perfectly re-usable code with a system of functions that go from general utility functions to specific client actions. Again ‘you can do it with functions’.

After doing a bit of googling, all other touted benefits of the OO state of mind strike me as quite subjective observations. Things that only make sense if you’re already using an OO system. The bottom line is that there’s absolutely nothing you can make a computer do with OO that you can’t make it do with simple procedural programing. There are no hard, logical reasons for going with OO over doing the same thing with a system of functions. The benefits only start to show themselves if we start questioning the foundations of why we use PHP or any other high level programming language in the first place.

Managing Complexity

All programming languages apart from machine code are there to manage complexity. That’s it. They act as an intermediary between the human programmers and the machine (and potentially, other machines). Some languages like C and assembler are closer to the machine, and others like ruby try to make things as easy (and even as fun) as possible for the programmer. The question then is not whether we can do things with OO that we can’t do with simple functions. The question is, which paradigm manages complexity better? Which acts as a better interface between the programmer and the machine?

After we get past a certain amount of complexity in a system, I’d argue that (for us mere mortals at least) using OO is the only sane way of managing that complexity. Using just functions and arrays to build a well designed, modular system would require a level of discipline that the average developer (including me) just doesn’t have.

The productivity benefits of using an OO system are well-advertised on the net. I still see them as subjective, but if you buy into the whole OO paradigm, this can translate to measurable productivity gains. Another benefit is that if another developer who knows OO picks up your code, he can quickly form a mental picture of how your application works based on the patterns you’ve used. Also, because of the ‘half-baked’ nature of patterns, it provides you with a vocabulary for prescribing general solutions to common, complicated problems that you come across when building applications.

OO Adoption Among PHP Developers

As to why more senior PHP developers haven’t adopted OO, the jury is still out. I think it’s a combination of factors:

  • PHP has only had decent support for OO since PHP5. More senior developers, by definition would have worked with earlier versions where incorporating OO principles wasn’t a major focus.
  • OO/Design Patterns are relatively hard for PHP coders*. The basic concepts are easy, but until you grok the general themes (composition over inheritance, coding to an interface etc) and the reasons therein, it can all seem like a big waste of time. Senior developers with a hard-won toolkit of techniques that work without OO may not have the patience to buckle down and learn all this.
  • Lack of a strong ’software engineering’ culture. PHP developers typically go from designing websites in HTML/CSS to throwing together a few scripts in PHP and then finally building complex CMS-like web applications. At no point is there any reason for them to identify with ’software engineers’ in the traditional sense. However OO is more or less a foregone conclusion amongst Java, C++ and Ruby developers.

Because of the low adoption of OO in PHP circles, there are one or two parts of sites I built that I deliberately ‘dumb down’ so that the average developer can work with them. Specfically, I find that a front controller introduces way too much complexity for Joe the PHP coder. Even though I know there would be all sorts of flexibility advantages of using front controller, I stick with a simple page controller that fires off command objects to the rest of the system and gets back any and all required data. This sticks with the mental model of ‘one .php file per page’ and so is easy to work with even if you only have a basic knowledge of object syntax. On the one hand, the client gets the website, the managers are happy and the site is easily maintainable by junior developers. On the other, I’m essentially insulting the maintenance developers intelligence and delivering an inferior product.

OO is not intuitive when you first learn it. It’s especially difficult to learn when you can’t see the immediate benefits. But as far as I’m concerned, it’s the right way to do things for 99% of all non-trivial PHP based applications. While functions may encapsulate certain parts of your code, functions alone are not enough to manage the complexity of a large application. OO was thought up as a better way to manage the growing complexity of software, and it’s the defacto standard in non-PHP software engineering circles. It may not have a high adoption rate in the PHP wilderness, but I’m still going to push for it in every project I’m involved with.

* If I’m being honest, OO/Design Patterns are not that hard. Juggling pointers in C is hard. Concurrency is hard. Recursion/Functional Programming is hard. This is not a dig at PHP developers, it’s just that if PHP is all you do then you’re not likely to come accross particularly difficult problems. From that point of view, Design Patterns etc. are a relatively difficult concept to grasp.

Published by Ali, on June 27th, 2009 at 8:01 pm. Filled under: Design Patterns, Development, Object Oriented PHP, PHPNo Comments

Character Encoding Problems With PHP

At work recently we’ve spent weeks trying to figure out how to get Bulgarian text to work on a PHP driven website. The first thing that comes to mind is basically setting everything to utf-8 and hoping for the best.

The database, table and field character encoding was utf-8. Boxes and squiggles. Every response sent content type of utf-8 in the header tags AND in the html meta tags. Still, nothing but boxes and squiggles. We tried using the ’set NAME utf-8′ query and the mysql_set_encoding function, and still nothing.

The dirty little secret here is that no one really knew what utf-8 or character encoding meant and that’s the reason why it wasn’t quite clicking. We were just hoping that if we could find and flip all the switches to utf-8 that the problem would go away.

I’d been meaning to read an article about character encoding on Joel Spolsky’s website. I’d skimmed it before but this time I sat down with a pen and paper and actually took notes. Surprisingly it wasn’t that hard. The basic issue is that PHP’s string manipulation functions all assume that all characters use one byte in memory. UTF-8 uses between 1 and 6 bytes per character.

That means that while PHP string funtions (strpos etc) can work with UTF-8 if we only use single byte characters (i.e. English) but spews garbage for input strings that contain characters that require more than one byte.

The solution? For our website we just had to get the data from the database to the web browser. We were good to do this without any transformations on the text so all we did was remove all string function operation on it and we were good to go.

The other solution would be to use PHP’s multi-byte string functions. These aren’t included with PHP out of the box but they are luckily installed on fasthosts. They should get you out of a pickle if you need to manipulate the text before sending it elsewhere.

Published by Ali, on June 19th, 2009 at 9:13 pm. Filled under: UncategorizedNo Comments

Welcome To The Cult Of Git

I am a card carrying member. It’s like taking your whole development process and adding a flux capacitor to it. In fact it’s better than that. You can create multiple timelines and merge them at will.

It was written by Linus (frikkin) Torvalds, and in his talk about Git at google, he told a room full of Google engineers that if they used SVN or CVS over Git, then they were stupid and ugly.

I’ve spent ages trying to crack version control but it didn’t really make sense until Linus explained Git. After a few sessions of using it at the command line I’m officially an addict.

Before I used to worry that going down a particular route when implementing a feature might not be ideal. I’d then go on to think about all the possible ramifications of each possible way of implementing the same thing and generally not get any work done.

Now? I just create a new branch, experiment and see if it works. If I was lucky and my initial idea did work, then I simply merge it into the original branch and we go on. If the new implementation turned out to be a mistake I just checkout the old branch and it’s as if it never happened. Control over the very circuits of time!

The distributed nature of Git makes it really easy to riff off of other peoples open source work and for you to make your work available. The manifestation of Linus’s big idea may well be a site called github.

Seriously, go get it.

Published by Ali, on June 9th, 2009 at 11:00 pm. Filled under: DevelopmentNo Comments

Coding For Your Most Important User

Any company that deals in web sites worth their salt values the audience, the users above all else. They’re the reason we build the sites, the reason we painstakingly make sure the pages are pixel-perfect in the abomination that is IE6.

But really… they’re not the most important users of your site. Not if you care about making that site profitable for your business or being productive in any sense of the word.

The real most important use isn’t some grandma on a windows box clicking buttons and figuring out how to send email. Nor is it the client checking if the header image is the right shade of muave. The person we’re talking about is a little more important.

It’s you.

More specifically, it’s your future self. You or someone like you is going to be responsible for maintaining this site, making amendments and more fundamental changes when called for.

You can hack together a garbled mess of ‘code’ that works just like the client asked. Your spiky-haired manager will be happy, your clients will be happy but deep down you’ll know how much its going to suck when you come back to this in a few months to make some changes.

It’ll suck because any future work on the site will probably take orders of magnitude longer than if you had designed the internals properly. It will doubly suck because you will have to unscramble a horrible mess of code that you know you could have done better but were too lazy to. Finally, the worst part of this whole mess is that any further amends on the site will continually decrease the amount of profit you made on it.

Published by Ali, on June 9th, 2009 at 10:37 pm. Filled under: DevelopmentNo Comments

How to Build a ‘Dynamic’ Config Class in PHP

One of the big security booboos I see on systems I inherit from other developers is keeping the database credentials within the document root. Normally this is in a file like ‘db.php’ that has mysql_connect functions that connect to a database, and is included into any page that needs a connection, a pretty ugly state of affairs to begin with. With better developers they declare them as constants somewhere in an include file, but even then it’s still not ideal.

The Config.ini File

I start off by creating a config.ini file outside the document root, in the same directory as htdocs or whatever the root directory is. The config.ini file looks something like this:

; Main Site Config File, this is a commenty comment

[site]
url = 'www.mywebsite.com'

[database]
host = "host.host.host.host"
user = "user"
password = "password"
name = "name"

[contact]
email = "me@myhost.com"

Nothing too crazy going on there.  Section names in square brackies,  config values are quoted and the semicolon starts a comment. Easy peasy. Now the question is, how do we get that info into our website…

The Config Class

This is all meant to fit in with an OO system. You’re never going to want to have more than one instance of a class holding config details so guess which design pattern you should probably go with. Just one SINGLE instance. You guessed right… the singleton. For a system with more than a couple of singleton-like classes, you probably want to take all that instance creation/tracking code out and build a registry instead.

Back to our config class, we want to pass it a .ini file in the constructor, let it suck the values out of the file and then give us nice clean access via getter methods. Simple enough to do, my first try looked something like this:

class Config {
   private $values;

   function __construct($filename) {
      //more error checking in the real thing please!
      $this->values = parse_ini_file($filename, true);
   }

   function getDatabaseUser() {
      return $this->values['database']['user'];
    }

   function getSiteURL() {
      return $this->values['site']['url'];
    }

   //etc etc...
}

So far it’s pretty boring, nothing much to see, move along move along….

The Dynamic Config Class

As I was writing this class and going through the manual labour of typing out the getter methods, some cogs were turning in my head… magic methods… call… explode…. ch-ching! Writing all these getter methods is a big waste of time! We can have the config class respond to methods requesting whatever values happen to be defined in config.ini using a little magic. The goal would be to provide method such as getDatabaseHost(), or getSiteUrl() using the php __call function (it gets called when an object has a method called on it that doesn’t exist.

The pseudo-code for __call would look like this:

  1. Check if the method starts with ‘get’, die if it doesn’t.
  2. Break the rest of the method up based on camel case (so getDatabaseUser would be broken down into ‘database’ and ‘user’)
  3. Use the values you got in 2 to index the values array.
  4. Return a value if found, die otherwise.

The bare-bones version of the class looks like this:

class Config {
		private $values = array();

		function __construct($filename) {
			$this->values = parse_ini_file($filename, true);
		}

		function __call($method, $args) {
			if (strpos($method, 'get') == 0) {
				$keys = $this->explodeCase($method);
				//so now $keys = array('get', 'section', 'value')
				$sectionKey = $keys[1];
				$valueKey = $keys[2];

				if (isset($this->values[$sectionKey][$valueKey])) {
					return $this->values[$sectionKey][$valueKey];
				} else {
					//die a horrible death
				}
			}
		}

		function explodeCase($string) {
		  // Split up the string into an array according to the uppercase characters
		}
	}

Now we can essentially add sections and values to the config.ini file and the config class will automatically have methods to get that data to where it’s needed. Added bonus, the config file is outside the document root.

Big Fat Caveats

You can’t conform to an interface based on retroactive methods like __call(). In other words, you can’t force the config class to have particular methods without declaring them explicitly. This will only start becoming a problem in one of two instances, a) where you have to integrate all this with some other big system or b) when someone has a fiddle with the config file without telling anyone. a) because you need interfaces to maintain sanity in big system integrations and b) because fiddling with the config.ini file essentially fingers the whole system. You have been warned.

Published by Ali, on May 26th, 2009 at 10:50 pm. Filled under: Object Oriented PHP, PHP, SecurityNo Comments

A Sneaky Way To Save Money On PHP/MySQL Projects

Don’t use MySQL.

In the UK, most web-hosts (<cough>FASTHOSTS</cough>) still mark-up each new MySQL database on your account. It’s a nominal amount like £25 but it builds up fast if you’re a small business and have a lot of clients. To get around this and still have an SQL database, SQLite is a perfectly good alternative, and doesn’t cost any money to use. PHP has (very well written) functions to handle SQLite out of the box.

Pros

  • No config settings, usernames or passwords to worry about. An SQLite database is contained within file. If you have access to the file, you have access to the database. This also makes backups a lot easier for those who aren’t too skilled with the shell.
  • You have an excuse to learn about and use PDO (PHP Data Object) as the PDO driver for SQLite is included in PHP out-of-the-box. This is a nice data abstraction layer that allows you to encapsulate the database connection/transactions. It also supports prepared statements and binding (and emulates them if the DBMS in question doesn’t support them). In theory if your site traffic grew, you could move it over to a MySQL DB with few problems.
  • No learning curve. If you can work with MySQL, you’ll have no problems whatsoever with SQLite

Cons

  • No config settings, usernames or passwords to worry about. An SQLite database is contained within file. If you have access to the file, you have access to the database. That means you better keep that file out of the public (htdocs) directory and double up on security (but to be fair, you should be doing this with a PHP/MySQL site anyway).
  • No Concurrency. With a MySQL database you have a daemon running that waits for database requests. No such thing with SQLite. Instead you’ve got the filesystem (just like you might request an image or html page). The filesystem can handle concurrency to an extent but not as well as full-fledged MySQL daemon can. In practice this means you can’t use SQLite for sites that recieve more than (according to the SQLite documentation) 100,000 database-requesting hits per day.

Note that you wouldn’t want to use SQLite in a website that might experience occasional traffic spikes (often when your clients want their website to be up and running the most).

Resources:

Published by Ali, on May 10th, 2009 at 11:18 pm. Filled under: PHP Tags: , , 1 Comment

Kernighan Ritchie Detab

In my first steps on the quest to master the dark art of C programming (using the so-called ‘white bible’), I’ve produced the following answer to the Kernighan Richie programming exercise to create a ‘detab’ program (p.34, 1-20):

#include <stdio.h>
#define TABWIDTH 8

/* Detab:
 * Write a program detab that replaces tabs in the input
 * with the proper number of blanks to space to the next
 * tab stop. Assume a fixed set of tab stops, say every
 * n columns.
 */

main() {
	int i, c, s; /* iterator, character, spaces */
	s = i = 0;

	while ((c=getchar()) != EOF) {
		if (c != '\t') {
			putchar(c);
			++i;
		} else {
			s = TABWIDTH - (i % TABWIDTH);
			s = i + s;
			while ( i < s) {
				putchar(' ');
				++i;
			}
		}
		if (c == '\n')
			i = 0;
	}
}

I’ve seen other answers on the internet that are a lot more complex, but I believe the above satisfies the basic requirements. Also, as I’m a genuine beginner at C, I haven’t used any of language structures that haven’t been introduced before that point in the book.

The only part that’s slightly interesting is the formula for figuring out the number of spaces to insert on encountering a tab character based on the current position in an input string (spaces to insert = tab width - (current position % tab width)). I figured it out by trial and error with two or three suitable random test cases, but I could smell the modulo in there pretty much as soon as I read the question.

Published by Ali, on May 6th, 2009 at 10:06 pm. Filled under: Uncategorized Tags: , No Comments

A Typography Moment

Now that I’m working at building my visual skills a little, I’m having little moments of greatness. Just yesterday I was setting up a subdomain to link Web Developments to their work-in-progress sites. The links are all subdirectories of the testserver.phpwarrior.net subdomain, so I wanted a page that requests the client go to their individual directory on the index page. Going to testserver.phpwarrior.net gets you a page that looks like this:

picture-1

This would only display like that provided you had the font ‘Myriad Pro’ installed on your machine. Otherwise in Helvetica it would look like this:

picture-2

And finally in the default Mac OSX sans-serif font:

picture-3

When you start paying attention, it starts mattering. The difference between the top one in Myriad Pro and the default sans-serif is profound. So much so that I started to consider using an image instead of plain text so that I could be sure of this (from a technical standpoint, not particularly good practice).

At the same time, now that I’ve spotted something like this, I’m looking forward to the damage I’m going to do once I have a better handle on typography. Time will tell…

Published by Ali, on May 6th, 2009 at 9:50 pm. Filled under: Uncategorized Tags: , No Comments

LAMP Bibles

In rounding out my skills and learning more about the LAMP stack (Linux, Apache, MySQL and PHP/Perl/Python) I’ve been doing a little reading. In technical subjects like these, there usually emerges one or two ‘bibles’, seminal works that you absolutely must read before you can be taken seriously. Here’s what I’ve found:

Linux: The Linux Administration Handbook

Apache: Apache: The Definitive Guide

MySQL The Definitive Guide To MySQL 5

As far as PHP goes, I’m not sure if there is one defining book (the same could be said of MySQL). I’ll let you know when I find it. I will warn you however, these books aren’t ‘teach yourself LAMP in 24 hours’. They are long, dry and thorough so unless you’re serious about learning these technologies, I’d steer clear.

Published by Ali, on May 4th, 2009 at 8:19 pm. Filled under: PHPNo Comments

Graphic Design For Programmers: The Thirty Day Challenge

I’m going to admit something I’ve been keeping secret for a while (though it’s no big secret…)

I suck at graphic design.

And I’m probably not alone. No doubt there are probably a lot of other technical types out there who’ve faced this same problem before. I’m also guessing that the majority of us don’t bother facing this particular demon at all.

Is this to say that we’re doomed to sucky visuals? I think not. If we can figure out how the LAMP stack fits together or how to make command line scripts that do just about anything, I’m pretty sure that us geeks can figure out graphic design too.

So here’s my plan for learning. Spend one hour a day for thirty days learning graphic design. Occasionally I will put layer comps on this blog detailing something I’ve learned recently.

I’m going to read all the tutorials and books about graphic design (specifically for the web) that I can get my hands on. To start with, I’ve ordered The Non Designers Design Book by Robin Williams, and am getting started with some of the material that Seth Godin posted in his article on this same subject.

I’ve started today, so here’s my first try. It’s a website for a local Italian restaurant called ‘Al Fresco’. Hopefully I’m going to be able to look back and laugh one day. Think of it as the ‘before’ picture on a diet advertisement.
alfresco

Published by Ali, on May 3rd, 2009 at 10:01 pm. Filled under: Uncategorized Tags: No Comments