Category Archives: Code

On Code – best practice, tips and other useful tricks

strpos in PHP – like being stung by a needle in a haystack

In PHP when you have a string and want to find out if it contains another string, there are a few ways to do it. You can use regular expressions, use the strstr functions and a few other methods.
The easiest way though is probably by using strpos, which returns the number of the character containing the first occurrence of the thing you’re looking for – and false if the string isn’t found.

Simple – yet with a slight danger.

$haystack = 'This is an example';
if (strpost('This', $haystack)) {
  echo "Found";
} else {
  echo "Not found";

In the example above, where we’re looking for the string “This”, the php code will echo “Not found”. The reason is, that the first (and only) occurrence of “This”, is at the begining of the string – character zero.
As strpos returns zero, the if statement is evaluated to false and thus the “Not found” is echoed to the screen.

Fixing the error is simple once you remember the “the first index in a string is zero with strpos” rule:

$haystack = 'This is an example';
if (strpost('This', $haystack) !== false) {
  echo "Found";
} else {
  echo "Not found";

Adding the !== false, forces a type check, and as the number zero is (exactly) false, the value echoed is “Found”.

Adding to php arrays

In PHP many things can be done several different ways. Picking which way to do something may be a matter of personal taste or habit. Sometimes however, things may be much clearer for the next developer, if you choose one way over another.

A very simple example of this, is adding a new item to an array. Often I come across this construct:

$valuepairs[] = 'Some value'

It’s valid and compact syntax, but in terms of clarity, I’d prefer this construct anytime:

array_push($valuepairs, 'some value');

Password failure in WordPress Plugin

One of the great features of WordPress is the wide variety of plugins available. They often enable a lot of interesting functionality and integrations to other services not native to WordPress itself. Most of these plugins are developed by individuals or small teams independent of the core community – and often not with a keen interest in security, but an exclusive focus on “making stuff work”.

I’ve been using the WordPress “Google AdSense Dashboard” for awhile, and after the recent host of password leaks, I’ve been changing and upgrading password all around. This change lead to expose what I would call a critical password exposure in the plugin and so far caused me to remove the plugin everywhere I’ve installed it.

The issue was the following:

If the password to Google AdSense fails in the plugin, your username and password is displayed in clear-text on screen – in the dashboard when logged into WordPress. Where’s the catastrophic take away – the username and password seems to be stored in clear text (or at least stored by the plugin in a format which can be converted back to clear text), and secondly, apart from storing it somewhat carelessly the plugin even display the information on the login screen – apparently for each and every user.

Removing the hash part of an URL

A url may contain a hash/an anchor reference. If you need to remove it from url, it’s quite easy. Here’s a short recipe on how to do it in PHP (including a little test input):

$urls = array(
foreach ($urls as $url) {
	if (strpos($url, '#')) {
		$url = substr($url, 0, strpos($url, '#'));
	echo $url, "\n";

Apart from removing the hash ending from urls, the function can naturally also be used on any number of other similar cases, where you need to trim a string.

PHP 5.4 built-in webserver & Linux (mint/ubuntu)

PHP 5.4 comes with a built-in webserver, which can be useful for development and quick tests. It easily launched from the command-line, but if you’re running Linux Mint or Ubuntu, the PHP version, isn’t 5.4 but 5.3.x. If you don’t have the time/courage/energy to compile PHP 5.4 yourself, some nice fellow on the internet has done the work and made it available through a package repository which makes it a breeze to install.

To install PHP 5.4 on your Ubuntu or Linux Mint simply do this:

sudo add-apt-repository ppa:ondrej/php5
sudo apt-get update
sudo apt-get install php5

(answer yes to any questions asked).

then you should go to go. Verify the update with:

php --version

.. and the "answer" should be something like:

PHP 5.4.4-1~precise+1 (cli) (built: Jun 17 2012 13:01:09)
Copyright (c) 1997-2012 The PHP Group
Zend Engine v2.4.0, Copyright (c) 1998-2012 Zend Technologies

(version numbers and dates are probably subject to change).

To use the webserver, go to the directory you want to be the document root, and launch the webserver with:

php -S localhost:8000

and you can also add a custom php.ini file with the configuration you want with:

php -c ./php.ini -S local:8000

Please remember, that the built-in webserver is only suited for development, but for a quick hack, it sure beats installing Apache or any other webserver.

Moving to PHP on 64 bit… the isssues & challenges

So your current website – if running PHP – and it seems to work just fine. I am however working on a project, where the new servers are running on a 64 bit version of the OS. This change seem to cause a number of potential issues, and as there didn’t seem to be a resource collection the issues, I’ll try to post a few notes on the experience. Please feel free to add applicable notes and links in the comments.

Our first experience was that all our scripts seemed to use a lot more memory than it did on the old server, but there are also number of other 64 bit challenges, you should be aware of. This post is trying to provide an overview of these changes.

The Integer issue

On a 32bit OS, PHP uses 4 bytes (of 8 bit) to define an integer. On a 64 bit system, PHP uses 8 bytes (of 8 bit) to define an integer and thus allows it to store a far langer range of numbers.

You can test this with a simple script such as this:

 echo 'The integer size on this system is: ';
 echo PHP_INT_SIZE. '<br>';
 echo 'The maximum value you can save in an integer is: ';
 echo PHP_INT_MAX;

On 32 bit, the script will output:

The integer size on this system is: 4
The maximum value you can save in an integer is: 2147483647

On 64 bit, the script will output:

The integer size on this system is: 8
The maximum value you can save in an integer is: 9223372036854775807

Generally speaking the only drawbacks of this approach is an increased memory usage and maybe a lower performance – given you script doesn’t need the etra 32 bits provided on a 64 bit system.

This simple little script can simply illustrate the increased memory use:

$test = array();
for ($counter = 0; $counter < 10000; $counter++) {
  $test[$counter] = $counter;
echo "Memory peak usage: ", memory_get_peak_usage();

On a 32 bit system the number output from the script is (roughly):
On a 64 bit system the number output from the script is (roughly):
That’s an increase of memory usage of more than 80% on a simple integer array!

Time and dates

Beware that many time related functions in PHP works with integers – such as mktime, strtotime and others uses integers as return values. As long as you use and work with these within the 32bit boundaries, you should be fine.

On 64 bit systems, they are able to handle much larger ranges, which could cause issues, if you allow that to happen.

Memory and performance

As the data volumes being moved around is increased, you could expect a performance penalty. On sites with low traffic volumes, it’s probably not an issue, but if you’re hosting a high volume site, it might be to some extend.

The extra memory seems to be a much larger issue to be aware of. While you may only assume you use a small number of integers, PHP itself does use them many places. When you’re creating arrays – they probably are indexed by integers and many functions return integers as control codes. While the required memory doesn’t double, do expect an overhead of 25-50% depending on what the script does – from the initial experiences; it does seem to be the case.

Bit shifting

Generally speaking, you should be aware every where you use bit shifting operations, as they by their very nature, is quite dependent on the number of bits in the variables available.

Handling Hashes

If you’re using hashes for checksums, beware. Some 64 bit issues may occur.

We’ve seen this issue on the crc32-function. If the result of the CRC32 is a positive number (on a 32 bit system), it will be the same on a 64 bit system. If the CRC32 results in a negative number however, the return result on a 64 bit sytem will be different.

This script:

  echo "<p>Letters 'ab'<br>";
  echo crc32('ab');
  echo "</p>";

Produces the following output on a 32bit system:

Letters 'ab'

But on a 64 bit system, it produces this output:

Letters 'ab'

Note the returned hash is always the same, so if you’re using the crc32-hash on a completely 32bit setup OR a complete 64 bit setup, you’re might see any issues, whereas a mixed environment probably will cause issues.

Hash functions such as MD5, SHA1 and others – will always produce the same result no matter what system they’re running on.

PHP, MySQL and 64 bit

Mysql handles integers different than PHP. An integer in mysql has always the same size no matter if it’s running on 32 or 64 bit systems. An integer is always 32 bit. If you need to store a 64bit integer, mysql has an explicit data type – BigInt for this purpose, which is a 64 bit Integer (see mysql manual on Numerical types).

How to handle mysql seems to depend on what kind of PHP solution you’re building. If your application is deployed across several servers (which may be a mix of 32 and 64 bit systems), to two core strategies – is to either handle it in PHP or in Mysql.

Handling the issue in PHP would probably suggest, that you some how “range check” the PHP integer values and make sure the value is within the range allowed by a 32 bit integer.

Handling the issue in Mysql, would mean to just change the integers in the database to BigInts. This would always work, but for all 32 bit system be a less efficient solution.

Defaults may be wrong…

Just a word of warning when using PHP and Mysql – if you’re trying to make efficient code and not utilizing all sort of frameworks and abstractions, you might be in for a small surprise in a default setting.

Usually is slightly lazy and often use the mysql_fetch_assoc function. It provides each row as an associative array, and is quite convenient to work with. Recently however while optimizing some code, I figured I’d switch to using mysql_fetch_array assuming it should be more efficient. The logic being that mapping hash keys to array values wouldn’t be needed and it should use less memory.

It wasn’t the case out of the box. Switching from mysql_fetch_assoc to mysql_fetch_array without doing anything else actually increases you memory use, and is probably slightly slower. By default mysql_fetch_array does not just provide the field values as array indexes, but still maintains the hash keys too.

If you only want the indexes in the returned rows, you need to add an extra parameter to the function stating this explicitly.

  $row = mysql_fetch_array($result, MYSQL_NUM);

I wonder why it was made so. It seems like an odd choice when mysql_fetch_assoc kan provide the row indexed with hash keys – The correct behavior for mysql_fetch_array (by default) ought to be to just return the array without the hashkeys – and have that option available if needed.

Function names as signaling

In most web applications there’s a host of functions (or methods if speaking in the object-oriented world). It’s widely recognized, that it’s probably a good idea to name them something, which may suggest the purpose or functionality of what the function is doing, but often developers seem to fail at making a stringent naming convention. Before starting on your next big development adventure, here are a three suggested rules for naming functions.

1. It’s more important to have a suggestive name, than a short one.

Never call a function something short but meaningless. Instead use CamelCase and make a sentence suggesting what the function does.

  • Bad examples found in live code: “process”, “fixit”, “cleanup”.
  • Good examples: “saveToDatabase”, “convertIpnumberToDomainName”, “calculateTotalPricing”.

2. Use prefixes on functions

Reserve common names (more if you like) for specific type of functions. Here are a few suggested rules:

  • “get”-functions should always retrieve and return data – never print data.
  • “print”-functions shoudl always print data to “standard out”.
  • “set”-functions should ways set data to an object (and choose if “set” also saves data or not).
  • “save”-functions (if set-functions doesn’t save properties) saves all current properties to the persistent storage (usually database).

3. Reuse data model objects in function names

If you’re data model (or object model) already contains names of entities, reuse these in function names. If a table is named “Travel”, call the function “saveTravelRecord”, Not just “saveDataRecord”.
Make consistent use of the same names, field names, properties and other entities found in the application. Using the same name for the same object all across the application, may seem obvious, but somehow developers seem to find slightly different names for the same object again and again.

While the three above tips may seem simple, do check your code and see how many places they are broken. I’ve seen countless times, and getting it right from the beginning would have cost a small effort, refactoring the code years later is a much bigger effort.

Bread crumbs in version control

I’m sorry but sometimes I really don’t get why even seasoned developers doesn’t learn the art of the commit message in version control system. All too often I’ve come across check-ins where the entire commit message just reads “bugfix”, “change”, “oops” or something just as mindless.

The effort of writing a useful message compared to the potential benefit seems to be one the best ratios – but of course the pay-back is usually some time away – too bad. Once you work on the same code for years – or even better inherit code from others, you’ll quickly learn to appreciate anyone who used more than 10 seconds on composing a thoughtful message for the future.

Here are 3 rules you should always, always obey when committing to a version control system.

Always leave a reference to the issue/bug tracking system.

All professional development uses some sort of issue tracking system, to keep track of bugs, new features and other changes to the system. The issue tracking system should always be able to tell who asked for a change, why it was asked for and what considerations was made before the code change. By leaving a reference to the issue tracker, it’s often much easier to get “the big picture” if the change need to be changed years later. To make sure you get it in, just write “Bug #number#: “ as the initial part of the commit message.

Don’t write what, write why

Don’t write it’s a bug fix – most people will know it from look either at the code or in the issue tracking system (see point 1 above), rather write why it fixes the issue (“New check to check for missing parameters”, “Now handles no search result from db correct” – not just “bugfix”).

Keep it brief.

Log messages are not a place to store documentation, user guides or any other important information. You can assume it’s the future you (or another future fellow developer) who will look at the code and try to make sense of it. Think of this, when writing the message – it’s not for the project manager, it’s not for the end-users – it’s for a developer doing maintenance work on the code in the future.