How do you know good code?

One of the great challenges of PHP is that it’s so easy to learn, that just about anyone can learn it with not too much of an effort. While this is great for the number of PHP developers, it also seems to have the effect, that there is a huge number of bad examples of code out there. How do you then know good code?
In my book there are a few signs, which you could judge from – and they may even apply broader than just php-code.

First sign: It does the job

The code should be designed for the challenge at hand. Too often developers seem to apply the same style and energy to solve a challenge – no matter how much they differ in complexity and importance. If it’s a simple little function being developed, it shouldn’t require a huge framework.

If it deals with financial or personal data, it should probably utilize transactions and apply logging. Did the developer think of the challenge they set out to solve? – if so it’s a good sign.

Second sign: Well-structed

How does the sourcecode look? Are there mile long lines of code or are they sanely formatted? Is the code broken into functions, classes or another structure – or is it just page after page of sourcecode? – I don’t need to see all code as classes nor as neat function-libraries, but I do like if the developer has made an effort to break an application into some manageable pieces somehow.

Third sign: Reasonable naming scheme and comments

How does the function names, classnames and variable names look? is it random garbage or does it make sense? – I really hate variables named from $a to $z – and I do hate functions named “doSomething” – without specifying it further. I would expect great code to utilize the same naming conventions (CamelCaseing, underscores and so on) across all functions and/or variables.

If strange – as in unnatural/unexpected – things happen in the source code, I would expect a (short) comment explaining what’s going on.

Fourth sign: Security and Contingency

Did the developer think about security? is the code wide open for xss attacks? is input validated? Is “the unexpected” handled gracefully or does the code explode if exposed to simple URL-manipulation? Do you know what a SQL-injection is? If you need data from another source, what happens if it isn’t available? – does the code blow up or does it burn gracefully?

How do you recognize good code?

Caching & WebApplications

One of the funny observations as a web developer: It’s amazing how many people consider caching bad by definition. If you know what you’re doing, caching is an amazingly powerful tool, which can provide cheap and efficient scaling to those who know how to use it.

Know when it’s okay to cache

If thousands of people see the same non-personal frontpage of your website – do you then do the 20+ database queries to build a fresh one for each visitor or do you just refresh a cacheable version from time to time?

Cacheing isn’t a binary this

While a page on a website may not be cacheable as a whole, is your navigation, your header and footer – and other static parts of the site cacheable? is new menu-items are rare compared to the number of page views, the pages probably receives, the “semi-static” parts of a site should absolutely be cacheable (server side).

Consider you cacheing options

Do make a caching strategy and see what works best for you. Is it a caching server in-front of your website (such as varnish), is it pre-build HTML-files, is it shared memory or is it blobs in the database serving pre-build pieces? The efficiency and the fit to the task may change from case to case, but knowing there is a range of options and knowing what the different options are good for, should be a required skill for any professional web developer.

Caching isn’t evil, it’s your friend – if you know how to use it efficiently.

Tip a friend – not so simple

Many online sites such as news-sites and other content providers often have a “tip a friend” option. With this you can mail a friend and tell them about an interesting piece of content you’ve found. The Idea seems quite simple, and everyone should have the tip-option, wright? – no, wrong. While it may offer a convenience for some, it has several backsides.

First if you – or your email provider – has implemented anti-spam techniques such as SPF-records, the “tipping mail” will not be sent through the authorized list of mail-servers and thus have a larger likelihood of being labeled as spam. Your mail be sent, but you don’t know if it will arrive in your friends mailbox.

Second having two (assumable) valid email addresses submitted to a site could be a goldmine for evil spammers. Besides mailing the tip, the email addresses may be collected and abused some time in the future.

Third by having a site sending thousands and thousands of “tips” to friends does mimic a spammers’ behavior – the function may cause your servers to be labeled as a spamming server, and you may not be able to send mails – no tips nor important messages your servers may try to mail to you or other users.

If you want to have a “Tip a friend” function – make sure it’s a mailto-link.

The mailto-link may not be as sexy as the options available when you create the mails server side, but the links go out from the users’ own mail client and the likelihood of it being labeled as spam is far less. It also gives the user complete control of what is being sent – no unexpected ads or anything else unwanted material.

Mysql metadata

If you’re a developer and use mysql, I’m sure you’re aware that it’s a database and it quite good at storing data, but one of the neat things about Mysql (and most other databases) is also their ability to provide meta-data on the contents of the database.

Most people know how to use the meta-data queries in the commandline, but if you want you can also use them in your (php/perl/some-other- ) language. Here is a quick guide to some of them.

show databases

The show databases provide a list of all databases available in the datbase-server you’re accessing. It doesn’t tell you which of the databases, you’re allowed to access.

Once a database is selected, you can see a list of tables with the command:

show tables

And with either the ”desc tablename” or with the command

Show columns from tablename

(replace ”tablename” with an actual tablename from the database).

You can exclore which columns and column definition is available.

It’s probably rarely you need to use these functions unless you’re writing a phpmysqladmin replacement – often a script makes assumptions on which tables and columns exist.

If you’re developing an upgrade to an existing application/webbsite/script and the update requires database changes, you can use these functions to check if the database layout version is the one matching you application version needs. By doing this, you can provide much better feedback to the user on what’s wrong with the script, instead of just breaking horribly with database errors.

Backups, WordPress & GMail

Backups seem to be a constant pain for just about everyone. It’s something we know we should do, but somehow never get around to actually doing. Since switching to WordPress on this site, things have been different though.

One of my many installed wordpress Plugins is the WordPress Backup plugin. It runs once a day and makes a complete backup of my wordpress database (with all these precious posts) and sends it in a mail to my Gmail-account.

On my gmail account I have a filter, which sees these mails – it attaches a dedicated backup label and archives it (thus removing it from the inbox). Leaving a me with a backup of all the important data off site.

I have been checking the mailed files (that they actually are unzip’able and restoreable) and every once in a while I do delete all backups more than a week old (though I don’t need to with all the space available on the Gmail account).

It’s so easy, that there really wasn’t any reason not to have a current backup of the site, right?

Website Traffic Tracking

Do you have a website? If so please go to the place you store the access logs, and check how much disk space they use. Having a website a few yours old, you’re probably looking at gigabytes, and what exactly is the value of that?

Sure keeping track of traffic levels is sort of interesting, but sometimes you need to balance the value provided by the space/resources required, and I’ve been slowly changing the way I use the access logs on this site.

Step 1: Don’t track the images

Do you really need to track, which images downloaded from the site, or would it be enough to know which pages are loaded? – For my part page impressions is enough intelligence on the site traffic and with apache it’s easy to disable image tracking. The easy way to do it is by adding a parameter to your log configuration saying:

env=!object_is_image

Restart the webserver and the log file should be somewhat smaller from now on.

Step 2: Use Awstats

My next step was to use Awstats. It parses the raw accesslog data into a database-file, which is significantly smaller than the raw files themselves. Awstats is a lot like other access-log analyzing packages, but it seemed to be just a notch above the rest.

Step 3: Drop the access logs for long term intelligence

While access logs on the webserver may be the source for traffic intelligence, there are several options to track traffic through remote services.

Most of them are pretty good and if you’re interested in generic analytics, you should probably look at one of the many options available to do traffic tracking as a remote service.

Some of the options available include Google Analytics (which I use), StatCounter and several others. Isn’t it nice, that someone else offer to keep all those historic data online – and in many cases absolutely free.

I still have access logs, but they’re used to (1) validate data from Google Analytics and (2) keep an eye on what’s happening on the site “now”. Any data more than a week (or so) only exist at Google Analytics…

Letting others feed the web for you

I follow a ton of sites on the web. I go for a morning surf through each and every one of them; I use an aggregator which checks the feeds from the websites, and tell me where to go for news. I guess most people do this – using feeds to find updates and then visit the site to check out the content.This way of tracking sites has changed one important thing on this website – the most popular file on the site is no longer the frontpage nor is it at particular popular page with a high Google ranking – it’s the feeds. Until recently almost 25% of all inbound tracking was hits to the main feed-URL.

While I do appreciate the traffic, serving a feed is more a necessity/convenience than it is adding value to the site is self, and wouldn’t it be quite nice, if I could use the webserver resources for something better than letting aggregators know if I’ve change anything or not.

Well, guess what. I’m (almost) not wasting any server resources on feeds – FeedBurner handles that.

There really isn’t any magically in doing this – FeedBurner is pushing more than a million feeds, but there are three reasons why you should let feedburner (or an other feeding server) push your feeds:

  • By using feedburner, I’ve moved a lot of traffic away from this server and thus pulling less traffic and a lesser load on the server.
  • FeedBurner are assumable feed experts and they probably ensure the readers used by people tracking the site, get the best possible feed.
  • Since FeedBurner Pro is Free, I can even brand the feeds with my own domain name, so that visitors don’t even know FeedBurner is serving the feeds. My mainfeed lives at http://feeds.netfactory.dk/netfactory

There are a few other cool benefits – Feedburner offers statistics on the feed usage and widgets I can use on the website, but the three above points should be enough to get most blogs and small websites to at least consider using FeedBurner.

kUbuntu 7.10

kUbuntu logoJust a few days before leaving for South Africa, the latest version af Ubuntu was released. I really didn’t have the nerve to try and upgrade before my vacation, but today was the day.

Ubuntu is an operating system – like windows – but based upon (Debian) Linux. It can probably do everything you need – and it’s free. With the packaging done to Linux by the Ubuntu team(s), it’s a complete user-friendly and easy to use alternative for most computer users, and it has worked pretty well for me for the quite some time.

The upgrade

While it probably is possible to do a distribution upgrade, I’ve been reinstalling from scratch when upgrading. It usually just requires all the contents of my home-directory (and a few select configuration files from the /etc/ directory) to be zip’ed together in an archive. The archive is temporaryly store don a USB disk (about 600 Mb in total), while the harddisk was completely wiped and formatted.

The entire install process was the smoothest experience I’ve witnessed so far, and to less than 30 minutes. The packed homedir was unziped in a directory on the desktop, and the files and directories I know I needed was moved to the location they were placed in before the reinstall.

The software updater was run and within an hour the machine was running the new version. So far it’s been an impressive upgrade. Screen drivers, printers and just about everything work. Amazing.

Scary Docs

Google Docs errorPlacing your documents online, does require trust in the online service you choose to use. I usually have a pretty solid trust in google. They do however from time to time have some glitches. After getting the message in the screenshot for an hour, I did start to get the chills, as the document as long and didn’t exist anywhere else. After an hour or so, it did however reappear. phew.