Mysql metadata

If you’re a developer and use mysql, I’m sure you’re aware that it’s a database and it quite good at storing data, but one of the neat things about Mysql (and most other databases) is also their ability to provide meta-data on the contents of the database.

Most people know how to use the meta-data queries in the commandline, but if you want you can also use them in your (php/perl/some-other- ) language. Here is a quick guide to some of them.

show databases

The show databases provide a list of all databases available in the datbase-server you’re accessing. It doesn’t tell you which of the databases, you’re allowed to access.

Once a database is selected, you can see a list of tables with the command:

show tables

And with either the ”desc tablename” or with the command

Show columns from tablename

(replace ”tablename” with an actual tablename from the database).

You can exclore which columns and column definition is available.

It’s probably rarely you need to use these functions unless you’re writing a phpmysqladmin replacement – often a script makes assumptions on which tables and columns exist.

If you’re developing an upgrade to an existing application/webbsite/script and the update requires database changes, you can use these functions to check if the database layout version is the one matching you application version needs. By doing this, you can provide much better feedback to the user on what’s wrong with the script, instead of just breaking horribly with database errors.

Backups, WordPress & GMail

Backups seem to be a constant pain for just about everyone. It’s something we know we should do, but somehow never get around to actually doing. Since switching to WordPress on this site, things have been different though.

One of my many installed wordpress Plugins is the WordPress Backup plugin. It runs once a day and makes a complete backup of my wordpress database (with all these precious posts) and sends it in a mail to my Gmail-account.

On my gmail account I have a filter, which sees these mails – it attaches a dedicated backup label and archives it (thus removing it from the inbox). Leaving a me with a backup of all the important data off site.

I have been checking the mailed files (that they actually are unzip’able and restoreable) and every once in a while I do delete all backups more than a week old (though I don’t need to with all the space available on the Gmail account).

It’s so easy, that there really wasn’t any reason not to have a current backup of the site, right?

Website Traffic Tracking

Do you have a website? If so please go to the place you store the access logs, and check how much disk space they use. Having a website a few yours old, you’re probably looking at gigabytes, and what exactly is the value of that?

Sure keeping track of traffic levels is sort of interesting, but sometimes you need to balance the value provided by the space/resources required, and I’ve been slowly changing the way I use the access logs on this site.

Step 1: Don’t track the images

Do you really need to track, which images downloaded from the site, or would it be enough to know which pages are loaded? – For my part page impressions is enough intelligence on the site traffic and with apache it’s easy to disable image tracking. The easy way to do it is by adding a parameter to your log configuration saying:

env=!object_is_image

Restart the webserver and the log file should be somewhat smaller from now on.

Step 2: Use Awstats

My next step was to use Awstats. It parses the raw accesslog data into a database-file, which is significantly smaller than the raw files themselves. Awstats is a lot like other access-log analyzing packages, but it seemed to be just a notch above the rest.

Step 3: Drop the access logs for long term intelligence

While access logs on the webserver may be the source for traffic intelligence, there are several options to track traffic through remote services.

Most of them are pretty good and if you’re interested in generic analytics, you should probably look at one of the many options available to do traffic tracking as a remote service.

Some of the options available include Google Analytics (which I use), StatCounter and several others. Isn’t it nice, that someone else offer to keep all those historic data online – and in many cases absolutely free.

I still have access logs, but they’re used to (1) validate data from Google Analytics and (2) keep an eye on what’s happening on the site “now”. Any data more than a week (or so) only exist at Google Analytics…

Letting others feed the web for you

I follow a ton of sites on the web. I go for a morning surf through each and every one of them; I use an aggregator which checks the feeds from the websites, and tell me where to go for news. I guess most people do this – using feeds to find updates and then visit the site to check out the content.This way of tracking sites has changed one important thing on this website – the most popular file on the site is no longer the frontpage nor is it at particular popular page with a high Google ranking – it’s the feeds. Until recently almost 25% of all inbound tracking was hits to the main feed-URL.

While I do appreciate the traffic, serving a feed is more a necessity/convenience than it is adding value to the site is self, and wouldn’t it be quite nice, if I could use the webserver resources for something better than letting aggregators know if I’ve change anything or not.

Well, guess what. I’m (almost) not wasting any server resources on feeds – FeedBurner handles that.

There really isn’t any magically in doing this – FeedBurner is pushing more than a million feeds, but there are three reasons why you should let feedburner (or an other feeding server) push your feeds:

  • By using feedburner, I’ve moved a lot of traffic away from this server and thus pulling less traffic and a lesser load on the server.
  • FeedBurner are assumable feed experts and they probably ensure the readers used by people tracking the site, get the best possible feed.
  • Since FeedBurner Pro is Free, I can even brand the feeds with my own domain name, so that visitors don’t even know FeedBurner is serving the feeds. My mainfeed lives at http://feeds.netfactory.dk/netfactory

There are a few other cool benefits – Feedburner offers statistics on the feed usage and widgets I can use on the website, but the three above points should be enough to get most blogs and small websites to at least consider using FeedBurner.

kUbuntu 7.10

kUbuntu logoJust a few days before leaving for South Africa, the latest version af Ubuntu was released. I really didn’t have the nerve to try and upgrade before my vacation, but today was the day.

Ubuntu is an operating system – like windows – but based upon (Debian) Linux. It can probably do everything you need – and it’s free. With the packaging done to Linux by the Ubuntu team(s), it’s a complete user-friendly and easy to use alternative for most computer users, and it has worked pretty well for me for the quite some time.

The upgrade

While it probably is possible to do a distribution upgrade, I’ve been reinstalling from scratch when upgrading. It usually just requires all the contents of my home-directory (and a few select configuration files from the /etc/ directory) to be zip’ed together in an archive. The archive is temporaryly store don a USB disk (about 600 Mb in total), while the harddisk was completely wiped and formatted.

The entire install process was the smoothest experience I’ve witnessed so far, and to less than 30 minutes. The packed homedir was unziped in a directory on the desktop, and the files and directories I know I needed was moved to the location they were placed in before the reinstall.

The software updater was run and within an hour the machine was running the new version. So far it’s been an impressive upgrade. Screen drivers, printers and just about everything work. Amazing.