Archive for April, 2009

HTML: The Difference Between id and class

Thursday, April 30th, 2009

In HTML markup, an id is an attribute you can graft onto a element markup tag, one that identifies that element with a word:

<p id=”sampletest”>This is the sample test</p>

The W3C defines id as the “document-wide unique id.” You can use the name of an id only once, and in a single type of tag, on a Web page. Ryan Fiat writes, “Simply put, an ID is like the sticker that says ‘you are here’ on a mall directory, and there can only ever be one of those.”

Very much like the id, a class can also be grafted onto an element, in a very similar fashion:

<p class=”sampletest”>This is the sample test</p>

The class attribute specifies a classname for an element, says the W3C. You can add your own class to most elements (except base, head, html, meta, param, script, style, and title).

A class can be used to give a name to a particular part of a web page, i.e. <p class=”preamble”>. This can be useful for manipulating that part of the HTML document, through CSS (to stylize that section), or Javascript (to process the content of that section).

Unlike an id, a class can be used across multiple tags. For example, you could have both of these elements in a document:

<h3 class=”legalese”>title</h3>
<div class=”legalese”>content</div>

You could call only one of the classes (i.e. “div.legalese”), or define legalese, and it will apply in both instances. (Click here for working example. View source to see how it works)

ALSO, another cool thing about classes is that more than one can be used in a single element. They are separated by spaces within the quotes:

<p class=”legalese preamble”>content<p>

In this case, if you have CSS stylesheet definitions for both legalese and preamble, then they would both apply to the content above.

In a CSS style sheet, for instance, you could embed these style elements in your document via CSS:

.preamble { font: bold; }
.legalese { font-size: 325%;}

And any content marked with these classes (i.e. <div class=”preamble legalese”>) would be stylized accordingly (Click here for working example. View source to see how it works).


One nice thing you can do with an id that you can’t do with a class: Internal navigation. Anchor linking, it’s called. A browser can use an id as a sort of internal navigation by attributing it to a markup tag of some sort:

<a href=”#content”>This will take you over yonder</a>

<p>hilldale</p>

<p>hilldale</p>

<p>hilldale</p>

<p>hilldale</p>

<p>hilldale</p>

<p id=”content”>Yonder!!</p>

(Click here for working example. View source to see how it works).

–Joab Jackson





XHTML: Upgrading from HTML to XHTML

Saturday, April 25th, 2009

You can think of XHTML as a proper version of HTML. Or strict version, as the W3C like to put it. XHTML is HTML based on XML, with the “<HTML>” tag as the root.

You can’t get away with some of the lazy tricks that browsers overlook with HTML. But the upside, besides being in standards compliance which is always helpful down the road, is that you are preparing your pages for machine readability. Since it is XML, XHTML can be parsed. More on that in later posts.

Surprisingly, it does not take a lot of work to make your HTML documents all XHTML-y, especially if you already practice good markup. Six easy steps is all you’ll need to do. Here is what to do:

1. The page must have a DOCTYPE declaration in the header: This gives a XML parser something to validate against. When I open a new XHTML file in the NetBeans editor, this is the declaration I get, placed in-between the Head tags of the document. So this one will work for all XHTML files:

<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN” “http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>

2. Code must be well-formed. This simply means that closing tags must be deployed in the exact reverse order from which they were deployed. This is simply proper nesting:

<b><i><u>This is not well formed mark-up.</b></u></i>

<b><i><u>This is properly-formed mark-up.</u></i></b>

3. All elements and attributes must be lower-cased: So <B> is wrong while <b> is correct. HTML itself is not case-sensitive, though XML is.

All attributes must be in quotes: So, now when you add in a link, or any other attribute, you must put it in quotes. There will be no more of this:

<a href=http://www.joabj.com tagert=_blank>Joab;lt;/a>

Instead you must do thusly:

<a href=”http://www.joabj.com” Target=”_blank”>Joab;lt;/a>

Oh, if you put a quote at the beginning of an attribute, be sure to put put the closing quote at the end of the attribute. Seems obvious, but nothing befuddles a browser faster than half a quote.

4. Empty elements must be closed: No more just putting down a <br> and leaving it at that. Now, you write <hr/>. This goes for <hr>, <img> <param> <meta> <input> <col> too!

5. All elements must have end-tags: This seems obvious, except in cases where it is not. This listing tag, for instance. If you put a <li> element at the beginning of a list item, you must now close it with a </li>. Also applies to <body> <html> <option> <p> <head> <tr> and, well, all the other tags too.

6. All attributes must have defined values: I didn’t know you could do attribute without values, but I guess in certain cases you could have.

One example is the “compact” attribute for the underline “<ul>” element. Under plan ole element you could just write “<lt compact>” Now though, you should spell out what it really means, namely ‘<ul compact=”compact”>.’

And believe it or not, that is about all there is to it. Please keep in mind that this is XHTML version 1.0. Version 1.1 of XHTML is stricter still, but for now, most Web developers shouldn’t have to worry about it.

Also, if you want to ensure that a Web page is coded in the correct XHTML form, you can check it at the W3C’s Validation Service.

For learning more on XHTML, check out the W3C’s fine tutorial. Material was also taken from “HTML Programmer’s Reference, 2nd Edition.”

–Joab Jackson

And now, a word from the sponsor:





Unix: The Basic Mechanics of File Permissions

Wednesday, April 22nd, 2009

Unix is a multi-user system. As such, every process that runs and every file that is stored must have an owner, or user-account. Conversely, each time a user tries to interact with a program or file, Unix checks to see if the user has permission before letting him/her proceed with the action.

The owner of currently running programs can be checked through the ps command. At the command prompt type “ps-aux” and you’ll get a list of programs currently running. The last two entries may look something like:

henry 32186 0.7 0.7 5604 3020 pts/0 Rs 06:58 0:00 -bash
henry 32202 0.0 0.2 2644 1012 pts/0 R+ 06:58 0:00 ps -aux

The last two actions carried out were done by user “henry”–namely opening the shell (-bash) when logging in (an automatic procedure; the shell provides the command line), and the running of “ps -aux” itself.

For files and directories, user permissions can be found by typing in the list command, with the option to show details (“ls -l”) at command prompt. You should get something like this:

-rw-r–r– 1 henry henry 6 2009-03-29 22:10 test.txt
-rwxr–r– 1 henry henry 32 2009-03-29 22:15 text.txt

In this listing, we see the information for two files (“test.txt” and “text.txt”), one on each line. The user permissions are on the left (the series of dashes & letters, or flags). Right after that is the file owner (“henry”) and the name of the group that file belongs to (more on that later, maybe). The size of the file and when it was created is also included in that listing.

Deciphering the Permission Set

Each one of the 10 flags (“drwxrxwrxw”) designates whether or not a designated party has a specific permission to do something with the file. The rest of this section will break down what each permission means.

To understand the full set of permissions, break them into four subsets, reading left to right:

Position 1: This indicates whether or not the file is a directory (if it is, then there is a “d”–if it is not a directory, then “-”).

Positions 2-4: This is the set of permissions allotted to the owner of the file.

Positions 5-7: This is the set of permissions allotted to the group that owns the file.

Positions: 8-10:These are the permissions for everyone else who is not the owner of the file, nor belongs to the group that owns the files (“Others“).

In recap, reading left to right (after the directory key), you are reading the read-write-execute permissions for owner-group-other. Summarily, the permission set runs from lesser to greater degrees of control of a file, and from specific to more general possible users of the file.

Each of these three sets of letters comes in the same format. Reading each block of three left to right, you could see, in this order:

r: The right to read the file.

w: The right to write to the file, meaning to make changes to the file.

x: The right to execute the file. If the file consists of code that can be executed by the machine, and if the “x” is present, then the individual can task the computer with executing the code within the file (or, rather, the file is the program).

If the letter is present in the designated spot, then that permissions is granted. if a blank (“-”) is in the place, then there is no permission.

As an example, if a file has the permissions:

-rwxrw-r–

This means the owner of the file read, write, or execute the file. The group can read and write to the file, but not execute it. And everyone else can read to the file, but not execute it.


To change the permissions of a file, use the chmod command on the command line. chmod is an abbreviation for “change mode”

The basic format for chmod is this:

chmod [Changes to be made] [file]

For simplicity, I’m leaving out the ability to designate options and to concatenate the commands. See the manual page for more details.

The “Changes to be made” space above, you want to format the changes to be made in this way:

[who the changes will apply to] [The action to be carried out] [The new permissions]

Who the changes will apply to will be one of four groups

u: The owner of the file.
g: Other users in the file’s group.
o: All other users.
a: Everyone (u and g and o)

Note that “other” users is not quite the same as all “users.” It does not incorporate u or g. Also, remember “o” does NOT stand for “owner.”

The second part of the statement, [The action to be carried out], will be either a “+” or “-” . “+” means you are adding these permissions, while “-” means you are removing them.

The third part of the statement are the permissions that are being changed. As from above they can be either read (“r”), written to (“w”) or executed (“x”).

Putting this all together in an example, say I would want to add a permission for others to write to a file, I would type this in at the command line:

chmod o+r [file to be changed]

Or to remove the permission for the group to execute a file:

chmod g-x [file to be changed]

I can add multiple permissions onto one change order. For instance, say I want to add read and execute permissions for the chief user of the file:

chmod u+rx [file to be changed]

For lovers of numeric abstraction and/or being closer to the metal, there is also a way to change permissions using numbers, I’ll get to that approach (the octal approach) later, in a separate entry. Maybe. If I need to, In the mean time, read about it in the manual page.

Whil I won’t delve into the details, I did want to point out one option, for recursion. This is the -R flag:

chmod -R u+x * [file to be changed]

This above command grants execute permission for all the file, not only in the working directory, but any subdirectories under it (Also, wildcards (*) do work with chmod, but be very sure about what you are changing before you hit that return key).

chmod never changes the values of symbolic links. Those permissions are the same as the file the link is connecting to. Symbolic links is another topic

This post just covers the mechanics, and the basic ones at that. Of course, there are a lot of implications that need to be articulated. Getting user permissions right is a matter of balancing security and ease of use: Granting permissions on an Internet-connected for everyone will ensure your system will be hacked. But keeping them too tight will cause the user aggravation and may hinder programs from working. I’ll explore these topics in future posts.

Taken from various tutorials, Dartmouth Tutorial, and Unix in a Nutshell

–Joab Jackson

And now, a word from our sponsor:





Project: Roll Your Own URL Shortener

Sunday, April 19th, 2009

Like many other folks, I started using Twitter, and soon thereafter, found the value in URL shorteners. You need to keep each message to 140 characters or less, and if you are passing along the Web link, sometimes that link can take up most of the allotted space.

A URL shortener basically assigns a Web address with as few characters as possible to a longer address, so that when you enter the shorter address into the browser, a server automatically redirects the page requester to the original, longer address.

While there are plenty of free Web sites that offer this service, there are reasons for deploying your own, if you have the gumption and a Web server at your disposal.

For one, how long will these services last? They don’t seem to have business plans. So it keeping your own file of URLs assures that the shortened links won’t be recycled (unless you want them to), and that they’ll be up as long as you want them to.

Another reason for rolling your own is pure vanity (media companies should take note). Like personalized license plates, a shortened URL can say what you want. It can also advertise domain name of the owner.


This is how I built my own URL shortener.

Please note that this code I’ll present has some previous limitations. I would strongly advise not offering it as a public service, at least not without more security measures in place (Filtering what is placed in the Web input, for instance, to prevent cross-site scripting).

I set it up to use as a private service. In other words, place it in some hard-to-find cranny of your Web server, inaccessible from the spidering probes of the search engine. Even as a private service, you should consider putting some authentication code in place.

This is bare-bones code-showing how URL redirect works, on a Unix box, using a Web front-end. It’s so crude, you even have to provide your own shortened URL.

Here is what you need to do:

To set up a URL shortening service, at its most basic, you need to set up three files. I’ll explain each in detail.

One is a Web page that users can use to submit a long link and its short link (to keep things simple for me as a non-programmer, I’ll ask the user to create the short link names, rather than have them automatically generated).

The second file is a text file on the Unix server listing all the short file names, alongside the original ling URL addresses that they will redirect for. It is called the .htaccess file and it is for the Web server, so when the request for the short link comes in, it will redirect the browser to the original (long link).

The third file, a PHP-Web page connects the two above-named pages. So that when a user hits “submit” on the HTML page, the information is sent to the PHP page, which has code to append the Unix file with the new link information.

And that’s about it.

Now the details:
STEP 1: Set up a directory on you Web server for the shorteneing service: For the sake of keeping things tidy, I set up a specific directory on my Web server, just to keep the URL shortening service. It is immediately under the root directory, and is called “A” (“xttp://www.joabj.com/A”). This directory itself has the same permissions as other publicly-accessible directories on the server (drwxr-xr-x). It will contain only the three files needed for this service, plus an obscuring index page so others can’t snoop on the directory’s contents.

STEP2: Set up the .htaccess file: Most of the decent URL shorteners use 301 redirect HTML command. To use 301 redirect, you set up a file, called .htaccess in one of the Web server’s directories. It should be in a directory that the Web server can read and that can be written to by the PHP software.

Setting up an .htaccess file takes a number of steps. I’ve covered them here. Follow these steps and come back when your finished.

The permissions of the .htaccess file should be set so that anyone can write to the file, and read it (-rw-rw-rw-). Security-wise, this sucks. Hiding it in a part of the publicly-accessible though unlinked part of the Web server will keep it from snoopers, though the security-through-obscurity approach is not a good long-term solution. But for the purposes of this instruction, it is the easiest path. You’ve been warned though.

For more information on changing permissions on a Unix box, go here you fool!

STEP 3: Set up the landing HTML page: So the idea is to set up a basic html page that you can bookmark and go to when you want to shorten a URL. It should have a short and easy-to-remember title so you can get to it when you are on the road. But you should NOT link to it from any other page on your site that is crawled by human or search engine spider. Warning: Security through obscurity again.

Anyway, at its most basic, this should page should include three things. It should have two fields for the user to enter the original URL and a short URL that they make up. The page should also have a button that can be pushed to kick off the whole operation, once the values are filled in.

This would be the active code for such a page:

<FORM ACTION=”Shorten.php” METHOD=”get”>
The link to be shorted: <INPUT TYPE=”Text” NAME=”Link” />
Shorty nickname: <INPUT TYPE=”Text” NAME=”Shorty” />
<INPUT TYPE=SUBMIT VALUE=”GO” />
</FORM>

To see this code in an actual working html page, go here. To make it operational, change the suffix of the file name from .txt to .html .

O.k., some explanation of what is going on here. Using the W3C standards for creating Web forms, we’ve given the user two fields to fill in. The content filled into the “Link” field will be assigned to the variable “Link” and the content filled into the short field will be assigned to the variable “Shorty.”

Note that we are asking the user to fill the original link address into the “Link” field and the shortened link name in the “Shorty” field. Also on the page is the code:

FORM ACTION=”Shorten.php” METHOD=”get”

and

INPUT TYPE=SUBMIT VALUE=”GO”

This basically instructs the browser to fetch the Shorten.php page and feed it the contents of the “Link” and “Shorty” variables, when the SUBMIT button is pushed.

Next we create the Shorten.php page.

STEP 4: Create the PHP page: next you have to create the page that the HTML page is sending its information to. And this page will format and insert the data into the .htaccess page in such a way that it can be read by the Web server, capiche?

Here is some background material on getting started on PHP. Here is a bit on how to pass information from Web forms to a PHP page. And here are some pages on how a PHP can open a disk file and append data to that text file on a Web. Familiarize yourself with all of these pages, please. Mash them together you’d get code like this:

$Link = $_GET["Link"];
$Shorty = $_GET["Shorty"];

$Preamble = “redirect 301 “;
$Space = ” “;
$Directory = “/A/”;
$NewLine = “n”;
$All = $Preamble.$Directory.$Shorty.$Space.$Link.$NewLine;
$Name = “.htaccess”;
$Handle = fopen($Name, ‘a’);
fwrite($Handle, $All);
fclose($Handle);

To see this code in an actual working html page, go here. To make it operational, change the suffix of the file name from .txt to .php .

So what is going here? What we need to do is take the information given to this page from the HTML page (“$Shorty = $_GET["Shorty"]” and “$Shorty = $_GET["Shorty"];”) and format it in the appropriate way for an .htaccess file (“redirect 301 [new short address] [original address]).

To do this, PHP has to make a one-line string. You can concatenate multiple PHP variables through the “.” symbol. So we create variables for the additional formatting we have to do. “$Preamble” is the first statement needed on the .htaccess line (“redirect 301″). “$Directory” is the directory (in this case “A”) the new address will be appear to be in (so the user doesn’t have to type it in, as a prefix). “$Space” adds the space needed between the two addresses, and “$NewLine” tells Unix to start a new line after this string is entered.

Finally, $All assembles all these variables together in the order of a proper 301 redirect request.

The page then opens the .htaccess file, appends on the new request, and closes the file. After this file is appended, when the short link is typed into the browser, as part of the full file-name (i.e. http://www.joabj.com/A/0″), your Web server should automatically send the viewer to the page you indicated.

That’s it. Ezy pezy, yes?


Again, this is just the bare bones code, to show you how it works.

There are some easy things you can do to pretty up the service, by adding to the HTML portion of these pages: You need to see up error messages, to tell the user when they fill in the boxes incorrectly. You may want to place a Twitter submission box on the results page, so you can submit your newly-christened short link directly to the microblogging service. Or you could post a link to try the new short URL. Or show the URL to the last link, so you know where you left off. You could even insert a generator of short addresses, taking the manual naming of the address out of the process.

Heck, this code doesn’t even offer the ability to tell the user that the short link submitted has already been used!

A word on naming the short links: As you can tell, the user has to supply the own short links, which become live as soon as they are entered. While this offers a way to way to customize Web addresses (“http://www.yourname.com/ThisStorySucks.html”), if you want to make them as short as possible, you should use as few letters as possible. And long-term use requires a few heuristics, as they say.

Myself, I am starting by running through all the 1-character options (0-9 a-z, for a total of 36 links) in order (“0″ then “1″ then “2″ and so on). When they are exhausted, I’ll go through all the 2-character options (“01″ then “02″ and so on). This will provide a total of 1296 links (36*36), and then, all the three-character options (36*36*36 = 46,656 links), and so on.

In my lifetime, I probably won’t use up all three-letter URL combinations. So my URLs will, at the most, run only 20 characters in length (i.e. “http://joabj.com/z99″), thanks to my relatively short domain name. This is the exact length as the shortened links that Bit.Ly current offers. Yay! Brevity! –Joab Jackson

If you found this useful, consider buying something from these people:





Automatically Reposting Delicious Tags to Identica and Twitter

Saturday, April 18th, 2009

I wanted the bookmarks that I added on to my Delicious account to also be automatically reposted on to my Identica and Twitter accounts. This is how I learned to do that.

I didn’t work up the solution myself: This chap figured it out. But I’ll recap the basics here, because I’ll forget how to do it, and forget where to find the link.

1. Grab the RSS Feed from your Delicious account: Look at the bottom of your Delicious homepage [http://www.delicio.us/[YOURACCOUNTNAME], find the “RSS Feeds” link. Click on that. Copy the resulting address from the browser.

2. Sign up for an account on Twitterfeed. Once signed in, you will be given an opportunity to create a new feed for either the Twitter, Identica, Ping or Hellotxt microblogging serves. After authenticating though your microblogging service of choice, paste your Delicious Feed link.

And that is basically it.

When a new Delicious bookmark is added onto the RSS feed, Twitterfeed picks it up next time it checks the the feed, and reposts it to the microblogging service of your choice. Specifically, it uses the information in the Delicious “Title” field and the link. It automatically deploys a link shortening service for long URLs.

It doesn’t post links prior to you turning on the feed. It checks for new entries based on the date or on the GUID (the incremental numbering of new entries).

In the settings, you can prefix the reposts with a brief tag, up to 20 characters. For instance, I used “Bookmark:” Note, however, we only have 140 characters to work with. So given that the small URL will about 30 characters, and “Bookmark:” is itself 9 characters, I’ll need to keep the Delicious “Title” field to 110 characters or less.

You can adjust how often you want Twitterfeed to check for updates. It doesn’t seem to pull Delicious bookmarks entered prior to signing up.

Note: For Identica, you have to choose Laconica on the Twitterfeed range of microblogging site options, as Identica is one implementation of Laconica.

–Joab Jackson

Oh, here, support capitalism:





Unix: Simple String Replacement

Tuesday, April 14th, 2009

Say you have a bunch of text files in a directory, and you need to change some text in all of them. This is where the replace command comes in handy.

The basic form of “replace” on the command line is this

#replace [TextToBeReplaced] [ReplacementString] — [name of the file]

(Note for all these blog pages, you fill your own values in the brackets “[ ]“).

The “–” tells the shell that the replacement string is finished and the next dab of letters is the file that the replacement operation is to be performed upon. Keep a space on either side of “–” .

You can replace multiple strings in one command, i.e.

#replace [OldString1] [NewString1] [OldString2] [OldString2] — [name of the file]

You can change multiple files in one action, by simply naming all the files, each separated by a space, i.e. “file1.txt file2.txt”)

You also can use a wild card (*) as part (“*.html”), or all (“*”) of the designation for the files you want to be modified, but be careful to not inadvertently replacing strings you didn’t intend to replace, especially in cases where the replaced strings are short.


You can also do simple search and replace with Perl, if you have Perl installed on your Linux system. At the command line, the basic format is:

perl -p -i -e ’s/[Text2bReplaced]/[ReplacementText]/g’ [Name of File]

There are some escape codes to deal with, for complicated strings. If you have more than one line you are replacing. For more than one line add “\n” at end of the every line. Put a “\” in front of every single (‘) and double (“) quote.


Lastly, I’m not sure where else to put this, so I’ll tuck it in here and let the search engines find it. If you want to convert DOS files to Unix ones on a Unix machine, the Perl command is this:

perl -i -pe ‘s/\r//g’ [FilesYouWantConverted]

–Joab Jackson

And now, a word from our sponsor:





PHP: Writing to a disk file

Friday, April 10th, 2009

Now, that we have learned how to create and open a disk file with PHP, now it’s time to write something to it.

Again referring to this tutorial, we insert two lines into the previous set of code:

$Name = “FileToAddStuffTo.txt”;
$Handle = fopen($Name, ‘w’);

$TextToAdd = “HelloWorld\n”;
fwrite($Handle, $TextToAdd);

fclose($Handle);

The new part is here:

$TextToAdd = “HelloWorld\n”;
fwrite($Handle, $TextToAdd);

The PHP “fwrite” command is key here. With it, you are instructing PHP to 1: open the file “FileToAddStuffTo.txt” (which has been assigned to the variable $Name, and is opened by calling the variable $Handle), and 2: write into the file the contents of variable $TextToAdd).

(For the full working code, click here. To get the code to run in a PHP environment, change the “txt” suffix to “php”.)


Running the above code, you will just overwrite what was previously there. Appending a file requires an different flag. Instead of this:

$Handle = fopen($Name, ‘w’);

you’d write this:

$Handle = fopen($Name, ‘a’);

Also, be sure to add a new line break at the end of the string (“\n”):

$TextToAdd = “HelloWorld\n”;

For this sample, I used PHP 5.2.4

–Joab Jackson

And now, a brief word on why you should buy something:





PHP: Creating a file on disk

Tuesday, April 7th, 2009

The PHP command for both creating and opening a file is “fopen” … Like typical Unix file programs, if it doesn’t see a file called “x” it will create a file called x.

How to create a disk file using PHP? This tutorial advises us to add these three lines to a PHP skeleton file:

$Name = “ThisFileWasCreatedByPHP.txt”;
$Handle = fopen($Name, ‘w’);
fclose($Handle);

(For the full working code, click here. To get the code to run in a PHP environment, change the “txt” suffix to “php”.)

All you do to execute this action of creating a file is to call up this page with a browser. The page should have a suffix of .php (i.e. “01-CreateFile.txt.php”) and you should have PHP working on your server.

NOTE: For this to work, the administrator must give “all” user permissions for reading, writing and executing programs for the directory this file is in. Sucks, I know. In other words, don’t use this in a directory with any valuable info (at the command line, type “chmod a+rwx [Name of Directory]“)

In the above code, the tutorial tutors us, the first line creates a name of the file (“ThisFileWasCreatedByPHP.txt”") and assigns it to a variable ($Name).

The second line instructs PHP to open and write (“w”) to a file, or if one doesn’t exist, create that file, with the “fopen” command, giving it the name of variable $Name (which in this case, happens to be “ThatThatWasCreatedByPHP.txt”). The third line closes the file.

For this sample, I used PHP 5.2.4

–Joab Jackson

And now, a brief word on why you should buy something:





Apache: Redirecting Web page requests

Sunday, April 5th, 2009

If you move a Web page on your site to another location, or give it a another address, there are a number of ways you can have the Apache Web server software automatically redirect browser requests that come in for the page to the new location.

The easiest way is to put a page at the old address that automatically directs the browser to the new location, i.e.:

<html>
<head>
<meta http-equiv=”refresh” content=”0;url=http://www.TheNewAddress.com”>
</head>
</html>

In the above page, the meta tag redirects the browser to the new location (here, it is “http://www.TheNewAddress.com”), with a delay of 0 seconds (“0″) . The user just sees the page at its current location.


This process of setting up a new page for each updated address is a but cumbersome though. Far better would be to put all the old addresses and their new replacements in a single file, which Apache could check every time a new request for a page comes in.

Fortunately, the HTTP protocol has something called 301 Status code, which is basically a permanent change-of-address notification.

For Apache, doing a 301 redirect involves setting a .htaccess blank file (or appending an existing one). The period in front of the the name means it will be a hidden file—to see hidden files, use the “ls-a” command.

To create such a file, just name a blank text file .htaccess. Place it in the root directory of your Web server (Or if all the pages you are redirecting are in one directory, place the file in that directory).

Then, add a new line for each redirect in the following form:

[old address] [new address]

For example, this entry at the bozo.com site…

/OldFiles/OldBozo.html /NewDirectory/NewFile.html

…clicking on the link “http://bozo.com/OldFiles/OldBozo.html,” the user’s browser will automatically pull up the “http://bozo.com/NewDirectory/NewFile.html.”

Note that when the new page is outside the control of the Web server, the full address (including “http://”) of the destination address must be used, not just the internal directory tree.

Setting up an Apache .htaccess file, if one didn’t previously exist, requires letting your copy of Apache know that this file exists and should be consulted. In Ubuntu, and probably other distributions as well, Apache ignores the .htaccess page in the default install.

(Note, for this instruction, I am using Apache 2.2.8 on Ubuntu server 8.0.4).

Doing this requires two steps. First of all, find the “apache2.conf” file. In Ubuntu, it is located in “/etc/apache2″ directory. It can be edited at the command line with a text editor, such as vi, emacs or Pico.

Open the file and search for the mention “.htaccess.” Check to see that “.htaccess” follows the “AccessFileName” option. If it is enabled, there will be no ‘#’ at the beginning of the line (meaning it is not commented out). This tells Apache to look in this file for directives, such as a page address substitute as the one above. It should read:

AccessFileName .htaccess

That is probably already set correctly, but the second step probably involves some changes in configuration. Namely you have to set something called “AllowOverride,” which is the configuration setting that tells Apache whether or not to follow the .htaccess requests

This option can be found in another file, one showing the directories that Apache should use for the Web site. In Ubuntu, it is the “default” file in the “sites-available” folder (“/etc/apache2/sites-available”). (NOTE: In Ubuntu this file is also under another name as a symbolic link, in the “sites-enabled” folder.)

In this “default” file, you will find a list of directories on your server that have been enabled as Web server pages.

<Directory /var/www/>
Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all
</Directory>

NOTE: This is the not the “document root” entry, but the one right after it. The “document root” entry also has an AllowOverride. It is set as “none” and can stay that way.

Each entry (framed by <Directory> and the </Directory> tags) in this list specifies the options that Apache should use for that directory. In the above entry, change “AllowOverride None” to “AllowOverride All”.

A bite of explanation: “AllowOverride None” means Apache does not look for the .htaccess file, and does not follow its instructions. “AllowOverride All” means that it does.

After you make this change, or any changes to these configuration files, you need restart the Apache server software. In Ubuntu it is done thusly from the command line:

/etc/init.d/apache2 restart

(Note, you do not need to restart Apache when new entries to .htaccess are made. That seemed obvious but I should mention this anyway)


Also, if you know that all your rerouting is being done from one folder. You can place the .htaccess file in that folder, and, instead of changing the “AllowOverride” setting for the whole site, just make a new entry in the “default” configuration file for that one directory.

For instance, I wish to redirect addresses of expired Web pages in the “/var/www/L/” folder. I would place an .htaccess file in that folder and add this entry into the “default” configuration file:

<Directory /var/www/L/>
AllowOverride All
</directory>

This seems to be all you need to add–the other options are inherited from the listing of the parent directory.

End-note: I’ve found that Apache is extremely fussy about what is put into an .htaccess file. Don’t put junk in just as a way of testing something else out. Only properly formed URL’s or internal links should be added. Anything else will halt all Apache redirects, giving users only error messages.


Note: Other forms of redirection are discussed here.

–Joab Jackson

And now, a word from our sponsor: