Unix: Securing a Web site with the proper file permissions

August 29th, 2010

O.K. so you have Apache all set up and serving to the files all the files in your /www folder (or wherever). The question is, what would be the appropriate file permissions. This entry assumes you already know how Unix file permissions work and how to change them).

Setting the permissions is a three-part process. One part would be for the directories, and the second part for the files themselves. The third part is to deal with the exceptions.

With directories, the common wisdom seems to be to use “rwx–x–x”, or “711″ numerically speaking. This means all visitors can access documents within the directories, but can’t view or write to the directories themselves (if, for some reason, you wish to expose the contents of the directories, use “rwxr-xr-x” –or “755″).

In order to recursively change all the directories in your Web folder, you would use this command, from the root directory for your Web site:

find . -type d -exec chmod 711 {} \;

Thanks to the Movable Tripe for providing the above command.

For extra-added protection, you may want to place an index.html file of some sort in each directory. Then, if for some reason the directory permissions get changed to where outside folks can read them, when someone just enters the directory name in the browser (i.e. “http://www.site.com/DirectoryName”), then what will be returned will be the contents of the index.html file, rather than a listing of the directory.

Individual files, or Web pages should have a permission setting of “rwxr–r–” (“744″), which gives the owner full read, write and execute privileges, though others only read permissions. The command from the home directory to change all file permissions, but not directories, would be:

find . -type f -exec chmod 744 {} \;

Thanks again, Movable Triple

Finally, there are exceptions you should consider, depending on what advanced functionality you have on your site. First, there is PHP to think about: While PHP files work with the above permission sets, if you write to a ext file on the site, write permission needs to be added to those files.

Secondly, there is blog software, which in most cases requires residency in your Web directory structure, and needs write and execute permissions in certain places. Wikis also require some write permissions in selected folders.

Finally, please keep in mind that this post does not take into consideration issues of who owns the files (the file “owner”) or groups. That is a topic for another post.–Joab Jackson

iTunes “Play Date” tag numeric mystery, solved!

August 22nd, 2010

Recently I embarked on a project that involved parsing the contents of my iTunes library file, which stores information about all the MP3 files in my collection, in an XML format.

What I wanted to do was assemble a list of songs I played, in chronological order. Fortunately, the file, “XML Music Library.xml” offers two tags that capture this info, namely:

<key>Play Date</key><integer>3365245919</integer>

&

<key>Play Date UTC</key><date>2010-08-21T18:31:59Z</date>

Setting aside Apple’s odd notions of how XML works for a second, we can see that the date played is captured in a human readable format, in the latter “Play Date UTC” tag which of course must be parsed with a regular expression to compute against. (UTC stands for Coordinated Universal Time).

I was curious as to if I could calculate the time played directly from “Play Date” tag rather than parse it out from the “Play Date UTC” tag. I suspect that iTunes is storing this number for exactly such purposes, in fact. The challenge however, would be to decipher the number.

So, in the example above, the “Play Date” value of “3365245919″ should equal “2010-08-21T18:31:59″ stated in “Play Date UTC.” I would just need to find the conversion algorithm.

My first guess was that it was a Unix Timestamp, given that iTunes, in its original Macintosh incarnation, runs on Unix. The Unix Timestamp counts time in terms of seconds elapsed since January 1, 1970. However, when I put the number “3365245919″ into a Unix Timestamp calculator, I didn’t get August 21, 2010 at all, but rather Aug 21, 2076!! Crazy!

My next step was to reverse engineer the “Play Date” number so to speak. By comparing a few “Play Date” and “Play Date UTC,” I sussed that the incrementation was, in fact, in seconds.

For instance, one song played at 3:10 has “Play Date” of “3365248245″ while the next song, played at 3:13, had a “Play Date” of “3365248408,” which was greater by 163–or just under three minutes if you consider a second for each increment.

So, August 21, 2010, 6:31.59 was 3,365,245,919 seconds ahead of what date? Well, using this online Epoch Converter, 3,365,245,919 seconds equals 38,949 days (and 14 hours and 31 minutes).

So, now using this online date calculator (Isn’t the Internet great?!), we find that subtracting 38,949 days from August 21, 2010 brings us to January 1, 1904.

So, in short, the iTunes “Play Date” figure is the number of seconds that have elapsed since January 1, 1904.

In other words, Apple is using an Epoch time, with January 1, 1904 as the starting point (Epoch time has no unified start point per se, just whatever the keeper of the timestamp decides it to be).

And this is keeping with an Apple tradition. According to this online Filemaker help file, this date is the official starting point for all Macintosh computer timestamps, as it was last century’s “earliest New Year’s Day that falls in a leap year”–an easy starting point for developers.–Joab Jackson

Unix: A (somewhat) easy trick to understanding file permission octals

May 5th, 2010

In Unix, file permissions are represented in a set of nine letters (with an additional single letter beforehand as a prefix to signify if the file is a directory or not).

-rwxwr-r–

This cluster can be broken up into three groups (after the initial directory indicator), scanning left to right: A set of three permissions for the owner of the file, a set of three for the group (a group of other users that the user is part of) and a set of three for everybody (or “others“).

The letters used to represent permissions are “r,” “w,” and “x.” For each of the three groups, they are always in the same order: rwx. The “r” permits the reading of the file, The “w” permits writing to the file, that is to say the ability to make changes to the file, and the “x” permits the execution, or running the file (assuming it is an executable file or script).

If no permission is granted for that action then a dash (“-”) is inserted into its place, instead of the appropriate letter.

So for instance:

-rwxwr-r–

…may be broken down into three sections, one for the owner (“rwx”), one for the group(“rw-”) and one for the everyone (“r–”).

In this case, the owner has full read/write/execute permissions (“rwx”), while the group has read and write permissions but no execute permission (“rw-”) and others get only read permission (“r–”).

These file permissions are all bundled in one line, and revealed at the command line, with the “ls -l” command for listing the file attributes of all the files in a directory.

But that is not what I am here to talk to you about. I’m here to discuss….

Numerical Representation

File permissions can be represented not only with rwx’s but also in octals, or a set of three numbers in Base-8 (that is to say a number system that uses only 0 through 7). You can specify changes using octals in the command to change permissions, chmode.

These octal permissions will be three digits. For instance, the “-rwxwr-r–” above, octally speaking, will be “764.”

From left to right, the first digit represents the permissions for user, the second one is for the group and the third one is for others.

I will show you an easy way to derive this number.

The trick is to convert each set of rwx file permissions into a three-digit binary number. Then, all 3 sets can be converted into a single three-digit octal number.

The trick is understanding how binary numbering works. Binary is a Base-2 system. A digit is either 0 or a 1, and all numbers can be represented this way, given enough binary positions. You just keep carrying over, just like in Base-10.

For octal numbering systems, we’ll need only the first 8 numbers (in our base-10 system) in binary. They are:

0 is in binary 000
1 is in binary 001
2 is in binary 010
3 is in binary 011
4 is in binary 100
5 is in binary 101
6 is in binary 110
7 is in binary 111

If you memorize the formula for binary numbers, you can derive these quite easily. Reading right to left, each position of the three binary set represents a different number in the Base-10 numeric system. The first space represents a “1,” the second a “2,” and the third a “4″ (in essence, the values double for every move to the left).

So, to read the binary string, you can basically add up all the values that each 1 represents, as determined by the column that 1 is in. In other words, just look to see if there is a 1 in each column. If there is a 1 in a column, you add the number that that column represents into the total. So for instance, “111″ in binary would represent “7″ because it would be 4 + 2 + 1.

Now, what is interesting about the file permissions is that, thanks to the use of octals, the rwx permission clusters (for the user, group and other), line up exactly with the three digit binary representations.

To build a binary representation of a set of permissions, just look to see where permissions are granted, then, keeping the same 3 column format, place a 1 where each permission is granted. If there is a dash, put a zero as a placeholder:

So, from our example:

“rxw” = 111

“rw-” = 110

“r–” = 100

And, taking these numbers, in the same order as the original file permissions, you build the octal (using the binary-to-octal conversion above):

111 = 7
110 = 6
100 = 4

or, 764!

Source material taken from this book:


…as well as a class I’m taking on Unix. All mistakes are my own, however.–Joab Jackson

Unix: Getting started with vi

April 4th, 2010

Created by Bill Joy in 1976, vi is a text editor for the Unix/Linux command line. At first glance, it may seem crude by today’s standards for text editors, but it is useful for working in remote command-line sessions.

To open vi at the command line, simple type vi. If you want to open a specific file with vi, type vi and then the filename.

One thing to keep in mind about vi is that it operates in three modes. You must be aware of what mode you are in at any given time, because each reacts differently to what you type in. The three modes:

Command mode is the default mode. When you first start vi, you are in command mode. You can not enter text. Here you are entering commands. Most keystrokes have a command associated with them.

Input mode is where you actually enter text. the easiest way to get into input mode from command mode is to type the letter “i.” Then you can start typing. (“a” will also work). To get out of insert mode back into command mode, hit the escape key.

Ex mode is used for file handling duties, as well as performing substitution tasks. It is kicked off by typing a “:” from the command mode.

For instance , if you want to save a file, you’d hit escape type in “:w [filename]” If you want to quit, type in “:q” If you haven’t saved your file since making any changes however, it won’t let you quit, unless you put an “!” at the end of the command, “:q!”

You can also combine the commands for writing and quitting, i.e. “:wq”

vi can be frustrating to use for beginners; it really is designed to be lightening fast for those who have memorized many of its myriad commands.

While you will have to figure out which commands are worth memorizing for yourself, here are a few that I myself have found handy:

(all of these are executed from the command line, unless otherwise noted):

) and ( : Jump ahead one sentence or jump back one sentence, respectively.

:[Number]: This will allow you to jump ahead by the number of lines you designate. For instance. “:4″ will jump the cursor ahead 4 lines. Using a negative number will jump back by the number you designate.

ctrl-f, ctrl-b, and ctrl-u, ctrl-d: Jump a screen (24 lines) forward, back, up or down, respectively.

o and O: will move from command mode to insert mode, but insert a new blank line. This is also handy for adding a new line at the end of the document.

dd: delete a line. (Note: This is also the first step of a cut and paste operation. See below).

dw: delete a word.

p and P: This means to paste, as in cut and paste. When you delete something with dd or dw, it goes into the buffer. This command retrieves what is in the buffer.

yy: The command allows for copying and pasting, without the cutting of copy. Typing yy copies the line that the cursor is on.

u undoes the last command (though there seems to be no undo for the edit mode).

/ and ? are search operators. Type them in and then the text you are searching for. / looks for the next instance after the cursor, ? looks for the first instance before the cursor.


Taken from this book:


…as well as a class I’m taking on Unix. All mistakes are my own, however.–Joab Jackson

Unix : Redirection basics, part 1

April 3rd, 2010

Note: This entry does not discuss Unix pipes. That will be part 2.

One of the powers of the Unix command line is the ability to redirect input and output of either end of the command (for most commands, anyway).

By default, Unix assumes that the default input will come from the keyboard, and the default output would go to the display. So, to view all the files in a directory, you type “ls” at the command line and the program returns to the display of all the files in a directory.

But you can also direct the output of a program to another source, such as to a text file. You can also specify a new source of input.

This is done using the “>” and the “<” characters.

For instance, say you want to get a list of files in a directory, but instead of having them appear on the screen, you want to put them in a new file, called ListOfInfo.txt. then you’d type:

$ls > ListOfInfo.txt

And if you wanted to add more information to this file, you could append the info with “>>”, i.e.:

$ps -aux>> ListOfInfo.txt

(Otherwise, with just a single “>” Unix will just overwrite the contents of an existing file).

Just as “>” directs the output, using “<” will direct input. For instance…

$wc < ListOfInfo.txt

…will give you the word count of the ListOfInfo.txt file.

You can mix and match these commands. For instance…

$wc < ListOfInfo.txt > WCResults.txt

Keep in mind that not all Unix programs accept redirection, of either input or output. A command such as “mkdir” can’t accept input or redirect its work elsewhere.

* * *

When running a shell, Unix keeps three different streams, or files, for input/output purposes. Each gets a file descriptor number (More on that later). They are:

Standard Input: The file that captures the input, usually from the keyboard (file descriptor # 0).

Standard Output: The file that captures the output, which is usually sen to the terminal display (It has a File descriptor # 1).

Standard Error: This file captures the error messages from the shell or the running program (It has a file descriptor # 2).

With this in mind, “>” really means “1>” and “<” is shorthand for “0<”.

All this means you can redirect standard input output and error messages. For instance, say you want to capture an error message in a text file. You can’t do that with the standard redirect. A wc on a nonexistent file reirected to an output file will not send the error message to the file. Instead, you can type:

$wc phonyfile.txt 2> ErrorFile.txt

Note, you can also group these redirections for a single stream. “1>$2″ sends standard output to the standard error file, and “2>$1″ sends the error output to the standard output.


Taken from this book:


…as well as a class I’m taking on Unix. All mistakes are my own, however.–Joab Jackson

The human body’s master clock

April 3rd, 2010

In almost each cell of the human body is a tiny molecular clock, made up by a set of protein gears. These gears play a role in almost every biological function, such as cuing body hunger and sleepiness, and even affecting cellular division and the aging process overall.

Traditionally, it was thought the body had a master clock to synchronize time with all the cell clocks, called a suprachiasmatic nucleus (SCN), which has about 20,000 neurons, and resides in the brain.

Cells in the retina relay a message to the SCN when they sense light. The light cues, in turn, affect the SCN cells’ firing rate, or the rate at which neurons send off electrical messages to other brain cells. The SCN messages affects a wide range of other brain-controlled body functions, such as regulation for thirst.

It also thought some researchers that the SCN also gives out a substance, a set of peptides, that may help help synchronize other cells. Others think that it is not the SCN, but rather all the individual cells working together to produce a daily rhythm.

From the March 27 audio issue of Science News–Joab Jackson.

Unix: Converting files between DOS and Unix

February 28th, 2010

Recently I found that, after a uploading file from a Windows computer to a Linux one, and opened the file from the command line, Ubuntu would notify me that it was converting it from the DOS format.

Even if it was a standard text file (.txt) filled with ASCII characters, it still needed converting.

Why? Aren’t text files the same across different operating systems? Evidently not.

Unix handles end-of-line signifiers differently than Windows/DOS does, according to Sumitabha Das’s book “Your Unix”.

Specifically, DOS uses two different sets of characters, “\r” (for Carriage Return [CR], or simply “enter”) and “\n” (for Line Feed [LF]) to signify the end of a line.

Unix only uses one, namely LF

These markers can both be seen by examining text files with Octal Dump.

Ubuntu anyway seems to handle DOS text files easily in day to day operation. Nonetheless, most variants of Unix/Linux have a set of utilities to convert files from Windows/DOS into Unix, and back again. They are called dos2unix and unix2dos, respectively.


Taken from this book:


…as well as a class I’m taking on Unix. All mistakes are my own, however.–Joab Jackson

Unix: Decoding binary files with Octal Dump

February 22nd, 2010

In many cases with Unix/Linux, if you want to view a file, using the cat command works just fine. The phrase “cat samplescript.txt”, will reveal, at the command line, the content of that file.

Cat won’t work for binary files, because binary files contain non-printing characters (Or non-ASCII characters). Run a cat on a binary program, such as sed, will only get you a screen full of gibberish, and may even destroy the terminal session itself.

(Storing programs as binary files is more efficient than storing them in ASCII, largely because binary programs use all eight bits in a byte [up to 256 possible combinations], whereas ASCII only uses seven [128 combinations] leaving the last bit to sign the byte).

What Octal Dump (od from the command line) does is display the contents of a binary file, including an execution files, as sets of octals.

As the name suggests, the octal numbering system is a numbering system in base eight. When used with the “-bc” option, he od program renders each byte of the program in octal.

For instance, rendering this command from the command line in the /bin directory of binary files:

od -bc sed

will return a row of six digit octals, preceded by a seven digit number that is the offset, or position, of the first byte in the line. Below each octal is a its conversion into ASCI characters, if the resulting decimal conversion falls between decimal 33 and 127.

As an aside, to convert from octal to decimal yourself, simply multiply each digit of the octal number by a successive power of eight, going from right to left. So, if the octal is 114, then you would calculate (1* [8^2] + 1 * [8^1] + 4 * [8 ^ 0]), which would equal (64 + 8 + 4), which would equal 76


Taken from this book:


…as well as a class I’m taking on Unix. All mistakes are my own, however.–Joab Jackson

Unix: Indexing files with inode

February 16th, 2010

In Unix, an inode is a data structure that holds information about a file, or set of data blocks. You can think of it as an index, or a collection of metadata about a file. It contains info such as the owner, the permissions, the date created and last modified, as well as the location of the data blocks that contain the information.It is kept on a disk in a separate location from the data blocks themselves.

“When users search for or access a file, the UNIX system searches through the inode table for the correct inode number. When the inode number is found, the command in question can access the inode and make the appropriate changes if applicable,” according to the online paper about inodes posted by IBM.

Each time a user creates a file, a corresponding inode is created. It is possible to run out of inode numbers. Typically, however, a disk will run out of space first before it runs out of inode numbers, according to one instructional site. Although typically, the number of inodes is set by the operating system, they can be set during the set up process of the file system.

By using numerical inode numbers as identifiers, the OS can have multiple file names, in different directories, point to the same file (Called hard linking). inodes are also handy during file system maintenance or recovery operations, such as fsck. fsck checks for lost inodes, or inodes with no pointers, and attempts to repair them.

One can use the “df” command to check the remaining percentage of inodes left on a system. For Ubuntu Linux, the command is “df -i.” To find the inode numbers of all the files in a directory, type “ls -i”

–Joab Jackson

Windows: Troubleshooting a non-working Hosts file

January 30th, 2010

What do you do when your Windows XP computer isn’t recognizing the Hosts file? Here are a few possible solutions.

Recently, I ran into this devil of a problem. I wanted to do some internal testing of a Web site, from a browser on a Windows XP machine. So I added an entry in the hosts file on the XP machine that would redirect joabj.com to the internal IP address of the server (“192.168.0.33 joabj.com” in this case). (Typically, in WinXP, the Hosts file was located in the C:\WINDOWS\system32\drivers\etc folder). Yet, the browser still returned errors!

Yet, the browser kept consulting the external DNS service first, and returning the wrong page (my cable modem page in this case).

Most infuriatingly, Windows host file command line tools (namely, ping and SSH) recognized entries, but the browser did not!! If I ping’ed my domain name entered into the Hosts file (“ping -a joabj.com” in this case), it pinged the correct IP number (“reply from 192.168.0.33:” etc…).

Surfing the Web, I came across a number of different solutions to this problem:

*Reboot: Not only rebooting the machine (duh!), but emptying the browser caches, flushing the DNS (from the command line, type “ipconfig /flushdns”).

*Extra empty characters in the Hosts file: Evidently, Windows doesn’t like an empty space behind the entry, i.e. “192.168.0.33 joabj.com ” rather than “192.168.0.33 joabj.com” –make sure you don’t add in an empty space.

*Corrupt Hosts file: This could be the case even if it opens in Notepad o.k. Try replacing the existing Hosts file with a new one.

*Specify exact subdomain in Hosts file: This is the solution that ultimately worked for me, after trying all these other more complicated solutions, described below.

In a nutshell, if you plan on using the address “www.YOURDOMAIN.com” you should type “www.YOURDOMAIN.com” into the Hosts file, rather than just “YOURDOMAIN.com”.

So, for me, once I replaced “192.168.0.33 joabj.com” with “192.168.0.33 www.joabj.com” then using http://www.joabj.com worked fine, whereas before it wouldn’t.

*Editor adding extension to Hosts file name: Sometimes a text editor could add on the .txt to the file name during save, making it Hosts.txt rather than just Hosts. Of course, then Windows won’t recognize the Hosts file, and Explorer won’t show, by default, the suffixes of file names.

If perusing from Explorer, set the folder view options to show suffixes. From Explorer, go Tools–>Folder Options–>View and uncheck “Hide extensions for known folder types.” If Hosts is a .txt, remove the .txt from the file name.

*XP’s DNS Cache service taking priority: One troubleshooting site suggested this as a probable cause. It didn’t make any difference in my case.

To disable this service, go Start–>Control Panel–>Administrative Tools–>Component Services–>Services(Local). Then search for DNS Cache and disable it. You could just stop the service to check if it has any affect, though it will start up again on reboot. The Manual setting just means that the service will start up, once a browser is fired up. The Disable option turns it off altogether, until you turn it back on again.

According to other people who’ve tried this, disabling DNS Cache should have no ill-effect on your DNSing.

*Reorder the DNS lookup sequence: Typically, Windows XP will consult the local Hosts file before checking with a DNS server to resolve domain name. But, sometimes not.

You change the order of the lookup in the registry. (STANDARD DISCLAIMER: DO not mess w/ registry until you know what you are doing).

To fire up the registry editor, do Start–>Run and put “regedit” in the box.

Once in regedit, go to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\ServiceProvider . Once there, you will see a number of entries, including “DnsPriority” “HostsPriority” “LocalPriority” and “NetbtPriority” –which are the entries for DNS-based lookup, Hosts file-based lookup, Computer-based lookup and NetBios-based look-up. (More info on that here).

In the data column for each you see a number in parenthesis. This number is the priority for that lookup. The lower the number, the earlier in the domain name resolution sequence it is consulted (evidently the range is between -32768 and 32767). If the DNS number is lower than the Hosts’ number, then you want to give the Hosts number a lower number than the DNS number.

So if DnsPriority is 5000 and Hosts is 7000, you may want to change Hosts to, say, 4500

Keep in mind, that when you set the number, by right clicking on the entry and choosing “modify,” you just can’t add the number as is — you will have to enter the new number in either hexidecimal or binary.

One easy way to convert a number into hexadecimal is to call up Windows calculator, switch the view from standard to scientific, then enter the number into the field for entering a number. After the number is entered, look for where “dec” is selected on top of the calculator, and switch that to “hex.”

–Joab Jackson