Configuring 404 File Not Found Error Handler in Apache

I recently needed to configure how 404 file not found errors were handled in Apache. This was easily done by editing httpd.conf (search for “404” to quickly find where the relevant directives are) and changing a line of code. Here’s what it looks like by default

[cc lang="php"]# Customizable error responses come in three flavors:
# 1) plain text 2) local redirects 3) external redirects
#
# Some examples:
#ErrorDocument 500 "The server made a boo boo."
#ErrorDocument 404 /missing.html
#ErrorDocument 404 "/cgi-bin/missing_handler.pl"
#ErrorDocument 402 http://www.example.com/subscription_info.html
[/cc]

Continue reading Configuring 404 File Not Found Error Handler in Apache

Unicode Regular Expressions in PHP

Recently I wanted to perform a regular expression match in PHP to match all printable characters. I used the character class [:print:] to do this. My PHP test code was

[cc lang=”php”] preg_match(“/^[[:print:]]*$/”, “abcde”)[/cc]

Although this worked, it didn’t work for non-ASCII characters, e.g. French characters with accent marks like réseau. What I needed was the ability for preg_match to match all unicode printable characters. It turns out there is a modifier (/u) that supports this. But, I also had to use a special unicode character class so my test code became

[cc lang=”php”]preg_match(“/^P{C}+$/u”, “réseau”)[/cc]

P{C} basically matches everything EXCEPT control characters in any language.

You can find more info in the Regular Expressions Cookbook by O’reilly in chapter Unicode Code Points, Properties, Blocks, and Scripts.

How to Create a Windows Keyboard Shortcut

Recently I have needed to convert web pages saved in non-UTF-8 encoding into UTF-8 encoding. I was using Windows Notepad to open files and then save them with the same file name (using the Save As command) but selected “UTF-8” in the encoding field (It defaults to ANSI). I noticed, however, that I needed to close Notepad and reopen it every time I wanted to do this, otherwise weird characters would appear when I try to view the converted file. So, instead of looking for and clicking on the Notepad icon for each file, I created a keyboard shortcut key to do this.  This works for any program. Here’s how:

1. Go to Windows Notepad and right click and select “Properties” as shown below.

windows shortcut Continue reading How to Create a Windows Keyboard Shortcut

Find Out a File’s Encoding On Windows

Recently, I needed to know whether a file was encoded in UTF-8 or not. Since everything is international nowadays (especially websites), you might as well encode everything in UTF-8 so no funky characters appear. Although there are many encoding schemes, Notepad was pretty easy to use to see whether a text file was encoded in UTF-8 or not. Here’s what I did:

  1. Open a text file (e.g. index.php),
  2. Click File -> Save As
  3. Look as what is selected in the Encoding field. If it’s not UTF-8, then it’s not UTF-8, and you can select UTF-8 and save it as UTF-8.
If anyone finds a simple way to automate this in a loop to convert a batch of files, let me know in the comments. I read many posts on automating this in PHP.net comments but they didn’t seem reliable.
Actually, I just found out you can use the free Notepad ++ to convert from ANSI to UTF-8 and to other character encodings.

URL Regular Expression (Regex)

Here’s a regular expression to match URLs based on the RFC 3986. This example uses PHP.

[cc lang=”php”]
$scheme = “(https?)://”;
$userinfo = ‘([“+a-z0-9-._~+”]+(:[“+a-z0-9-._~+”]+)?@)?’;
$host = ‘([([0-9a-f]{1,4}|:)(:[0-9a-f]{0,4}){1,7}((d{1,3}.){3}d{1,3})?]|[“+a-z0-9-+”]+(.[“+a-z0-9-+”]+)*)’;
$port = “(:d{1,5})?”;
$path = ‘(/(([“+a-z0-9-._~!$&’ . “‘()*+,;=:@+” . ‘]|”+%[0-9a-f]{2}+”)+)?)*’;
$query = ‘(?(([“+a-z0-9-._~!$&’ . “‘()*+,;=:@+” . ‘”/”+”?”+”]|”+%[0-9a-f]{2}+”)+)?)?’;
$fragment = ‘(#(([“+a-z0-9-._~!$&’ . “‘()*+,;=:@+” . ‘”/”+”?”+”]|”+%[0-9a-f]{2}+”)+)?)?’;
[/cc] Continue reading URL Regular Expression (Regex)

Different Types of CAPTCHA

Recently I needed to improve CAPTCHA on a website. Here are the different types of CAPTCHAs I found.

Personally, the CSS CAPTCHA is the most elegant since users won’t even see any CAPTCHA field.