Here’s a regular expression to match URLs based on the RFC 3986. This example uses PHP.
[cc lang=”php”]
$scheme = “(https?)://”;
$userinfo = ‘([“+a-z0-9-._~+”]+(:[“+a-z0-9-._~+”]+)?@)?’;
$host = ‘([([0-9a-f]{1,4}|:)(:[0-9a-f]{0,4}){1,7}((d{1,3}.){3}d{1,3})?]|[“+a-z0-9-+”]+(.[“+a-z0-9-+”]+)*)’;
$port = “(:d{1,5})?”;
$path = ‘(/(([“+a-z0-9-._~!$&’ . “‘()*+,;=:@+” . ‘]|”+%[0-9a-f]{2}+”)+)?)*’;
$query = ‘(?(([“+a-z0-9-._~!$&’ . “‘()*+,;=:@+” . ‘”/”+”?”+”]|”+%[0-9a-f]{2}+”)+)?)?’;
$fragment = ‘(#(([“+a-z0-9-._~!$&’ . “‘()*+,;=:@+” . ‘”/”+”?”+”]|”+%[0-9a-f]{2}+”)+)?)?’;
[/cc] Continue reading URL Regular Expression (Regex)