Locale-specific behaviour in URI regex
http://èxämplæ.org is recognized as a legal URI, but is not. [a-zA-Z] obviously also grabs non-ASCII characters. Maybe related to this <http://stackoverflow.com/questions/9043712/locale-specific-behavior-in-the-regex-library>.
This is a security bug as it makes phishing via IDN homograph attack possible http://en.wikipedia.org/wiki/IDN_homograph_attack … http://раураꙆ.com is full on cyrillic homographs and highlighted as a URI.
If you copy from your browser's URL bar, the actual string copied is percent-escaped: http://de.wikipedia.org/wiki/Käsesorte becomes http://de.wikipedia.org/wiki/K%C3%A4sesorte