punycode applied the wrong way
Bug description
when a URI / IRI with non-ascii characters is received, punycode encoding gets applied the wrong way, resulting in URIs that can not be resolved. this affects two areas:
-
domain name conversion: the prefix is not added and the calculated suffix is wrong. while http://öpnv-karte.de should be http://xn--pnv-karte-z7a.de according to the epiphany browser and the converter at http://www.charset.org/punycode.php, gajim displays it as http://pnv-karte.de-w6b
-
the punycode escaping gets applied even inside the path component, where there is no punycoding defined, and common practice is to percent-escape the URI created from the unicode IRI entered. http://example.com/schön should be http://example.com/sch%C3%B6n, but gajim displays it as http://example.com/schn-tlc
Steps to reproduce
- send one of the abovementioned IRIs to yourself via gajim
- watch the xml console: on the wire, they appear as unicode strings without the added punycode explanation
- look at what gets received: the address displayed as hyperlink as well as the link attached to it are the original IRI, but the URI displayed in parentheses after is nonsentical.
Software versions
OS version: Debian GNU/Linux sid
GTK version: 2.24.25-1
PyGTK version: 2.24.0-4