Curl Global Community
Punycode in Curl 8.0 - Printable Version

+- Curl Global Community (https://communities.curl.com)
+-- Forum: Blogs (https://communities.curl.com/forumdisplay.php?fid=17)
+--- Forum: Tech blog (https://communities.curl.com/forumdisplay.php?fid=18)
+---- Forum: Robert blog (https://communities.curl.com/forumdisplay.php?fid=20)
+---- Thread: Punycode in Curl 8.0 (/showthread.php?tid=452)



Punycode in Curl 8.0 - RobertShiplett - 03-29-2012

The RFC 3492 Punycode (displaying a UNICODE international resource locator as ASCII characters) is explained at http://ja.wikipedia.org/wiki/Punycode OR http://rfc-ref.org/RFC-TEXTS/3492/index.html .

I was very pleased to find encode and decode proc's in the String class of CURL.IO.FILE as

public {idn-hostname-to-unicode hostname:String}:String

and

public {idn-hostname-to-ascii hostname:String}:String

Since we seem to be stuck with the pejorative "puny" or "piu" (for LATIN puteo) or whatever, so perhaps the naming could have been more like puny-to-unicode and unicode-to-puny or at least have the former named

{puny-hostname-to-unicode}

but the important thing is that the methods are there and public.

But note: punycode is for international domain names; for everything to the right of the domain, use

{let aviation:String = {url-encode-string "my-機-directory"} } which should return

"my-%e6%a9%9f-directory"

whereas, e.g.,
http: / / our-機-resources.jp
must become
http: / / xn--our--resources-y454a.jp
which gives us
http: / / xn--our--resources-y454a.jp/my-%e6%a9%9f-directory

until we decide to make that "directory" a subdomain ... but let's not go there. Do we have a browser smart enough to ask you how you want to copy a URL out of your own address bar? My biggest annoyance is ru.wikipedia.org which I often visit while trying to improve my Russian geography ... and whose URL's display unencoded in my browser, but copy encoded. O-vey!

Please note that according to the docs, {idn-hostname-to-ascii some-hostname} is called by {parse-url } when Curl converts the hostname part of a URL to Punycode Ascii . The {url } macro, for example, calls the proc {parse-url }.

Comparing the declaration of {parse-url to that of {abs-url in the Curl docs is instructive.