HACKER Q&A
📣 ugur2nd

How to generate URL in Asian languages?


I developed a micro SaaS tool. In this tool, people can add things. And it gives titles to the things they add. When you give a title, this title turns into a URL with a permalink.

For example: "Your Welcome" => "your-welcome"

But when Asian languages are involved, the system explodes.

For example: "这是一个例子。" => "这是一个例子。"

The website does not understand this and takes it to the error page. Call it "Zhè shì yīgè lìzi." If I could print it like this maybe it would work, but I searched and couldn't.

I tried with URLEncode/decode, still didn't work.

Maybe you know. If there are Asian entrepreneurs among us or people who have worked in that field, can it help?

Thank you.


  👤 dave4420 Accepted Answer ✓
Web browsers will take the IRI (Internationalised Resource Identifier; think of it as a URI but containing non-ascii characters), and turn it into a URI (containing a load of percent-encoded utf-8 bytes instead of non-ascii characters) and send that to the web server.

So your website will need to decode the percent-encoded utf-8 bytes into Unicode characters.

(On phone, simplifying a bit. I encourage you to read the relevant RFC.)