HACKER Q&A
📣 m33k44

Why are username and password fields limited to certain characters?


This could be a noob question as I am not a web developer, but why limit the username and password fields to certain characters?


  👤 imhoguy Accepted Answer ✓
Because sanitization is not easy. You can visually spoof user names with Unicode, e.g. this Unicode: "аdmіn", is not really same as this ASCII: "admin" [0]. Both would create two separate accounts without input limitation. There is more variety with available alphabets [1].

[0] https://onlineunicodetools.com/spoof-unicode-text?input=Noth...

[1] https://qaz.wtf/u/convert.cgi?text=admin


👤 dusted
Optimistic wishful thinking version: It's cargo cult cultural heritage from a time before proper input sanitation and password hashing. (bad characters like " would make either the receiving backend code bork, cause an SQL escape or be inserted as-is into a database field not supporting the character).

More likely: It's not cargo-cult cultural heritage and stuff is still that bad in some places.

As for hashing bruteforce attacks: no problem, do an initial hashing round on the client before even sending it to the backend.


👤 marklit
Long passwords can cause a DDOS attack as they need to be hashed. Longer the password, longer the hashing takes. Django had to patch this back in 2013: https://github.com/django/django/commit/5ecc0f828ebe270cfc92...

👤 pie_flavor
Certain systems were built by certain very stupid people, and not only do not hash the passwords before inserting or checking them against the database, but additionally neither use the built-in parameter quoting system nor manually quote the password when it goes into the query. Fortunately they're not quite dumb enough to allow the old " OR "" = " attack, and so make sure you don't have any SQL characters like " or ; in your password.

👤 schappim
Sometimes the backing database is a legacy system that can only accept certain encodings or symbols.

👤 SideburnsOfDoom
A blunt instrument against SQL injection attack: https://owasp.org/www-community/attacks/SQL_Injection

👤 muzani
So I can't set my username to "თ̶̪̫͑ა̷̘̒͝ვ̶̪̀̃ა̸̍͜დ̷̄ͅ ̴̢̟͠ტ̶̻̀̎კ̷̡̂ი̴͇͓́̍ვ̶̻̐ი̶̹͍͒ლ̷̙͘ი̵̪̠͆". For extra fun, remember that there are many forms of whitespace characters.

👤 zzo38computer
Usernames should probably be limited to a subset of printable ASCII.

However, for passwords, you should allow any sequence of bytes (as much as the system can support it, which it should allow any sequence since they should be hashed anyways), and the maximum length should be sufficiently long (and don't truncate passwords either).


👤 dijit
Depends what you’re doing with the username.

Sometimes it’s just being able to be standards compliant in future:

POSIX stipulates that a dot should separate the user with the group (so jan.harasym is technically an illegal unix user name, and as such might break non-gnu `chown`)

If your username is for a comprehensive suite like Google’s gsuite then having an “@“ or “%” would make your email address invalid.

There could be encoding issues too, but I think the majority of new systems use UTF-8.

I haven’t seen a _real_ encoding issue in about 4 years.


👤 tester34
If you're displaying username (like on NewsHacker) then maybe you don't want to allow people to use fancy weird characters that are untypable

👤 SahAssar
For usernames it's reasonable. People identify other people based on usernames, so allowing unicode control charachters or zero-width spaces would not be good. If your username is only for login then it should be fine to allow anything.

For passwords it's just bad practice. Allow everything, and allow it to be very long.