TOPlap continues its struggle with password security.
The IT department has encouraged the use of accented characters in their password policy for extra security. But then they got some complaints from users that they can't enter the system on some devices. After a lengthy investigation, two related problems were discovered1.
The first problem is that different devices may have different input methods. Users may switch from one device to another and expect that they can still log in using the same password. But depending on the device, each accented character can be entered in either composed or decomposed form. What apparently looks like the same password can actually be encoded as different strings. It's even possible that an input string contains a mix of composed and decomposed characters.
The second problem is that the passwords are already encrypted. We never store passwords in plaintext, as this would be very poor security. The passwords are encrypted using bcrypt. (Libraries for bcrypt are available for most programming languages). Given the trapdoor nature of bcrypt, it's impossible to find back the original password. The only time that we can know the password, is when the user enters it, and we've checked that it hashes to the same result.
The result of a bcrypt hash may look like this:
bcrypt.hash("secret", 10)
$2b$10$v3I80pwHtgxp2ampg4Opy.hehc03wCR.JBZE6WHsrSQtxred57/PG
Note that the encrypted string includes a salt
, a random string that increases the security of the password. Because the salt is random, each invocation of bcrypt could give a different result.
To check if a password matches, you must feed the salt back into the algorithm, and check that you still get the same hash.
bcrypt.hash("secret", "$2b$10$v3I80pwHtgxp2ampg4Opy.")
$2b$10$v3I80pwHtgxp2ampg4Opy.hehc03wCR.JBZE6WHsrSQtxred57/PG
You receive a list of log-in attempts. For each log-in, check if it matches using normalized and unnormalized forms.
Given some (UTF-8 encoded) test input:
etasche $2b$07$0EBrxS4iHy/aHAhqbX/ao.n7305WlMoEpHd42aGKsG21wlktUQtNu
mpataki $2b$07$bVWtf3J7xLm5KfxMLOFLiu8Mq64jVhBfsAwPf8/xx4oc5aGBIIHxO
ssatterfield $2b$07$MhVCvV3kZFr/Fbr/WCzuFOy./qPTyTVXrba/2XErj4EP3gdihyrum
mvanvliet $2b$07$gf8oQwMqunzdg3aRhktAAeU721ZWgGJ9ZkQToeVw.GbUlJ4rWNBnS
vbakos $2b$07$UYLaM1I0Hy/aHAhqbX/ao.c.VkkUaUYiKdBJW5PMuYyn5DJvn5C.W
ltowne $2b$07$4F7o9sxNeaPe..........l1ZfgXdJdYtpfyyUYXN/HQA1lhpuldO
etasche .pM?XÑ0i7ÈÌ
mpataki 2ö$p3ÄÌgÁüy
ltowne 3+sÍkÜLg._
ltowne 3+sÍkÜLg?_
mvanvliet *íÀŸä3hñ6À
ssatterfield 8É2U53N~Ë
mpataki 2ö$p3ÄÌgÁüy
mvanvliet *íÀŸä3hñ6À
etasche .pM?XÑ0i7ÈÌ
ssatterfield 8É2U53L~Ë
mpataki 2ö$p3ÄÌgÁüy
vbakos 1F2£èÓL
The first section of the input contains entries from the authentication database. It contains usernames, followed by the bcrypted password. The last section of the input contains a series of login /attempts/. Some of these attempts may be invalid (perhaps there was a typo, or perhaps somebody else tried to log in with a random password). The passwords and login attempts may contain composed or decomposed accented characters, they even may contain a mix of both.
Looking at the first login attempt above, user etasche
logged in with .pM?XÑ0i7ÈÌ
.
This expands to:
. | p | M | ? | X | Ñ | 0 | i | 7 | E | ◌̀ | Ì |
But the original password was entered as .pM?XÑ0i7ÈÌ
, which expands to:
. | p | M | ? | X | N | ◌̃ | 0 | i | 7 | È | Ì |
As you can see, in the original password, the Ñ
was decomposed, but in the login attempt, the Ì
was decomposed.
To conclude, this login is indeed valid, because both passwords can be normalized to the same string.
In the same vein, we can check all twelve login attempts and get the following result:
etasche .pM?XÑ0i7ÈÌ
is a valid login.mpataki 2ö$p3ÄÌgÁüy
is not a valid login.ltowne 3+sÍkÜLg._
is not a valid login.ltowne 3+sÍkÜLg?_
is a valid login.mvanvliet *íÀŸä3hñ6À
is not valid.ssatterfield 8É2U53N~Ë
is not valid.mpataki 2ö$p3ÄÌgÁüy
is not valid.mvanvliet *íÀŸä3hñ6À
is not valid.etasche .pM?XÑ0i7ÈÌ
is valid.ssatterfield 8É2U53L~Ë
is valid.mpataki 2ö$p3ÄÌgÁüy
is not valid.vbakos 1F2£èÓL
is not valid.In this example, 4 (out of 12) logins were valid.
How many valid logins are there for your puzzle input?
Reading & reference materials
- Unicode Normalization
- Unicode normalization forms
- Comparing Unicode codepoints can be tricky, but it's essential when searching in texts
Thanks to Roel Spilker for providing inspiration for this puzzle. ↩
To play, please log in with one of these options:
GitHub Login |
Google Login