Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can you expand on what the security concerns are as to confusables in symbol names in C source code? Clearly there's a security concern with cut-n-paste, but that's true regardless of what the rules might be for C identifiers.

It's not like it's obvious that UTR#39 applies literally everywhere that there are "identifiers".

Also, can you speak to what is the security concern with form-insensitivity (rather than confusables) as to symbols in input source files? I just don't see a concern at all there, but maybe I'm missing something.

Lastly, I think `#include` is the most important place to get this right since that does interface with the world outside the compiler (specifically: the filesystem), but as you note the filesystems mostly are just-use-8bit -- very few filesystems normalize on create (HFS+) or are form-insensitive (ZFS). The other place to get this right is on the object file output side, where symbols definitely must be normalized.

Oh, one more thing: the platform might impose some rules regarding symbols in ELF and any other object file formats. Are they known to? I suppose C can't necessarily cater to all platform-imposed limitations on symbol naming, but it'd be useful to know about them.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: