Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, people can't hardly write safe web apps in PHP without spraying XSS and auth bypasses and arbitrary shell executions and arbitrary SQL injections everywhere, and you also want to hand the attackers the ability to segfault your server or possibly even straight-up run arbitrary code?

Anyone smart enough to truly safely code a website in C is smart enough to learn a language to create that website which doesn't get on all fours and beg to be owned.

Yes, I mean that 100% seriously. If you're smart enough to safely write that C code, you won't. If you aren't smart enough to write that C code safely... and the first clue that you aren't is that you think that you are... you don't stand a chance. It's suicidal.



I use C every day, and I wrote a whole framework for writing websites in C (no longer maintained because I'm no longer that interested in writing websites):

http://web.archive.org/web/*/http://www.annexia.org/freeware...

What was amazing was that you could write pretty complex dynamic websites that fit into a few K of RAM and could trivially handle huge load.

I originally wrote that framework for a chat service. It ran for years on a single 32-bit CPU box with 128 MB of RAM, handling the chat requirements of dozens of English schools.


Unexpectedly popular comment :-) A bit more background:

- It used a scheme for cooperative threading that is similar to coroutines. This meant you could write straightforward code and all the event driven stuff was done automatically behind the scenes.

- Parallelism wasn't so important back in 2001 but you could also do that by forking N reactors (one for each core).

- There was a C string library modelled on Perl. It had lots of string management, vectors and hashes. Buffer overflows were effectively made impossible by the string handling library.

- UTF-8 was used throughout and it was fully i18n-able (using getttext).

- Template library.

- There was a pool-based memory allocator library (similar to Apache's APR / Samba's talloc). Effectively you didn't have to worry about memory allocation at all except in some rare corner cases, mainly when you wanted to store something in a global cache.

- All persistence done through PostgreSQL using a Perl DBI-like library. SQL injection was impossible because it used prepared statements.

- There was a rather experimental system built on top of this that allowed you to embed widgets in webpages so you could write very interactive stuff without using Javascript (something of a concern back in 2001, not so much now). It maintained the widget state transparently across page reloads. AFAIK no one has every done anything like this before or since.

It was, to some extent, a bit crazy that I wrote all of the above in about 6 months, but I was being paid a lot of money, working at a very disfunctional company that was on the verge of going out of business, and didn't have much else to do.


Seems like the rules for C web development are the same as for any other language: don't trust user input, and delegate the sanitization to vetted library functions. It's not like it's 1991 and you have to use plain arrays and strcmp; there are really good, safe libraries for these things.

That said, doing web development in a language with neither a REPL nor built-in unicode support sounds like a Bad Time.


Still an order of magnitude easier not to shoot yourself in the foot in most higher level language.

Pretty sure you still have to use plain arrays and strcmp, what are these "safe" libraries you were going to use? Unless we are talking about C++ here?

Also C supports unicode fine (to the extent it supports strings) and REPL can't hardly be considered a requirement for web development considering Java, .NET and PHP* don't have REPL's.

*Looks like PHP has some now.


Pretty sure you still have to use plain arrays and strcmp, what are these "safe" libraries you were going to use? Unless we are talking about C++ here?

glib: https://en.wikipedia.org/wiki/GLib

There are others. Basically, if you wrap your dangerous C app in a thin, impenetrable layer of solid string processing and input validation, it's very manageable.

Valgrind and input fuzzing help considerably to work out any bugs.


> Still an order of magnitude easier not to shoot yourself in the foot in most higher level language.

I would argue that C's lack of robust string concatenation encourages most people to avoid concatenating strings at all costs. Most DB libraries support bound parameters, which would be much easier to use than constructing an arbitrary sql string in C. So, I would argue the tendency for a competent C programmer is to do the safe thing rather than the lazy thing other languages make easy that exposes you to SQL injections.

Along with that, most scripting languages are written in C. I know a lot of people who have written PHP extensions in C. This article seems to suggest that no one does any web development in C, when almost every large company I know of does so, even if it is just to speed up slow parts of their app by adding new functions to PHP.


> Pretty sure you still have to use plain arrays and strcmp, what are these "safe" libraries you were going to use?

So, thought it would be worth giving some examples. By far the lowest level solution are things like strlcpy & strlcat, which basically still live in a NULL terminated world by try not to be stupid about it:

http://www.gratisoft.us/todd/papers/strlcpy.html

There are some specifically targeting strings and making them both more efficient and safer:

http://bstring.sourceforge.net/

There's more sophisticated runtimes like glib or APR, which almost seem like they are trying to completely replace the C runtime, but they provide very clean memory management interfaces and string & blob/block abstractions that allow you to avoid having to worry about a buffer overflow.

Then there are solutions built on top of the likes of that. Things like the GGSK: http://gsk.sourceforge.net/

There's lots more, but it's late and I'm tired. ;-)


For strcmp, the safer strncmp version?

Also you should compile your app with apparmor and run it under grsecurity.

REPL like behaviour you can get with gdb. :)


> Still an order of magnitude easier not to shoot yourself in the foot in most higher level language.

There are some best practices that tend to help you to limit the risk a lot. Still... a lot of people do web development in JavaScript, and I'd cite it as a very strong exception to your assertion.

> Pretty sure you still have to use plain arrays and strcmp, what are these "safe" libraries you were going to use?

Pretty much all of the "NULL terminated" functions have a length terminated equivalent that you can (and should) use instead. There are also blob & string abstractions available that wrap arrays and strings in structs that have fields to track the size of the allocated space, with the side benefit of making C's evil type coercion a bit harder to bump into.

> Also C supports unicode fine

Particularly if you use ICU4C, C actually has the best unicode support out there (obviously there is ICU4J which gets merged into Java regularly, but often you get stuck with an old VM with an ancient version). It's actually kind of shocking how painful it is to have full unicode support with higher level languages that really ought to know better.

> and REPL can't hardly be considered a requirement for web development considering Java, .NET and PHP* don't have REPL's.

> *Looks like PHP has some now

Not only does PHP have one, but Java has since forever (http://www.beanshell.org/), and .NET really kind of does have a few semi-reasonable options (http://www.linqpad.net/, http://www.sliver.com/dotnet/SnippetCompiler/, http://www.mono-project.com/CsharpRepl, not to mention: http://technet.microsoft.com/en-us/library/bb978526.aspx).

Of course, so does C (http://root.cern.ch/drupal/content/cint, http://root.cern.ch/drupal/content/cling, http://www.softintegration.com/products/chstandard/, and arguably even things like https://code.google.com/p/picoc/ or http://ups.sourceforge.net/main.html can serve if you are desperate).


You mean like this?

https://github.com/tyler/Bogart/blob/master/bogart.c#L53

I'm sure nothing could possibly go wrong there...


To be fair, it looks like a "get it working" proof of concept, but I certainly wouldn't take that into production :)


The author explicitly says it's not intended for production in the README:

> This code has every security flaw you imagine, tons of memory leaks, and when I wrote it I may have been awake for much longer than one should be when writing C.


On the other hand, if your development turnaround time is measured in seconds (due to fast compiles from a good toolchain) it really doesn't matter what you're programming in. C with a good set of libraries backing it up is probably just fine.


I once wrote a website (a search engine for a specific set of websites) in C. It actually worked, once I spend 20 hours in valgrind. I've grown since then, in two important ways. First, I'd probably do a better job now, and not have to spend any time in valgrind at all (I still use C quite frequently). Second, I'd never, ever, try to pull that stunt again.


> It actually worked, once I spend 20 hours in valgrind

Thia is exactly the main problem with C.

You need to rely on tolling outside the language to be able to write safer code.

While languages like Ada and Modula-2 and their descendents, offer the same hardware capabilities as C with stronger type checking.


We use C# which allows you to write safe code. However, people still manage to fuck it up at an implementation level often enough for it to cause high risk problems.


I do mostly JVM and .NET based development nowadays, in consulting projects.

Sometimes I wish to be part of a C or C++ based project, then I try to imagine how the quality of our offshore guys would map to those languages and realize how lucky I am not to be part of such projects.


Don't you do something wrong, when you consult the company to use (more) offshore guys? I thought that a company is best led, when developers share their knowledge cooperatively and ask their managers to outsource unimportant time-consuming things like api-/file-/conversions, legacy code support, CSVs …

(Disclaimer: Don't get my tone wrong please, I'm asking not suggesting, thus I respect your experience.)


Most consulting projects in Fortune 500 companies end up with outsourcing the whole project department to the consulting company, in the cases where IT is not the main business.


Oh, I didn't know that it's so extreme.. thanks for reporting back, I appreciate it :)


That's what is supposed to happen but the MBA asshats use it purely for cost cutting...


Yeah very good point. Our offshore guys can't even get c# right :)


If you have not already and can do so, I highly recommend adding Dtrace to your C development toolkit. Dtrace, Valgrind, and GDB make rooting out C runtime issues a lot more pleasant and complement one another well.


Indeed. It's a pity DTrace is not available in Linux (there are two ports, none work for real work). It's also a pity DTrace in OS X is starting to bit rot.


A clean room reimplementation of Dtrace is on my list of ideal computing wants that will probably never happen.

Also on the list is everyone targeting the same hypervisor for device drivers (such as Xen) so that hardware support is excellent for all operating systems and all devices.

I would really, really like a clean room reimplementation of Hexray's IDA pro so that I can use it on OS and architecture, an LLVM front end for Plan 9/Inferno OS, a clean room reimplementation of ZFS, GNUstep to have at least a 1:1 implementation of Cocoa so that no matter the OS and architecture one can target that GUI kit and we have inter-application reuse of functionality through scripts like we have for CLI apps, and a clean room reimplementation of AutoCAD and Candence's Orcad.

And as someone who uses a CAS or equivalent environment a lot, I would really appreciate if an ecosytem such as julia, ipython, or octave would reach and exceed Mathematica and Matlab in ease of use, degree of combination and semantics possibilities, and power as well as efficacy.

I really would like to be able to use Plan 9/Inferno OS all the time but developer tools are not comparable to any BSD or Linux ecosystem and there is not a good GUI toolkit available (I would like GNUstep here is why I want the implementation to succeed).

People who can reverse engineer software and hardware is a very small population compared to the res of the dev population though and they are very likely to end up in a lawsuit if they try.


I don't know where the problem is, I use http://gpo.zugaina.org/dev-util/dtrace on Gentoo without a problem.


We actually used c in ~2000 in our web company (one of the biggest in our country at that time). Even the finnish EU commission site ran on C platform that I wrote. Oh the days..


> If you aren't smart enough [to safely code a website in C...] the first clue that you aren't is that you think that you are

Gee, isn't that epistemology at its finest. Personally I've never met anyone who programmed in C because they were too dumb to learn anything else, but who knows. You may however be aware that, before an HTTP packet even makes it to your shiny, scripty web page, it's often processed by a succession of "segfault-y" and "arbitrary-code-running" software such as Apache and Linux.


I've never met anyone who programmed in C because they were too dumb to learn anything else

I have.

They were too dumb to understand that they were writing unsafe C code because they weren't skilled enough to write safe C code. So it never occurred to them that they had a problem that would be solved by writing part of their system in Python or Ruby or Java.

it's often processed by a succession of "segfault-y" and "arbitrary-code-running" software such w Apache and Linux

The vast majority of C programmers are much less skilled than Apache and Linux developers.


>They were too dumb to understand that they were writing unsafe C code

"Too dumb to catch vulnerabilities in their code" is a much wider category than "too dumb to learn anything else [besides C]". You're not really providing an example of the second.

>The vast majority of C programmers are much less skilled than Apache and Linux developers

So we agree that at least some C developers can write consistently safe code. So a person is not necessarily "not smart" just because of this language choice.


> it's often processed by a succession of "segfault-y" and "arbitrary-code-running" software such as Apache and Linux.

If your web code has ~20 years of security audits and fixes it's probably perfectly safe to use it. If you're adding large new features under tight deadlines I don't recommend it.


Just addressing the OP's point that "if you're smart enough to safely write that C code, you won't".


This makes me wonder, are there any decent (military grade?) web frameworks for Ada?


One of the big items on my todo-list is to play a bit with aws (the Ada web server):

http://libre.adacore.com/tools/aws/

I've no idea what the architecture is like, but being Ada, I'm pretty hopeful it delivers some nice guarantees (or at least hefty promises) regarding reliability.

An hello-world example web server:

  https://en.wikibooks.org/wiki/Ada_Programming/Libraries/Web/AWS
An outdated, probably flawed (aren't they all) benchmark:

  http://wiki.ada-dk.org/aws_vs_node.js
If nothing else it seems to indicate that aws isn't hopelessly slow.

There's also awa - the Ada web framework. I've yet to play with it, but it would appear to be relevant:

  http://code.google.com/p/ada-awa/
According to a stack overflow answer[s], Ada also comes with a spitbol-package, allowing you to use spitbol/snobol rather than regexpes for pattern matching. See eg the bottom of:

  http://www.adacore.com/adaanswers/gems/gem-26-2/

[s] http://stackoverflow.com/questions/5904053/web-programming-i...


Are you aware of Ada Web Server (AWS)?

http://www.adacore.com/adaanswers/gems/gem-29/


Using that same logic we shouldn't use C for anything, because we might make a mistake.

I wouldn't use this not because of possible mistakes leaking in, but because the higher-level languages have already solved some of the problems you would have to solve yourself, such as handling unicode. There are C frameworks you could use but my point is that you would come across problems that have already been solved, and you would have to solve them again, but this time for libCello.

Now that's for production and work. To mess around on my own time? Sounds like fun to me. C is my favorite language but I use it everyday programming mobile devices so I'm probably a little unusual. Maybe a little website experiment or something. If it goes down or gets owned, rebuild time.


And we shouldn't use C for anything, if it can at all be avoided, because the likelihood and aftermath of mistakes are enormous.


Then by that logic, you shouldn't use a Computer, because the likelihood and aftermath of mistakes are enormous. :) c'mon, don't be that close-minded.


I disagree completely. But again, I use it daily. The performance and footprint benefits of C outweigh the negatives, which are basically summed up as - it's too much power.


It's entirely possible to have a language that is precisely as performant as C, but significantly safer. Several exist as well.

It's too little safety.


I'm not sure it's that cut and dry. For one thing, PHP itself has vulnerabilities, as do its flagship apps. These vulnerabilities are easy to scan for, because the default configuration advertises that it is there and what version it is. If you're worried about some kid running scripts or idly scanning, you're probably in better shape with a custom C program than with a widely-used PHP program, even though the custom C program is likely to be more fragile and crumple more easily under actual expert attention.


are you actually arguing for security by obscurity?


No, I'm saying something more subtle than that, which is why I used four sentences instead of three words.


> If you're smart enough to safely write that C code, you won't.

That's one of the best things I've read in a last few days.


> So, people can't hardly write safe web apps in PHP without...

I've said it before, but it bares repeating:

PHP is C for people who shouldn't write in C... or PHP.


Did you even seriously try it before you came up with this highly opinionated post.


Good summary. The answer to the question in title would be: yes.


Yeah, nobody would let anything important rely on an application written in C. Like apache or nginx. Or postgresql. Or your whole operating system. That would be crazy.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: