Also e-voting can be hacked (I guess they vote from their computer/smartphone, which can be hacked from the other side of the world). The last place you want to care about phishing, IMO, is voting.
Good luck hacking in-person voting or even "physical" mail voting from the other side of the world.
Regular ballot voting can also be hacked and on a scale. Making ballots invalid while counting them, or modifying them in some form or other, intentionally writing wrong values in the counting protocols...
And of course controlled vote or paid vote...
E-voting can and has also led to exposing voting fraud -- see Venezuella.
Yeah but it cannot be hacked from the other side of the world. I think it's a different kind of threat.
If an attacker from somewhere else in the world want to tamper with their votes, they have to get Swiss people to modify the ballots, or get their agent to learn Swiss-German, good luck with that :D.
But I think that the main reason is that Brazil's elections were a lot dirtier and a lot more unreliable than Switzerland's.
What I mean is that the push towards e-voting is much stronger in countries with unreliable elections, because e-voting is harder to tamper than the crude ways you can defraud paper ballots.
Switzerland's and other organized countries have elections that are "good enough", so the push towards e-voting is probably not that strong.
Is the "leapfrog" concept. Sometimes it is easier to adopt newer technologies in places where the existing ones are horrible. Other examples: electronic payment systems, solar panels and EVs in India and Africa.
Actually I don't understand the push towards e-voting in countries like Switzerland. E-voting can be hacked from the other side of the world, because it happens on computers. In-person voting or physical mail is much harder to hack from the other side of the world.
Most of the push for e-voting in Switzerland is from the Swiss abroad (10% of the electorate), who have a right to vote, but whose exercise of that right is subject to the vagaries of the international postal system. I personally have had problems with receiving postal ballots from Australia to Switzerland with not enough time to return them; presumably Swiss voters in Australia have similar problems, let alone less-developed countries.
It's not necessarily easy. The timing of Australian elections from issuance of writs is limited by the constitution, and since they can occur at the discretion of the prime minister, you can't prepare for them in advance.
Swiss votes are scheduled in advance, but the explanatory material and campaign flyers still have to be made and in order to be topical you don't want to make them too early. In particular the consequences of previous votes can affect the upcoming votes, and the closest interval is only 2 months (September/November).
Can't talk about Switzerland, don't know the particularities.
But in continental countries like Brazil it makes a lot of sense. It is cheaper, faster and safer.
> E-voting can be hacked from the other side of the world, because it happens on computers
How do you "hack from the other side of the world" a computer that isn't even online? True, the transmission of computed results is made online, but keeping that safe is trivial, banks do it.
> How do you "hack from the other side of the world" a computer that isn't even online?
Supply chain attacks. You just need to get in there before everything is cryptographically signed and sealed.
The best part is even if they published the source code for everything it would prove nothing. I seriously doubt the builds are reproducible. Lack of source-to-binary correspondence means source code would serve only to embarrass anyone protesting the electronic voting system.
As it stands, nothing short of a full audit of the complete signed software image that the machine booted and executed on election day would suffice. The judge-kings are on the record saying this system is UNQUESTIONABLE so they should publish the image on the internet and let the whole world take a look. I'm sure no faults will be found.
Not really. It started with hundreds of thousands of votes.
That said, Brazilian elections are completely unverifiable. They don't seem to be false, but it's hard to say anything that would be different if they were.
When it started, at the 90s, the machines were at least simpler and possible to verify up to the firmwares. Nowadays, they are not.
The title is misleading. It's an e-voting PILOT. That's important. "Switzerland is running small-scale e-voting pilots in four of its 26 cantons", three of which were not affected.
From Wikipedia [1]:
> A pilot experiment, pilot study, pilot test or pilot project is a small-scale preliminary study conducted to evaluate feasibility, duration, cost, adverse events, and improve upon the study design prior to performance of a full-scale research project.
Switzerland has been very careful/ conservative about rolling out e-voting. The same cannot be said of other jurisdictions (like Ontario's municipal elections) where adoption is very rapid and without coordination/support/standards from the provincial or federal governments.
And it makes it sound like a production system failed, where what actually happened is that this was a pilot that worked in 3/4 of the involved cantons and that the people who participated to it knew it was a pilot.
> an argument for protecting that test suite and API specification under copyleft terms.
If we protect API under copyright, it makes it easier to prevent interoperability. We obviously do NOT want that. It would give big companies even more power.
Now in the US, the Supreme Court that the output of an LLM is not copyrightable. So even a permissive licence doesn't work for that reimplementation: it should be public domain.
Disclaimer: I am all for copyleft for the code I write, but already without LLMs, one could rewrite a similar project and use the licence they please. LLMs make them faster at that, it's just a fact.
Now I wonder: say I vibe-code a library (so it's public domain in the US), I don't publish that code but I sell it to a customer. Can I prevent them from reselling it? I guess not, since it's public domain?
And as an employee writing code for a company. If I produce public domain code because it is written by an LLM, can I publish it, or can the company prevent me from doing it?
- Natural languages are ambiguous. That's the reason why we created programming languages. So the documentation around the code is generally ambiguous as well. Worse: it's not being executed, so it can get out of date (sometimes in subtle ways).
- LLMs are trained on tons of source code, which is arguably a smaller space than natural languages. My experience is that LLMs are really good at e.g. translating code between two programming languages. But translating my prompts to code is not working as well, because my prompts are in natural languages, and hence ambiguous.
- I wonder if it is a question of "natural languages vs programming languages" or "bad code vs good code". I could totally imagine that documenting bad code helps the LLMs (and the humans) understand the intent, while documenting good code actually adds ambiguity.
What I learned is that we write code for humans to read. Good code is code that clearly expresses the intent. If there is a need to comment the code all over the place, to me it means that the code is maybe not as good as it should be :-).
Of course there is an argument to make that the quality of code is generally getting worse every year, and therefore there is more and more a need for documentation around it because it's getting hard to understand what the hell the author wanted to do.
> If there is a need to comment the code all over the place, to me it means that the code is maybe not as good as it should be :-)
If good code was enough on its own we would read the source instead of documentation. I believe part of good software is good documentation. The prose of literate source is aimed at documentation, not line-level comments about implementation.
Confusing code is one thing, but projects with more complex requirements or edge cases benefit from additional comments and documentation. Not everything is easily inferred from code or can be easily found in a large codebase. You can also describe e.g. chosen tradeoffs.
I have written code that was correct and necessarily written the way it was oly to have it repeatedly altered by well meaning colleagues who thought it looked wrong, inefficient, or unidiomatic. Eventually I had to fill it with warning comments and write a substantial essay explaining why it had to be the way it was,
Code tells you what is happening but it doesn't always do it so that it is easy to understand and it almost never tells you why something is the way it is.
Difficult to say without an example, but "code isn't enough" is just one possible conclusion in this case. Another one could be that the code is not actually as good as expected, and another one is that the colleagues may need to... do something about it.
An obvious example I have is CMake. I have seen so many people complaining about CMake being incomprehensible, refactoring it to make it terrible, even wrapping it in Makefiles (and then wrapping that in Dockerfiles). But the problem wasn't the original CMakeLists or a lack of comments in it. The problem was that those developers had absolutely no clue about how CMake works, and felt like they should spend a few hours modifying it instead of spending a few hours understanding it.
However, I do agree that sometimes there is a need for a comment because something is genuinely tricky. But that is rare enough that I call it "a comment" and not "literate programming".
What do you mean by "poorly documented"? I have been using it for 20 years, I have yet to find something that is not documented.
As for convoluted, I don't find it harder than the other build systems I use.
Really the problem I have with CMake is the amount of terribly-written CMakeLists. The norm seems to be to not know the basics of CMake but to still write a mess and then complain about CMake. If people wrote C the way they write CMake, we wouldn't blame the language.
But the documentation can really help in telling why we are doing things. That also seeps in to naming things like classes. If that were not so, we'd just name everything Class1, Class2, Method1, Method2 and so on.
The code is what it does. The comments should contain what it's supposed to do.
Even if you give them equal roles, self-documenting code versus commented code is like having data on one disk versus having data in a RAID array.
Remember: Redundancy is a feature. Mismatches are information. Consider this:
// Calculate the sum of one and one
sum = 1 + 2;
You don't have to know anything else to see that something is wrong here. It could be that the comment is outdated, which has no direct effects and is easily solved. It could be that this is a bug in the code. In any case it is information and a great starting point for looking into a possible problem (with a simple git blame). Again, without needing any context, knowledge of the project or external documentation.
My take on developers arguing for self-documenting code is that they are undisciplined or do not use their tools well. The arguments against copious inline comments are "but people don't update them" and "I can see less of the code".
> Redundancy is a feature. Mismatches are information. Consider this:
Respectfully, if someone wrote code like this, I wouldn't want to work with them. I mean next step is "I copy paste code instead of writing functions, and in the comment above I mention all the other copies, so that it's easy to check that they are all doing the same thing redundantly".
> The arguments against copious inline comments are "but people don't update them" and "I can see less of the code".
Well no, that's not my argument. I have been navigating code for 20 years and in good codebases, comments are rare and describe something "surprising". Good code is hardly surprising.
My problem with "literate programming" (which means "add a lot of comments in the implementation details") is that I find it hard to trust developers who genuinely cannot understand unsurprising code without comments. I am fine with a junior needing more time to learn, but after a few years if a developer cannot do it, it concerns me.
You did not engage with my main arguments. You should still do so.
1. Redundancy: "The code is what it does. The comments should contain what it's supposed to do. [...] You don't have to know anything else to see that something is wrong here." and specifically the concrete trivial (but effective) example.
2. "My take on developers arguing for self-documenting code is that they are undisciplined or do not use their tools well. The arguments against copious inline comments are "but people don't update them" and "I can see less of the code"."
> Respectfully, if someone wrote code like this, I wouldn't want to work with them. I mean next step is "I copy paste code [...]
This is an nonsensical slippery slope fallacy. In no way does that behavior follow from placing many comments in code. It also says nothing about the clearly demonstrated value of redundancy.
> I have been navigating code for 20 years and in good codebases, comments are rare and describe something "surprising".
Your definition of good here is circular. No argument on why they are good codebases. Did you measure how easy they were to maintain? How easy it was to onboard new developers? How many bugs it contained? Note also that correlation != causation: it might very well be that the good codebases you encountered were solo-projects by highly capable motivated developers and the comment-rich ones were complicated multi-developer projects with lots of developer churn.
> My problem with "literate programming" [...] is that I find it hard to trust developers who genuinely cannot understand unsurprising code without comments.
This is gatekeeping code by making it less understandable and essentially an admission that code with comments is easier to understand. I see the logic of this, but it is solving a problem in the wrong place. Developer competence should not be ascertained by intentionally making the code worse.
You talk as if you had scientific proof that literate programming is objectively better, and I was the weirdo contradicting it without bringing any scientific proof.
Fact is, you don't have any proof at all, you just have your intuition and experience. And I have mine.
> It also says nothing about the clearly demonstrated value of redundancy.
Clearly demonstrated, as in your example of "Calculate the sum of one and one"? I wouldn't call that a clear demonstration.
> This is gatekeeping code by making it less understandable
I don't feel like I am making it less understandable. My opinion is that a professional worker should have the required level of competence (otherwise they are not a professional in that field). In software engineering, we feed code to a compiler, and we trust that the compiler makes sure that the machine executes the code we write. The role of the software engineer is to understand that code.
Literate programming essentially says "I am incapable of writing code that is understandable, ever, so I always need to explain it in a natural language". Or "I am incapable of reading code, so I need it explained in a natural language". My experience is that good code is readable by competent software engineers without explaining everything. But not only that: code is more readable when it is more concise and not littered with comments.
> and essentially an admission that code with comments is easier to understand.
I disagree again. Code with comment is easier to understand for the people who cannot understand it without the comments. Now the question is, again: are those people competent to handle code professionally? Because if they don't understand the code without comments, many times they will just have to trust the comments. If they used the comments to actually understand the code, pretty quickly they would be competent enough to not require the comments. Which means that at the point where they need it, they are not yet professionals, but rather apprentices.
def reallyDumbIdeaByManagerWorkaroundMethodToGetCoverageToNinetyPercent(self):
"""Dont worry, this is a clear description of the method.
"""
return False
Exactly, that's why a good project will use comments sparingly and have them only where they matter to actually meaningfully augment the code. The rest is noise.
I'm very near the idea that "LLM's are randomized compilers" and the human prompts should be 1000% more treated with care. Don't (necessarily) git commit the whole megabytes of token-blathering from the LLM, but keeping the human prompts:
"Hey, we're going to work on Feature X... now some test cases... I've done more testing and Z is not covered... ok, now we'll extend to cover Case Y..."
Let me hover over the 50-100 character commit message and then see the raw discussion (source) that led to the AI-generated (compiled) code. Allow AI.next to review the discussion/response/diff/tests and see if it can expose any flaws with the benefit of hindsight!
An important addendum: code can sometimes, with a bit of extra thinking of part of the reader, answer the 'why' question. But it's even harder for code to answer the 'why not' question. Ie what were other approaches that we tried and that didn't work? Or what business requirements preclude these other approaches.
> But it's even harder for code to answer the 'why not' question.
Great point. Well-placed documentation as to why an approach was not taken can be quite valuable.
For example, documenting that domain events are persisted in the same DB transaction as changes to corresponding entities and then picked up by a different workflow instead of being sent immediately after a commit.
I don't think this is enough to completely obsolete comments, but a good chunk of that information can be encoded in a VCS. It encodes all past approaches and also contains the reasoning and why not in annotation. You can also query this per line of your project.
Git history is incredible important, yes, but also limited.
Practically, it only encodes information that made it into `main`, not what an author just mulled over in their head or just had a brief prototype for, or ran an unrelated toy simulation over.
Yes, git ain't the only one, but apart from interface difference, they are pretty much compatible in what they allow you to record in the history, I think?
Part of the problem here is that we use git for two only weakly correlated purposes:
- A history of the code
- Make nice and reviewable proposals for code changes ('Pull Request')
For the former, you want to be honest. For the latter, you want to present a polished 'lie'.
Not really. Launchpad.net does not have any public branches I could share atm as an example, but Bazaar (now breezy) allowed having a nested "merge commit": your trunk would have "flattened" merge commits ("Merge branch foo"), and under it you could easily get to each individual commit by a developer ("Prototype", "Add test"...). It would really be shown as a tree, but smartness was wven richer.
This was made possible by using a DAG for commit storage and referencing, instead of relying on file contents and series of commits per reference. Merge behaviour was much smarter in case of diverging tip or criss-cross merges. But this ultimately was harder and slower to implement, and developers did not value this enough and they instead accepted the Git trade-offs.
So you seamlessly did both with a different VCS without splitting those up: in a sense, computers and software worried about that for us.
You can select whether you want the diff to the first or the second parent, which is the difference between collapsing and expanding merges. You can also completely collapse merges by showing first-parent-history.
Or I do not understand what you mean with "the expected thing".
If you throw away commit messages, that is on you, it is not a limitation of Git. If I am cleaning up before merging, I'm maybe rephrasing things, but I am not throwing that information away. I regularly push branches under 'draft/...' or 'fail/...' to the central project repository.
The WIP commits I initially recorded also don't necessarily existed as such in my file system and often don't really work completely, so I don't know why the commit after a rebase is any more a lie then the commit before the rebase.
It's a 'lie' in the sense that you are optimising for telling a convenient and easy to understand story for the reviewer where each commit works atomically.
The "honest" historical record of when I decided to use "git commit" while working on something is 100% useless for anyone but me (for me it's 90% useless).
git tracks revisions, not history of file changes.
You put past failed implementation in comments? That sounds like a nightmare. I rather only include a short description in the comment that can then link to the older implementation if necessary.
But why would you ever put that into your VCS as opposed to code comments?
The VCS history has to be actively pulled up and reading through it is a slog, and history becomes exceptionally difficult to retrace in certain kinds of refactoring.
In contrast, code comments are exactly what you need and no more, you can't accidentally miss them, and you don't have to do extra work to find them.
I have never understood the idea of relying on code history instead of code comments. It seems like it's all downsides, zero upsides.
Because comments are a bad fit to encode the evolution of code. We implemented systems to do that for a reason.
> The VCS history has to be actively pulled up and reading through it is a slog
Yes, but it also allows to query history e.g. by function, which to me gets me to understand much faster than wading through the current state and trying to piece information together from the status quo and comments.
> history becomes exceptionally difficult to retrace in certain kinds of refactoring.
True, but these refactorings also make it more difficult to understand other properties of code that still refers to the architecture pre-refactoring.
> I have never understood the idea of relying on code history instead of code comments. It seems like it's all downsides, zero upsides.
Comments are inherently linear to the code, that is sometimes what you need, for complex behaviour, you rather want to comment things along another dimension, and that is what a VCS provides.
What I write is this:
/* This used to do X, but this causes Y and Z
and also conflicts with the FOO introduced
in 5d066d46a5541673d7059705ccaec8f086415102.
Therefore it does now do BAR,
see c7124e6c1b247b5ec713c7fb8c53d1251f31a6af */
Both have their place. While I mostly agree with you, there's a clear example where git history is better: delete old or dead or unused code, rather than comment it out.
Agreed. Tests are documentation too. Tests are the "contract": "my code solves those issues. If you have to modify my tests, you have a different understanding than I had and should make sure it is what you want".
Having "grown up" on free software, I've always been quick to jump into code when documentation was dubious or lacking: there is only one canonical source of truth, and you need to be good at reading it.
Though I'd note two kinds of documentation: docs how software is built (seldom needed if you have good source code), and how it is operated. When it comes to the former, I jump into code even sooner as documentation rarely answers my questions.
Still, I do believe that literate programming is the best of both worlds, and I frequently lament the dead practice of doing "doctests" with Python (though I guess Jupyter notebooks are in a similar vein).
Usually, the automated tests are the best documentation you can have!
You seem to misunderstand the purpose of documentation.
It's not to be more accurate than the code itself. That would be absurd, and is by definition impossible, of course.
It's to save you time and clarify why's. Hopefully, reading the documentation is about 100x faster than reading the code. And explains what things are for, as opposed to just what they are.
Number of times reading the source saved time and clarified why: many.
Number of times reading the documentation saved time and clarified why: never.
Perhaps I've just been unlucky?
EDIT:
The hilarious part to me is that everyone can talk past each other all day (reading the documentation) or we can show each other examples of good/bad documentation or good/bad code (reading the code) and understand immediately.
> Number of times reading the documentation saved time and clarified why: never.
OK, so let's use an example... if you need to e.g. make a quick plot with Matplotlib. You just... what? Block off a couple weeks and read the source code start to finish? Or maybe reduce it to just a couple days, if you're trying to locate and understand the code just for the one type of plot you're trying to create? And the several function calls you need to set it up and display it in the end?
Instead of looking at the docs and figuring out how to do it in 5 or 10 min?
Literate programming is not about documenting the public API, it's about documenting the implementation details, right? Otherwise no need for a new name, it's just "API documentation".
> if you need to e.g. make a quick plot with Matplotlib. You just... what?
Read the API documentation.
Now if you need to fix a bug in Matplotlib, or contribute a feature to it, then you read the code.
> Lots of comments in code is a code smell. Yes, really.
No, not really. It's actually a sign of devs who are helping future devs who will maintain and extend the code, so they can understand it faster. It's professionalism and respect.
> If I see lots of comments in code, I'm gonna go looking for the intern who just put up their first PR.
And I'm going to find them to say good job, keep it up! You're saving us time and money in the future.
If someone gives me code full of superfluous comments, I don't consider it professional. Sounds like an intern who felt the need to comment everything because ever single line seemed very complex to them.
> I'm assuming "lots of comments" means lots of meaningful comments.
That's not what literate programming is. Literate programming says that you explain everything in a natural language.
IMO, good code is largely unsurprising. I don't need comments for unsurprising code. I need comments for surprising code, but that is the exception, not the rule. Literate programming says that it is the rule, and I disagree.
> Literate programming says that you explain everything in a natural language.
At a high level. Not line-by-line comments.
> IMO, good code is largely unsurprising. I don't need comments for unsurprising code.
I've never heard anything like that, and could not disagree more. Twenty different considerations might go into a single line of code. Often, one of them is something non-obvious. So you comment that thing. The idea that "good" code avoids anything non-obvious, that those are "exceptions", is frankly bizarre to me. Unless the code you write is 99% boilerplate or something.
> So you comment that thing. The idea that "good" code avoids anything non-obvious, that those are "exceptions", is frankly bizarre to me.
What I find interesting from the comments here is that there are obviously different perspectives on that. Granted, I cannot say that my way is better. Just as you cannot say that your way is better.
But I am annoyed when I have to deal with code following your standards, and I assume you are annoyed when you have to deal with code following mine :-).
Or maybe, I imagine that people who defend literate programming mean more comments than I think is reasonable, and people who disagree with me (like you) imagine that I mean fewer comments than you think is reasonable. And maybe in reality, given actual code samples, we would totally agree :-).
Do you have an example of such knowledge that you need to get from the comments? I have been programming for 20 years, and I genuinely don't see that much code that is so complex that it needs comments.
Not that it doesn't exist; sometimes it's needed. But so rarely that I call it "comments", and not a whole discipline in itself that is apparently be called "literate programming". Literate programming sounds like "you need to comment pretty much everything because code is generally hard to understand". I disagree with that. Most code is trivial, though you may need to learn about the domain.
I've never properly tried literate programming, overkill for hobby projects and not practical for a team unless everyone agrees.
Examples of code that needs comments in my career tend to come from projects that model the behaviour of electrical machines. The longest running such project was a large object oriented model (one of the few places where OOP really makes sense). The calculations were extremely time consuming and there were places where we were operating with small differences between large numbers.
As team members came and went and as the project matured the team changed from one composed of electrical engineers, physicists, and mathematicians who knew the domain inside out to one where the bulk of the programmers were young computer science graduates who generally had no physical science background at all.
This meant that they often had no idea what the various parts of the program were doing and had no intuition that would make them stop and think or ask a question before fixing a bug in wat seemed the most efficient way.
The problem in this case is that sometimes you have to sacrifice runtime speed for correctness and numerical stability. You can't always re-order operations to reduce the number of assignments say and expect to get the same answers.
Of course you can write unit and functional tests to catch some such errors but my experience says that tests need even better comments than the code that is being tested.
Because the why can be completely unrelated to the code (odd business requirements etc). The code can be known to be non-optimal but it is still the correct way because the embedded system used in product XYZ has some dumb chip in it that needs it this weird way etc. Or the CEO loves this way of doing things and fires everyone who touches it. So many possibilities, most technical projects have a huge amount of politics and weird legacy behavior that someone depends on (including on internal stuff, private methods are not guaranteed to not be used by a client for example). And comments can guard against it, both for the dev and the reviewer. Hell we currently have clients depend on the exact internal layout of some PDF reports, and not even the rendered layout but that actual definitions.
Again, if it's a comment saying "we need this hack because the hardware doesn't support anything", I don't call it "literate programming".
Literate programming seems to be the idea that you should write prose next to the code, because code "is difficult to understand". I disagree with that. Most good code is simple to understand (doesn't mean it's easy to write good code).
And the comments here prove my point, I believe: whenever I ask for examples where a comment is needed, the answer is something very rare and specific (e.g. a hardware limitation). The answer to that is comments where those rare and specific situations arise. Not a whole concept of "literate programming".
> Literate programming sounds like "you need to comment pretty much everything because code is generally hard to understand".
You and I read code. Came so naturally for me that I didn't realize others don't. But over the years and with some weird chats I've realized that for a lot of developers it's more like "deciphering code", like they're slowly translating a human language they only vaguely know - and it never even crossed their mind that it was possible to learn a programming language to the point you could just read it.
Not for everything. For code you own, yes this is often the case. For the majority of the layers you still rely on documentation. Take the project you mention going straight to source, did you follow this thread all the way down through each compiler involved in building the project? Of course not.
My understanding is that "literate programming" doesn't say "you should document the public API". It says "you should document the implementation details, because code is hard to understand".
My opinion is that if whoever is interested in reading the implementation details cannot understand it, either the code is bad or they need to improve themselves. Most of the time at least. But I hear a lot of "I am very smart, so if I don't understand it without any effort, it means it's too complicated".
> because my prompts are in natural languages, and hence ambiguous.
Legalese developed specifically because natural language was too ambiguous. A similar level of specificity for prompting works wonders
One of the issues with specifying directions to the computer with code is that you are very narrowly describing how something can be done. But sometimes I don't always know the best 'how', I just know what I know. With natural language prompting the AI can tap into its training knowledge and come up with better ways of doing things. It still needs lots of steering (usually) but a lot of times you can end up with a superior result.
Yes. LLMs are search engines into the (latent) space or source code. Stuff you put into the context window is the "query". I've had some good results by minimizing the conversational aspect, and thinking in terms of shaping the context: asking the LLM to analyze relevant files, nor because I want the analysis, but because I want a good reading in the context. LLMs will work hard to stay in that "landscape", even with vague prompts. Often better than with weirdly specific or conflicting instructions.
But search engines are not a good interface when you already know what you want and need to specify it exactly.
See for example the new Windows start menu compared to the old-school run dialog – if I directly run "notepad", then I get always Notepad; but if I search for "notepad" then, after quite a bit of chugging and loading and layout shifting, I might get Notepad or I might get something from Bing or something entirely different at different times.
> Natural languages are ambiguous. That's the reason why we created programming languages. So the documentation around the code is generally ambiguous as well. Worse: it's not being executed, so it can get out of date (sometimes in subtle ways).
I loathe this take.
I have rocked up to codebases where there were specific rules banning comments because of this attitude.
Yes comments can lie, yes there are no guards ensuring they stay in lock step with the code they document, but not having them is a thousand times worse - I can always see WHAT code is doing, that's never the problem, the problems is WHY it was done in this manner.
I put comments like "This code runs in O(n) because there are only a handful of items ever going to be searched - update it when there are enough items to justify an O(log2 n) search"
That tells future developers that the author (me) KNOWS it's not the most efficient code possible, but it IS when you take into account things unknown by the person reading it
Edit: Tribal knowledge is the worst type of knowledge, it's assumed that everyone knows it, and pass it along when new people onboard, but the reality (for me) has always been that the people doing the onboarding have had fragments, or incorrect assumptions on what was being conveyed to them, and just like the childrens game of "telephone" the passing of the knowledge always ends in a disaster
The compiler ensures that the code is valid, and what ensures that ‘// used a suboptimal sort because reasons’ is updated during a global refactor that changes the method? … some dude living in that module all day every day exercising monk-like discipline? That is unwanted for a few reasons, notably the routine failures of such efforts over time.
Module names and namespaces and function names can lie. But they are also corrected wholesale and en-masse when first fixed, those lies are made apparent when using them. If right_pad() is updated so it’s actually left_pad() it gets caught as an error source during implementation or as an independent naming issue in working code. If that misrepresentation is the source of an emergent error it will be visible and unavoidable in debugging if it’s in code, and the subsequent correction will be validated by the compiler (and therefore amenable to automated testing).
Lies in comments don’t reduce the potential for lies in code, but keeping inline comments minimal and focused on exceptional circumstances can meaningfully reduce the number of aggregate lies in a codebase.
> what ensures that ‘// used a suboptimal sort because reasons’ is updated during a global refactor that changes the method?
And for that matter, what ensures it is even correct the first time it is written?
(I think this is probably the far more common problem when I'm looking at a bug, newly discovered: the logic was broken on day 1, hasn't changed since; the comment, when there is one, is as wrong as the day it was written.)
I don’t disagree here. I personally like to put the why into commit messages though. It’s my longtime fight to make people write better commit messages. Most devs I see describe what they did. And in most cases that is visible from the change-set. One has to be careful here as similar to line documentation etc everything changes with size. But I prefer if the why isn’t sprinkled between source. But I’m not dogmatic about it. It really depends.
I <3 great (edit: improve clarity) commit comments, but I am leaning more heavily to good comments at the same level as the dev is reading - right there in the code - rather than telling them to look at git blame, find the appropriate commit message (keeping in mind that there might have been changes to the line(s) of code and commits might intertwine, thus making it a mission to find the commit holding the right message(s).
edit: I forgot to add - commit messages are great, assuming the people merging the PR into main aren't squashing the commits (a lot of people do this because of a lack of understanding of our friend rebase)
IMHO, you shouldn't have to justify yourself ("yeah yeah, this is not optimal, I know it because I am not an idiot"). Just write your code in O(n) if that's good enough now. Later, a developer may see that it needs to be optimised, and they should assume that the previous developer was not an idiot and that it was fine with O(n), but now it's not anymore.
Or do you think that your example comment brings knowledge other than "I want you to know that I know that it is not optimal, but it is fine, so don't judge me"?
A little bit of "Don't judge me" and a little bit of "I nearly fell into a trap here, and started writing O(log n) search, but realised that it was a waste of time and effort (and would actually slow things down) - so to save you from that trap here's a note"
The risk with that is that because it was not obvious to you does not necessarily mean it's not obvious to others.
Over the years, I have seen many, many juniors wrapping simple CLI invocations in a script because they just learned about them and thought they weren't obvious.
- clone_git_repo.sh
- run_docker_container.sh
I do agree that something actually tricky should be commented. But that's exceedingly rare.
I mean, the whole point of explicit being superior to implicit is because what's obvious to some isn't necessarily obvious to everyone.
Someone following me could look at it and go.. "well duh" and that's not going to hurt anyone, but if I didn't put that comment and someone refractometer, then we have someone redoing and then undoing, for no good reason.
There's that meme where people are told to update the number of hours wasted because people try to refactor some coffee and have to undo it because it doesn't work
Do you write a comment before every for loop to explain how a for loop works? Do you write a comment above that to remind the reader that the next few lines are written in, say, Go, just like in the rest of the file? Do you write a comment explaining that the text appearing on the screen is actually digital and will disappear when you turn off the computer?
Obviously you don't, because you assume that the person reading that code has some level of knowledge. You don't say "well, it may not be obvious to everybody, so I need to explain everything".
I guess where we differ is that to me, a professional software developer should be able to understand good code. If they aren't, they are a junior who needs practice. But I am for designing tools for the professionals, not for the apprentices. The goal of an apprentice is to become a professional, not to remain an apprentice forever.
> Do you write a comment before every for loop to explain how a for loop works?
Thank you for missing the point.
It's not about the WHAT, it's about the WHY.
For loops are obvious. O(n) being intentional instead of 'lazy' isn't obvious without context. That's what comments preserve - the decision rationale, not the syntax explanation.
A professional developer can read code. But they can't read the mind of the author who made a non obvious tradeoff. That's what comments preserve.
> I guess where we differ is that to me, a professional software developer should be able to understand good code. If they aren't, they are a junior who needs practice. But I am for designing tools for the professionals, not for the apprentices. The goal of an apprentice is to become a professional, not to remain an apprentice forever.
If you are going to make personal attacks, you should know that I work with actual professionals, and they understand that future maintainers, myself included, cannot read their mind on why they chose the path they did.
And my point is that I don't care what it is about, I care about whether or not it is useful. I disagree with the literate programming idea that it's always useful to explain why you wrote the code the way you did, and your one example (justifying the O(n)) actually proves to me that I really don't care about your explanation in this particular case. So obviously your one example that I don't find useful won't convince me that all WHY comments are useful.
> O(n) being intentional instead of 'lazy' isn't obvious without context.
What does such a comment tell me?
- That you chose the O(n): it's the "please don't judge me, I know what I am doing" part. It's superfluous, because by default I assume that you know what you are doing.
- That you tried to do better and failed. If I believe that we don't need better than O(n), I don't care. If I believe that we need better than O(n), I will reason about doing it myself (no matter what you wrote).
- ... I can't see anything else.
Now sometimes, of course, there is real knowledge that needs to go into a comment. Like "This is a workaround due to a bug in version 1.4.2 of this proprietary dependency". But that's an exception. I can also totally imagine that some files implement something really tricky and deserve a lot of comments. But in my experience reading and contributing to a lot of open source code from many different projects, most code is not like that. The concept of "literate programming" doesn't say "be pragmatic about comments, use them when it matters", it says "comment the code because it always helps".
> If you are going to make personal attacks
I am not making personal attacks, I genuinely believe that you are perfectly able to read and understand code that does not follow the "literate programming" paradigm. And if you are not, I still don't see that as a personal attack: with experience you will definitely get there.
> cannot read their mind on why they chose the path they did.
I just want to repeat it here: it does not matter at the implementation detail level. You may want to document the architecture (including technology choices) of course, but that's not what literate programming is about. You probably want to document the public API (because using an API generally does not require reading the code, and the implementation may be proprietary), but again that's not what literate programming is about. But the implementation details? Unless it's surprising (e.g. a necessary workaround), I don't care about why it was written the way it was, I just care about understanding what it does such that I can reason about it.
Again you prove my point: natural languages are ambiguous and communication is hard.
And maybe also that you don't seem to make the difference between natural languages and programming languages: I have not been commenting code here. If you can't make the difference, maybe it explains why you want to mix them.
Docs and code work together as mutually error correcting codes. You can’t have the benefits of error detection and correction without redundant information.
> With agents, does it become practical to have large codebases that can be read like a narrative, whose prose is kept in sync with changes to the code by tireless machines?
I think this is true. Your point supports it. If either the explanation / intention or the code changes, the other can be brought into sync. Beautiful post. I always hated the fact that research papers don't read like novels, eg "ohk, we tried this which was unsuccessful but then we found another adjacent approach and it helped."
Computer Scientist Explains One Concept in 5 Levels of Difficulty | WIRED
Computer scientist Amit Sahai, PhD, is asked to explain the concept of zero-knowledge proofs to 5 different people; a child, a teen, a college student, a grad student, and an expert. Using a variety of techniques, Amit breaks down what zero-knowledge proofs are and why it's so exciting in the world of cryptography.
Programming languages are natural and ambiguous too, what does READ mean? you have to look it up to see the types. The power comes from the fact that it's audit-able, but that you don't need to audit it every time you want to write some code. You think you write good code? try to prove it after the compiler gets through with it.
Natural languages are richer in ideas, it may be harder to get working code going from a purely natural description to code, than code to code, but you don't gain much from just translating code. One is only limited by your imagination the other already exists, you could just call it as a routine.
You only have a SENSE for good code because it's a natural language with conventions and shared meaning. If the goal of programming is to learn to communicate better as humans then we should be fighting ambiguity not running from it. 100 years from now nobody is going to understand that your conventions were actually "good code".
> Programming languages are natural and ambiguous too
Programming languages work because they are artificial (small, constrained, often based on algebraic and arithmetic expressions, boolean logic, etc.) and have generally well-defined semantics. This is what enables reliable compilers and interpreters to be constructed.
Exactly. Programming is the art of removing ambiguity and making it formal. And it's why the timelines between getting an EXACT plan of what I need to implement vs hazy requirements are so out of whack.
> Programming languages are natural and ambiguous too, what does READ mean?
"READ" is part of the "documentation in natural language". The compiler ignores it entirely, it's not part of the programming language per se. It is pure documentation for the developers, and it is ambiguous.
But the part that the compiler actually reads is non-ambiguous. It cannot deal with ambiguity, fundamentally. It cannot infer from the context that you wrote a line of code that is actually ironic, and it should therefore execute the opposite.
> Programming languages are natural and ambiguous too, what does READ mean?
Not nearly in the same sense actual language is ambiguous.
And ambiguity in programming is usually a bad thing, whereas in language it can usually be intended.
Good code, whatever that means, can read like a book. Event-driven architectures is a good example because the context of how something came to be is right in the event name itself.
What is good code now is only good code because of the bad programming languages we’ve had to accept for the last hundred years because we’re tied to incremental improvements. We’re tied to static brittle types. But look at natural systems - they all use dynamic “languages.” When you get a cut, your flesh doesn’t throw an exception because it’s connected to the wrong “thing.”
Maybe AI will redefine what good code means, because it’s better able to handle ambiguity.
>Natural languages are ambiguous. That's the reason why we created programming languages.
Programming languages can be ambiguous too. The thing with formal languages is more that they put a stricter and narrower interpretation freedom as a convention where it's used. If anything there are a subset of human expression space. Sometime they are the best tool for the job. Sometime a metaphor is more apt. Sometime you need some humour. Sometime you better stay in ambiguity to play the game at its finest.
Programming languages are non-ambiguous, in the sense that there is no doubt what will be executed. It's deterministic. If the program crashes, you can't say "no but this line was a joke, you should have ignored it". Your code was wrong, period.
I don’t have my LLMs generate literate programming. I do ask it to talk about tradeoffs.
I have full examples of something that is heavily commented and explained, including links to any schemas or docs. I have gotten good results when I ask an LLM to use that as a template, that not everything in there needs to be used, and it cuts down on hallucinations by quite a bit.
"But translating my prompts to code is not working as well, because my prompts are in natural languages, and hence ambiguous."
Not only that, but there's something very annoying and deeply dissatisfying about typing a bunch of text into a thing for which you have no control over how its producing an output, nor can an output be reproduced even if the input is identical.
Agreed natural language is very ambiguous and becoming more ambiguous by the day "what exactly does 'vibe' mean?".
People spoke in a particular way, say 60 years ago, that left very little room for interpretation of what they meant. The same cannot be said today.
> People spoke in a particular way, say 60 years ago, that left very little room for interpretation of what they meant. The same cannot be said today.
Surely you don’t mean everyone in the 1960s spoke directly, free of metaphor or euphemism or nuance or doublespeak or dog whistle or any other kind or ambiguity? Then why are there people who dedicate their entire life to interpreting religious texts and the Constitution?
Maybe if we had a really terse and unambiguous form of English? Whenever there is ambiguity we insert parentheses and operators to really make it clear what we mean. We can enclose different sentences in brackets to make sure that the scope of a logical condition and so on. Oh wait
I would say this expresses the intent, no need for a comment saying "check if the number is even".
Most of the code I read (at work) is not documented, still I understand the intent. In open source projects, I used to go read the source code because the documentation is inexistent or out-of-date. To the point where now I actually go directly to the source code, because if the code is well written, I can actually understand it.
With this small change, all we have are questions:
Is the name wrong, or the behavior? Is this a copy / paste error? Where is the specification that tells me which is right, the name or the body? Where are the tests located that should verify the expected behavior?
Did the implementation initially match the intent, but some business rule changed that necessitated a change to the implantation and the maintainer didn't bother to update the name?
Both of our examples are rather trite- I agree that I wouldn't bother documenting the local behavior of an "isEven" function. I probably would want a bit of documentation at the callsite stating why the evenness of a given number is useful to know. Generally speaking, this is why I tend to dislike docblock style comments and prefer bigger picture documentation instead- because it better captures intent.
Not at all. I'm just pointing out that code does not intrinsically convey intent, only implementation.
To use a less trite example, I'd probably find some case where a word or name can have different meanings in different contexts, and how that can be confusing rather than clarifying without further documentation or knowledge of the problem space.
Really though, any bug in the code you write is a deviation between intent and implementation. That's why documentation can be a useful supplement to code. If you haven't, take a look at the underhanded C contests- there's some fantastically good old gems in there that demonstrate how a plain reading of the code may not convey intent correctly.
I feel like we're going from "literate programming" to "sometimes it makes sense to add comments". I agree with the latter. Good code is mostly unsurprising, and when it is surprising it deserves a comment. But that is more the exception than the rule.
> I had some Scala 3 feelings when reading the vision, I hope Rust doesn't gets too pushy with type systems ideas.
I don't know if it is true or not, but my feeling is that Scala brought a lot of new ideas. But as I read somewhere, "Scala was written by compiler people, to write compilers", and I can understand that feeling.
Kotlin came after Scala (I think?) and seems to have gotten a lot of inspiration from Scala. But somehow Kotlin managed to stay "not too complex", unlike Scala.
All that to say, Rust has been innovating in the zero-cost abstraction memory safe field. If it went the way of Scala, I wonder if another language could be "the Kotlin of Rust"? Or is that Zig already? (I have no idea about Zig)
> But somehow Kotlin managed to stay "not too complex", unlike Scala.
It's not really true anymore, Kotlin has slowly absorbed most of the same features and ideas even though they're sometimes pretty half-baked, and it's even less principled than the current Scala ecosystem. JetBrains also wants to make Kotlin target every platform under the sun.
At this point, the only notable difference are HKTs and Scala's metaprogramming abilities. Kotlin stuck to a compiler plugin exposing a standard interface (kotlinx.serialization) for compile-time codegen. Scala can do things like deriving an HTTP client from an OpenAPI specification on the fly, by the LSP backend.
Not to the same extent. Scala.JS and Kotlin.JS are somewhat comparable, other targets not so much. There was no serious attempt at making Scala target mobile devices, even during the window of opportunity with Scala on Android.
> even during the window of opportunity with Scala on Android.
I don't understand this. You can run any pure Java jar on Android, pretty sure you can do that with Scala too? It's not exactly a "different platform" in terms of programming language. Sure it needs tooling and specific libraries, but that's higher level than the programming language.
Jetbrains is doing interop with Swift (Kotlin -> ObjC -> Swift and more recently Kotlin -> C -> Swift), which Scala never did. But I don't really see how this is relevant in this conversation.
You can run Scala on Android and it's been done but it never worked well nor was given priority. Which is understandable as the commercial entities behind Scala already struggle to build the ecosystem and tooling in spaces where the language shines.
For instance the Android runtime has chronically lagged behind mainline JVM bytecode versions, iirc once Scala started to emit Java 8 bytecode, Android was stuck on Java 6.
Kotlin had other obvious advantages on Android like its thin standard library or the inlining of higher-order functions.
> I wonder if another language could be "the Kotlin of Rust"?
Some people would say that Swift is that language since it's potentially memory safe like Rust and is described as friendlier to novices. There's some room for disagreement wrt. the latter point.
I have been reading HN for a few years, and my feeling is that I find fewer and fewer interesting articles. Maybe it's just me, and the average articles are the same quality.
Now I tend to skim through it to see if a title looks like it may bring interesting discussions, and then I skim through the discussions. Because there are very knowledgeable people who sometimes share valuable insights.
Interestingly, last time I asked a question, hoping to get interesting people to share insights, I was answered that I "should learn how to use an LLM instead of asking questions" :-).
Same here. Unless a site is known for good quality, I only open articles after checking top comments. Most days I just read the daily digest of discussions - which sometimes piques my curiosity enough to check the original links.
reply