This comment is just bad and misinformed all over. (1) *Automatic Reference Coun...

dang · on May 12, 2019

Please don't call names in arguments here. Your first sentence is clearly against the site guidelines, and was quite unnecessary.

https://news.ycombinator.com/newsguidelines.html

tomp · on May 12, 2019

Sorry. I get emotional about programming languages...

dang · on May 12, 2019

Me too, as do many others.

cyphar · on May 9, 2019

> (1) Automatic Reference Counting doesn't work; its equivalent in interpreted languages is, well, reference counting, which can be optimized quite a lot (though has some issues with multithreading), but cannot collect cycles.

This is what weakrefs (or better data structures) are for. The Linux kernel uses reference counting incredibly effectively for almost every structure. I think that pretty much discounts any argument that reference counting cannot work.

> (5) Rust features a lot of "cheating" - pointers that allow mutation from multiple threads, reference-counted pointers, etc. - so obviously just ownership&borrowing isn't good enough

"unsafe" isn't cheating and isn't even related to whether garbage collection is a good idea or not. Yes, Rust's ownership and borrowing model doesn't cover all programs and hence "unsafe" is required -- but "unsafe" doesn't use a garbage collector so the existence of "unsafe" doesn't change that Rust works without one.

tomp · on May 9, 2019

Weakrefs require “careful programming” i.e. knowing which ref must be a weakref.

By “cheating” I don’t mean “unsafe” but all kinds of pointers that aren’t just unique or borrowed such as different reference-counted pointers, the existence and use of which demonstrates that ownership+borrowing isn’t a complete memory-management technique. You need reference counting, therefore you need GC or careful programming.

Hopefully we discover some technology that makes these needs go away, but not so far AFAIK.

m0th87 · on May 9, 2019

The point with weakrefs, though, is that they still require developer intervention.

Also, you can use most memory management facilities in Rust (including reference counting) without `unsafe`.

arcticbull · on May 9, 2019

Correct, leaks are considered "safe" -- it's premature deallocations that are considered unsafe.

apta · on May 9, 2019

I think the poster to whom you're responding means things like Cell and/or RefCell, which allow mutation behind the scenes while still being safe.

cyphar · on May 9, 2019

Right, but I think their main complaint is that RefCell is implemented using "unsafe" -- hence my point about "unsafe".

apta · on May 9, 2019

It's API is not `unsafe` though. Otherwise, every safe API depends on some `unsafe` call down the call graph.

cyphar · on May 9, 2019

I think we're both in agreement, I was phrasing things in a manner which I thought would better reference how GP thinks of unsafe. Obviously the API itself is safe, but the "cheating" GP was referring to was the fact that Rust requires "unsafe" under the hood in order to implement some safe operations (because the memory model doesn't cover all cases). You and I agree that this isn't a problem.

wbl · on May 9, 2019

Reference counting is a garbage collection algorithm. It just isn't a very good one compared to semi space because its runtime is the number of dead objects not live objects.

cyphar · on May 10, 2019

Refefence counting is barely classifiable as garbage collection. The runtime is O(1) because you free the object immediately when it dies, there's no stage where you start looping over dead objects.

pedrocr · on May 9, 2019

>Rust features a lot of "cheating" - pointers that allow mutation from multiple threads

You're confounding GC and synchronization of multi-threaded access here. What rust does with things like Arc other languages will need some other synchronization primitive to enable. Their GC or manual memory management does not guarantee the multi-threaded semantics. That's one of the advantages of the rust memory model, some patterns become multi-threaded without extra primitives (read-only borrows) and all of the primitives are compile-time checked for correctness.

>reference-counted pointers, etc. - so obviously just ownership&borrowing isn't good enough

In practice ownership and borrowing gets you a long way. I've only had to use a reference counted primitive (Arc) when that was the exactly correct semantics. I had a cache and both the caller and the original owner could drop the value first based on external input. No other static decision can be made and the ergonomics of writing that are very good. Rust seems already quite a bit better than having to be "very careful/restricted when writing code".

tomp · on May 9, 2019

No, Rust is conflating memory safery and multithreading safety.

JVM state-of-the-art GC guarantees multithreaded memory safety.

AFAIK many wait-free data structures are pretty much impossible without a GC.

pedrocr · on May 9, 2019

You start your response with "No" but don't actually argue against anything I've said.

> Rust is conflating memory safery and multithreading safety.

Rust doesn't conflate anything. It just gets some multi-threading safety for free as a result of its memory model. And then gets quite a few other multi-threading safety characteristics that no other mainstream language has from the type system. That several features in the language work together well is just a sign of good design.

> JVM state-of-the-art GC guarantees multithreaded memory safety.

This is where you're mixing the two unrelated concepts. Garbage collection does nothing for multi-threading safety. While the object is live and two threads hold a reference to it you need to synchronize access somehow. GC can't do anything about that.

> AFAIK many wait-free data structures are pretty much impossible without a GC.

This is unrelated to the previous points I made but isn't true either:

https://aturon.github.io/blog/2015/08/27/epoch/

tomp · on May 9, 2019

By "conflating" I mean that Rust's memory management technology provides just one kind of multi-threading safety (i.e. "multiple readers/single-writer" model), but to implement other kinds of multi-threading-safe algorithms/data structures, you need to transcend beyond ownership+borrowing.

I think "epoch-based memory reclamation" (which Linux kernel also uses IIRC) is still a (primitive / limited / contained) form of GC. For those that aren't familiar with it, a quote from the article @pedrocr posted:

> The basic idea is to stash away nodes that have been unlinked from the data structure (the first source of reachability) until they can be safely deleted. Before we can delete a stashed node, we need to know that all threads that were accessing the data structure at the time have finished the operation they were performing.

So basically, the runtime needs to keep track of allocated objects until it's certain there are no more references to them... sounds like GC (the difference between this and a real GC is that (1) you know there's no cross-references between the nodes, so you don't need to mark&sweep but just need to keep track of the epoch, and (2) the nature of lock-free data structures is such that threads will move to the next epoch fairly soon (unless they infinite-loop)).

pedrocr · on May 9, 2019

> By "conflating" I mean that Rust's memory management technology provides just one kind of multi-threading safety (i.e. "multiple readers/single-writer" model), but to implement other kinds of multi-threading-safe algorithms/data structures, you need to transcend beyond ownership+borrowing.

That's not "conflating" anything. You're just saying that "to do other things you need other stuff". Sure, to implement multi-threading you also need locks and semaphores and atomic variables sometimes. Rust provides all of those. There's nothing specific to GC vs non-GC there.

> So basically, the runtime needs to keep track of allocated objects until it's certain there are no more references to them... sounds like GC

Reference counting is not GC by most definitions. And if you want to define it like that Rust already has it.

> (the difference between this and a real GC is that (1) you know there's no cross-references between the nodes, so you don't need to mark&sweep but just need to keep track of the epoch, and (2) the nature of lock-free data structures is such that threads will move to the next epoch fairly soon (unless they infinite-loop)).

So the difference between this and GC is that it's nothing like GC... If reference counting, RCU, etc are all GC then there's nothing to discuss as it turns out languages like Rust have all the GC you need.

tomp · on May 9, 2019

I agree, the model of lock-free data structures is pretty much perfect for reference counting (no cycles, shallow reference tree... just needs to be thread-safe). I wonder then why epoch-based approaches are used, which (sound like they) are more difficult / tricky to implement... I honestly have no idea. The only thing that comes to mind is that they seem to require another atomic primitive (atomic increment, otherwise lock-free data structures only require compare-and-swap), which might not be available on all platforms...

kbenson · on May 9, 2019

> AFAIK many wait-free data structures are pretty much impossible without a GC.

That makes no sense. You can implement a garbage collector in a manually managed program (or even just the specific parts that gain you the attributes required to accomplish something), so even if a GC made certain data structures easier, there's nothing to prevent their use in a manually managed language. The argument you're making is sort of like saying that certain things can only be accomplished in ain C or Java, and not in assembly.

More specifically to your actual assertion, anything a GC does for lock free data structures could be handled by a special routine or manually in a language where you handle memory manually.

geocar · on May 12, 2019

(1/2): It most certainly does work: q/kdb is an interpreted language and doesn't have cycles. Being an array language, nesting is uncommon, so chasing pointers isn't an issue either.

(3): I didn't say otherwise Python doesn't also do reference counting. Python however does have a bytecode and a GC, and perl5 does not and executes the AST directly. And yes perl5 is faster for many problems[†]. I think it's quite common that people believe that GC and bytecode are a win, and this is a simple and accessible example where it is not true.

(4): More significant than what? q/kdb is implemented in this style, and it's a huge advantage (reducing bugs) having your library handle allocation/deallocation over having the caller do these things so I tend to do it as well. What exactly are you comparing it to? The mental load of figuring out what's causing pauses is quite high since it requires considering the entire program, but linear programming is very easy and it obviates a lot of need for other kinds of memory management techniques.

(5): Good enough for what? Why is mixing static analysis with runtime analysis "cheating"? What would be "fair" in your mind?

[†]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

StreamBright · on May 9, 2019

It is weird, I do not feel restricted when I write non-GC code. Btw. you still need to pay attention with GCd languages no to leak references that cannot be GCd. Java has this sort of problems. Rust made it easy and simple to deal with memory in a non-malloc level while not having GC. Erlang on the other hand has a VM that makes GC much faster than in other languages. I can live with that because if you do it right it does not impact end user latency. For me both Rust and Erlang gives the cake + eat it feeling.

jerf · on May 9, 2019

"I can live with that because if you do it right it does not impact end user latency."

The vast majority of GC use does not impact end user latency. Thinking otherwise is either the availability heuristic at work, or you work in AAA videogames or another ultra-high-real-time-performance industry and don't realize you're not in the center of programming work, you're on an extreme. (A perfectly valid and sizeable extreme.)

This may also be why you don't feel restricted; if you happen to work in any of the many places where a tree of references is adequate, you'll find Rust-style management to be a treat. If you work pervasively with graph structures in the core of your program, though, the experience can change. Pervasively-graph-type-structures are generally just a question of how you want your automated garbage collection to work, because I'm pretty sure getting the manual management correct otherwise is effectively impossible without an algorithm that's going to look a lot like a GC.

That said, despite my disagreement, I also disagree with the people who, as of this writing, have downvoted your comment into grey. I feel like I've been seeing a burst of "downvote as disagreement" in the past couple of weeks.

MrEldritch · on May 9, 2019

"Downvote as disagreement" is such a pervasive pattern that I Find myself doing it automatically even when I know that's not what downvote is "supposed" to be for. I suspect it's because I'm more likely to view posts that I disagree with as being poorly thought-out.

This even extends to platforms where downvoting is not a thing - witness 4chan's struggles with "sage is not a downvote" ("sage" being a function which allowed you to reply to a thread without bumping it to the top and extending its lifespan - again, intended to mark a thread as low qualit) until they finally gave up and made saging invisible.

Given how thoroughly inescapable downvoting-to-disagree seems, I wonder if it might not be better to just declare defeat, allow downvotes and upvotes to mean nothing beyond a signal of protest or agreement, and have a separate "this comment is low quality / this comment is high quality" buttons.

wglb · on May 9, 2019

PG long ago said that it is ok to use downvotes (in part) to signal disagreement.

apta · on May 9, 2019

> still need to pay attention with GCd languages no to leak references that cannot be GCd. Java has this sort of problems.

Unless you're doing weird things with class loaders, this is a non-issue.

> Erlang on the other hand has a VM that makes GC much faster than in other languages

What makes Erlang's VM make GC faster than other languages?

geocar · on May 10, 2019

That's a very loaded question.

I don't think it's easy to point to one thing Erlang is doing and say that is why it's faster, partly because I think the way Erlang programmers are encouraged to write their programs also has an effect; The language itself may have a greater effect on its GC performance than the GC implementation itself.

As an example: In Erlang, we have function calls like F(X) but we also have message sends like F!X. Message sends get one type of memory treatment, while function calls get another one. The lifetime of objects you send via messaging is going to be different, and yet Java only has function(method) calls, so there's no way to hint this to Java. Maybe this information can be recovered with static/dynamic analysis, but in Erlang we have syntax to tell the VM what's going to happen.

If you want to learn more about Erlang's internals, I can recommend Joe's original paper on the subject:

http://citeseerx.ist.psu.edu/viewdoc/summary;jsessionid=0397...

since it describes a lot of the design that went into Erlang early, and:

http://www.it.uu.se/research/publications/lic/2005-001/

which while long, quite in-depth describes many of the sub-optimisations that can be realised on top of it.

apta · on May 10, 2019

Thanks for the links. That being said, new developments in GCs on the JVM (like Shenandoah and ZGC) are yielding very impressive results. Things like < 1ms pause times for heaps in the TB range. No other free GC for any other language approaches this as far as I'm aware, including golang's which people claim is fast, but completely gets destroyed for even heaps in the GB range (I did some benchmarks).

coldtea · on May 9, 2019

>It is weird, I do not feel restricted when I write non-GC code.

Some people don't feel C has any safety issues either -- perhaps they think all others are idiots and their code is mistake proof because they do it right.

lmm · on May 9, 2019

> (4) linear / uniqueness types (not exactly the same, but both can be used to ensure safe memory management) impose significant mental overhead on programmers as they prohibit many common patterns

Such as?

I've found that linear style is generally just good coding style. When I get caught up on something I can't express linearly, it generally means I was trying to plough through with a dodgy approach; stopping and thinking through would have resulted in better code even in a GCed language.

blub · on May 9, 2019

You came in guns blazing and then covered yourself in embarrassment already in the first two points you made.

The thing you said doesn't work is in use on millions of devices, working just fine. Or maybe you'd like to be more precise than "work".

Avoiding cycles and thinking about ownership is not programming carefully, it's merely programming. Really, manual memory management is not rocket science folks, just error prone. And RAII is reliable to the point that anyone that is capable of writing software can wield it.

coldtea · on May 9, 2019

>Really, manual memory management is not rocket science folks, just error prone

Doesn't have to be rocket science to be a cognitive burden. And we want to eliminate "error prone" factors...

blub · on May 9, 2019

And we did eliminate them.

But no one can seriously claim that GC is a magic bullet, because it's only clearly winning against plain C.

Carefully designed, popular languages are choosing RAII and reference counting implementations and those are very competitive.

mcguire · on May 9, 2019

?

C++ and Objective C inherited C's memory model and thus don't have a good memory management option. Rust is explicitly a systems language, meaning stronger memory models are out of scope, and then inherited most of C++'s idioms.

Other than those, what popular languages are specifically choosing reference counting as a primary GC method?

steveklabnik · on May 9, 2019

Swift.

pjmlp · on May 10, 2019

Swift requires reference counting for compatibility with Objective-C code.

Apple could have gone the .NET way (RCW) to interface with COM reference counting, but that naturally would complicate the bindings to Objective-C runtime and existing libraries.

mcguire · on May 9, 2019

Which inherits its memory management from Objective C.