Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This kind of things happens all the time in academia. The authors are either constrained by space due to paper limitations or they are too lazy to explain all the little details that go into the algorithm.

I used to do research in computer vision a few years ago and it used to be that people won't publish their code and they purposely won't put in all of the details of the algorithm in the paper. Many of those algorithms were patent pending and I assume the authors were hoping to make some money from the patents. Compared to that, it's a lot better nowadays where most of the popular papers come with published code.



Is this really that common? That's disheartening, I want to spend time in academia but experiences like this are sucking the fun out for me...


Academia isn't flawless, it's no paradise. It's just humans and their factions, with all the good and bad that brings.

> The authors are ... constrained by space due to paper limitations

This is very real. Different journals have different criteria: word count; formatting; the number of tables you're permitted to include; etc. It's archaic and daft but that's the truth of it. And that's before we even get started on the really bonkers stuff like author order, impact factor, reviewer workload vs lack of pay, publish or perish, and so on.


Yes, this was a huge disappointment when I read Chemistry & Physics years ago. The naive view of papers was that they were to move human knowledge forward, but it became clear they were an elaborate knowledge withholding device!

Fortunately the trend seems to be towards better levels of openness, but it varies by subject. Stories like the BASF team being unable to reproduce vast numbers of established published techniques are way too common.


I tried to make use of some public audio research and it was pretty bad. There was an audio comprehensibility competition a few years ago. Some of the papers submitted are still around, as well as the summary paper describing the results. But many papers are hard to find, and those that claimed to have source code available are hard to find --- i was able to get matlab sources for a few algorithms, but they somehow work on the example files, but mostly crash on my files.

It's a shame because I understand the idea of the paper, and have an excellent place to apply it, but I lack the DSP background, so I can't really rebuild the code from scratch -- so the work is not able to be used.


this sounds interesting, would you care to reference the paper in question?


I'm not sure if I can find the exact paper anymore. This was in response to the Hurricane Challenge, a summary of results is available [1]. I tried to use code for uwSSDRCt available from the legacy page of the conference [2], under the link "Live and recorded speech modifier", direct download here [3].

The basic context is verification code delivery -- I'm playing pre-recorded samples of numbers to users, and can't control or sample the noise (either transmission or environmental), but would like to enhance intelligibility to reduce user effort, improve experience, and reduce costs.

[1] https://www.research.ed.ac.uk/portal/files/17887878/Cooke_et...

[2] https://web.archive.org/web/20131012005150/http://listening-...

[3] http://www.laslab.org/resources/LISTA/code/D4.3.zip


Yes it's common that a lot of "little details" are left out. Yes it does make things harder. No it's not a big enough problem that you need to be disheartened. You could even see it as an opportunity to raise the bar and stand out positively.


welcome to earth. we apologize for the mess.


Completely agree. The situation today is far better than a decade ago, with code releases for machine learning and computer vision papers being much more common than before.

I try to make my students release polished and easy to use code, but it can sometimes fall through the cracks due to deadlines, etc. Many projects are the output of a single PhD student.


Yep, it's so much better now in computer vision with early publishing in arxiv and published code. I feel like that is one of the reasons why research in CV is progressing so fast.

Also, I think it is something of a fact of life that you can't put all of the details of your algorithm in the paper, not just paper length limitations. I have actually tried to do this in two papers by putting all of the details in the supplements and the work to explain all the details and justify my choice of parameters and decisions for edge cases is almost as hard as writing the main paper. It becomes hard to justify the time spent pretty fast. Also, putting this in the main paper makes your beautiful explanations be tinged with edge cases and digressions haha.


Even worse, when actually researching CV, it seems that leaving out the details can even be deliberate in papers - to hide the fact that the authors tuned and cherry picked input datasets so they show the improvements they claim. Actually implementing the code and running on more standard datasets tends to quickly put results of many papers into question.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: