Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think this only reinforces my point: this is a ridiculous default. But that is news to me :)


In light of this issue, dplyr's tbl_df structure (a light but helpful wrapper around data.frame) actually has different drop defaults, for example

  > x <- data.frame(foo=1:5, bar=1:5, baz=1:5)
  > dim(x[,'foo'])
  NULL
  > dim(x[,c('foo','bar')])
  [1] 5 2
  > dim(x[,'foo',drop=FALSE])
  [1] 5 1
compared to

  > x <- dplyr::data_frame(foo=1:5, bar=1:5, baz=1:5)
  > dim(x[,'foo'])
  [1] 5 1
Although I think these are more reasonable (I've got multiple commits at work with messages bemoaning drop=FALSE), this can ironically also mess you up if you got used to the old defaults :)


There's no question that many defaults seem wonky to many users, however you have to take into account that the use cases when the language was created (particularly going back to S) aren't the same as they might be now. tl;dr classic statistics isn't the same as contemporary data science


Use the right function, subset(), for the right effect then.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: