I think this only reinforces my point: this is a ridiculous default. But that is...

andy_wrote · on March 26, 2015

In light of this issue, dplyr's tbl_df structure (a light but helpful wrapper around data.frame) actually has different drop defaults, for example

  > x <- data.frame(foo=1:5, bar=1:5, baz=1:5)
  > dim(x[,'foo'])
  NULL
  > dim(x[,c('foo','bar')])
  [1] 5 2
  > dim(x[,'foo',drop=FALSE])
  [1] 5 1

compared to

  > x <- dplyr::data_frame(foo=1:5, bar=1:5, baz=1:5)
  > dim(x[,'foo'])
  [1] 5 1

Although I think these are more reasonable (I've got multiple commits at work with messages bemoaning drop=FALSE), this can ironically also mess you up if you got used to the old defaults :)

jghn · on March 26, 2015

There's no question that many defaults seem wonky to many users, however you have to take into account that the use cases when the language was created (particularly going back to S) aren't the same as they might be now. tl;dr classic statistics isn't the same as contemporary data science

stewbrew · on March 29, 2015

Use the right function, subset(), for the right effect then.