data.table package supports a powerful syntax to select rows and columns.
Selecting a single column
The syntax below returns a one-column
This one returns a vector
it will always return a data.table
This is because the input for column subset is always a vector, even with length 1.
To get a single column as a vector, we can use list subsetting syntax, since
data.table is also a
What if we also want the row-filtering power of
data.table? It gets tricky because the syntax for row filtering only works with a
column, not a column name.
This one works
But this one doesn’t
To use a dynamic column name in row subsetting, we need to rely on
Does this compromise performance?
|DT[Species == “setosa”]
|DT[DT[[myCol]] == “setosa”]
|iris[iris[[myCol]] == “setosa”, ]
The bulky syntax turned out to outperform the neater ones, and to my surprise, operation on
data.frame is more efficient than on
Assignment (or sub-assignment) is done in place. So we should expect this to change the original
By chaining the assignment to a previous selection, we’re only modifying the copy.
data.table remains unchanged.