# data.table subsetting

,

The data.table package supports a powerful syntax to select rows and columns.

## Selecting a single column

The syntax below returns a one-column data.table

This one returns a vector

However, when with=FALSE

it will always return a data.table

This is because the input for column subset is always a vector, even with length 1. To get a single column as a vector, we can use list subsetting syntax, since data.table is also a data.frame

What if we also want the row-filtering power of data.table? It gets tricky because the syntax for row filtering only works with a column, not a column name.

This one works

But this one doesn’t

To use a dynamic column name in row subsetting, we need to rely on [[

Does this compromise performance?

expr Mean (microsec) Median (microsec)
DT[Species == “setosa”] 1418.0504 1126.9530
DT[DT[[myCol]] == “setosa”] 518.5885 408.5935
iris[iris[[myCol]] == “setosa”, ] 240.9830 191.2095

The bulky syntax turned out to outperform the neater ones, and to my surprise, operation on data.frame is more efficient than on data.table.

## Assigment Operator :=

Assignment (or sub-assignment) is done in place. So we should expect this to change the original data.table

By chaining the assignment to a previous selection, we’re only modifying the copy.

The original data.table remains unchanged.