The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe


Question

R provides two different methods for accessing the elements of a list or data.frame- the [] and [[]] operators.

What is the difference between the two? In what situations should I use one over the other?

1
476
4/26/2019 10:04:28 AM

Accepted Answer

The R Language Definition is handy for answering these types of questions:

R has three basic indexing operators, with syntax displayed by the following examples

    x[i]
    x[i, j]
    x[[i]]
    x[[i, j]]
    x$a
    x$"a"

For vectors and matrices the [[ forms are rarely used, although they have some slight semantic differences from the [ form (e.g. it drops any names or dimnames attribute, and that partial matching is used for character indices). When indexing multi-dimensional structures with a single index, x[[i]] or x[i] will return the ith sequential element of x.

For lists, one generally uses [[ to select any single element, whereas [ returns a list of the selected elements.

The [[ form allows only a single element to be selected using integer or character indices, whereas [ allows indexing by vectors. Note though that for a list, the index can be a vector and each element of the vector is applied in turn to the list, the selected component, the selected component of that component, and so on. The result is still a single element.

299
9/21/2018 1:41:06 PM

The significant differences between the two methods are the class of the objects they return when used for extraction and whether they may accept a range of values, or just a single value during assignment.

Consider the case of data extraction on the following list:

foo <- list( str='R', vec=c(1,2,3), bool=TRUE )

Say we would like to extract the value stored by bool from foo and use it inside an if() statement. This will illustrate the differences between the return values of [] and [[]] when they are used for data extraction. The [] method returns objects of class list (or data.frame if foo was a data.frame) while the [[]] method returns objects whose class is determined by the type of their values.

So, using the [] method results in the following:

if( foo[ 'bool' ] ){ print("Hi!") }
Error in if (foo["bool"]) { : argument is not interpretable as logical

class( foo[ 'bool' ] )
[1] "list"

This is because the [] method returned a list and a list is not valid object to pass directly into an if() statement. In this case we need to use [[]] because it will return the "bare" object stored in 'bool' which will have the appropriate class:

if( foo[[ 'bool' ]] ){ print("Hi!") }
[1] "Hi!"

class( foo[[ 'bool' ]] )
[1] "logical"

The second difference is that the [] operator may be used to access a range of slots in a list or columns in a data frame while the [[]] operator is limited to accessing a single slot or column. Consider the case of value assignment using a second list, bar():

bar <- list( mat=matrix(0,nrow=2,ncol=2), rand=rnorm(1) )

Say we want to overwrite the last two slots of foo with the data contained in bar. If we try to use the [[]] operator, this is what happens:

foo[[ 2:3 ]] <- bar
Error in foo[[2:3]] <- bar : 
more elements supplied than there are to replace

This is because [[]] is limited to accessing a single element. We need to use []:

foo[ 2:3 ] <- bar
print( foo )

$str
[1] "R"

$vec
     [,1] [,2]
[1,]    0    0
[2,]    0    0

$bool
[1] -0.6291121

Note that while the assignment was successful, the slots in foo kept their original names.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon