I want to sort a data.frame by multiple columns. For example, with the data.frame below I would like to sort by column `z`

(descending) then by column `b`

(ascending):

```
dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"),
levels = c("Low", "Med", "Hi"), ordered = TRUE),
x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9),
z = c(1, 1, 1, 2))
dd
b x y z
1 Hi A 8 1
2 Med D 3 1
3 Hi A 9 1
4 Low C 9 2
```

You can use the `order()`

function directly without resorting to add-on tools -- see this simpler answer which uses a trick right from the top of the `example(order)`

code:

```
R> dd[with(dd, order(-z, b)), ]
b x y z
4 Low C 9 2
2 Med D 3 1
1 Hi A 8 1
3 Hi A 9 1
```

*Edit some 2+ years later:* It was just asked how to do this by column index. The answer is to simply pass the desired sorting column(s) to the `order()`

function:

```
R> dd[order(-dd[,4], dd[,1]), ]
b x y z
4 Low C 9 2
2 Med D 3 1
1 Hi A 8 1
3 Hi A 9 1
R>
```

rather than using the name of the column (and `with()`

for easier/more direct access).

`order`

from`base`

`arrange`

from`dplyr`

`setorder`

and`setorderv`

from`data.table`

`arrange`

from`plyr`

`sort`

from`taRifx`

`orderBy`

from`doBy`

`sortData`

from`Deducer`

Most of the time you should use the `dplyr`

or `data.table`

solutions, unless having no-dependencies is important, in which case use `base::order`

.

I recently added sort.data.frame to a CRAN package, making it class compatible as discussed here: Best way to create generic/method consistency for sort.data.frame?

Therefore, given the data.frame dd, you can sort as follows:

```
dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"),
levels = c("Low", "Med", "Hi"), ordered = TRUE),
x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9),
z = c(1, 1, 1, 2))
library(taRifx)
sort(dd, f= ~ -z + b )
```

If you are one of the original authors of this function, please contact me. Discussion as to public domaininess is here: http://chat.stackoverflow.com/transcript/message/1094290#1094290

You can also use the `arrange()`

function from `plyr`

as Hadley pointed out in the above thread:

```
library(plyr)
arrange(dd,desc(z),b)
```

Benchmarks: Note that I loaded each package in a new R session since there were a lot of conflicts. In particular loading the doBy package causes `sort`

to return "The following object(s) are masked from 'x (position 17)': b, x, y, z", and loading the Deducer package overwrites `sort.data.frame`

from Kevin Wright or the taRifx package.

```
#Load each time
dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"),
levels = c("Low", "Med", "Hi"), ordered = TRUE),
x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9),
z = c(1, 1, 1, 2))
library(microbenchmark)
# Reload R between benchmarks
microbenchmark(dd[with(dd, order(-z, b)), ] ,
dd[order(-dd$z, dd$b),],
times=1000
)
```

Median times:

`dd[with(dd, order(-z, b)), ]`

**778**

`dd[order(-dd$z, dd$b),]`

**788**

```
library(taRifx)
microbenchmark(sort(dd, f= ~-z+b ),times=1000)
```

Median time: **1,567**

```
library(plyr)
microbenchmark(arrange(dd,desc(z),b),times=1000)
```

Median time: **862**

```
library(doBy)
microbenchmark(orderBy(~-z+b, data=dd),times=1000)
```

Median time: **1,694**

Note that doBy takes a good bit of time to load the package.

```
library(Deducer)
microbenchmark(sortData(dd,c("z","b"),increasing= c(FALSE,TRUE)),times=1000)
```

Couldn't make Deducer load. Needs JGR console.

```
esort <- function(x, sortvar, ...) {
attach(x)
x <- x[with(x,order(sortvar,...)),]
return(x)
detach(x)
}
microbenchmark(esort(dd, -z, b),times=1000)
```

Doesn't appear to be compatible with microbenchmark due to the attach/detach.

```
m <- microbenchmark(
arrange(dd,desc(z),b),
sort(dd, f= ~-z+b ),
dd[with(dd, order(-z, b)), ] ,
dd[order(-dd$z, dd$b),],
times=1000
)
uq <- function(x) { fivenum(x)[4]}
lq <- function(x) { fivenum(x)[2]}
y_min <- 0 # min(by(m$time,m$expr,lq))
y_max <- max(by(m$time,m$expr,uq)) * 1.05
p <- ggplot(m,aes(x=expr,y=time)) + coord_cartesian(ylim = c( y_min , y_max ))
p + stat_summary(fun.y=median,fun.ymin = lq, fun.ymax = uq, aes(fill=expr))
```

(lines extend from lower quartile to upper quartile, dot is the median)

Given these results and weighing simplicity vs. speed, I'd have to give the nod to ** arrange in the plyr package**. It has a simple syntax and yet is almost as speedy as the base R commands with their convoluted machinations. Typically brilliant Hadley Wickham work. My only gripe with it is that it breaks the standard R nomenclature where sorting objects get called by

`sort(object)`

, but I understand why Hadley did it that way due to issues discussed in the question linked above.Licensed under: CC-BY-SA with attribution

Not affiliated with: Stack Overflow