if I have the following dataframe:
如果我有以下数据帧:
value factorA factorB
1 a e
2 a f
3 a g
1 b k
2 b l
3 b m
1 c e
2 c g
how can I get for each factorA the highest value and the entry from factorB associated with it i.e.
我怎样才能获得每个因子A的最高值和与之相关的因子B的条目,即
value factorA factorB
3 a g
3 b m
2 c g
Is this possible without first using
没有先使用,这是否可行
blocks<-split(factorA, list(), drop=TRUE)
and then sorting each block$a as this will be performed many times and number of blocks will always change.
然后对每个块$ a进行排序,因为这将执行多次,并且块的数量将始终改变。
12
Here is one option, using base R functions:
这是一个选项,使用基本R函数:
maxRows <- by(df, df$factorA, function(X) X[which.max(X$value),])
do.call("rbind", maxRows)
# value factorA factorB
# a 3 a g
# b 3 b m
# c 2 c g
4
With your data
随你的数据
df<- structure(list(value = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L), factorA = structure(c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor"),
factorB = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 3L), .Label = c("e",
"f", "g", "k", "l", "m"), class = "factor")), .Names = c("value",
"factorA", "factorB"), class = "data.frame", row.names = c(NA,
-8L))
Using ddply
function in plyr
package
在plyr包中使用ddply函数
> df2<-ddply(df,c('factorA'),function(x) x[which(x$value==max(x$value)),])
value factorA factorB
1 3 a g
2 3 b m
3 2 c g
Or,
> rownames(df2) <- df2$factorA
> df2
value factorA factorB
a 3 a g
b 3 b m
c 2 c g
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2012/04/10/e753025e012b0676305c6785ee9b4ed1.html。