比较聚集(tidyr)融化(重塑2)
我喜欢reshape2软件包,因为它让生活变得如此简单。 通常,Hadley在以前的软件包中进行了改进,使代码更加简化,运行速度更快。 我想我会给tidyr一个旋转,从我读的东西,我认为gather
非常相似,从重塑2 melt
。 但是在阅读完文档之后,我无法gather
去完成melt
任务。
数据视图
这里是数据的视图(在dput
结尾的dput
forms的实际数据):
teacher yr1.baseline pd yr1.lesson1 yr1.lesson2 yr2.lesson1 yr2.lesson2 yr2.lesson3 1 3 1/13/09 2/5/09 3/6/09 4/27/09 10/7/09 11/18/09 3/4/10 2 7 1/15/09 2/5/09 3/3/09 5/5/09 10/16/09 11/18/09 3/4/10 3 8 1/27/09 2/5/09 3/3/09 4/27/09 10/7/09 11/18/09 3/5/10
码
这里是melt
时尚的代码,我试图gather
。 我怎样才能让gather
做同样的事情呢?
library(reshape2); library(dplyr); library(tidyr) dat %>% melt(id=c("teacher", "pd"), value.name="date") dat %>% gather(key=c(teacher, pd), value=date, -c(teacher, pd))
期望的输出
teacher pd variable date 1 3 2/5/09 yr1.baseline 1/13/09 2 7 2/5/09 yr1.baseline 1/15/09 3 8 2/5/09 yr1.baseline 1/27/09 4 3 2/5/09 yr1.lesson1 3/6/09 5 7 2/5/09 yr1.lesson1 3/3/09 6 8 2/5/09 yr1.lesson1 3/3/09 7 3 2/5/09 yr1.lesson2 4/27/09 8 7 2/5/09 yr1.lesson2 5/5/09 9 8 2/5/09 yr1.lesson2 4/27/09 10 3 2/5/09 yr2.lesson1 10/7/09 11 7 2/5/09 yr2.lesson1 10/16/09 12 8 2/5/09 yr2.lesson1 10/7/09 13 3 2/5/09 yr2.lesson2 11/18/09 14 7 2/5/09 yr2.lesson2 11/18/09 15 8 2/5/09 yr2.lesson2 11/18/09 16 3 2/5/09 yr2.lesson3 3/4/10 17 7 2/5/09 yr2.lesson3 3/4/10 18 8 2/5/09 yr2.lesson3 3/5/10
数据
dat <- structure(list(teacher = structure(1:3, .Label = c("3", "7", "8"), class = "factor"), yr1.baseline = structure(1:3, .Label = c("1/13/09", "1/15/09", "1/27/09"), class = "factor"), pd = structure(c(1L, 1L, 1L), .Label = "2/5/09", class = "factor"), yr1.lesson1 = structure(c(2L, 1L, 1L), .Label = c("3/3/09", "3/6/09"), class = "factor"), yr1.lesson2 = structure(c(1L, 2L, 1L), .Label = c("4/27/09", "5/5/09"), class = "factor"), yr2.lesson1 = structure(c(2L, 1L, 2L), .Label = c("10/16/09", "10/7/09"), class = "factor"), yr2.lesson2 = structure(c(1L, 1L, 1L), .Label = "11/18/09", class = "factor"), yr2.lesson3 = structure(c(1L, 1L, 2L), .Label = c("3/4/10", "3/5/10"), class = "factor")), .Names = c("teacher", "yr1.baseline", "pd", "yr1.lesson1", "yr1.lesson2", "yr2.lesson1", "yr2.lesson2", "yr2.lesson3"), row.names = c(NA, -3L), class = "data.frame")
您的gather
线应如下所示:
dat %>% gather(variable, date, -teacher, -pd)
这就是说“收集所有的variables,除了teacher
和pd
,调用新的键列”variables“和新的值列”date“。
作为解释,请注意help(gather)
页面中的以下内容:
...: Specification of columns to gather. Use bare variable names. Select all variables between x and z with 'x:z', exclude y with '-y'. For more options, see the select documentation.
由于这是一个省略号,收集的列的规格是作为单独的(裸名)参数给出的。 我们希望收集除了teacher
和teacher
以外的所有专栏,所以我们使用-
。