Reshaping data
- 1) from the wide to the long format
- 2) split a column
- 3) from the long to the wide format
- 참고 링크 https://www.r-bloggers.com/how-to-reshape-data-in-r-tidyr-vs-reshape2/
1) Wide to long data format
- tidyr::gather 또는 reshape2::melt 를 이용
- gather의 경우 matrix나 array를 다룰 수 없음
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
## messy dataset set.seed(1801) messy <- data.frame(id = 1:4, trt = rep(c('control', 'treatment'), times = 2), work_T1 = runif(4), home_T1 = runif(4), work_T2 = runif(4), home_T2 = runif(4)) messy ## tidyr::gather library(tidyr) gathered.messy <- gather(messy, key, value, -id, -trt) gathered.messy ## reshape2::melt library(reshape2) molten.messy <- melt(messy, variable.name = "key", value.names = "value", id.vars = c("id", "trt")) molten.messy |
2) Split a column
- tidyr::separate 또는 reshape2::colsplit 를 이용
1 2 3 4 5 6 7 8 9 10 11 12 13 |
## tidyr::separate tidy <- separate(gathered.messy, key, into = c("location", "time"), sep = "_") tidy ## reshape2::colsplit res.tidy <- cbind(molten.messy[1:2], colsplit(molten.messy[, 3], "_", c("location", "time")), molten.messy[4]) res.tidy |
3) Long to wide data format
- tidyr::spread 또는 reshape2::dcast 를 이용
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
## messy dataset set.seed(1801) stocks <- data.frame(time = as.Date('2018-01-01') + 0:9, X = rnorm(10, 0, 1), Y = rnorm(10, 0, 2), Z = rnorm(10, 0, 4)) stocks gathered.stock <- gather(stocks, stock, price, -time) gathered.stock ## tidyr::spread spread.stock <- spread(stocksm, stock, price) spread.stock ## reshape2::dcast cast.stock <- dcast(gathered.stock, formula = time ~ stock, value.var = "price") cast.stock |
Pingback: [TIL] 2018-01-09 (화) – Today I Learned