问题使用rbind将多个.csv文件加载到R中的单个数据框中

我写了下面的函数来合并300个.csv文件。我的目录名是“specdata”。我已经完成了以下的执行步骤，

步骤1：

> x <- function(directory) { dir <- directory data_dir <- paste(getwd(),dir,sep = "/") files <- list.files(data_dir,pattern = '\\.csv') tables <- lapply(paste(data_dir,files,sep = "/"), read.csv, header = TRUE) pollutantmean <- do.call(rbind , tables) }

第2步：

 >x("specdata")

第3步：

 >head(pollutantmean)

头部错误（污染物含量）：物体'污染物含量'未find

我做了什么错误呢？你可以请任何人解释？

提前致谢。

你的函数中有很多不必要的代码。您可以将其简化为：

 load_data <- function(path) { files <- dir(path, pattern = '\\.csv', full.names = TRUE) tables <- lapply(files, read.csv) do.call(rbind, tables) } pollutantmean <- load_data("specdata")

请注意， do.call + rbind相对较慢。你可能会发现dplyr::bind_rows或者data.table::rbindlist要快得多。

 ```{r echo = FALSE, warning = FALSE, message = FALSE} setwd("~/Data/R/BacklogReporting/data/PastDue/global/") ## where file are located path = "~/Data/R/BacklogReporting/data/PastDue/global/" out.file <- "" file.names <- dir(path, pattern = ".csv") for(i in 1:length(file.names)){ file <- read.csv(file.names[i], header = TRUE, stringsAsFactors = FALSE) out.file <- rbind(out.file, file) } write.csv(out.file, file = "~/Data/R/BacklogReporting/data/PastDue/global/global_stacked/past_due_global_stacked.csv", row.names = FALSE) ## directory to write stacked file to past_due_global_stacked <- read.csv("C:/Users/E550143/Documents/Data/R/BacklogReporting/data/PastDue/global/global_stacked/past_due_global_stacked.csv", stringsAsFactors = FALSE) files <- list.files(pattern = "\\.csv$") %>% t() %>% paste(collapse = ", ") ```

为了更新Wickham教授的答案，他和他最近与Lionel Henry合着的purrr库的代码：

 Tbl <- list.files(pattern="*.csv") %>% map_df(~read_csv(.))

如果types转换是厚脸皮的，你可以强制所有的列作为字符。

 Tbl <- list.files(pattern="*.csv") %>% map_df(~read_csv(., col_types = cols(.default = "c")))

如果您想要插入子目录来构build您的文件列表以最终绑定，请确保包含path名称，并在列表中注册文件的全名。这将允许绑定工作在当前目录之外进行。（想像完整的path名称，像护照一样操作，以允许移回目录“边界”）。

 Tbl <- list.files(path = "./subdirectory/", pattern="*.csv", full.names = T) %>% map_df(~read_csv(., col_types = cols(.default = "c")))

正如韦翰教授在这里所描述的（大约一半）：

map_df(x, f)与do.call("rbind", lapply(x, f))实际上是一样的，但引擎盖下效率更高。

并感谢Jake Kaupp向我介绍map_df（）在这里。

在当前函数中， pollutantmean只能在函数x的范围内使用。修改你的function

 x <- function(directory) { dir <- directory data_dir <- paste(getwd(),dir,sep = "/") files <- list.files(data_dir,pattern = '\\.csv') tables <- lapply(paste(data_dir,files,sep = "/"), read.csv, header = TRUE) assign('pollutantmean',do.call(rbind , tables)) }

在全球环境中do.call(rbind, tables)应该把do.call(rbind, tables)放到一个叫pollutantmeanvariables中。

问题使用rbind将多个.csv文件加载到R中的单个数据框中

步骤1：

第2步：

第3步：

如何在MongoDB中将集合导出为CSV？

Mysqldump以CSV格式

如何使用mongoimport导入csv

如何使用sqlcmd从SQL Server将数据导出为CSV格式？

使用java将XML文件转换为CSV文件

Concat字段值在SQL Server中string

R中的read.csv（）和read.csv2（）之间的区别

如何使用PHPparsingCSV文件

导入CSV到MySQL表

使用Python读取Pandas中的CSV文件时的UnicodeDecodeError