如何自动将上传的CSV或XLS文件中的数据导入Google表格
我在一台生成CSV或XLS报告到Google云端硬盘文件夹的服务器上有一个遗留数据库系统(不能访问networking)。 目前,我正在Drive Drive界面中手动打开这些文件,并将其转换为Google表格。
我宁愿这是自动的,所以我可以创build作业,追加/转换和图表的数据在其他表。
有可能输出一个本地的.gsheet文件? 或者有没有办法将编辑后的CSV或XLS格式的文件转换成.gsheet文件,然后在Google Apps中或通过基于Windows的脚本/工具保存到Google Drive?
您可以使用Google Apps脚本以编程方式将数据从云端硬盘中的csv文件导入到现有Google表格中,并根据需要replace/追加数据。
以下是一些示例代码。 它假定: a)您的云端硬盘中有一个指定的文件夹,CSV文件被保存/上传到; b) CSV文件名为“report.csv”,其中的数据以逗号分隔; c)将CSV数据导入到指定的电子表格中。 有关更多详细信息,请参阅代码注释
function importData() { var fSource = DriveApp.getFolderById(reports_folder_id); // reports_folder_id = id of folder where csv reports are saved var fi = fSource.getFilesByName('report.csv'); // latest report file var ss = SpreadsheetApp.openById(data_sheet_id); // data_sheet_id = id of spreadsheet that holds the data to be updated with new report data if ( fi.hasNext() ) { // proceed if "report.csv" file exists in the reports folder var file = fi.next(); var csv = file.getBlob().getDataAsString(); var csvData = CSVToArray(csv); // see below for CSVToArray function var newsheet = ss.insertSheet('NEWDATA'); // create a 'NEWDATA' sheet to store imported data // loop through csv data array and insert (append) as rows into 'NEWDATA' sheet for ( var i=0, lenCsv=csvData.length; i<lenCsv; i++ ) { newsheet.getRange(i+1, 1, 1, csvData[i].length).setValues(new Array(csvData[i])); } /* ** report data is now in 'NEWDATA' sheet in the spreadsheet - process it as needed, ** then delete 'NEWDATA' sheet using ss.deleteSheet(newsheet) */ // rename the report.csv file so it is not processed on next scheduled run file.setName("report-"+(new Date().toString())+".csv"); } }; // http://www.bennadel.com/blog/1504-Ask-Ben-Parsing-CSV-Strings-With-Javascript-Exec-Regular-Expression-Command.htm // This will parse a delimited string into an array of // arrays. The default delimiter is the comma, but this // can be overriden in the second argument. function CSVToArray( strData, strDelimiter ) { // Check to see if the delimiter is defined. If not, // then default to COMMA. strDelimiter = (strDelimiter || ","); // Create a regular expression to parse the CSV values. var objPattern = new RegExp( ( // Delimiters. "(\\" + strDelimiter + "|\\r?\\n|\\r|^)" + // Quoted fields. "(?:\"([^\"]*(?:\"\"[^\"]*)*)\"|" + // Standard fields. "([^\"\\" + strDelimiter + "\\r\\n]*))" ), "gi" ); // Create an array to hold our data. Give the array // a default empty first row. var arrData = [[]]; // Create an array to hold our individual pattern // matching groups. var arrMatches = null; // Keep looping over the regular expression matches // until we can no longer find a match. while (arrMatches = objPattern.exec( strData )){ // Get the delimiter that was found. var strMatchedDelimiter = arrMatches[ 1 ]; // Check to see if the given delimiter has a length // (is not the start of string) and if it matches // field delimiter. If id does not, then we know // that this delimiter is a row delimiter. if ( strMatchedDelimiter.length && (strMatchedDelimiter != strDelimiter) ){ // Since we have reached a new row of data, // add an empty row to our data array. arrData.push( [] ); } // Now that we have our delimiter out of the way, // let's check to see which kind of value we // captured (quoted or unquoted). if (arrMatches[ 2 ]){ // We found a quoted value. When we capture // this value, unescape any double quotes. var strMatchedValue = arrMatches[ 2 ].replace( new RegExp( "\"\"", "g" ), "\"" ); } else { // We found a non-quoted value. var strMatchedValue = arrMatches[ 3 ]; } // Now that we have our value string, let's add // it to the data array. arrData[ arrData.length - 1 ].push( strMatchedValue ); } // Return the parsed data. return( arrData ); };
然后,您可以在脚本项目中创build时间触发器 ,以定期运行importData()
函数(例如,每天晚上1点),所以您只需将新的report.csv文件放入指定的Drive文件夹中,它会在下一次计划运行时自动处理。
如果你绝对必须使用Excel文件而不是CSV,那么你可以使用下面的代码。 要使其起作用,必须在脚本和开发人员控制台中启用高级 Google服务中的Drive API(请参阅如何启用高级服务以了解详细信息)。
/** * Convert Excel file to Sheets * @param {Blob} excelFile The Excel file blob data; Required * @param {String} filename File name on uploading drive; Required * @param {Array} arrParents Array of folder ids to put converted file in; Optional, will default to Drive root folder * @return {Spreadsheet} Converted Google Spreadsheet instance **/ function convertExcel2Sheets(excelFile, filename, arrParents) { var parents = arrParents || []; // check if optional arrParents argument was provided, default to empty array if not if ( !parents.isArray ) parents = []; // make sure parents is an array, reset to empty array if not // Parameters for Drive API Simple Upload request (see https://developers.google.com/drive/web/manage-uploads#simple) var uploadParams = { method:'post', contentType: 'application/vnd.ms-excel', // works for both .xls and .xlsx files contentLength: excelFile.getBytes().length, headers: {'Authorization': 'Bearer ' + ScriptApp.getOAuthToken()}, payload: excelFile.getBytes() }; // Upload file to Drive root folder and convert to Sheets var uploadResponse = UrlFetchApp.fetch('https://www.googleapis.com/upload/drive/v2/files/?uploadType=media&convert=true', uploadParams); // Parse upload&convert response data (need this to be able to get id of converted sheet) var fileDataResponse = JSON.parse(uploadResponse.getContentText()); // Create payload (body) data for updating converted file's name and parent folder(s) var payloadData = { title: filename, parents: [] }; if ( parents.length ) { // Add provided parent folder(s) id(s) to payloadData, if any for ( var i=0; i<parents.length; i++ ) { try { var folder = DriveApp.getFolderById(parents[i]); // check that this folder id exists in drive and user can write to it payloadData.parents.push({id: parents[i]}); } catch(e){} // fail silently if no such folder id exists in Drive } } // Parameters for Drive API File Update request (see https://developers.google.com/drive/v2/reference/files/update) var updateParams = { method:'put', headers: {'Authorization': 'Bearer ' + ScriptApp.getOAuthToken()}, contentType: 'application/json', payload: JSON.stringify(payloadData) }; // Update metadata (filename and parent folder(s)) of converted sheet UrlFetchApp.fetch('https://www.googleapis.com/drive/v2/files/'+fileDataResponse.id, updateParams); return SpreadsheetApp.openById(fileDataResponse.id); } /** * Sample use of convertExcel2Sheets() for testing **/ function testConvertExcel2Sheets() { var xlsId = "0B9**************OFE"; // ID of Excel file to convert var xlsFile = DriveApp.getFileById(xlsId); // File instance of Excel file var xlsBlob = xlsFile.getBlob(); // Blob source of Excel file for conversion var xlsFilename = xlsFile.getName(); // File name to give to converted file; defaults to same as source file var destFolders = []; // array of IDs of Drive folders to put converted file in; empty array = root folder var ss = convertExcel2Sheets(xlsBlob, xlsFilename, destFolders); Logger.log(ss.getId()); }
上面的代码也可以作为一个要点在这里 。
您可以让Google云端硬盘自动将csv文件转换为Google表格
?convert=true
到你打电话的APIurl的末尾。
编辑:这是关于可用参数的文档: https : //developers.google.com/drive/v2/reference/files/insert
另外,在search上面的链接时,我发现这个问题已经在这里回答了:
使用Drive v2 API将CSV上传到Google Drive电子表格
(2017年3月)接受的答案不是最好的解决scheme。 它依靠使用Apps脚本进行手动翻译,而且代码可能不够灵活,需要维护。 如果您的旧版系统自动生成CSV文件,则最好将它们放到另一个文件夹中进行临时处理(导入[上传到Google云端硬盘并转换为Google表格文件])。
我的想法是让Drive API完成所有的重任。 Google Drive API小组在2015年底发布了v3 ,在该版本中, insert()
将名称更改为create()
,以便更好地反映文件操作。 也没有更多的转换标志 – 你只是指定MIMEtypes …想象一下!
文档也得到了改进:现在有一个专门的上传指导 (简单,多部分和可恢复),附带Java,Python,PHP,C#/。NET,Ruby,JavaScript / Node.js和iOS示例代码/ Obj-C,根据需要将CSV文件导入Google表格格式。
下面是一个用于短文件(“简单上传”)的替代Python解决scheme,您不需要apiclient.http.MediaFileUpload
类。 此片段假定您的authentication代码在https://www.googleapis.com/auth/drive.file
的最小authentication范围内的服务端点为DRIVE
的位置工作。
# filenames & MIMEtypes DST_FILENAME = 'inventory' SRC_FILENAME = DST_FILENAME + '.csv' SHT_MIMETYPE = 'application/vnd.google-apps.spreadsheet' CSV_MIMETYPE = 'text/csv' # Import CSV file to Google Drive as a Google Sheets file METADATA = {'name': DST_FILENAME, 'mimeType': SHT_MIMETYPE} rsp = DRIVE.files().create(body=METADATA, media_body=SRC_FILENAME).execute() if rsp: print('Imported %r to %r (as %s)' % (SRC_FILENAME, DST_FILENAME, rsp['mimeType']))
更好的方法是,不要上传到My Drive
,而是上传到一个(或多个)特定的文件夹,这意味着您需要将父文件夹ID添加到METADATA
。 (另请参阅此页上的代码示例。)最后,没有原生的.gsheet“文件” – 该文件只是链接到在线工作表,因此上面是您想要执行的操作。
如果不使用Python,则可以使用上面的代码片段作为伪代码来移植到您的系统语言。 无论如何,由于没有CSVparsing,所以维护的代码要less得多。 剩下的唯一的东西是吹走你遗留系统写入的CSV文件临时文件夹。