如何写入现有的Excel文件,而不覆盖数据(使用pandas)?
我用下面的方式用pandas来写excel文件:
import pandas writer = pandas.ExcelWriter('Masterfile.xlsx') data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2']) writer.save()
Masterfile.xlsx已经由多个不同的选项卡组成。
pandas正确写入“主”表,不幸的是,它也删除所有其他标签。
pandas文档说,它使用的xlsx文件的openpyxl。 快速浏览ExcelWriter
的代码,可以看出类似这样的情况:
import pandas from openpyxl import load_workbook book = load_workbook('Masterfile.xlsx') writer = pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') writer.book = book writer.sheets = dict((ws.title, ws) for ws in book.worksheets) data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2']) writer.save()
对我来说,skyjur的答案几乎奏效。 我必须明确地为作者设置引擎:
writer = pd.ExcelWriter(excel_file, engine='openpyxl')
否则会抛出
AttributeError: 'Workbook' object has no attribute 'add_worksheet'
老问题,但我猜测有些人还在寻找这个 – 所以…
我发现这个方法很好,因为所有的工作表都被加载到由sheetand = None选项的pandas创build的工作表名称和数据框对的字典中。 在将电子表格读入字典格式并从字典中回写之间添加,删除或修改工作表很简单。 对于我来说,在速度和格式方面,xlsxwriter比openpyxl更适合这个特定的任务。
注意:未来版本的pandas(0.21.0+)会将“sheetname”参数更改为“sheet_name”。
# read a single or multi-sheet excel file # (returns dict of sheetname(s), dataframe(s)) ws_dict = pd.read_excel(excel_file_path, sheetname=None) # all worksheets are accessible as dataframes. # easy to change a worksheet as a dataframe: mod_df = ws_dict['existing_worksheet'] # do work on mod_df...then reassign ws_dict['existing_worksheet'] = mod_df # add a dataframe to the workbook as a new worksheet with # ws name, df as dict key, value: ws_dict['new_worksheet'] = some_other_dataframe # when done, write dictionary back to excel... # xlsxwriter honors datetime and date formats # (only included as example)... with pd.ExcelWriter(excel_file_path, engine='xlsxwriter', datetime_format='yyyy-mm-dd', date_format='yyyy-mm-dd') as writer: for ws_name, df_sheet in ws_dict.items(): df_sheet.to_excel(writer, sheet_name=ws_name)
以2013年问题为例:
ws_dict = pd.read_excel('Masterfile.xlsx', sheetname=None) ws_dict['Main'] = data_filtered[['Diff1', 'Diff2']] with pd.ExcelWriter('Masterfile.xlsx', engine='xlsxwriter') as writer: for ws_name, df_sheet in ws_dict.items(): df_sheet.to_excel(writer, sheet_name=ws_name)
使用openpyxl
版本2.4.0
和pandas
版本0.19.2
,@ski过程变得简单一些:
import pandas from openpyxl import load_workbook with pandas.ExcelWriter('Masterfile.xlsx', engine='openpyxl') as writer: writer.book = load_workbook('Masterfile.xlsx') data_filtered.to_excel(writer, "Main", cols=['Diff1', 'Diff2']) #That's it!
def append_sheet_to_master(self, master_file_path, current_file_path, sheet_name): try: master_book = load_workbook(master_file_path) master_writer = pandas.ExcelWriter(master_file_path, engine='openpyxl') master_writer.book = master_book master_writer.sheets = dict((ws.title, ws) for ws in master_book.worksheets) current_frames = pandas.ExcelFile(current_file_path).parse(pandas.ExcelFile(current_file_path).sheet_names[0], header=None, index_col=None) current_frames.to_excel(master_writer, sheet_name, index=None, header=False) master_writer.save() except Exception as e: raise e
这工作完全没问题,只有格式化主文件(我们添加新工作表的文件)丢失了。
我知道这是一个较老的线程,但是这是您在search时发现的第一个项目,如果您需要在已经创build的工作簿中保留图表,上述解决scheme不起作用。 在这种情况下,xlwings是一个更好的select – 它允许您写入Excel书籍并保留图表/图表数据。
简单的例子:
import xlwings as xw import pandas as pd #create DF months = ['2017-01','2017-02','2017-03','2017-04','2017-05','2017-06','2017-07','2017-08','2017-09','2017-10','2017-11','2017-12'] value1 = [x * 5+5 for x in range(len(months))] df = pd.DataFrame(value1, index = months, columns = ['value1']) df['value2'] = df['value1']+5 df['value3'] = df['value2']+5 #load workbook that has a chart in it wb = xw.Book('C:\\data\\bookwithChart.xlsx') ws = wb.sheets['chartData'] ws.range('A1').options(index=False).value = df wb = xw.Book('C:\\data\\bookwithChart_updated.xlsx') xw.apps[0].quit()