Read merged cells in Excel with Python - Stack Overflow

link之家
链接快照平台
输入网页链接，自动生成快照
标签化管理网页链接
相关文章推荐
爱旅游的手套 · 《哈哈农夫》嘉宾新造型 ...· 7 月前 ·
文质彬彬的灌汤包 · 雌雄莫辨的冷艳杀 ...· 2 年前 ·
健身的雪糕 · 自主品牌内斗？长城汽车实名举报比亚迪_搜狐汽 ...· 2 年前 ·
幸福的骆驼 · 单身陪读妈妈：儿子同学来家里做客，他一个举动 ...· 2 年前 ·
I am trying to read merged cells of Excel with Python using xlrd.
My Excel: (note that the first column is merged across the three rows)
    A   B   C
  +---+---+----+
1 | 2 | 0 | 30 |
  +   +---+----+
2 |   | 1 | 20 |
  +   +---+----+
3 |   | 5 | 52 |
  +---+---+----+
I would like to read the third line of the first column as equal to 2 in this example, but it returns ''. Do you have any idea how to get to the value of the merged cell?
My code:
all_data = [[]]
excel = xlrd.open_workbook(excel_dir+ excel_file)
sheet_0 = excel.sheet_by_index(0) # Open the first tab
for row_index in range(sheet_0.nrows):
    row= ""
    for col_index in range(sheet_0.ncols):
        value = sheet_0.cell(rowx=row_index,colx=col_index).value             
        row += "{0} ".format(value)
        split_row = row.split()   
    all_data.append(split_row)
What I get:
'2', '0', '30'
'1', '20'
'5', '52'
What I would like to get:
'2', '0', '30'
'2', '1', '20'
'2', '5', '52'
                Can you make the question reproducible? We would like to see raw data and code you use to import it.
                    – Roman Luštrik
                Jun 9 '15 at 9:00
                If you do a print all_data after the for loop, what do you get? And what do you expect?
                    – shruti1810
                Jun 9 '15 at 18:04
all_data = []
excel = xlrd.open_workbook(excel_dir+ excel_file)
sheet_0 = excel.sheet_by_index(0) # Open the first tab
prev_row = [None for i in range(sheet_0.ncols)]
for row_index in range(sheet_0.nrows):
    row= []
    for col_index in range(sheet_0.ncols):
        value = sheet_0.cell(rowx=row_index,colx=col_index).value
        if len(value) == 0:
            value = prev_row[col_index]
        row.append(value)
    prev_row = row
    all_data.append(row)
returning
[['2', '0', '30'], ['2', '1', '20'], ['2', '5', '52']]
It keeps track of the values from the previous row and uses them if the corresponding value from the current row is empty.
Note that the above code does not check if a given cell is actually part of a merged set of cells, so it could possibly duplicate previous values in cases where the cell should really be empty. Still, it might be of some help.
Additional information:
I subsequently found a documentation page that talks about a merged_cells attribute that one can use to determine the cells that are included in various ranges of merged cells. The documentation says that it is "New in version 0.6.1", but when i tried to use it with xlrd-0.9.3 as installed by pip I got the error
  NotImplementedError: formatting_info=True not yet implemented
I'm not particularly inclined to start chasing down different versions of xlrd to test the merged_cells feature, but perhaps you might be interested in doing so if the above code is insufficient for your needs and you encounter the same error that I did with formatting_info=True.
                Even more info in this mailing list thread. formatting_info is unsupported for .xlsx files, unfortunately.
                    – John Y
                Jan 17 '18 at 5:40
You can also try using fillna method available in pandas
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html
excel = pd.read_excel(dir+filename,header=1)
excel[ColName]=excel[ColName].fillna(method='ffill')
This should replace the cell's value with the previous value
For those who are looking for handling merged cell, the way OP has asked, while not overwriting non merged empty cells. 
Based on OP's code and additional information given by @gordthompson's answers and @stavinsky's comment, The following code will work for excel files (xls, xlsx), it will read excel file's first sheet as a dataframe. For each merged cell, it will replicate that merged cell content over all the cells this merged cell represent, as asked by original poster.Note that merged_cell feature of xlrd for 'xls' file will only work if 'formatting_info' parameter is passed while opening workbook.
import pandas as pd
filepath = excel_dir+ excel_file
if excel_file.endswith('xlsx'):
    excel = pd.ExcelFile(xlrd.open_workbook(filepath), engine='xlrd')
elif excel_file.endswith('xls'):
    excel = pd.ExcelFile(xlrd.open_workbook(filepath, formatting_info=True), engine='xlrd')
else:
    print("don't yet know how to handle other excel file formats")
sheet_0 = excel.sheet_by_index(0) # Open the first tab
df = xls.parse(0, header=None) #read the first tab as a datframe
for e in sheet_0.merged_cells:
    rl,rh,cl,ch = e
    print e
    base_value = sheet1.cell_value(rl, cl)
    print base_value
    df.iloc[rl:rh,cl:ch] = base_value
I was trying the previous solutions without having existo, nevertheless the following worked for me:
sheet = book.sheet_by_index(0)
all_data = []
for row_index in range(sheet.nrows):
    row = []
    for col_index in range(sheet.ncols):
        valor = sheet.cell(row_index,col_index).value
        if valor == '':
            for crange in sheet.merged_cells:
                rlo, rhi, clo, chi = crange
                if rlo <= row_index and row_index < rhi and clo <= col_index and col_index < chi:
                    valor = sheet.cell(rlo, clo).value
                    break
        row.append(valor)
    all_data.append(row)
print(all_data)
I hope it serves someone in the future
ExcelFile = pd.read_excel("Excel_File.xlsx")
xl = xlrd.open_workbook("Excel_File.xlsx")
FirstSheet = xl.sheet_by_index(0)
for crange in FirstSheet.merged_cells:
    rlo, rhi,clo, chi = crange
    for rowx in range(rlo,rhi):
        for colx in range(clo,chi):
            value = FirstSheet.cell(rowx,colx).value
        if len(value) == 0:
            ExcelFile.iloc[rowx-1,colx] = FirstSheet.cell(rlo,clo).value
This function you can get a array like ['A1:M1', 'B22:B27'], which tell you the cells to be merged.
openpyxl.worksheet.merged_cells
This function shows you whether a cell has been merged or not
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
                        Related
                            4294How do I merge two dictionaries in a single expression?
4508Calling an external command from Python
5381What are metaclasses in Python?
2860Finding the index of an item given a list containing it in Python
3119What is the difference between Python's list methods append and extend?
3864How can I safely create a nested directory?
5542Does Python have a ternary conditional operator?
2588How to get the current time in Python
3601Does Python have a string 'contains' substring method?
1729Why is reading lines from stdin much slower in C++ than Python?
site design / logo © 2019 Stack Exchange Inc; user contributions licensed under cc by-sa 4.0
                            with attribution required.
                    rev 2019.11.6.35358