Don’t do that in Excel

Excel isn’t maybe great, but it’s useful for many things. That being said, I don’t use it at all in private life. But am using it a lot, not to say too much, at work:( It wouldn’t be so bad if Excel files produced by humans were also used/edited only by humans and Excel files produced by machines were also used/edited only by machines. Usually, it’s a mix. And it’s the basis of most problems I’ve encountered with Excel files. Not saying machines are perfect and people always do Excel wrong, but in most cases when something goes wrong it’s because of human. And I can’t influence machines, so in this post I’ll try to list things you should avoid when creating/working with Excel files which will be consumed by a machine.

Important information for people editing it
A note for the boss
Description of what this worksheet is about
ColumnA ColumnB ColumnC

For some reason, people love to put a lot of bloat in “header” section of worksheet. Don’t do that. I know that it may look nice to your boss, but most tools won’t be able to properly process a worksheet if first row doesn’t contain column names.

You should also avoid hiding rows/columns. What’s hidden when looked at in Excel is perfectly readable by a tool and can lead to unexpected behavior. Same goes for filtering, some tools may recognize filter applied and read only visible rows, but others won’t. Keep it in mind. If you want to make sure third party tool will read only specific rows and columns, create Excel file containing only that.

ColumnA ColumnB ColumnC ColumnB

Column names should be explicit. If you’re duplicating the names you’re doing it wrong. It won’t only break it for external tools, but also for Excel itself. Try to create a PivotTable from a worksheet with columns as above. It’ll let you, but second ColumnB will be visible as ColumnB2 and often it’s not so obvious which is which. Even worse, if you’ll have empty column headers. This time PivotTable won’t apply a workaround.

ColumnA ColumnB ColumnC ColumnD
ColumnA ColumnC ColumnB ColumnD
ColumnA ColumnB ColumnC ColumnD
ColumnA ColumnB ColumnC ColumnWithNewName

Another common sin is to change the order of columns or names of columns when updating a file. A tool configured to read the first file you’ve produced won’t properly read updated file if you’ve basically changed its structure. I know that we’re thrilled with “AI” and believe computers can think now. But most of tools for processing Excel files are still “dumb” and won’t recognize your brilliant and completely justified change to structure. And if you’re messing with your columns like that in a single Excel file for the same type of data, there’s a special room in hell for you.

Merged cells

Merged cells. If you really have to, use them, but do it consistently. When you’ve decided to merge two (or more) cells, do it for each row/column. Having split and merged cells for same rows/columns is a recipe for disaster.

Broken data formats is a topic which could easily fill another post. Just to briefly mention few most common examples:

  • Dates in different formats, like “3/6/2025” in one cell and “2025-03-06” in another, in the same column.
  • Dates saved as text, very few tools can handle them properly.
  • Dates with added notes, seen that, cried a lot.
  • Numbers saved as text.
  • Numbers saved as text and decimal point is mix of “.” and “,”.

Next one isn’t necessary a problematic thing, but I really dislike it. Adding formatting (like bold, italics, whatnot) to part of text in given cell. Excel is not a Word, if you need your text to look pretty maybe you’re doing something wrong.

ColumnA ColumnB ColumnC ColumnD
Text TextB Something Else
Text2 TextC Don’t Please
ColumnA Name ColumnC ColumnD ColumnE
Text Date 3534 23.4343 Another

Empty row(s). There are tools which would only read till first empty row. Then you can make it even worse by putting another table in the same worksheet with different column structure and/or different types of data. Even if your tool will be able to read it, it’ll most probably use column mapping from the first table, so your data would be rubbish. And even if columns are the same, you’ll still get column names in your data, as second header row will be read as a regular row, and you probably don’t want that. I now have a confession to make, I’ve added such functionality to my ExcelORM as I’ve needed it at work. But it’s really bad thing to do and should be avoided.

Hope this text will help somebody, and maybe even make my life a bit easier. When working with Excel, please remember about this simple rule “the prettier you’ll make it for humans, the harder it would be for machines to read”.