How to combine multiple Excel files into one whilst merging row data

If you have a number of Excel spreadsheets with differing columns in each, but with a common unique column which can be used to merge the row data, it’s likely that your best option to merge them (for free) is to use a VBA macro.

This article will walk you through building a VBA macro in Excel that will combine multiple Excel files into one whilst merging row data where appropriate. The code used here is based on the code in a previous article, How to merge Excel files with different columns.

To help explain how the merge should work, we’ll use a set of example spreadsheets:

  • The first spreadsheet has columns first_name, last_name, company_name, city, phone1, email:Merge Excel files differing columns File1
  • The second spreadsheet has columns email, company_name, phone2:
    Merge Excel files differing columns File2
  • The third spreadsheet has columns email, first_name, last_name, date_of_birth:
    Merge Excel files differing columns File3

Here, you’d like to merge the data from each spreadsheet into just one sheet in one spreadsheet with the same set of columns; first_name, last_name, company_name, city, phone1, phone2, email, date_of_birth:Merge Excel files differing columns Output

We can see that the mapping of columns from the source spreadsheets to the output would need to work as follows:

Spreadsheet first_name last_name company_name city phone1 phone2 email date_of_birth
ExcelFile1.xlsx first_name last_name company_name city phone1 email
ExcelFile2.xlsx company_name phone2 email
ExcelFile3.xlsx first_name last_name phone2 email date_of_birth

We have now defined the structure of the output spreadsheet that is required and we have worked out which column from each source spreadsheet maps to which column in the output. The mapping also shows us that we have a common column, email. Since email addresses are unique, we can use this as our key column for identifying when row data can be merged.

It is now possible for us to define the VBA code we need to merge these spreadsheets.

Excel Document StackNot exactly what you need?
Have a look at The Ultimate Guide on how to Merge Excel files for some other ways to merge Excel files.

Setting up the VBA code

To keep things simple for this example, we’ll assume that all the Excel files to merge are located in a single folder. To ensure we can map the columns from the source spreadsheets to the output, we also need to know the names of each of the spreadsheets that will be merged.

The example spreadsheet can be downloaded from the Downloads section at the end of this article.

  1. Create a folder for the Excel files to be merged
  2. Copy & paste all the source Excel files into this new folder
  3. Create a new Excel workbook in a location that is NOT the folder created in step 1
  4. In this new workbook, click on the Developer ribbon tab
    Excel Developer Ribbon Tab
    If the Developer ribbon tab is not present, follow these steps to get it
  5. Click the Macros button in the Developer tab (The Macros button on a Mac)
    Excel Macros Button
  6. In the dialog box which opens, type the name MergeExcelFilesAndMergeRows and click the ‘Create’ button
    Excel Create VBA Macro
  7. The Visual Basic Editor will open:
    Excel VBA Macro Editor
  8. Copy & paste (or re-type by hand if you prefer!) the code below between Sub MergeExcelFilesAndMergeRows()  and End Sub  (if you’d like to know more about what this code does, see how the code works):
    Below the End Sub of Sub MergeExcelFilesAndMergeRows() , paste this code:
  9. Look for the line shown below and replace <<Full path to your Excel files folder>> with the full path to the folder you created in step 1 (e.g. C:\Users\Martin\Documents\MyExcelFilesToMerge)
  10. We can now run this macro to merge the Excel files. There are 2 ways to do this:
    1. Click the Run toolbar button in the Visual Basic Editor
      Excel VBA Macro Editor Run
    2. In the Developer tab in the Excel ribbon, click Macros, select MergeExcelFilesAndMergeRows and click the Run button
      Excel Macro Run
    3. If there are any errors reported, check that the VBA code is correct and try again
  11. If you’d like to save this spreadsheet with the macro you just created in it, you will need to save the file as a Macro-Enabled Workbook (an xldm file). Follow these steps to do this:
    1. Click File -> Save as…
    2. Select ‘Excel Macro-Enabled Workbook (.xlsm)’ from the Format list
      Excel Save As
    3. Click the Save button

Although this code required the Excel source files to be combined to be placed in the same folder, if you would like to use a form to choose which Excel files to merge, you can also adapt the VBA code shown in How to merge Excel files with the same columns to include the mapping described in this article.

Explanation of the column mapping

There are several parts of the code that are involved in the mapping of columns from each spreadsheet into the output. The first section of code to discuss is the that which allows us to map the columns for each spreadsheet. This code is defined in the function Function MapColumns(fileName As String) As Object

In this function, you can see a section for each Excel file we wish to merge, starting with a Case expression, e.g. Case "ExcelFile1.xlsx" . Following the Case expression, the columns are mapped by adding an entry to a collection for each column in the source spreadsheet. The Key  represents the column in the source spreadsheet and the Item  represents the column in the output spreadsheet. For example, taking the last mapped column in source spreadsheet ExcelFile1.xslx:

Here we are mapping column F, email, of the source spreadsheet, ExcelFile1.xlsx, to column G, email, of the output spreadsheet.

You will have different source spreadsheets and need to map different columns. This function, Function MapColumns(fileName As String) As Object , is where you would apply your own mappings for your specific data. To change this code to suit your needs, you should first define the mappings from your source spreadsheets to the output columns you need. A good way to do this is to draw up your own mapping table similar to that shown above.

With the mapping defined, there are a few places in the code where the mapping is used.

Whilst looping through the files to process, the specific column map for the file currently being processed is retrieved using this code:

The code then loops through the columns in the source spreadsheet and uses the mapping to copy all the values from the column in the source spreadsheet into the appropriate column in the output:

These lines select the column and rows in that column to copy:

And these lines select column and rows in that column to paste the values in the output:

Merging rows of common data

In the example data we’ve used, there is a common column, email, which allows us to uniquely identify a record in the data sets that we have. This means that rather than have separate rows for each row in the source spreadsheets, we could match rows of related data based on whether the email address matches and then merge the data for that record.

To do this requires some changes to our mapping code to add some additional code to try and find the matching row based on the value of the email column.

First off, we need to specify which column in the output will contain the unique values:

Then, when processing a worksheet, we need to determine which column in that worksheet contains the unique value (if any):

We also need to define the range in the output worksheet that will be searched to try and match the row:

The final thing to do is to try and find an existing row that matches the unique value from the input row. If we find one, we need to update the existing row, otherwise, we copy the row into a new row in the output.

This  code is quite different to that in the previous example mainly because we need to process the source spreadsheet one row at a time so that we can match up the rows. A side effect of this is that it takes a lot longer to run than the version which simply adds rows from the source to the output.


If you’d like a copy of the spreadsheet which already has this functionality built in, you can download it here.

This has a macro defined, MergeExcelFilesAndMergeRows which will combine multiple Excel files, mapping columns to the output and merging data for common rows.