In this guest post Alan Murray (computergaga) compares two solutions to a challenging data sorting issue. I know my preference, what about you?
I was sent an interesting challenge recently to sort a list of names and addresses. This doesn’t sound too difficult a task, right?
Unfortunately, the list of names and addresses is in the format below. And has hundreds of them.
My first thought was to create a macro. Ideas were running through my head immediately about how to tackle it. I was confident it would be easy with VBA but I didn’t want to limit myself to VBA.
In this post I will explain two methods to achieve our goal of sorting the list by name. One using Excel VBA, and another using Power Query.
Power Query Solution
It is nice to have multiple options at our disposal. I used to rely so much on VBA. As soon as a complex task came along, I would dive straight into it. But with Power Query, we can set up a process that others may feel more comfortable with and may be easier to edit in the future.
Once the Power Query process is set up. It can be refreshed at the click of a button in the future if more addresses are added, or information changed.
Load Data Into Power Query
Select the range of cells and click Data > From Table/Range (the wording of this step may differ depending on your version of Excel). Check that the range is correct, and that our range does contain headers. Click Ok.
The range opens in the Power Query Editor. We will now step through some processes to transform this data into something we can sort.
Remove Blank Rows
We will start by removing the blank rows. We can do this by filtering out the null values. Click the Filter arrow in the Addresses column and uncheck the Null box.
3 Rows Into 1
The addresses are the main problem here because they spill over 3 rows. We need to get these into one row along with the name to sort the list. That is our goal now.
Let’s start by filling out the Names column. Select the Name column. On the ribbon click Transform > Fill and Down.
What’s The Pattern?
Now to get the addresses into column we need to understand the pattern in our data. In this example it repeats every 3 rows. If we can group these repeats, then we can pivot them.
Let’s start by inserting an Index column. Click Add Column tab and then Index. A new column is added, and the index values begin from 0.
We will now add a modulo column to group the repeating values. Select the Index column. Click the Add Column tab > Standard > Modulo. Enter 3 as the value.
The modulo calculation categorizes each repeating value for us.
Rows Into Columns
Let’s now get them into columns. Select the Modulo column. Click the Pivot Column button on the Transform tab. Select the Addresses column for the Values Column. We do not want any aggregate calculations, so click Advanced Options and select Don’t Aggregate.
Fill In Blank Cells
This is really beginning to take shape. Now let’s fill in some blank cells. Select the 0 column (Address column) and click Transform > Fill > Down. Then select column 2 (postcode column) and click Transform > Fill > Up.
Reduce To One Row
Let’s reduce the list to one row for each record of data. Click the filter arrow for column 1 and uncheck the Null box.
The Index column is no longer needed. Right click the Index column and select Remove.
Rename Column Headers
The 3 column headers for the address details need to be renamed with something more meaningful than 0, 1 and 2.
Double click each header and enter an appropriate name. Address, City and Postcode were used in my example below.
Sort The Data
You can see that the list has already been sorted automatically by Power Query. However, I would like to sort this list. I feel better with this step added.
Select the Name column and click the Sort Ascending button on the Home tab.
All of steps that we have undertaken are listed in the Applied Steps pane. We can use this to remove or edit steps of the process later.
Name The Query
We can also name the query here. And that is important. Click in the Name box and enter SortList as the query name.
Send Query Back to Sheet
We’re done! Let’s put the query results back into a sheet. Click ‘File’ (top left), ‘Close & Load to’, ‘New Worksheet’.
By using VBA, we have many approaches to re-shaping this data and sorting by name. This is the method that I used and an explanation of the key parts of the macro code.
Dim ColumnOffset As Long
Dim RowNum As Long
Dim LastRow As Long
Dim i As Long
Dim TotalRows As Long
LastRow = Application.WorksheetFunction.CountA(Range(“A:A”))
TotalRows = Application.WorksheetFunction.CountA(Range(“B:B”)) + LastRow – 2
RowNum = 2
Range(“C1”).Value = “City”
Range(“D1”).Value = “Postcode”
Range(“C1:D1”).Font.Bold = True
For i = 2 To TotalRows
If Cells(RowNum, 2).Value = “” Then
ColumnOffset = 0
If ColumnOffset >= 1 Then
Cells(RowNum – 1, 2 + ColumnOffset).Value = Cells(RowNum, 2).Value
RowNum = RowNum + 1
ColumnOffset = ColumnOffset + 1
ActiveWorkbook.Worksheets(“Sheet1”).Sort.SortFields.Add Key:=Range(“A2:A” & LastRow), _
SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
.SetRange Range(“A1:D” & LastRow)
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
I used two sets of variables. One set for the shaping of the data and another for the loop.
The i and TotalRows variables were used solely for the loop. I calculated the number of loop iterations by counting the non-blank cells in column B, adding the non-blank cells from column A and subtracting 2 (there are 2 less blank rows in B than there are entries in A).
The variables RowNum, ColumnOffset and LastRow were used to actually re-shape the data.
RowNum tracked the current row through the process, only being increased when a row was not deleted.
ColumnOffset was used to reference a column to the right for the second and third address parts.
LastRow is a non-blank count on column A. This was used when sorting data at the end to have a dynamic last row identifier (view other smart ways to find the last row in Excel VBA)
A For loop is used to run it through a predetermined number of iterations (the number in the TotalRows variable).
view other smart ways to find the last row in Excel VBA
If statements are used to test the status of the cell in column B and apply the correct behaviour. If the cell is empty the row is deleted and the ColumnOffset variable reset because it means an address has ended.
If the ColumnOffset variable is 1 or greater, then we are in the middle of an address so move the data into that column offset and delete the row. If the cell is not empty, or in the middle of an address then increase the RowNum variable.
That is a brief explanation of key stages of the macro. Step through the code in an example to understand it better. Maybe you can improve upon it.
Deciding On The Solution
There are advantages to each approach and the path you take depends to an extent of where you feel more comfortable.
- Power Query is only available from Excel 2010. It’s an add-in called Power Query in 2010 and 2013. It’s found on the Data tab as Get & Transform in Excel 2016 and later.
- It will be easier though for most Excel folk to modify Power Query steps than to edit VBA code which is a more unique skillset.
- They are both flexible, but VBA offers more flexibility due to how vast the language is.
- Macro security is a concern when using VBA. Macros must be enabled before they can be executed.
- Both are one button refreshable options. So, there is a tie for speed of application.
Guest Post Author: Alan Murray
Alan is the founder of Computergaga (https://www.computergaga.com). Alan also has a popular YouTube channel. When he is not talking about Excel he likes to spend his time running, hiking and spending time with his two children.
Read Alan’s post about importing multiple Excel files from a folder into a single Excel file!
My name is Kevin Lehrbass. I’m a Data Analyst living in Markham Ontario Canada.
Power Query? VBA? I love VBA but in this particular case I would definitely go with Alan’s Power Query solution. It’s so much easier to use Power Query for these types of data gymnastics.
Improve Your Excel Skills! Check out my recommended Excel Training!