Awkward Data

Another awkward dataset. Should we re-arrange the data or just find a way to calculate what we need?

(get my Excel file)

 

Awkward Data

A nice normalized dataset would have 4 columns: Category, Month, Letter, Value. But we have this:

 

 

 

 

 

 

Each month has a group of 3 columns. If we had more months this pattern would repeat.

 

 

The Question

Add values in columns C, F, I for each month/letter combination. The answers look like this:

 

 

 

The key to solving this is finding the pattern in the awkward data layout.

 

 

3 ‘As Is’ Solutions

Here are 3 formula solutions that work on the data as is (no rearranging).

 

Solution 1

=SUMIFS(OFFSET($A$3:$A$11,0,MATCH($I14,$B$2:$G$2,0)+2),OFFSET($A$3:$A$11,0,MATCH($I14,$B$2:$G$2,0)+1),J$13)

Long formula but I love OFFSET even if it’s a bit volatile.

 

Solution 2

=SUM(IF(($B$2:$E$2=$I19)*($C$3:$F$11=J$18),$D$3:$G$11,””))

An array (requires control shift enter).

 

Solution 3

=SUMPRODUCT(($B$2:$E$2=$I24)*($C$3:$F$11=J$18),$D$3:$G$11)

This is probably my favorite solution. Short and it’s a normal formula (non array).

 

Which formula is your favorite?

Do you have a better formula solution? (maybe using dynamic arrays?!)

 

 

Rearranging The Data

Rearrange the data so we can create a pivot. It should look like this:

 

 

 

 

 

 

 

Power Query

Can you rearrange this data using Power Query?

Your queries would need to adjust if the dataset grows (more months). I’d love to see if it’s possible!

 

Formulas

Well, that’s what we had to do in the old days so you should suffer too!

My file has my solution. It’s backwards compatible to any Excel version (no dynamic arrays).

 

 

 

 

 

 

 

 

It’s good practice to normalize using formulas. Note: formulas are in the same sheet as the data. Not a good idea but it’s a demo and it’s easier to read like this. Now we can create a simple pivot table with this data 🙂

 

 

VBA

Of course VBA could rearrange this data. But I’m too lazy to create all those loops right now!

 

 

About Me

My name is Kevin Lehrbass. I’m a data analyst from Canada.

I saw this awkward dataset just before covid crisis. Things have really changed…but there’s always been awkward data!

I play around with data to keep my mind busy and share ideas with others. I’m hoping that someone can solve this using Power Query. I gave up and went back to formulas.

3 Comments Awkward Data

  1. Alan Sidman

    For the situation shown and not taking into consideration for future months. I loaded the table and then created a reference table (duplicate) of the original. In the original table, I deleted any columns that were associated with February. In the second table I did the same for January. I changed column names in each so that they matched. I added a custom column with the data January in one and February in the other. I then appended the February to January which resulted in the following:
    Month Category Column1 Column2
    January Category1 c 0.56247919
    January Category2 c 0.665416221
    January Category3 v 0.317881541
    January Category1 f 0.369318453
    January Category2 g 0.838794826
    January Category3 t 0.265604131
    January Category1 c 0.034951919
    January Category2 sa 0.59025869
    January Category3 a 0.811550546
    February Category1 c 0.188865382
    February Category2 c 0.083025283
    February Category3 v 0.648843625
    February Category1 f 0.685798337
    February Category2 g 0.633699237
    February Category3 t 0.990748883
    February Category1 c 0.433338457
    February Category2 sa 0.344227568
    February Category3 a 0.064277462

    I trust this is what you were looking for. If you add additional months, then I am not sure of a way of adding it automatically to the query.

    Reply
    1. Kevin Lehrbass

      Hi Alan,
      You’ve accomplished more than I did that’s for sure! I can often find a power query solution but this time I gave up.
      …I received an excel file from Daniel and it includes = Table.Split(#”Filled Down”,3) which I had never heard of before. I think I’ll write a separate post about that. Thanks for reading my blog Alan. I’m curious… how did you find out about it?
      Cheers,
      Kevin

      Reply
  2. Pingback: awkward-data/power-query-solution | My Spreadsheet Lab

Leave a Reply

Your email address will not be published. Required fields are marked *