Data Disaster: non normalized data

Imagine a weekly task of sending out a simple report. To do this you rely on data from someone else. When the layout of the data changes chaos reigns! What can we do?

 

What We Should Receive

The agreement we have is to receive the data in this layout. This data is normalized. Fields ‘Employee’ and ‘Manager’ have their own columns. A simple vlookup gets each employee’s manager.

 

 

The Mess We Receive

The top name in each mini range is the manager.

 

 

 

 

 

This layout might be nice for a presentation but we can’t use a simple vlookup to get the manager’s name. Instead of 2 neat columns of data we have 90 mini ranges. SHOULD WE PANIC NOW?

 

 

Our Options

Despite the panic that’s flooded our mind let’s focus on possible solutions:

  1. Ask for previous data structure
  2. Manually find manager names
  3. Manually restructure the data
  4. Solve using Excel formulas!

Assuming options 1, 2 and 3 are not possible let’s explore using Excel formulas to solve this!

 

 

Excel Files

  • Download my Excel file to see my original formula solution!
  • In this Excel file there’s a modified solution (mod function) inspired by Vu Dien’s comment!

 

 

Formula Solution Overview

We can get row and column numbers for each employee using formulas.

How do we get the manager’s row number?

In sheet ‘data’ manager names are in rows 3, 12, 21, 30 etc. Consistently 9 rows apart.

 

 

 

 

 

 

 

Look at cells D3 to D9. D3 = manager ‘Lane Cline’. Employees below have the same column number but different row numbers. However, their rows numbers have something in common. They’re all greater than 3 and less than 12.

We’ll use a binning table to find the row number for each employee’s manager!

 

 

YouTube Video

Watch my YouTube video to see all the solution steps.

 

 

True Story!

This was a real problem that I helped someone solve on a Friday afternoon. The individual who sent the dataset was already offline enjoying a long weekend. I had to look around the pressure of a pending deadline to see if there was a time efficient way to find a solution. There was 🙂

 

 

About Me

My name is Kevin Lehrbass. I’m a Data Analyst. This is me on the Brooklyn Bridge earlier this year.

My favorite software is Microsoft Excel. I know some SQL and have done some programming but Excel is still my favorite tool.

In my next video/post I will show you how to solve this same challenge using Excel’s new amazing Get & Transform feature.

2 Comments Data Disaster: non normalized data

  1. Pingback: Normalize Data using Excel Get & Transform | My Spreadsheet Lab

  2. Pingback: Learning Power Query (Get & Transform) | My Spreadsheet Lab

Leave a Reply

Your email address will not be published. Required fields are marked *