(Download my **Excel demo file**. This post dedicated to **Marcelo Ribeiro Simões**)

I normally advise people to include all relevant criteria inside formulas. This way, formulas serve their purpose independently. Filtering and hiding rows are a separate action for viewing the data that won’t interfere with your formulas.

There are exceptions when it’s helpful to adjust formula results by quickly hiding/filtering rows instead of constantly modifying formula syntax. Imagine a busy meeting with lots of questions.

How do we incorporate hidden and/or filtered rows into our formulas?

The common method is to use complex formulas like this:

=SUMPRODUCT((A7:A309>50)*(SUBTOTAL(103,OFFSET(A7,ROW(A7:A309)-MIN(ROW(A7:A309)),0))))

Power users can decipher this but most people can’t. **What if we could reduce the formula to:**

=COUNTIFS($A$7:$A$309,$AK4,$O$7:$O$309,1)

**How is this possible?**

The trick is to add this helper formula =–SUBTOTAL(103,A8) alongside the data-set. Column O uses the subtotal function to determine if the row is visible. 1 = visible, 0 = non visible. We then include the 1s and 0s inside formula conditions.

Helper columns increase file size but they simplify formula writing and auditing.

The data-set is called ‘Heart Disease UCI’ from Kaggle.com. See sheet ‘Counting Example’. A to N are original columns. We have 303 rows.

Subtotal formula is in column O. Initially subtotal shows 1 for all rows but once you manually hide a row or apply a filter non visible rows will change to 0. Watch formulas in cells A6 and O6 change!

Formulas with a green background change if they can no longer see relevant rows (some rows are already disqualified via formula criteria).

Columns AL to AO are traditional formulas that are not affected by non visible rows. This let’s you compare the results.

Sheet ‘Ranking Example’ demonstrates how hiding/filtering rows can affect ranking when using a subtotal helper column.

A downside of using filters is that you can’t see filter values. You have to remember the values that you’ve filtered columns to. * GOOD NEWS*: you can see filter values if you use

Obviously we can’t have 50 subtotal helper columns as this would increase the file size and create clutter. However 1 or maybe 2 subtotal helper columns can be extremely helpful if you want to calculate visible rows only! Take some time to experiment with this concept.

If you’re still with me there’s more good news! Excel’s AGGREGATE function goes above and beyond what SUBTOTAL function can do. It takes some time to learn it but it’s worth it!

Read this **intro from Microsoft** and then watch this **playlist of videos from Mike Girvin** (ExcelIsFun).

My name is Kevin Lehrbass.

I’m a Data Analyst and I live in Markham Ontario Canada.

The main reason I like the subtotal helper column is to avoid crazy long complex formulas. Why so much pain? Just add a subtotal helper column and use filters and/or slicers!

]]>(download my **Demo Excel File**)

**List 1** has 35 names (16 unique names)

**List 2** has 54 names (18 unique names)

(1)how many unique names in List 1? (2)how many unique names in List 2?

This formula counts unique names:

=SUMPRODUCT(1/COUNTIFS(G7:G41,G7:G41))

Question 3 is tricky: **(3) h**

How should we solve Question 3?

**quick and simple?****power query?****single array formula?****helper formulas?**

Each solution has pros and cons. Let’s explore each one.

**quick and simple**

Why complicate things?

Copy paste List 1 names into column A. Use Remove Duplicates feature. Match function tells us if name is found in List 2. Count function provides final answer.

You would have to repeat Remove Duplicates step if the data changes.

**power query**

All work is done inside Power Query. Great for a large datasets that change frequently. Power Query does have a learning curve but it’s an amazing tool.

Summary: load each list, remove duplicates from List1, merge List 1 & List 2. Export back to your sheet.

**single array formula**

If the answer must be dynamic (power query requires a refresh) and fit in a single cell consider this:

To enter this formula hold keys ‘Ctrl’ & ‘Shift’ & ‘Enter’.

Overview explanation:

The inner MATCH is: MATCH(G7:G41,G7:G41,0) It looks for all List 1 names inside List 1!

Compare inner results with counter values to replace duplicates with blanks IF(MATCH(G7:G41,G7:G41,0)=ROW(G7:G41)-ROW(G7)+1,G7:G41,””)

Now the outer MATCH’s **lookup_value** contains unique List 1 names. We look for these in List 2 and COUNT gives us the total found.

**helper formulas**

If you want a live dynamic answer but not the complexity of solution 3 then consider this.

Column F has a simple counter, column I has this simple formula:

=IF(G7=””,”^^^^^”,IF(MATCH(G7,$G$7:$G$41,0)=F7,G7,”^^^”))

Use either formula below, that look at column I helper, to get the final count:

=SUMPRODUCT(–ISNUMBER(MATCH($I$7:$I$60,$J$7:$J$60,0)))

or this array formula (requires Control Shift Enter….not just Enter)

=COUNT(MATCH($I$7:$I$60,$J$7:$J$60,0))

The **quick and simple** solution is great for the masses especially if it’s a one time question.

I’m becoming comfortable with **Power Query** so that solution was easy. It’s an amazing tool but let’s not forget that a Power Query solution is not fully dynamic. It requires a refresh. If building a multiple step model on top of the answer you would probably prefer one of the formula solutions.

**Array?** I love it but it is challenging for most Excel users to understand. This solution was the most fun to build as I had to be creative.

**Helper formulas**. Splitting the array logic out into steps makes it easier to audit for non array fanatics.

**What about you? What solution do you prefer? Do you have a different solution?**

**Robert H Gascon’s Idea!**

See Robert’s comment below. My ‘**single array formula**‘ solution is difficult to understand/audit. Robert suggests splitting the logic into two parts and creating a named range out of each part! It’s easier to audit this way and the final formula =UniqueCount doesn’t require Control Shift Enter.

I’ve updated my Excel file (see top of post). Go to Formulas/Name Manager. You’ll see named range ‘Unique1’. This is the lookup_value for the outer MATCH function. Named range ‘UniqueCount’ references ‘Unique1’. In cell M5 we just need to type this: =UniqueCount

Another benefit is that with careful planning you could reference the same base named range logic several times! It reminds me of writing database queries where various subsequent queries refer to the same base query. **Thanks Robert for the idea!!!**

My name is Kevin Lehrbass. I live in Markham Ontario Canada. I’m a Data Analyst.

How much of an Excel fanatic am I? I used VBA to create random Mondrian style art! (read this **post**)

]]>

We have 10000 rows of data like this:

**Filter values:**

Search words and State can be changed at any time.

**Requirements:**

- Count rows: State = “California”, “lamp” or “lamps” found within ProductName
- BONUS: Display all rows that qualify

Filters or formulas could solve this. **Why Power Query?**

Reapplying filters is tedious. On large data-sets formulas can be slow to calculate.

**Power Query does the work in the background!** Just refresh after changing filter values.

I once re-shingled a roof. I used a hammer but the **pros use a hammer drill! That’s Power Query! **Practice using the hammer drill.

Download my **Excel file.**. Here’s my **YouTube video**.

**Solution Steps:**

**0:00****Requirements****0:24****Load Data into Power Query**load 3 tables**1:24****Requirement Adjustment**‘lamp’ or ‘lamps’ NOT ‘clamps’**1:56****Prep text**wrap with spaces, text to lowercase**3:34****Merge tables**filters 10000 rows to selected state**4:36****Add SearchWord**repeat search words beside rows**5:50****Search words found?**function Text.PositionOf**6:39****Filter out rows**if search words not found remove rows**7:26****Export rows**to Excel sheet**7:44****Create Row Count query**group by query counts rows**9:11****Export Row Count query**to Excel sheet

It’s important to point out a few things in my solution.

**Join vs Cartesian Product**

I used a merge (inner join) to connect tables ‘ProductData’ and ‘SelectState’. This filters ‘ProductData’ to those rows having the selected state. We select a single State at a time.

I used a CARTESIAN PRODUCT (cross join) to connect the search words! A **cartesian product means there is no join!** We get all combinations between tables ProductData and SearchWord. If one of the two tables is short then it’s fine. Our search word table only has 2 rows (‘lamp’, ‘lamps’). If both tables have 1000s of rows creating all combinations can crash your computer!

We start with 10000 rows but reduce it due to state selection. Then we double the remaining rows as we have two search words. Finally we filter out rows that don’t contain either search word.

**Text.PositionOf**

Power Query has menu options to clean & rearrange data but we still need functions. Power Query’s formula language is called M. Some functions are similar to Excel while others are different. Instead of Search or Find we use function Text.PositionOf. Read **Microsoft’s M reference**!

**Advanced Editor**

I like to see the individual steps on the right hand side of a query. Selecting them to walk through the query is helpful. However, I recommend looking at the M code that is generated.

Click ‘Advanced Editor’ and you’ll see M code like this below. It gets easier to read the more time you spend studying it!

let

Source = Table.NestedJoin(SelectState,{“Select State”},ProductData,{“State”},”ProductData”,JoinKind.Inner),

#”Expanded ProductData” = Table.ExpandTableColumn(Source, “ProductData”, {“Product Name”, “State”, “Product2”}, {“ProductData.Product Name”, “ProductData.State”, “ProductData.Product2”}),

#”Added Custom” = Table.AddColumn(#”Expanded ProductData”, “SearchWord”, each SearchWord),

#”Expanded SearchWord” = Table.ExpandTableColumn(#”Added Custom”, “SearchWord”, {“Search2”}, {“Search2”}),

#”Added Custom1″ = Table.AddColumn(#”Expanded SearchWord”, “FOUND?”, each Text.PositionOf([ProductData.Product2],[Search2])),

#”Filtered Rows” = Table.SelectRows(#”Added Custom1″, each [#”FOUND?”] <> -1),

#”Removed Columns” = Table.RemoveColumns(#”Filtered Rows”,{“ProductData.State”, “ProductData.Product2”, “FOUND?”}),

#”Renamed Columns” = Table.RenameColumns(#”Removed Columns”,{{“ProductData.Product Name”, “Product Name”}})

in

#”Renamed Columns”

Please share your thoughts or solution in the comments section below. I’ve learned Power Query from many different people including the readers of my blog

A **special thanks to Kunle** for challenging me to create a Power Query solution!

My name is Kevin Lehrbass. I live in Markham Ontario Canada.

This is my personal blog about Microsoft Excel.

]]>

When we think of Excel the first thing that comes to mind is **data and calculations**.

But there is something more to do other than data; Analysis and its Presentation.

**The presentation of the data is very important to explain our point to the audience**.

You might spend hours analyzing a large dataset but it’s not useful or helpful until you present it.

You probably use **basic and advanced charts** but there are a few new ones in OFFICE 2019/365.

Like…

**…Funnel, waterfall and 2D Map! **Use these to present your data in a more appealing and simpler way.

In this post we’ll explore these charts…so **let’s get started**.

This chart is the visual representation of decreasing data in each step or stage.

Funnel chart presents the figures in descending order due to its funnel shape; highest at the top and least at the bottom.

This chart works only if you have one group of data. There is no axis in this type of chart.

Suppose you’re working on an idea collection workshop and you receive 500 ideas from the participants. In the end, you could only implement 15.

Now you want to graphically present the lost ideas at each stage between 500 and 15.

How will you present this in graphical form? Yes, Funnel chart is the best option.

Now you cannot deny the fact that Funnel chart is more conclusive and visually appealing.

You will notice that the size of the bar against 250 is half of our start point i.e 500.

The size of the bar is determined by its value, higher the value, larger the bar size.

**Steps to make Funnel Chart**

- Organize the data in descending order. If the stages or data set cannot be altered or sorted in descending order, it is not advised to use a funnel chart.
- Then click on the insert tab ➜ Charts.
- In case you are not able to find the funnel charts, click on Recommended charts ➜ All charts. The funnel chart is listed in the left column of the window.

**Quick Tip:** The funnel chart is used to present descending values. If you use the same chart for ascending values, it will look like a Pyramid chart.

Making Waterfall charts in previous versions was also possible but it could take around 20 minutes to organize your data.

But now in Office 2019, Microsoft has given this as an inbuilt chart type reducing users’ efforts.

This chart type shows the cumulative effect of the data series both positive and negative.

This chart is used to represent both positive and negative values especially in data related to finance and accounting where we have both outflow and inflow.

In this chart, we have shown the price trend from Jan-19 to Jul-19. We started from USD 1000 in the month of Jan’19.

The bars in orange color denote negative figures i.e. decrease in price and blue positive.

In Feb-19 we increased the price by USD 40 i.e. 1000+40 = USD 1040.

Hence the bar in the month of Feb-19 starts from 1000 and ends up in 1040.

The bar for Mar’19 starts from 1040 and ends in 1040 +30=1070.

Similarly, in Apr-19, we decreased the price by USD 500. The orange bar starts at USD 1070 (1000+40+30) and goes down to USD 500 (1070-500).

**Steps to make Waterfall Chart**

- Start from arranging your data in the order or actual trend. Don’t try to sort your data in descending or ascending order.
- Select any cell of the data table
**➜**Insert ➜ Charts ➜ Waterfall charts.

You might come across situations where your data has more geographical details like population, area, market size, sales channel distribution across the globe.

Such data can be best presented in a map. **How do you insert a map in your data?**

Probably you have been downloading the images from google and inserting the same into your data.

This includes a lot of struggle in terms of manually entering labels on each point, formatting and resizing the map.

Here’s the solution called 2D map chart.

You need an internet connection to make or append data in this chart. But you can view a map chart without an internet connection.

This chart can be used in the data where we have cities, states, countries, postal codes.

If your data includes cities which can be common across the globe, you should add a country name or postal codes to your data.

**Steps to make 2D Map chart**

We have data with revenue from different cities in Asian Countries. Select any cell from the data **➜** Insert **➜** Charts **➜** Maps **➜** Filled Map.

**Your map chart is ready**.

But hold on, this is not the same as we wanted. The cities are too small and not very visible.

Now you need to change a few settings. Right click on the chart and click on Format Data Series.

**Series Options:** You will find 3 tabs Map Projection, Map Area and Map Labels.

- Map Projection gives you a drop down of different types of projection of the map chart Automatic, Mercator, Miller and Robinson.
- Map Area allows you to zoom a particular portion of the map based on data; Automatic, the only region with data and World.
- Map Labels is like data labels. It gives you a drop-down list of None, Best fit only a show all.

Select Map Projection and map Labels as per your choice and requirement but select Map area “Only region with Data” for the above-mentioned example.

**Series Color: **The series color option is only available for a chart with values. If your chart is based on some categories, you will not find the Series Color options.

Using series color, you can change the color of the map chart from 2 to 3 colors.

**Sequential (2-color): –**The map chart above is a 2-color map chart where all the values are highlighted with different shades of the same color blue. Highest the value, deeper the color. You can reverse the color code by changing the figures or colors against the “maximum” and “Minimum”.

**Diverging (3-color): –**In case 2 color chart looks boring, you can change the chart to 3 colors using diverging (3-color). This will give you an additional option when compared to Sequential (2-color) i.e. Midpoint.

Now for the Map chart, we have made, let’s change the Map area to “Only regions with data”.

You map chart is ready.

Aprajita is an MBA in Sales & Marketing and has been using Excel for the last 8 years.

Her journey started from learning a basic pivot table from Google which made her fall in love with Excel. She is a lifeguard to people around who fight with data on day to day basis.

My name is Kevin Lehrbass. I’m a Data Analyst. I live in Marham Ontario Canada.

Microsoft Excel is my favorite data software. In each new version there are so many new charts, tools, functions, etc!

A major benefit of blogging is that I interact with so many people and learn from them

]]>(Download my **Excel file**. Read my **Power Query solution**.)

It’s an array entered by pressing ‘Ctrl’ & ‘Shift’ & ‘Enter’.

=SUM(N(ISNUMBER(SEARCH($C5,IF(‘Product Data’!$B$3:$B$10000=$B$5,’Product Data’!$A$3:$A$10000,””)))))

I didn’t have any data so I put sample data into sheet ‘Product Data’

**Start with SEARCH**

A basic SEARCH function works like this: =SEARCH(“x”,”Texas”)

Search for “x” anywhere within text “Texas”. The answer is 3 (position where “x” is found).

But we are looking for text “Binder” within a column of text (based on a condition).

=SEARCH($C5,IF(‘Product Data’!$B$3:$B$10000=$B$5,’Product Data’!$A$3:$A$10000,””))

‘**find_text**‘ $C5 search for cell C5 value “Binder”.

‘**within_text**‘ IF(‘Product Data’!$B$3:$B$10000=$B$5,’Product Data’!$A$3:$A$10000,”” In sheet ‘Product Data’ if column B = “Texas” get values from column A.

‘**[start_num]**‘ is optional. We’re not using it.

**Summary in plain English**:

If column B = “Texas” then look for “Binder” in each qualifying column A cell.

**What Happens Inside the SEARCH array**

Highlight the SEARCH array and press F9. Here’s what you’ll see (VALUE errors removed)

The first number is 42. What does it mean?

“Binder” is found in the 16th cell in the 42nd position inside that cell (A17).

Cell A17 text: “Storex DuraTech Recycled Plastic Frosted **Binder**s”

(A16 doesn’t count as ‘State’ = “California” not “Texas”)

**Almost Finished Now!**

The Search array is wrapped with this: SUM(N(ISNUMBER

ISNUMBER converts the errors and numbers to TRUE or FALSE.

N converts TRUE and FALSE to 1s and 0s.

SUM adds the 1s.

When ‘State’ (column B) = “Texas”, 82 cells in ‘Product Name’ (column A) contain search word “Binder”.

Cell E5 of sheet ‘Solutions’ has this formula:

=COUNTIFS(‘Product Data’!$B$2:$B$9995,$B5,’Product Data’!$A$2:$A$9995,”*”&C5&”*”)

It’s not an array formula (faster to calculate) and it’s easier to understand.

“*”&C5&”*” wrapping cell C5 search word with wildcards does subsearch magic!

There’s not always an easy way to solve some questions. Sometimes an array is the only way to solve it (unless you create 1000s of formula helper columns). I love arrays but I try to only use them when there isn’t an easier way to solve it.

My name is Kevin Lehrbass. I’m a Data Analyst. I live in Markham Ontario Canada.

Two years ago I was in Las Vegas for a Data software conference. There was an interesting data art object in the lobby (not related to the conference). *Data and art?*

That just gave me another idea for a vba post. Stay tuned!

]]>(download my **Excel file**)

The VBA code instantly refreshes when new entries are added. Perfect!

*What about a formula solution?*

Scroll to the comments section of his **post.** There’s a suggestion to use a dynamic array! These are being tested (365 insider edition) so I created a formula solution that works in all versions.

My solution contains:

- city name input in field ‘ciudades’
- four helper formula fields (steps 1 to 4)

How does it work?

- Rank =COUNTIFS([ciudades],”<=”&[@ciudades]) ‘A Coruna’ = 1, ‘Zaragoza’ = 6(last)
- Counter =ROW([@ciudades])-ROW(Table13[[#Headers],[ciudades]]) basic counter
- Match =MATCH([@Counter],[Rank],0) find Counter value in Rank
- Sorted =INDEX([ciudades],[@Match]) index to get sorted city via Match
- Data Validation connects to a named range that references step 4 Sorted

Add a city below ‘Valencia’. The Data Validation list connects to a named range and updates instantly.

The helper columns are light and easy to audit. When you add a city the table, including all formulas, automatically expands!

Both the vba and formula solutions are worth considering.

**UPDATE: Alternative Formula Solution**

In the comments section below * Robert Gascon* suggested a solution that does not depend on an Excel table (my solution does).

I have now updated my Excel file (found at top of post) to include Robert’s solution! I’ve learned so much from Robert over the past few months!

*What about a Power Query solution?*

Power Query is an amazing tool that could load the ‘ciudad’ text, sort it and return it back to the sheet.

There’s one issue…if you add/modify city entries you need to refresh the query.

Using ‘Worksheet_Change’ the vba code updates automatically based on any addition or modification. The formulas are also automatic.

* Is Power Query a bad solution?* Not necessarily. If you had a large amount of data and didn’t want (a) heavy formulas or (b) vba code then Power Query could be the perfect solution! Power Query would load the data, do all the necessary steps, and then quietly drop the answers back into the sheet.

**UPDATE**: ExcelCampus has a nice **blog post** showing 4 different solutions. Jon’s formula solution is the way to solve it for those with the new 365 version of Excel. The solutions in my post by myself and Robert are compatible in any modern version of Excel (back to Excel 2007).

**excelforo.blogspot.com** is one of the best Spanish Excel blogs that you’ll find! I can read Spanish fairly well so when I read Ismael’s posts my Excel and Spanish hobbies collide! Coool!

Check out Ismael’s **Facebook page**!

My name is Kevin Lehrbass. I’m a Data Analyst. I live in Markham Ontario Canada.

I saw a friend working on an Excel model back in 1996. I was hooked for life!

It took awhile to get into the workforce…but it’s been an amazing ride!

]]>(Download my **Excel file**)

Column B has the count. Column C has the weight. Weighted average formula is:

**=SUMPRODUCT(B2:B12,C2:C12)/SUM(B2:B12)**

**12.05** is the answer.

Why can’t we just use a normal average?

If we average the numbers in column C we get an average of 14.0

BUT….it wouldn’t fairly represent our group of dogs.

Look at Pic 1, does an average of 14.0 look right???

Remember: 1 dog weighs 8 pounds, 7 weigh 9 pounds, 10 weigh 11 pounds, 11 weigh 12 pounds etc.

=AVERAGE(C2:C12) ignores the fact that most dogs weigh 9, 11, or 12 pounds (that’s 28 of 40 dogs). Only 7 dogs weigh 14 or more pounds.

Column D below helps to visualize this. I changed the font size to reflect their weight.

This is our weighted average formula:

**=SUMPRODUCT(B2:B12,C2:C12)/SUM(B2:B12)**

Let’s examine each part separately:

**SUMPRODUCT(B2:B12,C2:C12)**

SUMPRODUCT does this (1 dog X 8 pounds) + (7 dogs X 9 pounds) + (10 dogs X 11 pounds) etc **=482**

**SUM(B2:B12)**

SUM is simply counting the dogs. **=****40**

**SUMPRODUCT (482)** divided by **SUM (40)** gives us the correct weighted average of **12.05**

We were given the summary of the original data(Pic 1). Let’s **recreate the data!**

**Step 1 Cumulative Sum: **column A has a cumulative sum (+1) of column B dog count values. This creates binning groups.

**Step 2 Counter: **a sequential counter from 1 to 40 (40 dogs). Represents dog #1, dog #2, dog #3, etc.

**Step 3 Binning: =MATCH(M2,$A$1:$A$13,TRUE) **finds each counter value in column A. TRUE (approximate match) **not** FALSE (exact match) bins each counter number.

Pic 4 shows that dog #1 weighs 8 pounds. Dogs 2,3,4,5,6,7,8 all weigh 9 pounds (group or bin 2).

Look at counter value of 9. Dog #9 falls into bin 3 and has a weight of 11 pounds.

**The Proof:** *Use the normal average on column O. It’s the same as the weighted average!*

How did I create it?

I used formula =REPT(“õ”,B2)

Cell font = Webdings

I manually changed the font size (vba would be better!)

My inner nerd took over my body:

I changed the formula to: =REPT(VLOOKUP($V$1,’Animal List’!$A$2:$B$5,2,0),B2)

In cell V1 select from a list (maybe you have pet squirrels !!)

NOTE: if you increase the dog counts in column B then extend formulas in columns M,N, and O.

Watch this **video** from Microsoft.

Cali (on the right) weighs 10 pounds and Fenton weighs approximately 14 pounds (he wiggles a lot on the scale at the vet).

That’s me in the middle. I weigh a lot more! My name is Kevin Lehrbass. We live in Markham Ontario Canada.

I’ve been a Data Analyst since 2001. Microsoft Excel is my favorite software but Cali & Fenton get mad at me if I spend too much time in Excel.

Their hobbies: getting treats, barking at squirrels, naps with me on the couch.

]]>

(download my **Excel file**)

Even if your vlookup syntax is correct it might only be telling part of the story. ** HUH**?!

* Vlookup will only retrieve the first answer*. What if your lookup_value is found multiple times in your data-set? Which answer is correct?

Assuming that your lookup_value is only found once is a dangerous assumption.

I can’t examine every row in a dataset so I use various methods to double check my results.

I often use countifs to confirm how many times the lookup_value is found in the data-set. * If it’s only found 1 time then vlookup works*.

There are other ways to verify if you have unique values in a column:

**Pivot Table**put lookup_value column into row label area and also in values area as a count**Remove Duplicates**make a copy of the column and remove the duplicates. If nothing is removed there weren’t any duplicates**Conditional Formatting**highlight the duplicates**Formula**compare =COUNTA(Data!A2:A27) with =SUM(1/COUNTIFS(Data!A2:A27,Data!A2:A27)) (array formula)**Find**Excel’s find feature can search for a specific lookup_value (shortcut = ‘Ctrl F’)

Did I miss any?

If countif shows multiple matches what should we do?

- The first match wins (regular vlookup or index/match)
- The last match wins =LOOKUP(1,1/(Data!$A$2:$A$27=$C6),Data!$B$2:$B$27)
- Create a
**concatenated key**to properly identify the value

My name is Kevin Lehrbass. I’m a Data Analyst.

I live in Markham Ontario Canada.

When working with data assumptions are dangerous.

Over the years I’ve found many ways to double check results even when I’m in a hurry.

]]>

We want the CAGR for 2006 to 2013 (sheet ‘CAGR answer proof’ in my **Excel file**).

- logic
**=((EndYearAmount/StartYearAmount)^(1/NumberOfYearIntervals))-1** - formula
**=((C11/C4)^(1/7))-1** - answer
**0.032926657892244**in cell H7

We’ll use column D to increase original 302.50 amount in 7 steps (years or periods)

- Cell D4 has formula
**=C4**(cell C4 is 2006 amount or year 0 amount) - Cell D5 has formula
**=D4*(1+$H$7)**Drag it down to cell D11 - Cell D11 calculated amount = original ending amount in cell C11

In step 2 (cell D5) we add 1 to our CAGR answer to get 1.032926657892244 so that column D amount increases as we drag formula down.

CAGR formula (Compound Annual Growth Rate) is used to analyze and compare investments.

The CAGR formula below does all steps in a single formula. The pic above shows you what happens inside the formula year by year. Pic below goes into more detail.

**=((EndYearAmount/StartYearAmount)^(1/NumberOfYearIntervals))-1**

CAGR * doesn’t* take the difference between EndYear and StartYear amounts divided by number of years. It would be easier to add 11 to starting amount 302.50 seven times to end up with 379.50 (columns G & H) but that’s not a CAGR. Let’s walk through the compounding nature of CAGR:

- Column E shows yearly increases: (1+CAGR) X previous amount. Cell E11 = D11 year end value
- Column F yearly increase difference (312.46 – 302.50 = 9.96 in cell F5) is a compounding amount (starts smaller, ends larger). Cell F17 average of F5 to F11 = 11.
- Column F has compounding amounts, column G is always the same amount

Does this help explain CAGR’s compounding nature? In my Excel file see sheet ‘what CAGR does’. All 3 examples show CAGR starting smaller and ending larger. The mid point (year 4 in this example) is almost identical to the average. The charts show this well.

The most common pitfall is incorrectly entering the number of years.

If we want the CAGR for 2006 to 2013 that’s 7 intervals NOT 8! I’d suggest altering the formula to this:

**=((EndYearAmount/StartYearAmount)^(1/(EndYear-StartYear)))-1**

When replacing the text above with cells references it looks like this:

**=((C11/C4)^(1/(B11-B4))-1) ** dollar amounts in column C, years in column B.

CAGR is one of many financial calculations (i.e. IRR, MIRR, NPV). Each have their uses and limitations.

Would you like to dive deeper into the math? I recommend these articles:

My name is Kevin Lehrbass. I’m a Data Analyst. I live in Markham Ontario Canada.

It wasn’t my dog Cali who asked me to prove that my CAGR was correct. But it was a real question from just last week. We should question how a solution works. Imagine all the errors that would be eliminated if all solutions were reviewed carefully.

Fenton and Cali went to the vet this past weekend for shots and and a TRIM. They look so different with short hair

]]>

Ankur Shukla’s **comment** reminded me that the fastest way to solve this challenge is to use FlashFill! If the pattern is consistent let FlashFill solve it! Type in the pattern (usually 1 or 2 entries is enough), click ‘Data’ and ‘Flash Fill’. Done!

The **array solution** is slow to calculate on a large data-set but it’s beautiful! And sometimes we need a dynamic formula (Power Query requires a refresh, Flash Fill needs rerunning). Arrays should be used sparingly like a fine wine but they can calculate the impossible and I love them! (see end of post for alternative array)

**Kunle SOPEJU** sent me his Excel Power Query solution. It’s amazing! ** WHY ? **(see end of post for alternative PQ solution)

If we had thousands of rows of data the array would be heavy. Why use so much calculation effort on extracting the largest number in each cell when Power Query can do the heavy lifting and drop the answers back into the sheet! Also, if we learn Power Query we can automate other tedious tasks!

Download **the Excel file** and follow my detailed explanation of Kunle’s Power Query solution.

I can confidently use basic features in Power Query but I’m not an advanced user (yet). To learn more I’ll audit Kunle’s solution!

**Input & Output**

We start in worksheet “**sheet1**” (seen above).

The blue table is the original data. The green table is what Power Query exports back to the sheet.

We see the largest number results in column D! Let’s audit the steps inside Power Query.

**Opening Power Query**

Let’s enter the magical world where all the action takes place!

- Select any cell in the green table
- On the ribbon select ‘Query Tools’ and ‘Edit’

**First Glance**

The original untouched data is below on the left (column ‘AlphaNum’).

PROPERTIES shows the query name: “**Stepwise – Largest Number in String**“

APPLIED STEPS lists each step. Oyekunle has clearly labeled the steps for us

Next is to explore the details of each applied step.

**STEP 1 “Source”**

= Excel.CurrentWorkbook(){[Name=”Table6″]}[Content]

Pic above shows us that the data is loaded from “Table6” (before any steps take place).

**STEP 2 “AddCol – Txt2List”**

Click Applied Step **AddCol – Txt2List** to see this:

We can see each step’s full code in the formula bar but it’s a bit confusing.

**There’s 2 ways to dig into the details to help understand what’s going on:**

(a) click the circular gear icon to the right of this step’s name

Now you should see this “Custom Column” box. This step uses this text function:

=Text.ToList( [AphaNum] )

**Tip**: click **here** to see Microsoft’s definition & example of Text.ToList

(b) in column “Txt2List” click the white space next to “List” in any row.

You’ll see each individual character split out into a vertical list like this:

**STEP 3 “AddCol – Replace Txt with Spaces”**

Click applied step “AddCol – Replace Txt with Spaces”. Click the gear to see the function:

List.Transform( [Txt2List] , each if Value.FromText( _ ) is text then ” ” else _ )

It looks at each value in the list. If it’s a text then it becomes a space.

In column “Transform List” click any white space beside “List”. Letters are removed!

**STEP 4 “Add Col – CombineText”**

Click applied step “Add Col – CombineText”

Clicking the gear reveals: Text.Combine( [TransformList] )

Spaces and numbers in the vertical list are flipped back into a horizontal cell.

**STEP 5 “Add Col – SplitText by Spaces”**

Click applied step “Add Col – SplitText by Spaces”

Clicking the gear reveals: Text.Split( [CombnTxt] , ” ” )

Numbers separated by spaces in each cell are now split apart in this new list. COOL !

In column “Split Text” click any white space beside “List” to reveal:

The list contains all the individual numbers! (but they are stored as text)

***We are almost finished now!***

**STEP 6 “Add Col – Transform Txt to Numbers”**

Click the gear to see function: List.Transform( [SplitTxt] , each Value.FromText( _ ) * 1 )

Each text number is multiplied by 1 to convert into a real number.

**STEP 7 “Add Col – Obtain Largest Number in List”**

Function List.Max ( [#”ListTransform – Txt2Number”] ) grabs the largest number in the list!

**STEP 8 “Removed Other Columns”**

Now we have our final answer so we can hide all the intermediary steps.

**Send Query Answers to Sheet1**

In the top left of the ‘Home’ tab Kunle clicked “**Close & Apply**” drop down (NOT the icon above) and then “**Close & Load to…**” to select exactly which sheet and cell to put the answers!

Currently, Oyekunle offers his services as a Resource person to Training and Consulting outfits in Sales and Data Analysis. He is a Faculty Member/Consultant with **AutusBridge Consulting Ltd** (Training & Consulting) in Lagos, Nigeria. He is a Certified Tutor of both Institute of Sales Management, United Kingdom. (ISM-UK) and Cambrigde Professional Academy, UK.

He is particularly interested in bridging the Data Science Gap. His favorite data analysis tools and apps is MS Excel, Power Query with M-Language, Power Pivot with DAX and of course Power BI.

Oyekunle has worked in Pharmaceuticals, FMCG and Telecommunications both in Nigeria and Ghana.

Ankur is an accountant and Excel guru from Lucknow, Uttar Pradesh, India. You can find him on **Linkedin** and **ExcelForum.com**

Thanks Ankur for all the comments & suggestions you’ve made on my YouTube channel!

Thank you **Kunle** for your solution! Auditing it has helped my understand Power Query Lists!

**What’s the BONUS?**

Inside Power Query click ‘Queries’ (left side). We see Oyekunle’s “Stepwise – Largest Number in String” query that we just audited.

Audit other queries & custom functions like “fnLargestNumberInAString” to learn more!

Instead of writing each step individually you could open the ‘Advanced Editor’ and write the M code.

This requires a lot of practice but some can do it! Here is Oyekunle’s M code:

Source = Excel.CurrentWorkbook(){[Name=”Table6″]}[Content],

#”AddCol – Txt2List” = Table.AddColumn(Source, “Txt2List”, each Text.ToList( [AlphaNum] )),

#”Add Col – Replace Txt with Spaces” = Table.AddColumn(#”AddCol – Txt2List”, “TransformList”, each List.Transform( [Txt2List] , each if Value.FromText( _ ) is text then ” ” else _ )),

#”Add Col – CombineText” = Table.AddColumn(#”Add Col – Replace Txt with Spaces”, “CombnTxt”, each Text.Combine( [TransformList] )),

#”Add Col – SplitText by Spaces” = Table.AddColumn(#”Add Col – CombineText”, “SplitTxt”, each Text.Split( [CombnTxt] , ” ” )),

#”Add Col – Transform Txt to Numbers” = Table.AddColumn(#”Add Col – SplitText by Spaces”, “ListTransform – Txt2Number”, each List.Transform( [SplitTxt] , each Value.FromText( _ ) * 1 )),

#”Add Col – Obtain Largest Number in List” = Table.AddColumn(#”Add Col – Transform Txt to Numbers”, “LargestNumber in String”, each List.Max ( [#”ListTransform – Txt2Number”] )),

#”Removed Other Columns” = Table.SelectColumns(#”Add Col – Obtain Largest Number in List”,{“AlphaNum”, “LargestNumber in String”})

in

#”Removed Other Columns”

Select ‘Home’ at the top and then ‘Advanced Editor’ to see this code.

It takes practice to understand M code. Auditing the steps carefully makes it easier. Remember that:

- a step starts with “#”
- each step refers to the previous step
- values were juggled back & forth between cells and lists
- various functions were used
- Oyekunle clearly renamed each step (easier to audit !)

I’ve been taking this amazing **course** taught by Ken Puls and Miguel Escobar. I’ve learned so much!

Disclaimer: i’m a student and an affiliate.

YouTuber ‘**GaribaldiInTheMaking**‘ suggested this alternative array formula that uses the versatile AGGREGATE function. It’s a longer formula that might be faster!

**Daniel Choi** (his **Excel blog**) shared three Power Query solutions with me! Thanks Daniel! Unfortunately it doesn’t work on my version of Excel 2016 but feel free to **download** it.

In the comments below Bill Szysz suggested a one step Power Query solution:

let

Source = Table.AddColumn(Table.TransformColumnTypes(Excel.CurrentWorkbook(){[Name=”Table6″]}[Content],{{“AlphaNum”, type text}}), “Largest”, each List.Max(List.Transform(Text.Split(Text.Combine(List.Transform(Text.ToList([AlphaNum]), each try Text.From(Number.From(_)) otherwise ” “)), ” “), each try Number.From(_) otherwise null) ) )

in

Source

I think both the step by step method by Kunle and Bill’s one step method are amazing! Sometimes we need to break things down into steps to see exactly how they work. And then it’s also great to know how to create a short compact solution.

My name is Kevin Lehrbass. I live in Markham Ontario Canada.

These are my dogs Cali and Fenton. They sit with me when I write my blog posts. They know when I need a break from my laptop

I’ve been a Data Analyst since 2001. My favorite software is Microsoft Excel. I’m currently learning Power BI.

]]>