13 Jan 2019

=SUMPRODUCT(MID(G2,{1;2;3},1)*1)  I found this formula in a notebook. Can you understand it’s purpose? I’ll explain it in this post. I’ll also expand it for a more complex case.

 

(follow along with my Excel file.)

Audit

I had to concentrate…then I remembered what it does.

=SUMPRODUCT(MID(G2,{1;2;3},1)*1)

If you understand this formula quickly then great but if not don’t worry…let’s audit it!

 

Mid Function

=MID(G2,{1;2;3},1)

Normally MID function extracts character(s) from within a cell like this =MID(G2,3,1)

MID looks at cell G2, goes to the 3rd position and extracts 1 character. If cell G2 has 789 MID would extract “9”.

But we’ve got this MID(G2,{1;2;3},1) what does the {1;2;3} do? 

{1;2;3} is a magical array constant! It extracts the 1st, 2nd and 3rd characters as individual values.

Inside our formula in cell H2 highlight MID(G2,{1;2;3},1) and press F9 key (on laptop hold Fn key and press F9).

You’ll see this: {“7″;”8″;”9”} (press escape key to revert back to original formula). 

 

Convert Text to Numbers

Each number has been extracted individually. It’s not “789” but rather “7” and “8” and “9”.

The double quotes indicate that the numbers are in text format (MID is a text function).

We use *1 to multiply or convert each text number to a real number.

 

Sumproduct Adds Numbers

Finally, the SUMPRODUCT function adds the 7, 8 and 9. That’s why we get 24.

The purpose of the formula was to add all the digits in the cell together!

 

But Wait There’s More

 

 

 

We want to add all the digits in a cell. Below we see values of different lengths.

 

 

 

 

 

 

If this happens we have two options:

  1. take a simpler approach
  2. make our formula smarter (two variations)

 

1. take a simpler approach

Use helper columns to individually extract and convert each number and then add them up.

 

2a. make our formula smarter

As it won’t always be 3 characters we’ll change this part {1;2;3} into this:

ROW(INDIRECT(“1:”&LEN(G12)))

This creates a dynamic array constant for whatever length we need!

Cell G12 has 93782 (5 digits) so our fancy array maker creates {1;2;3;4;5}

Isn’t that cool? So our smarter formula is:

=SUMPRODUCT(MID(G12,ROW(INDIRECT(“1:”&LEN(G12))),1)*1)

 

2b. make our formula smarter

This solution was suggested by Robert H. Gascon (see his comment below!)

As it won’t always be 3 characters we’ll change this part {1;2;3} into this:

ROW(A$1:INDEX(A:A,LEN(G22)))

This creates the exact same dynamic array constant as 2a except it’s NOT volatile (no INDIRECT).

Brillant! Robert’s formula solution:

=SUMPRODUCT(–MID(G22,ROW(A$1:INDEX(A:A,LEN(G22))),1))

 

 

Caution

Even-though the formula solves the more complex case (variable length) we have to be careful.

Why? Because using complex formulas with large datasets can cause Excel to calculate slowly and possibly crash your Excel file. Our 2a complex solution includes the volatile INDIRECT function (recalculates frequently & slowly). I’ll be honest that I’m not sure how much faster 2b solution is. It’s not using any volatile functions so it’s faster but it’s still using 5 functions.

It takes a lot of practice to get a feel for when you can use a complex formula and when it’s better to simplify. Maybe it’s time to test these two solutions using FastExcel addin by Charles Williams!

 

About Me

My name is Kevin Lehrbass. I’m a Data Analyst from Markham Ontario Canada.

Sometimes I write down formulas for future reference. I don’t always have time to create a sample file. In this case, without the data, I had to audit the formula to remember what it does.

The final dynamic formulas:

=SUMPRODUCT(MID(G12,ROW(INDIRECT(“1:”&LEN(G12))),1)*1)

and suggested formula from Robert:

=SUMPRODUCT(–MID(G22,ROW(A$1:INDEX(A:A,LEN(G22))),1))

These are brilliant, complex and dangerous. Brilliant: you can be so creative in Excel. Complex: most people won’t understand what it does. Dangerous they could be slow to calculate.

Experience teaches us when to use this approach and when to use a different approach.

4 Comments

  1. Robert H. Gascon says:

    Hello Kevin,
    To avoid the volatile INDIRECT function, we must instead use the INDEX function, as follows:
    =SUMPRODUCT(–MID(G12,
    ROW(A$1:INDEX(A:A,LEN(G12))),
    1))

    1. Kevin Lehrbass says:

      Hi Robert,
      Thank you for the INDEX approach! I still haven’t incorporated the variable length INDEX range technique into my Excel skill set.
      Is there a specific term for using INDEX like this? I’ve seen it before…I just never remember to use it. Do you have any resources/links for it?
      Thanks Robert! I’ll update my post and Excel file with your solution!
      Cheers,
      Kevin

  2. Robert H. Gascon says:

    Hi Kevin,
    In Cell F12, I entered the formula:
    =SUMPRODUCT(–MID(G12,
    ROW(A$1:INDEX(A:A,LEN(G12))),
    1))
    I then copied the formula to Cells F13:F18. In Cell F20, my formula is:
    =SUM(INDEX(F12:F18,E17):
    INDEX(F12:F18,E18))
    Cell E17 supplies the Start of the Variable Range while Cell E18 supplies the End. Whenever it is positioned before (or after) a colon (“:”), the INDEX function is COMPELLED to return the Reference of the Return Value (rather than the Value itself).
    In this case, I entered 2 in Cell E17 and 6 in Cell E18. The first INDEX function BEFORE the colon normally returns 17, which is the 2nd Value in the Range F12:F18, but was COMPELLED to return F13, which is the Reference of the Value 17.
    Likewise, the second INDEX function AFTER the colon normally returns 19, which is the 6th Value in the Range F12:F18, but was COMPELLED to return F17. So, the two INDEX functions evaluate to F13:F17, thereby resulting in a SUM of 88 in Cell F20.
    Note that when the Value of any (or both) of E17 and E18 is 0, the combination of two INDEX functions is COMPELLED to return the Whole Range, which is F12:F18, thereby resulting in a SUM of 147 in Cell F20.
    To help us remember the foregoing use of the INDEX function, let us look at the References before and after the colon in a range, as if they were returned by two INDEX functions. By the way, this use of the INDEX function is known as the Reference form! It is illustrated as the last example in Help page of the INDEX function. There was only one INDEX function therein, which returns the End of the Range. I just added another INDEX function herein, which returns the Start of the Range.
    We must use the INDEX function instead of the volatile OFFSET and INDIRECT functions. After the LOOKUP function, the INDEX function is indubitably my next favorite!
    Cheers,
    Robert

    1. Kevin Lehrbass says:

      Thank you so much for the explanation Robert. It’s an excellent and unique way to use INDEX. I’m going to write a post about that use of INDEX!
      Yes, INDIRECT is incredibly volatile and should be avoided when possible. Although it’s flexible and interesting it’s a bad idea to depend on it. Better to re-design the data layout. I don’t find OFFSET to be as volatile as INDIRECT but if it can be avoided that’s best (i.e. use INDEX).
      ….I think the volatile functions versus non-volatile functions debate is more important than the overdone vlookup versus index/match debate. What do you think?

Post a comment