It’s easy to create some random data using RANDBETWEEN and INDEX functions. But what if you want to create random data that favors or repeats some values over others? People have told me that they spend time copying and pasting or manually dragging the values to repeat them. Keep reading to see how we can automate this with just four easy steps!
Let’s use population by state for the United States.
STEP 1 EXPANDING SUM FORMULA (Cumulative Sum)
We have our original data: ‘State‘ and ‘Population‘. We add this simple formula in cell C6 =B6+C5 alongside all of our data.
STEP 2 CREATE RANDOM NUMBERS
We add this formula =RANDBETWEEN(1,’Original Data’!$D$5) that creates random numbers between 1 and 319451069 (total sum of the population).
STEP 3 APPROXIMATE MATCH INTO STEP 1
We add this formula =MATCH(A6,’Original Data’!$C$5:$C$55,TRUE) that bins our random number back into our expanding sum. We bin by using TRUE for approximate match.
STEP 4 INDEX FUNCTION GETS OUR ANSWER
We add this formula =INDEX(‘Original Data’!$A$6:$A$55,B6) The index function references our text ‘State’ values and uses the result of the match function (Step 3) to display a state.
Excel File & YouTube Video
Step 1 runs alongside your original data. Drag it down further if you have more data.
Steps 2, 3, and 4 are dragged down as far as you want in sheet ‘Randomly Created Data’. If you don’t drag the formulas down very far you’ll see more variation.
Sheet ‘Pivot & Formulas to Verify’ shows how close our randomly generated data is to our original state population data.
That’s it, we are finished!
Can We Vary The Randomness?
Say what? What if you want more variation in your random data. After all, isn’t it supposed to be random?? Good point.
Fasten your seat belts ladies and gentlemen. It’s about to get nerdy!
Random Variation Options
I have extended our solution to give us these three options:
- Use original expanding range (smallest variation)
- I want some variation. Currently set to 30.0% Based on value in cell F3
- I want to add variation manually (using columns H & I)
Option 1: Smallest variation if we’ve dragged down the formulas far enough in sheet ‘Randomly Created Data’
Option 2: Type in a % value in cell F3 that will increase or decrease (it’s random) the individual state population
Option 3: The option to manually type in a positive or negative number to be added or subtracted to our state population
Above we see that option 2 has been selected. Randomly add or subtract a percent of each state’s population.
The randomly generated ‘State’ data in column C of sheet ‘Randomly Created Data’ is now updated. Refresh the pivot table in sheet ‘Pivot & Formulas to Verify’ to see the variation.
Isn’t that cool?
Equally Random Data-set?
Simply change the numerical values to be the same value (i.e. 1).
Random Stuff About Me
Random Facts: I’m tall but not freakishly tall unless I’m in Costa Rica for work where the average height is lower and I’m 3+ standard deviations above the average height (hence the increased stares). I met NBA legend Bill Walton at the top of a pyramid north of Mexico City. I like book stores. I have a better than average memory (according to my siblings). I am a long-suffering Maple Leafs fan (if the Red Sox and Cubs can win we can too!).
Strange: Before the age of 40 I had never traveled to Europe. I simply had to go! After the initial trip I then had two business trips to Europe (3 trips to Europe in about 4 years!). Blindfolded I can tell the difference between Starbucks, Tim Hortons and Second Cup coffee. I’m a decent chess player but I could never solve a Rubick’s Cube (I got four sides once). I have an abnormal obsession with Microsoft Excel (or so I’m told). In a previous job I was allowed into the computer server room to test if databases were running. I wasn’t allowed to type or touch anything so I had to verbally spell out the SQL queries to another IT person who had permission to type.