One of my favorite co-workers, Ben Klopfer, wrote a post back in October 2012 that has been one of our most popular and the comments keep coming in. In it, he espouses the new INDEX-MATCH functions that can be nested to vastly outperform VLOOKUP and HLOOKUP. VLOOKUP is my go to function. I tend to feel very comfortable with relational databases so I just think in two dimensional tables. I’ve found uses for HLOOKUP (which is basically VLOOKUP rotated ninety degrees), but I would truly lose a lot of utility out of Excel if VLOOKUP or a similar function didn’t exist, as it easily searches tables.
In Ben’s post, he addresses four limiting issues of VLOOKUP:
- Your data range is limited to a table. I’m okay with that for now; most of my lookups are in a tabular format.
- VLOOKUP always searches the left-most column. Yes, this is a pain.
- You can only specify return values by index number. Well, I am very familiar with VLOOKUP and can use it in my sleep. I’m okay with that.
- VLOOKUP provides a very limited approximate match feature (where data must be sorted low to high). Yes, I’ve run across this issue and just resorted my table. But yeah, I’m not fond of that workaround.
So Ben eloquently explains how INDEX-MATCH can be combined to address these four issues. One of my co-workers ran into an issue with #2 and I directed him to Ben’s post. Problem solved.
But when you get right down to it, if your data was in a tabular format, anyway, wouldn’t it be great if VLOOKUP could just look to the left? I mean your data’s right there! And that sorting issue Ben mentions in #4. Why didn’t Microsoft just decide to add one more parameter to allow you to specify the sort order?
I decided to do something for the little man (I’m not that little, so I guess that makes me magnanimous) and extend the function myself.
Imagine the following worksheet.
I have this spreadsheet divided in four quadrants, which is convenient because it’s my understanding they normally come in groups of four. For simplicity’s sake, let’s name the top left one Q1, the top right one Q2, the bottom left one Q3, and the bottom right one Q4.
Q1 contains a table of grades and the threshold to meet each one. It would appear that to make an A, you would need a 90 or above, a B requires an 80, and so on.
Q2 contains a Grade-Threshold pair which looks up a numeric value for a grade (so A would be 90, and B is 80 as the example shows) and a Percent-Grade pair which looks up a grade based on a percentage. The first pair is very easy to do with VLOOKUP where the formula in cell F2 would be: =VLOOKUP(E2,A2:B6,2,FALSE). This formula looks at the value in cell E2 (a B), considers the range of cells A2:B6, finds the matching row containing the value B, and returns the second column’s value in cell B3: 80. The range lookup is turned off so exact matches are required. This makes sense, since the grade letters wouldn’t come in a range.
However, the threshold of the grades does define a range. If you were to look at the table in Q1, it would appear that from 0 to less than 60 is an F, from 60 to less than 70 is a D, and so on. VLOOKUP has two problems here, both highlighted by Ben back in 2012. First, you can’t look left so you can’t look at the numbers in column B and pull a grade from the range. Second, the range is from high to low, something VLOOKUP’s range ability cannot deal with.
Q3 contains a table which solves both of these issues for the standard VLOOKUP function: by moving the numbers to the first column and the letters to the second; and by resorting the numbers from low to high. Now, the Percent-Grade pair in Q4 will work. The formula in cell F10 is: =VLOOKUP(E10,A10:B14,2,TRUE). You’ll notice that the range parameter has been set to TRUE. This allows the range ability of VLOOKUP to function, where 91 as shown in cell E10 returns an A in cell F10.
But the problem here is that you now have two tables which have the exact same information but must be duplicated to present the data differently. This can cause problems, especially if you change one table and forget to update the other one.
It just so happens that the formula in Cell F5 in Q2 gets around this problem. It queries the table in Q1 just like cell F2 does. However, it looks backwards. It uses the range of numbers from B2:B6 and determines the appropriate grade, even though the sort is from high to low; and this is what the formula looks like: =VLOOKUP_2(E5,B2:B6,0,TRUE,TRUE).
I’m probably not the most creative person in the world. I could have named this MarksVLOOKUP but I wanted to save you the typing, so I just put an underscore in the name, VLOOKUP_2. Once you see the VBA, you’ll be able to change the function’s name to whatever you want; but I digress.
You’ll notice an extra parameter. It’s called LargeToSmall and it defaults to false. In this case, we set it to TRUE because our range sort is from high to low. I probably did better naming this argument than I did naming the function. If you don’t include it, VLOOKUP_2 will behave exactly as VLOOKUP does, except for the whole search to the left thing.
I built this function because I figure there are a lot of VLOOKUP affectionados out there who just want their function to step up a notch or two. I am in no way a VBA expert; I’ve not touched the stuff in many years, actually.
If you want to build this spreadsheet (it’s pretty easy) and test out VLOOKUP_2 for yourself, I’ll iterate through the formulae for you:
I set E10 so that I could update the cell E5, which accepts a number and calculates a grade using VLOOKUP_2 and have E10 automatically update so that F10 would give me the result VLOOKUP gives with a properly sorted table.
For all this to happen, you’ll need to create a VBA function. In order to do this in Excel, press ALT-F11 and the VBA development environment will open. In the upper left corner you’ll see the following:
Right-click Microsoft Excel Objects > Insert > Module (not Class Module).
Insert the following code:
Function VLookup_2(lookup_value As String, key_values As Range, col_index_num As Integer, Optional range_lookup As Boolean, Optional LargeToSmall As Boolean)
Dim testValue As Variant, valueOut As Variant, lastMatch As Variant
Const indexAdder = 1 ' make 0 to make col_index_num zero based, make 1 to act like VLOOKUP where 1 is the key column
Const maxNumberOfBlanks = 10000 ' number of blank keys found in a row which will make routine time out
valueOut = "": lastMatch = "" ' if not set, or set to empty, defaults out as zero which is not expected behavior
Dim bValueFound As Boolean, valueFound As Variant
Dim countOfBlanks As Long
Dim bIsEmpty As Boolean
For n = 1 To key_values.Rows.Count
testValue = key_values.Cells(n, 1).Value
valueFound = key_values.Cells(n, 1 - indexAdder + col_index_num).Value
bIsEmpty = isEmpty(valueFound)
If bIsEmpty Then
countOfBlanks = countOfBlanks + 1
countOfBlanks = 0
If testValue = lookup_value Then
If range_lookup And Not LargeToSmall Then
lastMatch = valueFound
valueOut = valueFound
bValueFound = True
ElseIf range_lookup Then
Dim compareTest, compareLookup
' we want to compare numbers like numbers; otherwise
' if we have 300, 400, 500 as key values and our search term is 3000, we'd
' return 300 instead of 500.
If IsNumeric(lookup_value) And IsNumeric(testValue) Then
compareTest = Val(testValue)
compareLookup = Val(lookup_value)
compareTest = CStr(testValue)
compareLookup = CStr(lookup_value)
If (compareTest < compareLookup) Then
bValueFound = True
If Not isEmpty(key_values.Cells(n, 1 - indexAdder + col_index_num)) Then
If LargeToSmall Then
valueOut = valueFound
lastMatch = key_values.Cells(n, 1 - indexAdder + col_index_num).Value
If bValueFound And ((compareTest > compareLookup) Xor LargeToSmall) Then Exit For
If countOfBlanks > maxNumberOfBlanks Then Exit For
If range_lookup And bValueFound And Not LargeToSmall Then
valueOut = lastMatch
If Not bValueFound Then valueOut = CVErr(xlErrNA) ' set value out as #N/A if value is not found in list
VLookup_2 = valueOut
Hopefully my code’s not the funniest thing about this blog.
I did want to highlight a couple of things in the code. You’ll notice that VLOOKUP’s third parameter defines the return column, where column 1 is the left-most column. To augment compatibility with this new function, column 1 is still the key column, and the return columns to the left would start at 0 and go negative from there. However, while developing this solution, I realized that having the key column be 0 made more sense to me. Then positive numbers denoted offsets to the right and negative numbers denoted offsets to the left. The comments in the code should let you make this change very easily, should you so desire.
It may take a bit more digging, but you might also realize that the range parameter passed to this function only needs to span one column: the key or lookup column. Whether you offset your return column to the left or right, the range doesn’t need to cover the columns in the entire table. It just needs to define the key values to be searched upon. In fact, if you define multiple columns in your range, the key range used for searching will be the first column and the other columns would be ignored by VLOOKUP_2 in terms of determining key values to search for.