Percentile Scoring: What it is and When to Use it

There are limited benefits to percentile scores, which reflect rank order and nothing more.

Sept. 21, 2015

6 min read

Most people have a vague understanding, at best, of what percentile scoring actually is. They know it has something to do with "curving the scores" but that’s about it. Let’s fix that.

A percentile score reports how many of the other test takers you outscored. That’s all it does.

So, if there were 100 test takers, and you were the top score, you outscored 99 other test takers, and would earn a 99^th percentile score. If you were the seventh score down from the top, you outscored 93 other test takers for a percentile score of 93^rd percentile.

This works for any number of test takers. If there were 15, for example, and you got top score, you outscored 14 out of 15, or 93% of the group, so that would put you in the 93^rd percentile. If you had the third score down, you outscored 12 out of 15, or 80 percent, putting you in the 80^th percentile.

The calculation is (how many you outscored) divided by (how many in the group) times 100.

So if you outscored 37 in a group of 65, it would be (37/65) * 100 = 57 percent = 57^th percentile.

Why percentile scoring?

So what does this actually tell us? In two words, not much. If you got the top score, but all of you failed miserably, you would still turn in an impressive percentile score. And if all of you scored in the 90s, still, the lowest person would turn in a zero percentile. Percentile scores do not tell us anything about the raw scores or actual capabilities of any one candidate. All they respond to is rank order.

So percentile scores reflect rank order and nothing more. But they do reflect that. Anyone who was higher than someone else in rank order is higher in their percentile score too. There is a one-to-one correspondence between rank order and percentile score.

So one might ask, well who would ever use such an uninformative scoring scheme? The answer may surprise you. The SAT does, for one. Also, the GRE, MCAT, LSAT and others. Also, IQ tests. So their use is well-established. But again, no actual measure of their capabilities is captured in these scores—only how they did relative to each other.

When, then, would a fire department ever want to use these? There is one application.

That is when you must hire or must promote from the existing candidate pool. It may be that you only had five lieutenant candidates, and all turned in failing scores, but you still need to promote one of them. So, intuitively, you promote the one who did best, but that can be a hard sell with a failing score. But this candidate, in percentile terms, got a score of (4/5) * 100 = 80^th percentile. So that can be an easier sell, and it is an established way of scoring.

That said, we don’t know if our 80^th percentile candidate got a 95 on the test or a 35, only that everyone else did worse. Small comfort if we need the guy to actually be proficient. Still, in real life, sometimes we have to put a person into a position before we can give them the intense training that they need to do the job well. It can be a start, and sometimes the only feasible start.

The bad news is that this works on the other end too. The candidate who came in third, say, out of those five lieutenant candidates, may have gotten a 75 on the actual test, but will show a 40^th percentile score, an apparent failing grade.

So it’s a specialized tool, with perhaps only one useful application, that being the situation where you have to promote someone from a small candidate pool. In a pool as few as four, the top person will score above a 70^th percentile, and higher if the pool is larger.

Remedies for low scores

There are a couple of other remedies for low scores that may be available to you. The key to all of them, however, is that they must be applied before you know who is attached to which score, or else you can be accused of manipulating the scores in favor of one particular candidate, or conversely, to the detriment of someone in particular.

Sometimes you can arbitrarily add 5 or 10 points to all scores, as long as that does not put anyone over 100 percent, and as long as your jurisdiction does not have some established minimum cut off score, such as 70 percent. Another option is to weight various elements in a test so that if one part of it is especially hard, it counts less. It is best to have the advice of a testing company if you want to explore those things, or better yet, have them do it. The big worry is always to make sure any adjustments are valid, and do not create any impression of favoritism on your part. And, whatever we do, we cannot alter the rank order. That must remain intact.

The best option for raising scores in future testing is to reduce the size of the book list. Some jurisdictions have eight books on the reading list, and that can be overwhelming. You can take some out, or select only certain chapters from some of them. Also, you can switch books out, perhaps removing a more advanced one on a special topic, and inserting instead a more general or basic book in its place. If you do have a commercial testing company do this, they can advise you.

And that is the story on percentile scoring. A specialized tool with perhaps only one useful application, but possibly one that can save the day on those occasions it is needed.

HENRY MORSE, BA, MA, BA, NFPA Instructor Level IV, is the president of Fire Service Testing Company, Inc., which tests emergency services jurisdictions across North America for entry and promotion of personnel. Author of a number of books, including Emergency Services Personnel Testing Practices (2013), Preparing for Emergency Services Testing (2005), and others, he is a member of the NFPA 1001 Technical Committee and speaks on these topics and others related to testing and communication.

Contributors:

Henry Morse