StATS: What is a percentile?

The pth percentile is a value so that roughly p% of the data are smaller and (100-p)% of the data are larger. Percentiles can be computed for ordinal, interval, or ratio data.

There are three steps for computig a percentile.

  1. Sort the data from low to high;
  2. Count the number of values (n);
  3. Select the p*(n+1) observation.

You can't always be so lucky to have p*(n+1) be a nice whole number. Here are some contingencies.

Examples

The following data represents cotinine levels in saliva (nmol/l) after smoking. We want to compute the 50th percentile.

73, 58, 67, 93, 33, 18, 147

  1. Sorted data: 18, 33, 58, 67, 73, 93, 147
  2. There are n=7 observations.
  3. Select 0.50*(7+1)=4th observation.

Therefore, the 50th percentile equals 67. Notice that there are three observations larger than 67 and three observations smaller than 67.

Suppose we want to compute the 20th percentile. Notice that p*(n+1) = 0.20*(7+1)=1.6. This is not a whole number so we select halfway between 1st and 2nd observation or 25.5. (Some people see the 1.6 and think they have to go six tenths of the way to the second value. You can do this if you like, but I think life is too short to worry about such details.)

Suppose we want to compute the 10th percentile. Since 0.10*(7+1)=0.8, we should select the smallest observation which is 18.

The five number summary

A five number summary uses percentiles to describe a set of data. The five number summary consists of

The five number summary splits the data into four regions, each of which contains 25% of the data.

Example of a five number summary

Percentage of body fat was estimated for a random sample of 252 individuals. The five number summary is

MAX - 45.1
75% - 24.6
50% - 19.0
25% - 12.8
MIN - 0.0

The value of 0.0 is clearly in error. Either the formula for estimating percentage of body fat was applied incorrectly or the estimated percentage of body fat was intended to be coded as missing. With the 0.0 removed, the minimum value becomes 1.9.

This summary implies, for example, that a quarter of the sample had body fat percentages between roughly 25 and 45.

Another example of a five number summary

You might be curious about the types of people who were in the random sample described above. A five number summary shows a wide range of ages in this sample.

MAX 81
75% 54
50% 43
25% 35
MIN 22

Notice that these subjects are adults of all ages. The youngest quarter of the subjects range from 22 to 35 years in age. The oldest quarter range from 54 to 81 years.

Computing percentiles in SPSS

If you have a data set in SPSS, select ANALYZE | DESCRIPTIVE STATISTICS | EXPLORE to compute all the information you need for a five number summary.In the dialog box, be sure to click on the STATISTICS button and select the PERCENTILES option. An example of the output appears below.

Percentiles 5.0000 10.0000 25.0000 50.0000 75.0000 90.0000 95.0000

Haverage 25.0000 27.0000 35.2500 43.0000 54.0000 63.7000 67.3500

Tukey's Hinges 35.5000 43.0000 54.0000

This page was written by Steve Simon while working at Children's Mercy Hospital. Although I do not hold the copyright for this material, I am reproducing it here as a service, as it is no longer available on the Children's Mercy Hospital website. Need more information? I have a page with general help resources. You can also browse for pages similar to this one at Category: Definitions, Category: Descriptive statistics.