KPI & CRB Newsletter

Special Edition, February 2000

© 2000 Key Performance Initiatives, Inc.

 

Peanuts  &  PARETO

by Winston Scott Jones

 

“Therefore the [Pareto] chart visually shows which situations are more significant…This chart is based on the Pareto principle: 80% of the trouble comes from 20% of the causes. While the percentages may not be always exactly 80/20, there usually are ‘the vital few and the trivial many.’ ”[1]

- Nancy R. Tague

 

 

Today’s Excite Harris Poll Online[2] provides an interesting example of the Pareto principle in action. The poll seeks to answer the question “Who is your favorite Peanuts character?” Using the poll data as of 9:30 am, February 15, 2000, we were able to construct several simple charts and reach some data based conclusions about the most significant characters in the Peanuts lineup. While Pareto charts are most commonly used to identify the most important causes of problems, they have many other applications. In this case, suppose we are a marketing department wishing to sell Peanuts products to users of the Excite Harris Poll. If we had limited resources we would want to focus our efforts on the products with the broadest appeal. We might wish to answer the question: “What characters will appeal to the most individuals?” In this case we can reach a reasonable, data based answer using a simple tool, the Pareto chart.

The Peanuts Pareto

The poll collected results for twelve Peanuts characters, and a category called “Other.” The raw data are included in an appendix to this article. Using the poll data we constructed the Pareto chart shown as Figure 1. On this chart we can see from the cumulative percent line that 74% of the respondents preferred Snoopy, Charlie Brown, and Linus. If we include Woodstock, the cumulative percent rises to 82%. As a general rule, we would like to see 3 or 4 of the categories (characters) account for 70% to 80% of the total of the data. In this example, we would want to focus our production on products depicting Snoopy, Charlie Brown, and Linus. If we had the resources to produce a fourth product, then Woodstock would be the next choice. Our rule is general, so we can apply some judgment in the interpretation. The choice of including or excluding Woodstock is not one with a right or wrong answer. The decision can be made using other factors, such as resource availability. We would probably be wasting resources, if we chose to produce products featuring Pig Pen, Schroeder, or the other six characters.

 

 


Using the Pareto chart

Frequently people ask “What is a Pareto? Where did that name come from?” Vilfredo Pareto was a nineteenth century Italian economist.  He and his partner, Lorenzo, showed that approximately 20% of the population controlled 80% of Italy’s wealth.  Similarly they showed a small number of criminals committed most of Italy’s crimes. The Pareto principle is that approximately 20% of the population produces 80% of the results. Some people call this the 80/20 rule. This principle is useful in separating the vital few (the 20%) from the trivial many (the other 80%)

The Pareto chart is a simple analytical tool that is effective for separating the vital few from the trivial many. It consists of 3 axes and two sets of information, constructed from the same data. While there are other ways to construct the Pareto, the one used here is, in our opinion, the easiest to read and can be readily constructed in Microsoft Excel.[3]

o       The horizontal, or “X”, axis is used to segregate the discrete categories being studied. In this example the categories are the characters in the poll.

o       The left vertical, or “ First Y”, axis depicts the data measured. This is often count data, or frequency, but can be other measurements such as dollars or time. In this example the left vertical axis is used for the count of votes for each character. Using a scale that begins at zero and ends with the total of all data collected for all categories will result in the most aesthetically pleasing Pareto charts.

o       The right vertical, or “Second Y”, axis depicts cumulative percent. The scale is from 0% to 100%.

o       The vertical bars represent the sum of the data collected for each category. The reader can see that the 5529 votes for Snoopy (left-most bar) falls between 5000 and 6000 on the left vertical axis. The categories are arranged in descending order from the one with the highest total to the lowest. The reader may note that for ease of display that the analyst consolidated the seven categories with sparse data into one category called “Other”. As a general rule, such a category should never contain over 20% of the total data. (Yes, the analyst could have included Pig Pen, and Schroeder under Other.) A consolidated Other category is always placed at the extreme right hand position on the chart.

o       The line above the bars is the cumulative percent line. The triangles above each bar depict the percent of the total data included in the sum of the bar below the triangle, and all bars to the left of it. For example, the triangle above Linus is labeled 74%. This means that Linus, Charlie Brown, and Snoopy account for 74% of the total responses.

 

Text Box: The Pareto chart is a simple analytical tool that is effective for separating the vital few from the trivial many.

 


Interpreting a Pareto chart is easy once a person understands how the information is displayed. To begin with the goal is to identify three or four categories that account for 70% to 80% of the data. We call these the vital few. The other categories are, of course, the trivial many. Begin on the right vertical axis at the 80% level and move horizontally to the right to the cumulative percent line. This is shown in Figure 1 as a red arrow. Continuing to the left along the cumulative percent line, the first point you will encounter is labeled 74% and is directly above the bar for Linus. You can immediately see that Linus and the two characters to the left, Charlie Brown and Snoopy, account for 74% of the data. You now have three candidates for your product. You may have noticed that 3 of the original 13 categories are 23% of the categories. The 80/20 rule is in action.

Suppose you have the resources to produce a fourth product. Notice that at a cumulative percent of 82%, Woodstock is not far from our 80% limit. You can apply your own judgment here and include Woodstock in your product line, if you wish. The analyst colored Woodstock’s bar yellow to show that inclusion or exclusion is a judgment call. Either way, you have your vital few

 

Stratification

One requirement for a useful Pareto chart is the correct selection of categories. Use of too few or too many categories often results in a Pareto chart that is difficult, or even impossible to interpret. It is our experience that novice users often fall into the category trap. We call this inappropriate stratification of the data. If the user uses too few categories, the result is usually one or two categories containing 80% or more of the data. In this case the user should try to break the categories down into sub-categories. This often requires gathering additional data. The more common scenario is over-stratification, or use of too many categories. We have seen novice users produce Pareto charts in which ten or twenty categories account for 80% of the data. This is hardly a vital few. In this situation, the user should consolidate the data into broader categories, identify the vital few, and then break them down. This often produces a series of cascading Pareto charts.

While the data in our example were adequately stratified, we can repeat the analysis using cascaded Pareto charts. For this analysis we have grouped the Peanuts characters into four categories: male, female, animal, and unknown. Figure 2 depicts the Pareto chart with our data grouped into these categories. You immediately see that only 2 categories account for 92% of the data. This is a clue that the data are under-stratified.

 

 


Nevertheless, we proceed with the analysis and produce two more cascaded Pareto charts for the categories of Animal and Male characters. These are shown as Figures 3 and 4. In Figure 3, we see that Snoopy accounts for 92% of the data within the Animal category, making him an obvious candidate for our product line. In Figure 4, Charlie Brown and Linus account for 73% of the data within the Male category, so we also include them.

The purpose of this second analysis is to show the application of stratification and cascaded charts. Even though the analysis was flawed, the data being under-stratified, we still got the same results as in the original analysis. This demonstrates the power of the Pareto chart. By consistently using the 80/20 rule, you can segregate the categories that are the most significant and focus your valuable resources on them.

 


 

 



The Pareto chart is a simple analytical tool that is useful for separating the vital few from the trivial many. As is the case with all of the simple analytical tools, workers at all levels can, with proper training, quickly learn to use the Pareto chart to solve important business problems. It can quickly focus attention and resources on the 20% of the causes that produce 80% of the effects. Although proper stratification of data is important, the technique is very robust to minor deviations. Proper stratification is a skill that improves with practice. With the guidance of an experienced user or statistician, novice users can quickly learn to stratify data properly and to use the Pareto chart efficiently and effectively to solve problems. The Pareto chart is a powerful and robust tool.

 

 

 

Text Box: the goal is to identify three or four categories that account for 70% to 80% of the data..

 

 


For Additional Information on training in the simple tools

Contact:

 

Key Performance Initiatives, Inc.

PMB: 1702

5968 W. Northwest Highway

Dallas, TX 75225

Voice:              214-728-7447

Fax:                  214-369-1787

Email:               scott@keyperfin.com

URL:                http://www.keyperfin.com/

 


Appendix: Raw Data

 

Character

Vote Count

Charlie Brown

2076

Franklin

46

Linus

1108

Lucy

392

Marcie

82

Peppermint Patty

323

Pig Pen

629

Rerun

53

Sally

59

Schroeder

463

Snoopy

5529

Woodstock

900

Other

100

 

 

 

End Notes:



[1] Tague, Nancy R.. The Quality Toolbox. (Milwaukee: American Society for Quality. 1995) 209 – 210.

[2] Excite Harris Poll Online. (February 15, 2000) http://news.excite.com/news/poll/results?voted=1

[3] Contact the author at scott@keyperfin.com for an example of a Pareto chart in Excel.