Measures of central tendency for grouped data formulas, exercises

3067
Charles McCarthy
Measures of central tendency for grouped data formulas, exercises

The trend measures central they indicate the value around which the data of a distribution are. The best known is the average or arithmetic mean, which consists of adding all the values ​​and dividing the result by the total number of data.

However, if the distribution consists of a large number of values ​​and they are not presented in an orderly way, it is not easy to perform the necessary calculations to extract the valuable information they contain..

Figure 1. Measures of central tendency for grouped data are a good indication of the general behavior of the data

That is why they are grouped into classes or categories, to develop a distribution of frequencies. Carrying out this previous ordering of the data, it is then easier to calculate the measures of central tendency, among which are:

-Half

-Median

-fashion

-Geometric mean

-Harmonic mean

Formulas

Here are the formulas for the measures of central tendency for the grouped data:

Arithmetic average

The mean is the most used to characterize quantitative data (numerical values), although it is quite sensitive to the extreme values ​​of the distribution. It is calculated by:

With:

-X: average or arithmetic mean

-Fi: class frequency

-mi: the class mark

-g: number of classes

-n: total data

Median

To calculate it, it is necessary to find the interval that contains the observation n / 2 and interpolate to determine the numerical value of said observation, using the following formula:

Where:

-c: width of the interval to which the median belongs

-BM: lower bound of said interval

-Fm: number of observations contained in the interval

-n / 2: total data divided by 2.

-FBM: number of observations before of the interval containing the median.

Therefore, the median is a measure of position, that is, it divides the data set into two parts. They can also be defined quartiles, deciles Y percentiles, that divide the distribution into four, ten and one hundred parts respectively.

fashion

In the pooled data, the class or category that contains the most observations is searched. This is the modal class. A distribution may have two or more modes, in which case it is called bimodal Y multimodal, respectively.

You can also calculate the mode in grouped data following the equation:

With:

-L1: lower limit of the class where the mode is found

1: subtract between the frequency of the modal class and the frequency of the class that precedes it.

two: subtract between the frequency of the modal class and the frequency of the next class.

-c: width of the interval containing the mode

Harmonic mean

The harmonic mean is denoted by H. When you have a set of n x values1, xtwo, x3…, The harmonic mean is the inverse or reciprocal of the arithmetic mean of the inverses of the values.

It is easier to see it through the formula:

And having the grouped data available, the expression becomes:

Where:

-H: harmonic mean

-Fi: class frequency

-mi: class mark

-g: number of classes

-N = f1 + Ftwo + F3 +...

Geometric mean

If they have n positive numbers x1, xtwo, x3…, Its geometric mean G is calculated by the nth root of the product of all the numbers:

In the case of grouped data, it can be shown that the decimal logarithm of the geometric mean log G is given by:

Where:

-G: geometric mean

-Fi: class frequency

-mi: the class mark

-g: number of classes

-N = f1 + Ftwo + F3 +...

Relationship between H, G and X

It is always true that:

H ≤ G ≤ X

Most used definitions

The following definitions are required to find the values ​​described in the formulas above:

Frequency

Frequency is defined as the number of times a piece of data is repeated.

Rank

It is the difference between the highest and lowest values, present in the distribution.

Number of classes

To know in how many classes we group the data, we use some criteria, for example the following:

Limits

The extreme values ​​of each class or interval are called limits and each class can have both well-defined limits, in which case it has a lower and a higher limit. Or it can have open limits, when a range is given, for example of values ​​greater or less than a certain number.

Class mark

It simply consists of the midpoint of the interval and is calculated by averaging the upper bound and the lower bound.

Interval width

The data can be grouped into classes of equal or different size, this is the width or width. The first option is the most used, as it makes calculations much easier, although in some cases it is imperative that the classes have different widths.

The width c The interval can be determined by the following formula:

c = Range / Nc

Wherec is the number of classes.

Exercise resolved

Below we have a series of speed measurements in km / h, taken with radar, which correspond to 50 cars that passed through a street in a certain city:

Figure 2. Table for the resolved exercise. Source: F. Zapata.

Solution

The data presented in this way is not organized, so the first step is to group it into classes.

Steps to group the data and build the table

Step 1

Find the range R:

R = (52 - 16) km / h = 36 km / h

Step 2

Select the number of classes Nc, according to the given criteria. Since there are 50 data, we can choose Nc = 6.

Step 3

Calculate width c of the interval:

c = Range / Nc = 36/6 = 6

Step 4

Form classes and group data as follows: for the first class, a value slightly less than the lowest value present in the table is chosen as the lower limit, then the value of c = 6, previously calculated, is added to this value, and thus obtains the upper limit of the first class.

We proceed in the same way to build the rest of the classes, as shown in the following table:

Each frequency corresponds to a color in figure 2, in this way it is ensured that no value escapes from being counted..

Calculation of the mean

X = (5 x 18.5 +25 x 25.0 + 10 x 31.5 + 6 x 38.0 + 2 x 44.5 + 2 x 51.0) ÷ 50 = 29.03 km / h

Calculation of the median

The median is in class 2 of the table, since there are the first 30 data of the distribution.

-Width of the interval to which the median belongs: c = 6

-Lower boundary of the interval where the median is: BM = 22.0 km / h

-Number of observations that the interval f containsm = 25

-Total data divided by 2: 50/2 = 25

-Number of observations there are before of the interval containing the median: fBM = 5

And the operation is:

Median = 22.0 + [(25-5) ÷ 25] × 6 = 26.80 km / h

Fashion calculation

Fashion is also in class 2:

-Interval width: c = 6

-Lower limit of the class where the mode is found: L1 = 22.0

-Subtract between the frequency of the modal class and the frequency of the class that precedes it: Δ1 = 25-5 = 20

-Subtract between the frequency of the modal class and the frequency of the class that follows: Δtwo = 25 - 10 = 15

With these data the operation is:

Mode = 22.0 + [20 ÷ (20 + 15)] x6 = 25.4 km / h

Calculation of the geometric mean

N = f1 + Ftwo + F3 +… = 50

log G = (5 x log 18.5 + 25 x log 25 + 10 x log 31.5 + 6 x log 38 + 2 × log 44.5 + 2 x log 51) / 50 =

log G = 1.44916053

G = 28.13 km / h

Harmonic mean calculation

1 / H = (1/50) x [(5 / 18.5) + (25/25) + (10 / 31.5) + (6/38) + (2 / 44.5) + (2/51)] = 0.0366

H = 27.32 km / h

Summary of measures of central tendency

The units of the variables are km / h:

-Average: 29.03

-Median: 26.80

-Fashion: 25.40

-Geometric mean: 28.13

-Harmonic Mean: 27.32

References

  1. Berenson, M. 1985. Statistics for management and economics. Interamericana S.A.
  2. Canavos, G. 1988. Probability and Statistics: Applications and methods. Mcgraw hill.
  3. Devore, J. 2012. Probability and Statistics for Engineering and Science. 8th. Edition. Cengage.
  4. Levin, R. 1988. Statistics for Administrators. 2nd. Edition. Prentice hall.
  5. Spiegel, M. 2009. Statistics. Schaum series. 4th Edition. Mcgraw hill.
  6. Treatment of grouped data. Recovered from: itchihuahua.edu.mx.
  7. Walpole, R. 2007. Probability and Statistics for Engineering and Sciences. Pearson.

Yet No Comments