The trend measures central they indicate the value around which the data of a distribution are. The best known is the average or arithmetic mean, which consists of adding all the values and dividing the result by the total number of data.
However, if the distribution consists of a large number of values and they are not presented in an orderly way, it is not easy to perform the necessary calculations to extract the valuable information they contain..
That is why they are grouped into classes or categories, to develop a distribution of frequencies. Carrying out this previous ordering of the data, it is then easier to calculate the measures of central tendency, among which are:
-Half
-Median
-fashion
-Geometric mean
-Harmonic mean
Here are the formulas for the measures of central tendency for the grouped data:
The mean is the most used to characterize quantitative data (numerical values), although it is quite sensitive to the extreme values of the distribution. It is calculated by:
With:
-X: average or arithmetic mean
-Fi: class frequency
-mi: the class mark
-g: number of classes
-n: total data
To calculate it, it is necessary to find the interval that contains the observation n / 2 and interpolate to determine the numerical value of said observation, using the following formula:
Where:
-c: width of the interval to which the median belongs
-BM: lower bound of said interval
-Fm: number of observations contained in the interval
-n / 2: total data divided by 2.
-FBM: number of observations before of the interval containing the median.
Therefore, the median is a measure of position, that is, it divides the data set into two parts. They can also be defined quartiles, deciles Y percentiles, that divide the distribution into four, ten and one hundred parts respectively.
In the pooled data, the class or category that contains the most observations is searched. This is the modal class. A distribution may have two or more modes, in which case it is called bimodal Y multimodal, respectively.
You can also calculate the mode in grouped data following the equation:
With:
-L1: lower limit of the class where the mode is found
-Δ1: subtract between the frequency of the modal class and the frequency of the class that precedes it.
-Δtwo: subtract between the frequency of the modal class and the frequency of the next class.
-c: width of the interval containing the mode
The harmonic mean is denoted by H. When you have a set of n x values1, xtwo, x3…, The harmonic mean is the inverse or reciprocal of the arithmetic mean of the inverses of the values.
It is easier to see it through the formula:
And having the grouped data available, the expression becomes:
Where:
-H: harmonic mean
-Fi: class frequency
-mi: class mark
-g: number of classes
-N = f1 + Ftwo + F3 +...
If they have n positive numbers x1, xtwo, x3…, Its geometric mean G is calculated by the nth root of the product of all the numbers:
In the case of grouped data, it can be shown that the decimal logarithm of the geometric mean log G is given by:
Where:
-G: geometric mean
-Fi: class frequency
-mi: the class mark
-g: number of classes
-N = f1 + Ftwo + F3 +...
It is always true that:
H ≤ G ≤ X
The following definitions are required to find the values described in the formulas above:
Frequency is defined as the number of times a piece of data is repeated.
It is the difference between the highest and lowest values, present in the distribution.
To know in how many classes we group the data, we use some criteria, for example the following:
The extreme values of each class or interval are called limits and each class can have both well-defined limits, in which case it has a lower and a higher limit. Or it can have open limits, when a range is given, for example of values greater or less than a certain number.
It simply consists of the midpoint of the interval and is calculated by averaging the upper bound and the lower bound.
The data can be grouped into classes of equal or different size, this is the width or width. The first option is the most used, as it makes calculations much easier, although in some cases it is imperative that the classes have different widths.
The width c The interval can be determined by the following formula:
c = Range / Nc
Wherec is the number of classes.
Below we have a series of speed measurements in km / h, taken with radar, which correspond to 50 cars that passed through a street in a certain city:
The data presented in this way is not organized, so the first step is to group it into classes.
Find the range R:
R = (52 - 16) km / h = 36 km / h
Select the number of classes Nc, according to the given criteria. Since there are 50 data, we can choose Nc = 6.
Calculate width c of the interval:
c = Range / Nc = 36/6 = 6
Form classes and group data as follows: for the first class, a value slightly less than the lowest value present in the table is chosen as the lower limit, then the value of c = 6, previously calculated, is added to this value, and thus obtains the upper limit of the first class.
We proceed in the same way to build the rest of the classes, as shown in the following table:
Each frequency corresponds to a color in figure 2, in this way it is ensured that no value escapes from being counted..
X = (5 x 18.5 +25 x 25.0 + 10 x 31.5 + 6 x 38.0 + 2 x 44.5 + 2 x 51.0) ÷ 50 = 29.03 km / h
The median is in class 2 of the table, since there are the first 30 data of the distribution.
-Width of the interval to which the median belongs: c = 6
-Lower boundary of the interval where the median is: BM = 22.0 km / h
-Number of observations that the interval f containsm = 25
-Total data divided by 2: 50/2 = 25
-Number of observations there are before of the interval containing the median: fBM = 5
And the operation is:
Median = 22.0 + [(25-5) ÷ 25] × 6 = 26.80 km / h
Fashion is also in class 2:
-Interval width: c = 6
-Lower limit of the class where the mode is found: L1 = 22.0
-Subtract between the frequency of the modal class and the frequency of the class that precedes it: Δ1 = 25-5 = 20
-Subtract between the frequency of the modal class and the frequency of the class that follows: Δtwo = 25 - 10 = 15
With these data the operation is:
Mode = 22.0 + [20 ÷ (20 + 15)] x6 = 25.4 km / h
N = f1 + Ftwo + F3 +… = 50
log G = (5 x log 18.5 + 25 x log 25 + 10 x log 31.5 + 6 x log 38 + 2 × log 44.5 + 2 x log 51) / 50 =
log G = 1.44916053
G = 28.13 km / h
1 / H = (1/50) x [(5 / 18.5) + (25/25) + (10 / 31.5) + (6/38) + (2 / 44.5) + (2/51)] = 0.0366
H = 27.32 km / h
The units of the variables are km / h:
-Average: 29.03
-Median: 26.80
-Fashion: 25.40
-Geometric mean: 28.13
-Harmonic Mean: 27.32
Yet No Comments