MEV 019: Unit 07 - Descriptive Statistics-I
UNIT 7: DESCRIPTIVE STATISTICS – I
https://chatgpt.com/s/t_688da2af8e58819187d81c4609229806
7.1 Introduction
Descriptive statistics involves methods of organizing, summarizing, and
presenting data in an informative way. This unit introduces foundational
statistical measures used to describe and understand data patterns,
particularly measures of central tendency and dispersion. These techniques are
widely applied in environmental science for summarizing large data sets such as
temperature records, pollutant concentrations, and biodiversity indices.
7.2 Objectives
After studying this unit, you will be able to:
- Understand
the purpose and importance of descriptive statistics.
- Explain
the concepts of central tendency and dispersion.
- Calculate
and interpret the arithmetic mean, median, and mode.
- Compute
and compare different measures of dispersion such as range, mean
deviation, variance, and standard deviation.
- Understand
and apply the coefficient of variation for comparing variability.
7.3 Measures of Central
Tendency
Measures of central tendency are statistical values that represent the
center or typical value of a dataset. The most common measures include the
mean, median, and mode.
7.3.1 Significance of the Measure of Central
Tendency
These measures help:
- Summarize
data sets with a single representative value.
- Facilitate
comparisons between different data groups.
- Serve
as a basis for further statistical analysis.
7.3.2 Properties of a Good Average
A good average should:
- Be
clearly defined and easy to understand.
- Be
based on all data values.
- Be
stable and not greatly affected by extreme values.
- Allow
for further mathematical treatment.
7.3.3 Different Measures of Central Tendency
7.4 Arithmetic Mean
The arithmetic mean is the most common measure of central tendency,
calculated by dividing the sum of all values by the number of values.
7.4.1 Calculation of Simple Arithmetic Mean
Formula:
Xˉ=∑XN\bar{X} = \frac{\sum X}{N}Xˉ=N∑X
Where:
- ∑X\sum
X∑X = Sum of all values
- NNN =
Number of values
Example:
If pollutant levels (in µg/m³) on five days are: 50, 55, 60, 65, and 70,
Xˉ=50+55+60+65+705=3005=60\bar{X} = \frac{50 + 55 + 60 + 65 + 70}{5} =
\frac{300}{5} = 60Xˉ=550+55+60+65+70=5300=60
7.4.2 Combined Arithmetic Mean
When data is divided into groups with different sample sizes:
Xˉcombined=n1Xˉ1+n2Xˉ2+⋯+nkXˉkn1+n2+⋯+nk\bar{X}_{combined} = \frac{n_1 \bar{X}_1 + n_2 \bar{X}_2 + \dots +
n_k \bar{X}_k}{n_1 + n_2 + \dots + n_k}Xˉcombined=n1+n2+⋯+nkn1Xˉ1+n2Xˉ2+⋯+nkXˉk
Where:
- n1,n2,…n_1,
n_2, \dotsn1,n2,… = sample sizes
- Xˉ1,Xˉ2,…\bar{X}_1,
\bar{X}_2, \dotsXˉ1,Xˉ2,… = means of each group
7.5 Median
The median is the middle value of a dataset when arranged in ascending
or descending order. It divides the data into two equal parts.
7.5.1 Finding the Median for a Set of Data
- For odd
number of observations:
Median=middle value\text{Median} =
\text{middle value}Median=middle value
- For even
number of observations:
Median=n/2th value+(n/2+1)th value2\text{Median}
= \frac{n/2^{th} \text{ value} + (n/2 + 1)^{th} \text{
value}}{2}Median=2n/2th value+(n/2+1)th value
Example:
Data: 10, 15, 20, 25, 30 → Median = 20
Data: 10, 15, 20, 25 → Median = (15 + 20)/2 = 17.5
7.6 Mode
The mode is the value that appears most frequently in a dataset.
7.6.1 Calculation of Mode
For ungrouped data: Identify the value with the highest frequency.
For grouped data, Mode is calculated using:
Mode=L+(f1−f02f1−f0−f2)×h\text{Mode} = L + \left( \frac{f_1 - f_0}{2f_1
- f_0 - f_2} \right) \times hMode=L+(2f1−f0−f2f1−f0)×h
Where:
- LLL =
lower boundary of modal class
- f1f_1f1
= frequency of modal class
- f0f_0f0
= frequency of preceding class
- f2f_2f2
= frequency of succeeding class
- hhh =
class width
7.7 Measures of Dispersion
Dispersion indicates the extent to which data values vary around the
central tendency.
7.8 Range
Formula:
Range=Maximum value−Minimum value\text{Range} = \text{Maximum
value} - \text{Minimum value}Range=Maximum value−Minimum value
It gives a rough idea of variability.
7.9 Mean Deviation
Mean deviation is the average of the absolute deviations from the
central value (mean, median, or mode).
Formula:
Mean Deviation (MD)=∑∣X−Xˉ∣N\text{Mean Deviation (MD)} = \frac{\sum |X -
\bar{X}|}{N}Mean Deviation (MD)=N∑∣X−Xˉ∣
Where:
- ∣X−Xˉ∣|X - \bar{X}|∣X−Xˉ∣ = absolute deviation
from the mean
7.10 Standard Deviation and
Variance
These are more precise measures of dispersion.
7.10.1 Root Mean Square Deviation (RMSD)
It is the square root of the average of squared deviations from the
mean.
RMSD=∑(X−Xˉ)2N\text{RMSD} = \sqrt{ \frac{\sum (X - \bar{X})^2}{N}
}RMSD=N∑(X−Xˉ)2
7.10.2 Standard Deviation (S.D.)
σ=∑(X−Xˉ)2N\sigma = \sqrt{ \frac{\sum (X - \bar{X})^2}{N} }σ=N∑(X−Xˉ)2
Standard deviation measures the spread of data from the mean.
7.10.3 Variance
Variance=σ2=∑(X−Xˉ)2N\text{Variance} = \sigma^2 = \frac{\sum (X -
\bar{X})^2}{N}Variance=σ2=N∑(X−Xˉ)2
Variance is the square of the standard deviation.
7.11 Coefficient of Variation
It is a relative measure of dispersion, useful for comparing variability
across datasets.
Formula:
C.V.=(σXˉ)×100%\text{C.V.} = \left( \frac{\sigma}{\bar{X}} \right)
\times 100\%C.V.=(Xˉσ)×100%
A lower C.V. indicates less variability, and vice versa.
7.12 Let Us Sum Up
Descriptive statistics help summarize and understand data efficiently.
Measures of central tendency—mean, median, and mode—describe the average
behavior of data, while measures of dispersion such as range, standard
deviation, and coefficient of variation reveal how much the data varies around
the average. These tools are crucial in environmental science for analyzing
field data, monitoring environmental changes, and making informed decisions.
7.13 Key Words
- Mean: The
arithmetic average.
- Median: The
middle value in ordered data.
- Mode: The
most frequent value.
- Range:
Difference between the largest and smallest value.
- Standard
Deviation: A measure of data spread from the mean.
- Variance:
Square of standard deviation.
- Coefficient
of Variation: Ratio of standard deviation to mean, expressed as a percentage.
Comments
Post a Comment