MEV 019: Unit 07 - Descriptive Statistics-I

 UNIT 7: DESCRIPTIVE STATISTICS – I


https://chatgpt.com/s/t_688da2af8e58819187d81c4609229806

7.1 Introduction

Descriptive statistics involves methods of organizing, summarizing, and presenting data in an informative way. This unit introduces foundational statistical measures used to describe and understand data patterns, particularly measures of central tendency and dispersion. These techniques are widely applied in environmental science for summarizing large data sets such as temperature records, pollutant concentrations, and biodiversity indices.


7.2 Objectives

After studying this unit, you will be able to:

  • Understand the purpose and importance of descriptive statistics.
  • Explain the concepts of central tendency and dispersion.
  • Calculate and interpret the arithmetic mean, median, and mode.
  • Compute and compare different measures of dispersion such as range, mean deviation, variance, and standard deviation.
  • Understand and apply the coefficient of variation for comparing variability.

7.3 Measures of Central Tendency

Measures of central tendency are statistical values that represent the center or typical value of a dataset. The most common measures include the mean, median, and mode.

7.3.1 Significance of the Measure of Central Tendency

These measures help:

  • Summarize data sets with a single representative value.
  • Facilitate comparisons between different data groups.
  • Serve as a basis for further statistical analysis.

7.3.2 Properties of a Good Average

A good average should:

  • Be clearly defined and easy to understand.
  • Be based on all data values.
  • Be stable and not greatly affected by extreme values.
  • Allow for further mathematical treatment.

7.3.3 Different Measures of Central Tendency


7.4 Arithmetic Mean

The arithmetic mean is the most common measure of central tendency, calculated by dividing the sum of all values by the number of values.

7.4.1 Calculation of Simple Arithmetic Mean

Formula:

Xˉ=∑XN\bar{X} = \frac{\sum X}{N}Xˉ=N∑X​

Where:

  • ∑X\sum X∑X = Sum of all values
  • NNN = Number of values

Example:

If pollutant levels (in µg/m³) on five days are: 50, 55, 60, 65, and 70,

Xˉ=50+55+60+65+705=3005=60\bar{X} = \frac{50 + 55 + 60 + 65 + 70}{5} = \frac{300}{5} = 60Xˉ=550+55+60+65+70​=5300​=60

7.4.2 Combined Arithmetic Mean

When data is divided into groups with different sample sizes:

Xˉcombined=n1Xˉ1+n2Xˉ2++nkXˉkn1+n2++nk\bar{X}_{combined} = \frac{n_1 \bar{X}_1 + n_2 \bar{X}_2 + \dots + n_k \bar{X}_k}{n_1 + n_2 + \dots + n_k}Xˉcombined​=n1​+n2​++nk​n1​Xˉ1​+n2​Xˉ2​++nk​Xˉk​​

Where:

  • n1,n2,…n_1, n_2, \dotsn1​,n2​,… = sample sizes
  • Xˉ1,Xˉ2,…\bar{X}_1, \bar{X}_2, \dotsXˉ1​,Xˉ2​,… = means of each group

7.5 Median

The median is the middle value of a dataset when arranged in ascending or descending order. It divides the data into two equal parts.

7.5.1 Finding the Median for a Set of Data

  • For odd number of observations:

Median=middle value\text{Median} = \text{middle value}Median=middle value

  • For even number of observations:

Median=n/2th value+(n/2+1)th value2\text{Median} = \frac{n/2^{th} \text{ value} + (n/2 + 1)^{th} \text{ value}}{2}Median=2n/2th value+(n/2+1)th value​

Example:

Data: 10, 15, 20, 25, 30 → Median = 20
Data: 10, 15, 20, 25 → Median = (15 + 20)/2 = 17.5


7.6 Mode

The mode is the value that appears most frequently in a dataset.

7.6.1 Calculation of Mode

For ungrouped data: Identify the value with the highest frequency.

For grouped data, Mode is calculated using:

Mode=L+(f1−f02f1−f0−f2)×h\text{Mode} = L + \left( \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \right) \times hMode=L+(2f1​−f0​−f2​f1​−f0​​)×h

Where:

  • LLL = lower boundary of modal class
  • f1f_1f1​ = frequency of modal class
  • f0f_0f0​ = frequency of preceding class
  • f2f_2f2​ = frequency of succeeding class
  • hhh = class width

7.7 Measures of Dispersion

Dispersion indicates the extent to which data values vary around the central tendency.


7.8 Range

Formula:

Range=Maximum value−Minimum value\text{Range} = \text{Maximum value} - \text{Minimum value}Range=Maximum value−Minimum value

It gives a rough idea of variability.


7.9 Mean Deviation

Mean deviation is the average of the absolute deviations from the central value (mean, median, or mode).

Formula:

Mean Deviation (MD)=∑X−XˉN\text{Mean Deviation (MD)} = \frac{\sum |X - \bar{X}|}{N}Mean Deviation (MD)=N∑X−Xˉ

Where:

  • X−Xˉ|X - \bar{X}|X−Xˉ = absolute deviation from the mean

7.10 Standard Deviation and Variance

These are more precise measures of dispersion.

7.10.1 Root Mean Square Deviation (RMSD)

It is the square root of the average of squared deviations from the mean.

RMSD=∑(X−Xˉ)2N\text{RMSD} = \sqrt{ \frac{\sum (X - \bar{X})^2}{N} }RMSD=N∑(X−Xˉ)2​​

7.10.2 Standard Deviation (S.D.)

σ=∑(X−Xˉ)2N\sigma = \sqrt{ \frac{\sum (X - \bar{X})^2}{N} }σ=N∑(X−Xˉ)2​​

Standard deviation measures the spread of data from the mean.

7.10.3 Variance

Variance=σ2=∑(X−Xˉ)2N\text{Variance} = \sigma^2 = \frac{\sum (X - \bar{X})^2}{N}Variance=σ2=N∑(X−Xˉ)2​

Variance is the square of the standard deviation.


7.11 Coefficient of Variation

It is a relative measure of dispersion, useful for comparing variability across datasets.

Formula:

C.V.=(σXˉ)×100%\text{C.V.} = \left( \frac{\sigma}{\bar{X}} \right) \times 100\%C.V.=(Xˉσ​)×100%

A lower C.V. indicates less variability, and vice versa.


7.12 Let Us Sum Up

Descriptive statistics help summarize and understand data efficiently. Measures of central tendency—mean, median, and mode—describe the average behavior of data, while measures of dispersion such as range, standard deviation, and coefficient of variation reveal how much the data varies around the average. These tools are crucial in environmental science for analyzing field data, monitoring environmental changes, and making informed decisions.


7.13 Key Words

  • Mean: The arithmetic average.
  • Median: The middle value in ordered data.
  • Mode: The most frequent value.
  • Range: Difference between the largest and smallest value.
  • Standard Deviation: A measure of data spread from the mean.
  • Variance: Square of standard deviation.
  • Coefficient of Variation: Ratio of standard deviation to mean, expressed as a percentage.

 

Comments

Popular Posts

Jcert Class 8 Daffodil Chapter 1a: The Naive Friends Solutions

Jcert Class 8 भाषा मंजरी Chapter 3 मित्रता Solutions

Jcert Class 8 भाषा मंजरी Chapter 8 अमरूद का पेड Solutions