UNIT- 3 fy B.Sc. DS Discriptive statistics

Published on
Embed video
Share video
Ask about this video

Scene 1 (0s)

UNIT-3, Moments, Skewness and Kurtosis: Moments, In statistics, we talk of moments of a random variable about some point. These moments are used to describe the various characteristics of a frequency distribution, central tendency, dispersion, skewness, kurtosis etc. Definition- The arithmetic mean of the various powers of the deviations of items from their mean will give the required power of moment of the distribution. If the deviations of the items are taken from the arithmetic mean of the distribution, it is known as central moment. Central Moments or Moments about actual mean Let xΜ… be the mean of the individual series.let x be the deviation of x from its mean xΜ…, First Moments about Mean: ΞΌ 1 = βˆ‘(π’™βˆ’π±Μ…)) 𝑡 Second Moments about Mean: ΞΌ 2 = βˆ‘(π±βˆ’π±Μ…)𝟐 𝐍 Third Moments about Mean: ΞΌ 3 = βˆ‘(π±βˆ’π±Μ…)πŸ‘ 𝐍 Fourth Moments about Mean: ΞΌ 4 = βˆ‘(π±βˆ’π±Μ…)πŸ’ 𝐍 .rth Moments about Mean: ΞΌ r = βˆ‘(π±βˆ’π±Μ…)𝐫 𝐍 Central Moments for frequency Distribution Let π‘Ž π‘“π‘Ÿπ‘’π‘žπ‘’π‘’π‘›π‘π‘¦ π‘‘π‘–π‘ π‘‘π‘Ÿπ‘–π‘π‘’π‘‘π‘–π‘œπ‘› β„Žπ‘Žπ‘£π‘–π‘›π‘” π‘œπ‘π‘ π‘’π‘Ÿπ‘£π‘Žπ‘‘π‘–π‘œπ‘› x1, x2, x3, … . xn with respective frequencies f1, f2, . fn. The first four moments are defined as given below First Moments about Mean: ΞΌ 1 = βˆ‘π’‡π’Š(π’™π’Šβˆ’π±Μ…)) 𝑡 Second Moments about Mean: ΞΌ 2 = βˆ‘π’‡π’Š(π±π’βˆ’π±Μ…)𝟐 𝐍 Third Moments about Mean: ΞΌ 3 = βˆ‘πŸπ’(π±π’βˆ’π±Μ…)πŸ‘ 𝐍 Fourth Moments about Mean: ΞΌ 4 = βˆ‘πŸπ’(π±π’βˆ’π±Μ…)πŸ’ 𝐍 .rth Moments about Mean: ΞΌ r = βˆ‘πŸπ’(π±π’βˆ’π±Μ…)𝐫 𝐍 Properties of central Moments 1. The first moment about mean is always zero : ΞΌ 1 = 0 2. The second moment about mean measures variance, ΞΌ 2 =𝛔2 3. The Third moment about mean measure skewness. Iit give us an idea about the degree of skewness present in a series: If ΞΌ 3 > 0, the given distribution is positively skewed..

Scene 2 (27s)

If ΞΌ 3 < 0, the given distribution is negatively skewed. If ΞΌ 3 = 0, the given distribution is symmetrical. 4. In a symmetrical distribution, all odd moments are zero : ΞΌ 1= ΞΌ 3= ΞΌ 5= ΞΌ 7= ΞΌ 9…= ΞΌ 2r+1 =0 5. The fourth moment about mean help us to measure kurtosis. Ξ²2= ππŸ’ 𝝁𝟐 𝟐 6. Two important constant of distribution are calculated from ΞΌ 2, ΞΌ 3, ΞΌ 4 they are Ξ²1= ππŸ‘ 𝟐 𝝁𝟐 πŸ‘ Ξ²2= ππŸ’ 𝝁𝟐 𝟐 Ξ²1 measures skewness and Ξ²2 measures kurtosis Q1) calculation of raw moments Calculate first four moments about an arbitrary origin from the following data: Marks: 60-62 63-65 66-68 69-71 72-74 No. of students: 5 18 42 27 8 Also find the central moments. Solution- let us take the assumed mean A= 67.let d=x-A =x-67 classes mid value x Frequencyfi di fidi di2 fidi2 di3 fidi3 di4 fidi4 60-62 61 5 -6 -30 36 180 -216 -1080 1296 6480 63-65 64 18 -3 -54 9 162 -27 -486 81 1458 66-68 67 42 0 0 0 0 0 0 0 0 69-71 70 27 3 81 9 243 27 729 81 2187 72-74 73 8 6 48 36 288 216 1728 1296 10368 100 45 891 20493 Raw Moments: ΞΌ `1 = βˆ‘π’‡π’Šπ’…π’Š 𝑡 = πŸ’πŸ“ 𝟏𝟎𝟎 = 0.45, ΞΌ `2 = βˆ‘π’‡π’Šππ’πŸ 𝑡 = πŸ–πŸ•πŸ‘ 𝟏𝟎𝟎 = 8.73, ΞΌ `3 = βˆ‘π’‡π’Šπ’…π’ŠπŸ‘ 𝑡 = πŸ–πŸ—πŸ 𝟏𝟎𝟎 = 8.91, ΞΌ `4 = βˆ‘π’‡π’Šππ’πŸ’ 𝑡 = πŸπŸŽπŸ’πŸ—πŸ‘ 𝟏𝟎𝟎 = 204.93, Central Moments πœ‡1 = πœ‡1 β€² βˆ’ πœ‡1 β€² = 0 πœ‡2 = πœ‡2 β€² – ( πœ‡1 β€² )2 = 8.73-(0.45)2 = 8.73-0.2025 = 8.5275.

Scene 3 (53s)

πœ‡3 = πœ‡3βˆ’3 β€² πœ‡1πœ‡2β€² β€² + 2(πœ‡1 β€² )3 = 8.91-3(0.45)(8.73)+2(0.45)3 = 8.91- 11.7855+0.1822=-2.693 πœ‡4 = πœ‡4 β€² -4 πœ‡1 β€² πœ‡3 β€² +6πœ‡2 β€² (πœ‡3 β€² ) 3-3(πœ‡1 β€² )4 = = 204.93-4(0.45)(8.91)+ 6(8.73)(0.45)2-3(0.45)4 204.93- 16038+10.607-0.123= 199.376. Skewness Definitions of skewness 1. "When a series is not symmetrical it is said to be asymmetrical or skewed." - Croxton & Cowden. 2. "Skewness refers to the asymmetry or lack of symmetry in the shape of a frequency distribution." -Morris Hamburg. 3. "Measures of skewness tell us the direction and the extent of skewness. In symmetrical distribution the mean, median and mode are identical. The more the mean moves away from the mode, the larger the asymmetry or skewness." - Simpson &. Kafka 4. "A distribution is said to be 'skewed' when the mean and the median fall at different points in the distribution, and the balance (or center of gravity) is shifted to one side or the other-to left or right." –Garrett β–ͺ Symmetrical Distribution. I That in a symmetrical distribution the values of mean, median and mode coincide. The spread of the frequencies is the same on both sides of the center point of the curve. β–ͺ Asymmetrical Distribution. A distribution, which is not symmetrical, is called a skewed distribution and such a distribution could be either positively skewed or negatively skewed as would be clear from the diagrams 1. Positively Skewed Distribution. In the positively skewed distribution, the value of the mean is maximum and that of mode least-the median lies in between the two as is clear from the diagram..

Scene 4 (1m 20s)

2. Negatively Skewed Distribution. In a negatively skewed distribution, the value of mode is maximum and that of mean least-the median lies in between the two. In the positively skewed distribution, the frequencies are spread out over a greater Measure Of Skewness Measures of skewness quantify the asymmetry of a probability distribution, indicating whether the data's tail is longer on the right or left side compared to a normal distribution. Common measures include 1) Karl Parsons’s Coefficient of Skewness Measures of skewness quantify the asymmetry of a probability distribution, indicating whether the data's tail is longer on the right or left side compared to a normal distribution. Formula- π‘€π‘’π‘Žπ‘› βˆ’ π‘€π‘’π‘‘π‘–π‘Žπ‘› π‘†π‘‘π‘Žπ‘›π‘‘π‘Žπ‘Ÿπ‘‘ π·π‘’π‘£π‘–π‘Žπ‘‘π‘–π‘œπ‘› Interpretation 1. Its value usually lies between -1 and +1 2. When its value is Zero, there is no skewness i.e. distribution is symmetrical. 3. When its value is negative, the distribution is negatively skewed. 4. When its value is positive, the distribution is positively skewed. Karl Pearson`s coefficient of skewness Based on mean Q1) Calculate Karl Pearson`s coefficient of skewness for the following data: 25 15 23 40 27 25 23 25 20.

Scene 5 (1m 39s)

Solution: x d=x-A d2 25 0 0 15 -10 100 23 -2 4 40 15 225 27 2 4 25 0 0 23 -2 4 25 0 0 20 -5 25 βˆ‘d=-2 βˆ‘d2 =362 Mean= A+ βˆ‘π’… 𝑡 = 25 + βˆ’πŸ πŸ— = 25-0.22 = 24.78 S.D. = βˆšβˆ‘ππŸ 𝑡 βˆ’ ( βˆ‘π 𝑡 )2 = βˆšπŸ‘πŸ”πŸ πŸ— βˆ’ ( βˆ’πŸ πŸ— )2 = βˆšπŸ’πŸŽ. 𝟐𝟐 βˆ’ (βˆ’πŸŽ. 𝟐𝟐)2 =6.3 Mode = 25, because this item has highest frequency Q2) Calculate Karl Pearson`s coefficient of skewness for the following data of marks obtained by five students: S. No. 1 2 3 4 5 Marks 12 18 35 22 18 Solution: s.no Marks(x) 𝐱 = 𝐱 βˆ’ 𝐱̅ 𝐱𝟐 1 12 -9 81 2 18 -3 9 3 25 14 196 4 22 1 1 5 18 -3 9 N=5 βˆ‘x=105 βˆ‘x2=296 Mean: 𝐱̅ = βˆ‘π± 𝒏 = πŸπŸŽπŸ“ πŸ“ = 21 S.D. = 𝛔 = βˆšβˆ‘π±πŸ 𝑡 = βˆšπŸπŸ—πŸ” πŸ“ = βˆšπŸ“πŸ—. 𝟐 = 7.7 Mode =18, because it occur maximum number of times in the series. Coefficient of Skewness: Skp = π‘΄π’†π’‚π’βˆ’π‘΄π’π’…π’† 𝑺𝑫 = πŸπŸβˆ’πŸπŸ– πŸ•.πŸ• = πŸ‘ πŸ•.πŸ• = 0.3896.

Scene 6 (1m 56s)

Hence Karl Pearson`s Coefficient of skewness = 0.3896 Karl Pearson`s Coefficient of skewness Based on Median Karl Pearson`s coefficient of skewness: Sk p = πŸ‘(π‘΄π’†π’‚π’βˆ’π‘΄π’†π’…π’Šπ’‚π’) 𝑺𝑫 Calculate Karl Pearson`s coefficient of skewness from the the data given Weekly wages No. of workers Weekly wages No. of workers 40-50 5 90-100 30 50-60 6 100-110 36 60-70 8 110-120 50 70-80 10 120-130 60 80-90 25 130-140 70 Solution- since Modal class lies in the last class, the last class, the coefficient of skewness will be calculated by the formula Sk p = πŸ‘(π‘΄π’†π’‚π’βˆ’π‘΄π’†π’…π’Šπ’‚π’) 𝑺𝑫 Let the assumed mean A=105 and width of interval i=10 and d=(m-105)/10 wages Mid- point(m) f (m- 105)/10=d fd fd2 c.f. 40-50 45 5 -6 -30 180 5 50-60 55 6 -5 -30 150 11 60-70 65 8 -4 -32 128 19 70-80 75 10 -3 -30 90 29 80-90 85 25 -2 -50 100 54 90-100 95 30 -1 -30 30 84 100-110 105 36 0 0 0 120 110-120 115 50 +1 +50 50 170 120-130 125 60 +2 +240 240 230 130-140 135 70 +3 +630 630 300 N=300 βˆ‘ fd=178 βˆ‘ fd2 =158898 Mean: A=105, βˆ‘ fd=178, N=300, βˆ‘ fd2 =158898 𝐱̅ = 𝑨 + βˆ‘πŸπ 𝒏 Γ— π’Š = πŸπŸŽπŸ“ + πŸπŸ•πŸ– πŸ‘πŸŽπŸŽ Γ— 𝟏𝟎 = πŸπŸŽπŸ“ + πŸ“. πŸ—πŸ‘ = 𝟏𝟏𝟎. πŸ—πŸ‘ Median = size of 𝐍 𝟐th item = πŸ‘πŸŽπŸŽ 𝟐 = size of 150th item Median lies in the class 110-120.

Scene 7 (2m 21s)

L=110, N/2 =150, c.f.=120, f=50, i=10 Median = 𝒍 + 𝐍 πŸβˆ’π‚.𝐅. 𝑭 Γ— π’Š = 𝟏𝟎𝟎 + πŸπŸ“πŸŽβˆ’πŸπŸπŸŽ πŸ“πŸŽ Γ— 𝟏𝟎 = 110+6 = 116 Standard Deviation: βˆšβˆ‘πŸππŸ 𝑡 βˆ’ ( βˆ‘πŸπ 𝑡 )2 Γ— π’Š = βˆšπŸπŸ“πŸ—πŸ– πŸ‘πŸŽπŸŽ βˆ’ ( πŸπŸ•πŸ– πŸ‘πŸŽπŸŽ )2 Γ— 𝟏𝟎 βˆšπŸ“. πŸ‘πŸ‘ βˆ’ 𝟎. πŸ‘πŸ“πŸ Γ— 𝟏𝟎 = 22.31 Coefficient of skewness: Sk p = πŸ‘(𝟏𝟏𝟎.πŸ—πŸ‘βˆ’πŸπŸπŸ”) 𝑺𝑫 = βˆ’πŸπŸ“.𝟐𝟏 𝟐𝟐.πŸ‘πŸ = -0.682 2) Bowley`s Coefficient Of Skewness: SkB =. ππŸ‘ + 𝐐𝟏 βˆ’πŸ 𝐌𝐞𝐝𝐒𝐚𝐧. ππŸ‘ + 𝐐𝟏 = πŸ‘πŸ”.πŸ’+πŸ‘πŸ.πŸ‘βˆ’πŸΓ—πŸ‘πŸ“ πŸ‘πŸ”.πŸ”βˆ’πŸ‘πŸ.πŸ‘ = βˆ’πŸ.πŸ‘ πŸ“.πŸ‘ = -0.43. Bowley`s Absolute Measure of skewness = Q3 + Q1 -2 Median. Properties of Bowley`s Coefficient Of Skewness: 1)Bowley`s measure Measure is useful when the distribution has open end classes or unequal class intervals. 3) Limits for Bowley`s coefficient of skewness are -1≀ π‘ π‘˜ ≀ 1 4) In each case sk =0 implies the absence of skewness Q1) A distribution has Q3 = 36.4; Q1 = 31.3; Median =38 SkB =. Q3 + Q1 βˆ’2 Median. Q3 + Q1 = 36.4 + 31.3 βˆ’2 x38. 36.4 + 31.3 5) Kelly`s Coefficient of Skewness i) Kelly`s coefficient of Skewness = P10 + P90 βˆ’2 P50 . P90 + P10 ii) Kelly`s coefficient of Skewness = D1 + D9 βˆ’2 Median . D9 + D1 Q1) Calculate Kelly`s coefficient of skewness for the following data D1 = 50, D9 = 260 and median = 155.

Scene 8 (2m 44s)

Sol: Kelly`s coefficient of Skewness = D1 + D9 βˆ’2 Median . D9 + D1 = 50+260βˆ’2x155 . 260βˆ’50 = 310βˆ’330 210 = βˆ’20 210 =0.095 4) Measure of Skewness based on the moments. Karl Pearson’s defined the following four coefficients known as Ξ²1, Ξ²2,Ξ³1, Ξ³2 where Ξ²1= ππŸ‘ 𝟐 𝝁𝟐 πŸ‘ ,Ξ²2= ππŸ’ 𝝁𝟐 𝟐 , Ξ³1 = +√ Ξ²1 = ππŸ‘ 𝝁𝟐 πŸ‘/𝟐 = ππŸ‘ π›”πŸ‘ , Ξ³2 = Ξ²2 - 3 These four Coefficient are pure numbers and are used in measuring skewness and Kurtosis. The two coefficients, namely Ξ²1 and Ξ³1 are used for measuring the degree and direction of skewness. Kurtosis Kurtosis enables us to have an idea about the shape and nature of the middle peak of a frequency distribution. It is concerned with the flatness or peakedness of the frequency curve Definitions- 1. β€œKurtosis refers to the degree of peakdness of the hemp of the distribution”. By C.M. Mayess. 2. β€œA measure of Kurtosis indicates the degree to which the curve of the frequency distribution is peaked or flat topped” By Croxton & cowden 3. β€œThe degree of Kurtosis of a distribution is measured relative to the peakd ness of a normal. By Simpson and Kafka Types of Kurtosis curves.

Scene 9 (3m 4s)

Karl Pearson classified curves into three types on the basis of the shape of their peaks. These are mesokurtic, leptokurtic and platykurtic. These three types of curves are shown in figure below: Measure of Kurtosis As a measure of Kurtosis, Karl Pearson gave the coefficient of Kurtosis as co-efficient of Beta(Ξ²2) and is next derivative as (Ξ³2) These measures are based upon the second and fourth moments about the mean and are defined as: Ξ²2= π›πŸ’ 𝝁𝟐 𝟐 = π›πŸ’ π›”πŸ‘ where π›πŸ = second moment about mean; π›πŸ’ = fourth moment about mean. Ξ³2 = Ξ²2 – 3 = π›πŸ’ π›”πŸ‘ βˆ’ πŸ‘ = π›πŸ’βˆ’πŸ‘π›” π›”πŸ‘ If Ξ²2>3, the distribution is said to be more peaked, and the curve is leptokurtic. If Ξ²2=3, the distribution is said to be more peaked, and the curve is normal curve (Mesokurtic) If Ξ²2<3, the distribution is said to be more peaked, and the curve is platykurtic..

Scene 10 (3m 21s)

Q1) The Standard deviation of a symmetrical distribution is 3. What must be the value of the 4th moment about mean in order that the distribution be mesokurtic? Solution: A normal curve is Mesokurtic. in a normal curve (mesokurtic curve) the value Ξ²2 = 3, Here ΞΌ2 = Οƒ2 =9 Ξ²2= π›πŸ’ 𝝁𝟐 𝟐 3= ΞΌ4 81 , ΞΌ4 = 243. Hence 4th Moment should be equal to 243. ________________________________________________________.