Regression Analysis

Published on Slideshow
Static slideshow
Download PDF version
Download PDF version
Embed video
Share video
Ask about this video

Scene 1 (0s)

Regression Analysis.

Scene 2 (5s)

[Audio] Often professionals want to know if there is a relationship between two or more variables for instance is there a relationship between the grade on the 3rd print exam a student takes and the grade on the final exam if yes then how is it related and how strongly regression can be used here to arrive at a conclusion this is an example of bivariate data that is two variables however statisticians are mostly interested in multivariate data the regression analysis is used to predict the value of one variable the dependent variable on the basis of other variables the independent variables in the simplest form of regression linear regression you work with one independent variable the formula for simple linear regression is shown on the screen in the next screen we'll look at a few examples of regression analysis.

Scene 3 (1m 23s)

[Audio] Regression analysis is used in several situations such as those described on the screen in example 1 using the data given on the screen you have to analyze the relation between the size of a house and its selling price for a realtor In example 2 you need to predict the exam scores of students who study for 7.2 hours with the help of the data shown on the slide.

Scene 4 (1m 59s)

[Audio] A couple more examples are given on the screen in example 3 based on the expected number of customers and the previous day's data given you need to predict the number of burgers that will be sold by KFC outlet in example 4 you have to calculate the life expectancy for a group of people with the average length of schooling based on the data given.

Scene 5 (2m 33s)

[Audio] Now let's look at the two main types of regression analysis simple linear regression and multiple linear regression both of these statistical methods use a linear equation to model the relationship between two or more variables simple linear regression considers one quantitative and independent variable X to predict the other quantitative but dependent variable y multiple linear regression considers more than one quantitative and qualitative a variable to predict a quantitative and dependent variable Y we'll look at the two types of analyses in more detail in the slides that follow.

Scene 6 (3m 30s)

[Audio] In simple linear regression the predictions of the explained variable Y when plotted as a function of the explanatory variable X from a straight line the best fitting line is called the regression line the output of this model is a function to predict the dependent variable on the basis of the values of the independent variable the dependent variable is continuous and the independent variable can be continuous or discrete.

Scene 7 (4m 11s)

[Audio] Now let's look at the different kinds of linear and nonlinear analyses list of linear techniques are simple method of least squares coefficient of multiple determination standard error of the estimate dummy variable and interaction similarly there are many nonlinear techniques available such as polynomial logarithmic square root-- reciprocal and exponential.

Scene 8 (4m 46s)

[Audio] To understand this model will first look at a few assumptions the simple linear regression model depicts the relationship between one dependent and two or more independent variables the assumptions which justify the use of this model are as follows linear and additive relationship between the dependent and independent variables multivariate normality little or no collinearity in the data a little or no autocorrelation in the data homoscedasticity that is variance of errors same across all values of X the equation for this model is shown on the screen.

Scene 9 (5m 41s)

[Audio] A more descriptive graphical representation of simple linear regression it's given on the screen beta naught represents the slope slope with two variables implies that one unit changes in result in a two-unit change and why the beta1 represents the estimated change in the average value of y as a result of one unit change in X epsilon represents the estimated average value of y when the value of x is zero.

Scene 10 (6m 26s)

[Audio] This simple linear regression the goal of a simple linear regression is to predict the value of a dependent variable based on an independent variable the greater the linear relationship between the independent variable and the dependent variable is the more accurate is the prediction visually the relationship between the variables can be represented in a scatter plot the greater the linear relationship between the dependent and the independent variable the more the data points lie on a straight line the task of a simple linear regression is now to determine exactly the straight line that best describes the relationship between the dependent and the independent variable in the context of linear regression analysis a straight line is plotted on the scatter plot in order to determine the straight line the linear regression uses the method of least squares let's say a hospital asks you to give them an estimate based on the age of a person how long this person will stay in the hospital after a surgery the target of the hospital operator is to optimize the bad planning in this example your dependent variable the one you want to infer is the length of stay after surgery your independent variable is the age of a person the equation that describes the model now looks like this b is the slope and a is the receptor point if a person would be zero years old which doesn't really make sense In this example the model would tell us that this person stays eight days in the hospital In order to calculate the coefficients the hospital must of course Provide you with a sample of people where you know the age and the length of stay after surgery by using your data You could find out that b is 0.14 and a is 1.2 This is now our model which helps us to estimate the length of stay after Surgery based on the age of people now let's say a person who is 33 years old Is registered for a surgery then we would put 33 for x our model then tells Us that this person stays in the hospital for 5.82 days after surgery now of course The question is how do you calculate the slope b and how do you calculate the Intercept a usually, you use a statistics program like data tab in the case of simple linear regression however it is also quite simple to do this by hand b results from the correlation of the two variables times the standard deviation of the variable length of stay after surgery divided by the standard deviation of h a is obtained by calculating the mean value of the length of stay minus the slope times the mean value of the h.