
Analysis of the equational relationship between X and Y. Regression SS/Total SS 


has a t distribution which standardizes its value to see if it is significantly different from 0. When the pvalue of the slope is greater than the level of significance, one should assume the correlation coefficient will e close to 0. 




R, indicates nature and strength of the linear relationship variables 

Coefficient of determination 

R2, the ration of explained variation in Y to the total variation. 


minimizes the squared vertical distances between the points and the regression line resulting in the line of best fit. 


will be smaller for better predictive equation. if the override value of y varies widely about the regression line, the standard error of the slope will be large. Square root of the MS residual 


is the study of the nature and degree of the relationship between variables. A correlation coefficient of +1 or 1 means x and y are perfectly, linearly related. An r value of 0 indicates absolutely no relationship 


is using Xs beyond the range of the given Xs to predict Y. THis can cause large errors in prediction. Relationship of slope to the correlation coefficient. signs are the same. 


when Xs are highly correlatedthis gives redundant information. 


nonconstant variance in the residuals 


constant variance in the residuals 


atypical values in a data set (anomalies) 


CURVILINEAR patterns or LOGARITHMIC relationships 

Multiple regression analysis includes 

one dependent variable and more than one independent 


tries all combinations of variables and produces the best predictors in order of their predictive power. 

Artificially inflated Rsquared occurs when.. 

there are too many predictors and not enough samples. 


you should have at least 10 times the number of observations as predictor variables. 


should produce a nearly straight line without outliers. 

T distribution vs. F Distribution 

T is usedto test the individual coefficients where F tests the overall or “global” model. 


are the differences in the observed value of Y at a given X and the predicted value. Absolute values between 2 and 3 are usually just suspicious while those over the absolute value of 3 are severe. 


should fall within +/3 in order to be considered normal values. 

Transform Y and/or X when… 

any of the assumptions are violated 

In simple linear regression the use of regression lines is to … 

predict the average value of y that can be expected to occur at a given value of x. 

A high correlation between x and y.. 

does NOT prove that x causes y 

dependent variable plotted.. independent variable plotted… 

vertical axis horizontal axis 

If the confidence interval on the slope contains 0… 

there is no significant relationship between x and y 


you CANNOT assume that the slope is also positive. 

The slope of the regression line represents… 

the amount of change that is expected to take place in y when x increases by one unit. 


using values beyond the range of the given Xs to predict Y 

if null hypothesis is rejected… 

there is a relationship between x and y. 

if no correlation between two variables… 

the regression line will be horizontal 

A large value for the slope does not necessarily imply a large value for the… 


Test the individual coefficients to see which Xs are good predictors. 

only test these if the overall model had at least one good predictor. 

A we add more predictors… 


When you rerun a model after taking out the poor predictor variables… 

you have reduced the model 

When choosing between two models, both with good predictors for y… 

choose the one with the smallest standard error. 

Check the correlation matrix to make sure the X variables… 

are not correlated with each other 

check the signs of the coefficients… 

to make sure they are logical. 

Never say x causes y unless it was… 


Qualitative variables in multiple regression are called.. 

dummy variables. do not interpret their coefficients 

If there is a curve in the scatter diagram for any x,y chart or the residuals use… 

a quadratic equation… use x and x^2 

If you think two x variables may work together at different levels to affect y… 

then try an interaction term. 

Only interpret the coefficients of… 

good predictors and first order terms. First order terms are linear terms. 

Squared Xs and interacted Xs are called… 


The us of regression lines is to.. 

predict the average value of y that can be expected to occur at a given value of x. 

The study of the equational relationship between variables is called… 

