Identify the slope for one of the independent variables in your final model.

Survey of Consumer Finances Data = A credit card company is interested in targeting a new product it has developed to a key market segment. That market segment consists of people who typically hold credit card balances from month to month (thus paying interest on their credit card debt). They have asked for an analysis to be done of the Survey of Consumer Finance Data to see if it can help them narrow down who to focus their marketing efforts on. This data set consists of over 6,000 unique households and is actual data available online.There are actually 55 variables included in the full data set. I have narrowed that down to about a dozen from which to perform your analysis. In addition to the data I have prepared a data description document that details each of the variables in the data set. Food for thought…Can you think of other variables that, if we had data on them, might make our model a stronger one?CCBAL – This variable identifies the average monthly credit card balance for each observation and is what we would like the model to predict.CASID – This is the identifier for each observation in the data set. It should not be included in the model.EDUC – I included this variable so that you would see how it is coded in the original data set. You should not include in your model. I have transformed it into the following four dummy variables.NOHS – If an observation in the data set was coded as having no HS diploma, this variable is codded as = 1, otherwise it is coded as a 0.HSDiploma – If an observation in the data set was coded as having completed a HS diploma, this variable is codded as = 1, otherwise it is coded as a 0.SomeCollege – If an observation in the data set was coded as having attended college but not having earned a degree, this variable is coded as = 1, otherwise it is coded as a 0.CDegree – If an observation in the data set was coded as having a college degree, this variable is coded as = 1, otherwise it is coded as a 0.
Expectations for the write-up of your results: Please label each section of your paper with a section header (i.e. Section 1, Section 2, etc.) so that I know where to go to look for your analysis of each aspect of your final model.Section 1: Provide a brief overview of the data set and what it is you are attempting to accomplish by running the regression model. Include the size of the data set, dependent variable information, the number of independent variables in the data set, and the methodology you are using to run the model. Briefly explain what Stepwise Regression does to get you to a final model.Section 2: What is the predictive equation for the model? Interpret this equation with a concrete example from your output.Section 3: Identify the slope for one of the independent variables in your final model. Interpret the meaning of that slope for the best fitting line in the context of the data set. Calculate the 95% confidence interval for the slope of this variable. Interpret this interval in the context of the problem.Section 4: Perform an F test for the model. State the null and alternate hypothesis for the F test. Interpret the result of the test in the context of the data set. State whether you chose to do the F-test using an F Table or using the P-value. Does the model pass this test and what does that mean?Section 5: Explain the problem of collinearity. How did you test for this possible concern? If collinearity appears in the running of your analysis, explain what you did to reduce/eliminate the problem.Section 6: Discuss the influence that outliers may have on a regression model. How did you test for outliers in your model? If you found an outlier how did you decide to modify the model as a result and why?Section 7: Interpret the Rsquare value for the model in the context of the problem. Does it give you any faith in the ability of the model to provide you with usable predictions? Explain the Adjusted R-square value and note which Rsquare value you think would be best to use with this model.Section 8: Discuss the results of the Residual Analysis you performed. What assumptions are being made about this model? Does it appear any of those assumptions have been seriously violated? If the model uses data collected in a time series, what tests can be used for autocorrelation? If needed, does it appear autocorrelation is a problem?Section 9: If the final model includes one or more Dummy Variables select one of these and explain how such a variable works. Interpret the slope coefficient for this variable that is included in the final model. Discuss what Interaction is and why it may be a concern in a model that includes a dummy variable. If you tested for interaction, discuss your results and if that changed your final model. You are not required to test for interaction in this project but may do it if you wish.Section 10: Select a 95% confidence interval estimate of the population average response (Mean) for a given set of independent variables. Interpret this interval in the context of the problem.Section 11: Select a 95% confidence interval for a specific instance of Y for a given set of independent variables (Prediction Interval). Interpret this result in the context of the problem. Explain the difference between this confidence interval and the one presented in section 10 (Why is one interval larger than the other?).Section 12: Conclusion of your paper: Provide your opinion regarding the usability of this model and whether or not you feel the predictive equation may be of value. Discuss the pitfall of extrapolation and how it relates to the use of your final model. Discuss the pitfall of cause/effect conclusions and how it relates to the use of your final model. Finally, do you have any suggestions for additional independent variables that might be valuable in improving the predictability/value of this model.