Wednesday, January 27, 2021

Testing a Basic Linear Regression Model on the Outlook on Life Survey

 In review, I am working with the 2012 Outlook on Life Survey, specifically the relationship between Religiosity (RELIND, an aggregate index of several questions) and Acceptance of Others (ACCIND, also an aggregate index).  Both variables are quantitative.  For this assignment, I "centered" Religiosity by finding the mean of all respondents' scores (0.75), then creating a new variable by subtracting that mean from each respondent's score (RELIN0).  I then ran the linear regression for the two variables.  The results indicated that Religiosity (Beta=-0.26, p=.006) was significantly and negatively associated with Acceptance of Others.  However, Religiosity accounts for only 0.3% of variability in the response variable (R-squared = 0.003).  After centering my explanatory variable, the mean = -5.49e-15 (essentially zero).  Below is the output of the linear regression, followed by the relevant program snippet.

OLS regression model for the association between Religiosity and Acceptance of Others
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                 ACCIND   R-squared:                       0.003
Model:                            OLS   Adj. R-squared:                  0.003
Method:                 Least Squares   F-statistic:                     7.711
Date:                Wed, 27 Jan 2021   Prob (F-statistic):            0.00553
Time:                        15:43:24   Log-Likelihood:                 776.02
No. Observations:                2269   AIC:                            -1548.
Df Residuals:                    2267   BIC:                            -1537.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      0.6877      0.004    190.500      0.000       0.681       0.695
RELIN0        -0.0260      0.009     -2.777      0.006      -0.044      -0.008
==============================================================================
Omnibus:                       81.863   Durbin-Watson:                   1.961
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               90.391
Skew:                          -0.489   Prob(JB):                     2.35e-20
Kurtosis:                       2.966   Cond. No.                         2.59
==============================================================================

relmean = data['RELIND'].mean(axis=0)
print ('mean of Religious Index')
print (relmean)

#subtract mean from all values to center Religious Index values on 0
data['RELIN0'] = data['RELIND'] - relmean
relmean0 = data['RELIN0'].mean(axis=0)
print ('mean of zeroed Religious index <should be zero.>')
print (relmean0)

#basic scatterplot:  Q->Q
scat1 = seaborn.regplot(x="RELIN0", y="ACCIND", fit_reg=False, data=data)
plt.xlabel('Religiosity')
plt.ylabel('Acceptance')
plt.title('Scatterplot for the Association Between Religiosity and Acceptance')

#linear regression using centered explanatory variable (Religiosity)
print ("OLS regression model for the association between Religiosity and Acceptance of Others")
reg1 = smf.ols('ACCIND ~ RELIN0', data=data).fit()  #response variable then expanatory variable
print (reg1.summary())

Testing a Basic Linear Regression Model on the Outlook on Life Survey

 In review, I am working with the 2012 Outlook on Life Survey, specifically the relationship between Religiosity (RELIND, an aggregate index...