bug-gnu-pspp
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

PSPP-BUG: [bug #55825] Feature Request: Improved Linear Regression Diagn


From: Matt
Subject: PSPP-BUG: [bug #55825] Feature Request: Improved Linear Regression Diagnostics
Date: Mon, 4 Mar 2019 09:27:25 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36

URL:
  <https://savannah.gnu.org/bugs/?55825>

                 Summary: Feature Request: Improved Linear Regression
Diagnostics
                 Project: PSPP
            Submitted by: mattakatdat
            Submitted on: Mon 04 Mar 2019 02:27:23 PM UTC
                Category: Other
                Severity: 5 - Average
                  Status: None
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
                 Release: None
                  Effort: 0.00

    _______________________________________________________

Details:

PSPP is amazing, and I would love to use it in my college classrooms. However,
right now it lacks the features needed to actually conduct regression analyses
responsibly by checking your assumptions. There are dozens of these, of
course, but I think that three in particular would be very helpful to putting
PSPP on the map for introductory statistics courses looking for a free GUI
based teaching tool. I am not a programmer, but I feel like I should give
suggestions on what would be most useful for PSPP in teaching a "second class"
on regression

1) OLS Assumption #1: Negligible multicollinearity: The ability to estimate
Variance Inflation Factors would give a tool for testing the presence of
multicollinearity.

References: Thiel's Principles of Econometrics; basic computations can be
found here: https://newonlinecourses.science.psu.edu/stat501/node/347/

2) Assumption 2: Outliers are handled and non-influential- The ability to
identify outliers in multivariate models by calculating Cook's Distance (aka
Cook's D statistics) would help with finding outliers.

Reference: Cook, R. Dennis (March 1979). "Influential Observations in Linear
Regression". Journal of the American Statistical Association. American
Statistical Association. 74 (365): 169–174. 

3) Assumption 3: Linearity: Component-plus-residual plots can visually
identify non-linear associations in many cases. 

Overview:
https://www.stat.washington.edu/pds/stat423/Documents/LectureNotes/notes.423.ch12.pdf

4) Assumption 4: Heteroskedasticity: Residual vs fitted plots, which PSPP all
but supports already since it can output residuals and predicted values. It
would just be a matter of temporarily taking those values, standardizing them,
and scatter plotting the two.

There are, of course, other assumptions and tests, but this is a good start. I
am happy to test these features if they are implemented. 




    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?55825>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]