Some Logic on Regression
• John Vandivier
- default should be linear regression
- consider the simple case of two points: either two points are related or they are not related
- if they are related we may want to describe the relationship in terms of a mathematical formula, this is a main idea behind regression
- ceteris paribus if we do not know the relationship we should assume linear because this is the risk-minimizing assumption
- we can consider more complex cases either series of linear relationships, or a single relationship
- the series of linear relationships can be approximated by a taylor polynomial, or independantly described and then a meta-pattern sought. the single relationship approach should also assume linear prima facie.
- other functional forms (shapes, formula kinds.... these are functional forms) may be considered after, in particular if they have a better explanatory power or goodness of fit for the data, or other theoretical or observation-based reasons...OR for any reason! random testing is ok too, but the linear should be null hypothesis.