Archive for the ‘finance’ Category

Nonparametric regression using R

Saturday, April 24th, 2010

screen-shot-2010-04-24-at-11659-pm

Nonparametric regression aims at modeling relation between predictors and dependent variable without any assumptions on specific form of the dependency function:

E(y_i) = f(x_{1i}..,x_{pi})

Unlike classical linear regression, where we the goal is determining parameters of assumed linear function, with nonparametric regression, the goal is estimating the entire regression function directly. Depending on the assumptions on the structure of underlying data, a number of methods exist that achieve optimality of estimation. We give a overview of several methods and explain their practical usage in R. In doing so, we make use of the social graph data described in recent post.

Local Regression

LOWESS (Locally Weighted Scatterplot Smoothing) algorithm is based on idea of local linear regression. The general approach of local regression is fitting simple models to “local” subsets of data and combining the results to determine the regression function for entire dataset. In this this method, for modelling “local” data we use weighted least squares polynomial fit of general form :

y_i = a + b_1(x_i - x_0) + b_2(x_i-x_0)^2 +..+ b_p(x_i - x_0)^p + e_i

where the p “local” observations are weighted by their proximity to “focal” value x_0 ..

groups_likes_lowess

> plot(num_groups, num_likes)
> lines(lowess(num_groups ~ num_likes,  f = 2/3, iter=4),col = 2)

The effect of span window for (f=1/16, f=1/8, f=1/4, f=1/2 ) :

lowess_span_effect

(in progress…)