Home AI Headlines [D] Recursive Least Squares vs Gradient Descent for Neural Networks

[D] Recursive Least Squares vs Gradient Descent for Neural Networks

[D] Recursive Least Squares vs Gradient Descent for Neural Networks


I have been captivated by Recursive Least Squares (RLS) methods, particularly the approach that employs error prediction instead of matrix inversion. This method is quite intuitive. Let's consider a scenario where you need to estimate the true effect of four factors (color, gender, age, and weight) on blood sugar. To find the true impact of weight on blood sugar, it's necessary to eliminate the influence of every other factor on weight. This can be accomplished by using simple least squares regression to predict the residual errors recursively, as shown in the diagram below:

Removing the effect of all factors on "weight" in a recursive manner

The fundamental contrast between RLS and Gradient-based methods lies in how errors are distributed across inputs based on their activity, leading to the subsequent update of weights. However, in the case of RLS, all inputs undergo decorrelation before evaluating prediction errors.

Comparison between error sharing in RLS and GD

This de-correlation can be done in few lines of python code:

for i in range(number_of_factors):

for j in range(i+1, number_of_factors):

wx = np.sum(x[i] * x[j]) / np.sum(x[i]**2)

x[j] -= wx * x[i]

This approach also bears relevance to predictive coding and can shed light on intriguing neuroscientific findings, such as the increase brain activity during surprising or novel events — attributable to prediction errors.

The prediction errors are increasing during the surprising events similar to how brain activity increases.

RLS learns very fast but it's still subpar to deep learning when it comes to non-linear hierarchical structures but that is probably because Gradient based methods enjoyed more attention and tinkering from the ML-community. I think RLS methods needs more attention and I have been working on some research projects that uses this method for signal prediction . If you're interested, you can find the source code here:

submitted by /u/brainxyz


Source link


Please enter your comment!
Please enter your name here