# Python regression finder

I've been working my way through Stanford's online Machine Learning course recently and I thought I should put some of what I've learnt to use.

The program I made (file below) reads a file of tab-delimited data, assuming the first column is the independent values (x) and the second is the dependent values (y). It then gives some options:

1. Simple summary
2. Plot data
3. Find linear regression
4. Find polynomial fit
5. Exit

Simple summary currently just gives the mean of x and y, but I'll probably add standard deviation and maybe some other simple statistics.

Plot data does just that using matplotlib. I might changes this to use an SVG graph drawer I've been working on, but I wanted to try out matlibplot.

The function "find linear regression" initially worked by explicitly using the normal equation:

`p = (X.T * X).I * X.T * y`

But then I found, unsurprisingly, there is a function to do this in the Python linear algebra library, which also returns extra information:

```from numpy.linalg import lstsq
(p, residuals, rank, s) = lstsq(X, y)```

Similarly, I was going to write my own function to calculate a polynomial fit of a given degree, but it makes much more sense to use the function already there, namely polyfit:

`p = numpy.polyfit(x, y, degree)`

Most of my code is to display the resulting vector is a nice way.

There's still lots to add, including working with multivariate data and adding regularisation, but I hope it will already be a useful program.

AttachmentSize
regression.txt2.31 KB

### Topics:

Hi, like it so far, i'm curious to see what it will look like when you've added to it. I'm working on a program to find interesting correlations from aparently random data (still a bit of a newb though)