I'm confused, we don't define our win rate as br/hands.
Spoony has it right.
winrate is the change of BR with respect to (w.r.t.) change in hands.
As a differential equation:
{winrate} = d{BR}/d{hands}
My point is that if we look at a sample at the end of n-hands, we can apply math and make an estimate of that rate of change... but it's a poor estimate based on 1 data point. (Well, 2... we call the starting BR 0 and the starting hand number 0, making a data point)
If we look at the change over each individual hand, then we have a (n-1) more data points to work with. All that extra data allows us to use more refined and robust methods of estimating the average slope... which is the rate of change in BR w.r.t. hands.
Now, my proposition is to assume that the winrate is a constant throughout the data set and to use a linear regression to model the slope of BR w.r.t. hands.
Here's a pretty good argument for why a linear regression is a reasonable choice:
TL;DR : it is
Spoiler:
We could do a quadratic regression (probably a bad idea) or a cubic regression (probably better than quad.). If we are looking at our BR over a very long period during which we've been moving up stakes, then we'd probably want to use an exponential regression. Of course, we could definitely do a Fourier analysis on the data to look for periodic (repeating) patterns.
I'm saying, there are lots of ways to show a trendline through data, but the one that yields the lowest variance is not necessarily the best model of the phenomenon. For regressions like x^2, x^3, x^n, ... the bigger the n, the lower the variance until you get the n high enough that the line passes through every data point. Sure, great line... but does it model the situation? No.
I think we're really interested in modeling our winrate as a constant over a sample size... so a linear regression works well, and gives us a good basis for extrapolation. A cubic fit would show a bit of wave in our winrate (which may or may not indicate anything other than card variance), but it would make a poor extrapolation (trending either to + or - very big very fast, just outside our data set, usually).
Originally Posted by ImSavy
Slightly off topic do you know any good books about stats, never something I've been a huge fan of but I want to take a look at it, first year undergrad type stuff.
Sorry, but I don't. I didn't like my text book on the subject, although it was slightly less annoying than almost all other math text books.
My favorite math book is umm... Here's the 7th edition (huge 1000+ page .pdf), but I have the 6th in hard back... apparently they took out the section on prob/stats.
This came up, among dozens of other, which I didn't look at. It looks like a prob/stats textbook to me.
Just google "probability and statistics textbook pdf" and see the 636,000 results