If a player is 80/20 through 16 hands, how likely is it that he could actually be a 15/13 player? How do we calculate this?
I think to get this accurately is pretty difficult, simplifying this a lot into If a player is 80 vpip through 16 hands, how likely is it that he could actually be a 15vpip player?
This is a lot easier and may lead the way to a better or more complete solution. This, as sthief09 says, is likely to be a simple binomial distribution problem. We know that the player opens 15% so for each trial (ie, a hand) the chance of opening is 0.15, so the chance of scoring 13 (12.8 is 80%) out of 16 is given by : nCk p^n(1-p)n-k
This is really unlikely to happen, the probability is 6.69316*10^-9, or about 1 in 150 mllion.
(This assumes the player is robotically opening 15%, ie, never tilts into being a maniac, or varies in any real way - not like real life)
We can do the same for pfr, scoring 3 out of 16 when we have a probability of 13% for each trial, this is quite likely and is a 20% chance.
edit: this 20% figure is for scoring exactly 3, the result for 3 or more is about 35%, the value for the vpip above is so small it doesn't much matter is you ask for 13 or greater as it's still 1 in 100 mill or so.
In this case I think using the binomial model is fine but this is simplified, we aren't using the extra information of the pfr for the vpip case. The pfr has a fairly complicated relationship to vpip but will have some bearing on the chance of seeing 80% for vpip, and also the vpip will have an influence on the pfr result. I am not sure how to tie these things together in a good manner, I think this is far too complicated to get an algorithmic answer.
I'm not looking for a specific formula, but more of a general framework for understanding how we can trust PFR as a stat at (for example) 500 hands, but can't trust 3-bet % until 1500 hands, etc.
The above answer is really not much use as in that case we somehow knew the player was actually 15/13.
A simple way of getting a feel for how much we can trust hud stats of low samples is to use the "Margin Of Error" approach, also suggested by stief09.
I wrote a script to calculate these and for a vpip of 40% you get:
No. Hands in sample: 20, openedHands: 8, sampledVpip: 40.0%, Approx90%CI: +/-18.544%
No. Hands in sample: 40, openedHands: 16, sampledVpip: 40.0%, Approx90%CI: +/-12.944%
No. Hands in sample: 60, openedHands: 24, sampledVpip: 40.0%, Approx90%CI: +/-10.524%
No. Hands in sample: 80, openedHands: 32, sampledVpip: 40.0%, Approx90%CI: +/-9.094%
No. Hands in sample: 100, openedHands: 40, sampledVpip: 40.0%, Approx90%CI: +/-8.124%
No. Hands in sample: 120, openedHands: 48, sampledVpip: 40.0%, Approx90%CI: +/-7.410%
No. Hands in sample: 140, openedHands: 56, sampledVpip: 40.0%, Approx90%CI: +/-6.856%
No. Hands in sample: 160, openedHands: 64, sampledVpip: 40.0%, Approx90%CI: +/-6.410%
No. Hands in sample: 180, openedHands: 72, sampledVpip: 40.0%, Approx90%CI: +/-6.042%
No. Hands in sample: 200, openedHands: 80, sampledVpip: 40.0%, Approx90%CI: +/-5.730%
Each time you get 4x the number of possible events you half the width of the confidence interval.
(Also this approach breaks down if the sample size is too small as you can get a CI that would push the stat into the -ve)
You always get the chance to pfr each hand but at 6 max you are only likely to be able to 3-bet on 20% of hands so this would imply that in a 1500 sample of hands the 3bet CI accuracy is equivalent to having 1500/5 = 300 hands of pfr or vpip.
To get better accuracy I think you can use some Bayesian type analysis on these stats. We know the players stat will come from a distribution (probably Normal) for the player popualtion and we can use this to influence our CI. I think this may be to do with joint distributions or covariance stuff but my knowledge fails around here.
I had also best add my usual disclaimer that I might have calculated wrong and I am just reading up on stats, I am not an expert.