In Sabermetrics, the mathematical study of baseball, there are many ways to estimate what a team’s winning percentage (w) should be against its opponents, given the total runs (y) it has scored and the total runs (x) it has allowed in a certain number of games (G).
It is also
helpful to write This is Once one
has a formula for a potential Winning Percentage Estimator (wpe), one wonders
how accurately it applies to real teams in Major League Baseball. I measure wpe’s against a roughly 1260
team-season sample of MLB from 1903 to 2010, with an overall w of .50000 in the
sample.
My sample uses only team-seasons with 149 games or
more, which improves the accuracy of all wpe's over what they
would be if used on shorter seasons. I use Excel to measure the root
mean square error (rmse) in the predicted winning percentages as compared to actual
results—but I don’t then convert to “Wins”, as many analysts do. Since
my sample isn’t all MLB team-seasons, and leaves out all shorter
ones, I
can’t tell which wpe's are “really” most
accurate—but the general relationships I find agree pretty much
with accuracy info about wpe's I've found online from other
sources who do use all data. I
just want (and use here) a rough estimate of which measures (including my newfound
ones) are substantially different from others.
As a benchmark, BJ’s Pyth2 has for my sample a rmse of .0264 (the difference between a w
of .500 and a w of .5264, or around 4 wins in a 162 or 154-game season), and the lowest rmse's for other formulas are a couple at .0256, and one at .0255. The “worst” (still
halfway-decent, and often quite accurate for most "normal" teams) ) wpe's I consider can get up to rmse of about .031.
The .026 or .025 rmse level is almost certainly the best possible, since
there is an ineradicable element of random chance in baseball. More Notation:
6. The Uniform Run Distribution:
DTE to Quasi-Pythagorean Results7. DTE, Kross, and Tangent Lines to Pythagorean Formulas: Simpler derivations 8. Tangent Line Considerations for log5 9. Wins Per Run Models 10. Pythagoras to Natural Logarithms via Integration: a New Run Estimator 11. L'Hopital Enters the Fray 12. BJ’s original derivation of log5 in 1981 13. List of all wpe’s considered 14. Credits to other sabermetricians and their research _____________________________________
That
is, if one team has k times as much quality as another, will it tend to win k
times as many games? This holds up for many different plausible
measures of quality, including u = y/x. However,
just enough random chance involved in the game to make
it true. This random chance comes from many factors,
but one major one is that a baseball team's "quality" is a (weighted)
average of that of all its players--but not all players play in any given
game. This is especially true of
pitchers--the team's 5th best pitcher may be pretty bad, even though
the overall average quality of the team may be better than any opponent's. When the 5th best pitcher pitches, the team
has a good chance of losing. Or, when
their best pitcher pitches, but happens to pitch by chance against the only
good (but very good!) pitcher the
opposing team has, the team may still lose.But in
arm-wrestling, there is little role of chance:
if I am twice as strong as my opponent, i.e., if my “quality” of
arm-strength enables me to lift 200 pounds while she can only lift 100, I will NOT
win twice as many of my matches with her as she will—instead, I will And in
basketball, the Bulls team that went 72-10 did not win in proportion to its
relative excess of quality over its opponents.
It was nowhere near “7 times as good” as its average opponents—not by
any measure (shooting percentage, speed, strength, points, rebounds, steals,
free-throws, etc.), nor even by the total sum of its (small) excesses in each
category. But, as in arm-wrestling,
there is less “chance” in basketball than in baseball, so even a modest surplus
of basketball talent over one’s opponents much more often allows the team to
demonstrate that surplus by winning the game.
This is because (basically) the "good" players on the team
always play in every game, and especially near the end, when close games are
decided--Michael Jordan was rarely "by chance" not around to help
determine a game's outcome when needed. However,
the fact that this proportionality of wins to quality IS roughly true in
baseball, under many different plausible measures of quality, leads to a basic
model for many different sabermetric formulas, which may at first seem ad hoc
and unrelated.
make
sure that the “quality” of an average
team (which is predicted to win half its games, with w = .500, and with x = y)
is equal to a constant: and 1 is certainly the best constant. I will do this normalization for some quality measures, particularly ones based on the run ratio y/x: if y = x, then Qy = 1 = Qx. However,
since this "normalization" is not at all necessary to make the model
work, and is merely an "aesthetic" feature, I will not always do it.
One can always force any quality measure into a different,
normalized form that yields the same wpe via the Axiom, but when
the logical features and rationales of relative quality
measures are already apparent, even though it isn't "normalized" to 1,
I will often not bother to do so, since doing so can introduce
extra mathematical cumbersomeness, and never changes the final Quality
Model results of predicted "w". ***************************************************************************
A baseball
Team Y's For a given
(plausible) measure of quality, with team Y having quality called "Qy",
and its opponent team X having quality called "Qx",
*************************************************************************** In
baseball, there are many quality measures for which, when team Y has “k” times
the quality of its opponent according to that quantitative measure, it will
indeed generally win “k” times as many games in their matches. That is, For various
mathematical reasons, it may sometimes appear that
w
for a team Y in a league is: It is
certainly a pretty good predictor, as wpe's go.
Here y is runs scored by Y, and x is runs allowed by Y, i.e., runs scored by its opponents (X). This result would trivially follow from the
Basic Axiom IF we chose runs^2 to be the the quality measure of each team,
respectively. That results in y^2 being
the quality Qy of Team Y, and x^2 being Qx, the quality of the agglomerated
league opponent Team X when playing against Y.
I wondered for 28 years WHY the runs-squared were used! Why
aren't just runs themselves (without the squares) indicators of team quality, and hence predictors of winning percentage?One answer
would be that if Qy = y and Qx = x, the resulting formula via my Basic Axiom
would be w = y / (y
+ x), which has been shown to NOT predict real-life team winning percentages
very well. Pyth2 predicts w pretty accurately--refinements
and alternate formulas never do very much better than Pyth2. But, in fact, there is a much more plausible
indicator of team Quality than y^2 and x^2, one which leads (via my Axiom) to
the Pyth2 result,
First
developed in the late 70's, vaguely mentioned in his 1978 Baseball Abstract,
and explicitly developed in his 1981 Baseball Abstract, BJ tried in
his 1981 Abstract to “prove” this relationship between log5 and Pythagorean
results, but unsuccessfully—his claims in the 1981 Abstract about how the two
are related have a fundamental error.
But the formulas are nonetheless very related, as I will show soon. [A discussion of BJ’s original development
(with the error) in the 1981 Abstract is presented further down the webpage.]
That is to
say, if Team Y meets Team X in post-season play, and we know that Team Y went
60-40 in the regular season in its league, while Team X went 55-45 in its
league (perhaps the same league, perhaps not), what is the winning percentage w that
we should "expect" in the postseason series for Team Y as it plays
Team X? This is a
very
Or: So, when a
60-40 team plays a 55-45 team, it should have predicted winning percentage w
of: w =
(60/40) / [60/40 + 55/45 ] = .551
. This
discovery by BJ has long been shown to be a very useful and accurate result. (It was also immediately transformed
mathematically into an equivalent version using inputs of team Y’s and team X’s winning
percentages against their leagues, rather than their odds ratios. But the odds ratio version above is the
important one for other wpe’s.) BJ’s Pythagorean Formula for power 2 (Pyth2) says that winning percentage
w is given by (with y = runs scored by Y, and x = runs allowed)
Define
Qy, the That is, the quality of Y is measured by the "run ratio" of
its runs to its runs allowed. This is
certainly an intuitively plausible measure of the quality of a team. So the That is, Plugging the above Qy and Qx
into the
I get Rmse
for Pyth2 as .0271, which will be the “pretty good” benchmark against which to
measure other wpe’s.
It was rapidly realized, even by BJ in 1981, that Pyth2 could be made
more accurate (a little bit) by using a
We derive
this from the Quality model by using:
But now we
ask, Well, it's
This fact
is well known in sabermetrics, and leads to wpe’s that include RPG = (y + x)/G in them, by which the greater the
RPG, the more the resulting w is due to a Again,
this
effect of chance at historically average MLB RPG levels means that the
Basic Axiom
isn’t “really” true—rather, it’s true for
“effective” quality, as opposed to “real”
quality--but it’s close enough for horseshoes, and for MLB
baseball. Again, the difference in accuracy between
Pyth2 and Pyth1.82 is very small, on average—a small fraction of a win per team per
year. What it
does mean is that many Quality measures result in wpe’s that can K:
Adding any constant in the numerator and twice the constant in the
denominator moves the overall result a bit closer to ½. BJ used K = 60,000, but I find with my data
that the best fit is around 38,000.
Doesn’t matter too much… This has
the same effect on overall accuracy as changing the 2 to 1.82 and not using any
K.
BUT, from now on, we will NOT measure or model "effective" quality any
more, as, again, the real "quality"measures are still good predictors, and it
is the mathematical relationships between the various fundamental wpe's
in which I'm interested, not ad hoc tweaks that produce minutely better accuracy.
Define
(BJMWP)
Note:
Let the
Quality of a team Y be: Intuitively,
this says that any team Y, no matter how many total runs it scores, that has a surplus
of runs scored over allowed of (y - x) has Quality, relative to that of its
opponents, That is, the
"near-average" Team would normally have A runs, but its excess run
differential would now give it (y - x) + A runs. An average opponent would have A runs. This is
fairly plausible--in a league where average total runs per team are 700, a team
that scores 760 and gives up 740 should be of roughly similar overall quality
to one which scores 720 and gives up 700, as should a team that scores 669 and
gives up 649. In BJ's formula, [One also
gets the same result by assuming that Qy = A / (A + (x – y), i.e., that the
quality is similar to a situation in which Y scored an Average amount of runs
A, and X scores A + (x – y), i.e., below average, if y > x .]
w = (y -
x + A) / [ (y - x + A) + (x - y + A) ]. I don’t
actually have data on A handy, so I checked it against various constants in
place of A, and for A = 750 got “best fit” rmse of .0272. Thus, it is at least as good as Pyth2, which is
explicitly why BJ used it for Win Shares. But of course it is undoubtedly more accurate using A for each league/year separately.
Assume that
a difference of runs (y – x) is similar to a situation where each team is a
distance of half of that difference above or below an Average Team. That is, assume that it is as if Y scored A +
(1/2)(y – x) runs, and X scored A + (1/2) (x – y) runs. Then apply
the same Quality measure as in the Pyth2 case:, i.e. the run ratio between the
two teams:
This, of course, leads to the Pyth2 result with the runs for each team squared: (first, double each runs amount for simplicity): w = (2A
+ y – x)^2 / [ (2A + y – x)^2 + (2A
+ x - y )^2 ] or, squaring and simplifying, This is another
wpe using the run differential s = (y –
x), and will be discussed further in that section.
Note: H = half of 3 runs per game per team. If for H
one used not a constant, but (1/2) ( y
+ x)/2 / G , i.e.,
one used half of the average single team
runs/game in Y’s games, one actually gets Ben Vollmayr-Lee’s Linear Prediction
model "BVL2" (see next section.) A very nice relationship!
That is, if Qy = (y - A/2) / (y + x - A)
For an
"average" team, where y = x,
and thus t = 1/2, the winning percentage should be .500
= 1/2. Thus, a linear predictor: It could
thus also be expressed (as BVL prefers) in point-slope form as
So here is the BVLN Linear Prediction Model: w = N t + (1- N)/2 This will be accurate for various N's near 2
Let
which, using the LCD, gives us:
w
= (2Ny + y + x - Ny - Nx) / 2(y
+ x), or, simplifying for a couple steps... w =
[y + x + N(y - x) ] / 2(y + x) , or:
Here is our
first natural occurrence of the parameter
Let Intuitively,
this again says that a team Y's Quality can be measured by the following
sum: the Here, the BVL also
shows that whatever N fits the data will also be the exponent in the Pyth. Run
Formula that also best fits the data for that model--via a calculus
tangent-line relationship, which will be discussed below. In fact, the "best
fit" to real life data says that N should equal around 1.8, not 2. w =
So we look for
wpe's of the form w = f(q). The And
secondly that, since by definition the w = f(q) = 1 / (1 + q^n ). Here, to be realistic for baseball scenarios, we'll limit ourselves to n > 0. [Note: if we use u = y/x, we get w = f(u) = u^n / (1 + u^n), slightly less elegant than the q-version. No real difference in approach or results, of course.] This (or the u-version, of course) also simplifies, using y and x, to w = y^n / (y^n + x^n). When the exponent n = 2, we get the "original" version Pyth2 proposed by Bill James in the 1970's, w =
y^2 / (y^2 + x^2), which, because of a superficial similarity to the
(real, mathematical) Pythagorean Theorem, he dubbed by the name "Pythagorean Run Estimator (or
Formula)". It soon became apparent that other exponents besides n = 2 could be used, and empirical tests have suggested that the most accurate (statistically) predictive n-value could be (depending on the data sample) any of various exponents from the range of roughly n = 1.7 to n = 2, with most being close to 1.8 or so. I will call
any such function
Before I discovered the above reason "why" the Pythagorean Run Estimator Pyth2 is true, based on the quality model, I tried many other approaches. Here is one that "almost" worked, and based again on a very simple model. 6A) Uniform Run Distribution. Assume again that Team Yukon scores an average of Y runs per game, and gives up X to its opponents, Team Xavier. Here, X and Y are constants for the purposes of the rest of the inquiry, hence they are capital letters, as we will need to use small y and small x for their routine mathematical role as "variable" coordinates in the usual Cartesian plane. Further assume that the range of runs
scored per game is twice the average, that is, Further assume a "uniform"
distribution of run results for each team in each range (not a normal or
bell-shaped one). That is, assume that the probability of any individual
game's result equaling a number of runs for each team is the same for any
number in the range 0 to 2X and 0 to 2Y. This is NOT a very good
assumption, as most scores will be in the middle of the range, but it is
certainly a simple one, and its inadequacies tend to cancel out for our
purposes, given that its errors are in the same direction for both
teams. Further assume a continuous distribution of possible runs, not a
discrete one--this is an assumption that is necessary to keep the
simplicity of the model as well, and shouldn't distort the model too much
(certainly not compared to the distortions of uniformity! :) ]Under all these assumptions, one can model the game results by a rectangle in the Cartesian plane with corner coordinates (0,0), (0, 2Y), (2X, 2Y), and (2X, 0), in which a game result is any point (y,x) in the rectangle, with y being the number of runs scored in the game by Yukon, and x being those scored by Xavier (i.e., allowed by Yukon.) We first note that in this simple model, the probability that Yukon beats Xavier, i.e. its winning percentage
w, is simply the area of
the portion of the rectangle that is above
the line y = x, divided by the
total rectangle area 4xy. Assume WLOG that y > x, and elementary algebra shows that the area above y = x is: 4xy - 2x^2. Dividing by the total area 4xy and simplifying gives us:
forces the second constraint to
now also be satisfied, at the expense of now having a piecewise function, with one half being the linear f(q) , and the
other half the hyperbolic g(q) = 1/2q. This is a very nice example of a
piecewise function, because the transition between the two is
"seamless", with both having the same slope at q = 1 (and each being
a fairly good approximation of Pyth2, w = 1/(1 + q^2) on their respective
domains.Of
course,
both parts give a value of 1/2 at q = 1, as they should. The nice
thing,
though, is that for large values of q = x/y, which is to say teams that
give up
a lot more runs than the small (but non-zero) number they score, the
winning
percentage does not become exactly zero (or negative!), as it would
with DTE,
but as it would NOT in real life. I don't have my spreadsheet set
up to evaluate Kross by using the different pieces as appropriate for y
< x or y > x, so I can't tell how accurate it is, but it is certainly likely to be a little more accurate than either piece separately. But, I
thought, why not get a single function from the 2 pieces? That's what I did by simply averaging them, in:
This turns out to be a fairly accurate estimator, substantially better than either piece separately.: It has some other versions, such as : KrossAvg:
w = (1/2) + (y - x) M, where M) is a "winning percentage per
run" estimator (see section 9), with M = (y + x)/ 4xy 6C) But in fact, it is better to "average them" incorrectly! If we use this on Kross1 (DTE) and Kross2, and
writing Kross1 as (2y - x)/(2y), with Kross2 as y/(2x), we
get that the "false average" is Kross2: w = 1/2 + (y - x)/2x BJMWP: w = 1/2 + (y - x)/2A BVL2: w = 1/2 + (y - x)/(y + x)
Except...what
is that That is, if one uses as one's variable not q = X/Y, but "p"
= (X - 1/2) / (Y - 1/2), one gets the formula above as w = f(p). It is a "perfect" Pythagorean Power Run Formula, except as a function of a " shifted" variable p,
equal to the ratio of (X - 1/2) to
(Y - 1/2), instead of q, the ratio
of X to Y. I am naming this the Shifted Pythagorean
Run (SPR) formula. One can of course change back to X and Y if you
want, yielding: With p = (X - 1/2) / (Y - 1/2)
, SPR: w = f(p) = 1 / (1 + p^2 ) = ...
simplifying... = (Y - .5)^2 / [ (Y - .5)^2 + (X - .5)^2 ] Again, this is perfectly
"Pythagorean" in form, just using the "shifted" run
averages. Oh, well...definitely worse than Pyth 2. The reason
it does worse than Pyth 2 is that, as we saw earlier, Pyth 2 predicts results
(based on a given run ratio q) without sufficiently taking into account the
effect of chance, which moves w a tad
closer to .500 than would be predicted by Pyth 2. That effect means that a run ratio value a
little closer to 1 predicts more accurately than the actual run ratio
does. But subtracting ½ from X and Y
moves the run ratio farther away from 1, not closer—that is, it further reduces
the role of “chance”, and is hence even less predictive. But why not
move in the opposite direction? Why not increase
Y and X, which moves the q ratio closer to 1, and should thus account more for
chance? Hence, let's use X+ 1/2 and Y +
1/2, and see if we get better predictive results: and we do!
[Of course, this is now an “ad hoc” adjustment, not derived from my
model above—but it’s an interesting one, and is very accurate!] Also, let's
no longer treat Y and X as fixed runs per game, but as variables, and as total
runs per team--PRF's have no concern for which variable is used, total runs vs.
runs/G, as the G’s cancel out. So we get
the much more accurate new estimator: with y = total runs in a season for team Y, and x = total runs in a season allowed by team Y,
_________________________________ If n = 1.82, we get w = .955 - .455 q, or w = .955 - .455x/y. This should be the "best fit" linear wpe based on q. RMSE = ??? xxxx 7A) DTE: A Quality model w = 1 - (1/2)x/y,
or w = (2y - x) / 2y
w
= y / (2x) . But this is just Kross2 !
See section 6B). Hence, Kross2 also comes from the simple
Quality model of BJMWP.One can also obtain both DTE and Kross 2 from a "General Marginal Runs" formula" I've found by playing around with the above idea, GMR: w = (y - A/2) / (y + x - A) , RMSE: XXX ??? choose a best fit A! see end of sec. 3 = BJA in which if, reversing the substitutions we used above for BJMWP, we replace A by y we get Kross2, and if we instead replace A by x we get DTE. Note that GMRis is saying that the winning percentage is simply the ratio of the excess of runs scored by Y over half the league average, to the surplus of total runs in Y's games (scored and allowed) to half the league average (of total runs for both teams, which would be half of 2A, or A.)
7B) Tangent lines, q, u, v, t, Inflection points, etc. x
and y q
u
t
v
w = 1/2 when: Range: q
=
x/y
1/u
(1 - t) / t
(1 + v) / (1 - v)
q = 1
[0 , oo) Note that all but Pyth2 are linear in at least one version: BVL2 is linear in t and v, DTE is linear in q, and Kross2 is linear in u. And, amazingly, in each linear case, the linear formula is the (calculus) tangent line of the other three wpe's in its column! The same is true for the "q" column: DTE using the q formula is the tangent line to Kross2, BVL2, and Pyth 2 in their respective q-versions. More generality: Each of these above formulas comes in a variation where in stead of using n = 2 in Pyth 2, we could use a "better fit" exponent n. This is generally taken to be around n = 1.82 for best fit to real results. I'm going to use capital N in the titles, but small "n"s in the formulas: This creates BVLN, PythN, DTEN, and Kross2N: For these, which are more complicated, I will give only those paramter's versions which are fairly simple: PythN: y^n/(y^n + x^n) = 1 / (1 + q^n) BVLN: [(n + 1)y - (n - 1)x] /
(2x + 2y)
=
(1 - n)/2 + nt
= 1/2 + (n/2) v
xxxxTango Tiger , BVL, Palmerr
average
Wins/Run "Q" factor has been around 1 / 9.5 or so, and Games have long been
around 160 per season or so. Using an M(x,y)
based on individual team runs x and y (or League Average runs A) will of course
create non-constant M-values with a fair amount of variation around the rough
mean of 1/1500. Non-constant M’s should of
course fine-tune and increase the predictive accuracy.
Moreover, just in general, 1
extra win (above .500) from roughly 10 extra runs (above y = x) And a pretty accurate one! Here, M = 1/1500. This
M = 1/(2x) simply switches from its DTE value of 1/2y to 1/(2x), same rough size for an average team as for
DTE. Again, this is a new form of
Kross2, but simple algebra reveals the equivalence. It is no wonder Kross came up with these two
parts of his piecewise function, DTE and Kross2, since they are mirror images--one
using x, the other y, in the denominator.
Since y > x, one will always be an overestimate compared to the
other, and naturally using an average of the two will give a middle value, less
likely to be either an underestimate or overestimate. Which is precisely what the next function
uses:
Here, Is M here roughly in the
neighborhood it should be? Yes, set x =
y, and M = 2y/(4y^2) = 1/2y, so on average, it's around 1/1500 or so.
Here, Or, better, as
a "false average" of two previous M's, as follows:The first is M = 1/(x + y) from BVL2. The second is simply to use Q = 1/10, a rough historical average (empirically) for Wins/Run, (though 1/9.7 or so gives more accuracy in my wpr's), and thus M = Q/G = 1/(10G). If we "false average" these two M values by simply adding numerators and adding denominators, we get a new M = (1 + 1)/ (x + y + 10G) = the M used
in TT. (I
love false averages!) Note, TT didn't conceptualize
it this way… And the formula is
very accurate…more so than either BVL2 or the 1/(10G) are
individually--and both of those are pretty good. As we see:
Here
define DTEN as "the tangent line of PythN"! This is in their q = y/x forms, evaluated at
q = 1. But then we note that by that
definition, DTEN has exactly the same structure as DTE, but with a different
constant N/4 instead of 1/2. And that it
then turns out to be the tangent line of BVLN justifies the
"definition" even more, since DTE (using N = 2) is the tangent line
of BVL2. But right now
we're dealing with (y - x) forms, not q-forms.
So let's go to:
Xxxx why not use
M = 2/(x + y+ 2A), which is false average of 1/ x + y and 1/ 2A Kross Average! Separate issue: false average of 1/x + y and 1/1500, IS pretty good! but with 1625, not 1500 -- BEST??! Except for BVL a=br
_________________________________ 10. Pythagoras to Natural Logarithms via Integration: a New Run Estimator 10A) Let Pyth(n) be w = y^n / (y^n + x^n). Let G = total Games played, and n be fixed at some power we find "best". To formulate a Wins/Run function, we fix x, assume y = x has created w = .500, i.e., wG = (1/2)G, and then solve the equation wG = (1/2) G + 1 for y, and then for y - x. Doing so, we get y^n / (y^n + x^n) = (G/2 + 1) / G, or , cross multiplying and then collecting and factoring y^n, y^n (1 - [G + 2]/[2G] ) = x^n (G + 2)/(2G), or, dividing, simplifying, and taking the nth root, y = x ( [G+ 2]/[G - 2] ) ^ (1/n). Therefore, y - x = [ ( [G+ 2]/[G - 2] ) ^ (1/n) - 1 ] x = (extra) runs / additional win . Taking the reciprocal gives (extra) Wins/Run = 1 / (Kx), where K = ( [G+ 2]/[G - 2] ) ^ (1/n) - 1. 10B) Now assume a team has scored y runs and given up x runs. We ask how it accumulated extra wins during the process of accumulating the excess (y - x) surplus runs, since the Wins/Run changed at each step of the way. That is, our assumption above was that y = x, when we found Wins/Run, but as each successive extra run was scored, there is a higher y, and hence an "assumed" higher x, which will influence the extent to which the next incremental win will come from the (new) level of x and y.. So what we really need to do is INTEGRATE the Wins/Run function at every level of runs, from the x-value that really was what the opponents scored, up to the y that our team actually scored, accumulating "dw"'s (win increments) along the way for the varying Wins/Runs occurring at each different y. To do this, since y and x are now fixed total runs scored, and can't play the role of a variable, Integrate d[Wins/Run ] = Integral of [ 1 / (Kp) ] dp from p = "x" to p = "y". Since integral of 1/p = ln(p), and substituting the limits of integration x and y, We get: total "extra" Wins = (1/K) [ ln y - ln x ], with K as above. Divide these wins by G to convert to winning percentage, and add this "extra w" to the w = .500 level when y started (at x), and we get a new run estimator: LNW: w = (1/2) + (1/KG) [ ln y - ln x ] , where KG, for n = 1.82 (best Pyth n) and G around 160, yields 2.2 or so. Best LNW: w = (1/2) + [ ln y - ln x ] / 2.2 Rmse = .0259 [Note: 4 / (KG) = the Pythagorean exponent n we started with. This is not surprising when we look at tangent lines, as below.] This is quite an accurate estimator, according to my sample data. w = (1/2) + [ ln u ] / 2.2 = (1/2) - [ ln q ] / 2.2
_________________________________
WPR
= z = G(y^N - x^N) / [2 (y - x) (y^N + x^N) ] Or, even simpler, we can once again use the "false average approach" on the fractional parts of the two fractions that mis-estimate (one overestimating, the other underestimating). This is what we did in part
First
developed in the late 70's, vaguely mentioned in his 1978 Baseball Abstract,
and explicitly developed in his 1981 Abstract, BJ's (apparently misnamed, as it
seems to have nothing to do with logarithms) Log5 formula is an extremely
useful formula, stemming from basic probablility theory, describing how often
Team Y could be predicted to beat Team X in a series of head-to-head games, if
the only the info we have about them is their Winning Percentages (or,
alternately W/L odds ratios) against their respective entire leagues of
opponents. That is to
say, if Team Y meets Team X in the World Series, and we know that Team Y went
60-40 in the regular season in its league [winning percentage = .600 = 60/(60 + 40) = Wins/(Wins +
Losses)], while Team X went 55-45 (winning percentage = .550) , what is the winning percentage w that
we should "expect" in the World Series for Team Y as it plays Team
X?
Or: BJ went
through a method similar to mine to arrive at his conclusions back in
1981. A discussion of this follows, but
can be skipped by those who aren’t interested in his method. There are
many different plausible models that lead to the same log5 result, which is the
result BJ gave in his 1981 abstract. But
his stated method there is almost identical to my "Quality" model
method in spirit. So in a very real
sense, Log5 is the original/quintessential "quality" model for all
winning percentage estimators. However,
BJ did not seemed to explicitly realize that the basic quality model can
underpin much more than log5. BJ said the
following in the 1981 Abstract: Assume
that Y as above has winning percentage j = .600 against its league opponents Z,
where Z is "all the opposing teams that Y played against, as instantiated
in whomever they chose to play in their games against Y only." Again, Z is conceptualized as a vast
"entire league" team Z that only plays certain of its players against
Y in any given game. Further
assume that an average team (roughly like Z here) has an arbitrary
"quality" level Qz of 1/2.
Then ask, "What quality level Qy would Y have to have, in relation
to Z's quality levelof 1/2, so that Y would win with a winning percentage j =
.600 against Z, the league average team, That is, he
essentially asked, what quality level Qy would satisfy the Basic Axiom: Qy / (Qy + Qz) = .600, where Qz = 1/2 = .500 ? [He
asked this in English, and actually garbled (grammatically) the question, so it
is technically inaccurate--but it is clear from the immediately subsequent math
what he meant.] If you
substitute and solve Qy / (Qy + .500)
= .600, you get by basic algebra
Qy =
So, for me,
the "quality" of Y would in this case be simply 13. List of all wpe’s considered 1) 1A) Pyth(n)
http://gosu02.tripod.com/id69.html |