Numbers Beget Numbers, Beget Greater NumbersOnce you have the box score calculating either composite stat is relatively simple:
Win Shares
ScPoss = (FG_Part + AST_Part + FT_Part) * (1 - (Team_ORB / Team_Scoring_Poss) * Team_ORB_Weight * Team_Play%) + ORB_Part
FG_Part = FGM * (1 - 0.5 * ((PTS - FTM) / (2 * FGA)) * qAST)
qAST = ((MP / (Team_MP / 5)) * (1.14 * ((Team_AST - AST) / Team_FGM))) + ((((Team_AST / Team_MP) * MP * 5 - AST) / ((Team_FGM / Team_MP) * MP * 5 - FGM)) * (1 - (MP / (Team_MP / 5))))
AST_Part = 0.5 * (((Team_PTS - Team_FTM) - (PTS - FTM)) / (2 * (Team_FGA - FGA))) * AST
FT_Part = (1-(1-(FTM/FTA))^2)*0.4*FTA
Team_Scoring_Poss = Team_FGM + (1 - (1 - (Team_FTM / Team_FTA))^2) * Team_FTA * 0.4
Team_ORB_Weight = ((1 - Team_ORB%) * Team_Play%) / ((1 - Team_ORB%) * Team_Play% + Team_ORB% * (1 - Team_Play%))
Team_ORB% = Team_ORB / (Team_ORB + (Opponent_TRB - Opponent_ORB))
Team_Play% = Team_Scoring_Poss / (Team_FGA + Team_FTA * 0.4 + Team_TOV)
ORB_Part = ORB * Team_ORB_Weight * Team_Play%
FGxPoss = (FGA - FGM) * (1 - 1.07 * Team_ORB%)
FTxPoss = ((1 - (FTM / FTA))^2) * 0.4 * FTA
TotPoss = ScPoss + FGxPoss + FTxPoss + TOV
PProd = (PProd_FG_Part + PProd_AST_Part + FTM) * (1 - (Team_ORB / Team_Scoring_Poss) * Team_ORB_Weight * Team_Play%) + PProd_ORB_Part
PProd_FG_Part = 2 * (FGM + 0.5 * 3PM) * (1 - 0.5 * ((PTS - FTM) / (2 * FGA)) * qAST)
PProd_AST_Part = 2 * ((Team_FGM - FGM + 0.5 * (Team_3PM - 3PM)) / (Team_FGM - FGM)) * 0.5 * (((Team_PTS - Team_FTM) - (PTS - FTM)) / (2 * (Team_FGA - FGA))) * AST
PProd_ORB_Part = ORB * Team_ORB_Weight * Team_Play% * (Team_PTS / (Team_FGM + (1 - (1 - (Team_FTM / Team_FTA))^2) * 0.4 * Team_FTA))
Stops = Stops1 + Stops2
Stops1 = STL + BLK * FMwt * (1 - 1.07 * DOR%) + DRB * (1 - FMwt)
FMwt = (DFG% * (1 - DOR%)) / (DFG% * (1 - DOR%) + (1 - DFG%) * DOR%)
DOR% = Opponent_ORB / (Opponent_ORB + Team_DRB)
DFG% = Opponent_FGM / Opponent_FGA
Stops2 = (((Opponent_FGA - Opponent_FGM - Team_BLK) / Team_MP) * FMwt * (1 - 1.07 * DOR%) + ((Opponent_TOV - Team_STL) / Team_MP)) * MP
+ (PF / Team_PF) * 0.4 * Opponent_FTA * (1 - (Opponent_FTM / Opponent_FTA))^2
Stop% = (Stops * Opponent_MP) / (Team_Possessions * MP)
DRtg = Team_Defensive_Rating + 0.2 * (100 * D_Pts_per_ScPoss * (1 - Stop%) - Team_Defensive_Rating)
Team_Defensive_Rating = 100 * (Opponent_PTS / Team_Possessions)
D_Pts_per_ScPoss = Opponent_PTS / (Opponent_FGM + (1 - (1 - (Opponent_FTM / Opponent_FTA))^2) * Opponent_FTA*0.4)
+((((2*(B4+0.5*D4)*(1-0.25*(P4-F4)/C4*(Q4/48*1.14*(K$29-K4)/B$29+(K$29/48*Q4-K4)/(B$29/48*Q4-B4)*(1-Q4/48))))+((B$29-B4+0.5*(D$29-D4))/(B$29-B4)*(P$29-F$29-P4+F4)/(C$29-C4)*K4/2)+F4)*(1-((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29))*((1-H$29/J$29)*((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29))/((1-H$29/J$29)*((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29))+(H$29/J$29)*(1-((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29)))))*H$29/(B$29+(1-(1-F$29/G$29)^2)*G$29*0.4))+(H4*((1-H$29/J$29)*((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29))/((1-H$29/J$29)*((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29))+(H$29/J$29)*(1-((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29)))))*((B$29+(1-(1-F$29/G$29)^2)*G$29*0.4)/(C$29+0.4*G$29+N$29))*P$29/(B$29+(1-(1-F$29/G$29)^2)*0.4*G$29)))-0.92*((B4*(1-0.25*(P4-F4)/C4*(Q4/48*1.14*(K$29-K4)/B$29+(K$29/48*Q4-K4)/(B$29/48*Q4-B4)*(1-Q4/48)))+0.25*((P$29-F$29-P4+F4)/(C$29-C4))*K4+(1-(1-(F4/G4))^2)*0.4*G4)+(C4-B4)*(1-1.07*H$29/J$29)+(1-(F4/G4))^2*0.4*G4+N4)*(P$26/(C$26+N$26+G$26*0.4-1.07*H$26/(J$29+J$30)*(C$26-B$26))))/(0.32*(P$26/82/2))+Q4/48/5*(0.5*(C$29+0.4*G$29+N$29-1.07*H$29/J$29*(C$29-B$29)+C$30+0.4*G$30+N$30-1.07*H$30/J$30*(C$30-B$30)))*(1.08*(P$26/(C$26+N$26+G$26*0.4-1.07*H$26/(J$29+J$30)*(C$26-B$26)))-((P$30/(0.5*(C$29+0.4*G$29+N$29-1.07*H$29/J$29*(C$29-B$29)+C$30+0.4*G$30+N$30-1.07*H$30/J$30*(C$30-B$30))))+0.2*((P$30/(B$30+(1-(1-(F$30/G$30))^2)*G$30*0.4))*(1-(((L4+M4*((B$30/C$30*(1-H$29/J$29))/(B$30/C$30*(1-H$29/J$29)+(1-B$30/C$30)*H$29/J$29))*(1-1.07*H$29/J$29)+I4*(1-((B$30/C$30*(1-H$29/J$29))/(B$30/C$30*(1-H$29/J$29)+(1-B$30/C$30)*H$29/J$29))))+(((C$30-B$30-M$29)*((B$30/C$30*(1-H$29/J$29))/(B$30/C$30*(1-H$29/J$29)+(1-B$30/C$30)*H$29/J$29))*(1-1.07*H$29/J$29)+N$30-L$29)*Q4/48/5+O4/O$29*0.4*G$30*(1-(F$30/G$30))^2))*5*48/Q4/(0.5*(C$29+0.4*G$29+N$29-1.07*H$29/J$29*(C$29-B$29)+C$30+0.4*G$30+N$30-1.07*H$30/J$30*(C$30-B$30)))))-P$30/(0.5*(C$29+0.4*G$29+N$29-1.07*H$29/J$29*(C$29-B$29)+C$30+0.4*G$30+N$30-1.07*H$30/J$30*(C$30-B$30))))))/(0.32*(P$26/82/2))
Wins Produced
PROD = 3FGM*0.064 + 2FGM*0.032 + FTM*0.017 + FGMS*-0.034 + FTMS*-0.015 + REBO*0.034 + REBD*0.034 + TO*-0.034 + STL*0.033 + FTM(opp.)*-0.017 + BLK*0.020
FTM(opp.) = PF / Team PF * Team FT allowed
TDRBPM = MP * .034 * .504 * (Team DRB – Player DRB) / (Team Minutes Played – Player Minutes Played)
TAPM = FGA * .032586 * 2 * .725 * (Team Assists – Player Assists) / (Team Minutes – Player Minutes)
Team Defense Adjustment = [(3FGM(opp.)*-0.064 + (2FGM(opp.)*-0.031 + TO(opp.)*0.033 + TOTM*-0.034 + REBTM*0.033 – BLKTM*0.200)/Minutes Played]*48
DEFTM48 = League Average Team Defensive Adjustment – Team Defensive Adjustment
ADJ P48 = PROD * 48 / MP + (Team TDRBPM * Player DRB / Team DRB - TDRBPM) + Player AST / Team AST * Team TAPM + DEFTM48
Position Average Adj. P48
Point Guards 0.191
Shooting Guards 0.158
Small Forwards 0.186
Power Forwards 0.256
Centers 0.296
WP48 = ADJ P48 - Average ADJ P48 + .099
Wins Produced = WP48 * MP / 48
(((((D4*0.064+E4*0.032+F4*0.017+(C4-B4)*-0.034+(G4-F4)*-0.015+H4*0.034+I4*0.034+N4*-0.034+L4*0.033+(O4/$O$29*$G$30)*-0.017+M4*0.02)+0.034*((S$2*I4/I$29)-S4))+K4/K$29*T$2-T4)*48/(Q4*82)+((D$30*-0.064+E$30*-0.031+N$30*0.033+N$29*-0.034+J$29*0.033-M$29*0.2)/(82*5)-(D$29*-0.064+E$29*-0.031+N$29*0.033+N$30*-0.034+J$30*0.033-M$30*0.2)/(82*5))/2-U4)+0.099)*Q4*82/48
Note that unlike in real life we never have to worry about which position a player really is for Wins Produced purposes, because we've defined it from the get go. As you've no doubt noticed personal fouls only come into Win Shares and Wins Produced as a proportion of team fouls, so no non shooting fouls doesn't hurt us. Before we look at how the values change from season to season, I thought we could whet our appetite a little by looking at how a player's WS and WP in a given season were related, if at all:
That looks a lot like two distinct lines to me. What if we split them up so that bigs are blue, wings are green, and points are orange?
Yep. The wings and point distributions are slightly different but obviously much more similar to each other than the bigs. As you can see above these are wildly dissimilar approaches to assigning value and yet the different approaches turn out to be extremely well correlated with each other. There's more than one way to skin an apple, as the human saying goes.
Alright so let's look at our stats, all values given as per 48 minutes:
Labels Avg ws Max ws Min ws SD ws Avg wp Max wp Min wp SD wp
1 .0835 .1172 .0346 .0191 .1475 .2231 .0628 .0315
2 .0612 .1101 .0212 .0169 .1131 .1887 .0377 .0274
3 .0981 .1479 .0515 .0180 .1711 .2524 .1034 .0289
4 .1064 .1434 .0611 .0167 .1251 .1986 .0480 .0293
5 .1700 .2141 .1237 .0174 .2290 .3057 .1468 .0279
6 .0824 .1431 .0108 .0264 .1387 .2479 .0256 .0420
7 .0571 .1211 .0009 .0281 .1076 .2141 .0011 .0473
8 .0934 .1526 .0219 .0249 .1668 .2735 .0592 .0383
9 .1015 .1427 .0409 .0231 .1216 .1852 .0312 .0374
10 .1705 .2131 .1142 .0221 .2351 .3079 .1362 .0362
Total .1024 .2141 .0009 .0430 .1556 .3079 .0011 .0557
sb pwb max-avg avg-min stdevp mp max-avg avg-min stdevp
s p .0337 .0489 .0191 2788 .0756 .0847 .0315
s w .0489 .0400 .0169 2788 .0756 .0754 .0274
s w .0498 .0466 .0180 2788 .0813 .0677 .0289
s b .0370 .0453 .0167 2624 .0735 .0771 .0293
s b .0441 .0462 .0174 2624 .0767 .0823 .0279
b p .0607 .0716 .0264 1148 .1092 .1131 .0420
b w .0640 .0562 .0281 1148 .1064 .1065 .0473
b w .0592 .0716 .0249 1148 .1067 .1076 .0383
b b .0412 .0606 .0231 1312 .0636 .0905 .0374
b b .0426 .0563 .0221 1312 .0728 .0990 .0362
x x .0481 .0543 .0213 x .0841 .0904 .0346
There's a lot going on here.
Most relevant to our interests, the standard deviation is roughly inversely proportional to minutes played: the more minutes a player plays, the less uncertain we should be about his composite stats. That's good! It would be really disconcerting (not to mention annoying) if it was the opposite! At the same time, slight differences in minutes played show only small effects: all starters have about the same standard deviations, same for all backups.
For each stat the distribution is not quite symmetrical, the lower bound is slightly larger than the upper bound on average. The computations are clearly nonlinear so it's not surprising to have nonlinear effects here, but it's nice to know that they're relatively small: some players even saw the opposite. If we had to have different rules for different player qualities it would be annoying.
Tagging on to that point, there is no relationship between overall values in either stat and standard deviations. The shooting guard produced about a third of the Win Shares per 48 minutes of the center, but their standard deviations are almost identical. There was no a priori reason to believe this would happen but it did, which is very nice.
The backups and starters are extremely well correlated in both stats, with R^2s of .9981 and .9955 respectively. We
did have an a priori reason to believe this would be the case, because the backups and starters are identical to each other, so it's good that this happened. We didn't expect perfect correlation because the whole point of this exercise was that randomness was involved, but nearly perfect is great.
Taking the average of the standard deviations for starters for each stat and doubling it gives us .0352 for Win Shares and .0580 for Wins Produced, which are pretty much .035 and .060. We can be 95% sure that for any given player with X Win Shares per 48 minutes played has a true talent level in a range of X ± .035, and Y ± .060 for Wins Produced. This is a pretty big range, about one third of the average Win Shares per 48, but nowhere near the on/off figure of ± 11 that covers almost every player in the NBA. The values for backups are 41% higher for Win Shares and 39% for Wins Produced - it's probably 40% in each case, and it's another surprise and good sign that the methodology scales in the same way for each stat. But you may be asking, why does Win Shares have so much less uncertainty? I'll tell you.
I don't know.
But it does, and now we know it does. I've always preferred Win Shares over Wins Produced anyway because having to argue about whether Magic Johnson or Ben Simmons is
really a point guard is nobody's idea of a good time, but now we have another reason not to! So we've got that going for us.
Illustration: James Harden had 15 Win Shares last year, most in the league. How far down the leaderboard do we have to go before we find someone who was worse to a statistically significant degree? We take his WS/48 of .245, subtract .035*sqrt(2) because we have two independent noises, one on his value and one on whomever we are comparing him to, then turn that back into overall Win Shares and get 12. Next we take everyone else and
add .035*sqrt(2) and multiply back through by their minutes, and see that the first player who doesn't get up to 12 turns out to be #22 (!!!) DeMar DeRozan. It is proven that if you ain't first you're last, but it turns out there are twenty some odd players who are indistinguishable for first place.
Illustration: But 2017 was pretty blah stats-wise. Let's look at a historic stats year like 2016, when Steph Curry had one of the all time great regular seasons with 18 WS and .318 per 48. Surely he was first to a statistically significant degree?
Nope. Six other players had statistically indistinguishable years: Kevin Durant, Russell Westbrook, Kawhi Leonard, LeBron James, James Harden, Chris Paul. Granted those players are all pretty solid too, but what I'm getting at is even in a year where one player seems to stand head and shoulders above the rest it could just as easily have been noise.
And that's it! The moral of the story is if anyone says "it's X games/weeks/months into the season, it's not a small sample anymore" they're almost certainly wrong, and it's your right and duty as an American to demand they show you their calculations.
In bed.