Let's Kick RPM Around A Little More
Jan 13, 2018 15:07:00 GMT -6
ANK1990 and BKay Jewelers like this
Post by eric on Jan 13, 2018 15:07:00 GMT -6
h/t ANK1990
ESPN's RPM is a composite stat based (mostly) on plus-minus, and then doing super secret math on it that plebes like you and me don't deserve to know so that you split out which player is responsible for which pluses and minuses in each lineup. Great. Sounds great.
But hoooooo boy does it have issues.
For starters let's look at the minutes played and total wins for RPM and Win Shares as a control over the past five years:
Note there have been 595 wins so far this season, so Win Shares is a little ahead of the game, but you can see how consistent it is in full years. 3 wins of variation out of 1255ish is fantastic precision. Good job Win Shares. RPM, go to your room.
RPM, get back out here. What the f*** happened in 2016? How do you miss 16,000 minutes played? The answer is simple - 53 players just aren't listed for 2016 RPM.
We talked before about how they abruptly and silently changed from doing Wins Over Replacement to just plain Wins in 2016, and we see the evidence again up there. That's pretty bad, but it turns out they've been monkeying with it in other ways too. Let's take a look at the max, min, and standard deviation of RPM and WS/48 over the last five seasons, and we'll only look at players with 1000+ MP (500+ MP for 2018 since we're only halfway through) to keep things kosher.
Now, we expect the max and min to change year over year. Not every best player is exactly as good as each other. Fine. But look at how the overall gap fluctuates relative to the average:
Win Shares goes up, goes down, it's random. Real Plus Minus on the other hand is clearly shrinking over time, and dramatically so this year. That's what we in the biz call systematic (as opposed to random) error - we're seeing an actual phenomenon, and this is further reflected when you look at the standard deviation changes by year. Win Shares' barely changes, Real Plus Minus' is dramatically lower this year.
Well, Maybe It'll Get Better as the Year Goes On!
Maybe! But here's the thing, according to what ESPN has told us about RPM it should do the opposite. One of the things these plus-minus stats do is use a prior, which means "okay we've got some data measured this year, but it's small, so we'll tell our formula that the real answer is probably the prior until we hear otherwise." Which is fine! It's a totally fine way of doing things.
So long as you don't change your priors midstream.
And ESPN's (original?) prior was to use box score stats to model out a plus minus, so guys with great box score stats effectively get a bump to the relevant part of plus minus early in the season. I don't think I'm going out too far on a limb when I say guys like Steph Curry and James Harden have pretty good box score stats, so they shouldn't have any problem putting up great RPM numbers from the start...
...unless ESPN changed the prior for everyone to 0. In which case you might have guys with comparatively middling box score stats like Spencer Dinwiddie and Tyus Jones in the top ten of RPM.
Hey look! You do!
real plus minus
.
In memoriam
The Forgotten Boys of 2016
Hey Spencer Dinwiddie! From not even mentioned by the stat to top ten in two years. What a long spence dinwiddie it's been.
I have no idea why RPM missed these specific guys.
It kind of looks like minutes played plays into it:
every one of the 208 players with 1400+ are in
only 3 of the 66 players with 1000-1399 minutes are missing
28 of the 65 players with 200- minutes are missing
...but Rakeem Christmas and Sam Dekker both made it with 6 minutes each.
It also kind of looks like midseason trades play into it:
21 of the 50 guys who were are missing.
But the G-League All-Star (yes this is a real thing) Alex Stepheson and Briante Weber (easily a top ten VCU Ram in NBA history) made it after being on multiple teams for under 200 minutes each.
I don't know.
It's bad.
ESPN's RPM is a composite stat based (mostly) on plus-minus, and then doing super secret math on it that plebes like you and me don't deserve to know so that you split out which player is responsible for which pluses and minuses in each lineup. Great. Sounds great.
But hoooooo boy does it have issues.
For starters let's look at the minutes played and total wins for RPM and Win Shares as a control over the past five years:
Year WS MP Wins RPM MP Wins
2014 595193 1256.4 588314 868.1
2015 595203 1256.9 591784 870.0
2016 594863 1255.6 578523 1170.3
2017 594404 1253.4 592964 1179.7
2018 302034 636.5 301614 593.6
Note there have been 595 wins so far this season, so Win Shares is a little ahead of the game, but you can see how consistent it is in full years. 3 wins of variation out of 1255ish is fantastic precision. Good job Win Shares. RPM, go to your room.
RPM, get back out here. What the f*** happened in 2016? How do you miss 16,000 minutes played? The answer is simple - 53 players just aren't listed for 2016 RPM.
We talked before about how they abruptly and silently changed from doing Wins Over Replacement to just plain Wins in 2016, and we see the evidence again up there. That's pretty bad, but it turns out they've been monkeying with it in other ways too. Let's take a look at the max, min, and standard deviation of RPM and WS/48 over the last five seasons, and we'll only look at players with 1000+ MP (500+ MP for 2018 since we're only halfway through) to keep things kosher.
. . WS . . RPM .
Year Max Min SD Max Min SD
2014 .295 -.036 .052 9.08 -8.44 2.98
2015 .288 -.026 .053 9.34 -6.87 2.94
2016 .318 -.049 .056 9.79 -6.27 2.77
2017 .278 -.023 .055 8.42 -5.69 2.57
2018 .302 -.057 .058 6.52 -5.4 2.19
Now, we expect the max and min to change year over year. Not every best player is exactly as good as each other. Fine. But look at how the overall gap fluctuates relative to the average:
Year WS RPM
2014 99% 116%
2015 94% 107%
2016 110% 106%
2017 90% 93%
2018 107% 79%
Win Shares goes up, goes down, it's random. Real Plus Minus on the other hand is clearly shrinking over time, and dramatically so this year. That's what we in the biz call systematic (as opposed to random) error - we're seeing an actual phenomenon, and this is further reflected when you look at the standard deviation changes by year. Win Shares' barely changes, Real Plus Minus' is dramatically lower this year.
Well, Maybe It'll Get Better as the Year Goes On!
Maybe! But here's the thing, according to what ESPN has told us about RPM it should do the opposite. One of the things these plus-minus stats do is use a prior, which means "okay we've got some data measured this year, but it's small, so we'll tell our formula that the real answer is probably the prior until we hear otherwise." Which is fine! It's a totally fine way of doing things.
So long as you don't change your priors midstream.
And ESPN's (original?) prior was to use box score stats to model out a plus minus, so guys with great box score stats effectively get a bump to the relevant part of plus minus early in the season. I don't think I'm going out too far on a limb when I say guys like Steph Curry and James Harden have pretty good box score stats, so they shouldn't have any problem putting up great RPM numbers from the start...
...unless ESPN changed the prior for everyone to 0. In which case you might have guys with comparatively middling box score stats like Spencer Dinwiddie and Tyus Jones in the top ten of RPM.
Hey look! You do!
real plus minus
.
In memoriam
The Forgotten Boys of 2016
Andrea Bargnani
Anthony Bennett
Beno Udrih
Bryce Cotton
Bryce Dejean-Jones
Cameron Bairstow
Chris Copeland
Chris Johnson
Christian Wood
Chuck Hayes
Cory Jefferson
Coty Clarke
DeJuan Blair
Elijah Millsap
Elliot Williams
Erick Green
Gary Neal
Ian Clark
J.J. Hickson
J.J. O'Brien
James Ennis
Jared Cunningham
Jarnell Stokes
Jason Thompson
Jeff Ayres
Jimmer Fredette
Joe Harris
John Jenkins
Jordan Farmar
Jordan McRae
Jorge Gutierrez
Justin Harper
Keith Appling
Kendall Marshall
Kevin Martin
Kostas Papanikolaou
Lorenzo Brown
Mario Chalmers
Nate Robinson
Orlando Johnson
Phil Pressey
Rasual Butler
Ray McCallum
Russ Smith
Ryan Hollins
Sean Kilpatrick
Sonny Weems
Spencer Dinwiddie
Steve Novak
Thanasis Antetokounmpo
Tim Frazier
Toney Douglas
Tony Wroten
Hey Spencer Dinwiddie! From not even mentioned by the stat to top ten in two years. What a long spence dinwiddie it's been.
I have no idea why RPM missed these specific guys.
It kind of looks like minutes played plays into it:
every one of the 208 players with 1400+ are in
only 3 of the 66 players with 1000-1399 minutes are missing
28 of the 65 players with 200- minutes are missing
...but Rakeem Christmas and Sam Dekker both made it with 6 minutes each.
It also kind of looks like midseason trades play into it:
21 of the 50 guys who were are missing.
But the G-League All-Star (yes this is a real thing) Alex Stepheson and Briante Weber (easily a top ten VCU Ram in NBA history) made it after being on multiple teams for under 200 minutes each.
I don't know.
It's bad.