|
Post by eric on Aug 26, 2016 12:38:29 GMT -6
So every game awards a Player of the Game, and I wanted to figure out how it works, so what I did was set up an exhibition game with two actual players on each team. The team that wins is more likely to produce the POTG to a statistically significant degree. I don't know yet if that's because the winning team gets a bonus in the formula or if the winning team wins because it has players who play better on it, but I can neutralize it by only looking at the two players on the team that is awarded POTG.
Here is a set of three games that tells us something important about the formula. The POTG is listed first, then their teammate.
fga fg 3pa 3p fta ft reb ast stl blk pf tov pts 12 5 0 0 2 2 13 2 1 0 0 0 12 5 0 0 0 2 1 14 2 1 0 4 0 1
6 2 0 0 1 0 16 0 2 0 0 3 4 13 6 0 0 2 0 12 0 2 0 1 3 12
10 5 0 0 1 0 11 1 1 0 1 1 10 8 4 0 0 6 1 13 1 1 0 0 1 9
8 5 0 0 0 0 12 0 2 0 1 2 10 10 1 0 0 2 2 12 0 2 0 1 2 4 If we assume that makes/attempts are irrelevant and only from rebounds to points matter, then... game 1 tells us that 11 points are worth more than a rebound and four fouls, game 2 tells us that 4 rebounds are worth more than a foul and eight points, game 3 tells us that a foul and a point is worth more than two rebounds, and game 4 tells us that 6 points is worth more than nothing.
Because we're dealing with inequalities rather than equations, we can't subtract inequalities directly, but we can add the inequalities together, add the same thing to both sides of an inequality, and so on. Also important to note is that while it's not possible for two players to be awarded POTG, it is possible for the formula to tie and break that tie with a non box score element such as Player ID. Because of that all these inequalities are greater than or equal to, but to reduce typing I'm just going to put >. Taking the first and third games, we find... 11 pts > reb + 4 pf pf + pts > 2 reb
Let's multiply the third game by four, so we get... 11 pts > reb + 4 pf 4 pf + 4 pts > 8 reb
Now we add the two inequalities... 15 pts + 4 pf > 9 reb + 4 pf
And the foul terms are equal on each side so we subtract them off and get... 15 pts > 9 reb
If we do the same thing with the second and third games we end up with... 2 reb > 7 pts
Or put another way... 9 reb > 9/2 * 7 pts 9 reb > 31.5 pts
But that means that... 15 pts > 9 reb > 31.5 pts 15 pts > 31.5 pts
Which can only be true if "pts" has a negative modifier; that is, the more points a player scores the less likely they are to be POTG. But from game 4 we know that's not true. Therefore some or all of FG, FGA, 3P, 3PA, FT, and FTA matter to POTG. It took me 2668 games to even get this far, so it's pretty annoying that I'll have to control for those too, but I'm going to leave the Machine running all weekend and we'll see what happens.
Thoughts and questions welcome!
|
|
|
Post by Cinco de Wardo on Aug 26, 2016 18:56:08 GMT -6
|
|
|
Post by eric on Sept 6, 2016 15:18:42 GMT -6
So 28101 games later I've made a little progress. Here are some games...
game 1: 3 stl > 6 fga + 1 reb game 2: 1 fga + 5 stl > 1 reb game 3: 2 reb + 1 stl > 1 fga
game 1 and game 2 give us 33 stl > 7 reb
game 2 and game 3 give us 1 reb + 6 stl > 0
putting those two together gives us that 75 stl > 0 stl > 0
.
game 4: 1 ast > 3 fga + 2 stl game 5: 2 fga + 3 ast > 1 stl which gives us 11 ast > 7 stl ast > 7/11 stl ast > 0
.
game 6: 0 > 4 fga + 1 ast + 1 stl which gives us 0 > fga
.
So far so good!
|
|
|
Post by 20s Navidad on Sept 6, 2016 15:44:31 GMT -6
I normally follow your stuff, but I have no idea what you're talking about.
|
|
|
Post by eric on Sept 6, 2016 16:48:16 GMT -6
I normally follow your stuff, but I have no idea what you're talking about. So in any given game there will be one player of the game and one teammate of the player of the game. We can never see the criteria used to decide player of the game, only that one player gets it and one doesn't, therefore we can only ever know that a certain line score produced a POTGscore equal to or greater than their teammate's. The hope is that POTGscore is a linear formula, so something like... 2 score per point 3 score per assist 5 score per steal -1 score per field goal attempt ...and so on. If we simulate enough games, we will eventually come across games where very few box score stats are different between POTG and Teammate. This is important because unlike in a system of multivariable equations, the more different games we add together the farther apart the two sides of the inequality will be. Let's focus on the second part of the second post: game 4: 1 ast > 3 fga + 2 stl game 5: 2 fga + 3 ast > 1 stl What this means is that in game 4, the POTG had 1 more assist than their teammate, and their teammate had 3 more FGA and 2 more steals. By adding these two inequalities together, we can make the FGA term cancel and are only left with 11 assists are worth more than 7 steals. If we have already proven that steals are worth more than zero POTGscore, then we have just now proven that assists are also worth more than zero POTGscore. This in and of itself doesn't help us much, but it's a necessary first step for finding the inequalities that will. If we had instead proven that steals were negative and later proved that assists were positive, the 11 assists > 7 steals inequality wouldn't give us any new information, and any inequality of the form X assists > Y steals would tell us nothing. . I also looked at the data a little more and found... reb > 0 0 > tov So we're really cooking with gas now.
|
|
|
Post by eric on Sept 7, 2016 14:04:06 GMT -6
Why This is Tricky
Here are three relations I have found by combining various games.
11 assists > 7 steals 9 assists > 4 rebounds 33 steals > 7 rebounds
Or for rough purposes, 1.5 assists > steal 2 assists > rebound 5 steals > rebound
At first glance this could lead one to believe that rebounds are worth the most, then steals, then assists, and it is in fact possible to find a solution that fits that belief: 4 POTGscore per rebound, 3 per steal, 2 per assist, which when plugged into the original inequalities results in 22 > 21 18 > 16 99 > 28
All of which are true. It is also possible, however, to find a solution that fits the exact opposite order. 1 POTGscore per rebound, 2 per steal, and 3 per assist results in 33 > 14 27 > 4 66 > 7
All of which remain true. Chaining inequalities like this can only get us so far: an assist is worth 4/9 of a rebound or more, a steal is worth 11/7 of an assist or less, but chaining those gives us that a steal is worth 11/7 of (4/9 of a rebound or more) or less, and or more or less literally covers every number in the universe - this is what we in the biz call "not helpful".
The only way to really nail down a possible range when statistics share the same sign is to have two inequalities of the form X steals > Y rebounds Z rebounds > W steals and then we would know that a steal is worth from W/Z to X/Y rebounds, and that a rebound is worth from Y/X to Z/W steals.
But because these statistics do share the same sign, they will mostly be on the left side of the inequality because all else being equal a player amassing more of them is more likely to be POTG. For example, I have recorded 25 games where three or fewer stats differed and one was rebounds: 17 of those saw the rebound on the POTG side compared to only 8 on the teammate side.
.
Long story short (too late) we're gonna get there, it's just gonna take even more data.
|
|
|
Post by eric on Sept 13, 2016 12:47:41 GMT -6
I've run the count out to 45,000ish games and haven't run across any contradictions yet, which is a good sign, but something else is going on and it's weird. In those 45,000 or so games, I've had four with the exact same box scores:
6 points, 13 boards, 2 steals, 1 personal foul. One player goes 3 for 5 with 2 assists and 1 turnover, and the other goes 3 for 8 with 3 assists and 3 turnovers.
And it get worse! These are the only four games that differ by only field goal attempts, assists, and turnovers. The only time that happens, it has these exact differences and this exact box score.
Usually when you see such an enormously unlikely coincidence in an experiment like this, it means you're running up against the limits of the pseudo random number generator involved. That would be an easy explanation, but it really seems like the software simulates possession by possession, which is also a much easier way to code a basketball simulation.
So I have no idea why this is happening. The good news is that it doesn't sabotage any findings of the experiment, it just means some relationships might take much longer to derive or worst case scenario never be derivable. Hopefully there won't be many such relationships and I'll be able to just guess and check my way to them. I'll keep you posted!
|
|
|
Post by eric on Sept 26, 2016 10:08:10 GMT -6
I'm up to 60,000 box scores now and I'm giving up. The box score duplication effect had been linear for the first 50,000:
up to 10,000 had 89% unique up to 20,000 had 77% up to 30,000 had 64% up to 40,000 had 50% up to 50,000 had 40%
This suggested I could go out to 80,000 and still be getting actually new data, but up to 60,000 only had 11% unique so I've probably passed a critical point and will not get any more data no matter how many sims I run. It's frustrating and weird but in the end no data means no hope. I'm going to post just those box scores where three or fewer categories differed in case anyone else wants to take a crack at it, obviously posting all 60,000 some odd box scores would be impractical. The first column denotes which of [fga fgm 3pa 3pm fta ftm reb ast stl blk pf tov pts ] are involved, the second the game #, and the last is how specifically the box scores differed.
type game # result 0000000110010 10751 > ast 1 stl 2 tov 4 0000001000010 2856 reb 5 tov 2 > 0000001000100 42859 reb 7 pf 1 > 0000001000100 47470 reb 9 > pf 1 0000001010010 13123 reb 11 stl 2 > tov 1 0000001010010 5815 reb 3 stl 1 tov 1 > 0000001010010 49866 reb 3 stl 1 tov 1 > 0000001010100 40460 reb 2 stl 2 > pf 2 0000001100010 27764 reb 11 ast 1 > tov 2 0000001100010 13124 reb 3 ast 2 tov 3 > 0000001100010 1573 reb 9 ast 1 tov 4 > 0000001100100 21116 reb 3 ast 2 > pf 2 0000001110000 39814 reb 6 ast 1 stl 4 > 0000001110000 6049 reb 9 stl 1 > ast 2 0000100000010 4914 > fta 2 tov 2 0000100000110 1931 > fta 2 pf 2 tov 2 0000100100010 28902 ast 1 > fta 1 tov 2 0000101000010 15255 reb 5 > fta 1 tov 4 0000101100000 15965 reb 2 ast 1 > fta 2 0000101100000 10515 reb 6 ast 1 > fta 1 0010000010100 3955 stl 1 > 3pa 1 pf 2 0100001000001 9074 fgm 2 pts 4 > reb 2 1000000000110 9440 fga 3 pf 1 > tov 5 1000000000110 6528 pf 1 > fga 7 tov 4 1000000010000 23617 fga 1 stl 3 > 1000000010010 16387 fga 2 stl 1 > tov 2 1000000010010 13418 fga 3 stl 1 > tov 3 1000000010010 18565 stl 1 tov 1 > fga 9 1000000100010 5364 > fga 3 ast 1 tov 2 1000000110000 25668 > fga 4 ast 1 stl 1 1000000110000 23765 ast 1 > fga 3 stl 2 1000000110000 18276 fga 2 ast 3 > stl 1 1000001000000 2769 reb 2 > fga 6 1000001000010 17690 > fga 1 reb 5 tov 4 1000001000010 13620 > fga 2 reb 3 tov 4 1000001000010 16114 > fga 7 reb 1 tov 4 1000001000010 34478 fga 2 > reb 3 tov 4 1000001000010 9155 reb 4 tov 1 > fga 3 1000001000100 37182 fga 1 reb 1 > pf 1 1000001000100 36688 fga 1 reb 14 > pf 1 1000001000100 19767 reb 6 > fga 4 pf 1 1000001001000 6657 blk 1 > fga 6 reb 4 1000001001000 27950 reb 3 blk 1 > fga 3 1000001010000 46971 fga 1 reb 5 > stl 1 1000001010000 7195 fga 1 stl 5 > reb 1 1000001010000 13199 reb 2 stl 1 > fga 1 1000001010000 40218 stl 3 > fga 1 reb 3 1000001010000 4698 stl 3 > fga 6 reb 1 1000001100000 29398 fga 2 ast 3 > reb 2 1000001100000 55799 fga 2 reb 6 ast 1 > 1000001100000 13295 reb 10 ast 1 > fga 6 1000001100000 31629 reb 5 ast 1 > fga 2 1000100000100 12588 pf 1 > fga 3 fta 1 1000100000100 34675 pf 1 > fga 7 fta 2 1000100100000 27423 fta 2 ast 1 > fga 5
|
|