Post by eric on Apr 6, 2016 15:17:38 GMT -6
So when there aren't enough manually created players for a draft, the software will fill in randomly generated prospects... but are they randomly generated, or is there some higher power at work?
So I generated 200 drafts' worth of players (just under 20,000) and took a look. The first thing is that every position has a minimum and maximum value for every quantity that are almost never the same as that quantity's hard min and max:
These kiiiind of make sense in that they follow what we'd generally think about positions (i.e. centers are strong and slow, point guards are weak and fast) but note that a software PG can have as low as 5 Handling, which is pretty crazy. It's also critical to note that these distributions are not symmetric, so the average values are not simply the midpoints. As illustration here are the point guard's max, avg, and min:
.
So those are when we look overall, but does the software treat each quantity independently? Consider two popular strategies for character creation in Dungeons and Dragons:
1. Roll three six-sided dice for every stat. Every stat therefore has a minimum of three (three times one) and a maximum of eighteen (three times six). Though unlikely (1 out of 216 tries) it is possible to have a character start at 18/18/18/18/18/18 (the Kina Grannis) or 3/3/3/3/3/3 (the Derrick Rose).
2. A player has a certain amount of points to spend, for instance 28. Characters have a minimum of 8 in every stat, the first six increases in a stat cost one point, the next two cost two, the last two cost three. This makes it impossible to start with more than one stat at 18 or more than two stats at 16, and so on, but it also makes it impossible for characters to start with 3s: the possible range is constrained from 18 - 108 (average 63) to 48 - 76 (average not well defined) or even further to 68 - 76 if you force players to spend every point (and note how all of those possible values are above the random method's average).
Because our quantities have 95 possibilities instead of 6, it would be extremely difficult to generate enough data to demonstrate that the supermax build was impossible: 1 out of 857,375 tries in the best case (where each attribute has a flat distribution, and we already know that's not true). Instead, let's look at the power forward's Outside Scoring grade, which takes as inputs the Jump Shot and Three Point Shot attributes. They have ranges of [35 to 85] and [10 to 55] respectively. We will count how many times a power forward was created with a given Jump Shot value, then see what the maximum and minimum Three Point Shot were for each Jump Shot value:
We can see some flickers at the extreme ends of the Jump Shot distribution, but this is because the data dries up at those extremes rather than an underlying function.
.
Like with training camps, draft creation attributes are not on flat distributions, although they are also not correlated with training camp distributions. Here are three examples for the power forward using moving ten-point averages:
So values will be clustered near or slightly below the average, and there is no hard and fast rule about the length of tails. Inside Scoring cuts off at 60 (max 100 min 15), Jump Shot at 80 (max 85 min 35), Shot Blocking at 60 (max 80 min 15).
.
The end!
So I generated 200 drafts' worth of players (just under 20,000) and took a look. The first thing is that every position has a minimum and maximum value for every quantity that are almost never the same as that quantity's hard min and max:
PG SG SF PF C stat
77 80 83 84 90 Height
219 240 255 320 320 Weight
23 23 23 23 23 Age
85 100 89 100 99 InsideScoring
90 95 90 85 85 JumpShot
75 75 80 55 65 ThreePointShot
90 80 80 79 75 Handling
80 60 60 60 60 Passing
100 85 75 65 55 Quickness
55 55 55 80 85 PostDefense
65 80 55 55 45 PerimeterDefense
55 60 55 55 55 DriveDefense
80 80 70 60 60 Stealing
55 55 65 80 85 ShotBlocking
50 55 55 65 70 OffenseRebound
50 55 55 65 70 DefenseRebound
65 75 85 95 100 Strength
85 100 95 85 65 Jumping
PG SG SF PF C stat
66 73 78 78 81 Height
150 160 210 200 202 Weight
18 18 18 18 18 Age
10 10 15 15 15 InsideScoring
40 45 40 35 35 JumpShot
20 25 15 10 5 ThreePointShot
5 5 5 5 5 Handling
25 15 15 15 15 Passing
65 55 45 35 25 Quickness
10 10 10 15 25 PostDefense
20 10 10 10 10 PerimeterDefense
10 10 10 10 10 DriveDefense
35 35 25 15 15 Stealing
10 10 20 15 20 ShotBlocking
10 10 10 20 25 OffenseRebound
10 10 10 20 25 DefenseRebound
25 35 45 55 65 Strength
35 55 45 35 30 Jumping
These kiiiind of make sense in that they follow what we'd generally think about positions (i.e. centers are strong and slow, point guards are weak and fast) but note that a software PG can have as low as 5 Handling, which is pretty crazy. It's also critical to note that these distributions are not symmetric, so the average values are not simply the midpoints. As illustration here are the point guard's max, avg, and min:
.
So those are when we look overall, but does the software treat each quantity independently? Consider two popular strategies for character creation in Dungeons and Dragons:
1. Roll three six-sided dice for every stat. Every stat therefore has a minimum of three (three times one) and a maximum of eighteen (three times six). Though unlikely (1 out of 216 tries) it is possible to have a character start at 18/18/18/18/18/18 (the Kina Grannis) or 3/3/3/3/3/3 (the Derrick Rose).
2. A player has a certain amount of points to spend, for instance 28. Characters have a minimum of 8 in every stat, the first six increases in a stat cost one point, the next two cost two, the last two cost three. This makes it impossible to start with more than one stat at 18 or more than two stats at 16, and so on, but it also makes it impossible for characters to start with 3s: the possible range is constrained from 18 - 108 (average 63) to 48 - 76 (average not well defined) or even further to 68 - 76 if you force players to spend every point (and note how all of those possible values are above the random method's average).
Because our quantities have 95 possibilities instead of 6, it would be extremely difficult to generate enough data to demonstrate that the supermax build was impossible: 1 out of 857,375 tries in the best case (where each attribute has a flat distribution, and we already know that's not true). Instead, let's look at the power forward's Outside Scoring grade, which takes as inputs the Jump Shot and Three Point Shot attributes. They have ranges of [35 to 85] and [10 to 55] respectively. We will count how many times a power forward was created with a given Jump Shot value, then see what the maximum and minimum Three Point Shot were for each Jump Shot value:
jumper count max min
35 14 55 22
36 31 53 17
37 25 50 12
38 25 53 13
39 27 53 10
40 55 55 10
41 39 55 10
42 65 54 11
43 53 55 12
44 61 55 10
45 61 54 11
46 85 54 13
47 76 54 10
48 72 52 12
49 60 50 11
50 92 55 10
51 85 53 11
52 81 54 12
53 85 54 10
54 104 54 10
55 82 54 10
56 73 54 10
57 106 54 11
58 95 54 11
59 94 53 11
60 100 55 10
61 83 55 10
62 82 55 10
63 96 53 10
64 107 54 10
65 84 54 11
66 90 54 10
67 107 54 10
68 112 54 10
69 104 54 10
70 110 55 10
71 76 55 11
72 88 54 10
73 111 55 10
74 110 55 10
75 65 55 12
76 109 54 10
77 77 53 11
78 68 54 10
79 85 55 11
80 41 53 11
81 16 32 16
82 11 34 11
83 11 30 11
84 25 34 11
85 4 27 24
We can see some flickers at the extreme ends of the Jump Shot distribution, but this is because the data dries up at those extremes rather than an underlying function.
.
Like with training camps, draft creation attributes are not on flat distributions, although they are also not correlated with training camp distributions. Here are three examples for the power forward using moving ten-point averages:
So values will be clustered near or slightly below the average, and there is no hard and fast rule about the length of tails. Inside Scoring cuts off at 60 (max 100 min 15), Jump Shot at 80 (max 85 min 35), Shot Blocking at 60 (max 80 min 15).
.
The end!