Home Unfiltered Articles Players Baseball Prospectus
Basketball Prospectus home
Click here to log in Click here for forgotten password Click here to subscribe

Click here for Important Basketball Prospectus Premium Information!

<< Previous Article
Premium Article Kansas State Rolls On (01/02)
Next Article >>
Player Evaluation (01/03)

January 2, 2012
Linear Weights
Ranking the Formulas

by Neil Paine

Printer-
friendly
Contact
Author

For better and for worse, ever since former South Florida Sun-Sentinel writer Dave Heeren created TENDEX in the 1960s, linear weights have been a standard trope in the world of basketball stats.

The basic concept is simple--assign some set of positive weights to "good" actions (points, rebounds, assists, etc.) and negative weights to "bad" ones (turnovers, fouls, etc.). Add up the weighted total, then perhaps divide by games or minutes, and the result should give a vague approximation of a player's skill. No matter how much complexity the formula adds (and the currently reigning champ in this department might be John Hollinger's PER), this fundamental principle of linearity remains the same.

Although subsequent nonlinear metrics (like BP's WARP) have emerged as superior to linear-weighted equations, linear weights still have their place because it is still difficult to match their simplicity and ease of use--simply put, it's a lot easier to calculate TENDEX on the back on an envelope than Win Shares. For this reason, it is still valuable to know which linear weights formula is the best, and that means we must test each metric's accuracy.

More specifically, to determine the "best" formula I will use its ability to predict (or, more accurately, retrodict) game outcomes based on the minutes given to the players used by the two opponents. If a metric is to be of any value, a team composed of players who excel in it should consistently win out-of-sample games against teams composed of players whose expected metric performance is inferior.

The linear weights I tested are as follows:

  • NBA Efficiency* = ((pts+reb+ast+stl+blk)-((fga-fg)+(fta-ft)+tov))/mp

  • Old Win Score** (David Berri) = (pts+reb+stl+0.5*blk+0.5*ast - fga-0.5*fta-tov-0.5*pf)/mp

  • Game Score (John Hollinger) = (pts+0.4*fg-0.7*fga-0.4*(fta-ft) + 0.7*orb+0.3*(reb-orb)+stl+0.7*ast+0.7*blk-0.4*pf-tov)/mp

  • APMVAL (Neil Paine) = (45*pts-35*(fga+0.44*fta)+18*reb+30*ast+72*stl+41*blk-75*tov-39*pf)/mp

  • Alternate Win Score (David Lewin/Dan Rosenbaum) = (pts+0.7*orb+0.3*(reb-orb)+stl+0.5*blk+0.5*ast-0.7 * (fga-fg)-fg-0.35*(fta-ft)-0.5*ft-tov-0.5*pf)/mp

  • Sports Illustrated (David Sabino) = (pts+ft-(fta-ft)+2*3pm-(3pa-3pm)+2*fg-(fga-fg) + 2*blk+2*orb+2*ast-2*tov+1.5*stl+1.5*(reb-orb))/mp

  • TENDEX = (pts+reb+ast+blk-tov-(fga-fg)-(fta-ft))/mp

  • Production = (20*fg+20*(reb-orb)+20*ast+15*orb+15*stl+15*blk+10*3pm+10*ft-20 * tov-10*(fga-fg)-10*(fta-ft))/mp

  • Thibodeau (Tom Thibodeau) = (2*fg-(fga-fg)+ft-0.5*(fta-ft)+3*3pm-1.5 * (3pa-3pm)+reb+ast-pf+stl-tov+blk)/mp

  • Nash (John Nash) = (0.5*pts+0.5*reb+0.5*stl+0.5*blk+2*ast-tov)/mp

  • Points Created (Bob Bellotti) = (pts+0.9*reb+1.1*ast+0.9*stl+0.9*blk - 0.9*(fga-fg)-0.9*(fta-ft)-0.9*tov-0.45*pf)/mp

  • VORP (Kevin Pelton) = (pts+0.75*orb+0.25*drb+0.5*ast+stl+0.33*blk)/(1.5*(fga+(.44*fta)+tov + ((0.75*orb+0.25*drb+0.5*ast+stl+0.33*blk)/2)+(mp/4)))

  • Old SPM (Neil Paine) = -12.9+0.3*3pa36+0.7*ast36+0.3*blk36+0.6*drb36+0.1 * mpg+0.4*orb36+0.4*pf36+1.2*pts36+2.9*stl36-1.5*tov36-1.3*tsa36

  • New SPM (Neil Paine) = -12.5+0.3*3pa36+0.8*ast36+0.7*blk36+0.6*drb36+0.1 * mpg+0.3*orb36+0.4*pf36+1.1*pts36+2.2*stl36-1.5*tov36-1.2*tsa36

* The formula the NBA uses was popularized by Kansas City Star writer Martin Manley in his "Basketball Heaven" series from the late 1980s.
**--Berri released a
new version of Win Score in December 2011, but it was not included in this study.

The first test is to see how well each metric predicts game outcomes out-of-sample but within the same season. To accomplish this, I sorted every team game of the 2010-11 season into "even" and "odd" bins based on the order they came in the schedule (i.e., the Sixers' first game is "odd"; their second is "even"). Pace-adjusted player stats from even games were then used to predict the outcomes of odd games; if a player played less than 250 minutes in any half-season, he was given the league's average in each metric for the purposes of prediction.

To show how this works, here's an example using NBA Efficiency and the New Year's Day 2011 game between the Heat and Warriors, an odd-numbered game for both teams:

Player    Team   MP   NBA Eff
Ellis      GSW   46.6   .459
Wright     GSW   39.7   .412
Lee        GSW   36.5   .597
Curry      GSW   35.9   .564
Radmanovic GSW   21.6   .481
Udoh       GSW   17.1   .349
Law        GSW   14.0   .338
Amundson   GSW   13.3   .342
Carney     GSW    7.8   .460
Williams   GSW    7.5   .419
Team             24.0   .467
Player    Team   MP  NBA Eff
James      MIA   41.5   .743
Wade       MIA   39.4   .688
Bosh       MIA   39.3   .523
Arroyo     MIA   26.9   .312
Anthony    MIA   22.4   .303
Chalmers   MIA   21.6   .322
Jones      MIA   21.3   .292
Ilgauskas  MIA   16.7   .488
Howard     MIA    1.9   .387
Team             24.0   .497

In this game, NBA Efficiency would have predicted a Heat victory (.497 to .467); since Miami actually won, the game would count as an accurate forecast for the metric. (Also note that Carney received a league-average rating--.460--as he did not play 250 minutes in even-numbered games last season).

By contrast, here's how "Old SPM" treated the game:

Player    Team   MP   Old SPM
Ellis      GSW   46.6   1.86
Wright     GSW   39.7   3.35
Lee        GSW   36.5   3.06
Curry      GSW   35.9   4.64
Radmanovic GSW   21.6   2.80
Udoh       GSW   17.1  -4.25
Law        GSW   14.0  -3.13
Amundson   GSW   13.3  -5.82
Carney     GSW    7.8   0.28
Williams   GSW    7.5  -0.37
Team            240.0   1.51
Player    Team   MP   Old SPM
James      MIA   41.5   6.91
Wade       MIA   39.4   5.18
Bosh       MIA   39.3   0.23
Arroyo     MIA   26.9  -3.69
Anthony    MIA   22.4  -2.81
Chalmers   MIA   21.6   1.18
Jones      MIA   21.3  -1.10
Ilgauskas  MIA   16.7  -2.67
Howard     MIA   10.9  -3.89
Team            240.0   1.06

Based on the distribution of minutes, Old SPM would have predicted a victory for Golden State (1.51 to 1.06), so this game represents an incorrect pick for that metric. Here's how each metric performed in all 1,230 games from the 2011 regular season:

Metric         % Correct
Pts Created        68.5%
Old Win Score      68.5%
NBA Efficiency     68.3%
TENDEX             68.2%
Production         68.0%
Alt Win Score      67.9%
Thibodeau          67.4%
APMVAL             67.3%
Sports Illustrated 66.8%
Old SPM            66.4%
Game Score         66.1%
New SPM            65.7%
VORP               61.5%
Nash               61.2%

By this test, Bellotti's Points Created and Berri's Old Win Score performed the best, with each correctly predicting 68.5% of games last season.

For its part, Old Win Score also showed by far the highest correlation (0.81) with defensive rebounds per minute. Without a team defensive adjustment, a metric that correlates highly with DReb rate will naturally do better at predicting opposite-half games within the same season (team defensive performance is relatively stable from half to half, and a large number of defensive rebounds by definition means a large number of defensive stops).

Because of this, a better out-of-sample test might be to use metrics from one season to predict the following season. To that end, I took the same pace-adjusted metrics from every regular season between 1987 and 2011--save for the strange, lockout-shortened 1999 campaign--and ran a study similar to the one above, predicting every game's outcome and seeing which metric correctly predicted them at the highest rate. (Once again, players with less than 250 minutes in the preceding season were assigned the league's average rate in each metric.)

In a sample of 26,486 games, here were the most accurate metrics:

Metric         % Correct
New SPM            63.2%
Alt Win Score      63.1%
Old SPM            63.1%
APMVAL             62.9%
Production         62.8%
Thibodeau          62.7%
Pts Created        62.6%
NBA Efficiency     62.5%
Old Win Score      62.4%
TENDEX             62.3%
Game Score         62.0%
Sports Illustrated 62.0%
VORP               61.1%
Nash               60.6%

Of course, even this is not necessarily the ultimate test of a metric. In 2009, I wrote a Basketball-Reference post that dealt with team continuity; in it, I posited that the best metric would be one that accurately predicted the fates of teams with a great deal of roster turnover. As I said then:

Instead of predicting every team, maybe we should only focus on teams with obvious personnel changes, where the unpredictability is going to be highest because of changing roles and team dynamics. ... It seems to me that predicting these types of teams would be more informative than the typical squad, because your metric is going to have to rise or fall on its ability to anticipate how a player will react to a different role on the team. We're always talking about 'Holy Grails' when it comes to player ratings, and figuring out a way to effectively predict situations like this would go a long way toward developing the best metric for building future teams."

With that in mind, I calculated "continuity scores" for every team since 1987 (continuity score being the percentage of team minutes given to players who were on the team the previous season). I then looked at each metric's predictive performance in games involving teams who scored low in continuity:

Metric         Below  Avg.  Bottom 1/2   Bottom 1/3   Bottom 1/4   Bottom 10% Less than 50%
New SPM        63.98% ( 2)  63.83% ( 1)  64.47% ( 2)  64.38% ( 5)  64.93% ( 4)  64.69% ( 6)
Alt Win Score  64.07% ( 1)  63.82% ( 2)  64.48% ( 1)  64.64% ( 1)  65.01% ( 2)  64.94% ( 2)
Old SPM        63.68% ( 7)  63.61% ( 4)  64.15% ( 7)  64.07% ( 9)  64.55% ( 9)  64.38% ( 9)
APMVAL         63.91% ( 3)  63.69% ( 3)  64.22% ( 5)  64.32% ( 6)  64.73% ( 7)  64.73% ( 5)
Production     63.87% ( 4)  63.60% ( 5)  64.32% ( 3)  64.39% ( 4)  64.67% ( 8)  64.66% ( 7)
-------------------------------------------------------------------------------------------
Thibodeau      63.86% ( 5)  63.53% ( 6)  64.28% ( 4)  64.55% ( 2)  65.50% ( 1)  65.06% ( 1)
Pts Created    63.70% ( 6)  63.41% ( 7)  64.19% ( 6)  64.48% ( 3)  64.83% ( 6)  64.81% ( 3)
NBA Efficiency 63.47% ( 8)  63.23% ( 8)  64.02% ( 8)  64.25% ( 7)  64.95% ( 3)  64.77% ( 4)
Old Win Score  63.24% (10)  63.03% (10)  63.61% (10)  63.74% (10)  63.98% (11)  64.02% (10)
TENDEX         63.34% ( 9)  63.10% ( 9)  63.88% ( 9)  64.23% ( 8)  64.89% ( 5)  64.62% ( 8)
-------------------------------------------------------------------------------------------
Game Score     63.22% (11)  62.93% (11)  63.57% (11)  63.65% (11)  63.94% (12)  63.98% (11)
Sports Ill.    62.84% (12)  62.66% (12)  63.18% (12)  63.23% (12)  64.14% (10)  63.85% (12)
VORP           62.07% (13)  61.90% (13)  62.51% (13)  62.60% (13)  63.07% (13)  63.15% (13)
Nash           61.52% (14)  61.32% (14)  61.84% (14)  61.67% (14)  62.02% (14)  61.95% (14)
-------------------------------------------------------------------------------------------
(RANK) in parenthesis

Key:
* Below Avg. = Prediction% in games where at least 1 team
  was below the 1987-2011 average in continuity
* Bottom 1/2 = % in games where at least 1 team was in the
  bottom half of 1987-2011 teams in continuity
* Bottom 1/3 = % in games where at least 1 team was in the
  bottom third of 1987-2011 teams in continuity
* Bottom 1/4 = % in games where at least 1 team was in the
  bottom fourth of 1987-2011 teams in continuity
* Bottom 10% = % in games where at least 1 team was in the
  bottom tenth of 1987-2011 teams in continuity
* Less than 50% = % in games where at least 1 team had a 
  continuity score under 50%

As the sample narrows to games featuring teams with less and less continuity, some trends emerge. First, the adjusted plus-minus-based metrics (Old & New SPM, APMVAL), while clustered near the top of the overall rankings in all games, seem to decline in predictive accuracy when asked to handle teams composed of players in new roles and/or situations. Meanwhile, a metric like Tom Thibodeau's, which isn't especially accurate across all teams, appears to become a more effective predictor when low-continuity teams are involved. "Thibodeau" ranked just sixth out of fourteen metrics overall, but rose to second in games featuring teams in the bottom quarter in continuity, and was first in predicting games involving the bottom ten percent of teams in continuity.

The metric that emerged most strongly from these tests, though, is Alternate Win Score. AWS was devised by David Lewin & Dan Rosenbaum for their 2007 paper "The Pot Calling the Kettle Black," as an answer to Berri's original version of Win Score. Lewin & Rosenbaum showed that AWS outperforms various advanced metrics in terms of predicting future wins, a finding that this research seems to further reinforce. AWS was the second-most effective overall predictor of future wins, and unlike SPM or APMVAL, it lost little of its predictive power when asked to assess teams that saw heavy personnel turnover from the previous season.

For these reasons, I have to conclude that Alternate Win Score is the "best" basic linear-weighted metric currently in the public domain. Obviously a more complex, nonlinear metric such as WARP would be preferable to AWS, but if you need to assess player performance using a simple, back-of-the-envelope calculation, Alternate Win Score should probably be your first choice.

Neil Paine is an author of Basketball Prospectus. You can contact Neil by clicking here or click here to see Neil's other articles.

1 comment has been left for this article.

<< Previous Article
Premium Article Kansas State Rolls On (01/02)
Next Article >>
Player Evaluation (01/03)

RECENTLY AT BASKETBALL PROSPECTUS
State of Basketball Prospectus: A Brief Anno...
Tuesday Truths: March-at-Last Edition
Easy Bubble Solver: The Triumphant Return
Premium Article Bubbles of their Own Making: Villanova, Temp...
Tuesday Truths: Crunch Time Edition

MORE FROM JANUARY 2, 2012
Premium Article Kansas State Rolls On: The Martin Way

MORE BY NEIL PAINE
2012-01-26 - Starting Fives: Worst of Last Quarter Centur...
2012-01-19 - Premium Article Head to Head: Ranking the Point Guards
2012-01-11 - No Assists: Knicks Missing Playmaker
2012-01-02 - Premium Article Linear Weights: Ranking the Formulas
2011-12-16 - Premium Article Transaction Notes: Thoughts on This Week's M...
2011-12-09 - Premium Article Star Power: Championship-Caliber Talent
More...


Basketball Prospectus Home  |  Terms of Service  |  Privacy Policy  |  Contact Us
Copyright © 1996-2014 Prospectus Entertainment Ventures, LLC.