HomeWorldData[OC] Visual comparison of data for more than 10000 footballers in the top 7 European leagues across the last 7* seasons. (EXPLANATION & INTERACTIVE LINK IN COMMENTS)
[OC] Visual comparison of data for more than 10000 footballers in the top 7 European leagues across the last 7* seasons. (EXPLANATION & INTERACTIVE LINK IN COMMENTS)
August 9, 2024
[OC] Visual comparison of data for more than 10000 footballers in the top 7 European leagues across the last 7* seasons. (EXPLANATION & INTERACTIVE LINK IN COMMENTS)
Normalized Football Data: “Normalized Football Data” is my platform for delivering advanced player rankings through a detailed analysis of performance metrics from the top seven European leagues over the past seven seasons (6 for Eredivisie and Primeira Liga/Portugal). My aim is to provide a fair and comprehensive comparison of players, regardless of league, team, or season.
How It (basically) Works:
Data Collection: I gather data from fbref.com, focusing on players who have played at least 33% of available minutes each season. This ensures the analysis reflects consistent and impactful contributions. Normalization Process: Minutes Played: Statistics are adjusted on a per-minute basis for a more accurate comparison. My experience is that the accuracy of the scoring system increases when metrics are normalized around players with more minutes played. You can currently choose between three different minute-requirements (1/3, 1/2 or 2/3). League Elo Rating: I use a custom Elo rating to factor in the strength of each league. Essentially it means that if a player in the PL scores 1.0 goals per 90, a player from Eredivisie has to score 1.3 goals per 90 to achieve the same score. Success Rate Adjustment: Metrics like dribbles and tackles are adjusted for their success rates to better represent player effectiveness. On-Field Expected Goals Against (xGA): For defensive metrics, I normalize based on goals conceded and expected goals against (xGA) while the player is on the field, giving a clearer picture of defensive impact. This means that a player with 10 interceptions and 5 onga+onxga scores higher than a player with 40 inceptions and 30 onga+onxga. Metric Weighting: I assign different weights to each performance metric based on its importance: Non-Penalty Goals: Heavily weighted, up to 30 points.
Expected Goals (xG): Also highly valued, up to 20 points.
Assists, Key Passes, Progressive Carries, Defensive Actions etc.: These metrics are weighted at either 10 or 15 maximum points. Some minor ”discipline” metrics such as fouls commited (where low amounts of fouls give high score) are only ranked up to 5.0).
These points are arbitrarily chosen by me due to how I think they should be valued and from my experience with balancing the scoring system. I’m open to suggestions of you think any metric is wrongly valued.
Feature Scaling: I use the statistical method of feature scaling to standardize various metrics, allowing for a unified scoring system. Application Features: Filtering: You can dive deep into the data by filtering based on seasons, age, team, minutes played, nationality, league, role, and player names. This flexibility helps you focus on the specifics you’re interested in. Comparisons: Compare multiple players side-by-side and see how they measure up against each other. The platform makes it easy to visualize their stats and understand their relative strengths.
Current Scope and Future Plans:
Right now, the data includes only league play, but I want to expand this to cover cup competitions, more leagues, and detailed goalkeeper statistics in the future.
With “Normalized Football Data,” I hope to help you uncover not just who the standout players are, but also what makes them excel.
PS. This text was mainly created by ChatGPT because English is not my native language and I’m really fucking tired. Cheers.
1 comment
https://public.tableau.com/app/profile/bb.throwaway/viz/Normalizedfootballdata/Instrumentpanel1
Normalized Football Data: “Normalized Football Data” is my platform for delivering advanced player rankings through a detailed analysis of performance metrics from the top seven European leagues over the past seven seasons (6 for Eredivisie and Primeira Liga/Portugal). My aim is to provide a fair and comprehensive comparison of players, regardless of league, team, or season.
How It (basically) Works:
Data Collection: I gather data from fbref.com, focusing on players who have played at least 33% of available minutes each season. This ensures the analysis reflects consistent and impactful contributions.
Normalization Process: Minutes Played: Statistics are adjusted on a per-minute basis for a more accurate comparison. My experience is that the accuracy of the scoring system increases when metrics are normalized around players with more minutes played. You can currently choose between three different minute-requirements (1/3, 1/2 or 2/3).
League Elo Rating: I use a custom Elo rating to factor in the strength of each league. Essentially it means that if a player in the PL scores 1.0 goals per 90, a player from Eredivisie has to score 1.3 goals per 90 to achieve the same score.
Success Rate Adjustment: Metrics like dribbles and tackles are adjusted for their success rates to better represent player effectiveness.
On-Field Expected Goals Against (xGA): For defensive metrics, I normalize based on goals conceded and expected goals against (xGA) while the player is on the field, giving a clearer picture of defensive impact. This means that a player with 10 interceptions and 5 onga+onxga scores higher than a player with 40 inceptions and 30 onga+onxga.
Metric Weighting: I assign different weights to each performance metric based on its importance:
Non-Penalty Goals: Heavily weighted, up to 30 points.
Expected Goals (xG): Also highly valued, up to 20 points.
Assists, Key Passes, Progressive Carries, Defensive Actions etc.: These metrics are weighted at either 10 or 15 maximum points. Some minor ”discipline” metrics such as fouls commited (where low amounts of fouls give high score) are only ranked up to 5.0).
These points are arbitrarily chosen by me due to how I think they should be valued and from my experience with balancing the scoring system. I’m open to suggestions of you think any metric is wrongly valued.
Feature Scaling: I use the statistical method of feature scaling to standardize various metrics, allowing for a unified scoring system.
Application Features: Filtering: You can dive deep into the data by filtering based on seasons, age, team, minutes played, nationality, league, role, and player names. This flexibility helps you focus on the specifics you’re interested in.
Comparisons: Compare multiple players side-by-side and see how they measure up against each other. The platform makes it easy to visualize their stats and understand their relative strengths.
Current Scope and Future Plans:
Right now, the data includes only league play, but I want to expand this to cover cup competitions, more leagues, and detailed goalkeeper statistics in the future.
With “Normalized Football Data,” I hope to help you uncover not just who the standout players are, but also what makes them excel.
PS. This text was mainly created by ChatGPT because English is not my native language and I’m really fucking tired. Cheers.