No matter the sport, studying and improving the performance is a major asset towards success, whether individual or collective. However, there are many factors influencing the performance that are more or less manageable by sport scientists, coaches, or athletes themselves. For example, missing the rowing gold medal for few centimeters because of the sub-optimal shape of the boat, losing a collective game because of the lack of strategic study of the opposing team, or losing a 100m race for 3’’ with a starting position that was not prepared enough are really frustrating results considering we had potential solutions at our disposal.
This data challenge, in a smaller scale than the examples above, aims at studying one the step of the improvement in sport: the evaluation of the performance.
Several steps and some preliminary questions are required to reach this goal:
- Knowing your data: visualize, explore, synthetize.
- Are there clusters of similar patterns among football players?
- Can we predict the number of goals scored?
- Can we assess the field position of the player?
- Can we create a score that evaluates the annual performance for each player?
- Can we draw a map of the performances for a team?
The available data: name of the player, starting year of the championship (from 2009 to 2016), club, age, height, weight, field position, nb of game played, nb of minutes played, nb of assists, nb of yellow cards, red cards, average nb of shot per game, average % of successful pass, average nb of aerial won and nb of ‘man of the match’ awards.
The database is extracted from the website 'www.whoscored.com', and has been prepared by Adrien MARCK, who works at Insep (Institut National du Sport, de l'Expertise et de la Performance).