Select Language

AI社区

公开数据集

世界足球实时数据馈送

世界足球实时数据馈送

2.68M
258 浏览
0 喜欢
0 次下载
0 条讨论
Arts and Entertainment,News,Sports,Football Classification

数据结构 ? 2.68M

    Data Structure ?

    * 以上分析是由系统提取分析形成的结果,具体实际数据为准。

    README.md

    Context This is the **first live data stream** on Kaggle providing a simple yet rich source of all soccer matches around the world **24/7 in real-time**. What makes it unique compared to other datasets? - It is the first live data feed on Kaggle and it is totally free - Unlike “Churn rate” datasets you do not have to wait months to evaluate your predictions; simply check the match’s outcome in a couple of hours - you can use your predictions/analysis for your own benefit instead of spending your time and resources on helping a company maximizing its profit - A Five year old laptop can do the calculations and you do not need high-end GPUs - Couldn’t make it to the top 3 submissions? Nevermind, you still have the chance to get your prize on your own - You can’t get accurate results on all samples? Do not worry, just filter out the hard ones (e.g. ignore international friendly) and simply choose the ones you are sure of. - Need help from human experts for each sample? Every sample comes with at least two opinions from experts - You wish you could add your complementary data? Just contact us and we will try to facilitate it. - Couldn’t win “Warren Buffett's 2018 March Madness Bracket Contest”? Here is your chance to make your accumulative profit. Simply train your algorithm on the first version of training dataset of approximately **11.5k** matches and predict the data provided in the following data feed. Fetch the data stream The CSV file is updated every 30 minutes at minutes 20’ and 50’ of every hour. **I kindly request not to download it more than twice per hour** as it incurs additional cost. You may download the csv data file from the following link from Amazon S3 server by changing the **FOLDER_NAME** as below, https://s3.amazonaws.com/FOLDER_NAME/amasters.csv *. Substitute the **FOLDER_NAME** with "**analyst-masters**" Content Our goal is to identify the outcome of a match as Home, Draw or Away. The variety of sources and nature of information provided in this data stream makes it a unique database. Currently, FIVE servers are collecting data from soccer matches around the world, communicating with each other and finally aggregating the data based on the dominant features learned from 400,000 matches over 7 years. I describe every column and the data collection below in two categories, **Category I – Current situation** and **Category II – Head-to-Head History**. Hence, we divide the type of data we have from each team to 4 modes, - Mode 1: we have both Category I and Category II available - Mode 2: we only have Category I available - Mode 3: we only have Category II available - Mode 4: none of Category I and II are available Below you can find a full illustration of each category. ***I. Current situation*** ---------- *Col 1 to 3:* Votes_for_Home Votes_for_Draw Votes_for_Away The most distinctive parts of the database are these 3 columns. We are releasing opinions of over 100 professional soccer analysts predicting the outcome of a match. Their votes is the result of every piece of information they receive on players, team line-up, injuries and the urge of a team to win a match to stay in the league. They are spread around the world in various time zones and are experts on soccer teams from various regions. Our servers aggregate their opinions to update the CSV file until kickoff. Therefore, even if 40 users predict Real-Madrid wins against Real-Sociedad in Santiago Bernabeu on January 6th, 2019 but 5 users predict Real-Sociedad (the away team) will be the winner, you should doubt the home win. Here, the “majority of votes” works in conjunction with other features. *Col 4 to 9:* Weekday Day Month Year Hour Minute There are over 60,000 matches during a year, and approximately 400 ones are usually held per day on weekends. More critical and exciting matches, which are usually less predictable, are held toward the evening in Europe. We are currently providing time in Central Europe Time (CET) equivalent to GMT +01:00. *. Please note that the 2nd row of the CSV file represents the time, data values are saved from all servers to the file. *Col 10 to 13:* Total_Bettors Bet_Perc_on_Home Bet_Perc_on_Draw Bet_Perc_on_Away This data is recorded a few hours before the match as people place bets emotionally when kickoff approaches. The percentage of the overall number of people denoted as “Total_Bettors” is indicated in each column for “Home,” “Draw” and “Away” outcomes. *Col 14 to 15:* Team_1 Team_2 The team playing “Home” is “Team_1” and the opponent playing “Away” is “Team_2”. *Col 16 to 36:* League_Rank_1 League_Rank_2 Total_teams Points_1 Points_2 Max_points Min_points Won_1 Draw_1 Lost_1 Won_2 Draw_2 Lost_2 Goals_Scored_1 Goals_Scored_2 Goals_Rec_1 Goal_Rec_2 Goals_Diff_1 Goals_Diff_2 If the match is between two teams in the same league or a group (e.g. a cup group) then details of teams are reported in the format “home” as index=1 and “Away” as index=2, respectively. The information provided contains, 1. League_Rank_1 & 2, Team ranks in the league 2. Total_teams, Total number of teams 3. Games_played_1 & 2, Games played for that league till now which shows if based on Total_teams there have been enough 4. matches to have meaningful rankings/points by now, e.g. if total_teams is 15 and only 10 matches played then it is just the beginning of the season. 5. Points_1 & 2 ; Max & Min_Points Max & Min_Points, indicating points for both teams, the strongest and weakest teams in the league respectively by that date 6. Won_Draw_Lost 1 & 2, Number of outcomes for that team in the group 7. Scored_Rec_Diff 1 & 2, Number of Goals scored, received and difference for that team till now ***II. Head-2-Head history*** ---------- *Col 37 to 38:* Team_1_Found Team_2_Found We search for these two teams’ history against each other. “Team_1_Found” and “Team_2_Found” are the team names found in their histories, IRRESPECTIVELY. For example, {'Man utd', 'Manchester United', 'Man united', 'Manchester U'} are all similar names referring to 'Manchester United FC'. Hence, you need to double check if these two columns are referring to the same teams, although we check them using string matching algorithms. *Col 39 to 40:* Rank_1 Rank_2 Here we provide rankings of either current or previous league of the teams. *Col 41 to 42:* Win_Perc_1 Win_Perc_2 In the past 15 matches that these teams played against similar teams (e.g. Team A Vs Team C & Team B Vs Team C) called Cross Comparison what was their win percentage. *Col 44 to 45:* Draw_Perc_1 Draw_Perc_2 In the past 15 matches that these teams played against similar teams called Cross Comparison what was their Draw percentage. *Col 43:* League_type_country What type of league or match does this information belong? A country league, international or FIFA world cup, etc. *Col 46 to 47:* Large_Diff_win Won_out_of_6 In the past 6 matches that Team_1 played against Team_2 how many times did one of them win the other? It is only non-zero if it is greater or equal to 4. If it is negative (e.g. -4) then they drew 4 times out of their 6 recent matches. *Col 48 to 49:* Avrage_FT_Goal Average_HT_Goal In the past 6 matches how many goals were scored in the average by both teams before Half-Time (HT) or Full-Time (FT)? *Col 50* Number_of_H2H_matches Since 2008, how many times did they play against each other including cup and friendly games? Note that team pairs with less than 6 H2H matches are hard to predict. *Col 51* Jumps What was the largest gap in years during the past 6 matches? If they played once in 2014 and in 2018, one of them was relegated. Jump values equal to or greater than 3 are harder to predict. *Col 52 to 54* Odds_Home Odds_Draw Odds_Away The probability of each outcome is represented in the form of European odds which also quotes the net total that will be paid out to the correct prediction relative to the stake. You can easily obtain the pr
    ×

    帕依提提提温馨提示

    该数据集正在整理中,为您准备了其他渠道,请您使用

    注:部分数据正在处理中,未能直接提供下载,还请大家理解和支持。
    暂无相关内容。
    暂无相关内容。
    • 分享你的想法
    去分享你的想法~~

    全部内容

      欢迎交流分享
      开始分享您的观点和意见,和大家一起交流分享.
    所需积分:0 去赚积分?
    • 258浏览
    • 0下载
    • 0点赞
    • 收藏
    • 分享