What’s a Kaggle Competition? I didn’t know, so I looked it up. Get started by reading what I learned, and find an active list of competitions.
First of all, what’s Kaggle?
Until a few months ago I didn’t know the answer to that question. If you don’t either that’s okay, we’re going to answer it together. But first, you need to know a little background information about this data science network.
Kaggle was founded in 2010 with the idea that data scientists need a place to come together and collaborate on projects. This has transformed into a network with more than 1,000,000 registered users, and has created a safe place for data science learning, sharing, and competition.
Using the human competitive spirit, Kaggle created a platform for organizations to host competitions that have fueled new methodology and techniques in data science, and given organizations new insights from the data they provided.
Read more:
Kick-off with Kaggle competitions to learn data science skills
Being the competitive person I am, the competition aspect is what originally caught my eye, and gave me the desire to learn about the intricacies of a Kaggle Competition.
How Kaggle competition works
While combing through the Kaggle website and other informative articles, I found there are three basic steps in Kaggle Competitions.
- Preparation: Each Kaggle competition has a host, and each host has to prepare and provide data. When providing data, the host has the opportunity to give additional information such as a description, evaluation method, timeline, and prize for winning.
2. Experimentation: At this time, you’ve had your morning coffee, you’ve read all the information in the overview 500 times, and you’re ready to win 1st place. Now is the time to experiment, submit, and learn. There are three ways to upload your work:
- Kaggle Kernels
- Manual Uploads
- Kaggle API
If you don’t want anyone to really know what you’re doing, you should upload your experiments manually or by using the Kaggle API. Kaggle Kernels are a way for competitors to share what they’ve accomplished and get feedback from their peers. Kernels will give you ideas as to how to conquer the data, and I suggest you go through some of the popular ones.
- Results: In every Kaggle competition, there are public and private leaderboards. Be warned, the leaderboards are VERY different. The public leaderboard is based on a small percentage of the test data decided by the host. Although it gives you a good idea, it does not always reflect who will win and lose. The private leaderboard is what really matters. Not calculated until the end of the competition, this leaderboard is based on a larger proportion of data and, ultimately, decides the winners and losers.
If you would like to dive deep into the different types or formats and datasets offered by Kaggle, take a look at Kaggle’s Help and Documentation.
Active Kaggle competitions
[Updated May 6, 2019]
Kaggle competitions have a limited amount of time you can enter your experiments. This list does not represent the amount of time left to enter or the level of difficulty associated with posted datasets. One way to determine the level of difficulty is to look at the prize. Typically, the larger the prize, the more difficult/advanced the problem is. You can also look at the type of competition. You can find the four categories and Kaggle’s description of them below.
- Featured: “These are full-scale machine learning challenges which pose difficult, generally commercially-purposed prediction problems.”
- Research: “Research competitions feature problems which are more experimental than featured competition problems.”
- Getting Started: “These are semi-permanent competitions that are meant to be used by new users just getting their foot in the door
in the field of machine learning.” - Playground: “These are competitions which often provide relatively simple machine learning tasks, and are similarly targeted at newcomers or Kagglers interested in practicing
a new type of problem in a lower-stakes setting.”
I will try my best to keep this list as up-to-date as possible. Unfortunately, I’m not spending all my time on Kaggle’s website. So if you see something has ended, or a new competition has been added, please leave a comment below. Thanks and have fun!
- Two Sigma Using News to Predict Stock Movements
- Type: Featured
- Teams: 2,902
- Prize: $100,000
- LANL Earthquake Prediction
- Type: Research
- Teams: 573
- Prize $50,000
- Google Landmark Recognition 2019
- Type: Research
- Teams: 96
- Prize: $25,000
- Google Landmark Retrieval 2019
- Type: Research
- Teams: 96
- Prize: $25,000
- Freesound Audio Tagging 2019
- Type: Research
- Teams: 521
- Prize: $5,000
- Digital Recognizer
- Type: Getting Started
- Teams: 2,680
- Prize: Knowledge
- Titanic: Machine Learning from Disaster
- Type: Getting Started
- Teams: 10,234
- Knowledge
- House Prices: Advanced Regression Technniques
- Type: Getting Started
- Teams: 4,443
- Prize: Knowledge
- ImageNet Object Localization Challenge
- Type: Research
- Teams: 31
- Prize: Knowledge
- Predict Future Sales
- Type: Playground
- Teams: 2,170
- Prize: Kudos
- iMaterialist
- Type: Research
- Teams: 36
- Prize: Kudos
- iNaturalist
- Type: Research
- Teams: 120
- Prize: Kudos
- iWildCam 2019 – FGVC6
- Type: Research
- Teams: 159
- Prize: Kudos
- iMet Collection 2019 – FGVC6
- Type: Research
- Teams: 369
- Prize: Kudos
- Aerial Cactus Identification
- Type: Playground
- Teams: 507
- Prize: Knowledge
- TMD Box Office Prediction
- Type: Playground
- Teams: 971
- Prize: Knowledge
Know more about Kaggle competitions