These are some open datasets we think you could enjoy working on.
Pokémon started out as a Japanese card game and became a worldwide phenomenom. The link is to a public API providing access to all the information about all Pokémons, throughout all existing (seven) generations + including berries!
Two great projects for this dataset would be to create PokéBots: (1) a bot that you could compete against and (2) a bot that could help you train your Pokémons.
A link to the API: http://pokeapi.co/docsv2/#info
Some additional Pokémon resources are:
Reddit is one of the biggest social bullitien boards hosting many communities, one of these communities is the dataset community, used to both requesting and publishing open datasets.
A link to the community: https://www.reddit.com/r/datasets/
Some examples are:
Open local budget is a project under The Public Knowledge Workshop aimed at making local authorities budgets accessible to the public. While the project is at beta, many budgets are already online (some in a more accessible format than others) browsable, though a single budget at a time, in the projects home page. Being in beta means that there is plenty of room for improvement - this is where you can come in! Some project ideas based on this data are:
The budget data is available in this Google Drive folder.
Starting in Stack-Overflow, the Stack-Exchange network is a collection of Q&A websites, each dealing with a different topic - from porgramming to home improvement.
These vast knowledge bases, some containing over a few millions of answers, are available to download in XML format.
A link to the dataset: https://archive.org/details/stackexchange
Some projects that you could attempt using this dataset are:
We provide a unique dataset of facebook comments to statuses published by Israeli MKs during 2015-2016. In total there are about 5 million such comments, out of which 1,600 are labeled according to the sentiment of the comment's text. A great challenge is to use the 1,600 labeled comments, in order to find the sentiment of all the comments. In this folder you'll find the labeled data, some information about the labels, and the unlabeled data. This dataset was collected by the team of Kikar Hamedina, and they will be more than happy to help. Contact the data team if you wish us to get you in touch.
Here are some additional resources which you can use to find open datasets. This is really just the tip of the iceberg, so if you don't find anything interesting here, it dosen't mean that it dosen't exist at all. If you have something specific in mind and need our help, mail us at data@datahack-il.com.