Free Public Datasets for Your Data Science Project in 2024

Free Public Datasets for Your Data Science Project

Data science is all about finding interesting insights and stories from that data. And you can say data is crucial for data science. Your data science project is mostly dependent upon the goodness of the dataset. That’s why in this article I am gonna share Free Public Datasets for Your Data Science Project.

Thankfully there are various online data sources available where you can get free open source datasets for your project. You just need to download the datasets and use them for your project.

So without further ado, let’s get started-

Free Public Datasets for Your Data Science Project

1. Data.gov

Data.gov is the repository of the US government which you can use for your research and data science projects such as data visualization, mobile applications, etc. You can directly use some of the datasets without even registering on the site. But some datasets require licensing agreements before downloading the dataset. You can search for a specific data set or you can find datasets in the topic section-

Free Public Datasets for Your Data Science Project

2. Google Cloud Public Datasets

Google has a cloud hosting service named Google Cloud Platform. Where you can explore datasets by using a BigQuery tool. For obtaining better patterns and insights, you can create data visualizations and interactive dashboards by using Google Data Studio.

You just need to create an account on GCP for accessing the data. GitHub, United States Census Bureau, NASA, BitCoin, US Department of Transportation, etc are the data providers on GCP.

Free Public Datasets for Your Data Science Project in 2021

3. Kaggle

Kaggle is one of the famous platforms for data science, and you can download approx 68,000 public datasets on Kaggle free. In Kaggle you need to create an account and then you can search for any specific dataset in the search bar. You can also donate datasets on Kaggle and other community members can vote and run Kernel/scripts on them. 

kaggle dataset

In Kaggle, you can also take part in various competitions and can download the competition data sets from Kaggle.

4. UCI Machine Learning Repository

The UCI Repository has public datasets available for machine learning and data science. The best thing in UCI Repository is that datasets are tagged with different categories such as classification, regression, recommender system, etc.

These categories will make your findings easier. The datasets available on UCI are contributed by various people. So if you are a machine learning practitioner, then you should check the UCI Machine Learning Repository.

You don’t need to register on the site, you can directly download the datasets from the UCI Machine Learning repository.

5. AWS Public Data sets

Amazon has a huge amount of datasets available on their open data registry. You can easily download the datasets and use them for your project. You can also analyze the data on the Amazon Elastic Compute Cloud (Amazon EC2).

Full Enron email dataset, Google Books n-grams, NASA NEX datasets, Million Songs dataset are some of the popular datasets available on Amazon.

6. Quandl

Quandl is the best platform for financial, economic, and alternative data. Some of the datasets are free, but some require purchase. You can use Quandl datasets for stock price prediction or economic indicators prediction.

Stock Exchange data from India is freely available on Quandl. If you search properly, you will get some good free datasets.

7. The World Bank 

The World Bank is a global development organization and provides open datasets. In the World Bank, you will find several resources for datasets such as DataBank, Open Data Catalog, Microdata library, etc.

You can also find datasets by Regions and Countries-

Free Public Datasets for Your Data Science Project in 2021

8. Indian Government OpenDataset

data.gov.in is the website by the Government of India, where you can find free datasets from various industries such as climate, health care, transport, education, economy, etc.

datasets for data science projects

9. Earthdata

Earthdata is created by NASA and provides datasets related to the Earth and Space. So if you are looking for such kinds of datasets, then Earthdata is the perfect place for you. In Earthdata, you will find Earth’s atmosphere, oceans, solar flares, cryosphere, geomagnetism based datasets.

In Earthdata, you will find various sections such as find data, use data, visualize data, etc.

10. Awesome Public Dataset

This is a GitHub Repository that has listed datasets from various domains such as Agriculture, Biology, Climate & Weather, Complex Networks, Computer Networks, Economics, Education, Finance, etc. Most of the datasets are available freely.

And here the list end. So these are 10 Free Public Datasets for Your Data Science Project. I would suggest you bookmark this article for future referrals. Now it’s time to wrap up.

Conclusion

In this article, I tried to cover the 10 Free Public Datasets for Your Data Science Project. If you have any doubts or questions, feel free to ask me in the comment section.

All the Best!

Enjoy Learning!

Thank YOU!

Explore More about Data Science, Visit Here

Subscribe For More Updates!

[mc4wp_form id=”28437″]

Though of the Day…

It’s what you learn after you know it all that counts.’

John Wooden

author image

Written By Aqsa Zafar

Founder of MLTUT, Machine Learning Ph.D. scholar at Dayananda Sagar University. Research on social media depression detection. Create tutorials on ML and data science for diverse applications. Passionate about sharing knowledge through website and social media.

Leave a Comment

Your email address will not be published. Required fields are marked *