Repositories is a term which is used for storage location
where we can store things retrieve things. The term repositories of big data
means where the big data is stored and we can use that big data for big data
analytics, mining etc. SO, there is no need to build their own massive data
repositories before starting with big data analytics. So, these are the best
data sources available or we can say these are the top 10 repositories for big
data.
Google Trends is a
public web facility provided by Google Inc., based on Google Search. Google
Trends shows how often a specific search-term is entered relative to the total
search-volume across various regions of the world. Google trends shows the
result in the form of graph. The horizontal axis of the graph represents time,
and the vertical axis represents how often a term is searched for relative to
the total number of searches.
Data.gov http://data.gov
It is a U.S.
government website for getting dataset, launched in late May 2009 by Vivek
Kundra, the then Federal Chief Information Officer (CIO) of the United States. Data.gov
stores all sorts of amazing information on everything like climate, business,
education, agriculture etc.
Healthdata.gov https://www.healthdata.gov/
Healthdata.gov
provides health-related data free. You can get comprehensive catalog of
health-related data sets relevant to all aspects of health, available for free.
Facebook Graph https://developers.facebook.com/docs/graph-api
In most of the cases,
the Facebook profile of any user is public. Facebook provide the Graph API as a
way of querying the huge amount of information that the users want to share
with the world.
Google Finance https://www.google.com/finance
Google Finance is a
website launched on March 21, 2006 by Google Inc. based on Google Search. Google
Finance provides updated real time stock data. Google Finance also aggregates
Google News and Google Blog Search articles about each corporation.
New
York Times http://developer.nytimes.com/docs
New York Times is a
big data repository which provides searchable, indexed archive of news articles.
It is an open source big data repository.
DBPedia http://wiki.dbpedia.org
Wikipedia provides
millions of pieces of data on every subject under which exists in the world.
DBPedia is a project to create a public, freely distributable database and
anyone can analyze this data.
Google Books Ngrams http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
When you enter words
into the Google Books Ngram, it search and analyze the full text of any of the
millions of books digitized as a part of the Google Books project.
Amazon Web Services public datasets http://aws.amazon.com/datasets
Amazon Web Services
hosts various public data sets that anyone can access for free. The data sets
on Amazon Web Services are hosted in these two possible formats, Amazon Elastic
Block Store snapshots and/or Amazon Simple Storage Service buckets.
The CIA World Factbook https://www.cia.gov/library/publications/the-world-factbook/
The CIA world Factbook
is a factbook that provides the facts on the history, population, economy, government,
infrastructure and military of 267 countries.
1 comments:
Click here for commentsnice .Very useful .Thank you for sharing Big Data Hadoop Online Training
ConversionConversion EmoticonEmoticon