JSon datasets
Here are some JSon datasets from various repositories i gathered over the Internet.
I found it very useful to train yourself on NoSQL databases, especially MongoDB, CouchDB or elasticsearch.
Archive | Nb documents | Extracted from | Initial format | Features |
---|---|---|---|---|
dblp.json | 118 015 | dblp.org | XML | Text, JSonArray, Nesting, ref |
Tour-Pedia_paris.json | 25 357 | tour-pedia.org (paris : accommodation, restaurants, POI) | CSV, REST API, joins | 2D, text, nesting, JSonArray, links, social network |
cities15000 | 22 948 | geonames (bl.ocks.org) | CSV | 2D |
Movies | 10 000 | Amazon Cloud Search | JSon | Text, Nesting, JSonArray |
restaurants | 25 357 | Restaurants inspections in New York (used in MongoDB tutorials) | JSon | Nesting, 2D, links |
reuters_26-09-1997 | 21 495 | Reuters-21578 in 1987 | XML | Text, Nesting |
ArtWorks_collectionMaster | 31 558 + 3538 | Tate collection from Github | CSV x2, 2 collections for joins | Text, Nesting, JSonArray, links |
companies | 18 801 | JSonAr | JSon, 2D conversion | Text, JSonArray, 2D, Nesting |
ottawa-json | 30 225 | JSonAr | JSon | 2D, Text, Nesting |
stocks | 6 576 | JSonAr | JSon | |
enron | 5 929 | JSonAr | JSon | Text, JSonArray |
world_bank | 500 | JSonAr | JSon | Text, JSonArray, Nesting, links |
2 comments