Do you need some real data to use in your projects?
Why not use real data from the best sources?!
Our goal it’s to allow every developer find and use the best dataset possible for their projects, in a quick and easy way.
Make sure you have git and Docker desktop installed.
$ git clone https://github.com/diashenrique/iris-kaggle-socrata-generator.git
$ docker-compose build
$ docker-compose up -d
For this initial release, we are using Socrata APIs to search and download and speficic dataset.
Open the API tool of your preference like Postman, Hoppscotch
GET> https://api.us.socrata.com/api/catalog/v1?only=dataset&q=healthcare
This endpoint will return all healthcare related datasets, like the image below:
Now, get the ID. In this case the id is: “n9tp-i3k3”
Go the the terminal
IRISAPP>set api = ##class(dc.dataset.importer.service.socrata.SocrataApi).%New()
IRISAPP>do api.InstallDataset({"datasetId": "n9tp-i3k3", "verbose":true})
Compilation started on 01/07/2022 01:01:28 with qualifiers 'cuk'
Compiling class dc.dataset.imported.DsCommunityHealthcareCenters
Compiling table dc_dataset_imported.DsCommunityHealthcareCenters
Compiling routine dc.dataset.imported.DsCommunityHealthcareCenters.1
Compilation finished successfully in 0.108s.
Class name: dc.dataset.imported.DsCommunityHealthcareCenters
Header: Name VARCHAR(250),Description VARCHAR(250),Location VARCHAR(250),Phone_Number VARCHAR(250),geom VARCHAR(250)
Records imported: 26
After the command above, your dataset it’s ready to use!
To use the datasets from Kaggle, you need to register on the website. After that, you need to create an API token to use Kaggle’s API.
Now, just like with Socrata, you can use the API to search and download the dataset.
GET> https://www.kaggle.com/api/v1/datasets/list?search=appointments
This endpoint will return all healthcare related datasets, like the image below:
Now, get the ref value. In this case the ref is: “joniarroba/noshowappointments”
The parameters below “your-username”, and “your-password” are the parameters provided by Kaggle when you create the API token.
IRISAPP>Set crendtials = ##class(dc.dataset.importer.service.CredentialsService).%New()
IRISAPP>Do crendtials.SaveCredentials("kaggle", "<your-username>", "<your-password>")
IRISAPP>Set api = ##class(dc.dataset.importer.service.kaggle.KaggleApi).%New()
IRISAPP>Do api.InstallDataset({"datasetId":"joniarroba/noshowappointments", "credentials":"kaggle", "verbose":true})
Class name: dc.dataset.imported.DsNoshowappointments
Header: PatientId INTEGER,AppointmentID INTEGER,Gender VARCHAR(250),ScheduledDay DATE,AppointmentDay DATE,Age INTEGER,Neighbourhood VARCHAR(250),Scholarship INTEGER,Hipertension INTEGER,Diabetes INTEGER,Alcoholism INTEGER,Handcap INTEGER,SMS_received INTEGER,No-show VARCHAR(250)
Records imported: 259
After the command above, your dataset it’s ready to use!
We’re offering a GUI to install the dataset to make things easier. But this is something that we like to discuss in our next article. In the meanwhile, you can check a sneak peek below while we are polishing a few things before the official release: