Using JSON schema to validate your data

Recently we added a couple of neat functions which let you work with data more efficiently. And one of these functions is JSON schema support. JSON schema can be used in many cases, e.g. if you need to ensure that digger still works properly and data you are getting is still in good state, or if you need to get just specific records and skip others. For example, if you are gathering some events, you may want to get only event that not cancelled or has open slots, if website has information about it, you can easily set rules in a JSON scheme to pick only records you need.

So what is JSON schema? As states: “JSON Schema is a vocabulary that allows you to annotate and validate JSON documents.”. I would recommend you to learn more about it from above site, as we are not going to cover syntax and JSON schema usage in this article. You can easily learn it and play with it in debug mode at Diggernaut without paying a dime for it.

So how can you set JSON schema for a digger? First, you need to login to your Diggernaut account, then go to Projects > Diggers, find digger you need and click on “Config” button.


It will open editor panel where you usually put in digger config. You can see that it has 2 additional tabs now. You need to click on “Validator” tab.


Then you have to put your JSON schema and click on “Save” button.


Next time your digger is running it will use your JSON scheme for data validation. To understand it better, you may want to look into digger config we used for tests:

And JSON scheme we used for it:

Co-founder of cloud based web scraping and data extraction platform Diggernaut

Leave a Reply

Your email address will not be published. Required fields are marked *