Iterators

CSV Iterator

CSV iterators are used in cases when some argument has multiple values and you need to iterate for each value separately. The Digger itself will parse the passed CSV values and create a list of argument sets. The following parameters are used to initialize the iterator:

Parameter Description
type The constant defining type of iterator, has value csv.
name Name of argument.
value Values of arguments in CSV format (set of values, comma separated).

The following is an example of initializing an iterator:

              iterator:
- type: csv
  # ARGUMENT NAME
  name: age
  # ARGUMENT VALUES
  value: 1,2,3,4
              

As a result, the following list of sets of fields with values ​will be formed ​for each of which the digger will execute the main logic block:

              [
    { "age": 1 },
    { "age": 2 },
    { "age": 3 },
    { "age": 4 }
]
              

An example of using an iterator in a digger:

              ---
config:
    debug: 2
    agent: Firefox
iterator:
    type: csv
    name: age
    value: 18,19,25
do:
- walk:
    to: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=<%age%>
    do:
              
Time Level Message
2017-10-23 15:09:19:464 info Scrape is done
2017-10-23 15:09:19:449 debug Page content: <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <title>Diggernaut | Meta-language | Object sample</title> </head> <body> <h1>Title-1</h1> <p>Lorem ipsum dolor sit amet.</p> </body></html>
2017-10-23 15:09:19:422 debug Referers: Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=19
2017-10-23 15:09:19:413 debug Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=19
2017-10-23 15:09:19:405 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=25
2017-10-23 15:09:19:390 debug Page content: <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <title>Diggernaut | Meta-language | Object sample</title> </head> <body> <h1>Title-1</h1> <p>Lorem ipsum dolor sit amet.</p> </body></html>
2017-10-23 15:09:19:364 debug Referers: Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=18
2017-10-23 15:09:19:353 debug Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=18
2017-10-23 15:09:19:345 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=19
2017-10-23 15:09:19:332 debug Page content: <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <title>Diggernaut | Meta-language | Object sample</title> </head> <body> <h1>Title-1</h1> <p>Lorem ipsum dolor sit amet.</p> </body></html>
2017-10-23 15:09:18:944 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=18
2017-10-23 15:09:18:935 info Starting scrape
2017-10-23 15:09:18:914 debug Setting up default proxy
2017-10-23 15:09:18:900 debug Setting up surf
2017-10-23 15:09:18:867 info Starting digger: meta-lang-iterator-csv [1860]

You can use a combination of iterators to create a list with all possible combinations of arguments from different iterators.

An example of using a combination of iterators in a digger:

              ---
config:
    debug: 2
    agent: Firefox
iterator:
# by dates
- type: date
  start: '2017-10-01'
  period: 2
  interval: 1
  template: '%Y-%m-%d'
# and ages
- type: csv
  name: age
  value: 30,40
do:
- walk:
    to: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=<%age%>&from=<%start_date%>
    do:
              
Time Level Message
2017-10-23 15:36:56:946 info Scrape is done
2017-10-23 15:36:56:932 debug Page content: <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <title>Diggernaut | Meta-language | Object sample</title> </head> <body> <h1>Title-1</h1> <p>Lorem ipsum dolor sit amet.</p> </body></html>
2017-10-23 15:36:56:903 debug Referers: Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=30&from=2017-10-02
2017-10-23 15:36:56:894 debug Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=30&from=2017-10-02
2017-10-23 15:36:56:885 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=40&from=2017-10-02
2017-10-23 15:36:56:871 debug Page content: <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <title>Diggernaut | Meta-language | Object sample</title> </head> <body> <h1>Title-1</h1> <p>Lorem ipsum dolor sit amet.</p> </body></html>
2017-10-23 15:36:56:844 debug Referers: Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=40&from=2017-10-01
2017-10-23 15:36:56:836 debug Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=40&from=2017-10-01
2017-10-23 15:36:56:829 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=30&from=2017-10-02
2017-10-23 15:36:56:817 debug Page content: <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <title>Diggernaut | Meta-language | Object sample</title> </head> <body> <h1>Title-1</h1> <p>Lorem ipsum dolor sit amet.</p> </body></html>
2017-10-23 15:36:56:792 debug Referers: Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=30&from=2017-10-01
2017-10-23 15:36:56:784 debug Referer: https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=30&from=2017-10-01
2017-10-23 15:36:56:778 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=40&from=2017-10-01
2017-10-23 15:36:56:766 debug Page content: <!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"/> <title>Diggernaut | Meta-language | Object sample</title> </head> <body> <h1>Title-1</h1> <p>Lorem ipsum dolor sit amet.</p> </body></html>
2017-10-23 15:36:56:311 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html?age=30&from=2017-10-01
2017-10-23 15:36:56:302 info Starting scrape
2017-10-23 15:36:56:286 debug Setting up default proxy
2017-10-23 15:36:56:277 debug Setting up surf
2017-10-23 15:36:56:247 info Starting digger: meta-lang-iterator-combo [1861]

Next, we learn more about fieldset iterators.