Runtime Entities

Objects

Objects are containers for storing collected data.
Diggernaut supports not only flat data structures, but also nested, so objects can be nested in other objects.

The main points that you should know about objects:

  1. Objects serve to save the collected data
  2. Objects exist in all contexts and are context-independent
  3. Before writing fields to an object, you need to create this object
  4. The entry in the object fields is possible from any block within the scope where the object is created
  5. For correct operation, the objects must be stored in the same block where they were opened
  6. An object can be stored to the field of another object or to the database

Example of using a simple object:

              ---
config:
    debug: 2
    agent: Firefox
do:
- walk:
    to: https://www.diggernaut.com/sandbox/meta-lang-object-en.html
    do:
    # let's create new object with name `item`
    - object_new: item
    - find:
        path: h1
        do:
        - parse
        # store register value to the `title` field of the object `item`
        - object_field_set:
            object: item
            field: title
    - find:
        path: p
        do:
        - parse
        # store register value to the `description` field of the object `item`
        - object_field_set:
            object: item
            field: description
    # save object to the database
    - object_save:
          name: item
              
Time Level Message
2017-10-21 03:20:43:412 info Scrape is done
2017-10-21 03:20:42:687 debug Saving object with name: item
2017-10-21 03:20:42:679 debug Saved field description for object item: Lorem ipsum dolor sit amet.
2017-10-21 03:20:42:671 debug Parsed content: Lorem ipsum dolor sit amet.
2017-10-21 03:20:42:663 debug Parsing block with arguments: map[]
2017-10-21 03:20:42:654 debug Block content: Lorem ipsum dolor sit amet.
2017-10-21 03:20:42:647 debug Number of found blocks: 1
2017-10-21 03:20:42:639 debug Looking for: p
2017-10-21 03:20:42:631 debug Saved field title for object item: Title-1
2017-10-21 03:20:42:621 debug Parsed content: Title-1
2017-10-21 03:20:42:614 debug Parsing block with arguments: map[]
2017-10-21 03:20:42:606 debug Block content: Title-1
2017-10-21 03:20:42:598 debug Number of found blocks: 1
2017-10-21 03:20:42:590 debug Looking for: h1
2017-10-21 03:20:42:582 debug Creating object with name: item
2017-10-21 03:20:42:568 debug Page content: <html lang="en">
<head>
<meta charset="UTF-8"/>
<title>Diggernaut | Meta-Language | Object sample</title>
</head>
<body>
<h1>Title-1</h1>
<p>Lorem ipsum dolor sit amet.</p>
</body>
</html>
2017-10-21 03:20:42:269 info Retrieving page (GET): https://www.diggernaut.com/sandbox/meta-lang-object-en.html
2017-10-21 03:20:42:263 info Starting scrape
2017-10-21 03:20:42:250 debug Setting up default proxy
2017-10-21 03:20:42:235 debug Setting up surf
2017-10-21 03:20:42:208 info Starting digger: meta-lang-object [1853]
              <!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Diggernaut | Meta-Language | Object sample</title>
</head>
<body>
<h1>Title-1</h1>
<p>Lorem ipsum dolor sit amet.</p>
</body>
</html>
              
              {
    item : {
        title :  "Title-1",
        description :  "Lorem ipsum dolor sit amet."
    }
}