Methods for Working with DOM

Attributes

Nodes may has some attributes (for example: style or class) set. In some cases, you may need to remove these attributes. You can use the attr_remove command. It will remove all specified attributes in all nodes of the current block.

The parameter selector must be passed with the command, where you have to specify a selector for attributes that should be removed. To delete all attributes, you can pass the wildcard selector *.

Let's use following HTML source:

          <div class="container">
    <span style="width: 200px;">some text</span>
    <a href="link.html">some link</a>
    <span style="width: 400px;">another text</span>
</div>
          

Example of usage:

              - find:
    path: div
    do:
    - attr_remove:
        selector: '*'
    - parse:
        format: html

    # ALL ATTRIBUTES WERE REMOVED, REGISTER VALUE:
    # <span>some text</span>
    # <a>some link</a>
    # <span>another text</span>
              
              - find:
    path: div
    do:
    - attr_remove:
        selector: style
    - parse:
        format: html

    # REMOVED ONLY STYLE ATTRIBUTE, REGISTER VALUE:
    # <span>some text</span>
    # <a href="link.html">some link</a>
    # <span>another text</span>
              

In the next chapter, we will learn how to split the contents of a block into multiple blocks manually.