Mikhail Sisin Co-founder of cloud-based web scraping and data extraction platform Diggernaut. Over 10 years of experience in data extraction, ETL, AI, and ML.

Scraping foresty supplies from Ben Meadows online store

8 min read

Scraping foresty supplies from Ben Meadows online store

Ben Meadows was founded in 1956 with only one goal: to provide foresters with quality products for their professional activities. Soon the company added geodetic and fire fighting equipment, as well as equipment for cartographers, to its range. And for more than 60 years Ben Meadows has been supplying professionals with the best equipment and providing excellent service. With this scraper, you can extract information about foresty supplies from Ben Meadows online store.

Approx number of goods: 4000
Approx number of page requests: 4000
Recommended subscription plan: Free

PLEASE NOTE! The number of requests can exceed the number of products, because data about variations, images, etc. can be scraped from other resources and will require additional requests. Also part of the product data can be delivered using XHR requests, which also increases the total number of required page requests.

How to use the web scraper to extract data about goods and prices from benmeadows.com

To use the web scraper for Ben Meadows store’s website, you must have an account with our Diggernaut service. You can just simply follow this comprehensive guide:

  1. Go through this registration link to open free account with Diggernaut
  2. After registering and confirming the email address, you will need to log in to your account
  3. Create a project with any name and description, if you do not know how to do it, please refer to our documentation
  4. Switch to the created project and create a digger with any name, if you do not know how to do it, please refer to our documentation
  5. Copy the following digger configuration to the clipboard and paste it into the digger you created, if you do not know how to do it, refer to our documentation
  6. Switch the mode of the digger from Debug to Active, if you do not know how to do it, please refer to our documentation
  7. Run your digger and wait until the completion, if you do not know how to do it, please refer to our documentation
  8. Download the scraped dataset in the format you need, if you do not know how to do it, please refer to our documentation

You can also setup a schedule for running your scraper and collect data regularly.

Scraping configuration for the digger

---
config:
    debug: 2
    agent: Firefox
do:
- walk:
    to: http://www.benmeadows.com
    do:
    - find:
        path: 'ul#topnav>li:has(a#productCategories)>div.subMenu a'
        do:
        - parse:
            attr: href
        - space_dedupe
        - trim
        - if:
            match: \w+
            do:
            - normalize:
                routine: url
            - link_add:
                pool: catalog
- walk:
    to: links
    pool: catalog
    do:
    - find: 
        path: .viewPaginationNext 
        do: 
        - parse:
            attr: href
        - if:
            match: \w+
            do:
            - normalize:
                routine: url
            - link_add:
                pool: catalog
            
    - find:
        path: 'a#hlNavigation'
        do:
        - parse:
            attr: href
        - space_dedupe
        - trim
        - if:
            match: \w+
            do:
            - normalize:
                routine: url
            - link_add:
                pool: catalog
    - find:
        path: 'a#hladd'
        do:
        - parse:
            attr: href
        - space_dedupe
        - trim
        - if:
            match: \w+
            do:
            - normalize:
                routine: url
            - link_add:
                pool: pages
- walk:
    to: links
    pool: pages
    do:
    - sleep: 2
    - find:
        path: 'div#prodWrap'
        do:
        - object_new: product
        - eval:
            routine: js
            body: '(function (){var d = new Date(); return d.toISOString()})();'
        - object_field_set:
            object: product
            field: date
        - static_get: url
        - object_field_set:
            object: product
            field: url
        - find:
            path: meta[itemprop="identifier"]
            do:
            - parse:
                attr: content
            - space_dedupe
            - trim
            - if:
                match: \d+
                do:
                - object_field_set:
                    object: product
                    field: sku
        - find:
            path: 'span#lblGroupTitle'
            do:
            - parse
            - space_dedupe
            - trim
            - object_field_set:
                object: product
                field: name
        - find:
            path: 'a#imgLink'
            do:
            - parse:
                attr: href
            - space_dedupe
            - trim
            - if:
                match: \w+
                do:
                - normalize:
                    routine: url
                - object_field_set:
                    object: product
                    joinby: "|"
                    field: images
        - find:
            path: script:contains('loadProductPageDropDowns')
            do:
            - parse:
                filter: loadProductPageDropDowns\((.+)\)\;\$\('\#txtHeaderSearch'\)\.focus\(\)\;\}\)\;
            - normalize:
                routine: json2xml
            - to_block
            - find:
                path: body_safe>groupname
                do:
                - parse
                - space_dedupe
                - trim
                - object_field_set:
                    object: product
                    field: name
            - find:
                path: body_safe>groupid
                do:
                - parse
                - space_dedupe
                - trim
                - if:
                    match: \d+
                    do:
                    - object_field_set:
                        object: product
                        field: sku
            - find:
                path: largeimage,secimages>large
                do:
                - parse
                - space_dedupe
                - trim
                - if:
                    match: \w+
                    do:
                    - normalize:
                        routine: url
                    - object_field_set:
                        object: product
                        joinby: "|"
                        field: images
            - find:
                path: properties>children
                do:
                - variable_clear: sort
                - variable_clear: value
                - find:
                    path: sort
                    do:
                    - parse
                    - space_dedupe
                    - trim
                    - variable_set: sort
                - find:
                    path: value
                    do:
                    - parse
                    - space_dedupe
                    - trim
                    - variable_set: value
                - variable_get: sort
                - if:
                    match: \w+
                    do:
                    - variable_get: value
                    - if:
                        match: \w+
                        do:
                        - register_set: "<%sort%>: <%value%>"
                        - object_field_set:
                            object: product
                            joinby: "|"
                            field: variations
        - register_set: Ben Meadows
        - variable_set: brand
        - find:
            path: 'img[itemprop="brand"]'
            do:
            - parse:
                attr; content
            - space_dedupe
            - trim
            - variable_set: brand
        - variable_get: brand
        - object_field_set:
            object: product
            field: brand
        - find:
            path: span.currentCrumb>a
            slice: 0:-2
            do:
            - parse
            - space_dedupe
            - trim
            - if:
                match: \w+
                do:
                - object_field_set:
                    object: product
                    joinby: "|"
                    field: category
        - find:
            in: doc
            path: meta[name="description"]
            do:
            - parse:
                attr: content
            - space_dedupe
            - trim
            - variable_set: desc
        - find:
            path: 'div#prodDetailedBenefit>div.proDesc'
            do:
            - parse
            - space_dedupe
            - trim
            - variable_set: desc
        - variable_get: desc
        - object_field_set:
            object: product
            field: description
        - find:
            path: meta[itemprop="price"],meta[itemprop="lowPrice"]
            do:
            - parse:
                attr: content
                filter: ([0-9\.\,]+)
            - normalize:
                routine: replace_substring
                args:
                    \,: ''
            - space_dedupe
            - trim
            - object_field_set:
                object: product
                type: float
                field: price
        - find:
            path: meta[itemprop="currency"]
            do:
            - parse:
                attr: content
            - object_field_set:
                object: product
                field: currency
        - object_save:
            name: product

Sample of scraped data

Below is a sample of a dataset with several products in JSON format (so you can easily review it and see data structure). The dataset can be downloaded as CSV, XLSX, XML, or any other text format using the templates.

[{
    "product": {
        "brand": "Ben Meadows",
        "category": "Forestry Supplies and Equipment|Logging and Clearing Tools|Cable Pullers and Log Chains",
        "currency": "USD",
        "date": "2017-12-07T01:35:16.184Z",
        "description": "The compact design of this Swaged Wire Rope makes it up to 26% stronger than standard winch lines of the same diameter. The outer wires have a larger surface area than standard winch lines, providing better resistance to wear and tear. The already compact line is also resistant to abrasion, pig tailing, kinking and drum crushing. The 6 x 26 IWRC construction features a stainless steel duplex sleeve that maximizes sleeve to wire rope contact and a strong alloy Hook and Latch. Design Factor: 3.55:1 Ratio. NOTE: Match to your existing wire rope size or check your winch manufacturer's wire rope size recommendation before ordering.",
        "images": "https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_03.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_03.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_03.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_03.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_03.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MW89_03.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/36MX08_AS01.jpg",
        "name": "B/A Products Swaged IWRC Wire Rope with Hook and Latch",
        "price": 89.95,
        "sku": "37346391",
        "url": "https://www.benmeadows.com/ba-products-swaged-iwrc-wire-rope-with-hook-and-latch_37346391/",
        "variations": "Diameter: 3/8''|Length: 100'|Diameter: 3/8''|Length: 75'|Diameter: 3/8''|Length: 56'|Diameter: 3/8''|Length: 35'|Diameter: 3/8''|Length: 50'|Diameter: 1/2''|Length: 100'|Diameter: 7/16''|Length: 150'|Diameter: 7/16''|Length: 75'|Diameter: 3/8''|Length: 150'|Diameter: 1/2''|Length: 150'|Diameter: 7/16''|Length: 50'"
    }
}
,{
    "product": {
        "brand": "Ben Meadows",
        "category": "Forestry Supplies and Equipment|Logging and Clearing Tools|Cable Pullers and Log Chains",
        "currency": "USD",
        "date": "2017-12-07T01:35:19.635Z",
        "description": "Tough and durable, these lifting/pulling devices are designed for heavy-duty jobs. Puller handle bends as a safety warning when it is overloaded. The frame and pawls are made of ductile iron and the yoke is malleable iron. The frame hook and end hook are forged steel and tackle block hook is a steel casting. Each Puller comes with 5вЃ„16\" wire cable on a welded ductile iron reel (cast iron reel on No. 210016). Dimensions: 8\"H x 6\"W x 17\"L.Heavy-duty model Pullers are available with two or three-ton capacity with double line. Two-ton model comes with choice of cable length.",
        "images": "https://www.benmeadows.com/images/ir/s7product/8C839_AS04.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS04.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS07.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS07.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AA01.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS06.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS06.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS05.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS07.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS07.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS07.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_AS07.jpg|https://www.benmeadows.com/images/ir/s7product/8C839_web01.jpg",
        "name": "The More Power Puller® Puller",
        "price": 225.99,
        "sku": "36810185",
        "url": "https://www.benmeadows.com/the-more-power-puller-puller_36810185/",
        "variations": "Capacity: 2 Ton|Length: 35'|Capacity: 3 Ton|Length: 20'|Capacity: 2 Ton|Length: 30'|Capacity: 2 Ton|Length: 20'"
    }
}
,{
    "product": {
        "brand": "Ben Meadows",
        "category": "Forestry Supplies and Equipment|Logging and Clearing Tools|Cable Pullers and Log Chains",
        "currency": "USD",
        "date": "2017-12-07T01:35:22.272Z",
        "description": "Make hauling and securing heavy loads a little easier with these Cable Pullers. Galvanized aircraft-quality cable is virtually indestructible.One-piece aluminum-alloy ratchet wheels resist wear and last longer than laminated wheels. Electro-plated parts protect against rust. Drop-forged steel slip hooks rotate a full 360В°. Notch-at-a-time letdown makes for trouble-free, positive-control lowering and releasing.1-Ton Cable Puller is the \"original.\" 3вЃ„16\"-dia. cable. 15:1 leverage. 12' max. lift.2-Ton Cable Puller adds more leverage (30:1) and a pulley for heavier loads. 3вЃ„16\"-dia. cable. 6' max. lift.3-Ton Cable Puller has 5вЃ„16\"-dia. cable with 35:1 leverage for lifting your heaviest loads. 12' max. lift.",
        "images": "https://www.benmeadows.com/images/ir/s7product/8CJJ7_AS03.jpg|https://www.benmeadows.com/images/ir/s7product/8CJJ7_AS03.jpg|https://www.benmeadows.com/images/ir/s7product/8YAA0_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8YAA0_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8CJJ7_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8CJJ7_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8CJJ8_web01.jpg|https://www.benmeadows.com/images/ir/s7product/8CJJ8_web01.jpg",
        "name": "MAASDAM POW'R-PULL® Cable Pullers",
        "price": 40.99,
        "sku": "36810155",
        "url": "https://www.benmeadows.com/maasdam-powr-pull-cable-pullers_36810155/",
        "variations": "Capacity: 2 Ton|Capacity: 1 Ton|Capacity: 3 Ton"
    }
}
,{
    "product": {
        "brand": "Ben Meadows",
        "category": "Forestry Supplies and Equipment|Logging and Clearing Tools|Cable Pullers and Log Chains",
        "currency": "USD",
        "date": "2017-12-07T01:35:24.906Z",
        "description": "Class A, Grade 1 castings are completely self-contained. Handyman Jack allows you to lift load on down stroke of handle. Safety shear pin protects load from being dropped. Steel is standard rolled for strength and rigidity; reversible for extra wear. 4\"L lifting nose allows pickup as close as 4-1/2\" from bottom of 28 sq. in. base plate. Adjustable for clamping use. 4660-lb. capacity. Accessories are also available.Loc-RacВ® is a mounting and locking device that transports your jack securely in a pickup truck or utility vehicle. Includes lock and keys.Bumper Lift is designed to fit most vehicle bumpers.",
        "images": "https://www.benmeadows.com/images/ir/s7product/8C978_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/8C978_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/8C978_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/8C978_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/8C978_AS01.jpg|https://www.benmeadows.com/images/ir/s7product/8C978_AS01.jpg",
        "name": "Handyman® Jacks",
        "price": 86.99,
        "sku": "36810133",
        "url": "https://www.benmeadows.com/handyman-jacks_36810133/",
        "variations": "Height: 60''|Height: 48''"
    }
}]
Mikhail Sisin Co-founder of cloud-based web scraping and data extraction platform Diggernaut. Over 10 years of experience in data extraction, ETL, AI, and ML.

Leave a Reply

Your email address will not be published. Required fields are marked *