{"id":389,"date":"2018-02-11T00:48:24","date_gmt":"2018-02-11T00:48:24","guid":{"rendered":"https:\/\/www.diggernaut.com\/blog\/?p=389"},"modified":"2019-01-12T15:52:57","modified_gmt":"2019-01-12T15:52:57","slug":"free-web-scraper-cabelas-extract-data-products","status":"publish","type":"post","link":"https:\/\/www.diggernaut.com\/blog\/free-web-scraper-cabelas-extract-data-products\/","title":{"rendered":"Free web scraper for Cabela&#8217;s to extract data about products"},"content":{"rendered":"<p>With this free web scraper for the Cabela\u2019s online store,, you can collect data for dozens of thousands of products for outdoor activities, hunting, fishing and tourism. Cabela\u2019s company is one of the leading American retailers in the field of goods for various outdoor activities and sports. The company was founded in 1961 by Richard Cabela in the city of Sydney, Nebraska and is still managed by members of the Cabela family.<\/p>\n<p><strong>Approx number of goods:<\/strong> 70000<br>\n<strong>Approx number of page requests:<\/strong> 70000<br>\n<strong>Recommended subscription plan:<\/strong> Small<\/p>\n<p><strong>PLEASE NOTE!<\/strong> The number of requests can exceed the number of products, because data about variations, images, etc. can be scraped from other resources and will require additional requests. Also part of the product data can be delivered using XHR requests, which also increases the total number of required page requests.<\/p>\n<h3>How to use the web scraper to extract data about goods and prices for cabelas.com<\/h3>\n<p>To use the web scraper for Cabela\u2019s store website, you must have an account with our Diggernaut service. You can just simply follow this comprehensive guide:<\/p>\n<ol>\n<li>Go through this <a href=\"https:\/\/www.diggernaut.com\/accounts\/signup\/\">registration link<\/a> to open free account with <a href=\"https:\/\/www.diggernaut.com\">Diggernaut<\/a><\/li>\n<li>After registering and confirming the email address, you will need to <a href=\"https:\/\/www.diggernaut.com\/accounts\/login\/\">log in to your account<\/a><\/li>\n<li>Create a project with any name and description, if you do not know how to do it, please refer to our <a href=\"https:\/\/www.diggernaut.com\/dev\/website-projects-create-new-project.html\">documentation<\/a><\/li>\n<li>Switch to the created project and create a digger with any name, if you do not know how to do it, please refer to our <a href=\"https:\/\/www.diggernaut.com\/dev\/website-projects-create-new-digger.html\">documentation<\/a><\/li>\n<li>Copy the following digger configuration to the clipboard and paste it into the digger you created, if you do not know how to do it, refer to our <a href=\"https:\/\/www.diggernaut.com\/dev\/website-projects-digger-config.html\">documentation<\/a><\/li>\n<li>Switch the mode of the digger from Debug to Active, if you do not know how to do it, please refer to our <a href=\"https:\/\/www.diggernaut.com\/dev\/website-projects-edit-digger.html\">documentation<\/a><\/li>\n<li>Run your digger and wait until the completion, if you do not know how to do it, please refer to our <a href=\"https:\/\/www.diggernaut.com\/dev\/website-projects-run-digger.html\">documentation<\/a><\/li>\n<li>Download the scraped dataset in the format you need, if you do not know how to do it, please refer to our <a href=\"https:\/\/www.diggernaut.com\/dev\/website-projects-scraped-data.html\">documentation<\/a><\/li>\n<\/ol>\n<p>You can also setup a schedule for running your scraper and collect data regularly.<\/p>\n<h3>Scraping configuration for the digger<\/h3>\n<pre class=\"language-yaml line-numbers\"><code class=\"language-yaml\">---\nconfig:\n    debug: 2\n    agent: Chrome\ndo:\n- walk:\n    to: http:\/\/www.cabelas.com\n    do:\n    - find: \n        path: div.shopDropdown>a\n        do: \n        - parse:\n            attr: href\n            filter: ^(.+\\\/\\_\\\/N\\-\\d+)\n        - space_dedupe\n        - trim\n        - if:\n            match: \\w+\n            do:\n            - normalize:\n                routine: url\n            - link_add:\n                pool: catalog\n- walk:\n    to: links\n    pool: catalog\n    do:\n    - sleep: 2\n    - variable_clear: good\n    - find:\n        path: div.leftnav_content\n        do:\n        - variable_set:\n            field: good\n            value: &quot;yes&quot;\n    - find:\n        path: div.leftnav_content a\n        do:\n        - parse:\n            attr: href\n            filter: ^(.+\\\/\\_\\\/N\\-\\d+)\n        - space_dedupe\n        - trim\n        - if:\n            match: \\w+\n            do:\n            - normalize:\n                routine: url\n            - link_add:\n                pool: catalog\n    - find:\n        path: a.entry:contains(&#039;Next&#039;)\n        do:\n        - parse:\n            attr: href\n        - space_dedupe\n        - trim\n        - if:\n            match: \\w+\n            do:\n            - normalize:\n                routine: url\n            - link_add:\n                pool: catalog\n    - find:\n        path: div.productContentBlock>a\n        do:\n        - variable_set:\n            field: good\n            value: &quot;yes&quot;\n        - parse:\n            attr: href\n        - space_dedupe\n        - trim\n        - if:\n            match: \\w+\n            do:\n            - normalize:\n                routine: url\n            - link_add:\n                pool: pages\n    - find:\n        path: html\n        do:\n        - variable_get: good\n        - if:\n            match: &quot;yes&quot;\n            else:\n            - nocontent_category\n- walk:\n    to: links\n    pool: pages\n    do:\n    - variable_clear: good\n    - sleep: 2\n    - find:\n        path: &#039;div#productDetailsTemplate&#039;\n        do:\n        - variable_set:\n            field: good\n            value: &quot;yes&quot;\n        - variable_clear: list\n        - variable_clear: desc\n        - variable_clear: pid\n        - variable_clear: cid\n        - object_new: product\n        - eval:\n            routine: js\n            body: &#039;(function (){var d = new Date(); return d.toISOString()})();&#039;\n        - object_field_set:\n            object: product\n            field: date\n        - static_get: url\n        - object_field_set:\n            object: product\n            field: url\n        - find:\n            path: span.itemNumber\n            do:\n            - node_remove: b\n            - parse\n            - space_dedupe\n            - trim\n            - variable_set: pid\n            - object_field_set:\n                object: product\n                field: sku\n        - find:\n            path: h1.label\n            do:\n            - parse\n            - space_dedupe\n            - trim\n            - object_field_set:\n                object: product\n                field: name\n        - register_set: Cabela&#039;s\n        - variable_set: brand\n        - find:\n            in: doc\n            path: script[type=&quot;application\/ld+json&quot;]\n            do:\n            - parse\n            - normalize:\n                routine: replace_substring\n                args:\n                - Luck\\s+&quot;E&quot;\\s+Strike: Luck E Strike\n                - \\&quot;@type\\&quot;\\:\\s*\\&quot;Product\\&quot;\\,\\s+\\&quot;name\\&quot;\\:.+: &#039;&#039;\n                - \\&quot;description\\&quot;\\:.+: &#039;&#039;\n                - \\n+: &#039;&#039;\n                - \\}\\,\\]: &#039;}]&#039;\n            - normalize:\n                routine: json2xml\n            - to_block\n            - find:\n                path: brand>name\n                do:\n                - parse\n                - space_dedupe\n                - trim\n                - variable_set: brand\n            - find:\n                path: offers:has(standardPrice:not(:contains(&quot;null&quot;)))\n                slice: 0\n                do:\n                - find:\n                    path: standardPrice\n                    do:\n                    - parse\n                    - object_field_set:\n                        object: product\n                        type: float\n                        field: price\n                - find:\n                    path: price\n                    do:\n                    - parse\n                    - object_field_set:\n                        object: product\n                        type: float\n                        field: price\n                - register_set: USD\n                - object_field_set:\n                    object: product\n                    field: currency\n        - variable_get: brand\n        - object_field_set:\n            object: product\n            field: brand\n        - find:\n            in: doc\n            path: meta[name=&quot;description&quot;]\n            do:\n            - parse:\n                attr: content\n            - space_dedupe\n            - trim\n            - variable_set: desc\n        - find:\n            path: div.pdp-desc-long\n            do:\n            - parse\n            - space_dedupe\n            - trim\n            - variable_set: desc\n        - find:\n            path: &#039;div#description&#039;\n            do:\n            - node_replace:\n                path: br\n                with: &quot;\\n&quot;\n            - split:\n                context: text\n                delimiter: \\n+\n            - find:\n                path: div.splitted\n                slice: 0\n                do:\n                - parse\n                - space_dedupe\n                - trim\n                - variable_set: desc\n        - variable_get: desc\n        - object_field_set:\n            object: product\n            field: description\n        - find:\n            path: select.js-dropdown:has(option:contains(&quot;Select COLOR&quot;))\n            do:\n            - find:\n                path: option\n                slice: 1:-1\n                do:\n                - parse\n                - space_dedupe\n                - trim\n                - if:\n                    match: \\w+\n                    do:\n                    - object_field_set:\n                        object: product\n                        joinby: &quot;|&quot;\n                        field: variations\n        - find:\n            path: script:contains(&#039;params_viewlarger.push(&quot;asset&quot;,&#039;)\n            do:\n            - parse:\n                filter: \\s+params_viewlarger\\.push\\(\\&quot;asset\\&quot;\\,\\s+&quot;([^&quot;]+)&quot;\\)\\;\n            - to_block\n            - split:\n                context: text\n                delimiter: \\s*[\\;\\,]\n            - find:\n                path: div.splitted\n                do:\n                - parse:\n                    filter: ^([^\\?]+)\n                - space_dedupe\n                - trim\n                - if:\n                    match: \\w+\n                    do:\n                    - if:\n                        match: _sw_\n                        else:\n                        - register_set: http:\/\/images.cabelas.com\/is\/image\/?wid=1000\n                        - object_field_set:\n                            object: product\n                            joinby: &quot;|&quot;\n                            field: images\n        - find:\n            path: script:contains(&#039;params_viewlarger.push(&quot;asset&quot;,&#039;)\n            do:\n            - parse:\n                filter: \\s+altviewparams_viewlarger\\.push\\(\\&quot;asset\\&quot;\\,\\s+&quot;([^&quot;]+)&quot;\\)\\;\n            - to_block\n            - split:\n                context: text\n                delimiter: \\s*[\\;\\,]\n            - find:\n                path: div.splitted\n                do:\n                - parse:\n                    filter: ^([^\\?]+)\n                - space_dedupe\n                - trim\n                - if:\n                    match: \\w+\n                    do:\n                    - if:\n                        match: _sw_\n                        else:\n                        - register_set: http:\/\/images.cabelas.com\/is\/image\/?wid=1000\n                        - object_field_set:\n                            object: product\n                            joinby: &quot;|&quot;\n                            field: images\n        - find:\n            path: ul.breadcrumb>li>a\n            do:\n            - parse\n            - space_dedupe\n            - trim\n            - if:\n                match: \\w+\n                do:\n                - object_field_set:\n                    object: product\n                    joinby: &quot;|&quot;\n                    field: categories\n        - object_save:\n            name: product\n    - find:\n        path: html\n        do:\n        - variable_get: good\n        - if:\n            match: &quot;yes&quot;\n            else:\n            - nocontent_product<\/code><\/pre>\n<h3>Sample of scraped data<\/h3>\n<p>Below is a sample of a dataset with several products in JSON format (so you can easily review it and see data structure). The dataset can be downloaded as CSV, XLSX, XML, or any other text format using the templates.<\/p>\n<pre><code class=\"language-js\">[{\n    &quot;product&quot;: {\n        &quot;brand&quot;: &quot;Cabela&#039;s&quot;,\n        &quot;categories&quot;: &quot;Hunting|Hunting Bags & Packs&quot;,\n        &quot;currency&quot;: &quot;USD&quot;,\n        &quot;date&quot;: &quot;2017-12-27T13:24:43.531Z&quot;,\n        &quot;description&quot;: &quot;For total concealment and game-stalking quiet construction, look no further than our Traditional Hydration Pack. Incorporating low-nap poly tricot with PVC backing, this 70-oz.-capacity pack features comfort shoulder straps and a low-profile design. Taste-free hydration-tube system. External zippered pocket and shock-cord cargo panel. Imported.&quot;,\n        &quot;images&quot;: &quot;http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_518535_510_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_518535_510_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_518535_510_alt01_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_518535_510_alt02_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_518535_510_alt03_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_518535_510_alt04_01?wid=1000&quot;,\n        &quot;name&quot;: &quot;Cabela&#039;s Traditional Hydration Pack&quot;,\n        &quot;price&quot;: 23.88,\n        &quot;sku&quot;: &quot;IK-518535&quot;,\n        &quot;url&quot;: &quot;http:\/\/www.cabelas.com\/product\/hunting\/hunting-bags-packs\/pc\/104791680\/c\/104392080\/cabelas-traditional-hydration-pack\/728032.uts&quot;\n    }\n}\n,{\n    &quot;product&quot;: {\n        &quot;brand&quot;: &quot;Badlands&quot;,\n        &quot;categories&quot;: &quot;Hunting|Hunting Bags & Packs&quot;,\n        &quot;currency&quot;: &quot;USD&quot;,\n        &quot;date&quot;: &quot;2017-12-27T13:24:46.525Z&quot;,\n        &quot;description&quot;: &quot;Long, challenging hunting trips require you to be prepared for any obstacle you might encounter. With Cabela&#039;s-exclusive Badlands&#039; Release Day Pack, you&#039;ll have the pocket space, support and carry capabilities you need to be just that \u2013 prepared. The legendary Air-Track\u2122 Suspension system combines with the technical, ultralight construction to deliver easy mobility for long treks over the most rugged terrain. Made of super-quiet KXO-32 fabric with rugged Hypalon- and Kevlar-reinforced stress points for durability that stands up to years of use in the field, it also offers a multitude of pockets - five to be exact - that team up to keep all of your must-have hunting gear and accessories secure while you&#039;re on the move. It even boasts bow-carrying capabilities for added hands-free convenience. External bedroll and compression straps. Accepts up to a 110-oz. reservoir (sold separately). Comes with Badlands&#039; Unconditional Lifetime Warranty. Imported.&quot;,\n        &quot;images&quot;: &quot;http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_463111_999_04?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_463111_999_04?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_463111_999_alt01_04?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_463111_999_alt02_04?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_463111_999_alt03_04?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_463111_999_alt04_04?wid=1000&quot;,\n        &quot;name&quot;: &quot;Badlands Release Day Pack \u2013 Cabela&#039;s Exclusive&quot;,\n        &quot;price&quot;: 99.99,\n        &quot;sku&quot;: &quot;IK-463111&quot;,\n        &quot;url&quot;: &quot;http:\/\/www.cabelas.com\/product\/hunting\/hunting-bags-packs\/pc\/104791680\/c\/104392080\/badlands-release-day-pack\/2000208.uts&quot;\n    }\n}\n,{\n    &quot;product&quot;: {\n        &quot;brand&quot;: &quot;Herter&#039;s&quot;,\n        &quot;categories&quot;: &quot;Hunting|Hunting Bags & Packs&quot;,\n        &quot;currency&quot;: &quot;USD&quot;,\n        &quot;date&quot;: &quot;2017-12-27T13:24:48.912Z&quot;,\n        &quot;description&quot;: &quot;Transport your calls, ammo and other duck-hunting accessories safely and securely in Herter&#039;s Waterfowl Field Bag. A large zippered opening delivers access to the main compartment, which has an adjustable divider to separate your gear. Exterior pockets on the front, back and sides keep essentials at the ready. Adjustable shoulder strap and neoprene-wrapped handle for easy carry. Rugged 600-denier polyester fabric resists snags and tears. Imported.&quot;,\n        &quot;images&quot;: &quot;http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_421520_560_02?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_421520_560_02?wid=1000&quot;,\n        &quot;name&quot;: &quot;Herter&#039;s\u00ae Waterfowl Field Bag&quot;,\n        &quot;price&quot;: 24.99,\n        &quot;sku&quot;: &quot;IK-421520&quot;,\n        &quot;url&quot;: &quot;http:\/\/www.cabelas.com\/product\/hunting\/hunting-bags-packs\/pc\/104791680\/c\/104392080\/herters-waterfowl-field-bag\/1643686.uts&quot;\n    }\n}\n,{\n    &quot;product&quot;: {\n        &quot;brand&quot;: &quot;Cabela&#039;s&quot;,\n        &quot;categories&quot;: &quot;Hunting|Hunting Bags & Packs&quot;,\n        &quot;currency&quot;: &quot;USD&quot;,\n        &quot;date&quot;: &quot;2017-12-27T13:24:51.318Z&quot;,\n        &quot;description&quot;: &quot;You won\u2019t find a better gear bag for a lower price. Its durable, weather-resistant 600-denier polyester construction makes it ideal for toting everything from packable rain gear to extra odds and ends. Sturdy 1-1\u20442\\&quot; nylon web carry straps can be joined by a hand-friendly wrap handle. Six exterior pockets, including zippered mesh pockets on top and side, provide multiple storage and organization options. Embroidered Cabela\u2019s logo on front pocket. Imported.&quot;,\n        &quot;images&quot;: &quot;http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_999_03?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_999_03?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_014_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_014_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_022_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_022_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_027_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_027_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_148_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_148_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_928_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_928_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_896_05?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_896_05?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_014_alt01_01?wid=1000|http:\/\/images.cabelas.com\/is\/image\/Cabelas\/s7_580032_148_alt01_01?wid=1000&quot;,\n        &quot;name&quot;: &quot;Cabela&#039;s Catch-All Gear Bags&quot;,\n        &quot;price&quot;: 14.99,\n        &quot;sku&quot;: &quot;IK-580032&quot;,\n        &quot;url&quot;: &quot;http:\/\/www.cabelas.com\/product\/hunting\/hunting-bags-packs\/pc\/104791680\/c\/104392080\/cabelas-catch-all-gear-bags\/753422.uts&quot;,\n        &quot;variations&quot;: &quot;BLUE|TAN|PINK|GRAY|CAMO|O2 OCTANE&quot;\n    }\n}]\n<\/code><\/pre>","protected":false},"excerpt":{"rendered":"<p>With this free web scraper for the Cabela\u2019s online store,, you can collect data for dozens of thousands of products for outdoor activities, hunting, fishing and tourism. Cabela\u2019s company is one of the leading American retailers in the field of goods for various outdoor activities and sports. The company was founded in 1961 by Richard [&hellip;]<\/p>","protected":false},"author":4,"featured_media":391,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[31,30,2],"tags":[],"class_list":["post-389","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ecommerce-scraping","category-free-scrapers","category-web-scraping"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts\/389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/comments?post=389"}],"version-history":[{"count":2,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts\/389\/revisions"}],"predecessor-version":[{"id":643,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts\/389\/revisions\/643"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/media\/391"}],"wp:attachment":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/media?parent=389"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/categories?post=389"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/tags?post=389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}