{"id":813,"date":"2020-02-01T07:16:36","date_gmt":"2020-02-01T07:16:36","guid":{"rendered":"https:\/\/www.diggernaut.com\/blog\/?p=813"},"modified":"2020-05-03T17:20:09","modified_gmt":"2020-05-03T17:20:09","slug":"learning-how-to-scrape-the-data-from-ebay","status":"publish","type":"post","link":"https:\/\/www.diggernaut.com\/blog\/learning-how-to-scrape-the-data-from-ebay\/","title":{"rendered":"Learning how to scrape the data from eBay"},"content":{"rendered":"<p>eBay is a very famous and popular marketplace. Very often, it is used by small sellers to sell goods, the same way as on Amazon. Therefore, the data from it can be used to assess a trading niche when entering a market with a new product.<\/p>\n<p>The eBay site is quite simple, does not use Javascript to display pages, and there should be no technical problems for scraping it. However, you need to know that there is a limit to the number of listings eBay shows per query. Therefore, if you want to collect all the results, you will have to build your queries in such a way that the search query or filtering returns as many listings as would not exceed the limit.<\/p>\n<p>Let&#8217;s say we want to sell e-readers. Let&#8217;s try to find the category we need and configure the filters. eBay has the <a href=\"https:\/\/www.ebay.com\/sch\/i.html?_fsrp=1&#038;_dmd=1&#038;_sacat=171485&#038;rt=nc&#038;_ipg=25\">Tablets &#038; eReaders<\/a> category. That is what we need as a starting point. Open this page in Google Chrome<\/p>\n<figure id=\"attachment_mmd_816\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-1.png\"><img width=\"1794\" height=\"877\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-1.png\" class=\"attachment-full size-full\" alt=\"eBay - Tablets and eReaders\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-1.png 1794w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-1-300x147.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-1-1024x501.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-1-768x375.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-1-1536x751.png 1536w\" sizes=\"auto, (max-width: 1794px) 100vw, 1794px\" \/><\/a><\/figure>\n<p>However, tablets are also shown in this category, and we are only interested in e-readers. To do this, we need to configure the filter for the type of product. Unfortunately, there is no such filter in the main set, so you need to click on the \u201cMore Filters\u201d button, which will open a window with a list of all filters.<\/p>\n<figure id=\"attachment_mmd_817\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-2.png\"><img width=\"1883\" height=\"623\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-2.png\" class=\"attachment-full size-full\" alt=\"eBay - more filters\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-2.png 1883w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-2-300x99.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-2-1024x339.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-2-768x254.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-2-1536x508.png 1536w\" sizes=\"auto, (max-width: 1883px) 100vw, 1883px\" \/><\/a><\/figure>\n<p>There we need to find the \u201cType\u201d filter, select the \u201ce Reader\u201d option in it and click on the \u201cApply\u201d button.<\/p>\n<figure id=\"attachment_mmd_818\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-3.png\"><img width=\"706\" height=\"634\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-3.png\" class=\"attachment-full size-full\" alt=\"eBay - set the type filter\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-3.png 706w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-3-300x269.png 300w\" sizes=\"auto, (max-width: 706px) 100vw, 706px\" \/><\/a><\/figure>\n<p>We see that there are significantly fewer products, but we are only interested in new devices. Therefore, we need to select the \u201cNew\u201d option in the \u201cCondition\u201d filter.<\/p>\n<figure id=\"attachment_mmd_819\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-4.png\"><img width=\"259\" height=\"416\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-4.png\" class=\"attachment-full size-full\" alt=\"eBay - select only new products\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-4.png 259w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-4-187x300.png 187w\" sizes=\"auto, (max-width: 259px) 100vw, 259px\" \/><\/a><\/figure>\n<p>We are not interested in auctions either, so we\u2019ll select the \u201cBuy It Now\u201d option above the listings block.<\/p>\n<figure id=\"attachment_mmd_820\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-5.png\"><img width=\"1754\" height=\"877\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-5.png\" class=\"attachment-full size-full\" alt=\"eBay - get rid of auctions\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-5.png 1754w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-5-300x150.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-5-1024x512.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-5-768x384.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-5-1536x768.png 1536w\" sizes=\"auto, (max-width: 1754px) 100vw, 1754px\" \/><\/a><\/figure>\n<p>Now, to save on page requests, increase the number of displayed results on the page. By default, eBay displays 50 listings. We can choose 200, which will save our costs of going through the entire catalog of the filtered category by 4 times. To do this, under the block with the results, you need to select the number of results to show: 200.<\/p>\n<figure id=\"attachment_mmd_821\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-6.png\"><img width=\"1612\" height=\"637\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-6.png\" class=\"attachment-full size-full\" alt=\"eBay - increase number of results to show\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-6.png 1612w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-6-300x119.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-6-1024x405.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-6-768x303.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-6-1536x607.png 1536w\" sizes=\"auto, (max-width: 1612px) 100vw, 1612px\" \/><\/a><\/figure>\n<p>After we set up all the filters and options, we need to copy the URL from the address bar:<br>\n<a href=\"https:\/\/www.ebay.com\/sch\/Tablets-eBook-Readers\/171485\/i.html?_dcat=171485&#038;_fsrp=1&#038;_sacat=171485&#038;rt=nc&#038;Type=eBook%2520Reader&#038;LH_ItemCondition=1000&#038;LH_BIN=1&#038;_ipg=200\">https:\/\/www.ebay.com\/sch\/Tablets-eBook-Readers\/171485\/i.html?_dcat=171485&#038;_fsrp=1&#038;_sacat=171485&#038;rt=nc&#038;Type=eBook%2520Reader&#038;LH_ItemCondition=1000&#038;LH_BIN=1&#038;_ipg=200<\/a><\/p>\n<p>This will be our starting URL.<\/p>\n<p>Next, we need to disable JS on the page. We will do this, as usual, using the extension for Google Chrome: Quick Javascript Switcher. Next, open the developer tools in Google Chrome by pressing Ctrl + Shift + I. Then, using the tool to select elements, we will find the blocks we need on the page and CSS selectors for them. Firstly, we are interested in the listing block, and secondly, the link to the next page.<\/p>\n<figure id=\"attachment_mmd_822\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-7.png\"><img width=\"1902\" height=\"889\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-7.png\" class=\"attachment-full size-full\" alt=\"eBay - Developer tools\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-7.png 1902w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-7-300x140.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-7-1024x479.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-7-768x359.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-7-1536x718.png 1536w\" sizes=\"auto, (max-width: 1902px) 100vw, 1902px\" \/><\/a><\/figure>\n<p>First, let&#8217;s collect all the blocks with listings, that is, define a CSS selector for them.<\/p>\n<figure id=\"attachment_mmd_823\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-8.png\"><img width=\"1920\" height=\"1080\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-8.png\" class=\"attachment-full size-full\" alt=\"eBay - looking for CSS selector of the product block\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-8.png 1920w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-8-300x169.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-8-1024x576.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-8-768x432.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-8-1536x864.png 1536w\" sizes=\"auto, (max-width: 1920px) 100vw, 1920px\" \/><\/a><\/figure>\n<p>CSS selector: <code>ul&gt; li.s-item<\/code>. To check, we will do a search in the \u201cElements\u201d section (by pressing Ctrl + F in it). And make sure that the selector selects all the listings. We will see that more than 200 elements have been selected, although there should be 200 exactly. This happened because eBay, in addition to regular listings, also shows us Sponsored ads.<\/p>\n<figure id=\"attachment_mmd_824\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-9.png\"><img width=\"1846\" height=\"826\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-9.png\" class=\"attachment-full size-full\" alt=\"eBay - sponsored ads\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-9.png 1846w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-9-300x134.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-9-1024x458.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-9-768x344.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-9-1536x687.png 1536w\" sizes=\"auto, (max-width: 1846px) 100vw, 1846px\" \/><\/a><\/figure>\n<p>Filtering them out will not be easy, but we will try to do it. Click on one of the letters in the word \u201cSPONSORED\u201d and see which element opens in the \u201cElements\u201d window. We will see a set of span elements containing a random set of characters.<\/p>\n<figure id=\"attachment_mmd_825\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-10.png\"><img width=\"1567\" height=\"512\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-10.png\" class=\"attachment-full size-full\" alt=\"eBay - sponsored label\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-10.png 1567w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-10-300x98.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-10-1024x335.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-10-768x251.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-10-1536x502.png 1536w\" sizes=\"auto, (max-width: 1567px) 100vw, 1567px\" \/><\/a><\/figure>\n<p>We see that these elements have different classes. It is quite clear that \u201cSPONSORED\u201d is shown on the page due to the fact that one of the classes is shown, and the second is not. But we can\u2019t just take the class we need as a constant. Because, judging by the name of the class, it is not static, but dynamically generated. Therefore, we cannot rely on his name. However, if one of the classes is shown, and the second is not, somewhere on the page there should be CSS that sets this rule. To find it, do a search in the elements by the name of the class and find that the page contains the following code:<\/p>\n<pre><code class=\"language-html\">&lt;style type=&quot;text\/css&quot;&gt;\nspan.s-m1yuhh {\n    display: inline;\n}\nspan.s-o2xlx7k {\n    display: none;\n}\n&lt;\/style&gt;\n<\/code><\/pre>\n<p>Our task now is to construct a selector for it and check it in the \u201cElements\u201d window, making sure that the selector selects only 1 element on the page.<\/p>\n<p><code>style:contains(&ldquo;display: inline;&rdquo;)<\/code> \u2013 such a selector will work just fine for our task.<\/p>\n<p>Now that we have found the desired element, we need to pull out the class that is shown from there. To do this, in the <strong>parse<\/strong> command we will use the <strong>filter<\/strong> option. Let&#8217;s see which regular expression helps highlight the name of the class we need:<\/p>\n<p><code>span\\.([^\\s\\{]+)\\s*\\{\\s*display\\:\\s*inline;<\/code><\/p>\n<p>Using this regular expression, we will extract the class name: <strong>s-m1yuhh<\/strong>.<br>\nNow that we have a class, we can go into the \n<code>span [role =&rdquo; text &rdquo;]<\/code> element and remove all <strong>span<\/strong> elements with a class other than our class (which we will write to the <em>class<\/em> variable beforehand). You can do this with the <strong>node_remove<\/strong> command:<\/p>\n<pre><code class=\"language-yaml\">- node_remove: span:not(.&lt;%class%&gt;)\n<\/code><\/pre>\n<p>Then you need to use the <strong>parse<\/strong> command and compare the result in the register with the string \u201cSPONSORED\u201d using the <strong>if<\/strong> command.<\/p>\n<p>This way we can skip commercial listings.<\/p>\n<p>Now you can look into the listing block and select the fields that we will collect:<\/p>\n<figure id=\"attachment_mmd_826\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-11.png\"><img width=\"1563\" height=\"831\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-11.png\" class=\"attachment-full size-full\" alt=\"eBay - CSS selectors for product properties\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-11.png 1563w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-11-300x160.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-11-1024x544.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-11-768x408.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-11-1536x817.png 1536w\" sizes=\"auto, (max-width: 1563px) 100vw, 1563px\" \/><\/a><\/figure>\n<p>From this page we will collect:<\/p>\n<ul>\n<li>product name: <code>h3.s-item__title<\/code><\/li>\n<li>product price: <code>span.s-item__price<\/code><\/li>\n<li>shipping cost: <code>span.s-item__shipping.s-item__logisticsCost<\/code><\/li>\n<li>listing rating: <code>div.b-starrating&gt; span.clipped<\/code><\/li>\n<li>number of reviews: <code>span.s-item__reviews-count&gt; span: not (.clipped)<\/code><\/li>\n<li>number of people watching this listing: <code>span.s-item__hotness: contains (&ldquo; Watching &rdquo;)<\/code><\/li>\n<li>number of items sold: <code>span.s-item__hotness: contains (&ldquo; Sold &rdquo;)<\/code><\/li>\n<li>product link: <code>a.s-item__link<\/code><\/li>\n<\/ul>\n<p>As part of this article, we restrict ourselves to collecting data from the catalog page without going to the product page. You can independently improve the eBay scraper by adding the scraping logic of the product page. To do this, you must use the <strong>walk<\/strong> command to go to the product page and the <strong>find<\/strong> command to go through the blocks using CSS selectors.<\/p>\n<p>Now that we have split the product data block into its constituent parts and got the selectors, let&#8217;s find a selector to go to the next page of the category.<\/p>\n<figure id=\"attachment_mmd_827\" class=\"wp-block-image aligncenter\"><a href=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-12.png\"><img width=\"1575\" height=\"678\" src=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-12.png\" class=\"attachment-full size-full\" alt=\"eBay - CSS selector of the link to the next page\" decoding=\"async\" loading=\"lazy\" align=\"center\" srcset=\"https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-12.png 1575w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-12-300x129.png 300w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-12-1024x441.png 1024w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-12-768x331.png 768w, https:\/\/www.diggernaut.com\/blog\/wp-content\/uploads\/2020\/01\/ebay-12-1536x661.png 1536w\" sizes=\"auto, (max-width: 1575px) 100vw, 1575px\" \/><\/a><\/figure>\n<p>The image shows that the selector a [rel = &#8220;next&#8221;] is suitable for us. He is alone on the page, and all that remains for us to do is parse the href attribute and put the value to the link pool.<\/p>\n<p>We are ready to write a digger configuration:<\/p>\n<pre><code class=\"language-yaml\">---\nconfig:\n    debug: 2\n    agent: Mozilla\/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/71.0.3578.98 Safari\/537.36\ndo:\n# Add starting URL to the pool of the links\n- link_add:\n    url:\n    - https:\/\/www.ebay.com\/sch\/Tablets-eBook-Readers\/171485\/i.html?_dcat=171485&amp;_fsrp=1&amp;_sacat=171485&amp;rt=nc&amp;Type=eBook%2520Reader&amp;LH_ItemCondition=1000&amp;LH_BIN=1&amp;_ipg=200\n# Iterating over the pool and visit every page\n- walk:\n    to: links\n    do:\n    # Clear variables\n    - variable_clear: class\n    # Find the link to the next page\n    - find:\n        path: a[rel=&quot;next&quot;]\n        do:\n        # Parse value from the href attribute\n        - parse:\n            attr: href\n        # Do standard register value clean-up\n        - space_dedupe\n        - trim\n        # Check if value is not empty\n        - if:\n            match: \\w+\n            do:\n            # Add value to the pool of the links\n            - link_add\n    # Find CSS block to extract the &quot;SPONSORED&quot; class\n    - find:\n        path: &#039;style:contains(&quot;display: inline;&quot;)&#039;\n        do:\n        # Extract class name and save it to the variable\n        - parse:\n            filter: &#039;span\\.([^\\s\\{]+)\\s*\\{\\s*display\\:\\s*inline;&#039;\n        - variable_set: class\n    # Find all blocks with listings\n    - find:\n        path: ul &gt; li.s-item\n        do:\n        # Clear variables\n        - variable_clear: sponsored\n        # Detect SPONSORED listing\n        - find:\n            path: span[role=&quot;text&quot;]\n            do:\n            - node_remove: span:not(.&lt;%class%&gt;)\n            - parse\n            - space_dedupe\n            - trim\n            - if:\n                match: SPONSORED\n                do:\n                - variable_set:\n                    field: sponsored\n                    value: 1\n        # Check if listing is SPONSORED\n        - variable_get: sponsored\n        - if:\n            match: 1\n            else:\n            # It&#039;s a regular listing, extract it\n            # Create an object to store the data\n            - object_new: item\n            # Extract the product name\n            - find:\n                path: h3.s-item__title\n                do:\n                - parse\n                - space_dedupe\n                - trim\n                - object_field_set:\n                    object: item\n                    field: name\n            # Extract price of the product\n            - find:\n                path: span.s-item__price\n                do:\n                # Extract only digits, also there may be 2 prices (from\/to), so we have 2 regex to handle it properly\n                - parse:\n                    filter:\n                    - \\$([0-9\\.]+)\\s+to\n                    - \\$([0-9\\.]+)\n                # Check if register value has digits\n                - if:\n                    match: \\d+\n                    do:\n                    # Save value to the price field as float type\n                    - object_field_set:\n                        object: item\n                        field: price\n                        type: float\n            # Extract delivery cost\n            - find:\n                path: span.s-item__shipping.s-item__logisticsCost\n                do:\n                # Check if there is a free delivery\n                - parse\n                - if:\n                    match: Free\n                    do:\n                    - register_set: 0.0\n                    - object_field_set:\n                        object: item\n                        field: delivery\n                        type: float\n                    else:\n                    # Parse delivery cost\n                    - parse:\n                        filter:\n                        - \\$([0-9\\.]+)\n                    - if:\n                        match: \\d+\n                        do:\n                        - object_field_set:\n                            object: item\n                            field: delivery\n                            type: float\n            # Extract the listing rating\n            - find:\n                path: div.b-starrating &gt; span.clipped\n                do:\n                # Parse only digits\n                - parse:\n                    filter: ([0-9\\.]+)\\s+out\n                - if:\n                    match: \\d+\n                    do:\n                    - object_field_set:\n                        object: item\n                        field: rating\n                        type: float\n            # Extract the number of reviews\n            - find:\n                path: div.b-starrating &gt; span.clipped\n                do:\n                # Only digits are what we need\n                - parse:\n                    filter: (\\d+)\n                - if:\n                    match: \\d+\n                    do:\n                    - object_field_set:\n                        object: item\n                        field: reviews\n                        type: int\n            # Extract the number of people watching this listing\n            - find:\n                path: span.s-item__hotness:contains(&quot;Watching&quot;)\n                do:\n                # Only digits again\n                - parse:\n                    filter: (\\d+)\n                - if:\n                    match: \\d+\n                    do:\n                    - object_field_set:\n                        object: item\n                        field: watching\n                        type: int\n            # Extract number of sold products\n            - find:\n                path: span.s-item__hotness:contains(&quot;Sold&quot;)\n                do:\n                # Digits, digits and nothing more\n                - parse:\n                    filter: (\\d+)\n                - if:\n                    match: \\d+\n                    do:\n                    - object_field_set:\n                        object: item\n                        field: sold\n                        type: int\n            # Link to the listing page\n            - find:\n                path: a.s-item__link\n                do:\n                - parse:\n                    attr: href\n                - space_dedupe\n                - trim\n                - if:\n                    match: \\w+\n                    do:\n                    - normalize:\n                        routine: url\n                    - object_field_set:\n                        object: item\n                        field: url\n            # Save the object with data\n            - object_save:\n                name: item\n    # Pause to not abuse the eBay\n    - sleep: 3\n<\/code><\/pre>","protected":false},"excerpt":{"rendered":"<p>eBay is a very famous and popular marketplace. Very often, it is used by small sellers to sell goods, the same way as on Amazon. Therefore, the data from it can be used to assess a trading niche when entering a market with a new product. The eBay site is quite simple, does not use [&hellip;]<\/p>","protected":false},"author":4,"featured_media":832,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[31,30,9,2],"tags":[],"class_list":["post-813","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ecommerce-scraping","category-free-scrapers","category-learning-meta-language","category-web-scraping"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts\/813","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/comments?post=813"}],"version-history":[{"count":6,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts\/813\/revisions"}],"predecessor-version":[{"id":831,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/posts\/813\/revisions\/831"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/media\/832"}],"wp:attachment":[{"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/media?parent=813"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/categories?post=813"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.diggernaut.com\/blog\/wp-json\/wp\/v2\/tags?post=813"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}