Need some tips for web crawling

Question

Hello fellow hackers,I am trying to create a tool for myself that can crawl a few websites that I usually go on to compare the price of the same item. They have no APIs.A couple questions:1. is this LEGAL? 2. if I am crawling, what is the best way to approach this? does each website's crawling mechanism have to be manually written since they are unique or is there some strategy for scale if i need to expand the number of sites I crawl through in the future?Thank you!-F75

elliewithcolor · Accepted Answer

Yes it&rsquo;s legal. Just don&rsquo;t check the price ever 2 seconds from 800 locations.A simple way would be a headless browser [1]But there are also hosted tools that work like a website builder.The best way is: keep it simple and keep back (check once an hour or day and not every minute).Many shops use Schema.org markup. So if they support it, you don&rsquo;t have to write it for every site.You could also use a library that works with raw html and css. Then you could just use css selectors for extraction.[1] https://www.atlantbh.com/building-a-dynamic-crawler-with-pup...