HACKER Q&A
📣 thinkloop

Reliable scraper service that returns title, image, blurb for given url?


I would like to show relevant information like: primary image, title, blurb, video embed, etc., for links my users submit - similar to how chat services like WhatsApp, Messenger, etc., do when a url is pasted into the chat.

Are there any reliable services that provide this?


  👤 lomutinaci Accepted Answer ✓
I found one which works most of the time and helps specially with the captchas, have you tried proxycrawl?

👤 nyuszika7h
These chat services usually extract the info from OpenGraph tags: https://ogp.me/

👤 buboard
I m not sure how doable it is to do it reliably. Some services, e.g. cloudflare block scrapers like cURL and redirect to a captcha page or enable javascript "Attention Required" page . It seems owners of these sites need to explicitly enable the scraper by IP! This pretty much precludes a link previewer from working reliably, and cloudflare is used by A LOT of sites these days

https://support.cloudflare.com/hc/en-us/articles/217720788-T...