ML architecture for text pattern detection?
Id like to train an inference model that takes a bunch of structured text as an input (html), and outputs the relevant text. The goal is to build a pipeline where given a website for technical product spec, it outputs the relevant data. Every manufacturers website (about 50 of them) is structured differently, but generally the data is in an html table, sometimes rows, sometimes columns.
Anyone have links to papers or something I can read to get started? Or is this even a thing that exists?