HACKER Q&A
📣 PaulHoule

Server-side HTML Templates based on DOM?


For side projects I've used BeautifulSoup in Python to break down HTML documents into the DOM, manipulate them at the DOM level, then serve them to a browser.

For instance the "main template" for a site could be an HTML page that has a

inside it which is a placeholder for the content. The content is itself an HTML page, and the of the content is inserted into the
, also the metadata gets merged so the passes through.<p>You might write<p><pre><code> <div>First Name: <span id="#insertFirstName"></span></div> </code></pre> and the system would insert the firstName into that span, for instance.<p>I've done the same thing in Java with tools like jsoup and HTMLUnit.<p>This system can rewrite links and do other interesting transformations on the guest HTML.<p>One thing I'll concede is that this is slow compared to the alternatives. In a crude test a system like this and not optimized might be able to transform 10-50 pages per second on a core whereas it is easy to get 3000+ with string based templates.<p>I haven't seen other people do this, are there any examples you of know of DOM-based templating? Any thoughts?</h4> </div> </div> <hr> <div class="card"> <div class="card-header"> <span alt="fabianholzer - Accepted Answer" title="fabianholzer - Accepted Answer" class="ps-1 btn btn-success text-white">   👤 fabianholzer</span> <span class="text-success"> Accepted Answer ✓</span> </div> <div class="card-body text-start"> <h5>If you consider static site generators as a special case of side-server rendering, then that is basically the principle that underlies my personal ssg [0].<p>Highlevel idea: It reads in all HTML files, parses them with JSDOM, builds a metamodel of the overall website, which (ab-)uses standard HTML meta tags as alternative to a frontmatter (e.g <meta name="template" content="articles"> declares the particular file as the template for the category articles, while <meta name="category" content="article"> declares a particular file as part of the category articles. And then it uses a non-standard HTML tag to call JS plugins with the whole meta-model of the site, so it can plug together content and template, rewrite parts of the DOM (for example add bidirectional links when desired, run a syntax highlighter on <code> tag, count words). When all plugins have run the DOM is serialized into HTML. Admittedly it could be a tad faster, but for my purposes it is good enough. I don't mind running a ton of JS on my machine, but still can serve mostly static HTML with very few sprinkles of client-side JS<p>[0] <a href="https://github.com/fnh/intertwingle/">https://github.com/fnh/intertwingle/</a></h4> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 sntran </div> <div class="card-body text-start"> <h6>Cloudflare has a thing called HTMLRewriter that is both a streaming HTML parser and selector for changing it. It's very efficient and let you change only the places you need.<p>For your example, you can just match in `[id^="insert"]` and add the actual firstName value in the handler.<p>I use it alot on the server side, but it's for JavaScript side, or Rust if you use the lower level lol-html.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 nchmy </div> <div class="card-body text-start"> <h6>I'm not quite sure what you're doing, but I wonder if a new-ish browser api - Declarative Shadow DOM - could be useful here. It allows for sending a sort of html shell to the browser that contains pre-defined placeholders, and then async sending the html fragments for those placeholders. No JS required.<p>It recently got support from all the major browsers.<p>- <a href="https://developer.chrome.com/docs/css-ui/declarative-shadow-dom" rel="nofollow">https://developer.chrome.com/docs/css-ui/declarative-shadow-...</a><p>- <a href="https://lamplightdev.com/blog/2024/01/10/streaming-html-out-of-order-without-javascript/" rel="nofollow">https://lamplightdev.com/blog/2024/01/10/streaming-html-out-...</a><p>I'm also considering trying something similar to you, and thought that perhaps GoQuery (essentially jQuery written in much faster Golang) could be useful here. <a href="https://github.com/PuerkitoBio/goquery">https://github.com/PuerkitoBio/goquery</a><p>If you need to do the scraping stuff, Colly could be used (which uses GoQuery under the hood). <a href="https://go-colly.org/" rel="nofollow">https://go-colly.org/</a><p>You could also create a sort of hypermedia "single page app" if you serve the shell once, and then use something like HTMX (or even jquery) to swap the body and other html fragments without a full page reload.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 conradfr </div> <div class="card-body text-start"> <h6>Is this really different than Jinja2 or Twig with template inheritance and blocks ?</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 MatmaRex </div> <div class="card-body text-start"> <h6>XSLT would be a classic answer, but it's a bit of a lost ancient technology these days. I've only used it in the browser for a lark, but I suppose you'd find mature (old) libraries for working with it easily enough.<p>PHPTAL is another system I've heard of (but never used) that seems to be built on this idea. It is also very mature (old). <a href="https://phptal.org/" rel="nofollow">https://phptal.org/</a><p>In general working with DOM doesn't seem to be very cool these days. Everyone uses template systems that are glorified string concatenators, like Mustache, and doesn't care about security issues due to bad escaping.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 solardev </div> <div class="card-body text-start"> <h6>I'm not sure what the advantage of this is vs any standard server-side templating (PHP, Twig, Mustache, Liquid, etc.)<p>Converting from a string to a DOM is very expensive because of all the non-XML non-standard cruft HTML accumulated over the years. If you can do a simple string find/replace instead, it should be less computationally expensive. No reason to actually parse the HTML unless you need to actually render it or at least work with it semantically/structurally, which doesn't sound like the case here.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 TheRealPomax </div> <div class="card-body text-start"> <h6>How does this differ from using a templating engine where the server gets a template, and instantiates it to "the final thing it should be" on a per-request basis?<p>(also, if you need to transform 200 requests per second that's slowing things down, there's almost certainly a cache-layer missing?)</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 fzzzy </div> <div class="card-body text-start"> <h6>I made a web framework a long time ago that did this. <a href="https://twisted.org/documents/8.2.0/web/howto/woven.html" rel="nofollow">https://twisted.org/documents/8.2.0/web/howto/woven.html</a></h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 mk12 </div> <div class="card-body text-start"> <h6>The work-in-progress Zine SSG project <a href="https://zine-ssg.io/" rel="nofollow">https://zine-ssg.io/</a> calls its templating Super and it works just like this — directives are written in HTML rather than on top of it.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 newzisforsukas </div> <div class="card-body text-start"> <h6>You would likely use other attributes aside from id, and there are a large number of JS based templating frameworks that do this.<p><a href="https://github.com/Floofies/cdaTemplate">https://github.com/Floofies/cdaTemplate</a></h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 rbanffy </div> <div class="card-body text-start"> <h6>The Zope application server uses something like that, ZPT (Zope Page Templates) and TAL (Tenplate Access Language), IIRC. Plone, a CMS built on Zope uses them extensively.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 cxr </div> <div class="card-body text-start"> <h6>> Any thoughts?<p>10–50 pages per second is already more output than a human can produce input for, so it shouldn't be considered a problem.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 frmdstryr </div> <div class="card-body text-start"> <h6>enaml-web does this using lxml. It works fine for a limited number of concurrent users.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 delanyoyoko </div> <div class="card-body text-start"> <h6>Probably raw Svelte can be used as a frontend template to your backend</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 cynicalsecurity </div> <div class="card-body text-start"> <h6>PHP.</h6> </div> </div> <hr> <div class="card"> <div class="card-header"> 👤 noahlt </div> <div class="card-body text-start"> <h6>The end of this road is React's Server Side Rendering and Next.js.<p>This approach only really has one major benefit over string-based templating, namely, it lets you re-use code across server and client side DOM logic.</h6> </div> </div> </div> <hr> </div> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.2.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-kenU1KFdBIe4zVF0s0G1M5b4hcpxyD9F7jL+jjXkk+Q2h455rYXK/7HAuoJl+0I4" crossorigin="anonymous"></script> <!-- Default Statcounter code for hackerqa https://hackerqa.net --> <script type="text/javascript"> var sc_project=13078448; var sc_invisible=1; var sc_security="656d49db"; </script> <script type="text/javascript" src="https://www.statcounter.com/counter/counter.js" async></script> <noscript><div class="statcounter"><a title="Web Analytics Made Easy - Statcounter" href="https://statcounter.com/" target="_blank"><img class="statcounter" src="https://c.statcounter.com/13078448/0/656d49db/1/" alt="Web Analytics Made Easy - Statcounter" referrerPolicy="no-referrer-when-downgrade"></a></div></noscript> <!-- End of Statcounter Code --> </body> </html>