HACKER Q&A
📣 bredren

Have you run Microsoft Word in headless mode?


Have you or your organization run Microsoft Word in headless mode? How do you do it, and why?

For example, "a node.js server on a Windows machine that takes a file, saves it onto disk and runs a Visual Basic application on it which itself starts Microsoft Word headless to [perform some operation]" [1]

I need to manipulate Word documents that include accurate page numbers for elements like Table of Contents and List of Figures. These dynamic elements are populated on first open by Word (with user approval)

I'd like to automate all of this, and potentially manipulate the resulting .docx afterward.

From my initial research, doing this well requires processing by Word's layout engine.

I need this to be available as a web service to another application written in Python.

Other solutions I've considered include using LibreOffice in a headless mode, or a commonly mentioned but high priced solution [2] Aspose.words.

[1] https://news.ycombinator.com/item?id=17425712

[2] https://products.aspose.app/words


  👤 ksherlock Accepted Answer ✓
This advice is old enough to drive, but Joel Spolsky suggested doing just that.

"Word and Excel have extremely complete object models, available via COM Automation, which allow you to programmatically do anything. In many situations, you are better off reusing the code inside Office rather than trying to reimplement it. Here are a few examples..."

https://www.joelonsoftware.com/2008/02/19/why-are-the-micros...


👤 nullindividual
Microsoft essentially says 'don't install Word server-side' [0]. The 3rd party you've linked to would be a more appropriate route.

You can automate in other ways, such as interacting with COM via a scripting language or development language client-side. That is supported. If you're dead-set on a web-based application, then yeah, go 3rd party.

[0] https://support.microsoft.com/en-us/topic/considerations-for...


👤 billylo
I've used this approach recently. Put logic into a Word macro and trigger it using command line. (/mmacroname)

https://support.microsoft.com/en-us/office/command-line-swit...


👤 x0x0
For reasons, I've written code using docx4j to convert to and from .docx. We did examine Aspose.

Given the complexity of all the things you can do with Word and the immense feature set, if high fidelity is the goal, I don't believe using anything but actual Word and the real Word layout engine will give you that. You'll have to figure out if the licensing problem is one you can live with.


👤 transfire
My company just bought a used Olympus microscope. It requires Word to generate metrology reports. So apparently it runs Word behind the scenes. Is that considered a headless mode?

Unfortunately M$ Word is the only option. Can’t say I’m happy about it. But it is what it is.


👤 jellykid
My first thought was PowerApps: https://www.youtube.com/watch?v=s6NaIYP3-_w

👤 solardev
What about the Word web app? Can you use Playwright or similar to fake a login to that and extract the document data there?

👤 wruza
I automated Excel and other specific COM app at work with `npm i winax`. Word should be similar. There’s a few gotchas with reference counts and MS’s idiotic out-of-process behaviors, but otherwise it works.

LibreOffice cannot open even simple documents correctly ime, but ymmw.

In case the “user approval” isn’t a part of COM connection, I’d look for headful word instead with an AHK/etc script triggering it.