I struggle to grok a system if I can’t read about the context. What are the goals/antigoals? What constraints was the system built under? What big architectural choices were evaluated? (e.g Serverless vs Containers)
Digging deeper I also see a lack of consideration for non-functional attributes like Performance, Capacity and High Availability.
Cloud paradigms can help to an extent with auto-scaling etc., but I feel there should still be a model for the expected Perf/Capacity, given the selected SW+HW stack.
In the case of High Availability, I continully see teams running critical Prod services without clearly documenting the expected behaviour if an Instance, AZ/Datacentre, Region or an external dependency goes down.
All of these observations have been in highly regulated industries with significant tech budgets.
So HN:
Do you teams document these architectural designs?
Have I perhaps lost touch with modern practices?
How do you grow teams (or defend the design from Mgmt / Auditors) if this information is held in peoples heads?
Appreciate your input and apologies as English is my second language.
Ultimately, I blame management and incentives. I think most line levels would like to spend a lot more time making things bullet proof, documented, commented, automated, and diagrammed. But management and leadership want features delivered and progress made, even at the cost of a less robust and understood system. This seems like a reasonable trade off, but it should be made consciously. I don't think that it usually is.
To answer your questions -
Do you teams document these architectural designs? Barebones wiki. Terraform.
Have I perhaps lost touch with modern practices? The tech industry is VERY heterogenous in terms of practice. It's not the 80's 90's with the regimented "this is how we build and deliver software projects" waterfallesque processes. A lot of people have been burned by that over the years and have decided to go to the other extreme and throw a lot of process out of the window. A lot are doing scrum, or kanban, or agile, or any of a dozen other names for how we actually get shit done.
How do you grow teams (or defend the design from Mgmt / Auditors) if this information is held in peoples heads? Tribal knowledge dissemination to new hires by various current employees.
I'm currently putting together a design document (very simple system and still high-level) to ensure my idea is communicated concisely and unambiguously.
The main motivation is I'll have a refined reference to point people to so I don't have to worry about repeating context to all possible stakeholders.
However many organisations have limited technical leadership enforcing quality, and management layers not willing to invest in these assets.
Unfortunately some 'modern' practices are being interpreted to make the processes like Agile the most important thing, and reducing the importance of things like code/system quality and knowledge management.
Having a good technical writer pays dividends
Many times the factors involved are discriminatory starting at hiring time. Such factors may include graduate education, independent research/writing, invention or original solution creation. These people are more inclined to write and diagram, because they have prior practice doing so. The inclination to write is a behavior predicated upon that prior practice.
The larger population of software developers are not so inclined to write. There is no uniform foundation to define an acceptable baseline of qualification to practice writing software professionally. That means the distribution of competence is highly variable and not well measured. In case of poor structuring there is maximal opportunity for selection bias in candidate entry, which suggests preferential factors for candidate selection not aligned with performance (however that’s defined).
In economics this is called the blind leading the blind. Not only is the greater population not guaranteed to be competent but selection preferences reenforce that candidates are selected away from competence towards a medium distribution like that of a bell curve. If you are in this population, as most of us are, you must work harder to write and document anything only for much of it to be ignored.
Such best-practices are an interesting not-so-universal thing in a lot of areas. IMO this often has a lot to do with employer relationships.
I've found that a lot of people are really proprietary about documentation, (not even getting to architecture yet) and treat their time spent documenting as time spent giving away their negotiating leverage.
Sometimes this is a good point to consider though. The situation where you are happy with your work and the situation where the organization needs to start paying more attention to your value can be examined separately and you can make decisions about sharing your documentation work separate from decisions about the documenting itself.
Others don't feel a need to hold back, but they would rather work on more apparent or pressing issues (i.e. more obvious to them because of their differing work perspective). It is then left to the contingency thinker to justify themselves, which can be really difficult.
Regardless, it's important to decide how you feel about it and how helpful it is to your perception of your value and efficacy, as well as the outcomes you desire in your relationship with the organization. Good luck.
As a secondary note, regulated companies may have architecture/network diagrams and disaster recovery plans but these may be located with the compliance documents and not in the general document store that engineers use.
I really love this process, as a new member of the team I can go back and read the design doc for any of our systems and understand not only the architecture, but also why certain decisions were made. The caveat is that this a remote-first company with a strong asynchronous culture, which forces a lot of these discussions to take place in a google doc vs a conference room with no record.
As I see it, I have a lot of tiny items like servers, clusters, applications, database schemas, services, network things, ... all interlinked by relations. Relations are two way, and typically of the kind 'contains' or 'talks to'. The main problem: There are a lot of them, even if they are all tiny.
I'd love to have a good way to organize all this. wikis require too much clicking around. I'd love to ask questions like: What goes down if this subsystem disappears. So anyone willing to share experience is appreciated.
Now all this is simply about documenting what is. The AskHN wants to go a step further: How to document what should be.
And because, well, gosh -- "reading". Especially all those icky complete sentences and stuff. Which has definitely gone out of style, especially among the smartphone-addicted set ("if I can't digest or respond to it while thumbing my phone, it doesn't really exist").
Have I perhaps lost touch with modern practices?
No, they've lost touch with you.