Is there a way to learn these things without building the software and seeing it mature? Directly jumping into mature projects doesn't make me appreciate small design decisions!
Therefore, I would like to request the community to link some good resources on software architecture and code organization!
If you get to the point you're referring to, e.g. Twitter having fail whale issues because they built entirely on mostly vanilla Rails, is a good problem to have and one you'll be able to get resources to fix once you get there.
Maybe it's just me but IMO your time would be better spent reading something like Predictably Irrational by Dan Ariely or brushing up on your psych 101 concepts. Learn to deal with the irrationality of your team, your customers, and your boss and your life will be a lot less stressful.
Good luck!
I think the best way is unfortunately just experience. Design patterns are just that, patterns, and you learn and sort of internalize them when you work on real projects.
You can also look at open-source projects with good architectures like LLVM, lichess, etc. (there are other threads with a list of these).
The most important thing to understand IMO is that there’s no one-size-fits-all architecture, and a great architecture vs a “better” architecture isn’t that different and not worth wasting too much time. A small project will burnout using Google’s practices and a Google-sized project will fall apart using small-company practices. If you plan on making your own app, definitely sketch out an idea of the architecture before you start building it, but don’t spend too much time or worry too much because it’s more important that you actually start.
If you are using an object-oriented language then reading up on Design Patterns will be helpful, the most common in web application development is MVC. To dive into why MVC is important you have to realize that, back in the day, single file Perl scripts (or PHP, Python, etc.) that did everything (get parameters, execute DB code, output HTML, etc.) were the norm. A lot of what is good software design/architecture comes from the history of bad design/architecture. Separation of concerns is a solid concept within applications and for integrating applications.
Be wary of digging into what BigTech companies do - they have really large/complex systems that most of us won't ever scale up to. These may be solutions to problems you'll never have and could cause more complexities than you need.
More seriously, creating architectures in a systematic way produces 'good' architectures more often than not. Have a look at the Unified Process[0] for an approach.
1. read a lot. Design patterns, architectural patterns, microservices patterns. There are good material out there
2. apply what you have read in your day to day job
3. stick with a job for more than 2 or 3 years
Points 1 and 2 are obvious. Point 3 is necessary: if you are jumping from job to job every 1 or 2 years you'll never experiment how good (or bad) your"architectural" solutions are. People usually jump into a new job, implement all the cool patterns they have read and then after 1 or 2 years they leave the company. So you never learn to judge your own solutions because the moment your solutions need to be maintained/extended, you are leaving.
https://martinfowler.com/articles/enterprisePatterns.html
More specifically this book: https://www.amazon.com/Patterns-Enterprise-Application-Archi...
The most I could get as an enthusiastic architect with a couple of systems under my belt was a book recommendation (that I borrowed from said corporation and never gave back) and a meeting for when I turn 35. If I still want to join by then.
It provided me with a way to curb complexity, which I wasn’t aware I was adding to my projects. The symptoms where always similar, after certain time and code written the codebase was such a mess that I didn’t wanted to touch it and adding anything new became exponentially more complex with time.
It is possible to learn good architecture, but I think when it's the most important is when it's the least sexy. For example, one of the most important things I learned at my last project was that one of the teams wasn't using database transactions for extremely important records. I learned this by noticing an oddity in the logs, then reading that area to prove it. Finally, I had to teach them how to use database transactions and what the different types meant.
While getting them to use transactions, I noticed they were POSTing a record to another team, then saving it in the database. What happened if the database INSERT failed? Extremely important data was now sent to another reporting system but not in the system of record. I helped them add a listener that would pull records from the database after it was inserted.
Now, what about the data in the other system, sure we fixed the source, but are there records there that were POSTed successfully, but not INSERTed? Time to find out by learning that system and checking. Sure enough there were 10ks of records that were orphaned, but showing up in reports to customers. These are financial records, it's extremely important that we not report on bad data. Cleaning this up was a six month project involving 25 people and three teams.
This was the kind of stuff that kills financial companies. The bad architecture would eventually have caused lawsuits and likely company collapse. However, most people weren't really that interested in it, the devs and product just wanted to get features out. I'd have to not only find the issues, but then be savvy enough to PowerPoint the issue to the right people to get the bandwidth to have these things fixed. It was thankless work, mostly tedious reading of code and logs and databases. Everyone helped but begrudgingly.
I will say the fastest way to learn to see these things is by learning to read code and document it in a way to help you think about it. I found the original issue about the POST and INSERT simply by reading the whole codebase two times top to bottom my first couple months on the job, documenting it all, and reviewing my documentation and thinking about it. Things like that just start to stand out as you ask "what happens if X fails" and follow it along to a logical conclusion. When I'd see something surprising, like weird logs, if seemed important I'd drill down until I found the cause.
Almost never in my experience as an architect have I been needed to design something up front. I actually think the job title should be changed to something like "cross team bug finder and fixer". So many times a team internally will be consistent enough, but once data crosses to another team the weird issues happen.
Another issue I found was that data was being stored in two databases. A check of a few lookup tables showed that someone had put the wrong status codes in one of the databases, so when a record was status 1 (OK) in one database, that meant something else in the other. Why were there two databases with the same data? Oh, a conversion that never finished, well let's finish it. Oh, no one still works here who remembers it, well I'll learn enough to figure out what is missing and finish the project, deprecating the old database.
I found another issue when I realized the devs on a team didn't realize splitting relational data between two databases was an issue. They couldn't have foreign keys, so they couldn't use joins in most cases. Because of this, the would fetch all records and join in memory. This was extremely inefficient, and they were certain that they needed a document database (and a two year rewrite) to deal with their performance issues. In reality, they only had a few 100ks of records, and I showed them that merging the two databases and using JOIN could fetch their reports and screens almost instantly.
Almost every dollar of value I've earned as an architect was like this. Nothing sexy, most of the time telling younger devs "no we don't need this new thing, it can be done with existing tools you already have in 1/10th the effort". Many times they want to rewrite everything, when in reality if they aren't good enough to see and fix these issues, they'll just recreate new ones.
This post makes me look like a database architect, and I think that's simply because many devs can do a decent job keeping code clean, but very few think about messaging, ACID, databases, scaling, and data. They can write unit tests and organize code into modules that are fine, but don't realize what should be cached and what shouldn't. They'll reach for solutions they just read about like microservices or horizontal scaling with messaging, but not even know how to get 1% of the benefit from their existing tech stack.
I really like the book Designing Data Intensive Applications for filling in the missing gap about data, transactions, and performance.
Here is my recommended reading list: http://deliberate-software.com/page/books/
I would encourage you to approach architecture like recommended here http://www.norvig.com/21-days.html. It will take a long time for the benefits to start, it's not something you can cram in a couple years. Learn to read code, document it, and read as much about code, data, and performance as you can.
Lastly, there's a lot of snake oil salesman out there who will teach you the "secrets" of architecture for a low fee of 10k at a one week conference they run alone. Udi Dahan, iDesign, etc. These guys will teach you one architecture, and will try to convince you a simple rewrite to this messaging first architecture will save everything. I've never seen any of those successfully applied, but I've seen devs go and become totally brainwashed until they can't even find obvious things in an existing system to improve, because they are so focused on trying to convince everyone to rewrite everything. I've seen great people be on a team, but never notice huge issues with the project, because all they can see is "it's not what Udi said it's good architecture".
Good architecture is 100% dependent on what the goal is, what the existing system is, who's working on it, and what the desired business outcomes are. A slow website isn't a big deal, if it's not hurting the business. Inconsistent data isn't a big deal in many different types of systems. It's all relative, so there's no One Perfect Design, and the only people saying otherwise are trying to sell you something.