There used to be a time, in the dark dark ages of history, 10 years ago or so, when I would encounter issues during the course of my work, and I could fairly confidently assume I was doing something wrong, or I just hadn’t read the manual correctly.
Contrast that to now, when I regularly write code, or include a library to do a certain thing, and I find it just does not work. Following my historical logic, I spent a day trying to figure out what I did wrong, only to figure out that it was a bug, or edge case that someone just hadn’t covered (never mind that my life seems to consist of edge cases now) or the documentation is just plain out of date (or nonexistent).
Is this a trend? And does it have to do with the average code, or myself? Have you experienced something similar? It’s driving me nuts.
I want to rely on other code, but more and more I find that it’s safer to just assume it’ll be broken from the start.
For me, it went like this:
- The nerds did their nerd shit in the garage. Nobody saw them.
- Then the nerds figured out how to revamp many business processes with computers.
- Then those businesses needed more of that, so they started hiring other nerds.
- Not too many nerds existed because they're nerds and who wants to be a nerd? So therefore demand was high. What happens to the nerds' salaries when demand is high? Salaries are high.
- Now the nerds are making money. Tons of it. People started to notice that the nerds could sit all day and type at a computer (hey I can do that too!) but they make 100k more than I do.
- Eager to cash in on that nerd money, I start to google "how to code". Codeacademy, Kahn Academy, Udemy, and a plethora of highly expensive code camps show up. I pick my poison and begin.
- A Github profile is set up, my LinkedIn is updated, I have a few webapps under my belt, I'm ready for my interview.
- I get a job at Big Tech company as a junior position. A few weeks go by and I'm asked if I could help interview another candidate. Of course, I'm qualified enough right?
And the cycle continues.
This is how I perceived the shift happen. Code Camps were really detrimental since it became very difficult to vet actual skills vs. ability to pass coding interviews. When I worked at Uber this was a huge deal - a lot of people that had just finished code camps nailed the interviews but only lasted a few months because they had no idea how to actually do anything.
Of course it's all very nuanced and this isn't the only thing that happened (making programming "easier" certainly hasn't helped). But this was a large factor.
Post "Social Network" (the movie), and the economic downturn of 2008-2012 there is a new generation of engineers that started programming/entered the field because that's where the money is....
There is also proliferation of the code-academies, who often encourage trainees to create a 'github portfolio', often with low quality libraries.
2001 tech bust acted like a great filter. During the 2001-2012, a lot of people that remained in the field were there because general passion towards it, and had the skills to be employed after the bust.
People that couldn't cut it, did leave the field. Hence, the software was often more of a work of passion, or craft. A lot of the open source libraries that we use today, were created during this times (and the 90s, for the databases, and Linux), also a lot of the culture of startups (good and bad), was cemented during this time.
I think the field massively expanded after 2012, and people started doing engineering because that's where the money was/became trendy. Hence you will naturally have a dilution of overall quality.
I find that code become exponentially more costly the more features you add in, because of the number of architectural layers you have to add on. And if you haven't planned for that, well, now you have to deal with the pain of adding another architectural layer.
HN is likely in the sweet spot of good architecture, but it also means that it's not supporting things like infinite scrolling and reddit-like awards.
But in all fairness - from my personal experience - the average code is getting better. It is mostly thanks to tooling: languages (write a readable code in PHP vs Python, or C vs Rust), version control (no need to keep tons of unused code that one does not want to throw away), linters getting more widespread, tests and CI becoming a standard.
• Visibility and accessibility of software has increased. Repositories like npm and GitHub mean you can see anyone's weekend project. If you only got code from distros, curated repositories, that would have been a filter.
• It's a surviorship bias. Old libraries that survived must have been good enough. New half-baked libraries written today will either get polished and survive, or be forgotten.
In PHP I would not assume that an external dependency is without flaws. In fact, I'd assume that it's mostly buggy simply due to the lack of a type system, but fine if I navigate the happy path.
In Haskell I would assume that I'm using the library in a wrong way. In Haskell I also find that I can categorize libraries by quality and that, in the most cases, I'm not running into bugs, but rather lack of good examples or lack of pushing the dependency bounds to Stackage (the package distro on top of Hackage), either because the author doesn't use Stackage or because they've become too busy.
---
Comparing code 10 years ago and now:
I find that there are much more libraries in almost every ecosystem.
That means: More stale packages, more low-grade packages, more critical selection required, but also more opportunities! Hopefully you also experience that you can do a lot more with packages than you were able to 10 years ago.
This, in some ways, has shaped modern software development a lot.
1. The world is passing through troublous times. The young people of today think of nothing but themselves. They have no reverence for parents or old age. They are impatient of all restraint. They talk as if they knew everything, and what passes for wisdom with us is foolishness with them. As for the girls, they are forward, immodest and unladylike in speech, behavior and dress.
2. The children now love luxury; they show disrespect for elders and love chatter in place of exercise. Children are tyrants, not servants of the households. They no longer rise when their elders enter the room. They contradict their parents, chatter before company, gobble up dainties at the table, cross their legs, and tyrannize over their teachers
3. I see no hope for the future of our people if they are dependent on frivolous youth of today, for certainly all youth are reckless beyond words... When I was young, we were taught to be discreet and respectful of elders, but the present youth are exceedingly wise [disrespectful] and impatient of restraint
#1 was written in the 13th century, #2 in the 5th century BC and #3 in the 8th century BC. Old people always think young people are worse than ever.
I say this not to invalidate your feelings of frustration with how people are coding these days compared to the good old days. But ask yourself if your skills haven't improved in that time. Bugs that you would miss 10 years ago are now obvious to you. Also, there's just a lot more code out there than there used to be, judging by number of repos. The friction to put something into a public repository is lower than ever. That gets us some gems that might not have existed otherwise, but also some turds.
Rest assured that 10 years from now, devs will be talking about the good old days of today and how much better it was compared to 2030.
The barrier to entry is much lower.
In 1995, getting a C++ program compiled, tested, shrink wrapped and shipped (in waterfall fashion) required a lot more work. The "survivor" software you bought on your floppy disk or CD-ROM at the store meant it was high enough quality to jump through all those hoops. Some of those hoops were the software developers needed to know more about computers. Some were that you couldn't fix an issue after it was shipped in the wild pre-internet. Some of it was that a lot of those people had an intrinsic passion for the craft in which they weren't paid as well. They also did the work even though they were considered geeks, unlike now where a lot of "engineers" are indistinguishable from marketers and considered popular.
In the past, I had to write almost everything from scratch, and that meant I'd have worse code than what is open-sourced nowadays for the time period that I was implementing (and possibly longer if the problem is gnarly).
Nowadays, I can do a search on GitHub and find something that appears to solve my problem, but when I use it it'll turn out the author was solving a slightly different problem than mine and so his solution only partially solves my problem or doesn't handle certain edge-cases/bugs that will be a problem for me.
That doesn't mean that the code is worse. I am just using something that was created to solve another person's problem and expecting it to solve my own problems with no extra work.
Basically, you can still choose to implement everything from scratch if you want to. In these situations you should make sure to learn from the people that came before you, by reading other people's code first and spending time thinking about your design/implementation.
But if you want to use other people's code off-the-shelf to implement something faster, you can't just assume that because their solution solved their problem without running into edge-cases, that it'll do the same for yours. You have to be responsible for every line of code that you integrate with, and should assume that is not guaranteed to work for your problem/data.
---
What is an issue and wasn't before, are problems that occur due to a particular combination of packages and incompatabilities. These problems are difficult because they can require impossible levels of co-ordination to fix upstream. I think in many situations we should rewrite things to be more monolithic than they are.
From my experiences, what I have observered first and foremost is that the problems that software is solving are getting more complex as time progresses, causing software complexity. That is often proportional to the number of bugs.
Many popular FORTRAN packages out there solve a particular (or set of) well(ish)-documented mathematical or physical problems. Matrix manipulation, eigenvalues, weather simulation, etc. What i've observed is that these problems are, although sometimes complex, require minimal dependencies. The primary dependency is some journal paper somewhere, which you don't have to install as a software dependency!
Constrast that to a couple recent codebases i've worked on, easily >300MB of npm dependencies for a mix of react, typescript, webpack, sass (gyp!! >:(), jest, cypress, etc.
Having said all of that, for both of the above eras of software, i've encountered annoying bugs. It's just that nowadays i see most bugs are package compatibility issues (looking at you, steaming pile of webpack 5), whereas with FORTRAN it has instead been a mathematical error.
My point is that software is solving very complex problems right now. Back then, a web application was 3 types of files, because web applications were simpler. But now entire businesses use them, all their processes can rely on a single web app. Important processes sometimes with regulatory involvement. This necessitated typescript (scalable js), testing (jest and cypress), better performance (webpack), etc....
My take is code quality is concentrated in the FAANG level companies. They have a strong moat in just really good software engineering that leads to stable, fast reliable products.
I ask myself why this is?
I think there's something to:
- They hire the best technologists, have high standards when hiring... these folks put the effort in to make something they're proud of
- The leadership is tech literate, and understands how/where to build and payoff technical debt. What you need to build in-house, what to outsource/purchase, etc
- Because of higher tech literacy: the leadership/devs tends not to heavily cargo-cult buzzwords like "AI" etc (even though this is what their marketing depts push) and focus on the unsexy work that maters (reliable, maintainable software)
HOWEVER
Outside of the FAANG-verse, you have a variety of different cases with different incentives for code quality:
- The startups that just need to do whatever it takes to deliver an MVP before the money runs out. Lots of coding heroics to do this!
- The enterprisey big companies without strong tech leadership, they hire anyone with a pulse that can do development. These places there's often a couple smart people holding things together with lots of mediocre devs at best augmenting the smart devs, at worst moving things backwards. You also have leadership that doesn't understand how to make good tech leadership decisions, and often doesn't understand the tradeoffs carefully.
These are gross generalizations, of course, but on average with this pattern I think high quality code concentrates in the upper end of the market.
I find Rust and Perl to both have very high quality libraries available. JVM/Java ecosystem has a huge number of libraries and some extremely high quality ones but also a massive number of abandoned or low quality libraries left to decades past.
The ecosystems I find troublesome are JS, Python and to a much lesser degree Ruby. Huge numbers of low quality libraries and dependency hell.
The differences seem to stem from the culture of said ecosystems. Java code is often flexible to a fault, Rust code is strictly correct and fast but with less emphasis on flexibility/ease of use. JS/Python hail from the "get shit done" culture and as such there are many half-baked libraries.
These are just generalisations. Even the ecosystems I don't enjoy working with have some very high quality libraries of course. JS has React, Python has Tensorflow, Requests, etc.
If you want to be able to rely on library code though I would go for Java/C#/Rust maybe Go if it doesn't annoy you for other reasons.
My small amount of experience has shown that people are astoundingly bad at intuiting generational change in crime, capability, and morality.
I tend to rely as much as possible on the standard library of the languages I'm using (especially for Python and C++, which ships with a pretty good set of libraries already), and when I have to go beyond that, I try to vet libraries as carefully as possible.
If a library deals with problems I know are going to be tricky, I try to take a peek beforehand at how it approaches dealing with those problems. If the approaches look obviously simplicistic/broken, I can save myself some time.
But when I want to deal with truly hellish code problems, I let my 8-year-old sweet-talk me into installing just one more mod into his Minecraft installation...
Compare the situation on the web.
The days of Ie 6, polyfills and all that garbage are mostly gone. Browsers are an order of magnitude better than ever.
On the other hand you got the frontend js situation which is broken beyond belief. I dare you to pick any js framework and try to update it a version...
If you skip that and use the old and tested stuff, things are just way better than 10 years ago. Just compare Rails or Django to Jsp...
The god awful PHP std lib is still the same, but they clobbered some sort of object orientation and static typing onto PHP. So its as bad as ever and the libraries reflect that.
But in general whole library situation improved massively. Nowadays i can file a bug report on Github, or just fork it, get it fixed and make a merge request.
Back in the day you had to join someones IRC, talk to someone else who had the stuff on its cvs or whatever (hopefully) and then you wasted your day just fiddling around waiting for some random person to build a binary with some obscure version of a c compiler.
So there is more crap but the good old stuff matured and got way better.
There are better ways to discover problems for example it became quite the norm to use version control, a CI and automated tests..
Hehe, I remember those days :D Yes, code has gotten A LOT worse, and right now, whenever I pick a new library, I expect it to have bugs.
A lot of people will start coding and bring in a TON of open-source libs, each with their dependencies. And then when something goes wrong, spend an insane amount of time just figuring out who's to blame. Or, just fix the symptom and continue.
Clearly, the above won't work in the long run. But who cares about that, at the current pace of business, it'll be someone else's problem in a few months? They'll get promoted or switch jobs :D
I'm on the opposite side of the spectrum: I choose VERY FEW libraries, go for mature ones, and do quite a bit of testing. For pretty much each library, I simply add an extra layer on top of it, so I don't need to deal directly with it, just in case I may want to later switch to another implementation, or perhaps do some workaround until the issue gets fixed.
And this brings me to Microsoft: the code since Windows 8 (2012) has been beyond horrible, and their direction has been, lets say, "flexible" (to read: as the wind blows). I've discovered so many CRITICAL bugs, I've had to struggle with depression for months.
> I want to rely on other code, but more and more I find that it’s safer to just assume it’ll be broken from the start.
Yes, that is definitely a safe bet. The more you rely on 3rd party libs, the more you increase the risk of (critical) bugs. The more libs you use, the higher the risk. It's a really dim picture -- but something we unfortunately need to deal with. It's the new reality, and we just need to adapt :D
In the same way, when we are pushing coding at kids barely above kindergarten, there is going to be lots of crap code, but also much more excellent code, in general though, the amount of crap will overwhelm the amount of excellence so average will trend down. If you applied some kind of cutoff at the base, removing the absolute worst, the average might trend up. This is the case with anything that gets more democratized.
- We routinely depend on higher and higher level abstractions. These abstractions almost never cover 100% of the usecases you could imagine, but that doesn't mean they still won't get used inappropriately for a while.
- Corporate software is often subject to the technical equivalent of a pump and dump. The kind of situation where a developer wants to open source something but the company isn't really passionate about having them finish all of the features, only the ones that are pertinent to them.
- Companies are routinely understaffed, possibly by design. When I first started programming I heard all about programmers working in teams to accomplish things. When you don't have the option to pair on tough problems that means tough problems get a singular perspective.
It's very informative and engaging. Highly recommended.
1. The number of high-quality libraries available has increased substantially. 2. The variance has also increased substantially. 3. The AVERAGE quality of polished libraries has gone down.
Basically, the number of open source libraries has exploded over the past decade or so but the majority of new libraries are of low quality because they are slapped together or only half-baked and then abandoned. But it is easier than ever to find high-quality open source libraries for any use case you can think of. In other words, things have never been better but you cannot assume that a library is of a reasonable quality just because it is published as an open source library.
Now you can push a hot fix or day 1 patch almost immediately, and in the case of some software like games, actually prevent people from using the old version, leading to software being released that has known issues. Obviously with web based applications the fix deployment is completely transparent to the users, making the cost of deploying a fix almost zero.
The many different communities of programmers made that transition at different times, but very few dependencies was basically the only model available at the 80's, while the tons of dependencies is the model basically everybody uses nowadays (the amount of stuff embedded software imports nowadays would surprise a desktop GUI developer at the 90's).
- One that is so specifically designed to fix the authors' original problem. It is implemented in such a way that it doesn't cover a lot of use cases.
- One that is actually designed to be a general purpose library and to be used everywhere; but because of this there is a lot of feature creep and very complicated interfaces.
In 99% of cases when I noticed that they were third-party libraries.
Quality of first party ones, like Microsoft, Apple or Linux system APIs, is IMO mostly good.
Some third-party libraries are good as well. The problem is some others are not. For this reason I review at least public API, and ideally source code, before introducing new third-party dependencies.
If you wrote code for desktop computers in the 1980s, the platform you wrote code for probably didn't exist for longer than 5-10 years and the code had to target the CPU/arch directly because anything else was too slow. Your C64 library wasn't going to be reused on an Amiga or IBM. So any code older than ~5 years old just sort of died off. No support needed.
If you wrote code in the late 80s/early 90s, the hardware platform was starting to stabilize around the IBM PC, but the GUI layer was in rapid flux and languages were relatively immature - Windows 3 or OS/2? Protected mode? Windows 95? 16bit vs 32bit? Would C++ win? Even basic stuff like playing sounds (Adlib? Soundblaster?) or what graphics card to target (EGA? VGA? etc.) was in constant flux. Software lived a short life.
By the mid/late 1990s, the OS was stabilizing (mostly around Windows 16-bit) and basic hardware for video and sound had shaken out, but now we have the internet on the scene and a whole new round of immature and one-off tools. Remember WinSock? Netscape Navigator 4? Java applets?
By the early 2000s, 32bit Windows and OSX were on the scene and desktop software was mature in a way we rarely see anymore, but we were in the last days of single-user desktop software. Momentum was moving towards the web and software designed with the internet in mind. The mature desktop software of the early 2000s would mostly be abandoned as people moved to the internet and expected different things from computers.
By the late 2000s, we have the full-on mobile device wars with iOS, Blackberry, Android, Windows Phone, Palm, etc, all fighting over new ways of presenting software to users and all using different software approaches to make that happen. Iteration was incredibly rapid. APIs and languages changed quickly. The UI bar was raised significantly.
But around 2011-ish (10 years ago), things really started to shake out. The internet was relatively mature. iOS and Android became a duopoly. Linux became the standard for deploying web backends. Desktop software is largely dead at this point except in niche/creative industries. Rapid development moved toward new areas (AI/ML, etc) but the foundation was pretty stable.
The difference between now and 10 years ago is OSX Lion vs. Big Sur and Windows 7 vs. Windows 10. Hardly a fundamental change. People used Chrome to browse the web then and they still do, now. People mostly used iOS and Android then and they still do.
In 2011, web devs were mainly targeting Linux to run Python, Ruby, PHP, and Javascript. HTML 5 was finally a real thing. MySQL and Postgres were hot. Memcached and redis were cool. Some people used Mongo. The Windows crowd was using C# on .Net 3.5. All of those things have evolved over the last 10 years, but aside from Javascript, they are just iterations and are basically what we still use today. (Javascript of course has gone crazy, but that's a different story.). Maybe now you target a container instead of a bare machine, but the ideas are basically the same.
So my premise is that mass-market computers have broken compatibility less over the past 10 years than ever before. Along the way, more open source code has been released than ever before by a huge factor.
We have a situation now where new generations of developers are coming up and writing their new code for the same basic platforms as the previous generation. The old code still more or less works fine, but may have lapsed into unsupported territory. The new code more or less does the same thing, maybe in a slightly different way. There's more code than ever to maintain and not enough time to maintain it.
So now the number of libraries you can choose between to solve any problem has multiplied to the point where the surface area needing bug fixes and maintenance has become untenable. The hottest Django library that everyone used for file attachment uploads in 2011 has been replaced by 10 newer libraries that half-work and then those were replaced by 400 javascript versions of the same idea. And all that half-supported code is still sitting on Github, just waiting for you to include it in your project and cause yourself headaches. There's often no good way to know which library is mostly likely to work reliably and sometimes none of them do.
Another symptom of this phenomenon is that when you google about a bug or error, you often get 10 year old Stack Overflow answers that sort of apply but are also totally out of date and lead you down the wrong path. Attempts to update the question get deleted as "already answered". So now we not only have bugs to fight, but we have the endless perpetuation of wrong answers to long-since fixed bugs getting copy/pasted into new code forever.
In the past, the constant platform churn let us avoid this problem because the old software would just break and be obviously useless. But now that the basic platform is more stable and software is sticking around longer, we need to figure out better ways to deal with gradual change.
I think that average or high quality code has increased in quality over time, because CI/CD is a lot more widespread. Increased computing power has improved tools such as IDEs, compilers, linters, testing tools, etc. So the ceiling is generally higher, I'd say.
However there are a ton more newbies in the field, because there's a need for more people, so there is also a lot more garbage and that garbage is even worse.
Plus, there are a ton of programming-adjacent folks which can now publish their stuff. Since their focus and background are not related to programming, their libraries and apps are generally of lower quality.
TL;DR: There is more of everything, and since 80% of everything is crap, there is more crap. It's also harder to find the gems, since we have to wade through crap. But good stuff is still out there.