Have you bought something that you had to scrap and build yourself anyways?
Yes, but that hasn't meant it was a mistake -- 'buy-then-build' can be a great strategy. Often the 'then-build' never happens, but going into a decision with the mindset readiness for 'then-build', you can learn from existing products, hit their limits and understand what is the custom version of it you'll need in your context. Recent examples are on smaller scale, though - using a library that speeds work up early, hitting its limits, and replacing or extending with DIY that does less things but goes deeper for my use case.
> What are services/products that you built and wished you had bought?
The most annoying recurring version of this has been being just a little too early - building something, then discovering a few months or years later a public product that does the same, but better. At that stage, rebuilding to use has low ROI, and one ends up maintaining a legacy monster. There was a period when public offering of supporting backend infra was maturing, ie things like secrets/configuration management, logging, observability, monitoring, a/b tests, a bit earlier even basic web frameworks (ie building anything on PHP before Laravel came out meant you built your framework first; iirc worse than the frontend framework landscape in 2022).
Use Temporal, StepFunctions, something and try to avoid this urge.
My assumptions around how much integration our custom product customiser/editor needed with the rest of the e-commerce platform ware wrong. I thought I need a user system and "saved designs" for the customers, but that's somewhat rarely used, and could have been bolted onto a standard system.
Maintaining and updating it is extra work over what the core business is, there is now a lot of custom code to fix old assumptions and implement features that we didn't previously expect. All of which come as standard with Shopify.
We also believe that customers are increasingly used to seeing the Shopify checkout, it is a reassuringly familiar experience. I suspect it has a measurable effect on dropouts.
If I was to start again now I would 100% just use Shopify, no question. We are considering a large project to move to it. It would be quite satisfying to delete all that code. But it would probably bring new problems, and thing we are used to being able to customise that we will be unable to.
Do I regret doing it? No not really, hindsight is 20:20. A lot of lessons were learnt, but that enabled us to build a successful business.
The performance of Hibernate relative to plain SQL was abominable, and this directly caused us to lose at least one contract. It turns out - on reflection, of course - that it’s not even theoretically possible to get into the same performance ballpark.
After years of doing battle with the tools, I eventually kicked them all out, decided to work hand-in-glove with the database, and suddenly things became both straightforward and performant. I now think that ORMs are a code smell.
It was really disheartening to have 2.5 years of hard work dropped like that, but it was absolutely the right choice. They had designed it themselves and it was barely functional. They had a lot of upgrades planned to make it do what they really wanted.
- They ignore overhead. If an engineer makes $100k they calculate with those number. The reality is a 2X to 10X overhead depending on the company. - Required return. You can't spend $100 to make $100, no investor will fund that. So you need to do activities that generate an adequate return. - Opportunity costs. So say that you have an engineer that costs 100k, the overhead takes that to $400k and they need to generate at least $600k a year. That still doesn't mean that you should do the project if you have something else that's even better. - Maintenance. People always underestimate the cost of maintenance. I'm skeptical of any estimate where maintenance is < build costs
One thing I regret building is an analytics pipeline. We should have relied on Segment for that. I've also once built an analytics platform from scratch which was bad.
On the flip side, one time we bought an ETL tool and it was terrible compared to in-house solutions
I'd build one or two, test them for several months while making adjustments and fixing various weaknesses, then I'd build ten, twenty, however many were needed.
One particular client decided they wanted to skip all that and just spend the money on Dell. They didn't have many servers, certainly not enough to justify a separate server room. Their offices weren't air conditioned at night, and they had an entire summer where one server or another would become unresponsive over the course of a weekend, and sometimes at night between weekdays. Accessing iDRAC was beyond what they wanted to do, so of course I had to do that.
They had Dell support, but Dell had no "fix" for unstable machines other than to tell them to build a server room. Mind you - the ambient temperatures were always below 100º - any reasonable person would say that while that's not ideal for servers, "premium" servers should still be able to handle warm rooms, particularly when they're idle, and not crash or lock up.
After that fiasco, they gradually replaced the Dells, one at a time, with machines I built. I wish they had tried harder to return the Dells, but they just wrote them off.
I've learned that any savings in time and money (mostly - the Dells cost more than the machines I build, even accounting for extra billable time) aren't worth the loss in time, productivity and reliability in the long run. Of course, the opposite would be true if I just wanted more billable work, but I can't do that, unlike many others in the field.
BTW - when I build a new generation of servers, I do months of testing, but the same general platform can last a good five years, like Ryzen 1000 through 5000 systems with ECC have lasted since 2017.
I made this decision vs. building out a new gateway/router/switching/monitoring/SSIDRoaming infrastructure from scratch.
This was a bad decision.
Even now, nearly 2023, the UDM-PRO is a beta product and I am a beta tester. Further, we have all learned that Ubiquiti is a dysfunctional organization not focused on anything at all resembling technical/engineering goals.
Ubiquiti wifi APs are still, probably, some of the best available so I will probably keep those ... but everything else - including PoE switches is getting ripped out and replaced.
I like the control and I like tinkering. But as I've become busier, I realize how much revenue-producing/move-the-ball-forward time I've lost and am still losing by doing all of this myself.
- Same CTO somehow convinced investors there was no x86 machine available on the market with the right specs for what was needed… so they put together a team of hardware people to design and build a motherboard.
- Personal… bought a late 70’s / early 80’s Sol cat catamaran sail boat for 500 bucks while in college. Unbeknownst to me, the hulls were notorious for delaminating and I didn’t know what to look for at the time when I purchased it. Long story short - I spent months of effort fixing and painting it, a lot of money, and sailed it once before giving up on it (of course it partially sunk during the maiden voyage). It wound up blowing away in a hurricane.
Should have bought a $1000 worth and then just post this from my Caribbean island.
The worst outcomes I've seen are when it's SOWs that are buy-to-build. That is - paying a vendor to build something that doesn't exist yet. It's like the blind leading the blind. You have to agree to these detailed specs and project plans and contract, and then at the end of the day hope they deliver because there's so much nuance in software it's not going to hold up in court or be with your time. If you did write specs and acceptance criteria detailed enough to be bullet proof, you'd probably have been better off just writing the software instead.
My company did two of these recently, one ended up overrunning by 100% with 50% of requirements not met, never going to production and half the people on vendor side quit/fired. The other one went to production and then the vendor went bankrupt, lol.
The main reason I don't immediately jump to hiring is it takes weeks to get people out for estimates, then half of them ghost you, and if you do manage to convince someone to take your money they are booked months out (and they also ghost you).
I'm very close to with another product.
HashiCorp is way overhyped. Yes, their open source products aren't bad if you don't pay for them, but then you get an enterprise license and realize how bad their support is. Seems like the company thinks that "enterprise" means just a product with additional features, but you're still on your own with any help.
Consul, especially when running on k8s is very complex, it feels like support barely knows more than you. They won't answer any questions explaining how given feature works, referring to solutions architects, which takes months to get access to them. Unlike Open Source, since you don't have the source, you can't just modify code yourself to add missing functionality, and if you ask your rep about it, they might tell you it could take a year (if it even happens) to implement it. WTF?
Anything I've seen built that was not contributing business value ended up being a distraction.
When I first arrived, and knew nothing, they told me we were going to build a replacement. That sounded great to me at the time. Obviously the right thing to do is replace something old and busted.
Well, I and the company learned over time that actually that was not really the best business strategy. It may have been possible, but not with the resources we had available. I ended up doing a whole lot of work that went unused, through no fault of my own.
A few years later, our internal version of this software was crappy and our data was still not something that we could monetize. In the future, I'd refrain from being too cautious at the beginning when you are facing bigger risks that may not be sexy, but are very dangerous.
Then, I switched to web scraping API (I'm using https://scrapingfish.com as they have convenient pricing for my use case, but there are other alternatives). Now I only have to maintain parsing logic in scrapers. It also actually reduced my costs of scraping since I no longer pay for proxies which are more expensive for my scale than a web scraping API.
If I could do it all again, I would have gone cloud-native (or at least leveraged K8s), and I'd use as many managed cloud services as humanly possible. At a later gig - we did just that, and we very very rarely had even 1% of the infra struggles we had with the solution I described above.
Nowadays, my basic advice is to always buy the best possible service when you start out, and only start to think about replacing it with DIY services when you have enough scale to pay talented engineers a salary to build AND support replacing it - and even then, the potential loss of focus and velocity might still make this a bad idea. There's a reason Netflix is still on AWS.
Then one day these products eventually go EOL and the company is often stuck maintaining a zombie product. Or the products undergo a major refactor that breaks their customizations and integrations, and they end up stranded forever on version X.
I hear ERP systems often go like this too.
The only thing worse to me is entire huge products built in stored procedures. I know one product that was written in about 2 million lines of PL/SQL. It did some amazing things, but we were locked into PL/SQL for all time, and the Oracle scaling bills and HW were astronomical…
I don't feel particularly bad, though, because in the past I've done more impressive things that were harder (building a 3D printer from scratch, putting a V8 into a Datsun Z, etc.), but for some reason could not figure out how to build this computer from parts.
“We can only be great at one thing. The rest we can only be good at.”
This doesn’t quite answer the question, but I think it’s related.
They were burning money and about to go belly up.
We spend 3k on shopify plus some outsourced shopify engineers to get 95% of their solution in ~4 weeks with zero engineers.
Turned the company around and ended up doing a M/A 2 years later.
When you do hit the fork in the road on build/buy I don't have a hard and fast rule but generally I take the view that if it isn't directly related to revenue generation for the company than it shouldn't be built.
As a practical example, Heroku has been pretty unbearable support- and stability-wise since Salesforce bought them. But moving off their platform now that we integrated it into everything for 6 years is surprisingly difficult...
Similarly, all of our Allegoritic automatons became worthless when Adobe bought them and replaced $100 once perpetual indie licenses with $250 monthly rental. I've abandoned custom 3ds max and vray plugins for the same reason.
And ask anyone running a dropshipping store how renting their platform worked for them. Platform prices go up until you (the merchant) have practically no margin left.
I am almost reconsidering going back to the home grown solution even though it will require a bit of work since we haven't touched that code for 2+ years and almost retired that tool.
We ended up sinking two years into it, and never ended up with a particularly good compiler (although we did absolutely crush a couple toy benchmarks).
Arguably both sides of that tradeoff were wrong, though, as the eventually successful PyTorch 2.0 compiler (TorchInductor) was based on Triton (plus some custom higher-level scheduling logic).
The loss in productivity was so big that it was definitely not worth the money saved. And the promised “it’s open source, if it misses a feature, we will implement it” never materialised either, as the whole thing is really complex.
I don’t know why is Mattermost so slow and buggy. But it was. And nobody had time to research and profile that.
Spent hours messing around with the physical rig and software settings, but the balance was never quite right.
I'd imagine that the kits today are better all-around. At the time it was pretty cutting edge and the idea of being able to do steadicam shots with my DSLR but on a budget I could afford was too good to be true.
So, due to the lack of front-end exp (a while before I joined), they chose to buy the license for a moderately well-known UI component library to heavy-lift a big front-end rewrite. Well, due to same inexperience, there was no due diligence done and it turned out said component library had tonnes of bugs, wasn't easily extendable, had no real Typescript support, and on and on. The product suffered immensely for years, but dev leadership I think took on a sucken cost/hopeless mentality.
Years later, just before I joined, they chose to try and do something about it and pivot (I suppose as some political recoil), deciding to have a go at creating their own UI components and gradually strangler-pattern the existing external UI library. However, leadership thought themselves and the other CS grad dotnetter/java types could "learn on the job", so they didn't hire ANY devs with real experience, i.e. JS, TS, React/Angular/whatever, build tools, general front-end best practices, anything!
Fast-forward half a year and there is already even more bugs, a growing mountain of tech debt, etc. This is when i roughly joined, as their first developer with (back then) a couple years of f/e experience. Took me around a year to start turning around the general careless culture the business took towards it's f/e dev work, and argued for more learning, adoption of industry norms, craftsmanship, and all that jazz.
It's all in my rear view mirror now, but wow was that an interesting time.
I actually wrote about this experience and the ultimate refactor. [0]
[0] https://koptional.com/article/nuanced-strategy-build-vs-buy
IMO your product is the only thing you build, as much as possible should buy everything else.
There is eventually an inflection point where your product is so mature that the opportunity cost of improving it operationally vs marginal next feature you will eventually save money, but that scale has been growing YoY for a while (AWS mostly gets cheaper every generation so far) .
When management already says buy instead of build, the actual technology being bought is much farther down the priority list than engineers want to admit.
1. is this service or product similar to what your company creates, or would you have to create an entirely new business unit to develop and support it
2. how complex would it be to integrate the off the shelf solution into your current system. if it's the same as building on your own, then why pay for it
3. is there competition in the market for the buy solution so pricing and innovation are competitive.
4. who is this company selling this product or service, what is their track record, do you service as just their financier and guinea pig testers
My manager shared this opinion and said, we can't sell our software with the argument that users need specialized software and do the exact opposite ourselves by using software that wasn't even remotely designed and fit for project management and the like. Custom solutions do also have certain advantages. But you get like 80% of what you want with common software, usually for a reasonable price and without the headache.
A trap that sometimes gets laid out is it a binary build or buy decision, there is typically options in between in my opinion. Build isn't necessarily as onerous as it once was either, the use of cloud, frameworks, libraries, low code products and SaaS means you can often construct something from these legos.
In my experience, there are some in enterprises with procurement and IT management skills who tend not to have a clue about building modern software, and often push for buying stuff (keeps them busy), and sell it as a win, bought a thing, set it up, declared victory and fucked off to the next project leaving the users and technologists to figure out how to unfuck the mess that has accumulated around this clunky COTS product that is now a critical part fo the business.
Open Verse Media
When I first started, in 2016, at Open Verse Media, an ebook publisher, they asked me to look at their content management system (CMS/CRM). The staff had to rely on it, but it was very slow. The COO, whom I'll call Robin, had overseen the creation of this app. The actual work of creating the software was outsourced to one firm, but after two years Robin felt they were too expensive. She fired that first firm and then hired a firm in Ohio, which I'll call MegaStars.
The app had been built using a popular software framework called Ruby On Rails. Whenever Robin felt that a new feature was needed, she would ask MegaStars to add the feature. MegaStars billed $500 an hour, and over the course of seven years, a total of $3 million had been spent on the creation of this app.
The staff hated the app. When the head of marketing wanted to bring up the top 100 best-selling books, they would click on a link, and it would take a full 60 seconds for the page to come up. The staff had gotten used to the fact that they always needed to be engaged in two tasks, that is, something to keep them busy while they waited for the pages in the CMS to render. An advanced search, with multiple filters, could take up to five minutes to render a report. Many of the lower-level staff would simply go into Slack and engage in gossip with their peers while waiting for each page to slowly appear.
So on my first day I logged into our main web server, and right away I could see that the app was generating several thousand errors each hour, all of which were being written to a log file. Since this app was single-threaded, the work of writing the errors to the log file had to happen while each page was rendering. This was one reason why it was so slow.
This arms-length relationship needed to be closer.
Why did this app have so many problems? Well, when Robin requested a new feature, MegaStars would tell her exactly how much time was needed to get that feature done. If they felt a new feature needed 30 hours to build, they would simply quote $15,000 as the price tag. Sometimes the new work conflicted with old work and generated new problems, but that wasn't in the estimate and therefore the new problems needed to be ignored as much as possible. This tactic of ignoring new problems had been going on for many years. Additionally, much of the code base was now out of date and suffered version conflicts whenever some parts of the system needed to use newer libraries of code (which in Ruby On Rails are called “gems”).
MegaStars could have said, "Pay us $100,000 and we will clean up all of these problems." But then Robin might ask, "Why did you allow these problems to exist? What are we paying you for?" It might seem like a scam, if MegaStars asked for more money to fix the problems that they themselves had created.
Here was the central dynamic of the situation: Robin felt she held power because she could terminate the relationship at any time. In fact, all of the problems in the relationship were because she could end the relationship at any time and was leaning on that fact as her main way of getting compliance. MegaStars was unwilling to commit to the long-term health of the software while Robin was constantly threatening to fire them.
When you work with an outside agency, they typically can't or won't go back and clean up the code, because the customer is not willing to pay $500 an hour for that work. Some of the better agencies try to include the clean-up work in the overall price, but then those agencies seem expensive — and they get undercut by other agencies that are willing to do the absolute minimum, even if that means writing poor-quality code full of errors.
More one-on-one meetings would have helped
In many ways, the situation was worse than what I’ve already described. “Robin asked MegaStars to add a new feature” – what does this really mean? As a practical matter, the real process was something like this:
1. The staff hated the CMS.
2. Occasionally the frustration was so intense that it bubbled up to Robin.
3. Robin would convene a large meeting, including all team leads and their assistants.
4. Robin would give a speech emphasizing the need to control the budget, plus various warnings she had received from MegaStars – without doing a full re-write, MegaStars felt there was a limited amount they could do. Plus a full re-write would be too expensive.
5. Then Robin opened the floor to suggestions.
6. Everyone threw out some ideas, but without any knowledge of how much a feature might cost, and no real idea of what the budget was, the staff tended to engage in self-censorship.
7. Robin would pick three or four ideas that seemed interesting, then send them in written form to MegaStars.
8. MegaStars would send back a cost estimate.
9. Robin would then approve whatever items she felt were within the budget.
10. A new contract would be signed between Robin and MegaStars, regarding the next batch of work.
11. MegaStars would deliver the work, but without cleaning up some of the long-standing problems.
Please note, this is not a rant about out-sourcing. I’ve seen companies have great results while working with an outside agency. The real issue is this: if your company depends on an outside relationship, then that relationship needs to be a close, long-term, trusting relationship.There were several factors that caused things to get so bad at Open Verse:
1. The CEO was an industry legend, but rather elderly, so she pushed most of her responsibilities onto her COO. Robin was therefore spread thin with too many responsibilities.
2. The CEO and COO had spent much of their careers in print publishing, and were slow to realize how different ebooks were. (Books that sold well were more topical, less based on the prestige of the writer.)
3. Robin was very slow to realize how much the organization depended on the CMS. She herself didn’t use it, so perhaps she didn’t realize how painful it was for staff to have to wait 60 seconds for a page to render.
4. Robin thought her power, regarding MegaStars, lay in the fact that she could fire them. In fact, this was a source of weakness in the relationship.
It does not matter if your company has an internal tech team or works with an external agency, if you are the COO, be prepared to have long one-on-one conversations with whoever is heading up your software development. Obviously the COO is going to push day-to-day management of the tech team to someone else (a CTO or a project manager who can operate at a high level) but then the COO needs to be in frequent contact with that person.Who should accumulate requests for new features in the software? That should be the CTO or project manager, not the COO. It should be a regular, on-going process, not an occasional ad-hoc event. It needs to be someone who has the time to sit with those making the requests, talk to them one-on-one, and translate what they claim to need into what they really need.
One way or another, the only path forward for Open Verse Media was to find someone who could manage the software on a day-to-day basis. There were two possibilities:
1. Hire a project manager and let her manage the relationship with the outside agency. The project manager could focus on building a close, trusting relationship with that agency.
2. Hire a CTO, plus several software developers, and bring all software development in house.
Open Verse Media decided to go with the latter option. They fired MegaStars and instead hired a CTO plus several software developers. This should have given Open Verse Media the ability to move forward with faster and better software, as well as the ability to imagine software projects much more ambitious than anything that had been possible in the awkward and distrustful relationship with MegaStars.As it happened, Open Verse Media hired the wrong person to be CTO. This person was an egomaniac and very controlling. This irritated the software developers, and after eight months there was a mass exodus where the whole tech team quit. If the goal was to get a team that could care about the software over the long-term, choosing the wrong person to be the team leader undermined the intent — a fact that keeps us from drawing any easy or simple conclusions from this story, regarding the benefits of out-sourcing versus in-sourcing. It evidently isn't true that bringing the work in-house ensures the project will go smoothly. There remain other factors that can sabotage the situation. However, the fact remains that the relationship between the COO and MegaStars was unable to be productive because of distrust between the parties.
excerpt from this book:
https://www.amazon.com/meetings-underrated-Group-waste-time-...
- Time/timezones
- Taxes
- Encryption
Just the graphing library alone would have been worth it to buy rather than develop ourselves.
The rest, obviously Prometheus and Grafana are way easier to set up, integrates with other third tools and is much more powerful and battle proven than something developed from scratch, at no cost either.
______ is hard, you should buy it. Btw, I happen to work for a company that sells it!
Company had bought CloudShell test automation system from QualiSystems, after struggling with it for a few years and hitting pretty much every possible road bump along the way it was scrapped along with all artifacts which were accumulated inside it, because of course none of them are transferable. Or rather I wish we scrapped it already, in reality we are in the second year of this process and it will take at least one more year to finish. Replacement is just python code with some sane commonly used tooling where needed.
1- WebSocket Server / Service (Poorly designed, barely alive)
It was fine until it was not. It seems managing a lot of connections are harder than our team thinks it is. I still don't get it why we dedicated a couple of people to this for very long time. We should have used one of the existing services like pusher or signalr etc.
2 - Mobile Push Notifications Service only for our usage.
To be honest this was working fine but they designed it like to be one of competitors. Was not worth the effort.
On the other hand. For a number of years we were running AWS manaaged redis. Way worse inspectability, worse performance, worse everything than just running it ourselves. And a tad more expensive too. It's easy to think "oh I'lljust buy it", but any time there is tight integration, things get hairy quickly...
I will never get than $10k licensing fee back. Awful company, awful product.
I still miss sumologic, expensive as it may be, I probably prevented many millions of dollars of churn with being able to investigate and fix issues quickly. Businesses aren't good at measuring this.
I've bought something many times only to find it did not sufficiently address my needs.
To be fair, I've also never built something in the time I originally estimated, often off by a factor of up to 10, but the end result is always better for me.
If things go well we made the right decision. If things go poorly we should have gone the other way.
They have rebuilt all buildings, buy all new equipment and hire all Germans, but still got something awful.
So one day, decided: "there cursed place".
That's history of my life, I usually buy discounted top products but still supported officially, and usually they working excellent, only issue, quickly outdated software. For example, I use as photocamera old SGS4.
Few times I used new products, but from cheapest category (costs ~ same money as discounted top products), and this was terrible exp. For example, once, I have corporate phone, cheapest Sagem, and it was so awful, even unlimited tariff not helps.
And yes, I smart enough, to diy cellular phone.
It did work better than Informatica or whatever, but now we run 20k Spark jobs a day in the cloud, and it is incredibly wasteful to start whole Spark instance to touch ten megabyte csv file or database table.
What I should have done is added three engineers to the project and made our own homegrown system better and more useful.
Terrible idea.
There is so little margin in them that it’s definitely better to buy!
Even if you own an EDM, and mill, and lathe, and anodizing line, and don’t value your time at all - still buy it unless you are going to make ten.
I'm a database systems guy fundamentally so in the previous ~3-4 companies I have worked out my main achievement has usually been coming in and ripping out a proprietary, either custom built or poorly selected hosted database and porting it to OSS. Generally to PostgreSQL but also more specialized stores like Apache Druid (replacing a custom TSDB).
The issue is that folks seems to get enamored with shiny or think a certain feature of a proprietary database is too hard to replicate on PostgreSQL. Currently that is Google Firestore which has a real-time capability. Replicating such capability on PostgreSQL isn't that difficult if you are aware of logical replication and the tools necessary for scaling it if need be (namely a distributed log like Apache Kafka or Pulsar). The end result is a system that is heinously expensive for what it does, poor at the sort of queries it needs to do and thus the system is riddled with workarounds for it's shortcomings.
In the past it also included RethinkDB, MongoDB, Cloud Bigtable, DynamoDB, etc. Some of which can be good datastores if they perfectly match what you are trying to do but most of which you should never touch with a 30ft pole cough MongoDB cough.
Generally speaking if you should almost always use PostgreSQL unless you know you need something else.
The other big one for me is porting from runtimes like Lambda and CloudFunctions to k8s (either on EKS or GKE). Both of those runtimes result in terrible architecture and aren't worth the ~2-3 days you save with initial setup of k8s.
For the more general question of build vs buy I see it this way. If something is core to the flow of your product the selected solution should be maximized for control and ownership. i.e favour build over buy unless there is overwhelming positives to buy or the component is so comoditized it really doesn't matter.
Everything purely line of business however should be Buy > Build. i.e Slack, GMail/GSuite, ERP/CRM, etc.
Example for buy > build because of superiority is Cloudflare > self-baked CDN. You can't match CF network, it has to be bought.
Example of buy because comoditized is low level infra components from AWS/GCP, e.g VMs, networking, LBs, DNS, k8s controllers, etc.
Stuff you should never outsource are authz/authn, runtimes (i.e never use Lambda or shit like it), databases (except relatively portable like RDS/CloudSQL), etc.
Murky stuff where it's not clear: CI/CD - lots of tradeoffs and spectrum here, hosted controller + self-hosted runners seems good sweet spot, email - generally need multiple upstreams configured with different sending domains to handle being occasionally randomly blacklisted, observability - it's a bitch to run yourself but hosted options are heinously expensive, generally come with restrictions that reduce fidelity directly or indirectly through cost minimization. Feature flagging - hosted services usually result in CORS requests which are slow and the SDKs often block application startup, better handled yourself but maybe getting started with something hosted is OK.
Disclaimer: motivated by this, I'm now building Codeium (https://www.codeium.com), which is a free alternative to Copilot so that no one needs to consider cost as a reason for not using this tech.
Software that I (or the team) built just works. And is a perfect fit for our needs. External software usually isn’t a perfect fit. And bug fixes can take anything from months to infinity to get resolved.
An example was a commercial game engine we used for a PC/Console game. It was buggy and slow. We sent the developers code fixes and speed improvements. They ignored it. We ended up ripping out their engine and replacing it with our own.
TLDR: I wanted to travel the US and explore national parks in a camper van. I'm handy as a builder and knew that I could, technically, build everything. Turns out that the challenge wasn't in the technical aspect but rather the sheer volume of work involved.
A little more detail: Having grown up around 4x4 vehicles, I wanted something four wheel drive. No full-sized American made vans come from the factory as four wheel drive. Even if you find a new 4x4 van on the lot, it will be an after market conversion. This means that they're harder to find and, thus, cost a bit more than your typical 4x4 truck. But I was determined. I found a 1987 Ford Econoline in rough shape for around a grand. I bought it, named it "Polar Bear," and set to work on it.
One of the biggest setbacks for the project was the ongoing expense and hassle of repairing an old van with a custom conversion. I learned more about automotive mechanics with this vehicle than anything else I've owned. Still, a lot of repairs were well beyond my scope and I ended up spending tens of thousands of dollars rebuilding various components. Repairs were a near constant problem and this drastically slowed my build process.
Another hitch in my plan was fuel economy. Polarbear would take me a mere 9.5 miles on a gallon of gas. With a 3-speed transmission, the engine was always running at a high RPM. The alternative factory transmission with an overdrive wasn't as strong as the 3-speed and the gas mileage was only nominally better. After extensive research, I learned that a manual 5-speed swap could be done and would increase my MPG to over 20. However, with all of the expense and hassle of installing a clutch, I never took on this endeavor. Ultimately, I never did that national parks tour. I did, however, go a whole heckuvalotta cool places.
The camper conversion, which was supposed to be the focus, took me a solid ten years to finally complete. There are a lot of rather sloppy looking camper vans in the world and I really wanted mine to look good. This meant taking time on the details. And oh my, where there details. Unlike a box truck, a cargo van's walls are all continuously curved. Cutting wood to smoothly fit the curves of the walls is tricky and takes time to figure out. I also learned a ton about automotive and RV electrical systems in the process. When I began, I imagined a very complex system. I quickly learned that the more simple the design, the better the design. Everything has the potential to break and the more complex it is, the more likely it is to break. In the end, I decided to entirely skip water pumps and simply went with a gravity based system. And despite having a propane tank mounted to the van, I opted to use a portable camping stove instead of running more propane lines. In my opinion, these were good decisions.
Over the course of ten years, I spent enough money to have bought a nice completed rig right from the start. At the very least, I could have purchased a completed 2wd camper and had it converted to 4x4 for far less money. This would have also given me a more modern 4x4 drive train and suspension.
Still, I have no regrets. What I learned was priceless and the adventures I had along the way are some of my best memories in life. I finally sold the van last year for almost 10x what I paid for it - but far far less than I had invested in it. The following link is the build thread I posted to the Sportsmobile Forums for anyone who may be interested in seeing what she looked like:
https://www.sportsmobileforum.com/forums/f24/polar-bear-1-a-... (the last page has the final pictures)
I started blogging in earnest in 2005. At the time, I thought building my own blogging tool would be too much work, so I looked around for options that would allow me to store my content locally in source control. I settled on Blosxom, a Perl-based tool. Static site generators weren't really a thing in 2005, or I probably would have gone with that.
Blosxom served me well for 15 years, but it had problems. It ran on Apache, and every time I upgraded MacOS, I had to fiddle with Apache config to get it working again. It was customized with various plugins, and over the years I accumulated a lot of plugins and tweaks, to the point where I was afraid of making changes for fear of inadvertently breaking something. And it was very slow. I have a huge amount of content—I just checked, and it's currently at 6.9MB of text—and Blosxom couldn't handle the volume.
These problems meant that I eventually got to the point where I couldn't realistically make changes to the website any more. Blosxom had to be replaced.
Let's set that story aside for a moment. Back in 2012, I started a subscription-based screencast called "Let's Code JavaScript." To learn more about Node.js, I decided to build the website for the screencast myself, rather than using something like WordPress.
This code was a pleasure to work with. I had written with tests, so it was easy to change and evolve. Starting around 2018, I decided to start evolving to support more than one site, with the goal of eventually migrating my blog off of Blosxom. I did this gradually, in my spare time, as kind of a fun side project. Mostly it involved cleaning reversing questionable design decisions involving globals, and cleaning up bad tests so I could do so.
By 2020, the codebase had evolved into a general-purpose content management engine. I migrated my content from Blosxom. In 2022, I migrated a PHP website a third party built for another business I was involved with. Now the system handles three websites, all on a single server: letscodejavascript.com, jamesshore.com, and agilefluency.org.
The result is so much better than using Blosxom, and I think it's better for my needs than anything I could buy. It's well tested, has a great development feedback loop, and any feature I want is a small code change away. For example, I replaced Google Analytics with simple email alerts that inform me when a piece of content gets a surge of attention. It was an hour or two of work.
I don't think there's any grand lessons to be learned from this, other than the one lesson everybody forgets when they're contemplating build vs. buy: it's not "the cost of building" vs. "the cost of buying". It's "the cost of building + maintaining" vs. "the cost of buying + the cost of keeping up with vendor updates + the opportunity cost of some feature changes being unsupported."
In my case, in 2018-2020, "build it yourself" was the right decision, because I had a robust codebase that I could evolve into a content management engine. In 2005, "buying" Blosxom was the right decision. I don't regret either choice, and I'm happy with where I've ended up.