When googling this myself, many answers are something like: each language has its own characteristics which makes it more or less suited for solving certain problems. But why does that matter, if they all end up being machine code? What am I missing, or what are the difficulties with creating such a universal representation?
The short answer is that any "intermediate" form is just another abstraction over machine language. It's turtles all the way down, as the old joke goes. The world of software is one of varying layers of abstractions, each with a specific, constrained purpose and benefits. There isn't one abstraction to rule them all.
Having said that, there are several universal ways to represent the syntax of any individual language. My favorite is EBNF[0].
[0] https://en.m.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur...
As you say the difference is really how hard it is to solve certain problems plus the way they expose IO.