Edit: You could actually do this on pre ALSA linux and can possibly do this on modern BSD, OSS created a DSP device which you could write directly too, even from a shell. Assuming memory serves here, I remember writing to /dev/dsp from the shell but it is possible there was some intermediary which I am forgetting about.
In the early 80s, the computer had approximately 1 sound making device, and you could make it do its thing by writing some bytes to some memory address or whatever.
Now, "in iOS for example", there's a speaker, possibly any number of bluetooth devices, maybe some docking sound device too, so you need some mixer to control them individually, and maybe the phone is in silent mode or vibrate in which case it (probably?) shouldn't play a note, but maybe it should vibrate instead? or maybe not, depending on what the note means...
So now... you need to either deal with all this stuff yourself, or have some complex framework that tries to hide some or all of it from you... but it's probably a leaky abstraction at best, and the framework requires some number of lines of code much greater than one to set up.