We've learned a lot about computers in the decades we've been producing. A lot of software assumes that computers work a certain way. See Go lang and how it handles file permissions. In the Linux Kernel drivers exist that supports the IBM 3270 Display System which at this point is over 50 years old. Terminal Emulators mimic the VT100. All of this makes sense given the context in which they were developed and created. But if you were developing an OS in 2022 without these constraints what would you remove and or change? Would you keep 32bit support? Arm, Risc and or x86?, monolithic or microkernel?, C, Rust, Zig, D, C++, or something else?, how would it talk to hardware?, what would your executable format look like?
Also if anyone has any interesting links to cutting edge research into Modern Operating Systems I'd love to browse them!
Right now, if an important feature is missing from e.g. the built-in calendar app, your only option is to throw away the entire app and replace it with something new. This new app is probably less integrated into the rest of the system, or may be missing some other feature you use, or may come packed with features but also a cluttered and confusing interface.
This causes OS vendors to pack too many features into their default app suites. Today's default apps are at once too barebones for advanced users (like most people who read Hacker News) and too complicated for e.g. my grandmother, who keeps accidentally recording iPhone videos with slow motion enabled.
I want an OS with a base feature set similar to early versions of iOS or the original Macintosh, but which can be easily extended by third party software. Need to batch rename calendar events, or send an email at a specific time of day? There is not an app for that—but there is a plugin!
Normal files would just contain a 'bag of bits' like current file systems treat files. Objects would have attributes and methods that make them different than the other objects used by the system.
For example, the 'boot object' would just point to the 'loader object' which contained the IDs of all the operating system modules that are needed to get the OS up and running. You could have multiple loader objects (one for each OS or version of the OS you wanted to boot) within the same container. Imagine being able to have every Windows or Linux version installed within the same volume and be able to boot up any one of them! Files that are shared between OS versions would just have their IDs within both loader objects.
Configuration objects, logging objects, folder objects, policy objects, and other types of objects would be used by the OS for particular behaviors. Objects could also have attributes to distinguish between ones downloaded off the internet and ones the user created. (Why backup all those cat videos you could easily re-download?) Be able to easily distinguish all the photos from the documents even when there are millions of each and without having to memorize all their file extensions for various formats.
This is the system I have been building. https://www.Didgets.com
Imagine being able to easily step through the entire execution from user to kernel space. And also step backwards. And also record replay.
And then make instant changes during dev to any layer of the stack.
Think about how much time is spent across so many different tools trying to figure out perf bottlenecks.
Then think about if you wanted to patch something or write a driver, how difficult it is.
Then you would also want the entire system visualizable.
Imagine if the entire data flow and data structure of the system were automatically visualizable by instrumenting the source code. Why do we draw diagrams on whiteboards if they are not the best way to understand code.
Do this first and the rest becomes easy.
If you are aware of anyone working on something like this, please let me know!
I think that dbus does a much better job of this (and it should be possible to have a tool that interacts with the system bus on the CLI). I would definitely take more inspiration from dbus than POSIX for a modern OS.
HotOS is a workshop, but tends to be more creative and daring since it's just top researchers sharing their thoughts on OS design.
Off the top of my head in recent years you've had unikernels (e.g., mirageOS), barrel fish (multikernel), omnix (for dealing with lots of asics and other accelerators), and lots of modifications to existing OS design, like a ring-io like asynchronous system call mechanism (flexSC), systems for adding new abstractions off of modern hardware, like Dune, and of course ongoing work on microkernels, most significantly (and commercially), seL4.
Older research is still relevant. Capability systems and exokernels influence the above work and the original well-spring is still worth visiting.
Basically though, what you want to address are some of the following:
* Higher security requirements. Multitenancy is a given. Giant code blobs running without restriction in the same security domain (i.e., monolithic kernels) is a bad idea.
* Greater hardware support for differing security modes (virtualization, enclaves, trust zones, CHERI)
* Advances in compiler/language research and techniques for low-cost policy enforcement at the language level (to at least some degree), allowing for 'uploading' code into ring0 . Rust is the most popular example, but even just the eBPF trend is an example of this.
* Heterogenous hardware. How does the kernel handle differing cores with different performance characteristics, power requirements, capabilities, and even different ISAs? This is happening. Should the OS expose better primitives for ASICS and other accelerators?
* Cloud computing. Many big money work loads run under someone else's control with a hypervisor. This brings up attestation, but also raises the question if you're running a single application under a hypervisor (itself basically a microkernel), what do you need a full fledged kernel for? Similar to userland duplication with containers. Unikernels are one such response, but it's something that should be addressed.
All in all, pretty much anything I see in industry (conceptually) is trialed out in one form or another in academia many years in advance. Industry is theoretically where the really hard work of making it practical happens, but sadly this seems to very rarely happens.
I'd like to play a game on my phone but have my desktop's GPU render frames for me like Stadia.
I want my server to pick the next song in the playlist on my phone to match the tempo of the heart rate it hears from my watch.
I'd like to type "make -j" and have it transparently use every core ...on every device I own.
These things can be done today with ad-hoc hacks, but an OS designed around these ideas would mean you could write lots of programs very differently.
USENIX ATC '21/OSDI '21 Joint Keynote Address-It's Time for Operating Systems to Rediscover Hardware
The alternative is to build a world where everyone is nice. This is the preferred solution but it seems more difficult to create than a secure system for computing. Maybe one approach to secure computing would be to invent a new paradigm for non-networked computing that does everything we can do now, but is just slower to use.
Your spreadsheets aren't in numbers.app or excel.exe. They're spreadsheets, and get opened in, guess what, the spreadsheets app. It shouldn't matter to the end user what's behind the spreadsheets app.
Your web browser? It's not chrome.exe, firefox.app, or Safari. It's just "the web" app.
And while we're at it re: branding. No, you don't have Excel files. Or Word files. They're spreadsheets, and documents.
And another thing:
Places where files go. Guess where your spreadsheets are? That's right, /files/spreadsheets/ -- and you'll never guess where your documents are saved!
Guess where your browser's preferences are saved?
No, not in opt/var/something/whatevz/users/your-name/lib/etc/browser/endless-UUID/prefs
It's in /preferences/browser/
This comment brought to you by someone who spent the better part of two days trying to figure out where his built-in apache2 installation, as well as the homebrew port, saved their prefs. Oh, and same with PHP.
E.O.R.
A machine would be equipped with a 3 layer object system: one layer for core objects (any reusable part, locally and/or remotely stored), a personal and impenetrable by default layer (for any and all private data, personal config, tweaks, metadata - anything made and/or generated by the Owner's actions), and a third layer for deliberately shared data (open to anyone with the right key).
Modular core object interpreters could be added for Owners to access any code and/or stacks or packages, written in any language. Imagine a Windows interpreter, able to interact with compatible apps, at a cost for the Owner, should she/he choose to acquire such capabilities. Same for Mac or Linux or Android or any other interpreters.
The inter-connecting links between machines would also be designed around Owner's best interest, using differentiated air-gapped communication pipeline for each layer. The Owner would either choose to locally keep its own data, or store it in a Data Bank (and get some revenues from sharing it in part or in totality).
As for the OS interface, to each its own: from CLI to window based to personal assistants to _Her_ movie style interactions.
The thing is, between layers, hardware, data handling, core objects and interfaces - and the prerequisite for Owners to really own their data and dispose of it as they wish - it would be very hard for any one company to serve the market vertically without lots of public scrutiny. As a paying customer, I'd be incentivized to buy hardware and services from the most reputable sellers: best machines, best data bank or broker, best inter-connecting, best storage, best interfaces, best software environment, etc. Privacy, modularity and compatibility, backed by solid legislative guarantees globally.
https://en.wikipedia.org/wiki/Midori_(operating_system) https://web.archive.org/web/20210216182214/http://joeduffybl...
It's sometimes said that C/C++ are so performant because they are close to the metal, but I think often at this point we're seeing the reverse causal arrow: C/C++ remain so performant because that's still the "virtual machine" most modern processor/computer architectures are targeting. C/C++ ideas of memory management. C/C++ ideas of process/thread concurrency. No current processor architect would dare to break that model and do something in a new "modern" mold at the risk of sabotaging the performance of C/C++ apps that don't/can't/won't migrate to new paradigms/abstractions.
(Arguably that's part of the "betrayal" of the Meltdown and Spectre CPU timing bugs/hacks: the CPU doing too many things to mimic a C/C++ "machine" even though the raw details of the underlying concurrency model at the hardware have needed to vastly change and that abstraction break bleeding into a vulnerability. What if the hardware didn't have to work so hard to pretend to be a very fast C/C++ "strictly in order" machine? Could it be a faster machine if the abstractions it was targeting themselves were more concurrent and more secure?)
Lisp machines in the 70s had hardware memory garbage collectors. We've advanced the state of the art of memory garbage collectors quite a bit in the decades after, but few since have thought to again try to express that in hardware.
There have been dalliances with chipsets designed to more closely mimic the JVM (Java Virtual Machine) or CLR (Common Language Runtime; .NET's VM) that got some play in embedded design spaces, but the common echo was that they ran C/C++ code too slowly and were themselves labeled as "slow" and "not very useful for general purposes" (because we've tautologically defined C/C++ as "for general purposes").
It's easy to wonder if we could escape this trap of building our hardware machines to resemble our least capable, lowest common denominator "virtual machine" like the C/C++ model and push towards higher level concepts deep in hardware.
Lisp machines and JVM/CLR machines all have more interesting reflectivity over live objects than C/C++ machines. There's at the very least component model benefits to being able to get general information about the live object states. There's roads to more "Actor" based types of designs. More "object-oriented" possibility spaces than "files".
If you are talking about true, "clean slate", no concerns about backwards compatibility whatsoever, no concerns that interoperating with older software might need to be truly emulated and might possibly be slower in trade-off for better software design of new, non-emulated stuff, I think you might start at basic hardware assumptions and architectures. There's so much more to experiment there that we don't have the budget to explore if we have to continue to keep up the illusion that "C/C++ apps are fast and close to the hardware" for backwards compatibility's sake.
For example: one process trying to use all the memory on the machine should evict its own memory page from memory and try to swap them to disk but not evict memory page of other process.
I would add scripting and such to make it a better platform for servers.
I would have SSH, but I'd probably get rid of Bash completely in favor of one of the new modern shells. Or perhaps I'd have a pure logicless command prompt.
Most likely since we can do unlimited NIH here, I would aim for Dart as the system scripting language and add some JIT and caching to make it fast to compile, if I didn't just go with Python.
I'd have Ansible included natively, modified so that playbooks were a locally executable type like a shell script, so that random oneoff backup scripts and the like could be totally declarative.
Then I'd make a point to have a strong focus on not needing separate images for different hardware.
I would probably focus on RiscV on the assumption it will get big by the time that it was all finished, but I'd be sure to have strong ARM support.
I would probably get rid of "Everything is a file" completely. I'd have a stripped down dbus like thing for talking to devices. Everything is an object.
I would very aggressively hunt down and destroy anything that requires imperative modifications.
There would be a standard format for config files, where they go, and how different files override each other. Program defaults, os specific tweaks, and ram runtime values supported.
In fact, my package management would probably be based on a list of files describing what was wanted. Add a file that says I want apache, with this config setup, and the manager detects it and knows to install the package.
If you need to edit a system file, rather than add a layered override, there's a problem. If you need to ever swap out a core package, there's a problem.
This would be an extreme cattle not pets OS, designed for zero tinkering, just copy some files over and you're running.
Packages would always have everything they need to clean reinstall stored on disk.
Offline installing packages is done by copying a package file to the cache dir, then adding its config to the wanted list. You can do it in a disk image. All needed actual setup will happen at next boot.
No true containers. Instead you have apps like Android. You have API levels with everything you need, that is your standard base to build on.
Since servers need real databases sometimes, "make me a postgresql database" would be part of the OS API.
Networking would be fully mesh-first. You'd get a Yggdrasil-like IP address, with a built in firewall that blocks incoming connections. All software would be aware that these addresses are secure even without HTTPS.
And nothing would ever unnecessarily write to disk. There would be support for battery backed ram, and if available logfiles, nonsense like saving the last position of a window or what the browser was doing, would go there. It gets snapshot when you shut down.
Nothing would ever touch disk except stuff that the user actually would be upset to lose.
Everything would be Rust rather than C, aside from existing legacy stuff. Nothing else in its class seems to have that level of dedication to safety above everything else. Crap software is easy, unbuggy software is what's hard, so we should use a language that focuses on the hard part, not on banging out random MVPs, you can do that in just about any language already.
Everything should just work. Everything should be standard and consistent. And it should protect the hardware and be able to run for 30 years on a cheap eMMC.
Here are some of my ideas (although there are a lot more that I have not listed here, but I may write about in future):
- Real-time capabilities (which can be ignored if not needed). Some applications have this as useful.
- Entirely non-Unicode. International text is supported, but it isn't based on Unicode nor requiring any unified character encoding.
- The system call interface proxy. This can be used for such things as providing file permissions, system call tracing, maintaining some parts of a process state, and other capabilities, rather than all of them being a part of the kernel. The definition of the system call interface is independent of the instruction set as much as possible; this way, it makes writing them simpler. (The way that the interface corresponds to actual features of the computer does depend on the instruction set, though.)
- Hypertext file system. Any file can contain links to other files (inherently, rather than being considered as byte sequences like other data (although of course they are still stored on disk as byte sequences)), whether a normal link (which links to whatever version is current at the time of access), or to a specific version of the file (resulting in a copy-on-write object), or to a part of a file.
- It has its own programming language (which can be used as REPL as well as full compiled programs), but C can also be used, and assembly language is also possible. (If using C, then partial compatibility with POSIX and TRON may also be possible.)
- "Objects" are "files" (they are essentially the same things), and have a "object/file descriptor" to access them. This can be used for disk files as well as for other dynamic features such as a devices, process data, dynamically translated files upon access, etc. (Devices are not a separate object type like POSIX is; they are same like anything else.) Like any file, they may contain links, which may allow you to obtain access to them.
- A common file format for most (although not quite all) purposes, similar to TRON Application Databus, but with many differences. This would also be used for the command-line interface, too. It is also possible for data shown in both command-line and GUI (e.g. if a data table or graph is shown, you can easily apply your own filters, import/export, etc, with any other program; if it is formatted text, you can easily use it with other programs too; if it is numbers using some units of measurement, to be able to convert them; etc).
- Files can have multiple streams, numbered by 32-bit numbers (which need not be consecutively assigned). Stream numbers 0 to 255 have a standardized use, and other numbers do not have a standardized use.
- You can run a program with any instruction set on any computer, whether or not it uses that instruction set; if it isn't, then it will be emulated. (Whether the program is x86 or ARM or RISC-V, it will run on any of them.) Also if it is mainly one instruction set but uses some extensions (of a newer version of that instruction set) that this computer does not have, then those instructions are emulated.
- Message bus for higher level interfaces (these can be proxied like other interfaces, too, mainly by writing your own translator). Types are required, in order that programs written using different instruction sets with differnt endianness can communicate with each other without being confused, and so that the message passing can also include links.
I have many more ideas than only these above things, though!!!