What other benchmarks do you use for your programming languages?
https://benchmarksgame-team.pages.debian.net/benchmarksgame/...
I haven't implemented any language / VM of my own, but when I write a library then even for very specialized purposes I often write several tests when I want to be on the safe side as far as performance goes (correctness being the more important concern most of the time). Also, I typically want to have tests involving other libraries (or other VMs in your case) to get an idea of how fast I am in comparison, otherwise I'd be stuck with oh wow 1000 times doing something takes just under a moment kind of wisdom.