Since I'm mostly interested in it for benchmarking I'm looking for a search engine that lets me search by number of columns and number of rows and sparsity.
Whether or not your favorite engine supports those search criteria I'm still curious to hear what you like.
Your search criteria make perfect sense for benchmarking. I think we could indeed add that to https://slight.run/datasets, as I think we can justify the benefit for companies searching their own internal data too. Number of columns and rows would be simple enough; depending on the DB, we could do a size range too, "I want to see our company's heaviest datasets" etc, but also useful for your benchmark use-case.
Sparsity is interesting. We can have % of empty, or MB / (cols × rows), or probably what I'd like is % empty per column, but not yet sure how to make that an easy search criteria.
I'd probably also like to search datasets by type of column ("I want to run some tests on date fields", "I want to test our new graph on data with both time and integers" and so on), which would suit benchmarking and internal data searching. We can already do name searches, but could be nice to support searching for a name within a column type — this one probably isn't as relevant to benchmarking.
Even with Slight, I sometimes try to find certain numbers or types by scrolling down the catalog opening up the expanded view, and scrolling the columns there. So it could be something to do earlier rather than later.