Now that I am looking into this problem I am curious what are other startups solutions to this.
- Is the CTO or one of the engineers taking requests and running queries and sending csv/excel files back?
- Do business people at your company know SQL so they can run their own?
- Do you run your own BI tool? If that's the case, which one? I guess most startups at this level don't need a BI tool but we have do as we have a strong business side and no-one knows SQL outside the engineering team.
The platform provides real-time collaborative notebooks so people can train, track, package, deploy, and monitor machine learning models on Kubernetes.
Depending on the client and their data sources, our people can get data using Airbyte. Then build dashboards using Superset. They also can deploy a Streamlit application right from the notebook without worrying about spinning up a VM on GCP, set the environment, deploy, add authentication, etc. as the platform does that. They can invoke the deployed models from there.
Especially with projects that move fast, the cycle needs to follow to show results to our clients and we shave off inefficiencies as we go.
I honestly would suggest to use a BI tool rather than going with excel but you might be forced in one or the other direction based on what the people in your company are more comfortable with and how good or bad you data is. The quality of our data was bad enough that a manual step was required and more people where comfortable with excel.
I personally would go with BI tool. Like I said, I liked Tableau but from what I’ve seen MS PowerBI is also nice even though you will bind yourself to SQL Server. I would recommend Apache Superset as a no cost/low cost alternative. Tableau was expensive and because of that the people who where most interested in using it didn’t get the licenses and the people that got the licenses where not as interested. Also the licensing structure for Tableau is more suited towards larger companies and I personally think that BI tools are way to useful to limit them to management because of licensing cost.
At my previous company, we self-hosted Metabase using terraform on AWS, and connected it to a read-only replica of our production db. We also connected all the 3rd parties sources (Intercom, Zendesk, etc...) using Segment by moving the data to Redshift, which was connected to Metabase as well.
We created a Cloudflare tunnel (legacy) so that only people from the org could have access to Metabase via their Google profile.
It was a quite simple setup, but it worked very well for us.
- License products such as Tableau, Qlikview or PowerBI are too big for us at the moment, both in size, pricing, complexity, etc.
- Scheduled exports to Excel/csv don't solve our use-case as it needs our interaction every time there is a new export or change in the queries.
- Accepting requests directly don't scale well either.
- Currently we are considering Metabase, Superset and Redash. Leaning towards Metabase based on better UI, runs well in AWS and connects to SQL databases, Mongo, Redshift and Clickhouse (Not that we use all of them but we may in the future)
Backend is Bigquery + dbt. I highly recommend Bigquery if you are on GSuite and use Sheets. You can create views or tables in bq and have them included in sheets, always up to date. With dbt you can have them tested and be sure data is always correct.
True single source of truth, even for coworkers who work with excel only
We then have Mixpanel for simple stuff and a big sql database that we use for research using SQL, PowerBI, Grafana for more complex things.