We are planning to propose a department databricks instance to live alongside snowflake. We would let snowflake serve as the data warehouse, and use databricks for ML processing (R jobs, Python jobs, MLFlow, autoML, built in notebooks/git integration) pushing results back into snowflake when appropriate (e.g. needed in a BI tool or something).
I'm expecting pushback on this and wondering if people could share questions/problems I might run into going this route so I can think through them and be prepared to answer.
The main one I know is coming is "why can't you use snowpark", answer: we are heavy R users, snowpark/udfs are clunky, no desire to convert everything to python, no notebook interface built into snowflake.
On expenses there will be some extra in paying for both platforms when we're moving data back and forth, but I suspect the cost might actually be less on databricks to run processing intense ML jobs (I plan to test this). Storage cost is not a factor for us.
There was a thread on reddit about this:
https://www.reddit.com/r/dataengineering/comments/121mm5c/ma...