I am a junior python dev. I have been looking for some sort of tool or framework that would support a large scale image processing pipeline. Currently, I built out a pipeline in pure Python and use Pillow, numpy, and other packages to download, manipulate, and upload images.
I have been looking into apache airflow, spark, kubeflow, and some other tools. None seem to really fit the bill of large scale image processing, but I would love any suggestions for tools or frameworks I should consider or reconsider.
We're not a workflow as code framework like some of those that you mentioned. We let you automate your existing code (upload, copy/paste, or sync with Github) in the cloud with built-in monitoring and alerts. Install any packages at runtime, run each script in a containerized fashion, and connect them all together with complex pathing. No code changes or infrastructure management required.
If you're interested, feel free to reach me at blake[at]shipyardapp.com. Would love to help you tackle your use case!