So here's a tentative proposal: add an HTTP header like "X-Wants-Progress: 1" to the request indicating that the client wants progress information.
A compliant server then includes a "X-Has-Progress: 1" header in its response and streams newline delimited numbers-as-strings in the [0.0-100.0] range as progress is made.
Floats allowed, monotonically increasing, same number can be repeated, clients should protect themselves by updating with max(old_value, new_value).
Once the progress part has completed, an empty newline is inserted and the real HTTP result is appended, status code, headers, body and all.
This way clients and servers can opt-in to being "progress capable", and if you ensure you have that capability all the way through then you can switch from the dreaded spinner-of-death to a smooth progress bar.
Thoughts?
Other thoughts:
You could probably make this play better with HTTP libraries by returning the header on an existing HTTP message, instead of changing the overall reply semantics. It's a little less efficient, but probably much more broadly compatible.
You could also have min-est-completion-time and max-est-completion-time rather than a percentage (partly to address people's concerns about the uncertainty or unknowability of overall progress as a percentage). Perhaps these could customarily be described as 90% intervals (or they could explicitly include a confidence estimate! like "90% of requests at this approximate degree of completion were completed within 7 and 32 seconds").
at some point in the stack it is / should be fairly well
known how long something is going to take
You should cite sources for a statement like this, because it doesn't seem like it's always true for interacting with a third-party API