What is an acceptable latency for a web load balancer?

Question

We use an external company for network management. They've put a Citrix ADC load balancer in front of our web servers, which is also terminating TLS. I'm not sure if it was always like this (before I joined the company) but downloading a 1000 bytes file via HTTP takes ~15ms while the same file via HTTPS takes at least 110ms. The external company ignores this and claims that SSL termination is expensive. I can understand a few ms, but clearly 100ms is rubbish.What do you think folks?

danpalmer · Accepted Answer

100ms for setting up an initial TLS connection is about what I'd expect. It depends on the key size, ciphers, etc, but up to 300ms is typical.
However, this should only be done at the beginning of the connection. After this the client will have a symmetric encryption key that is much faster to use. Their load balancer should be caching these sessions so that subsequent connections don't need to re-negotiate a session key.
If this 110ms is only on the first request, and a cache miss on the sessions, then I'd say that's probably something you should be expecting. If it's after the TLS session has been set up, or on a cache hit, that sounds bad. It also could be that their session cache isn't large enough and is forgetting sessions too soon, causing more TLS negotiation than may be necessary.

c0l0 · Answer

100ms TLS session setup time with RTT

corobo · Answer

I think if you're quibbling over 100ms you need to be paying more so that people listen to the issue you're having and work with you.
I'm not saying this to minimise like "oh it's only 100ms" I mean there are providers out there that will work with you on this. They're not cheap.
If you pay enough you can reshape the internet - I once had an issue with a backup process and ended up having two ISPs start peering with each other to fix it.

jlgaddis · Answer

TBQH, the "acceptable latency" is whatever is defined in your contract with them.If the response to that is "there isn't one", well, there's your real problem.

archi42 · Answer

Fetching a one-TCP-package file via HTTP should take about 2 RTT_1 from a webserver (and ignoring connection tear down). The LB needs another RTT_2 to fetch data over a (hopefully) pre-established TCP connection from your webserver. Assuming a 2ms RTT_2 at the data center, your RTT_1 seems to be about 5.5ms.
A quick google search indicates with TLS the whole think takes at least 4 RTT_1, plus the RTT_2 because the LB still has to fetch your data, so your hard limit is 24ms, plus how long it takes to actually transfer the additional data (especially certificates).
This leaves about 86ms.
Now, does your client perform any checks for a revoked certificate? Assuming these checks are also done via TLS, and a "far away" server with a RTT of 15ms, then that's a whooping 60ms your client is taking to ensure that the certificates are valid. In that case your LB would merely be 26ms slower than the theoretical optimum.

vld · Answer

How much network latency is between you and the load balancer, and between the load balancer and the web servers? Normally, the initial TLS connection should be around 4 * the network latency, and subsequent requests should be very close to the network latency.
Give httping [0] a try for benchmarks.
[0]: https://www.vanheusden.com/httping/

Terretta · Answer

Yes, jumping from 15ms to 100ms is rubbish.
To compare network latency in terms everyday techies can follow, use video gaming: 15ms latency is 60 fps, and 100ms latency is 10 fps.
Which game would they want to play?
That said, it's not clear if you mean "time to first byte" or "time to transfer 1000 bytes" or both combined. The first is less of an issue; figure out which part of the request and data transfer timing is changing.
(Disclosure: In a past life I built a white label global content delivery network.)

NovemberWhiskey · Answer

It depends on exactly what and how you're measuring and why.
TLS has a non-trivial bootstrap cost, due to the need for extra network round-trips during establishment (mitigated with 0-RTT, if available), the additional byte overhead of certificate exchange (which can be substantial relative to small payloads if you have a long chain or are sending unnecessary certs, and more so if you're using RSA keys), and the cost of performing the crypto operations for the public key part and key exchange.
So if your use-case is "client arrives from the blue with a one-off API request and latency is important" then you are going to suffer with TLS.
If this is a web application (on the other hand), then what's actually going to happen is that the browser is going to establish HTTP/1.1 connections over TLS to the load-balancer, keep them alive, and re-use them. Assuming a well-configured load-balancer, it's also going to have the same from it's back-end to your actual service implementation.
However, once that's done, the only necessary overhead is the symmetric key stuff (microscopic) and maybe rekeying in long sessions.
Using curl to do a one-and-done connection to your service (as suggested by another poster) will give you an estimate which is only relevant to the first use case I described.

PaywallBuster · Answer

As mentioned, the TLS session is probably not cached yet, following requests should be faster.
However it's possible Citrix is just not as optimized as other providers/services or not support newer protocols like TLS 1.3
However, 100ms doesn't seem critical

simplyinfinity · Answer

Does your use case need low latency? No? Don't worry about it then. yes? Does it have clear measurable impact on the biz? No? probably wont' be able to convince anyone to do anything about it. yes? Show them the difference and clear benefits of how this improves customer/business value.

erichurkman · Answer

Check the endpoint with one of the SSL tools out there, like SSL Labs: https://www.ssllabs.com/ssltest/analyze.html?d=news.ycombina...Some of the settings there link to tips for more investigation. Look at the session resumption settings in particular.

toast0 · Answer

Have you (or they) made sure the load balancer is using the best options for TLS handshake performance for client facing and server facing connections?
If possible, Eliptic Key certificates and ECDHE can be a lot faster than RSA certificates and RSA based DHE. Especially if the load balancer is CPU constrained. Make sure TLS sessions or tickets are honored (tickets prefered over sessions so servers don't need to maintain a session database)
In my personal opinion, I'd rather not have the load balancer terminate TLS, but you lose a lot of features that way, and that might not be an option.

axismundi · Answer

I should clarify that 110ms is happening for all requests.
Tested with this: https://ybin.me/p/53f0aa4e1204b470#c561Frz6wpKkfvsTklDdZFOer...
and got this: https://ybin.me/p/1e68f80f6910ba0d#USqFo3Loksx5rHQIe16NHp299...

shyn3 · Answer

I haven't worked with Citrix ADC but if it's like any other load balancer you want to look at the SNI configurations settings and the TCP profile/buffer size. Increasing the buffer size may help but from my experience if you are offloading SSL then the buffer window shouldn't matter since the HTTPS connection is not being decrypted at the destination but who knows maybe ADC requires a larger buffer for HTTPS versus HTTP.

alyandon · Answer

Citrix LBs come in many flavors. At a guess, since you talk about it being externally managed - I'd ask your vendor if you are on dedicated hardware/instance vs being yet another tenant on an (and I might be forgetting my terminology) already oversubscribed VPX instance.I can tell you from my personal experience that +85ms overhead is definitely not what I'd expect for TLS termination.

fxtentacle · Answer

Make sure you use HTTP 1.1 or newer and keep-alive. That way, you only have the SSL negotiation overhead on the first HTTPS request, as for all following requests, the connection will be re-used. And then, 100ms for the first request and 15ms for each following request sounds reasonable to me.

axismundi · Answer

Here are some details from SSL Labs checker: https://ybin.me/p/c73726c16d669a5d#WGH7arFJpP8aitaNcBCGqRP3t...

gravypod · Answer

Most people don't care about latency but no, 110ms is not what I'd expect in this case. Take a look at these files hosted by google:
https://developers.google.com/speed/libraries
d3.js on my network is ~200kb and downloads in 20ms with ssl.

What is an acceptable latency for a web load balancer?

TBQH, the "acceptable latency" is whatever is defined in your contract with them.
If the response to that is "there isn't one", well, there's your real problem.

As mentioned, the TLS session is probably not cached yet, following requests should be faster.
However it's possible Citrix is just not as optimized as other providers/services or not support newer protocols like TLS 1.3
However, 100ms doesn't seem critical

Does your use case need low latency? No? Don't worry about it then. yes? Does it have clear measurable impact on the biz? No? probably wont' be able to convince anyone to do anything about it. yes? Show them the difference and clear benefits of how this improves customer/business value.

Check the endpoint with one of the SSL tools out there, like SSL Labs: https://www.ssllabs.com/ssltest/analyze.html?d=news.ycombina...
Some of the settings there link to tips for more investigation. Look at the session resumption settings in particular.

I should clarify that 110ms is happening for all requests.
Tested with this: https://ybin.me/p/53f0aa4e1204b470#c561Frz6wpKkfvsTklDdZFOer...
and got this: https://ybin.me/p/1e68f80f6910ba0d#USqFo3Loksx5rHQIe16NHp299...

Make sure you use HTTP 1.1 or newer and keep-alive. That way, you only have the SSL negotiation overhead on the first HTTPS request, as for all following requests, the connection will be re-used. And then, 100ms for the first request and 15ms for each following request sounds reasonable to me.

Here are some details from SSL Labs checker: https://ybin.me/p/c73726c16d669a5d#WGH7arFJpP8aitaNcBCGqRP3t...

Most people don't care about latency but no, 110ms is not what I'd expect in this case. Take a look at these files hosted by google:
https://developers.google.com/speed/libraries
d3.js on my network is ~200kb and downloads in 20ms with ssl.

What is an acceptable latency for a web load balancer?

TBQH, the "acceptable latency" is whatever is defined in your contract with them.If the response to that is "there isn't one", well, there's your real problem.

As mentioned, the TLS session is probably not cached yet, following requests should be faster.However it's possible Citrix is just not as optimized as other providers/services or not support newer protocols like TLS 1.3However, 100ms doesn't seem critical

Does your use case need low latency? No? Don't worry about it then. yes? Does it have clear measurable impact on the biz? No? probably wont' be able to convince anyone to do anything about it. yes? Show them the difference and clear benefits of how this improves customer/business value.

Check the endpoint with one of the SSL tools out there, like SSL Labs: https://www.ssllabs.com/ssltest/analyze.html?d=news.ycombina...Some of the settings there link to tips for more investigation. Look at the session resumption settings in particular.

I should clarify that 110ms is happening for all requests.Tested with this: https://ybin.me/p/53f0aa4e1204b470#c561Frz6wpKkfvsTklDdZFOer...and got this: https://ybin.me/p/1e68f80f6910ba0d#USqFo3Loksx5rHQIe16NHp299...

Make sure you use HTTP 1.1 or newer and keep-alive. That way, you only have the SSL negotiation overhead on the first HTTPS request, as for all following requests, the connection will be re-used. And then, 100ms for the first request and 15ms for each following request sounds reasonable to me.

Here are some details from SSL Labs checker: https://ybin.me/p/c73726c16d669a5d#WGH7arFJpP8aitaNcBCGqRP3t...

Most people don't care about latency but no, 110ms is not what I'd expect in this case. Take a look at these files hosted by google:https://developers.google.com/speed/librariesd3.js on my network is ~200kb and downloads in 20ms with ssl.

TBQH, the "acceptable latency" is whatever is defined in your contract with them.
If the response to that is "there isn't one", well, there's your real problem.

As mentioned, the TLS session is probably not cached yet, following requests should be faster.
However it's possible Citrix is just not as optimized as other providers/services or not support newer protocols like TLS 1.3
However, 100ms doesn't seem critical

Check the endpoint with one of the SSL tools out there, like SSL Labs: https://www.ssllabs.com/ssltest/analyze.html?d=news.ycombina...
Some of the settings there link to tips for more investigation. Look at the session resumption settings in particular.

I should clarify that 110ms is happening for all requests.
Tested with this: https://ybin.me/p/53f0aa4e1204b470#c561Frz6wpKkfvsTklDdZFOer...
and got this: https://ybin.me/p/1e68f80f6910ba0d#USqFo3Loksx5rHQIe16NHp299...

Most people don't care about latency but no, 110ms is not what I'd expect in this case. Take a look at these files hosted by google:
https://developers.google.com/speed/libraries
d3.js on my network is ~200kb and downloads in 20ms with ssl.