pull down to refresh

For a long time, I couldn't figure out why I couldn't access my self-hosted vaultwarden instance in a VPN via the browser (Brave) sometimes even though I could always access it fine via cURL or the Bitwarden CLI. When it didn't work, the site would just load until the connection times out.
But today, I figured it out a little bit more: Post-Quantum Cryptography in the TLS v1.3 handshake made the packets so big that my network interface must have choked—or something like that.
I arrived at this intermediate conclusion by comparing the browser's TLS v1.3 handshake with the one from cURL. I noticed the browser's Client Hello is a lot bigger (1866 vs 517) and has a lot of TCP retransmissions:
I also noticed that if I forced Firefox—I couldn't figure this out with Brave—to use TLS v1.2 by setting security.tls.version.max to 3 in the advanced config (that you can visit if you type about:config into the address bar), the site loaded immediately. So it was definitely related to TLS v1.3, but specifically the implementation in Brave and Firefox, since I could use TLS v1.3 fine with cURL.
I then looked further into why it was so big and noticed the unknown key share 4588. Thanks to this blog post, I learned that this is a post-quantum cryptography thing.
Fortunately, Firefox also had a setting to disable this via security.tls.enable_kyber.
When I did this, boom, multiple hours of debugging came to a conclusion that I was happy with, at least for now. When I searched for "PQC vs MTU", I found the blog post I linked to.
Apparently, PQC has the issue that it's a lot bigger over the wire to what we're used to:
In more concrete terms, for the server-sent messages, Cloudflare found that every 1K of additional data added to the server response caused median HTTPS handshake latency increase by around 1.5%. For the ClientHello, Chrome saw a 4% increase in TLS handshake latency when they deployed ML-KEM, which takes up approximate 1K of additional space in the ClientHello. This pushed the size of the ClientHello greater than the standard maximum transmission unit (MTU) of packets on the Internet, ~1400 bytes, causing the ClientHello to be fragmented over two underlying transport layer (TCP or UDP) packets4.
In some way, debugging continues though: I don't understand why the Client Hello wasn't simply fragmented. Afaict, it wasn't which would explain the TCP retransmissions? The MTU of my physical network interface is set to 1500 and my virtual network interface is set to 1380.
However, after I restarted my virtual network interface, it works now in Brave and in Firefox with Kyber enabled and I think it's still not fragmented—or maybe Wireshark doesn't show me that or I don't know what to look for ¯\_(ツ)_/¯
Anyway, at least I now know how to reliably access my password manager via the browser, lol.
800 sats \ 1 reply \ @waltmunny 16h
Have you tried to debug with 'ping'? If the packet size is really the culprit, ping will tell you.
reply
33 sats \ 0 replies \ @ek OP 16h
Ohhh, I did not! I didn’t know about -s:
-s packetsize
Specifies the number of data bytes to be sent. The default is 56, which translates into 64 ICMP data bytes when combined with the 8 bytes of ICMP header data. The maximum allowed value is 65507 for IPv4 (65467 when -R or -T or Intermediate hops) or 65527 for IPv6, but most systems limit this to a smaller, system-dependent number.
Thank you!!
reply
even though I understood very little of what was going on, i can still sense your satisfaction at having somewhat solved a fairly perplexing problem. nice work!
reply
45 sats \ 0 replies \ @ek OP 7h
Thank you! Also my first encounter with PQC
reply
0 sats \ 0 replies \ @Macoy31 1h
Wow, this is an incredible deep-dive into the real-world side effects of post-quantum cryptography (PQC)—something most devs or users won’t even realize is happening under the hood. It’s wild how a slightly bigger TLS handshake due to Kyber can cause such elusive network bugs, especially over VPNs or constrained MTU environments. The fact that you tracked this down by comparing cURL vs browser-level handshakes is top-tier troubleshooting.
Also, the MTU fragmentation issue really highlights how fragile things can get when cryptography moves ahead of network standards. PQC is crucial for the future, but this kind of debugging story is a clear reminder that implementation matters just as much as theory.
Thanks for sharing this — I bet a lot of people are struggling with weird loading issues and have no idea it's due to oversized TLS packets. You just saved folks hours (maybe days!) of head-scratching.
reply
Firefox doesn't work well in Windows, may be Brave Browser as same company owner.
reply
stackers have outlawed this. turn on wild west mode in your /settings to see outlawed content.