vlang

/

v Public
0 commits 39 issues 0 pull requests 0 contributors Discussions Projects CI

net.http: TLS server shutdown fix (#27429) follow-ups — handshake timeout tied to accept_timeout, read_timeout no-op on HTTPS #13

Describe the bug

Follow-up review of the TLS server shutdown/leak fix (#27429, merged as 5a1b7b030c; dev branch fix-http-tls-lsan-leak). The leak fix itself looks sound — SSLConn.shutdown() frees ssl/conf/rng/certs/socket, accept error paths route through it, workers are drained (ch.close()idle_conns.close_idle()ws.wait()), and the timeout sentinels are correct. But the review surfaced several remaining issues in the same TLS-server path. Two are user-visible.

All line refs are against current master.


1. accept_timeout <= 0 silently disables the TLS handshake timeout (can hang the server)

tls_accept_timeouts derives the handshake bound directly from the accept timeout:

// vlib/net/http/server_tls_notd_use_openssl.v:14
handshake_timeout := accept_timeout

and ssl_timeout_deadline treats any value <= 0 as infinite:

// vlib/net/mbedtls/ssl_connection.c.v:468
fn ssl_timeout_deadline(timeout time.Duration) time.Time {
    if timeout <= 0 || timeout == net.infinite_timeout {
        return time.unix(0) // -> ssl_remaining_timeout returns infinite
    }
    ...
}

So a Server configured with accept_timeout: 0 (a legitimate "wait indefinitely to accept") unintentionally gets an infinite handshake timeout. Because the handshake runs on the accept thread (accept_with_timeoutsserver_handshake(handshake_timeout), ssl_connection.c.v:430), a single client that completes the TCP connect and then stalls mid-TLS-handshake wedges the accept loop forever: no new HTTPS connections are accepted, and stop() is never observed. This is the same hang class #27429 fixed, reintroduced for that configuration.

Suggested fix: give the handshake its own timeout field (or clamp handshake_timeout to a finite default when accept_timeout <= 0) so "block forever waiting to accept" does not also mean "never time out a handshake".


2. Server.read_timeout is a silent no-op on HTTPS

listen_and_serve_tls applies the user's read timeout per accepted connection:

// vlib/net/http/server_tls_notd_use_openssl.v:97
conn.set_read_timeout(s.read_timeout)

but set_read_timeout writes to the per-connection config:

// vlib/net/mbedtls/ssl_connection.c.v:495
C.mbedtls_ssl_conf_read_timeout(&s.conf, ssl_read_timeout_ms(timeout))

while every accepted ssl context is bound to the listener's config, not the per-conn one:

// vlib/net/mbedtls/ssl_connection.c.v:422
ret := C.mbedtls_ssl_setup(&conn.ssl, &l.conf)

l.conf's read timeout is set once at listener init to a fixed constant:

// vlib/net/mbedtls/ssl_connection.c.v:232
C.mbedtls_ssl_conf_read_timeout(&l.conf, mbedtls_server_read_timeout_ms)

So the per-conn conf that set_read_timeout mutates is never the one in effect. Result: Server.read_timeout is honored on plain HTTP but silently ignored on HTTPS, where all connections instead use the hardcoded mbedtls_server_read_timeout_ms (default 41s). Idle keep-alives then run on a timeout the user cannot change.

Corollary: the per-conn C.mbedtls_ssl_config_init(&conn.conf) / mbedtls_ssl_config_free(&conn.conf) on the accept paths (ssl_connection.c.v:338, :417) is dead — that config is initialized and freed but never bound to anything.

Suggested fix: apply the read timeout to the config actually bound to the ssl context (or stop advertising per-conn read_timeout for the TLS server and document the listener-wide constant).


3. Shutdown latency up to accept_timeout; handshakes serialized on the accept thread

Handshakes run inline on the accept thread inside accept_with_timeouts, parking it in a single wait_for_read(remaining) for up to the full handshake budget. The 100ms poll only bounds the accept-wait, not the handshake, so a handshake in flight delays stop() by up to accept_timeout (~30s default) and serializes all other clients behind it.

Suggested fix: accept raw TCP on the accept thread, hand the un-handshaken conn to a worker, and perform the (idle-tracked, interruptible) handshake there — restoring prompt shutdown and parallel handshakes.


4. (minor) fd-reuse TOCTOU in close_idle

TlsIdleConnTracker.handles stores bare int fds, and close_idle calls net.shutdown(fd) outside the lock (vlib/net/http/server_tls_idle.v). Concurrently a worker's defer conn.shutdown() may already have net.close()d that fd, and the OS may have reused it for a freshly accepted socket — so close_idle could shut down an unrelated connection. Shutdown-only and a small window, but a genuine raw-fd race; tracking a conn/generation rather than a bare fd would remove it.

5. (minor, perf) per-frame lock churn on the HTTP/2 path

read_idle_frame (vlib/net/http/h2_server.v) takes the shared idle_conns mutex and does handles << + index()/delete() (O(n)) on every HTTP/2 frame, serializing frame reads across all workers under load. Marking idle once around the blocking-read boundary (rather than per frame) would avoid it.


Expected Behavior

  • accept_timeout: 0 should not disable handshake timeouts (no stalled-client hang).
  • Server.read_timeout should take effect on HTTPS, consistent with HTTP.
  • stop() should return promptly even with a handshake in flight.

Current Behavior

As described above (items 1–3 are behavioral; 4–5 are latent/perf).

Reproduction Steps

For #1: start an HTTPS Server with accept_timeout: 0, open a TCP connection to the port, complete the TCP handshake but send no TLS ClientHello, then call stop() from another thread — the server does not shut down.

For #2: start an HTTPS Server with a small read_timeout (e.g. 2 * time.second), open a keep-alive TLS connection, send a request, then trickle/withhold the next request — the connection is held until the hardcoded mbedtls_server_read_timeout_ms (~41s), not read_timeout.

Possible Solution

See the per-item suggested fixes above. Happy to send a PR for #1 and #2 if the direction is agreed.

Additional Information/Context

Found while reviewing #27429 for the HTTP connection-pooling client work; #1/#3/#4 are the same blocking-read / cross-thread-fd traps the pooled client must also avoid.

V version

master @ 5a1b7b030c

Environment details (OS name and version, etc.)

N/A (code review against master)

[!NOTE] You can use the 👍 reaction to increase the issue's priority for developers.

Please note that only the 👍 reaction to the issue itself counts as a vote. Other reactions and those to comments will not be taken into account.