Routing

Reverse Proxy

Our infrastructure involves a multitude of servers running our customer’s applications. These servers are not directly reachable from the wild outside world. Instead, they are hidden behind frontend servers called reverse proxies.

A reverse proxy designates a server that sits in front of web servers and forwards client (e.g. web browser) requests to those servers.

At Scalingo, our frontend reverse proxies have multiple responsibilities:

Filter out undesirable traffic.
Apply some constraints on ingress traffic to protect the platform and ensure good performances.
Terminate HTTPS and HTTP/2 traffic.
Forward HTTP requests to applications’ web containers, which act, as the name implies, as web servers.

Reverse Proxy

Key Benefits

Running these frontend reverse proxies provides a lot of key benefits:

They add a layer of security by hidding the internal structure of the platform and reducing the attack surface available for malicious users. From the outside world, our frontend reverse proxies are the only visible brick of our infrastructure.
They allow us to distribute ingress traffic across available servers hosting the applications’ containers. When a server becomes unavailable, the reverse proxies’ configuration are updated so they can re-route the traffic to the available ones.
When an application is running multiple web containers, this also allows us to forward the requests to one of these containers.
Since our frontend reverse proxies also act as TLS termination points, they allow us to easily provide TLS encryption and simplify the certificate management and renewals via Let’s Encrypt automation.
They allow us to provide router logs and request metrics.

Routing Specifics

Requests Distribution

When an HTTP request is sent to an app hosted on Scalingo, it’s first received and analyzed by our frontend proxies to retrieve the targeted domain name, using either the Host HTTP header for simple HTTP requests, or the Server Name Indication (SNI) of the TLS handshake for encrypted HTTPS requests.

Once the domain name is known, our routers use it to determine the location of the application’s web container(s) and forward the request to one of them. Selection is made by applying a strict round-robin scheduling across the available web containers of the application, ensuring even load balancing.

The request is processed by the application, and the answer is sent back to the client through the reverse proxy.

Quarantine

Our reverse proxies also include a quarantine mechanism to prevent requests from being sent to misbehaving containers. If one of your application’s web containers unexpectedly closes the connection, it is automatically placed in quarantine. From this point on, our frontends will stop routing requests to it and will instead periodically attempt to contact the container using exponential backoff. Once the container responds successfully, it is removed from quarantine and begins to receiving requests again.

Sticky Session

Our Sticky Sessions feature allows to slightly modify the default routing configuration by associating all incoming HTTP requests to a specific, single web container.

For more details about this feature, please read the dedicated documentation page.

Requests Queue

Each router of the infrastructure is keeping a local request queue for each application running on the platform. For each application, this queue is limited to 50 requests per web container. If the queue is full and a new request is received the router returns a 503 Service Unavailable HTTP error.

For instance, if your application is using 2 web containers, our reverse proxies accept to queue up to 100 requests, and then reject any additional requests.

When requests are being queued, it means the application is not able to cope with the amount of received requests. It often means the application is not well sized compared to the traffic the instances need to handle. The application should either have more web containers, or it should be optimized to respond to requests faster.

Timeouts

When a new connection has been established to a web container, the app is given an initial 30 seconds window to return a response. This response can be incomplete, the idea is to have some data transiting into the TCP socket so we know the web process is active.

After this first exchange, the window grows to 60 seconds. To maintain the connection active (for instance, while your app is processing a response), it has to send data back to the client at least once every 59 seconds.

When these timeouts are reached, our frontend reverse proxies return a 504 Gateway Timeout HTTP error.

WebSockets, Server-Sent Events and Long-Running Connections

WebSockets, Server-Sent Events (SSE) and long-running connections are fully supported and available by default.

Timeouts still apply, so please make sure your application keeps the connection alive when necessary.

Headers

Our frontend reverse proxies add or update a few headers to each request:

X-Real-IP: The originating IP address of the client connecting to Scalingo.
X-Forwarded-For: The originating IP address of the client connecting to Scalingo.
X-Forwarded-Proto: The originating protocol of the HTTP request. Either http or https.
X-Forwarded-Port: The originating port of the HTTP request. Either 80 or 443.
X-Request-ID: UUID identifying the request, set only if not already existing.
X-Request-Start: Unix timestamp (in milliseconds) when the request was received by the front server.
Example: t=1693406590.527.
X-Scalingo-Error: Detailed error message when an error occurred on the router.

Requests Compression

Our routers apply gzip compression to incoming requests that include the Accept-Encoding: gzip HTTP header, but only for a limited set of resource types identified by the following mimetypes:

application/atom+xml
application/javascript
application/json
application/rss+xml
application/x-javascript
application/xml
text/css
text/html
text/javascript
text/mathml
text/plain
text/xml

If you need a different compression algorithm, or support for additional file types, please let your application code handle it.

HTTP/2

HTTP/2 is enabled by default and always available.

If the client supports HTTP/2, and if it is reaching the application using HTTPS, the connection is automatically upgraded to HTTP/2 by our frontend reverse proxies.

The HTTP/2 connection terminates at the Scalingo frontend reverse proxies. We then forward the requests from the routers to the application’s containers through HTTP/1.1. Consequently, your application should not expect HTTP/2 traffic.

The performance benefits of HTTP/2 still apply, especially for slow clients (e.g. mobile, xDSL) as the protocol is designed to improve speed over slow connections. Behind the scene, multiplexed requests are handled in parallel using HTTP/1.1 when communicating with the containers. Since these containers operate within the same internal network, adopting HTTP/2 there would not provide any significant additional performance gains.

HTTP Errors

413 Payload Too Large

The request size exceeds the max allowed body size.
Was previously called Request Entity Too Large.

499 Client Closed Request

The client closed the connection before the server answered the request. It is usually caused by client side timeout.

502 Bad Gateway

The application sent an invalid response to our reverse proxy. This error is often sent when your application is abruptly cutting connections.
All containers running the app are in quarantine.

You can distinguish between these two cases based on the context: if you notice multiple consecutive requests returning a 502 error, it probably means one of your application containers is in quarantine.

503 Service Unavailable

The application requests queue is full, so the newly incoming request has not been dequeued.
The application has been stopped by its owner.
The application has no web container responding to HTTP requests.

504 Gateway Timeout

The application timed out. Your application must accept connections and send the first byte in a limited amount of time.

Last update: 08 Jan 2026 Suggest edits