Routing
Reverse Proxy
Our infrastructure involves a multitude of servers running our customer’s applications. These servers are not directly reachable from the wild outside world. Instead, they are hidden behind frontend servers called reverse proxies.
A reverse proxy designates a server that sits in front of web servers and forwards client (e.g. web browser) requests to those servers.
At Scalingo, our frontend reverse proxies have multiple responsibilities:
- Filter out undesirable traffic.
- Apply some constraints on ingress traffic to protect the platform and ensure good performances.
- Terminate HTTPS and HTTP/2 traffic.
- Forward HTTP requests to applications’
webcontainers, which act, as the name implies, as web servers.
Key Benefits
Running these frontend reverse proxies provides a lot of key benefits:
- They add a layer of security by hidding the internal structure of the platform and reducing the attack surface available for malicious users. From the outside world, our frontend reverse proxies are the only visible brick of our infrastructure.
- They allow us to distribute ingress traffic across available
servers hosting the applications’ containers. When a server becomes
unavailable, the reverse proxies’ configuration are updated so they can
re-route the traffic to the available ones.
When an application is running multiplewebcontainers, this also allows us to forward the requests to one of these containers. - Since our frontend reverse proxies also act as TLS termination points, they allow us to easily provide TLS encryption and simplify the certificate management and renewals via Let’s Encrypt automation.
- They allow us to provide router logs and request metrics.
Routing Specifics
Requests Distribution
When an HTTP request is sent to an app hosted on Scalingo, it’s first received
and analyzed by our frontend proxies to retrieve the
targeted domain name, using either the Host HTTP header for simple HTTP
requests, or the Server Name Indication (SNI) of the TLS handshake for
encrypted HTTPS requests.
Once the domain name is known, our routers use it to determine the location of
the application’s web container(s) and forward the request to one of them.
Selection is made by applying a strict round-robin scheduling across the
available web containers of the application, ensuring even load balancing.
The request is processed by the application, and the answer is sent back to the client through the reverse proxy.
Quarantine
Our reverse proxies also include a quarantine mechanism to prevent requests
from being sent to misbehaving containers. If one of your application’s web
containers unexpectedly closes the connection, it is automatically placed in
quarantine. From this point on, our frontends will stop routing requests to it
and will instead periodically attempt to contact the container using
exponential backoff. Once the container responds successfully, it is removed
from quarantine and begins to receiving requests again.
Sticky Session
Our Sticky Sessions feature allows to slightly modify the default routing
configuration by associating all incoming HTTP requests to a specific,
single web container.
For more details about this feature, please read the dedicated documentation page.
Requests Queue
Each router of the infrastructure is keeping a local request queue for each
application running on the platform. For each application, this queue is
limited to 50 requests per web container. If the queue is full and a new
request is received the router returns a 503 Service Unavailable HTTP
error.
For instance, if your application is using 2 web containers, our reverse
proxies accept to queue up to 100 requests, and then reject any additional
requests.
When requests are being queued, it means the application is not able to cope
with the amount of received requests. It often means the application is not
well sized compared to the traffic the instances need to handle. The
application should either have more web containers, or it should be optimized
to respond to requests faster.
Timeouts
When a new connection has been established to a web container, the app is
given an initial 30 seconds window to return a response. This response can be
incomplete, the idea is to have some data transiting into the TCP
socket so we know the web process is active.
After this first exchange, the window grows to 60 seconds. To maintain the connection active (for instance, while your app is processing a response), it has to send data back to the client at least once every 59 seconds.
When these timeouts are reached, our frontend reverse proxies return a
504 Gateway Timeout HTTP error.
WebSockets, Server-Sent Events and Long-Running Connections
WebSockets, Server-Sent Events (SSE) and long-running connections are fully supported and available by default.
Timeouts still apply, so please make sure your application keeps the connection alive when necessary.
Headers
Our frontend reverse proxies add or update a few headers to each request:
X-Real-IP- The originating IP address of the client connecting to Scalingo.
X-Forwarded-For- The originating IP address of the client connecting to Scalingo.
X-Forwarded-Proto- The originating protocol of the HTTP request. Either
httporhttps. X-Forwarded-Port- The originating port of the HTTP request. Either
80or443. X-Request-ID- UUID identifying the request, set only if not already existing.
X-Request-Start- Unix timestamp (in milliseconds) when the request was received by the front
server.
Example:t=1693406590.527. X-Scalingo-Error- Detailed error message when an error occurred on the router.
Requests Compression
Our routers apply gzip compression to incoming requests that include the
Accept-Encoding: gzip HTTP header, but only for a limited set of resource
types identified by the following mimetypes:
application/atom+xmlapplication/javascriptapplication/jsonapplication/rss+xmlapplication/x-javascriptapplication/xmltext/csstext/htmltext/javascripttext/mathmltext/plaintext/xml
If you need a different compression algorithm, or support for additional file types, please let your application code handle it.
HTTP/2
HTTP/2 is enabled by default and always available.
If the client supports HTTP/2, and if it is reaching the application using HTTPS, the connection is automatically upgraded to HTTP/2 by our frontend reverse proxies.
The HTTP/2 connection terminates at the Scalingo frontend reverse proxies. We then forward the requests from the routers to the application’s containers through HTTP/1.1. Consequently, your application should not expect HTTP/2 traffic.
The performance benefits of HTTP/2 still apply, especially for slow clients (e.g. mobile, xDSL) as the protocol is designed to improve speed over slow connections. Behind the scene, multiplexed requests are handled in parallel using HTTP/1.1 when communicating with the containers. Since these containers operate within the same internal network, adopting HTTP/2 there would not provide any significant additional performance gains.
HTTP Errors
413 Payload Too Large- The request size exceeds the max allowed body size.
Was previously calledRequest Entity Too Large. 499 Client Closed Request- The client closed the connection before the server answered the request. It is usually caused by client side timeout.
502 Bad Gateway-
- The application sent an invalid response to our reverse proxy. This error is often sent when your application is abruptly cutting connections.
- All containers running the app are in quarantine.
You can distinguish between these two cases based on the context: if you notice multiple consecutive requests returning a
502error, it probably means one of your application containers is in quarantine. 503 Service Unavailable-
- The application requests queue is full, so the newly incoming request has not been dequeued.
- The application has been stopped by its owner.
- The application has no
webcontainer responding to HTTP requests.
504 Gateway Timeout- The application timed out. Your application must accept connections and send the first byte in a limited amount of time.