How to Scale Your Application

Scalingo lets you scale your app in two different ways:

  • Horizontal Scaling (recommended): Use multiple containers for your app, incoming requests will be dispatched to all the available containers. It let your app handle much more requests in parallel.
  • Vertical Scaling: Get bigger containers for your app, more RAM memory, more CPU, bigger applications will require more resources, that’s the goal of this scaling operation.

Both forms of scaling are complementary. You will often need to scale out (horizontal) your application, and to scale up (vertical) to fit the evolution of the traffic of your app.

When and why should you use multiple containers?

Redundancy

With at least two containers, if there is a problem with one of them, it will be completely transparent for you users as we will redirect the requests to the others. We recommend any production application to have at least 2 containers. The quality of the service we can provide also depends on the amount of containers you are using, see our Terms of Service for more information.

Load Balancing

When a lot of people are reaching your app, one container may not be enough to support the load. Here, memory is not the problem, but parallelism. Most technologies handle a incoming requests queue. If this queue is not consumed fast enough, users may wait long time to get the response, and it can even time out. Thus, one instance of your application may not be able to handle all the incoming requests. In this case, adding new containers is more important than getting more memory.

When and why should you get bigger containers?

Getting bigger containers means scaling vertically your application. It mostly depends on the size of your app and the memory it consumes. You can get this data with our CLI with the following command:

$ scalingo --app my-app stats
+----------+-----+------------------+-----------------+
|   NAME   | CPU |      MEMORY      |      SWAP       |
+----------+-----+------------------+-----------------+
| clock-1  | 0%  | 29% 151MB/512MB  | 0%   0B/1.5GB   |
|          |     | Highest: 167MB   | Highest:   0B   |
|          |     |                  |                 |
| web-1    | 1%  | 49% 254MB/512MB  | 0%   4KB/1.5GB  |
|          |     | Highest: 264MB   | Highest:   0B   |
|          |     |                  |                 |
| web-2    | 0%  | 52% 266MB/512MB  | 0%  20KB/1.5GB  |
|          |     | Highest: 280MB   | Highest:   0B   |
|          |     |                  |                 |
| worker-1 | 1%  | 91% 932MB/1024MB | 7% 234MB/3.0GB  |
|          |     | Highest: 1024MB  | Highest: 241MB  |
+----------+-----+------------------+-----------------+

You can see here, for each container, how much your app is consuming in real time. If your app starts swapping, performance will be strongly degraded. Two reasons are possible:

  • The container type is undersized compared to your application and you should consider using a larger container size, otherwise the user experience of your end-users will be impacted.

  • Your application is leaking memory. In this case you’ll see your app containers using more and more memory over time without any top limit. In this case, getting a larger container won’t solve the problem, as your app will keep growing. You should profile your application and get the source of the leak, fix it and deploy a new version of your application.


Suggest edits

How to Scale Your Application

©2025 Scalingo