Application crash

For various reasons your application can terminate abruptly. Two cases of crash may happen:

Boot errors

They are detected directly when you deploy your application, if it crashes before binding the allocated network port, your deployment would fail.

The former version of your application keeps running.

Runtime crashes

Two situations are the most common causes of runtime crashes:

  • Runtime error of your application (uncatched exception, segfault of a library/runtime)
  • Temporary error of an external resource

Restart policy

When a runtime crash occured we automatically restart your container. If your application stopped again in the 5 minutes after being restarted, we won’t restart it directly, but 5 minutes later.

  • 5 additional minutes are added after each crash in the 5 first minutes of runtime.
  • A crash which occures after 5 minutes has no effect on the cool-down

This limitation has been setup in order to avoid the situation when an application try to boot and crash immediately, it is to let it crash again and again… and again.

A limit of 12 restart operations exist, in means that after 6 hours and 30 minutes, your application won’t be accessible anymore (The last cool-off duration is 1 hour).

How do I know my app has crashed?

When such an event occured, you’ll received notifications, as well as your collaborators for the concerned application. You’ll receive a notification after the 2nd, 5th and 12th crashes (following the restart policy detailed previously).

That means that if your app has crashed once for a temporary reason, it will be automatically restarted, and if it does not crash anymore, you would not receive an email, but if it crashes in a loop, you’ll get a notification in the minute.

You’ll also find the ‘crash’ events in your app timeline on the dashboard.

What to do?

We stronly advise to look at the logs of your application using the web dashboard or by using the CLI tool.

According to the information gathered, you should then modify your application to fix the issues which lead to this instability.

After a successful deployment, a manual restart, or scaling of your application, we cancel any queued restart job, and the cool-down time after the next crash is reseted.

Need support?

Don’t hesitate to contact us at support@scalingo.com

schedule 06 Oct 2016