If your program has been put into Fail-safe status, it means that your program has been suspended due to some issues. This guide covers the main reasons why programs could go into Fail-safe status, and how they can be recovered:
Programs could be put into Fail-safe status for various reasons:
- In Interactions programs might be put into Fail-safe status because resources in the program are not active, have issues or no longer exist.
- Automation Center programs might be put into Fail-safe status due to performance or resource related issues (e.g. configuration related problems with Relational Data segments).
How to review Fail-safe programs
You can check if your program has been put into Fail-safe status the following ways:
- On the Automation Programs page, Fail-safe is displayed in the Status column next to your program’s name.
- The following notification appears at the top of the page: “One or more programs are in Fail-safe status. Please review the affected automation programs to resolve the issues.”
- When you open the program editor, Program is in Fail-safe is displayed in the top right corner and a notification appears in the program’s editor.
- All program actions (such as Test, Launch, Pause, etc.) are unavailable. Interactions programs can be reactivated or finished. Automation Center programs can be only resolved by the Emarsys Support team, so no status changes are available.
What happens to contacts and how to recover programs
When your program is in Fail-safe status, contacts and events are queued at the entry node and within the program.
You will receive a Notification Center message, highlighting the root cause of the program going into Fail-safe status. You can reactivate Interactions programs after resolving the issue. If the problem still persists, then the program will be put into Fail-safe status again. If you need help with investigating the root cause, please contact Emarsys Support.
Important: The program’s status will automatically change to Fail-safe (Frozen), if it’s not recovered within 72 hours:
- Contacts and events won’t be able to enter the program.
- Contacts and events already in the program will exit immediately.
In the Automation Center
When an Automation Center program is in Fail-safe status:
- In the case of transactional and recurring programs, contacts are queued at the entry node and within the program.
- In the case of Active recurring programs, every future recurrence will be canceled while the program is in Fail-safe status. When the Emarsys Support team reactivates the program, it will start running again according to the defined schedule.
- In the case of ad hoc batch programs, if the program was launched before it went to Fail-safe status, contacts already in the program are queued.
- A message will appear in the Notification Center, describing why the program has been put into Fail-safe.
- The Emarsys support team will be automatically notified and a support ticket will be opened, so that they can assist you with recovering the program. If enabled on your account, your Account Owner will also receive an automatically generated notification.
Queues can build quickly at the entry node and within the program. Programs that are in Fail-safe status for more than 30 days will be Aborted automatically.
Programs in Fail-safe status cannot be reactivated in the Automation Center. There are two ways to recover them:
- The Emarsys Support team can resolve the issues and reactivate the original program in Fail-safe status.
- You can copy the program that is in Fail-safe and after eliminating the root cause, activate the copied program. For more information, see Copying a program that is in Fail-safe status.
Copying a program that is in Fail-safe status
When you copy a program that is in Fail-safe status, you will lose the contacts who have entered the original program along with its contact participation settings. For example, if you used the Contacts can enter this program once ever participation setting in the original program, those contacts who have already entered it can enter the copied program again. You can prevent this by using a shared Participation check. For more information, see Using the same Participation check node settings in multiple programs.
The copied program will not inherit the reporting related to the original program.
Launching a copied program
If you have not eliminated the root cause that put the program in Fail-safe status, the copied program will be put into Fail-safe status again when you activate it.
After the corrected program is activated, you can abort, then delete the original program that was put into Fail-safe status to avoid having duplicate programs.