Fail-Safe - Safe way to monitor IBM i processes
FailSafe is a monitoring and control tool for the IBM i.
With FailSafe you can monitor and control programs and processes submitted on the IBM i system.
FailSafe frees the system operator from the exhausting task of constant monitoring jobs that run on the IBM i.
FailSafe monitors your job even while you are out of the office.
FailSafe monitors 7*24 processes or one-time programs that run on the system.
Whenever the job encounters an error or a message waiting (MSGW), FailSafe attempts to recover from the problem by stopping the job (by responding to the pending message) and restarting it again.
FailSafe reports errors by sending messages through the following channels:
- PC brake message to the PC operator
- E-mail message with the job log attached
- XML message to the SMS server of the organization
Important batch jobs such as backups or daily batches are logged automatically into the FailSafe archive for future inquiry or regulatory inspections.
FailSafe can keep track of the number of objects in a library or output queue, and send a warning message when the number exceeds a specified limit
- Monitors 7*24 batch jobs (ensures at all times that the job is running)
- Monitors one-time batch jobs submitted through the FailSafe command interface
- Keeps track of the number of files in an IBM i library and reports when it exceeds the specified limit
- Keeps track of the number of spool files in a output queue and reports when it exceeds the specified limit
- Automatically responds to job message waiting (MSGW) errors
- Restarts the batch job if requested
- Archives the logs of jobs that run on the system on a PC server for future inquiry
- Sends an error message to a PC screen, e-mail, and SMS in case of job failure
- Supports the LPAR environment on the IBM i (multi-system)