Hi everybody,
I have a server that periodically get 100% of ram usage for a few min (this is normal).
Its state changes from OK to Critical and I would like to receive a notification only if the Critical state last a certain amount of time?
I have looked at the documentation but I didn't have found an aswer already (max-check-attempts ? notification period or options ?)
Any help or hints or advice apreciated :-)
Thank you in advance,
Best,
Hervé
You need to play with the configuration options max_check_attempts and retry_interval:
max_check_attempts: This directive is used to define the number of times that Nagios will retry the service check command if it returns any state other than an OK state. Setting this value to 1 will cause Nagios to generate an alert without retrying the service check again.
retry_interval: This directive is used to define the number of "time units" to wait before scheduling a re-check of the service. Services are rescheduled at the retry interval when they have changed to a non-OK state. Once the service has been retried max_check_attempts times without a change in its status, it will revert to being scheduled at its "normal" rate as defined by the check_interval value. Unless you've changed the interval_length directive from the default value of 60, this number will mean minutes. More information on this value can be found in the check scheduling documentation.
Example:
max_check_attempts=4, retry_interval=2 instructs Nagios to perform four more attempts with two minutes intervals BEFORE sending a notification when a check transforms into a non-OK state.
Tk,
Thank you for your answer. I will check asap and keep you informed.
Cheers,
Hervé