Tuesday, April 12, 2016

Check logfiles for relevant new messages

Supported Agents: Linux, Windows, AIX, Solaris
Distribution: official part of Check_MK
License: GPL
This check processes the output of agents with the logwatch plugin. The windows agent has built in this extension. The logwatch extension of the Linux/UNIX agents needs a configuration file that lists all relevant logfiles and lists possible log lines that should result in warning or critical state. The windows agents does not need any configuration but sends all log files in the Windows event log. It uses the warning/error classification of Windows. 
Relevant log messages found by the agent are stored locally into a text file. The check is critical, if at least one new or old log message exists that is classified as critical. If at least one warning message exists but no critical, the check results in a warning state. 
The only way to bring the state back to OK is to delete the text file with the stored log messages. This is stored below ~/var/check_mk/logwatch. Usually the logwatch webpage is used to browse and delete the messages. Please refer to the online documentation of check_mk for more details about logwatch. 

Item

The name of the logfile. For Linux/UNIX this is the complete absolute path name of the logfile. For Windows this is the name as shown in the windows event log, for example Application (case sensitive!). 

Check parameters

ParameterTypeDescription
patternslistThe parameter can be used to reclassify log messages sent by the agent. You can assign one of the three levels 'I' (ignore), 'W' (warning) or 'C' (critical) to each message. It is only possible to reclassify messages that have already been classified by the agent with an original level of IW or C. As of version 1.1.13i1 the Windows agent supports sending always all messages and classifying non-problems as I. This is done by a configution via check_mk.ini and setting the level for a logfile to all. Those messages can later be reclassified.
Each entry of the list is a pair of the new level and an extended regular expression searched in the log line. Please note, that because logwatch checks are usually inventorized, it is easier to use the global configuration variable logwatch_patterns instead of explicit check parameters. Example: [ ('I', "sshd"), ('C', "foo.*bar") ] 
In the current version of this check a level can be substituted by a pair of numbers, e.g. instead of 'W' you can write (10, 20). In that case the message will be considered as 'I' for the first 9 appearances, further appearances will be classified as 'W' and from the 20th message on it will be 'C'. The counts are reset each time the host is being checked. 

Performance data

None.

Inventory

All logfiles sent by the agent are automatically inventorized when the option logwatch_forward_to_ec is not configured or set to False. Please use standard inventory configuration methods if you want to ignore certain log files. 

Configuration variables

VariableTypeDescription
logwatch_patternsdictA dictionary for reclassifying the level of certain messages. This is especially useful for the Window agent, since that uses the pre-classification of Windows and might not meet your needs. The key of each entry is the name of a logfile (seeitem). The value of each entry is a list of pairs as described inparameters. It is also possible to restrict entries to single hosts or groups of hosts by prepending list of host and host tags to the pairs, as seen in the example. 
logwatch_max_filesizeintThis is preset to 500000 and limits the maximum size of a logfile stored on the monitoring servers (i.e. the amount of unacknowledged problem messages). As soon as the maximum size is reached, the check gets critical. (It probably anyways is with so many problem messages lying around). 
logwatch_dirstringDirectory where log messages are stored. The default value is/var/lib/check_mk/logwatch. For each host a subdirectory is created. The name of the files therein are the names of the logfiles with slashes replaced by backslashes. 
logwatch_service_outputstringChange the output of the service to ony show the count of events whitout the last error message. Default is defaultwhich shows the last error message and the count, use countif you only want to see the count.

Examples

main.mk
logwatch_patterns = {
    "System":             [ ('W', "crash"),
                            ('I', "svchost") ],
                            ((10,30), "login.*failed") ],
    "/var/log/messages":  [ ( ['test'], ALL_HOSTS, 'I', "sshd") ]
 }

No comments:

Post a Comment