Logging

Introduction

The log files for each module are in the folder %SiteController%/log. Log files will be rotated depending on your configuration, so please whenever you have a problem collect the log files as soon as possible. The internal service name will be useful when troubleshooting, as you will find a log text file with the same name as the internal service name with the extension ".log".

On this page:

Framework

Configuration

  • see SiteController.cfg (formerly in azeti_logging.cfg)

Standard Settings:

  • logger=azeti_file
  • by default should be level=INFO
  • SizeRotatingFileHandler
    • rotation after 10490000 Bytes and 5 backup-files configure

other useful handlers:

Have a look into the official docs to find examples for file size based rotation.

Changing the size rotation parameters

The default for the file size dependent parameters in the SiteController.cfg is

[handler_SizeRotatingFileHandler]
# Size based rotation of log files
formatter = simpleFormatter
class = handlers.RotatingFileHandler
# Rotate if a file exceeds 1049000 bytes (1 MiB) and keep 5 old files
args = (['%(logfilename)s', 'a', 10490000, 5])

There is no need to have default parameters present in a SiteController.cfg file. Default values are used when there is no entry in the SiteController.cfg

To change the count of old backup files per logfilename from default (5) to 15 add following snippet to the specific SiteController.cfg

[handler_SizeRotatingFileHandler]
# Rotate if a file exceeds 1049000 bytes (1 MiB) and keep 15 old files
args=(['%(logfilename)s', 'a', 10490000, 15])


Levels

LevelNumeric valueIntended purpose
CRITICAL50Used to log events so grave it causes the process not being able to continue running (e.g. a not specificially handled runtime error).
ERROR40Used to log events which indicate an erroneous condition but the process can continue its processing by ignoring the fact (e.g. ignoring a contradictory configuration entry)
WARNING30Used to log events which indicate erroneous or probably unwanted conditions and the process takes workaround measures to continue processing (e.g. a missing file and the process uses default values instead)
INFO20

Used to log seldom (!) informative events like starting and stopping of processes or configuration changes.
Note: INFO should never log more than a few lines a day. That means if INFO log messages on SC in regular operation exceeds 1kiB / day, then there are too many INFO messages, move the most unimportant ones to DEBUG instead

DEBUG10Used to log anything else, especially any detailled log messages to debug a certain condition.
NOTSET0This is not a real log level but causes the logger to determine the actual log level by looking somewhere else (refer to the Python documentation).

General rules:

No log level (with the only exception of Debug level) is allowed to log messages periodically!

Example:

  • If there is a state where e.g. a connection keeps failing to recover, log it once in Warning state, log it a second time after a while as Errorstate but as long as the state has not changed in between, with that error message, stop complaining about it in the logs.

INFO

Every module should log when it starts and when it finishes with a common structure (meaning we decide a sentence like "Module ... starting".

Examples:

  • Start/Stop of module
  • overview (and readable – that means not just a json-dump) of configuration changes
  • Other things that may be interesting for for a user to know but aren't warnings or errors

WARNING

A circumstance that is not expected but something that will not affect the functionality itself but the result may not what the user is expecting or is likely just a temporary thing. 

Examples:

  • There is no mosquitto found at the location the configured mosquitto path was set to, however by searching the path, there is another one found and the module takes this location instead
  • There is no such configured serial port, but since the system has only one of them, the daemon takes it instead.
  • A host address is suddenly not reachable anymore (assuming a temporary network issue).

ERROR

Any circumstance that is not expected, affect the proper functioning of the module, and may lead to the user receiving wrong information or a false perception about something working. The system is not able to resolve on its own. The module itself is not able to continue checking that particular part, however anything else in the module is not affected and the module continues to check anything else.

Examples:

  • The configured serial interface is wrong but there is more than one of them in the system.
  • A host address is unreachable and has never been seen alive before
  • A host that stays offline for a longer period of time

CRITICAL

Is an error where the module is not able to recover and unrelated checks are to be expected to be affected, too. This might lead to information loss.

Examples:

  • Out of memory
  • Corrupted database file
  • No mosquitto broker found
  • Unhandled exceptions

DEBUG

Information that the developer consider necessary to debug the module, meaning  in this mode we show as much information as it is required to show what is exactly happening at every point.

Next Steps