Publish/subscribe logging options
Zato publish/subscribe message queues and topics offer several ways to gain insight into the inner workings of all the components taking part in message delivery and this article presents an overview of the mechanisms available.
Two types of logging
Logging is broken out into two categories:
- Files or other destinations - there are several files where messages and events may be stored
- GUI in web-admin - dedicated parts of web-admin let one check runtime configuration and observe events taking place in internal components
Files or other destinations
By default, several files are employed to keep information about pub/sub messages, each file stores data with its own characteristics. As with other logging files, the details such as logging level are kept in the logging.conf file for each server in a cluster.
Note that logging to files is just what pub/sub uses in default settings - everything in logging.conf is based on Python’s standard logging facilities so it is possible to reconfigure it in any way desired to send logging entries to other destinations instead of, or in addition to, what the configuration says by default. Likewise, it is possible to disable them altogether if they are not needed.
zato_pubsub is the main file with internal events pertaining to handling of pub/sub messages. Any time a message is published or received by one of endpoints, a new log entry will be added here.
Moreover, the file contains plenty of extra information - whether there are any matching subscribers for a message, what their subscription keys are, if they are currently connected, what happened to message if there were none, if the message could not be delivered, what the reason was, how many attempts have been so far, all the relevant timestamps, headers and other metadata.
In short, it is the full life-cycle of publications and subscriptions for each message processed. This is the place that all the details of what is going on under the hood can be found in.
Unlike the file above, the whole purpose of zato_pubsub_audit is to save data and metadata of all the messages received and published to topics - only that.
There is no additional information on what happened to each message, when and where it was delivered - it is purely an audit log of all the messages that Zato published.
It is at times convenient to use it precisely because it has basic information only, without any details.
Prevention of data loss
The whole purpose of zato_pubsub_overflow is to retain messages that would be otherwise dropped if topics reached their maximum depth yet subscribers were slow to consume them.
Consider the topic below - it has a maximum depth of 10,000 in-RAM messages. If there is a steady inflow of messages to this topic but subscribers do not consume them in a continuous manner, which means that the messages will not be placed in their queues in a timely fasion, this limit, the maximum depth, will be eventually reached.
The question then is, what should be done with such topics overflowing with messages? By default, Zato will save them to disk just in case they should be kept around for manual inspection or resubmission.
If this is not needed, zato_pubsub_overflow can be configured to log on level WARN or ERROR. This will disable logging to disk and any such superfluous messages will be ignored.
There are several web-admin screens that let one understand what kind of internal events and structures participate in publish/subscribe:
- Delivery tasks - their job is to deliver messages from subscriber queues to endpoints
- Sync tasks - they are responsible for internal synchronization of state, data and metadata between topics and queues
Of particular interest is the event log - it shows step-by-step what happens to each message published in terms of decision making and code branches.
Note that the log has a maximum size of 1,000 events, with new events replacing any older ones, thus in a busy server it will suffice for a few seconds at most and it is primarily of importance in environments with low traffic to topics, such as development and test ones.
Note that runtime pub/sub information in web-admin is always presented for each server process separately - for instance, if a server is restarted, its process ID (PID) will likely change and all the data structures will be repopulated. The same holds for the event log which is not preserved across server restarts.
Publish/subscribe messages and their corresponding internal mechanisms are covered by a comprehensive logging infrastructure that lets one understand both the overall picture as well as low-level details of the functionality.
From an audit log, through events to tracing of individual server process, each and every angle is addressed in order to make sure that topics and queues work in a smooth and reliable way.