Now that Python 3 support is available as a preview for developers, this post summarizes the effort that went into making sure that Zato works smoothly using both Python 2.7 and 3.x.

In fact, the works required were remarkably straightforward and trouble-free and the article discusses the thought process behind it, some of the techniques applied or tools used.


Zato is an enterprise API integration platform and backend application server. We support a couple dozen of protocols, data formats, several sorts of IPC and other means to exchange messages across applications.

In other words, on the lowest level, passing bytes around, transforming, extracting, changing, collecting, manipulating, converting, encoding, decoding and comparing them, including support for all kinds of natural languages from around the world, is what Zato is about at its core when it is considered from the perspective of the programming language it is implemented in.

The codebase is around 130,000 lines of code, out of which Python and Cython are 60,000 lines. This is not everything, though, because we also have 170+ external dependencies that need to work with Python 2.7 and 3.x.

The works took two people a total of 80 hours. They were spread over a longer calendar time, except for the final sprint that required more attention for several days in a row.


Since the very beginning, it was clear that Python 3 will have to be supported at one day so the number one thing that each and every Python module has always had is this preamble:

from __future__ import absolute_import, division, print_function, unicode_literals

This is what every Python file contains and it easily saved 90% of any potential work required to support Python 3 because, among other less demanding things, it enforced a separation, though still not as strict as in Python 3, between byte and Unicode objects. The separation is a good thing and the more one works with Python 3 the clearer it becomes.

In Python 2, it was sometimes possible to mix the two. Imagine that there is a Python-derived language where JSON dicts and Python dicts can be sometimes used interchangeably.

For instance, this is a JSON object: {"key1": "value1"} and it so happens that it is also a valid Python dict so in this hypothetical language, this would work:

json = '{"key1": "value1"}'
python = {'key2': 'value2'}

result = json + python

Now the result is this:

{'key1': 'value1', 'key2': 'value2'}.

Or wait, perhaps it should be this?

'{"key1": "value1", "key2": "value2"}'

This is the central thing - they are distinct types and they should not be mixed merely because they may be related or seem similar.

Conceptually, just like upon receiving a JSON request from the network a Python application will decode it into a canonical representation, such as a dict, list or another Python object, the same should happen to other bytes, including ones that happen to represent text or similar information. In this case, the canonical format is called Unicode, and that is the whole point of employing it in one's application.

All of this was clear from the outset and the from __future__ statements helped in its execution, even if theoretically one could have been still able to mix bytes and Unicode - it was simply a matter of using the correct canonical format in a given context, i.e. a case of making sure the architecture was clean.

This particular __future__ statement was first announced in 2008 so there was plenty of time to prepare to it.

As part of the preparations, it is good to read a book about Unicode. Not just a 'Unicode for overburdened developers' kind of an article but an actual book that will let one truly appreciate the standard's breadth and scope. While reading it, do not resist the temptation to learn at least basics of two or more natural languages that you never knew about before. It will only help you develop into a better person and this is not a joke.

While programming with bytes and Unicode, it is convenient simply to forget about whether it is a 'str', 'bytes' or 'unicode' object - it is easier simply to think about bytes and text. There are bytes that can mean anything and there is text whose native, canonical form is Unicode. This is not always 100% accurate because Unicode can represent marvellous gems such as Byzantine musical notation and more but if a given application's scope is mostly constrained to text then this will work - there are bytes and there is text.

This is all fine with our own code but there are still the external libraries that Zato uses and some of them will want bytes, not text, or the other way around, in seemingly similar situations. There can be even cases like a library expecting for protocol header keys to be text and protocol header values to be bytes for rather unclear reasons. Simply accept it as a fact of life and move on with your works, there is no need to pause even for a moment to think about it.

Side projects

It was good to try out Python 3 first in a few new, smaller side-projects, GUI or command-line tools that are not part of the core yet they are important in the overall picture. The most important part of it was that creating a Python 3 application from scratch was in no way different than in Python 2, this served as a gentle introduction to Python 3-specific constructs and this knowledge was easily transferred later on to the main porting job.


Out of a total of 170+ dependencies, around 10 were not Python 3-compatible. All of them had not been updated in eight, twelve or more years. At this point, it is safe to assume that if there is a dependency that was last updated in 2009 and it has no Python 3 support then it never will.

What to do next depended on a particular case, each of them was some kind of a convenience library - sometimes they had to be dropped and sometimes forked. Most complex changes required in a fork were on the level of updating 'print' to 'print()' or doing away with complex installation setups that predated contemporary pip-based configuration options.

Other than that, there were no issues with dependencies, all of them were ready for Python 3.

Idioms and imports

Most of the reference information needed to make use of Python 2 and 3 was available via the python-future project which itself is a great assistance. Installing this library, along with its dependencies, sufficed for 99% of cases. There were some lesser requirements that were incorporated into a Zato-specific submodule directly, e.g. sys.maxint is at times useful as a loop terminator but ints in Python 3 have no limits so an equivalent had to be added to our own code.

Note that the page above does not show all the idioms and some changes were not always immediately obvious, like modifications to __slots__, or the way metaclasses can be declared, but there were no really impossible cases, just different things to use, either built in to Python 3 or available via future or six libraries.

A nice thing is that one is not required to immediately change all the imports in one go - they can be changed in smaller increments, e.g. 'basestring' is still available in the form of 'from past.builtins import basestring'.


A really important aspect during the migration was the ability to test sub-components of an application in isolation. This does not only include unittests, which may be too low-level, but also things such as starting only selected parts of Zato without a requirement to boot up whole servers which in turn meant each change could be tested within one second rather than ten. To a degree, this was an unexpected but really useful test of how modular our design was.

Intellectually, this was certainly the most challenging part because it required maintaining and traversing several trains of thought at once, sometimes for several days on end. This, in turn, means that it really is not a job for late afternoons only and it cannot be an afterthought, things can simply get complex very quickly.

String formatting

There is one thing that was not expected - the way str.format works with bytes and text.

For instance, this will fail in Python 3:

>>> 'aaa' + b'bbb'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't convert 'bytes' object to str implicitly

Just for reference, in Python 2 it does not fail:

>>> 'aaa' + b'bbb'

Still under Python 2, let's use string formatting:

>>> template = '{}.{}'
>>> template.format('aaa', b'bbb')

In Python 3, this is the result:

>>> template = '{}.{}'
>>> template.format('aaa', b'bbb')

In the context of a Python 3 migration, it would have been probably more in line with other changes to the language if this had been special-cased to reject such constructs altogether.

Otherwise, it initially led to rather inexplicable error messages because the code that produces such string constants may be completely unaware of where they are used further on. But witnessed once or twice, it was apparent later on what the root cause was and this could be easily dealt with.

Things that are missed

One small, yet convenient, feature of Python 2 was the availability of some of the common codecs directly in string objects, e.g.:

>>> u'abc'.encode('hex')
>>> u'abc'.encode('base64')
>>> u'ελληνική'.encode('idna')

This will not work as-is in Python 3:

>>> u'abc'.encode('hex')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: 'hex' is not a text encoding; use codecs.encode() to handle arbitrary codecs

Naturally, the functionality as such is still available in Python 3, just not via the same means.

Python 2.7

On server side, Python 2.7 will be around for many years. After all, this is a great language that let thousands and millions of people complete amazing projects and most of enterprise applications do not get rewritten solely because one of the technical components (here, Python) changes in a way that is partly incompatible with previous versions.

Both RHEL and Ubuntu ship with Python 2.7 and both of them have long-term support well into the 2020s so the language as such will not go away. Yet, piece by piece, all the applications will be changed, modified, modularized or rewritten and gradually Python 2.7's usage will diminish.

In Zato, Python 2.7 will be supported for as long as it is feasible and one of the current migration's explicit goals was to make sure that existing user Zato environments based on Python 2.7 will continue to work out-of-the-box with Python 3 so there is no difference which Python version one chooses - both are supported and can be used.


An extraordinary aspect of the migration is that it was so unextraordinary. There were no really hard-won battles, no true gotchas and no unlooked-for hurdles. This can be likely attributed to the facts that:

  • Python developers offered information what to expect during such a job
  • Unicode was not treated as an afterthought
  • Zato reuses common libraries that are all ported to Python 3 already
  • Internet offers guides, hints and other pieces of information about what to do
  • It was easy to test Zato components in isolation
  • Time was explicitly put aside for the most difficult parts without having to share it with other tasks

The next version of Zato, to be released in June 2019, will come with pre-built packages using Python 2.7 and 3, but for now installation from source is needed - visit this forum thread for more details.

This is part one of a mini-series about working with IBM MQ as a Zato and Python user. This installment will cover installation and configuration whereas the next one will delve into programming tasks.

Zato is a Python-based multi-protocol API integration platform, message broker and backend application server and IBM MQ is one of the protocols and products that it supports out of the box, making it possible to integrate with IBM MQ systems, including JMS ones, using little or no programming at all.


This is what the article will include. The end result will be a working Zato-IBM MQ installation, capable of both receiving and sending messages from IBM MQ queue managers.

  • Installing a Zato environment
  • Enabling IBM MQ connections in Zato
  • Installing IBM MQ Client
  • Installing PyMQI
  • Configuring Zato connection definitions
  • Zato outgoing connections
  • Zato channels

Installing a Zato environment

  • The first step is to install a Zato package. To work with IBM MQ, you can choose any of the systems supported by Zato. Ubuntu will be used here but everything works the same no matter if it is Ubuntu, RHEL or any other OS.

  • Next, you need to create a Zato environment. The easiest way is to use a quickstart cluster which sets up a working Zato cluster in one command.

  • Alternatively, you can follow the tutorial that will guide you through the same process in more details

Enabling IBM MQ connections in Zato

  • Once you have have a Zato environment ready, you need to stop all of its servers and enable IBM MQ connections in server configuration files

  • Stop all the servers

  • Open file server.conf in each server

  • Find stanza [component_enabled]

  • Set ibm_mq=True in the stanza

  • Do not start all the servers back just yet

Installing IBM MQ Client

  • To make it possible to connect to queue managers via TCP, it is required to install a software package called an IBM MQ Client. This package contains runtime libraries that let applications, such as Zato, to use TCP connections with IBM MQ.

  • Download the client and follow its installation procedure as described by IBM

Installing PyMQI

  • With the client package in place, it is now possible to install PyMQI, which is a low-level IBM MQ Client library for Python - it was not possible to install it in previous steps because it required the IBM MQ Client as its prerequisite

  • To install PyMQI, navigate to Zato installation directory and use pip to download PyMQI:

  cd /opt/zato/current
  ./bin/pip install pymqi
  • This concludes the OS-level preparations and installation steps

  • Now all the servers can be brought back using the zato start command

Configuring Zato connection definitions

  • Log in to web-admin and navigate to Connections -> Definitions -> IBM MQ

  • A connection definition is a reusable piece of configuration, a common set of information that can be used in more than one place, in this context, it allows one to create both incoming MQ connections (Zato channels) and outgoing ones. Note that Zato channels share the name with MQ channels but they are an unrelated concept, only distantly similar.

  • Create a definition by filling out the form as below. You need to change the connection's password because by default it is a randomly generated one that cannot be used. Afterwards, you can click Ping to confirm that connections to your remote queue manager work correctly.

Zato outgoing connections

  • In web-admin, create a new IBM MQ outgoing connection via Connections -> Outgoing -> IBM MQ

  • An outgoing connection lets one push and send messages to other systems, in this case it will let you send messages to MQ queues

  • A single connection is tied to a particular connection definition, which means that it is related to a queue manager the definition points to, but it can be used with any number of MQ queues as needed

  • Once a connection is defined, it is possible to start to use it from Python code but in this post, let's send a test message directly from web-admin

Zato channels

  • In web-admin, create a new IBM MQ channel through Connections -> Channels -> IBM MQ

  • A Zato channel acts as a message listener, accepting messages from a particular queue and invoking a user defined API service that acts accordingly, e.g. by transforming the message and delivering it to intended recipients

  • No programming is needed to accept messages from queue - the very fact of creating a channel lets Zato automatically consume messages in background from the channel's queue

  • Any user-defined service in the channel can be used, but below, just for illustration purposes, a built-in service called is employed. This is a convenience service that simply saves to Zato server logs all the information about each message taken off a queue, including the message's data and metadata, such as headers and MQMD.

  • To send a message to Zato, below, MQ's own command line utility is used and the full command that can be executed from the system that MQ runs on is /opt/mqm/samp/bin/amqsput DEV.QUEUE.1 QM1

  • If you do not have access to MQ command line, you can simply create an outgoing connection in Zato and use the Send message form in web-admin to send a message that will be received by a channel. It is not shown here but it would work just as fine.


This is it. Your Zato installation is configured to send and accept IBM MQ-originating messages and, in the next part, Python services will be used to actually process the messages in a useful manner, e.g. by enriching their contents and sending them out to other applications.

For more information - visit the main documentation site or go straight to the tutorial and stay tuned for the next article.

Zato publish/subscribe message queues and topics offer several ways to gain insight into the inner workings of all the components taking part in message delivery and this article presents an overview of the mechanisms available.

Two types of logging

Logging is broken out into two categories:

  • Files or other destinations - there are several files where messages and events may be stored
  • GUI in web-admin - dedicated parts of web-admin let one check runtime configuration and observe events taking place in internal components

Files or other destinations

By default, several files are employed to keep information about pub/sub messages, each file stores data with its own characteristics. As with other logging files, the details such as logging level are kept in the logging.conf file for each server in a cluster.

Note that logging to files is just what pub/sub uses in default settings - everything in logging.conf is based on Python's standard logging facilities so it is possible to reconfigure it in any way desired to send logging entries to other destinations instead of, or in addition to, what the configuration says by default. Likewise, it is possible to disable them altogether if they are not needed.

General log

zato_pubsub is the main file with internal events pertaining to handling of pub/sub messages. Any time a message is published or received by one of endpoints, a new log entry will be added here.

Moreover, the file contains plenty of extra information - whether there are any matching subscribers for a message, what their subscription keys are, if they are currently connected, what happened to message if there were none, if the message could not be delivered, what the reason was, how many attempts have been so far, all the relevant timestamps, headers and other metadata.

In short, it is the full life-cycle of publications and subscriptions for each message processed. This is the place that all the details of what is going on under the hood can be found in.

Audit log

Unlike the file above, the whole purpose of zato_pubsub_audit is to save data and metadata of all the messages received and published to topics - only that.

There is no additional information on what happened to each message, when and where it was delivered - it is purely an audit log of all the messages that Zato published.

It is at times convenient to use it precisely because it has basic information only, without any details.

Prevention of data loss

The whole purpose of zato_pubsub_overflow is to retain messages that would be otherwise dropped if topics reached their maximum depth yet subscribers were slow to consume them.

Consider the topic below - it has a maximum depth of 10,000 in-RAM messages. If there is a steady inflow of messages to this topic but subscribers do not consume them in a continuous manner, which means that the messages will not be placed in their queues in a timely fasion, this limit, the maximum depth, will be eventually reached.

The question then is, what should be done with such topics overflowing with messages? By default, Zato will save them to disk just in case they should be kept around for manual inspection or resubmission.

If this is not needed, zato_pubsub_overflow can be configured to log on level WARN or ERROR. This will disable logging to disk and any such superfluous messages will be ignored.

Web-admin GUI

There are several web-admin screens that let one understand what kind of internal events and structures participate in publish/subscribe:

  • Delivery tasks - their job is to deliver messages from subscriber queues to endpoints
  • Sync tasks - they are responsible for internal synchronization of state, data and metadata between topics and queues

Of particular interest is the event log - it shows step-by-step what happens to each message published in terms of decision making and code branches.

Note that the log has a maximum size of 1,000 events, with new events replacing any older ones, thus in a busy server it will suffice for a few seconds at most and it is primarily of importance in environments with low traffic to topics, such as development and test ones.

Note that runtime pub/sub information in web-admin is always presented for each server process separately - for instance, if a server is restarted, its process ID (PID) will likely change and all the data structures will be repopulated. The same holds for the event log which is not preserved across server restarts.


Publish/subscribe messages and their corresponding internal mechanisms are covered by a comprehensive logging infrastructure that lets one understand both the overall picture as well as low-level details of the functionality.

From an audit log, through events to tracing of individual server process, each and every angle is addressed in order to make sure that topics and queues work in a smooth and reliable way.

This article offers a high-level overview of the public services that Zato offers to users wishing to manage their environments in an API-driven manner in addition to web-admin and enmasse tools.


Most users start to interact with Zato via its web-based admin console. This works very well and is a great way to get started with the platform.

In terms of automation, the next natural step is to employ enmasse which lets one move data across environments using YAML import/export files.

The third way is to use the API services - anything that can be done in web-admin or enmasse is also available via dedicated API services. Indeed, both web-admin and enmasse are clients of the same services that users can put to work in their own integration needs.

The public API is built around a REST endpoint that accepts and produces JSON. Moreover, a purpose-built Python client can access all the services whereas an OpenAPI-based specification lets one generate clients in any language or framework that supports this popular format.

Python usage examples follow in the blog post but the full documentation has more information about REST and OpenAPI too.


First thing needed is to set a password for the API client that will be used, it is an HTTP Basic Auth definition whose username is pubapi. Remember, however, that there are no default secrets in Zato ever so the automatically generated password cannot be used. To change the password, navigate in web-admin to Security -> HTTP Basic Auth and click Change password for the pubapi user.

Now, we can install the Python client package from PyPI. It does not matter how it is installed, it can be done under a virtual environment or not, but for simplicity, let's install it system-wide:

$ sudo pip install zato-client

This is it as far as prerequisites go, everything is ready to invoke the public services now.

Invoking API services

For illustration purposes, let's say we would like to be able to list and create ElasticSearch connections.

The easiest way to learn how to achieve it is to let web-admin do it first - each time a page in web-admin is accessed or an action like creating a new connection is performed, one or more entries are stored in admin.log files on the server that handles the call. That is, admin.log is the file that lists all the public API services invoked along with their input/output.

For instance, when you list ElasticSearch connections, here is what is saved in admin.log:

INFO - name:``, request:`{'cluster_id': 1}`
INFO - name:``, response:`'
   {"zato_search_es_get_list_response": [],
   "_meta": {"next_page": null, "num_pages": 0, "prev_page": null,
   "has_prev_page": false,
   "cur_page": 1, "page_size": 50, "has_next_page": false, "total": 0}}'

It is easy to discern that:

  • The service invoked was
  • Its sole input was the cluster ID to return connections for
  • There were no connections returned on output which makes sense because we have not created any yet

Let's do the same in Python now:

# Where to find the client
from zato.client import APIClient

# Credentials
username = 'pubapi'
password = '<secret>'

# Address to invoke
address = 'http://localhost:11223'

# Build the client
client = APIClient(address, username, password)

# Choose the service to invoke and its request
service_name = ''
request = {'cluster_id':1}

# Invoke the API service
response = client.invoke(service_name, request)

# And display the response

Just like expected, the list of connections is empty:

$ python 

Navigate to web-admin and create a new connection via Connections -> Search -> ElasticSearch, as below:

Let's re-run the Python example now to witness that the newly created connection can in fact be obtained from the service:

$ python 
  u'name': u'My Connection',
  u'is_active': True,
  u'hosts': u'\r\n',
  u'opaque1': u'{}',
  u'timeout': 5,
  u'body_as': u'POST',
  u'id': 1

But this is not over yet - we still need to create a new connection ourselves through an API service. If you kept admin.log opened while the connection was being created in web-admin, you noticed that the service to do it was called and that its input was saved to admin.log too so we can just modify our Python code already:

# Where to find the client
from zato.client import APIClient

# Credentials
username = 'pubapi'
password = '<secret>'

# Address to invoke
address = 'http://localhost:11223'

# Build the client
client = APIClient(address, username, password)

# First, create a new connection
service_name = ''
request = {
    'name':'API-created connection',
    'hosts': '',
    'timeout': 10,
    'body_as': 'POST'
client.invoke(service_name, request)

# Now, get the list of connections, it should include the newly created one
service_name = ''
request = {'cluster_id':1}
response = client.invoke(service_name, request)

# And display the response

This is a success again because on output we now have both the connection created in web-admin as well as the one created from the API client:

$ python 
 u'name': u'API-created connection',
 u'is_active': True,
 u'hosts': u'',
 u'opaque1': u'{}',
 u'timeout': 10,
 u'body_as': u'POST',
 u'id': 2
 u'name': u'My Connection',
 u'is_active': True,
 u'hosts': u'\r\n',
 u'opaque1': u'{}',
 u'timeout': 5,
 u'body_as': u'POST',
 u'id': 1

Just to double-check it, we can also list the connections in web-admin and confirm that both are returned:


That is really it. The process is as straightforward as it can get - create a client object, choose a service to invoke, give it a dict request and a Python object is returned on output.

Note that this post covered Python only but everything applies to REST and OpenAPI-based clients too - the possibilities to interact with the public API are virtually limitless and may include deployment automation, tools to test installation procedures or custom command and control centers and administration dashboards.

This post describes a couple of new techniques that Zato 3.0 employs to make API servers start up faster.

When a Zato server starts, it carries out a series of steps, one of which is deployment of internal API services. There are 550+ of internal services, which means 550+ of individual features that can be made use of - REST, publish/subscribe, SSO, AMQP, IBM MQ, Cassandra, caching, SAP Odoo, and hundreds more pieces are available.

Yet, what internal services have in common is that they change relatively infrequently. They do change from time to time but this does not happen very often. This realization led to the creation of a start-up cache of internal services.

Auto-caching on first deployment

Observe the output when a server is started right after installation, with all the internal services about to be deployed along with some of the user-defined ones.

In this particular case, the server needed around 8.5 second to deploy its internal services but while it was doing it, it also cached them all for later use.

Now, when the same server is stopped and started again, the output will be different. Nothing changed as far as user-defined services go but things changed with regards to the internal ones - not only did the server deploy the internal services but it also did it by re-using the cache created above and, consequently, 3 seconds were needed to deploy them.

Such a cache of internal services is created and maintained by Zato automatically, no user action is required.

Disabling internal services

Auto-caching is already a nice improvement but it is possible to go one better. By default, servers deploy all of the internal services that exist - this is because users may want to choose in their projects any and all of the features that the internal services represent.

However, in practice, most projects will use a select few technologies, e.g. REST and AMQP, or REST, IBM MQ, SAP and ElasticSearch, or any other combination, but not all of what is possible.

This explains the addition of a new feature which allows one to disable all the internal services that are known not to be needed in a particular project.

When you open a given server's server.conf file, you will find entries in the [deploy_internal] stanza whose subset is below. Note that if your Zato 3.0 version does not have it, you can copy the stanza over from a newly created server.

The list contains not internal services as such but Python modules to which the services belong, each module concerns a particular feature or technology, AMQP, JMS IBM MQ, WebSockets, Amazon S3 and anything else. Thus, if something is not needed, you can simply change True to False for each module that is not used.

But, you need to keep in mind that all the internal services were already cached before so, having changed True to False in as many places as needed, we also need a way to recreate the cache.

This is done by specifying the --sync-internal flag when servers are started; observe below what happens when some of the internal services were disabled and the flag was provided.

All the user-defined services deployed as previously but the cache for the internal ones was recreated and only some of them were deployed, only the ones that were needed in this particular project, which happens to primarily include REST, WebSockets, Vault and publish/subscribe.

Note that even without the cache, the server needed only 4.1 second to deploy internal services which neatly dovetails with the fact that previously it needed 8.5 to deploy roughly twice as many of them.

This also means that with the cache already in place, the services will be deployed even much faster, which is indeed the case below. This time the server deployed the internal services needed in this project in 1.3 second, which is much faster than the original 8.5 second.

This process can be applied as many times as needed, each time you need new functionality disabled or enabled, you just edit server.conf, restart servers and that is it, the caches will be populated automatically.

With some of the services disabled, a caveat is that parts of web-admin will not be able to list or manage connections whose backend services were taken out but this is to be expected, e.g. if FTP connections were disabled in server.conf then it will not be possible to access them in web-admin.

One final note is that --sync-internal should really only be used when needed. The rationale behind the start-up cache is to make the process faster so this flag should not be used all the time, rather, there are two cases where it needs to be used:

  • When changing which internal services to deploy, as detailed in this post
  • When applying updates to your Zato installation - some of the updates may change, delete or add new internal services, which is why the caches need to be recreated in such cases