Planning for production

This guide presents information helping in designing and installing the right kind of a production Zato environment - one that meets the demands of a system capable of supporting a wide range of business processes in a manner that is efficient and effective.


The initial questions that an architect designing a Zato environment has to answer are:

  • What business processes should the environment support? Can they all be identified in advance?
  • Is it an environment for a single project, a single initiative or a multi-year stream of initiatives?
  • Is it an environment for an entire department, a company or for multiple companies?
  • What kind of communication patterns, techniques and means should it support? Is it mainly request-response, mainly asynchronous or mainly batch transfer communication? Is it a mix of them all?
  • What are the high availability requirements? Depending on the business context, some environments cannot ever tolerate a second of downtime whereas some can be offline for a few minutes or hours.
  • What are the development and operations team's automation skills?
  • Where do you deploy? Is there going to be a single cloud provider or many providers? Do you deploy on promises or in hybrid environments as well?
  • What are your business continuity and disaster recovery procedures?
  • Are you thoroughly familiar with the architecture of the platform?

The rest of this document assumes that you have answers to the aforementioned questions because it is these questions that the concrete, technical decisions outlined below will follow from.

Deployment to production should be a formality only

The prevailing majority of projects or works will require one to create at least one environment other than production before a solution can be moved to the production environment. That one environment is typically called dev, development or testing.

It means that a minimum of environments that you need to create is two, one for the actual production needs and at least one other environment. If there are only two environments, it is up to you to decide whether the one that the one that development and testing were conducted in become production or if production is created separately.

However, deployment to production should be nothing more than a mere formality. In a well planned, and least partly automated, environment, deploying a solution to production should not be much more than a repetition of steps that have been already conducted in another environment.

Whether to use two, three or more environments in addition to production will depend on a particular business and technical situation - in some cases there will separate development environments for each developer independently, or separate integration, user acceptance (UAT) and testing or pre-production and mirror environments.

Plan to automate your configuration and deployments and consistently use the same automation tools in different environments. Again, the expected end result is that a production environment, as important as it is, should be simply the same set of tools and procedures, parameterized differently, e.g. pointing to different databases or resources.

Online transactions vs. batch processing

Processes involving mainly work with small, short requests and immediate responses based on input from users are termed online transactions whereas work such as daily processing of invoices or customer orders using file transfer are batch processing.

Both types of processes will inherently compete for server CPUs and RAM. E.g. a nightly job to process large files downloaded from SFTP may completely saturate several CPUs and many GBs of RAM. During that time, processes involved in online transactions will not have access to these server resources.

Hence, some mixing of the two types of transactions in one environment may be usually acceptable but, as a rule, if you design a high-performance online transactions processing environment, do not use it for batch processing or the other way around. Instead, create two or more Zato clusters, each for the specific types of processes. Then, your production environment will consist or two or more Zato clusters - it is perfectly natural to have one environment composed of more than one cluster. If needed, they can communicate using WebSockets, REST or other protocols.

Take advantage of quickstart clusters

Quickstart clusters are a way of setting up a new Zato environment using a single command, "zato quickstart /path/to/environment". A quickstart cluster is a fully functional, self-contained environment that a server, web-based Dashboard and a load-balancer.

The resulting cluster is exactly the same as if one were to use set up all the individual components on one's own except that it is already pre-configured and ready to use which means that there is no need for you to do it on your own, rather, create a quickstart cluster whenever possible to save time.

Quickstart clusters are also available via Docker quickstart containers which let you configure cluster via environment variables and enmasse files. They are a convenient way of setting up new environments that do not require uninterrupted HA (High Availability).

HA - if downtime can be accepted

Downtime is never a pleasant state and most environments would like to avoid it. At the same time, many sites would not like to invest too much time into preventing it because there may be no tangible return from such an investment of time and resources.

For instance, if the business nature of processes dictates that they run once a day, once a month or at intervals that are wide enough for semi-automated or manual tasks to be conducted, there may exist a natural tendency not to prefer to design and maintain a full HA environment.

In such cases, do consider the usage of quickstart clusters first, possibly running Zato under a Docker quickstart container, and monitor its availability using a tool external to the environment.

This monitoring tool may be a ping REST request to /zato/ping or anything more site-specific, e.g. you can have the built-in scheduler trigger a new task of pinging an external monitoring server once a minute.

In this manner, should the environment become unavailable for any reason, its replacement can be brought up very quickly. It is up to you to decide in a given context whether bringing up a environment should be fully automated, i.e. whether the lack of a response to a monitoring event should trigger a rebuild automatically or should the rebuild be manually triggered.

Regardless of how the rebuild is triggered, if using quickstart clusters and zato enmasse, it should not take more than a few minutes for a new environment to become available.

HA - if non-stop availability is required

If full HA, with no downtime, is required, you need to ensure that there is always at least one server available to accept and process requests. There are two major avenues for achieving it:

  • Create a multi-server cluster with a load-balancer in front of them. This utilizes the fact that all servers in a single Zato cluster always work in an active-active manner. The load-balancer is already part of Zato although an external one can be used as well.

  • Create many small, single-server quickstart clusters with an external load-balancer in front of them. This takes advantage of the fact that creating a new quickstart cluster is a very fast operation that should be already familiar to all developers and administrators, which means that it may take less time overall to design such an HA environment. However, in this setup each cluster will have its own independent configuration database which means that WebSocket and publish/subscribe should not be used in this approach.

Scaling horizontally and vertically

Zato environments can be scaled by adding new servers, new CPUs to existing servers, or both. In either case, the result is the same - more CPUs available for message processing.

However, one of the pillars of cloud computing is that environments should scale via the addition of relatively small, inexpensive computers. In turn, it means that Zato servers should not have more than 4 CPUs assigned for main processes.

For instance, if your processing needs require the usage of 24 CPUs, create an environment comprised of 24 servers with 1 CPU each, 12 servers with 2 CPUs each, or 6 servers with 4 CPUs each rather than 2 servers with 12 CPUs each.

Choice of the operating system

Production environments must use Linux systems. All systems should be the same, i.e. consistently use Ubuntu at the same version and patch level everywhere in the environment.

Windows systems can be used for development but they are not supported as production environments.

Capacity planning

While Zato servers are fast and light on resources, in a general overview such as this one, it is not possible to specifically foresee how much RAM and CPUs will be needed by a particular solution or environment. In one instance it may be one small CPU, in another it may be many bigger CPUs - this kind of estimations can be produced only during performance testing specific to a particular workload.

The only default value that can be estimated is that of the disk space needed for production purposes:

  • Assign at least 30 GB of disk space to each VM if a single VM contains one server
  • Assign at least 15 GB of disk space to each server if a single VM contains more than one server

IBM MQ - CPU usage

IBM MQ connections use additional processes, independent of the main server ones, which means that they require additional CPUs. Each IBM MQ channel or outgoing connection uses its own process. In a busy system, assign more CPUs to a VM so that the additional processes do not share CPUs with the main server processes.

Oracle DB - do not use more than ten connections in a pool

Do not use more than 10 connections in a single Oracle DB pool, otherwise the underlying Oracle DB SQL library may begin to return error KPEDBG_HDL_PUSH_FCPTRMAX which will automatically restart the main server process, breaking any already requests that are in progress.

Publish/subscribe considerations

If you use publish/subscribe, each server may use 1 CPU. With such environments, scale them by adding more servers rather than by adding more CPUs.

WebSocket considerations

If you use WebSocket connections, each server may use 1 CPU. As with publish/subscribe, scale such environments by adding more servers rather than by adding more CPUs.