Have you ever thought “What would I do if my PLC processor stopped working”? If so, was your answer “I will just pull the cold standby off of the shelf” or “the hot standby will take care of it!”? Choosing the right option for your facility depends on impact of the downtime and also the budget you may have available. If you are operating a water system, you should consider questions such as: How long will your reservoirs take to drain? Are you able to run/shut down your facilities in manual control or is the PLC required for these operations? Are a couple days of no operation a big deal?

This article discusses four redundancy options:

  • Hot standby
  • Cold standby
  • Critical spares
  • None

A hot standby system is the Cadillac of redundancy, as it involves two identical systems running in parallel. If one system fails, the back up will take over. In the automation domain this is often seen as two PLC/PACs or power supplies. In the example in figure 1 there is a primary PAC (PAC A) and a redundant PAC (PAC B). If PAC A were to fail, PAC B will take over operating the process. Both processors will be updated with the same version of the program and of the firmware. This is a great option when your facility is not able to run in manual control, or if a significant downtime is not an option. The example seen in figure 1 has a redundant processor, redundant power supplies, and a redundant network loop. Due to the criticality of the system, there are multiple layers of redundancy in this implementation to minimize risk of downtime.

Figure 1: Hot Standby PLC

A cold standby system is similar to hot standby with the difference of the backup system not running when it is not needed. In the automation domain, this would involve both PLC/PACs still have the same programming and firmware, however this is done manually and is less automated than in a hot standby configuration. The backup processor can be sitting on a shelf or already in the control panel, ready to be installed when an incident occurs. This spare is process dependent, that is, the processer is intended to replace one specific PLC/PAC and not any other elsewhere.

For a system where a small outage is not a major concern, but an extended outage is, critical spares are a great option. Critical spares can include an array of spare parts sitting on the shelf and ready to be used if a component shall fail. The exact list of required spare parts and their amount will depend on your facility and can be determined by a risk assessment.

Some common critical spares can include:

  1. Commonly used PLC power supplies, processors, and communication adapters
  2. PLC IO Cards
  3. HMIs
  4. VFDs
  5. Instrumentation critical to the continuous operation of your process
  6. SCADA server or computer
  7. Networking Equipment:
    1. Cellular gateways
    2. Network switches
    3. Radios
Figure 2: Spare Cellular Gateways

Instead of keeping extra parts in your facility, there are also options to subscribe to critical spare programs with vendors or manufacturers. Depending on what parts and how many you need, this may be an option to consider. These relationships with suppliers can help you manage your inventory in your facility, or give you access to a shared inventory within a vendors’ facility. Using the latter may have a reduced cost implication but could increase the time to receive and install the spare.

One major consideration for utilizing critical spares is whether you have up-to-date program backups. In terms of PLCs/PACs and network equipment, a critical spare can only be as valuable as the backup that will be downloaded to it. Some issues to consider here are:

  1. When was the last time the automation equipment (PLC/PAC, HMI, SCADA, etc.) was backed up? 
  2. If the backups are automatically done, have they been validated? i.e., are they useable and the files not corrupted? 
  3. Where are the backups stored?

For all three of these points, if something goes wrong it may result in a complete or partial rewrite of your programming. If your integrator is required to re-write the program before the critical spare is installed, you may lose all time and cost advantages you gained by having the spares in the first place.

The final option for redundancy is: none. That is, knowingly addressing the possibility of a failure by not preparing any spares. Although the upfront cost is minimal, when an incident occurs this would be the costliest of the four. When an event occurs, you will have to put rush orders in, pay for unplanned overtime, and suffer lost production. The duration of the outage is also unpredictable as shipping times can fluctuate tremendously and may not work in your favour. With current worldwide shortages, some parts may have several months of lead-time, which may mean you need to design and implement a workaround.

This table summarizes the options discussed above and provides a comparison in upfront/ long term costs as well as time period to return the plant to operation after an outage.

Depending on your budget or allowable downtime you may implement a combination of these options. To determine which option should be utilized and where, it is recommended to undertake an evaluation of current equipment criticality and determine which components are critical in maintaining operational continuity. If you need assistance in this evaluation or implementing any of these options, please give us a call at 250-372-1486 or contact us at https://xenoncyber.ca/contact-us/.

Jason Marchese, P.Eng PMP
Project Engineer