Industrial Systems
ICS security fundamentals: systems that move physics
What makes ICS security different from IT: availability-first priorities, decade-long asset lifecycles, physical consequences, and the control set that works.
Executive summary
ICS security is the discipline of protecting industrial control systems — PLCs, HMIs, SCADA servers, safety systems — where a security failure has physical consequences, not just data consequences. The priorities invert relative to IT: safety and availability lead, asset lifecycles run 15 to 30 years, and a control applied carelessly can trip the plant it was meant to protect. Drawing on four years administering both the office and process-control networks at a petrochemical plant, this article explains why OT demands different instincts, walks through the control set NIST SP 800-82r3 formalizes, and lays out where a plant starting from zero should begin.
Why OT is a different discipline, not a different subnet
ICS security protects systems whose failure modes are physical. When a file server goes down, someone cannot open a spreadsheet. When a controller goes down, a pump keeps running against a closed valve, a reactor loses its cooling loop, or a batch worth six figures goes to flare. For four years I administered both the office network and the process-control network at a Chevron Oronite petrochemical plant, and the most important thing those years taught me is that the two jobs share tools but not instincts.
Physics does not retry.
A mail queue drains when the server comes back. A process excursion does not wait politely for the reboot — it develops, on its own schedule, whether or not anyone is watching a screen. That single asymmetry drives everything else in this discipline, and it breaks down into three differences worth internalizing before you touch a control network.
The priority order inverts. IT security is taught as confidentiality-integrity-availability, in that order. In a plant it runs safety, then availability, then integrity, with confidentiality a distant fourth. Nobody at 2 a.m. cares whether flow-rate telemetry is encrypted. They care whether the operator’s screen is telling the truth and whether the control loop is still closing. Every control you propose gets evaluated against one question before any other: can this trip the process?
Asset lifecycles are measured in decades. A control system is purchased with the equipment it runs and lives 15 to 30 years. That Windows XP box in the rack is not negligence — it is the HMI that shipped with a compressor package in 2008, the vendor charges real money to replace it, and the replacement project competes with mechanical work for the same outage window. You will spend your career securing devices that predate the vulnerabilities you are defending against. “Just upgrade it” is a sentiment, not a plan.
The infrastructure is fragile in ways IT tooling does not expect. Some PLCs and older RTUs fall over when they receive traffic they were never designed to see — a full-connect port scan, a malformed packet, sometimes just an eager asset-discovery sweep. The vulnerability scan that is routine hygiene on the office network is a process-interruption risk on the control network. Anything active gets proven on a bench or during an outage before it touches production. No exceptions, including yours.
The consequences are physical, so safety systems get special treatment
Modern plants layer their protections deliberately: the basic process control system (BPCS) runs the process, alarms bring the operator into the loop, and a safety instrumented system (SIS) sits underneath as the independent layer that trips the process to a safe state when everything above it has failed. From a security architect’s chair, the SIS is the crown jewel — the last automated barrier between a cyber event and a physical accident. The TRITON malware, which I cover in the ICS threat landscape, targeted exactly this layer, and that is why it permanently changed the industry’s threat model.
The practical implications are unglamorous and non-negotiable: keep SIS engineering workstations separate from BPCS engineering workstations, keep safety controllers off shared networks wherever the design allows, and treat any key-switch or write-enable state on a safety controller as a monitored, logged, exceptional event. The independence of the safety layer is its entire value. Anything that couples it to the systems it protects against is spending that value down.
The core control set
NIST SP 800-82r3 — retitled in revision 3 as a guide to operational technology security, which reflects how far the field has widened past SCADA — is the reference I hand people, alongside the ISA/IEC 62443 series for zone-and-conduit design. Distilled to what actually moves risk, the control set looks like this, roughly in order of return on effort.
1. An asset inventory you believe. You cannot defend what you have not enumerated, and in OT you cannot enumerate by scanning. Start passive: span-port captures parsed by a protocol-aware tool, firewall logs, switch CAM tables, and the walk-down binder from the last turnaround. Reconcile against the P&IDs and the vendor’s system drawings. Expect surprises — every plant I have seen has devices nobody remembered installing.
2. Segmentation with an OT DMZ. The single highest-leverage architectural control: separate the control network from the enterprise network, and force every crossing through a DMZ where the data is brokered — historian replication, patch staging, remote-access jump hosts. No direct sessions from the office network to a controller. Ever. The zone logic is the same one I lay out in network segmentation strategy; the Purdue model gives you the OT-specific vocabulary for it.
3. Controlled remote access. Vendor and integrator remote access is the most common initial-access path into control networks I have seen firsthand. It deserves its own architecture — brokered, MFA-enforced, time-boxed, recorded — which I cover in detail in OT remote access architecture.
4. Monitoring that respects the process. Passive network monitoring on span ports is the OT-safe default: it sees Modbus writes, controller mode changes, new devices, and firmware downloads without putting a single packet onto the control network. Alert on the things that matter in OT terms — a write from an unexpected source, an engineering-protocol session outside a maintenance window, a controller moving from Run to Program mode. An IDS signature feed tuned for enterprise malware will sleep straight through all three.
5. Backups that include control logic. Back up the HMI servers and historians, yes — but also the PLC programs, controller configurations, drive parameters, and the license files and installation media for engineering software old enough that the vendor no longer distributes it. After a ransomware event or a failed controller, the recovery-time question is rarely “where is the server image.” It is “where is the last known-good program, and what changed since” — which is a version-control problem as much as a backup problem.
6. An incident plan operations has rehearsed. The plant’s decision during a cyber incident is not “contain and eradicate.” It is “can we keep running safely, and if not, how do we shut down in an orderly way.” That decision belongs to operations, informed by security — not the other way around. Write the playbook together and walk through it before the night you need it.
What patching actually looks like
The patching conversation is where IT-trained expectations go to die, so set them correctly up front. Control system vendors qualify OS and application patches against their products before approving them; applying an unqualified patch can void support on the system running your plant. Qualified patches then wait for a maintenance window, and for systems that can only come down with the process, that means a turnaround — potentially years out.
This is not an excuse to do nothing. It means the unit of work is different: maintain the vendor-approved patch list, stage approved patches in the DMZ, apply them opportunistically whenever equipment is down anyway, and cover the gap with compensating controls — tighter conduits, protocol filtering, application allow-listing on HMIs (which works unusually well in OT precisely because these systems change so rarely), and monitoring. NIST SP 800-82r3 is explicit that compensating controls are a legitimate, expected part of OT risk management. Keep that citation handy for the auditor who arrives with the enterprise patch policy in hand.
Tradeoffs worth being honest about
| Control | Risk reduced | What it costs you |
|---|---|---|
| Segmentation + DMZ | Lateral movement from IT | Firewall change discipline; every new data flow needs engineering |
| Passive monitoring | Detection gap | Sensor and tooling spend; tuning time to learn the process |
| Application allow-listing | Malware execution on HMIs | Vendor coordination on every software change |
| Brokered remote access | External initial access | Vendor friction; someone must administer access grants |
| Active scanning | Unknown vulnerabilities | Real process-interruption risk; bench-test first or don’t |
Every row costs something, and pretending otherwise is how security programs lose the plant’s trust. Present the cost with the control and operations will meet you halfway. Hide it and they will find it themselves, once, and remember.
Where to start
If the plant is starting from zero, the sequence that works: inventory passively, then build the IT/OT boundary and DMZ, then fix remote access, then add monitoring, then formalize backup and recovery of control logic. That order front-loads the controls that block the realistic threats — which, as the threat-landscape data shows, are ransomware spillover from the enterprise network and exposed remote access far more often than bespoke ICS malware.
And write it down as you go: the zone model, every permitted conduit and its business reason, the vendor access register, the location of the last known-good controller programs. In OT, the documentation is not bureaucracy. It is the recovery plan.
The equipment will keep aging and the tooling will keep changing. The discipline underneath — know what you have, bound what can reach it, watch what it does, and be able to put it back — does not. That is not an OT insight, in the end. It is what security has always been, applied to systems that happen to move physics.
Frequently asked questions
- What is ICS security?
- ICS security is the protection of industrial control systems — the PLCs, RTUs, HMIs, SCADA servers, and safety instrumented systems that run physical processes. Its priorities differ from IT security: availability and safety come first, because the worst-case outcome is not a leaked database but a tripped plant, damaged equipment, or an injured operator. In practice the discipline sits closer to process safety engineering than to enterprise security.
- How is OT security different from IT security?
- The priority order inverts. IT security protects data, so confidentiality leads. OT security protects a running process, so safety and availability lead and confidentiality barely places. OT assets also live 15 to 30 years, patch only during planned outages, and can crash when actively scanned. Every IT habit — scanning, pushing agents, rebooting to fix things — has to be re-qualified before it touches a control network.
- Can you patch industrial control systems?
- Yes — on the plant's schedule, not the vendor's. Patches wait for vendor qualification, then for a maintenance window or turnaround that may be a year away. The gap is covered with compensating controls: segmentation, protocol-aware filtering, brokered remote access, and monitoring. NIST SP 800-82r3 treats compensating controls as legitimate OT risk management, not as an admission of failure.
- Are air-gapped control systems still a thing?
- Mostly not, and believing in one is itself a risk. Historian replication, vendor remote support, patch media, and engineering laptops all create paths into networks everyone swears are isolated. The safer engineering posture is to assume connectivity exists, enumerate every path, and put a control on each one. An air gap you have not personally verified is a diagram, not a defense.