Intro to Systems Theory and STAMP John Thomas and Nancy Leveson. All rights reserved.

Similar documents
Week 2 Class Notes 1

A New Approach to Safety in Software-Intensive Systems

A New Systems-Theoretic Approach to Safety. Dr. John Thomas

My 36 Years in System Safety: Looking Backward, Looking Forward

System Safety Engineering

Engineering a Safer World

Welcome to the STAMP/STPA Workshop

Engineering a Safer World. Prof. Nancy Leveson Massachusetts Institute of Technology

Engineering a Safer and More Secure World

Engineering a Safer and More Secure World

Including Safety during Early Development Phases of Future ATM Concepts

Safety-Driven Design for Software-Intensive Aerospace and Automotive Systems

A system-theoretic, control-inspired view and approach to process safety

Introduction. 25 th Annual INCOSE International Symposium (IS2015) Seattle, WA, July 13 July 16, 2015

Applying systems thinking to safety assurance of Nuclear Power Plants

The Need for New Paradigms in Safety Engineering

A New Accident Model for Engineering Safer Systems

Engineering Spacecraft Mission Software using a Model-Based and Safety-Driven Design Methodology

4 th European STAMP Workshop 2016

4. OPE INTENT SPECIFICATION TRACEABILITY...

An Integrated Approach to Requirements Development and Hazard Analysis

Focusing Software Education on Engineering

rones-vulnerable-to-terrorist-hijackingresearchers-say/

Modelling and Hazard Analysis for Contaminated Sediments Using STAMP Model

Addressing System Boundary Issues in Complex Socio-Technical Systems CSER 2007

Understanding STPA-Sec Through a Simple Roller Coaster Example

Software Challenges in Achieving Space Safety

Architecture-Led Safety Process

Application of STPA in Radiation Therapy: a Preliminary Study

Safety in large technology systems. Technology Residential College October 13, 1999 Dan Little

STPA FOR LINAC4 AVAILABILITY REQUIREMENTS. A. Apollonio, R. Schmidt 4 th European STAMP Workshop, Zurich, 2016

17.181/ SUSTAINABLE DEVELOPMENT Theory and Policy

Lecture 13: Requirements Analysis

A systems approach to risk analysis of maritime operations

Resilience Engineering: The history of safety

INF3430 Clock and Synchronization

Cyber-Physical Systems: Challenges for Systems Engineering

Parallel Computing 2020: Preparing for the Post-Moore Era. Marc Snir

Cognitive Systems Engineering

A New Safety Theory: Concept, Methodology, and Application

Overview of EMESRT. Mike Thuesen (Anglo American) (On behalf of EMESRT)

INTRODUCTION TO STAMP

Designing for recovery New challenges for large-scale, complex IT systems

ECE 382 Feedback Systems Analysis and Design

Evaluation of STPA in the Safety Analysis of the Gantry 2 Proton Radiation Therapy System Martin Rejzek, Paul Scherrer Institute, Switzerland

Objectives. Designing, implementing, deploying and operating systems which include hardware, software and people

Via Stitching. Contents

History and Perspective of Simulation in Manufacturing.

PRIMATECH WHITE PAPER COMPARISON OF FIRST AND SECOND EDITIONS OF HAZOP APPLICATION GUIDE, IEC 61882: A PROCESS SAFETY PERSPECTIVE

MAT.HS.PT.4.CANSB.A.051

Towards combined safety and security constraints analysis

Game Mechanics Minesweeper is a game in which the player must correctly deduce the positions of

Engineered Resilient Systems DoD Science and Technology Priority

2. CYBERSPACE Relevance to Sustainability? Critical Features Knowledge Aggregation and Facilitation Revolution Four Cases in the Middle East**

Using STPA in the Design of a Nuclear Power Plant Control Room

Fundamentals of Systems Engineering

INSPECTOR GENERAL U.S. DEPARTMENT OF THE INTERIOR

CIS 890: High-Assurance Systems

Design Principles for Survivable System Architecture

SDN Architecture 1.0 Overview. November, 2014

Simulation of Passenger Evacuation using a NAPA Model

Empirical Research on Systems Thinking and Practice in the Engineering Enterprise

MODELING COMPLEX SOCIO-TECHNICAL ENTERPRISES. William B. Rouse November 13, 2013

Electro-hydraulic Servo Valve Systems

Validation and Verification of Field Programmable Gate Array based systems

Policy-Based RTL Design

Systems Engineering Overview. Axel Claudio Alex Gonzalez

General Education Rubrics

WB2306 The Human Controller

Predictive Analytics : Understanding and Addressing The Power and Limits of Machines, and What We Should do about it

6.004 Computation Structures Spring 2009

A Knowledge-Centric Approach for Complex Systems. Chris R. Powell 1/29/2015

Emerging Technologies: What Have We Learned About Governing the Risks?

Instrumentation and Control

Stress Testing the OpenSimulator Virtual World Server

Measuring safety through the distance between system states with the RiskSOAP indicator Karanikas, N.; Chatzimichailidou, Maria Mikela; Dokas, Ioannis

Design and Operation of Micro-Gravity Dynamics and Controls Laboratories

Executive Summary. Chapter 1. Overview of Control

Where does architecture end and technology begin? Rami Razouk The Aerospace Corporation

Applying STPA-based Hazard Analysis to support HBSE for Systems built using MAPs

MEM380 Applied Autonomous Robots I Winter Feedback Control USARSim

Warfighters, Ontology, and Stovepiped Data, Information, and Information Technology

Ten Years of Progress in Lean Product Development. Dr. Hugh McManus Associate Director, Lean Advancement Initiative Educational Network

CHBE320 LECTURE XI CONTROLLER DESIGN AND PID CONTOLLER TUNING. Professor Dae Ryook Yang

ARIZONA STATE UNIVERSITY SCHOOL OF SUSTAINABLE ENGINEERING AND THE BUILT ENVIRONMENT. Summary of Allenby s ESEM Principles.

Application of STPA in Radiation Therapy: a Preliminary Study

Smart Grid Maturity Model: A Vision for the Future of Smart Grid

Autonomous Robotic (Cyber) Weapons?

Multi-Robot Coordination. Chapter 11

Deviational analyses for validating regulations on real systems

Leveraging 21st Century SE Concepts, Principles, and Practices to Achieve User, Healthcare Services, and Medical Device Development Success

New Realities Facing the Mining and Metals Industry

Behaviour-Based Control. IAR Lecture 5 Barbara Webb

Kalman Filtering Methods for Semiconductor Manufacturing

Tutorial: Emerging Issues in Application of Model-Based Systems Engineering (MBSE)

Spacecraft Autonomy. Seung H. Chung. Massachusetts Institute of Technology Satellite Engineering Fall 2003

Societal and Ethical Challenges in the Era of Big Data: Exploring the emerging issues and opportunities of big data management and analytics

arxiv: v1 [cs.ne] 3 May 2018

DEVELOPING INTELLIGENT SYSTEMS METHODS, BEST PRACTICE AND CHALLENGES

Focus on Mission Success: Process Safety for the Atychiphobist

Transcription:

Intro to Systems Theory and STAMP 1

Why do we need something different? Fast pace of technological change Reduced ability to learn from experience Changing nature of accidents New types of hazards Increasing complexity and coupling Decreasing tolerance for single accidents Difficulty in selecting priorities and making tradeoffs More complex relationships between humans and automation Changing regulatory and public views of safety 2

STAMP (System-Theoretic Accident Model and Processes) A new, more powerful accident causation model Based on systems theory, not reliability theory Treats accidents as a dynamic control problem (vs. a failure problem) Includes Entire socio-technical system (not just technical part) Component interaction accidents Software and system design errors Human errors 3

Introduction to Systems Theory Ways to cope with complexity 1. Analytic Reduction 2. Statistics [Recommended reading: Peter Checkland, Systems Thinking, Systems Practice, John Wiley, 1981] 4

Analytic Reduction Divide system into distinct parts for analysis Physical aspects Separate physical components Behavior Events over time Examine parts separately Assumes such separation possible: 1. The division into parts will not distort the phenomenon Each component or subsystem operates independently Analysis results not distorted when consider components separately 5

Analytic Reduction (2) 2. Components act the same when examined singly as when playing their part in the whole Components or events not subject to feedback loops and non-linear interactions 3. Principles governing the assembling of components into the whole are themselves straightforward Interactions among subsystems simple enough that can be considered separate from behavior of subsystems themselves Precise nature of interactions is known Interactions can be examined pairwise Called Organized Simplicity 6

Statistics Treat system as a structureless mass with interchangeable parts Use Law of Large Numbers to describe behavior in terms of averages Assumes components are sufficiently regular and random in their behavior that they can be studied statistically Called Unorganized Complexity 7

Complex, Software-Intensive Systems Too complex for complete analysis Separation into (interacting) subsystems distorts the results The most important properties are emergent Too organized for statistics Too much underlying structure that distorts the statistics Called Organized Complexity 8

From Leveson, Nancy (2012). Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, Massachusetts Institute of Technology. Used with permission. 9

Systems Theory Developed for biology (von Bertalanffly) and engineering (Norbert Weiner) Basis of system engineering and system safety ICBM systems of the 1950s Developed to handle systems with organized complexity 10

Systems Theory (2) Focuses on systems taken as a whole, not on parts taken separately Some properties can only be treated adequately in their entirety, taking into account all social and technical aspects These properties derive from relationships among the parts of the system How they interact and fit together Two pairs of ideas 1. Hierarchy and emergence 2. Communication and control 11

Hierarchy and Emergence Complex systems can be modeled as a hierarchy of organizational levels Each level more complex than one below Levels characterized by emergent properties Irreducible Represent constraints on the degree of freedom of components at lower level Safety is an emergent system property It is NOT a component property It can only be analyzed in the context of the whole 12

Example Safety Control Structure From Leveson, Nancy (2012). Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, Massachusetts Institute of Technology. Used with permission. 13

Courtesy of Qi D. Van Eikema Hommes. Used with permission. 14

Example High-Level Control Structure for ITP 15

Safety Constraints Each component in the control structure has Assigned responsibilities, authority, accountability Controls that can be used to enforce safety constraints Each component s behavior is influenced by Context (environment) in which operating Knowledge about current state of process 16

Communication and Control Hierarchies characterized by control processes working at the interfaces between levels Control in open systems implies need for communication 17

Control processes operate between levels of control Control Actions Actuator Action condition Goal condition Controller Model condition Sensor Observability condition Feedback Controlled Process 18

Every Controller Contains a Process Model Controller Model of Process Accidents occur when model of process is inconsistent with real state of process and controller provides inadequate control actions Control Actions Feedback Controlled Process Feedback channels are critical -- Design -- Operation 19

Relationship Between Safety and Process Models How do they become inconsistent? Wrong from beginning Missing or incorrect feedback Not updated correctly Time lags not accounted for Resulting in Uncontrolled disturbances Unhandled process states Inadvertently commanding system into a hazardous state Unhandled or incorrectly handled system component failures 20

Relationship Between Safety and Process Models (2) Accidents occur when models do not match process and Required control commands are not given Incorrect (unsafe) ones are given Correct commands given at wrong time (too early, too late) Control stops too soon or applied too long Explains software errors, human errors, component interaction accidents 21

Relationship Between Safety and Human Mental Models Explains most human/computer interaction problems Explains many operator errors Also explains developer errors. May have incorrect model of Required system or software behavior for safety Development process Physical laws Etc. 22

Potential Control Flaws Inappropriate, ineffective, or missing control action Delayed operation Controller Controller Inadequate Control Algorithm (Flaws in creation, process changes, incorrect modification or adaptation) Actuator Inadequate operation Conflicting control actions Process input missing or wrong Control input or external information wrong or missing Process Model (inconsistent, incomplete, or incorrect) Controlled Process Component failures Changes over time Unidentified or out-of-range disturbance Missing or wrong communication with another Controller controller Inadequate or missing feedback Feedback Delays Sensor Inadequate operation Incorrect or no information provided Measurement inaccuracies Feedback delays Process output contributes to system hazard 23

STAMP: System-Theoretic Accident Model and Processes 24

STAMP: Safety as a Control Problem Safety is an emergent property that arises when system components interact with each other within a larger environment A set of constraints related to behavior of system components (physical, human, social) enforces that property Accidents occur when interactions violate those constraints (a lack of appropriate constraints on the interactions) Goal is to control the behavior of the components and systems as a whole to ensure safety constraints are enforced in the operating system. 25

STAMP (2) Treats safety as a dynamic control problem rather than a component failure problem. O-ring did not control propellant gas release by sealing gap in field joint of Challenger Space Shuttle Software did not adequately control descent speed of Mars Polar Lander Temperature in batch reactor not adequately controlled in system design Public health system did not adequately control contamination of the milk supply with melamine Financial system did not adequately control the use of financial instruments Events are the result of the inadequate control Result from lack of enforcement of safety constraints in system design and operations 26

A change in emphasis: STAMP (3) prevent failures enforce safety constraints on system behavior Losses are the result of complex dynamic processes, not simply chains of failure events Most major accidents arise from a slow migration of the entire system toward a state of high-risk Need to control and detect this migration 27

Summary: Accident Causality Accidents occur when Control structure or control actions do not enforce safety constraints Unhandled environmental disturbances or conditions Unhandled or uncontrolled component failures Dysfunctional (unsafe) interactions among components Control actions inadequately coordinated among multiple controllers Control structure degrades over time (asynchronous evolution) 28

A Third Source of Risk Control actions inadequately coordinated among multiple controllers Boundary areas Controller 1 Controller 2 Process 1 Process 2 Overlap areas (side effects of decisions and control actions) Controller 1 Controller 2 Process 29 Copyright Nancy Leveson, Aug. 2006

Uncoordinated Control Agents SAFE STATE TCAS provides coordinated instructions to both planes Control Agent (TCAS) Instructions Instructions Source: Public Domain. OpenClipArt. Control Agent (ATC) 30

Uncoordinated Control Agents SAFE STATE ATC provides coordinated instructions to both planes Control Agent (TCAS) Instructions Source: Public Domain. OpenClipArt. Instructions Control Agent (ATC) 31

Uncoordinated Control Agents UNSAFE STATE BOTH TCAS and ATC provide uncoordinated & independent instructions Control Agent (TCAS) Instructions Instructions No Coordination Instructions Source: Public Domain. OpenClipArt. Instructions Control Agent (ATC) 32

33

From Leveson, Nancy (2012). Engineering a Safer World: Systems Thinking Applied to Safety. MIT Press, Massachusetts Institute of Technology. Used with permission. 34

Uses for STAMP More comprehensive accident/incident investigation and root cause analysis Basis for new, more powerful hazard analysis techniques (STPA) Safety-driven design (physical, operational, organizational)) Can integrate safety into the system engineering process Assists in design of human-system interaction and interfaces Organizational and cultural risk analysis Identifying physical and project risks Defining safety metrics and performance audits Designing and evaluating potential policy and structural improvements Identifying leading indicators of increasing risk ( canary in the coal mine ) 35

MIT OpenCourseWare http://ocw.mit.edu 16.63J / ESD.03J System Safety Fall 2012 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.