ACTION PLAN TO AVOID FURTHER OUTAGES

Similar documents
Dispatcher Control for MotoTRBO Capacity Plus Systems

VIP-200. Point to Point Extension Configuration Quick Start Guide. Video over IP Extender and Matrix System

A Quick & Dirty Guide to Revising your Novel

SVT Tab and Service Visibility Tool Job Aid

TC 60 THERMOCOMPUTER TC 60. prog. start stop. Operating Instructions

Operating Instructions

Operating Instructions

Configuring the MSNSwitch For the LAN-Cell 3

Creating Gift Card Batches

Spring 06 Assignment 3: Robot Motion, Game Theory

Upgrading to PlanetPress Suite Version 5

Dry Contact Sensor DCS15 User Manual

Automatic Number Plate Recognition

State Bank Virtual Card FAQs

Meal Time! Game Concept

T-Mobile. Interim Text to Services and Wireless Emergency Alerts. Harold Salters, Director, Federal Regulatory Affairs, T-Mobile, USA

Lab 1 Load Cell Measurement System

AccuBuild Version 9.3 Release 05/11/2015. Document Management Speed Performance Improvements

Using the Laser Cutter

Automated Meters Frequently Asked Questions

EPOCH 50 V4.70 GNSS Firmware

PhotoVu Digital Picture Frame Service & Repair Guide

UCLA Extension Writers Program Public Syllabus

Expiration Date: April 2000 October 1999 File name: draft-ietf-ospf-shortcut-abr-01.txt

Network Working Group. Category: Informational Cisco Systems A. Shaikh AT&T Labs (Research) April 2005

No, Cell C has partnered up with various Fibre Network Operators (FNO) in the market providing fibre on an open access network.

NanoScan v2 Readme Version 2.7. Change log. v2.7 - Added information for new product Pyro/9/5-MIR.

Release Notes: Implementation Information Release Notes: Implementation Information 5.25

Exam solutions FYS3240/

RiverSurveyor S5/M9 & HydroSurveyor Second Generation Power & Communications Module (PCM) Jan 23, 2014

LXI Wired Trigger Bus Extended Function. Revision 1.0

idcv Isolated Digital Voltmeter User Manual

Super ABC Plug-in kit for Pacman or Ms Pacman

TROUBLESHOOTING GUIDE

Dorsey s Search. Name Address Home Telephone Work. Address. Property Owner s Signature

What is a Customer Service Model?

PAPER SPACE AND LAYOUTS

BILLING POLICIES AND PROCEDURES FOR ALL PROVIDERS

Lab 1 Load Cell Measurement System (Jan 09/10)

Spring 06 Assignment 3: Solution

CUSTOMER PORTAL. Floorplan Management

Briefing on Discussions Regarding a Master Lease Agreement for the Intelligent Digital Kiosks

RF MONO 8.2MHz RF MONO HIGH FREQUENCY PCB CONNECTION. Please see the last section in this document. FOREWARDS

60min Tinkerb t games

Puget Sound Company Overview. Purpose of the Project. Solution Overview

The objective of Man of Steel is to obtain winning symbol combinations by spinning the reels.

DIMACS Working Group on Measuring Anonymity Notes from Session 3: Information Theoretic and Language-based Approaches

GAMIFICATION REFERENCE GUIDE

SAMPLE: HEALTH CHECK REPORT Insert Site Name

Configure and Use Bar Tabs

The objective of Superman the Movie is to obtain winning symbol combinations by spinning the reels.

VLBA Electronics Memo No. 737

SARAD GmbH Tel.: 0351 / Wiesbadener Straße 10 FAX: 0351 / Dresden Internet:

.,Plc..d,~t l~ucjio PA300 DIGITAL BASS PROCESSOR USER'S MANUAL. 2 Why use the DIGITAL BASS PROCESSOR? 2 About the PWM Subsonic Filter

C-Fibre with black FREQUENTLY ASKED QUESTIONS

Hands-Free Music Tablet

Last update: December 26, English Translation DRAFTS of Asian Rules by Eric Wu. Contents

Introduction. Version 8.2.2

DB-Direct Internet Global

Dry Contact Sensor

Ten-Tec Model RX-366 Subreceiver 565/566 Subreceiver Installation and Operation Manual-74467

October Inaccessible Electrical Facilities. Phase 2 Primary Lines (Lower Keys)

Troubleshooting Guide StarFire Satellite Changes

SHADOW OF THE DRAGON AGE OF SIGMAR

D a i s y M o d e m s

Service Update 7. PaperStream IP (TWAIN x64) for SP Series. change history. Version Version Version

CAR ASYST - Quick Start Guide MAIN MENU

SolarEdge Built-in Revenue Grade Meter Troubleshooting

WS-400 BASE STATION FOR WIRELESS INTERCOM WITH FOUR TX/RX MODULES USER MANUAL

FIRMWARE RELEASE NOTES. Versions V2.0.0 to V Model HDL-32E. High Definition LiDAR Sensor

Consciousness Shifting

Figure 1: A Battleship game by Pogo

SEARCHING PROVINCIAL NETLAW

Remote Control Learn Button Receiver Input Connections

This app uses callas pdftoolbox server as the imposition engine and consequently you have to have that program installed on your Switch server.

JJ / CP RFP Response to Inquiries

Dragon Fall Age of Sigmar Event

JPS Interoperability Solutions SNV-12 Voter Executive Outline

NATF CIP Requirement R1 Guideline

Altis Flight Manager. PC application for AerobTec devices. AerobTec Altis v3 User Manual 1

SISTEMA ELEVATÓRIO ETV 460A

Energy system of Moldova

Supplementary Appendix to ARTC Track & Civil Code of Practice. Trackside Monuments ETD-11-01

Martel LC-110H Loop Calibrator and HART Communications/Diagnostics

Chapter 13. Routing Protocols (RIP, OSPF, and BGP) PDF created with FinePrint pdffactory Pro trial version

Batman & The Penguin Prize

LED wdali MC Switch Input Modul Set - User Manual

Guide for ESP32-Sense Development Kit

DXF2DAT 3.0 Professional Designed Computing Systems 848 W. Borton Road Essexville, Michigan 48732

2018 Print and DPI Annual Competition Rules

COM/ENG 357: Scriptwriting for Serial Media Spring 2016 Tue./Thur. 12-1:50pm Black 139

Hospital Task Scheduling using Constraint Programming

Quilts By The Sea Quilt Show Gram. Feb , 2020

Big Kahuna Assembly Instructions

Manual Zeiss Axio Zoom.V16 microscope and ZEN 2 Pro software

Damocles 2404i Manual

Wonder Tree Video Slot Introduction. How to Bet. Gamble Feature

IntesisBox MAPS. Configuration & monitoring software of IntesisBox BACnet series. User's manual Issue Date: 05/17 r1.0 eng

Ethernet Gateway User Manual

Specification for a communicating Panelboard system to monitor, control and maintain LV electrical installations

Transcription:

ACTION PLAN TO AVOID FURTHER OUTAGES Paris, Octber 18 th, 2013, These past few mnths, France-IX has undergne a series f utages which highlighted the limits France-IX has recently undergne a series f utages which highlighted the weaknesses f and weaknesses f the current netwrk design. the current netwrk design. France-IX wants t ensure its members that the whle team, with the help f experts frm the France-IX wants t ensure its members that any necessary actin is being taken t internet cmmunity, is putting int actin all the necessary steps t avid ging thrugh new avid ging thrugh such incidents again. issues. The present actins plan aims t give details abut the events, the technical reasn behind The present actin plan aims t give details abut why technically we had the utages and t draw them and the list f actins that will be made as sn as pssible t g back t having a the list f actins that will be made as sn as pssible t cnslidate the newrk and its availability. resilient and available infrastructure. 1/ THE CURRENT DESIGN FranceIX relies n a hetergeneus infrastructure including: MPLS in the cre sites (Telehuse-2 and Interxin-5) and in Interxin-2 and SFR Netcenter (Marseille) sites. Spanning-tree Ethernet n all the edge sites ((Interxin-1, TelecityGrup Curbevie, TelecityGrup Cndrcet and Iliad-Vitry). Our infrastructure backbne is based n MPLS lgical static links (LSP relying n RSVP) which ensures the best traffic and capacity engineering n the platfrm. We are using Brcade equipment MLX 8 /16 /16e with the cde versin 5.2h because this versin is cnsidered stable. The ther edge sites are pure Layer 2 and hst ther equipment. We have ld Frce10 E1200 n Iliad-Vitry, Interxin-1 and TelecityGrup Curbevie sites. The cde FTOS-ED- 6.5.4.3 is run n this equipment. This versin is the latest stable cde versin available fr this hardware.

In TelecityGrup Cndrcet, a Frce10 S4810 is installed with the cde versin FTOS-SE- 8.3.12.1. The edge Layer 2 sites are cnnected directly t the VPLS fabric thrugh 2 separate paths t ensure redundancy. FranceIX s current netwrk tplgy 2/ ENCOUNTERED ISSUES Over the past few mnths we have been facing sme big utages n the platfrm and we identified several recurring issues. Prt Flapping: At several ccasins, we had sme prts flapping between the MPLS ruters, a phenmenn that can happen in any live netwrk. On FranceIX s infrastructure, a defective link shuld nt be a prblem since the netwrk has been verprvisined. Our MPLS links (LSPs) are cnfigured with Fast rerute and BFD. The internally used prtcl (IGP) Isis is als running with BFD. These practices were recmmended by the vendr in rder t re-rute the traffic effectively, in the event f a link utage.

With the utbreak f flaps, it became bvius that when we were lsing a link in a LAG, we were facing instability, specifically in the VPLS cre. After discussin with several experts, it was recmmended t us and therefre we decided t remve BFD n MPLS and als n IS- IS. Since we made this cnfiguratin change, we had multiple flaps (due t link failures frm ur supplier, t prt relcatins) but these had n impact n the verall platfrm, which represents a standard situatin. Spanning-tree lps: Members nticed multiple unusual strm bradcasts n their prts facing FranceIX infrastructure. On the layer 2 infrastructure, we dispse f 2 types f equipment: Frce 10 E1200 (Etherscale) Frce 10 S4810 The Frce10 E1200 are ld equipment n which we can't activate sme Layer 2 prtectin such as strm-cntrl limitatin. The nly available ACL prtectin cnsists in filtering macaddresses. These past few weeks, we wrked n enabling mac-address limitatin all ver the infrastructure: nly ne explicit mac-address per member is nw authrized and cnfigured. On the Frce10 S4810, we can d bth and we d bth mac-address filtering and bradcaststrm limitatin (up t 1% f the prt). We als fixed the cnfiguratin n sme prts where spanning-tree was still activated. We dn't have any prts left with spanning-tree enabled n them. Hwever, sme members made a lp n their prts facing the exchange. On Brcade equipment, these lps had n effect n the rest f the infrastructure. While n Frce 10 E1200 equipment, we were nt able t cntain these lps t the prts f the cncerned members. These latter lps caused a lt f instability n the whle platfrm fr 2 reasns: We were nt able t autmatically shut dwn the cncerned prts As ur VPLS cre des nt run spanning-tree, the cre was unable t detect the lp emerging frm edge sites as an abnrmal traffic. Here we are facing a design incmpatibility between Layer 2 and VPLS. The redundancy we wanted t implement n the edge sites with layer 2 was in the end causing mre truble than slving.

Prt-Flapping AND spanning-tree Lps (last incident n Octber 16 th ) The rt cause f the last incident was a cmb f prt-flapping with a spanning-tree lp. We had a flapping card in ur chassis in Interxin-2. This card was part f a 6*10G lag between Interxin-2 and Interxin-5. We discvered and understd a bug n the Brcade sftware. If yu lse the first link f a lag (it's specific t Brcade) all the lag ges dwn. With Brcade, the relad f a card is really fast, but this was causing mre pain as the VPLS tplgy was als flapping. As we dn't run spanning-tree in the cre, an infinite lp was created between the layer-2 equipment and the fabric (because they have 2 links t 2 different VPLS ndes). We have mved Layer 2 equipment t a single attachment t the VPLS ndes. The incident was amplified by the physical lp we had between 2 prts n the VPLS fabric fr ur mnitring system (sflw exprt). That was the reasn why we disabled the statistics fr a day while changing the stats tplgy n that same day. 3/ ACTION PLAN The netwrk is nw stable thanks t several fixes we already applied n the infrastructure and we encurage the members wh deactivated their sessins t put them up again. In additin t that, we nw plan a list f actins that will be made t: t ensure stability and cntinuity in the services delivered t the members, t have a hmgeneus netwrk and t be able t include high-speed cnnectin (100G) All required actins will be taken t reach these gals. Fr the time being, we will disable the redundant link between the Layer 2 infrastructure and the VPLS fabric until we replace them. In the meantime, the redundant links will be activated specifically if an issue ccurs n the primary link. This pint des nt cncern the sites intercnnected in MPLS. In additin t that, ver the next weeks, we will : Upgrade the cde f ur Brcade MLX t 5.4.ca, Upgrade the Frce10 S4810 t FTOS-SE-9.2.1.0, Remve the Frce10 E1200 and replace them by ther equipment Migrate ld cnnectins f members cnnected t PaNAP switches int FranceIX wn equipment (nly TelecityGrup Curbevie site is cncerned) The cnfiguratin f the infrastructure will als be imprved and a new netwrk tplgy, which is currently being studied, will be annunced t the members and rlled ut in due time.

What we learned : Dn't nly trust yur equipment supplier n the cnfiguratin they advise yu Trust the cmmunity and get helps frm them, Industrialize the prcesses, Apply drastic filtering rules t all the members (n exceptin!), Layer 2 n a multi-sites IXP is bad! FranceIX will keep its members infrmed abut the prgress f this actin plan. FranceIX thanks its members fr their help and their maintained trust thrugh all the recent events.