Incident Report: 20181114: Difference between revisions

Created page with "{{Incident |brief=ITS broke |severity=Critical |impact=High (Dead air for around 2 minutes) |start=25/02/2017 16:29 |end=25/02/2017 17:30 |mitigation=ITS Broke |..."
 
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{Incident
{{Incident
   |brief=ITS broke
   |brief=ITS broke
   |severity=Critical
   |severity=Moderate
   |impact=High (Dead air for around 2 minutes)
   |impact=High (Dead air for around 2 minutes)
   |start=25/02/2017 16:29
   |start=25/02/2017 16:29
   |end=25/02/2017 17:30
   |end=25/02/2017 17:30
   |mitigation=ITS Broke
   |mitigation=Fix Nameservers
   |leader=[[Isaac Lowe]]
   |leader=[[Isaac Lowe]]
   |others=[[Jordan Cameron]]
   |others=[[Jordan Cameron]]
}}
}}


ITS had a major system error. This broke our stuff. More to follow. 
(Total dead air: 16:28:44 - 16:30:36)


(Total dead air: 16:28:44 - 16:30:36)
This page is under development because this literally just happened.
== Causes ==
Basically, ITS broke so anything at URY that's reliant on their DNS/Nameservers also broke, so we lost Jukebox, MyRadio and even the studio selector. However, it couldn't have happened at a better time with Head of Computing Jordan Cameron and Assistant Head of Computing Isaac Lowe both in the station at the same time.


At 19:22, output was successfully switched to OB (they were running late anyway), which then proceeded without incident.
Interestingly, Rednet (Dante) and BAPS continued to function properly, so Jordan and Tom Burrows (who just happened to be in the station) did an unplanned Chat and Such, and were later joined by Jacob Dicker.  


== Causes ==
Approximately 30 seconds before the resumption of full services (ITS turned themselves back off and on again) Isaac was able to manually re-start Jukebox and the crisis was over.
Namesevers.  


== Work Required ==
== Work Required ==
Namesevers.  
Investigate reducing dependency on ITS systems (NTP/DNS).
 
Honestly, we got lucky. If this had happened at 4am the dead-air time might have been measured in hours rather than minutes.
 
Also, don't walk into Studio Red and proclaim everything is broken.  


[[Category:Incident Reports]]
[[Category:Incident Reports]]