Line 7: |
Line 7: |
| |mitigation=See body | | |mitigation=See body |
| |leader=Marks Polakovs (MP) | | |leader=Marks Polakovs (MP) |
− | |others=Matthew Stratford (MS), Isaac Lowe (IL), Michael Grace (MG), Jacob Dicker (JD), Jess Schofield (JS) | + | |others=Matthew Stratford (MS), Isaac Lowe (IL), Michael Grace (MG), Jacob Dicker (JD), Jess Schofield (JS), Ben Allen (BA) |
| }} | | }} |
| | | |
Line 19: |
Line 19: |
| | | |
| * Given the scale of the hardware issues experienced, the fact that we had a total of 15 minutes of dead air is not ideal but could be a lot worse. | | * Given the scale of the hardware issues experienced, the fact that we had a total of 15 minutes of dead air is not ideal but could be a lot worse. |
− | :* However, a lot of that was off-air-loop, sine-wave, or Flagship News Sosij, which, while ''technically'' not dead air, isn't exactly broadcasting. TODO exact numbers, but somewhere around 10m of that. | + | :* However, a lot of that was off-air-loop, sine-wave, or Flagship News Sosij, which, while ''technically'' not dead air, isn't exactly quality broadcast output. TODO exact numbers, but somewhere around 10m of that. |
| | | |
| == What Did Not Go So Well == | | == What Did Not Go So Well == |
Line 42: |
Line 42: |
| * MP to remove song that caused Campus Playout bug and high Dolby CPU usage from rotation - '''Done''' | | * MP to remove song that caused Campus Playout bug and high Dolby CPU usage from rotation - '''Done''' |
| * MP to set up emergency back-up audio player on BSOD - '''Done''' | | * MP to set up emergency back-up audio player on BSOD - '''Done''' |
− | * MS to finish hardware dead-air detector | + | * MS to finish hardware dead-air detector - '''Done''' |
− | * JS and BA to contact Focusrite Support about RedNet issues | + | * JS and BA to contact Focusrite Support about RedNet issues - '''Done''' |
| * BA to replace clock coax with high-quality SDI cable | | * BA to replace clock coax with high-quality SDI cable |
− | * MP to set up Journald persistence on Dolby (and ideally all Debian boxen) - on hold until off-air | + | * MP to set up Journald persistence on Dolby (and ideally all Debian boxen) - '''done''' on Dolby, on hold on rest |
− | * Purchase alternative Dante<->ADAT interface | + | * Purchase alternative Dante<->ADAT interface - '''Done''' |
− | * Sort out SelectorListener startup - on hold until off-air | + | * MP and MS to sort out SelectorListener startup - on hold until off-air |
| | | |
| = Timeline = | | = Timeline = |
Line 60: |
Line 60: |
| | | |
| 21:18:35 Dante Controller reports “Device 001DC10208C1 (device name not known) is now a grandmaster” - the RedNet 3 has dropped off | | 21:18:35 Dante Controller reports “Device 001DC10208C1 (device name not known) is now a grandmaster” - the RedNet 3 has dropped off |
| + | |
| + | : In retrospect, I (MP) don't think this was the entry that confirmed R3 dropoff, as the timestamp was just when Dante Controller was started. However, it's hard to pin down the exact time, as the switch logs show the R3 flapping on and off the network repeatedly. |
| | | |
| 21:18:36 MP screams “DANTE!” | | 21:18:36 MP screams “DANTE!” |
Line 69: |
Line 71: |
| 21:28 The AM feed becomes noticeably sosig, slowly degrades quality | | 21:28 The AM feed becomes noticeably sosig, slowly degrades quality |
| | | |
− | 21:29-21:32 The AM goes to silence with occasional bursts of techno. At the same time, Dante Controller reports '''various clock switches and devices muting''' (which is Dante-speak for “oman i am no good with audio pls to halp”). The clock master cycles between Phil, Office, and Studio Red several times. MP says in Slack at 21:31 “[Dante] is not in a good place right now.” | + | 21:29-21:32 The AM goes to '''DEAD AIR''' with occasional bursts of techno. At the same time, Dante Controller reports '''various clock switches and devices muting''' (which is Dante-speak for “oman i am no good with audio pls to halp”). The clock master cycles between Phil, Office, and Studio Red several times. MP says in Slack at 21:31 “[Dante] is not in a good place right now.” |
| | | |
− | : Right as the clocking storm starts, the switch starts reporting <code>IFNET Error LINK_UPDOWN GigabitEthernet1/0/11 link status is DOWN.</code> - port 11 is uryStores | + | : Right as the clocking storm starts, the switch starts reporting <code>IFNET Error LINK_UPDOWN GigabitEthernet1/0/11 link status is DOWN.</code> - port 11 is uryStores. Later <code>IFNET Error LINK_UPDOWN GigabitEthernet1/0/11 link status is UP.</code> and then DOWN again, ad infinitum. |
| | | |
− | : Relevant log entry: 2020-03-04 21:30:53 GMT 1583357453638 Information "uryStores" "Timed out 3 times sending message 'UpdateRxChannels' to uryStores, giving up." | + | : Relevant log entry: <code>2020-03-04 21:30:53 GMT 1583357453638 Information "uryStores" "Timed out 3 times sending message 'UpdateRxChannels' to uryStores, giving up."</code> |
| | | |
| 21:33:34 MP, MS, BA set Studio Blue to clock master. Dante sort-of stabilises except not really | | 21:33:34 MP, MS, BA set Studio Blue to clock master. Dante sort-of stabilises except not really |
Line 91: |
Line 93: |
| : During the debrief IL reported that, looking at the front panel of the Scarlett, no lights were on - not even the AM return feed, which should have at least some signal (modulation noise) even in the event of dead air | | : During the debrief IL reported that, looking at the front panel of the Scarlett, no lights were on - not even the AM return feed, which should have at least some signal (modulation noise) even in the event of dead air |
| | | |
− | During this time, IL, MWP, and BA are setting up a TX OB (in layman's terms, shove a microphone directly into the transmitter to get *some* signal on air). | + | During this time IL and BA are setting up a TX OB (in layman's terms, shove a microphone directly into the transmitter to get *some* signal on air). Much running between office and Stores to gather equipment ensues. |
| | | |
− | 21:38:49 Dante Controller reports that uryStores is clock master, as the team is oblivious to the ongoing cataclysm. | + | 21:38:49 Dante Controller reports that uryStores is clock master, as the team is oblivious to the ongoing Scarlett cataclysm. |
| | | |
| 21:39:56 Studio Blue is switched to Clock Master, confirmed by MP in Slack at 21:40:07. Still dead air. | | 21:39:56 Studio Blue is switched to Clock Master, confirmed by MP in Slack at 21:40:07. Still dead air. |
Line 101: |
Line 103: |
| : Gracefully stopping Jack via systemd fails and MP has to kill it, probably due to driver issues. | | : Gracefully stopping Jack via systemd fails and MP has to kill it, probably due to driver issues. |
| | | |
− | : During the debrief BA reported that, although it was showing up in as a device (although MP has no logs of this), the Scarlett had probably borked itself completely due to the double-clocking. | + | :: jack_lsp reported <code>jack_client_open() failed, status = 0x21</code> |
| + | |
| + | : During the debrief BA speculated that, although it was showing up in as a device (although MP has no logs of this), the Scarlett had probably borked itself completely due to the double-clocking. |
| | | |
| 21:44 MP makes the call to reboot Dolby. | | 21:44 MP makes the call to reboot Dolby. |
Line 117: |
Line 121: |
| 21:52:00 horrible sosig Flagship News ends and AM has off-air loop | | 21:52:00 horrible sosig Flagship News ends and AM has off-air loop |
| | | |
− | 21:54-21:56 MP realises that SelectorListener isn’t running, and runs some commands to try and start it. | + | 21:54-21:56 MP realises that SelectorListener isn’t running, and runs some commands to try and start it. Selector cycles between Off-Air, Sine Wave, and Jukebox as MP tests that it works. |
| + | |
| + | 21:58:01 MP selects Jukebox. |
| | | |
| 21:58:59 '''MP declares the incident over''' | | 21:58:59 '''MP declares the incident over''' |
| | | |
| [[Category:Incident Reports]] | | [[Category:Incident Reports]] |