Incident Report: 20140116

From URY Wiki
Jump to navigation Jump to search
Incident Report
A user error resulted in a total loss of the source code for the iTones Liquidsoap service.
Summary
Severity Critical
Impact Medium (Several minutes of silence on output, Campus Playout unavailable for several hours)
Event Start 16/01/2014 09:22
Event End 16/01/2014 10:49
Recurrence Mitigation Multiple actions taken to prevent recurrence.
Contacts
Recovery Leader Lloyd Wallis <lpw@ury.org.uk>
Other Attendees Andrew Durant <aj@ury.org.uk>, Stephen Clarke <sc1152@ury.org.uk>


During a routine procedure of taking an existing production configuration directory and creating a git repository, a failure of Lloyd Wallis to invoke `git commit` before `git pull` resulted in a file with the same name in the remote repository overwriting the local /etc/liquidsoap/jukebox.liq file that runs our production jukebox system.

This issue was then compounded in two ways - since the code was in the process of being added to the repository, it was not yet under version control, and the jukebox server was not backed up. This meant there was no way to restore the file from a recent state.

During this incident, the file was somewhat reconstructed using an old version of the code from June 2012, and several pastebin snippets found in IRC logs. As far as it is known, the system is now operational back with its original functionality, however for reasons unknown running jukebox as a service no longer works.

This incident brought an important issue to light: the lack of backups of our jukebox server. This issue has now been rectified, and urybackup0:/pool0/jukebox now has daily backups using the same script as our other servers. The Liquidsoap script is now in a GitHub repository under version control, which means the issue will not recur as long as the version control process is adhered to.