Cold-Starting URY Systems: Difference between revisions

initial content
 
No edit summary
 
(13 intermediate revisions by the same user not shown)
Line 5: Line 5:
So, you've done a full shutdown. Or, there was a power cut or zombie apocalypse that interrupted the ability of our physical servers to operate. The good news is that you now thing you're ready to turn things back on.
So, you've done a full shutdown. Or, there was a power cut or zombie apocalypse that interrupted the ability of our physical servers to operate. The good news is that you now thing you're ready to turn things back on.


''Remember - during any power failure it is advised to immediately switch off the transmitter. See [[Shutting Down URY in a Hurry]].''
''Remember - during any power failure it is advised to immediately switch off the transmitter. See [[Shutting Down URY In A Hurry]].''


== Before You Start - Is It Safe Checklist ==
== Before You Start - Is It Safe Checklist ==
Line 11: Line 11:
* Has the Head of Computing or Station Manager given consent to restoring service?
* Has the Head of Computing or Station Manager given consent to restoring service?
* Does information from Estates, YUSU or other relevant sources suggest all is okay?
* Does information from Estates, YUSU or other relevant sources suggest all is okay?
* [Preferred] If you are going to re-start AM Transmission, have you got consent from the Chief Engineer to power on the transmitter audio path?
* If you are going to re-start AM Transmission, have you got consent from the Chief Engineer to power on the transmitter audio path?
* [Preferred] Do you have at least two technical team members on site (ideally one engineer)?
* Do you have at least two technical team members on site (ideally one engineer)?


Great - lets give this a go.
Great - lets give this a go.
Line 19: Line 19:
We got all these servers right? Well they ain't no good until there's a network. You do this stage in [[The Hub]].
We got all these servers right? Well they ain't no good until there's a network. You do this stage in [[The Hub]].


* urysw4 should come up on its own, as it has PoE [???] - check the injector is on
* Power on urysw3 (The HP ProCurve 2626 [The top one])
* Power on urysw3 (The HP ProCurve 2626 [The top one])
* Power on urysw1 (The Netgear GS748T [The bottom one])
* Power on urysw1 (The Netgear GS748T [The bottom one])
Line 44: Line 45:
We identify critical servers as those that enable us to broadcast on AM. URY Policy states that we must have '''two''' operating loggers before restoration of AM service. You'll also want the jukebox to play some noise.
We identify critical servers as those that enable us to broadcast on AM. URY Policy states that we must have '''two''' operating loggers before restoration of AM service. You'll also want the jukebox to play some noise.


* Power on [[jukebox]] in [[The Hub]]
* Power on [[uryfw0]]
* Power on [[logger1]]
* No really, turn on uryfw0. Are you sure it’s on yet? Since this is the gateway for all URY systems, other servers may have trouble bringing up interfaces if it is not up.
* Power on [[logger2]]
* Power on [[uryred]]
* Power on [[uryred]]
* Power on [[uryblue]]
* Power on [[uryblue]]
* On logger1, run `/usr/local/etc/rc.d/audiolog.sh start`
* On both the loggers, run <code>sudo service loggerng status</code> and start it if it fails to auto-start
* On logger2, run `/etc/rc3.d/S40audiolog start`. You may also need to `modprobe es1370`
* Power on [[dolby]]
* Power on [[uryfw0]]
* If you don't get audio, run <code>cd /usr/local/etc/liquidsoap/scripts && sudo ./startAudio.sh</code>


If at any stage you are unhappy with the noises, smells, or LED blink patterns on logger1 or logger2, '''stop''', turn that machine back off, and come back to it. If both are not running, you will need to investigate this before you continue - uryred is not currently considered a stable logger so is not a valid substitute. Generally, they'll work when you try again a little later (they don't like being cold).
The station should start outputting the world's most annoying loop, featuring a happy instrumental tune and someone telling you that we're off air right now. We're most definitely not.
 
:* If it tries to play a jingle, this might fail spectacularly and go to a loop of Monty Python's Intermission, featuring Alex Boyall giving a grammatically incorrect technical difficulties message.
Jukebox should boot up happily, and the station will start outputting a loop of Monty Python's ''Intermission'', occasionally overlaid with a grammatically incorrect technical difficulties message from Alex Boyall.


== AM Broadcast ==
== AM Broadcast ==
Line 76: Line 75:
Core Computing Services are defined as those which must be operational for URY to broadcast anything other than [[iTones]] (or, at this point, Intermission).
Core Computing Services are defined as those which must be operational for URY to broadcast anything other than [[iTones]] (or, at this point, Intermission).


* Power on [[uryfs1]] and [[themis]]
* Power on [[urybackup0]] and [[urysteve]]
* '''Wait''' for uryfs1 to finish booting as most other systems are dependent on it
* ''Wait'' for urysteve to finish booting
* Run `mount -t ext4 /dev/sdb1 /music` on uryfs1
* Power on [[ury]]
* Power on [[ury]]
* Fade up jukebox in [[Studio 1]], then switch to S1 then back to S3
* Ensure selector is powered in The Hub
* Fade up jukebox in [[Studio Red]], then switch to S1 then back to S3
** This ensures selector state is up to date
** This ensures selector state is up to date
** Jukebox should now be playing actual music. You might need to restart it if it's stuck on techlude - on Dolby, <code>cd /usr/local/etc/liquidsoap/scripts && sudo systemctl stop ury-jack && sudo ./startAudio.sh && sel 8 && sel 3</code>


That's the absolute basics. Themis is also pretty much optional, but as it provides some authentication and DNS services, it's best we bring it up too. Now verify the following are accessible and functioning:
That's the absolute basics. Now verify the following are accessible and functioning:


* http://ury.org.uk/
* http://ury.org.uk/
* https://ury.org.uk/myury/
* https://ury.org.uk/myradio/
* https://ury.org.uk/sis2/
* https://ury.org.uk/roundcube/ (including sending test emails both internally and externally)
* https://ury.org.uk/roundcube/ (including sending test emails both internally and externally)
** It is possible that mta.york.ac.uk is not yet back online, or that it is but refuses to route mail. You can test this with good ol' `telnet mta.york.ac.uk 25`
** It is possible that mta.york.ac.uk is not yet back online, or that it is but refuses to route mail. You can test this with good ol' <code>telnet mta.york.ac.uk 25</code>
* http://ury.org.uk/live/
* http://ury.org.uk/live/
* live-high, live-mobile, live-high-ogg, jukebox and campus-playout streams are visible at http://uryfs1.york.ac.uk:7070/
* live-high, live-mobile, live-high-ogg, jukebox streams are visible at https://audio.ury.org.uk/status
* iTones should now be playing music instead of intermission. If not, try remounting manually or restarting share daemons on uryfs1
* BAPS (all the presenter and guest PCs)
* BAPS (studio1, studio2 and production services)
* Timelord (might need a reboot on eccleston/tennant/smith)
* Timelord (will probably need an `F5` on jukebox)
* myradio_daemon
* myradio_daemon


We're now at a point where shows can go on and things will mostly be okay.
We're now at a point where shows can go on and things will mostly be okay.
==Dante==
* Open Dante Controller on one of the studio or production PCs (you might need Wogan up for this), and ensure that everything is happy and you see lots of green check marks. If you see any angry red X-es, mouse over them to check which box they're complaining about, and ensure it's powered on
* Check that you can hear
** Studio Red in Blue, and vice versa
** Jukebox and News (beeeeeeeep) in both studios
** AM on Phil in the office


== Additional Servers ==
== Additional Servers ==
Power on other systems:
Power on other systems:
* urybsod
* urybsod
** Don't forget to remount <code>urybackup0:/pool0/backup</code> to <code>/mnt/pool0</code>
** Use https://urybsod.york.ac.uk/xymon/ to monitor other services
** Use https://urybsod.york.ac.uk/xymon/ to monitor other services
** https://ury.org.uk/loggerng/ should now be available
** https://ury.org.uk/loggerng/ should now be available
* copperbox
** http://copperbox.york.ac.uk/ should now be available
** Webcams should now be available (motion and camd may need manually restarting)
* urybackup0
** Run `zfs mount -a`, `service nfsd restart`, `service mountd restart`
** Windows and backup filestores should now be available. Mount the backup filestores on ury and uryfs1.
* uryrrod
* uryrrod
** Only provides mixclouder service, will have no immediate noticable impact
** Only provides mixclouder service and webcams, will have no immediate noticable impact
** Note that in the event of a full power outage, the ITS Cloud may not be immediately available. Patience.
* wogan
** The Windows PCs may be a bit unhappy if it isn't around
** See above about the ITS Cloud.
* urystv
* moyles - you'll need ITS to do this


== Nearly Done ==
== Nearly Done ==
Everything should be bright and cheery again now. You should now complete a full incident report, making it available online in [[:Category:Incident Reports]] and sending it to computing@ury.org.uk, engineering@ury.org.uk and management@ury.org.uk.
Everything should be bright and cheery again now. You should now complete a full incident report and make it available online in [[:Category:Incident Reports]].


Ideally, you'd also act on any recommendations this review brings up to make things run better in future.
Ideally, you'd also act on any recommendations this review brings up to make things run better in future.


[[Category:Technical How-Tos]]
[[Category:Technical How-Tos]]