Incident Report: 20200627 and Shutting Down URY In A Hurry: Difference between pages
|  Created page with "{{Incident   |brief=uryups0 forgot what "uninterruptible power" means   |severity=High   |impact=Medium (Total loss of computing services for 2 hours - but we were off air)..." | m 11090 moved page User:11090/Shutting Down URY In A Hurry 2020 to Shutting Down URY In A Hurry | ||
| Line 1: | Line 1: | ||
| ''This is a ops-critical document. A printed copy is available in the Server Cupboard and should be updated whenever this online version is.'' | |||
| Turn servers off in this order, waiting a few seconds between each button: | |||
| * urystv [no need to wait] | |||
| * ury (thunderhorn) | |||
| * dolby | |||
| * urybsod | |||
| * urysteve | |||
| * urybackup0 | |||
| * uryfw0 | |||
| * transmitter, uryblue, uryred [call engineering now] | |||
| * uryrrod [VMWare] | |||
| ''Note: The KVM is powered by a 12V brick, so can’t go on UPS power. So you need to move the monitor cable between each server if one seems to be having trouble going down. The keyboard should still pass through using power gleaned from the PS/2 ports, if you want to risk that.'' [not sure if this is still a thing?] | |||
| A big factor in delays on powering down is hanging waiting on NFS/SMB - problematic if you’ve shut down whatever was providing the mount, so stick to this order. | |||
| As early as possible during this process, try to reach one of: Station Manager; Assistant Station Manager; Programme Controller to inform them of the service outage so they can invoke necessary social media routes. | |||
| Remember: Once these servers are off, sending emails to @ury.org.uk email accounts doesn't work! Use Slack, @york.ac.uk addresses, Facebook or phone numbers. | |||
| You now won't have much to do until the power's back on, most likely. Using a manual writing implement, make note of how the procedure went in preparation for [[Cold-Starting URY Systems]] later on. | |||
| == Rationale == | |||
| * urystv has no critical mounts, so it can be a quick way to shed some load | |||
| * ury goes after that since it doesn’t have any mounts elsewhere | |||
| * dolby after that, because of postgres | |||
| * urybsod has some exports pertaining to log generation, namely to uryrrod and ury | |||
| * urysteve now, because of /music | |||
| * urybackup0 now because urysteve backs up to it. [If the UPS is absolutely screaming about low battery, you can risk taking this down first as it does draw the most power - still accurate?] | |||
| * uryfw0 after all that -- would be handy to still have comms if servers need to cross networks (unmounting loggers) | |||
| * The loggers, in no particular order. | |||
| * The transmitter must be turned off if the loggers are powered down, and especially if the UPS power fails altogether, due to lack of logging capability, a legal requirement. Call engineering to let them know this has happened. | |||
| * uryrrod mounts urybsod for mixclouder and urybackup0 for webcams, so it may be unhappy if it's unmounted - this is not critical though | |||
| [[Category:Technical How-Tos]] | |||
Latest revision as of 12:23, 23 July 2020
This is a ops-critical document. A printed copy is available in the Server Cupboard and should be updated whenever this online version is.
Turn servers off in this order, waiting a few seconds between each button:
- urystv [no need to wait]
- ury (thunderhorn)
- dolby
- urybsod
- urysteve
- urybackup0
- uryfw0
- transmitter, uryblue, uryred [call engineering now]
- uryrrod [VMWare]
Note: The KVM is powered by a 12V brick, so can’t go on UPS power. So you need to move the monitor cable between each server if one seems to be having trouble going down. The keyboard should still pass through using power gleaned from the PS/2 ports, if you want to risk that. [not sure if this is still a thing?]
A big factor in delays on powering down is hanging waiting on NFS/SMB - problematic if you’ve shut down whatever was providing the mount, so stick to this order.
As early as possible during this process, try to reach one of: Station Manager; Assistant Station Manager; Programme Controller to inform them of the service outage so they can invoke necessary social media routes.
Remember: Once these servers are off, sending emails to @ury.org.uk email accounts doesn't work! Use Slack, @york.ac.uk addresses, Facebook or phone numbers.
You now won't have much to do until the power's back on, most likely. Using a manual writing implement, make note of how the procedure went in preparation for Cold-Starting URY Systems later on.
Rationale
- urystv has no critical mounts, so it can be a quick way to shed some load
- ury goes after that since it doesn’t have any mounts elsewhere
- dolby after that, because of postgres
- urybsod has some exports pertaining to log generation, namely to uryrrod and ury
- urysteve now, because of /music
- urybackup0 now because urysteve backs up to it. [If the UPS is absolutely screaming about low battery, you can risk taking this down first as it does draw the most power - still accurate?]
- uryfw0 after all that -- would be handy to still have comms if servers need to cross networks (unmounting loggers)
- The loggers, in no particular order.
- The transmitter must be turned off if the loggers are powered down, and especially if the UPS power fails altogether, due to lack of logging capability, a legal requirement. Call engineering to let them know this has happened.
- uryrrod mounts urybsod for mixclouder and urybackup0 for webcams, so it may be unhappy if it's unmounted - this is not critical though