My server crashes when a HD fails
This has now happened twice in short time. A HD fail and the Synology server enters a state of horriblis. GUI does not work, but I can SSH to it. Most commands does not work, ie. neither restarting nginx nor sudo reboot. But I can navigate the file system, get htop to run (hardly any processes are running).
A forced restart (button) is the only way I have got to work. I have users requiring the server to be online, so not too much time with experimenting and diagnosing.
Server is rs18016xs+ running 7.2.2. It is in a VMM cluster with two other similar servers. All VMs also crash when this happens with a storage failure error.
RAID is RAID-6 with 17 x 12TB harddrives. It has a assigned spare drive with “automatic rebuild” checked. Disks are WD SAS He drives in expansion module. Two Intel 1TB SSDs as cache.
The server recognizes that the drive is dead, but instead of blocking the dead drive and start automatic rebuild on the designated spare disk it just goes bonkers. Why is this? A failing drive in a RAID-6 should not make anything fail; just log, alert and automatic reallocate data to spare.
Which logs should I take a look at now in retrospect to get any clues to what is going on?
- You must be logged in to reply to this topic.
