Categories
APIC crash hang kernel logs network Ubuntu

Random crash of Backup server

I have been trying to nail down an issue with my backup server. I thought I had solved it with a boot option of noapic , because the server worked fine after this setup. It booted reliably up each time.

Until yesterday, when it did something that it had done before. That is the NIC seems to turn off all by itself and then the OS sort of hangs. Before this point the server had sent out its e-mails and even started a couple of backups. I’m not able to login at the console, so the only option I have is a power off and then a boot.

Checking through the logs reveals nothing. DMESG, syslog, messages reveals nothing. No panic, nothing. All I can see is that after sometime backuppc can no longer ping machines and then backuppc soon stops – perhaps because the server is now hung. Pinging the backup server does not work either, so the server really is locked up.

It is very annoying to say the least. The previous server was rock-solid in this regard. It was extremely slow, but at least it booted and stayed up. This maybe because it had a more modern BIOS than the current unit. Which makes me think I will now have to hunt down a updated BIOS.

This really is the first Ubuntu/Linux unreliability I have had in over four (4) years of using Linux.

If anyone has some place to begin, please don’t hesitate to comment. I am running Ubuntu 10.04 LTS server edition. Its only purpose is to run Backuppc and this server is woken by WOL each night to start the backup and then shutsdown early morning when all backups are done.

edit: Just to let you know that this appeared to be a hardware issue and I have switched everything over to the original backup machine. Which is much slower but at least works. I now need to wonder what the problem is.