/ Troubleshooting

Post oVirt 3.5 Upgrade Issues

I recently upgraded my personal oVirt server (which is both the engine and the hosts for my VMs) from 3.4 to 3.5.

The upgrade process was relatively painless, and the basics are outlined in the Release Notes for oVirt 3.5. It wasn't until I tried to spin up a new VM that I noticed some issues.

Something is amiss...

I could not create any new disks in my storage domain. The dialog would come up, but the 'OK' button would do nothing. This appears to be because I'd lost connection to my storage domain.

Headed over to the storage domain and tried to bring it back up but was not able to reactivate it, and like a Windows admin, I decided to just reboot instead of investigate. (ZING!!)

That's when things got worse.

Long story short, after the engine webpage came back, I'd lost contact with the 'local_datacenter' and no VMs would start.

Maybe it was a firewall issue?

I started searching and found an article with a similar issue to mine.

TL;DR

> please ssh to the host and try the following:
> - vdsClient -s 0 getVdsCaps (validity check making sure vdsm service is up and running and communicate with its network socket from localhost)
> - please ping between host and engine
> - please make sure there is no firewall on blocking tcp 54321 (on both host and engine)

My engine and the host reside on the same server and all traffic from 'lo' is allowed by default in my iptables, so I didn't think it was the firewall.

Either way I tried these steps.

Not Listening Hard Enough

Upon executing vdsClient -s 0 getVdsCaps, things started to get a little clearer.

vdsClient -s 0 getVdsCaps
Connection to 0.0.0.0:54321 refused

Connection refused huh? That does smell like a firewall, but since I already confirmed traffic from 'lo' is allowed, I was still unconvinced. I bet there is nothing listening on port 54321.

netstat -tnlp | grep 54321
(whole lot of nothing)

Well crap. Vdsm is not up and running.

service vdsmd status
VDS daemon is not running

And after trying to start it, my luck did not change.

service vdsmd start
initctl: Job is already running: libvirtd
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running check_is_configured
libvirt is not configured for vdsm yet
Modules libvirt are not configured
Error:  
One of the modules is not configured to work with VDSM.
To configure the module use the following:
'vdsm-tool configure [--module module-name]'.
If all modules are not configured try to use:
'vdsm-tool configure --force'
(The force flag will stop the module's service and start it
afterwards automatically to load the new configuration.)
 
 vdsm: stopped during execute check_is_configured task (task returned with error code 1).
vdsm start                                                 [FAILED]

Thankfully though, that error message is extremely helpful. It says Modules libvirt are not configured and suggests I run vdsm-tool configure [--module module-name]

vdsm-tool configure --module libvirt
Checking configuration status...
libvirt is not configured for vdsm yet
SUCCESS: ssl configured to true. No conflicts
Error:  
Cannot configure while service 'supervdsmd' is running.
 Stop the service manually or use the --force flag.
 

UGH! Ok. Yet another great error message.

service supervdsmd stop
Shutting down supervdsm daemon: 
supervdsm watchdog stop                               [  OK  ]
supervdsm stop                                        [  OK  ]

Once more, with feeling!

vdsm-tool configure --module libvirt
Checking configuration status...
libvirt is not configured for vdsm yet
SUCCESS: ssl configured to true. No conflicts
Error:  
Cannot configure while service 'libvirtd' is running.
 Stop the service manually or use the --force flag. 

I really love these error messages!!

service libvirtd stop
Stopping libvirtd daemon: libvirtd: libvirtd is managed by upstart and started, use initctl instead

What the hell....really....FINE!

initctl stop libvirtd
libvirtd stop/waiting

YAY!!

vdsm-tool configure --module libvirt
Checking configuration status...

libvirt is not configured for vdsm yet
SUCCESS: ssl configured to true. No conflicts

Running configure...
Reconfiguration of libvirt is done.

And finally, I've was able to start up vdsmd!!

service vdsmd start
initctl: Job is already running: libvirtd
vdsm: Running mkdirs
vdsm: Running configure_coredump
vdsm: Running configure_vdsm_logs
vdsm: Running run_init_hooks
vdsm: Running check_is_configured
libvirt is already configured for vdsm
vdsm: Running validate_configuration
SUCCESS: ssl configured to true. No conflicts
vdsm: Running prepare_transient_repository
vdsm: Running syslog_available
vdsm: Running nwfilter
vdsm: Running dummybr
vdsm: Running load_needed_modules
vdsm: Running tune_system
vdsm: Running test_space
vdsm: Running test_lo
vdsm: Running unified_network_persistence_upgrade
vdsm: Running restore_nets
libvir: Network Driver error : Network not found: no network with matching name 'vdsm-ovirtmgmt'
vdsm: Running upgrade_300_nets
Starting up vdsm daemon: 
vdsm start                                            [  OK  ]

And start libvirtd back up.

initctl start libvirtd                                                                                                             
libvirtd start/running, process 8471

And start supervdsm back up too.

service supervdsmd start
supervdsm start                                       [  OK  ]

Once that the services were back up, all the lights came back on in the engine webpage!!

One More thing!

I'd forgot that the whole reason I started this doc was because I could not add any disks to a VM!!

Turns out, that issue was mainly due to not having a storage profile defined in my storage domain.

After adding one, the OK button worked.

Kinda wish there was at least some form or error message for this, but there appears to not be any.