The success and failure of upgrading Fedora with fedup

For discussions which are not directly related to MythTV.

Moderator: Forum Moderators

Post Reply
User avatar
stuarta
Developer
Posts: 195
Joined: Wed Feb 05, 2014 5:13 pm
Great Britain

The success and failure of upgrading Fedora with fedup

Post by stuarta » Wed Apr 23, 2014 7:08 pm

So I had a fun few days. I decided to upgrade the host under this site from f19 to f20. I mean what could go wrong? The host start life running Fedora 17 and had survived upgrades from f17 -> f18, f18 -> f19, all with fedup, and all worked successfully. Pretty much as the documentation says it will.

So what happened this time? OMG what a complete and utter debarcle....

1. Firstly the upgrade kernel didn't appear to install correctly. Rebooting brought me straight back to the existing f19 installation. A few readings of the known issues (such as a separate /var) and a quick tweak here and there, then fedup was re-run. After the server was rebooted it never came back. Now as this server is in a data centre nowhere near me, this is quite problematic. Thank goodness for web interfaces which allow me to trigger remote ctl-alt-del's or hardware resets. Many, many hardware resets were done to no avail. Booting into the rescue mode at least allowed me to make some progress, and tweak a few more things. Still, after a reboot, it went away and never came back. A few more resets and a few more rescue mode sessions later, i find the rpmdb had been completely corrupted and needed rebuilding. No problem, that's easy to fix

Code: Select all

# rpm --rebuilddb
Eventually I get into a state where the correct f20 kernel and other packages have been installed. Does it work? Don't be silly, that was just the start of the fun.

1a. During one of these many reboot cycles I managed to get the server up and booted, but then found that none of the firewalls would load?

Code: Select all

ERROR: Your kernel/iptables do not include state match support. No version of Shorewall will run on this system
wtf? I have other machines running f20 and these same firewalls. This is complete BS. After some poking around i identify that the reason for this is because the kernel that is running is actually the kernel used during the upgrade process, not one of the ones designed for regular use. A bit of fiddling with the grub config later, and I had the system configured to use the f20 kernel. And so it was rebooted, and never came back......

2. Time for the remote console. Bit of a pain in the arse, because you have to request it, wait for someone in the datacentre to connect it, and then you have 2hrs before they nick it back and stick onto somebody else's server. This is no problem. I can do this. Fire it up. Java, oh great, we all know how well Java, Remote Consoles and Linux play well together, not well. Lets try it anyway. I can see the video \o/ woo hoo. Panic on boot

Code: Select all

Invalid MD raid superblock
wtf? These md raid devices assembled just fine under the rescue environment, so i know that they are fine. Okay, hit the remote reset and try to boot back into the f19 kernel. Bang the keyboard a bit, hmmm, it's not working. Did I mention how well Java, Remote Consoles and Linux work together? Now I have to get out the windows laptop. Right, back on the console. wtf? the keyboard still isn't working. Some random keyboard bashing later and it magically starts working :o But that didn't end up helping much. Back into the rescue enviroment, mounting all the relevant bits under /mnt, and spend some time abusing the installation. Clearly that f20 kernel is broken. Remove that, and rebuild grub, so that we are booting into the f19 kernel. \o/ it lives

3. So while we have the remote console still (and we are fast approaching the 2hr limit) lets reinstall the proper f20 kernel and reboot. Nope, it's not playing games. It fails to mount half the filesystems and drops me to an emergency shell. These filesystems are on Logical Volumes (LV) which aren't active. Okay, again a simple fix.

Code: Select all

# lvchange -ay <lvname>
Immediately the filesystems are mounted by systemd. So clearly nothing wrong with them then. Some googling later, and I find a bugzilla which implies that there are changes to lvm.conf which are required if you use mirrored LV's, guess what, I have mirrored LV's on this system. After a quick

Code: Select all

# cp /etc/lvm/lvm.conf.rpmnew /etc/lvm/lvm.conf
i reboot again. Same issue. wft? *facepalm*. The lvm.conf is included in the initrd, so the one i've just replaced, is still in use on the initrd

Code: Select all

# dracut -f
fixes that, and reboot. Et voila. \o/ again it lives, and the firewall works :)

4. Time to see what state the system is in

Code: Select all

# yum check
It finds "issues". over 230 conflicting packages where both the f19 and f20 versions are installed in parallel. Sheesh, wtf did fedup do to my system. Some creating scripting, rpm -e and yum remove later, i have removed all traces of the f19 packages from the system and everything is all up to date and working.

All I can say is it's a good job I do this sort of thing as a day job. Most normal users would have reinstalled a *long* time ago....


Stuart

User avatar
stuarta
Developer
Posts: 195
Joined: Wed Feb 05, 2014 5:13 pm
Great Britain

Re: The success and failure of upgrading Fedora with fedup

Post by stuarta » Wed Apr 23, 2014 8:26 pm

This bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=979723 lead me to the unmerged changes to lvm.conf

Post Reply