Here's a little puzzle presented in the form of a dialogue between
two characters. The characters are fictional, but the situation really
did happen. If you can guess the solution before reading to the end,
award yourself the corresponding points listed on the right.
Bruce: What are you doing?
Lawrence: I'm converting this SuSE Linux machine to use software
RAID.
Bruce: Didn't that use to have a hardware RAID controller in it?
Lawrence: Yes it did--a Promise FastTrak TX2000. But the driver was
dropped from support in later kernels. I found some partial
source code at the Promise Web site and tried building it,
but it didn't work reliably, so I gave up.
Bruce: Closed source can be a bitch, can't it?
Lawrence: Yeah. But anyway, back to the job at hand, namely
reformatting the OS partition on the 40GB drive as one half
of a RAID-1 pair. The other half will come from the second
40GB drive which was previously used for the hardware RAID.
Bruce: You've got a place to back up the OS install, or are you
going to reinstall?
Lawrence: I'm going to temporarily copy it to another, 120GB, drive.
This is Linux--why do a reinstall when you don't have to?
I'll use this SuSE 9.3 boot CD as a "rescue disc" to boot
the system from, so I can freely mess around with the hard
disks. Then I use mdadm to turn the two 40GB partitions
into a RAID-1 pair. Finally I'll copy the OS installation
back again.
Bruce: Don't forget to change the partition type from 82 hex
(regular Linux filesystem) to FD hex (auto-detected Linux
RAID partition).
Lawrence: Oh yeah, thanks for reminding me about that.
Bruce: Bad things can happen otherwise.
Lawrence: All right, all right, I'm doing it already.
(time passes)
Lawrence: OK, I've set up the RAID-1 partitions, formatted the array
as an ext3 filesystem, and copied the files back.
Bruce: Does the kernel have the RAID drivers built-in? Otherwise
how is it going to boot?
Lawrence: This is SuSE, not Gentoo. We don't build a custom kernel
to do things like that, we include the driver modules in
the initial RAM disk (initrd). Let's see, the basic "md"
RAID driver is built into the kernel, but the "raid1"
configuration support is a separate kernel module. I edit
my /etc/sysconfig/kernel file, look for the
INITRD_MODULES symbol definition, and make sure the
modules listed include raid1. Now I just run mkinitrd,
and that's done.
Bruce: Don't forget to the the root= kernel parameter in the boot
loader settings. It currently says /dev/hda2, it should be
/dev/md0.
Lawrence: Done.
Bruce: Same thing in /etc/fstab, where it says to mount the root
filesystem from /dev/hda2, that should be /dev/md0 as well.
Lawrence: OK, done. Anything else?
Bruce: And don't forget to re-set-up the grub bootloader. OK, I
think that's all the changes you have to make. Let's
take out that boot CD and see if it starts up from the
hard drive now...
Lawrence: OK, there're the BIOS messages ... grub is coming up ...
the choice of system to boot ... looking good ...
Bugger. That didn't work, did it? 100
Bruce: Let's see, there's the messages about autodetecting hda2
and hdb2, and adding them to the md0 array.
Lawrence: And right below that, where it tries to mount md0, it says
the device wasn't found! 80
Bruce: That's crazy. It goes through all the process of setting that
device up, then it says it can't even find it.
Lawrence: Hmmm... is it perhaps not setting up the entry for /dev/md0
in the initrd?
Bruce: The kernel root= parameter lets us specify the root device
in a number of different ways. Besides the usual pathname
form, we can also directly enter a device number. The
device definitely is in the kernel, so using the number
should let us get around any missing /dev/md0 name.
Lawrence: /dev/md0 is major device 9, minor device 0. How do we enter
that as one number?
Bruce: Let's see, shift left, xor in the bits ... 2304.
Lawrence: OK. The kernel boot line now says root=2304. Rebooting...
Bruce: Interesting. Instead of a kernel panic like before, it has
now left us in the initrd, looking at a shell prompt.
Lawrence: Stuff this. Let's give up. The client is getting antsy
about how long the machine's been down. I'm going to put it
back the way it was, with the OS on /dev/hda2 and no RAID.
Bruce: OK, change the partition type back to 82 hex, and reformat
it directly as an ext3 filesystem.
Lawrence: Hmm, it threw up an error about not being able to write back
the modified partition table--must be because the RAID driver
still thinks it's got control.
Bruce: Oh well, a quick reboot will fix that.
Lawrence: Right. As we were. OK, copy all the files back. Re-do
the grub setup ... rebooting ...
Bruce: Interesting. Now it says it can't find /dev/hda2. 60
Lawrence: But that can't be! Not finding the RAID device I can
(sort of) understand, but how can it fail to find a simple
IDE disk!?
Bruce: Unless...
Lawrence: ...the message doesn't mean what it says it means? 40
Bruce: You're formatting the OS partition as ext3, right?
Lawrence: Yes. Previously it was reiserfs, but with all the doubts
over the future of that, I thought I'd try ext3 instead.
Bruce: Is ext3 built into the kernel, or is it loaded as a module? 20
Lawrence: I thought it was built--no, hang on, it's not! It's a
module!
Bruce: There we go. That message saying "device not found" really 0
means "I don't know how to mount this filesystem". You
need to include the ext3 module in your initrd.
Lawrence: *Sigh*. OK, let's go back and try setting up the software
RAID again.
(more time passes)
Lawrence: OK, here we go, second try. Done the initrd, done the
fstab and the bootloader, here we go...
Bruce: You've done it. It's booting off the RAID.
Lawrence: *Phew*. After all that time spent, it really was quite
simple, wasn't it?
Bruce: Just a little tricky with the misleading error message.
Lawrence: Yeah.
Your score:
0 -- Stick to Microsoft Windows.
20 -- By all means use GNOME or KDE, but stay away from
anything resembling a command line.
40 -- You can probably manage a few Shell commands and edit files
with vi.
60 -- You can probably manage a few Shell commands and edit files
with emacs.
80 -- You can tame balky Linux systems with one hand while
throwing lightning bolts with the other.
100 -- You rule. Other Linux mavens bow down before you. Feel free
to write the next Puzzle Page.