Ubuntu – How to install Ubuntu 14.04/16.04 64-bit with a dual-boot RAID 1 partition on an UEFI/GPT system

16.04dual-bootraidsystem-installationuefi

Update: Question and answer below also applies to Ubuntu 16.04

I have a computer with dual SSDs and Win (7) pre-installed on another disk. The pre-installation uses (U)EFI/GPT boot. I want to install Ubuntu 14.04 64-bit desktop on a RAID1 root partition on my SSDs and still be able to dual-boot my Win7 system. Is this possible?

This guide using the desktop installer did not work, probably because it (implicitly) assumes MBR booting. Neither did installing the server distribution, probably for the same reason.

Best Answer

  • UPDATE: I have verified that the description below also works for Ubuntu 16.04. Other users have reported working on 17.10 and 18.04.1.

    NOTE: This HOWTO will not give you LVM. If you want LVM too, try Install Ubuntu 18.04 desktop with RAID 1 and LVM on machine with UEFI BIOS instead.

    After days of trying, I now have a working system! In brief, the solution consisted of the following steps:

    1. Boot using a Ubuntu Live CD/USB.
    2. Partitions the SSDs as required.
    3. Install missing packages (mdadm and grub-efi).
    4. Create the RAID partitions.
    5. Run the Ubiquity installer (but do not boot into the new system).
    6. Patch the installed system (initramfs) to enable boot from a RAIDed root.
    7. Populate the EFI partition of the first SSD with GRUB and install it into the EFI boot chain.
    8. Clone the EFI partition to the other SSD and install it into the boot chain.
    9. Done! Your system will now have RAID 1 redundancy. Note that nothing special needs to be done after e.g. a kernel update, as the UEFI partitions are untouched.

    A key component of step 6 of the solution was a delay in the boot sequence that otherwise dumped me squarely to the GRUB prompt (without keyboard!) if either of the SSDs were missing.

    Detailed HOWTO

    1. Boot

    Boot using EFI from the USB stick. Exactly how will vary by your system. Select Try ubuntu without installing.

    Start a terminal emulator, e.g. xterm to run the commands below.

    1.1 Login from another computer

    While trying this out, I often found it easier to login from another, already fully configured computer. This simplified cut-and-paste of commands, etc. If you want to do the same, you can login via ssh by doing the following:

    On the computer to be configured, install the openssh server:

    sudo apt-get install openssh-server
    

    Change password. The default password for user ubuntu is blank. You can probably pick a medium-strength password. It will be forgotten as soon as you reboot your new computer.

    passwd
    

    Now you can log into the ubuntu live session from another computer. The instructions below are for linux:

    ssh -l ubuntu <your-new-computer>
    

    If you get a warning about a suspected man-in-the-middle-attack, you need to clear the ssh keys used to identify the new computer. This is because openssh-server generates new server keys whenever it is installed. The command to use is typically printed and should look like

    ssh-keygen -f <path-to-.ssh/known_hosts> -R <your-new-computer>
    

    After executing that command, you should be able to login to the ubuntu live session.

    2. Partition disks

    Clear any old partitions and boot blocks. Warning! This will destroy data on your disks!

    sudo sgdisk -z /dev/sda
    sudo sgdisk -z /dev/sdb
    

    Create new partitions on the smallest of your drives: 100M for ESP, 32G for RAID SWAP, rest for RAID root. If your sda drive is smallest, follow Section 2.1, otherwise Section 2.2.

    2.1 Create partition tables (/dev/sda is smaller)

    Do the following steps:

    sudo sgdisk -n 1:0:+100M -t 1:ef00 -c 1:"EFI System" /dev/sda
    sudo sgdisk -n 2:0:+32G -t 2:fd00 -c 2:"Linux RAID" /dev/sda
    sudo sgdisk -n 3:0:0 -t 3:fd00 -c 3:"Linux RAID" /dev/sda
    

    Copy partition table to other disk and regenerate unique UUIDs (will actually regenerate UUIDs for sda).

    sudo sgdisk /dev/sda -R /dev/sdb -G
    

    2.2 Create partition tables (/dev/sdb is smaller)

    Do the following steps:

    sudo sgdisk -n 1:0:+100M -t 1:ef00 -c 1:"EFI System" /dev/sdb
    sudo sgdisk -n 2:0:+32G -t 2:fd00 -c 2:"Linux RAID" /dev/sdb
    sudo sgdisk -n 3:0:0 -t 3:fd00 -c 3:"Linux RAID" /dev/sdb
    

    Copy partition table to other disk and regenerate unique UUIDs (will actually regenerate UUIDs for sdb).

    sudo sgdisk /dev/sdb -R /dev/sda -G
    

    2.3 Create FAT32 file system on /dev/sda

    Create FAT32 file system for the EFI partition.

    sudo mkfs.fat -F 32 /dev/sda1
    mkdir /tmp/sda1
    sudo mount /dev/sda1 /tmp/sda1
    sudo mkdir /tmp/sda1/EFI
    sudo umount /dev/sda1
    

    3. Install missing packages

    The Ubuntu Live CD comes without two key packages; grub-efi and mdadm. Install them. (I'm not 100% sure grub-efi is needed here, but to maintain symmetry with the coming installation, bring it in as well.)

    sudo apt-get update
    sudo apt-get -y install grub-efi-amd64 # (or grub-efi-amd64-signed)
    sudo apt-get -y install mdadm
    

    You may need grub-efi-amd64-signed instead of grub-efi-amd64 if you have secure boot enabled. (See comment by Alecz.)

    4. Create the RAID partitions

    Create the RAID devices in degraded mode. The devices will be completed later. Creating a full RAID1 did sometimes give me problems during the ubiquity installation below, not sure why. (mount/unmount? format?)

    sudo mdadm --create /dev/md0 --bitmap=internal --level=1 --raid-disks=2 /dev/sda2 missing
    sudo mdadm --create /dev/md1 --bitmap=internal --level=1 --raid-disks=2 /dev/sda3 missing
    

    Verify RAID status.

    cat /proc/mdstat
    
    Personalities : [raid1] 
    md1 : active raid1 sda3[0]
          216269952 blocks super 1.2 [2/1] [U_]
          bitmap: 0/2 pages [0KB], 65536KB chunk
    
    md0 : active raid1 sda2[0]
          33537920 blocks super 1.2 [2/1] [U_]
          bitmap: 0/1 pages [0KB], 65536KB chunk
    
    unused devices: <none>
    

    Partition the md devices.

    sudo sgdisk -z /dev/md0
    sudo sgdisk -z /dev/md1
    sudo sgdisk -N 1 -t 1:8200 -c 1:"Linux swap" /dev/md0
    sudo sgdisk -N 1 -t 1:8300 -c 1:"Linux filesystem" /dev/md1
    

    5. Run the installer

    Run the ubiquity installer, excluding the boot loader that will fail anyway. (Note: If you have logged in via ssh, you will probably want to to execute this on you new computer instead.)

    sudo ubiquity -b
    

    Choose Something else as the installation type and modify the md1p1 type to ext4, format: yes, and mount point /. The md0p1 partition will automatically be selected as swap.

    Get a cup of coffee while the installation finishes.

    Important: After the installation has finished, select Continue testing as the system is not boot ready yet.

    Complete the RAID devices

    Attach the waiting sdb partitions to the RAID.

    sudo mdadm --add /dev/md0 /dev/sdb2
    sudo mdadm --add /dev/md1 /dev/sdb3
    

    Verify all RAID devices are ok (and optionally sync'ing).

    cat /proc/mdstat
    
    Personalities : [raid1] 
    md1 : active raid1 sdb3[1] sda3[0]
          216269952 blocks super 1.2 [2/1] [U_]
          [>....................]  recovery =  0.2% (465536/216269952)  finish=17.9min speed=200000K/sec
          bitmap: 2/2 pages [8KB], 65536KB chunk
    
    md0 : active raid1 sdb2[1] sda2[0]
          33537920 blocks super 1.2 [2/2] [UU]
          bitmap: 0/1 pages [0KB], 65536KB chunk
    
    unused devices: <none>
    

    The process below may continue during the sync, including the reboots.

    6. Configure the installed system

    Set up for to enable chroot into the install system.

    sudo -s
    mount /dev/md1p1 /mnt
    mount -o bind /dev /mnt/dev
    mount -o bind /dev/pts /mnt/dev/pts
    mount -o bind /sys /mnt/sys
    mount -o bind /proc /mnt/proc
    cat /etc/resolv.conf >> /mnt/etc/resolv.conf
    chroot /mnt
    

    Configure and install packages.

    apt-get install -y grub-efi-amd64 # (or grub-efi-amd64-signed; same as in step 3)
    apt-get install -y mdadm
    

    If you md devices are still sync'ing, you may see occasional warnings like:

    /usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
    

    This is normal and can be ignored (see answer at bottom of this question).

    nano /etc/grub.d/10_linux
    # change quick_boot and quiet_boot to 0
    

    Disabling quick_boot will avoid the Diskfilter writes are not supported bugs. Disabling quiet_boot is of personal preference only.

    Modify /etc/mdadm/mdadm.conf to remove any label references, i.e. change

    ARRAY /dev/md/0 metadata=1.2 name=ubuntu:0 UUID=f0e36215:7232c9e1:2800002e:e80a5599
    ARRAY /dev/md/1 metadata=1.2 name=ubuntu:1 UUID=4b42f85c:46b93d8e:f7ed9920:42ea4623
    

    to

    ARRAY /dev/md/0 UUID=f0e36215:7232c9e1:2800002e:e80a5599
    ARRAY /dev/md/1 UUID=4b42f85c:46b93d8e:f7ed9920:42ea4623
    

    This step may be unnecessary, but I've seen some pages suggest that the naming schemes may be unstable (name=ubuntu:0/1) and this may stop a perfectly fine RAID device from assembling during boot.

    Modify lines in /etc/default/grub to read

    #GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
    GRUB_CMDLINE_LINUX=""
    

    Again, this step may be unnecessary, but I prefer to boot with my eyes open...

    6.1. Add sleep script

    (It has been suggested by the community that this step might be unnecessary and can be replaced using GRUB_CMDLINE_LINUX="rootdelay=30" in /etc/default/grub. For reasons explained at the bottom of this HOWTO, I suggest to stick with the sleep script even though it is uglier than using rootdelay. Thus, we continue with our regular program...)

    Create a script that will wait for the RAID devices to settle. Without this delay, mounting of root may fail due to the RAID assembly not being finished in time. I found this out the hard way - the problem did not show up until I had disconnected one of the SSDs to simulate disk failure! The timing may need to be adjusted depending on available hardware, e.g. slow external USB disks, etc.

    Enter the following code into /usr/share/initramfs-tools/scripts/local-premount/sleepAwhile:

    #!/bin/sh
    echo
    echo "sleeping for 30 seconds while udevd and mdadm settle down"
    sleep 5
    echo "sleeping for 25 seconds while udevd and mdadm settle down"
    sleep 5
    echo "sleeping for 20 seconds while udevd and mdadm settle down"
    sleep 5
    echo "sleeping for 15 seconds while udevd and mdadm settle down"
    sleep 5
    echo "sleeping for 10 seconds while udevd and mdadm settle down"
    sleep 5
    echo "sleeping for 5 seconds while udevd and mdadm settle down"
    sleep 5
    echo "done sleeping"
    

    Make the script executable and install it.

    chmod a+x /usr/share/initramfs-tools/scripts/local-premount/sleepAwhile
    update-grub
    update-initramfs -u
    

    7. Enable boot from the first SSD

    Now the system is almost ready, only the UEFI boot parameters need to be installed.

    mount /dev/sda1 /boot/efi
    grub-install --boot-directory=/boot --bootloader-id=Ubuntu --target=x86_64-efi --efi-directory=/boot/efi --recheck
    update-grub
    umount /dev/sda1
    

    This will install the boot loader in /boot/efi/EFI/Ubuntu (a.k.a. EFI/Ubuntu on /dev/sda1) and install it first in the UEFI boot chain on the computer.

    8. Enable boot from the second SSD

    We're almost done. At this point, we should be able to reboot on the sda drive. Furthermore, mdadm should be able to handle failure of either the sda or sdb drive. However, the EFI is not RAIDed, so we need to clone it.

    dd if=/dev/sda1 of=/dev/sdb1
    

    In addition to installing the boot loader on the second drive, this will make the UUID of the FAT32 file system on the sdb1 partition (as reported by blkid) match that of sda1 and /etc/fstab. (Note however that the UUIDs for the /dev/sda1 and /dev/sdb1 partitions will still be different - compare ls -la /dev/disk/by-partuuid | grep sd[ab]1 with blkid /dev/sd[ab]1 after the install to check for yourself.)

    Finally, we must insert the sdb1 partition into the boot order. (Note: This step may be unnecessary, depending on your BIOS. I have gotten reports that some BIOS' automatically generates a list of valid ESPs.)

    efibootmgr -c -g -d /dev/sdb -p 1 -L "Ubuntu #2" -l '\EFI\ubuntu\grubx64.efi'
    

    I did not test it, but it is probably necessary to have unique labels (-L) between the ESP on sda and sdb.

    This will generate a printout of the current boot order, e.g.

    Timeout: 0 seconds
    BootOrder: 0009,0008,0000,0001,0002,000B,0003,0004,0005,0006,0007
    Boot0000  Windows Boot Manager
    Boot0001  DTO UEFI USB Floppy/CD
    Boot0002  DTO UEFI USB Hard Drive
    Boot0003* DTO UEFI ATAPI CD-ROM Drive
    Boot0004  CD/DVD Drive 
    Boot0005  DTO Legacy USB Floppy/CD
    Boot0006* Hard Drive
    Boot0007* IBA GE Slot 00C8 v1550
    Boot0008* Ubuntu
    Boot000B  KingstonDT 101 II PMAP
    Boot0009* Ubuntu #2
    

    Note that Ubuntu #2 (sdb) and Ubuntu (sda) are the first in the boot order.

    Reboot

    Now we are ready to reboot.

    exit # from chroot
    exit # from sudo -s
    sudo reboot
    

    The system should now reboot into Ubuntu (You may have to remove the Ubuntu Live installation media first.)

    After boot, you may run

    sudo update-grub
    

    to attach the Windows boot loader to the grub boot chain.

    Virtual machine gotchas

    If you want to try this out in a virtual machine first, there are some caveats: Apparently, the NVRAM that holds the UEFI information is remembered between reboots, but not between shutdown-restart cycles. In that case, you may end up at the UEFI Shell console. The following commands should boot you into your machine from /dev/sda1 (use FS1: for /dev/sdb1):

    FS0:
    \EFI\ubuntu\grubx64.efi
    

    The first solution in the top answer of UEFI boot in virtualbox - Ubuntu 12.04 might also be helpful.

    Simulating a disk failure

    Failure of either RAID component device can be simulated using mdadm. However, to verify that the boot stuff would survive a disk failure I had to shut down the computer and disconnecting power from a disk. If you do so, first ensure that the md devices are sync'ed.

    cat /proc/mdstat 
    
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md1 : active raid1 sdb3[2] sda3[0]
          216269952 blocks super 1.2 [2/2] [UU]
          bitmap: 2/2 pages [8KB], 65536KB chunk
    
    md0 : active raid1 sda2[0] sdb2[2]
          33537920 blocks super 1.2 [2/2] [UU]
          bitmap: 0/1 pages [0KB], 65536KB chunk
    
    unused devices: <none>
    

    In the instructions below, sdX is the failed device (X=a or b) and sdY is the ok device.

    Disconnect a drive

    Shutdown the computer. Disconnect a drive. Restart. Ubuntu should now boot with the RAID drives in degraded mode. (Celebrate! This is what you were trying to achieve! ;)

    cat /proc/mdstat 
    
    Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
    md1 : active raid1 sda3[0]
          216269952 blocks super 1.2 [2/1] [U_]
          bitmap: 2/2 pages [8KB], 65536KB chunk
    
    md0 : active raid1 sda2[0]
          33537920 blocks super 1.2 [2/1] [U_]
          bitmap: 0/1 pages [0KB], 65536KB chunk
    
    unused devices: <none>
    

    Recover from a failed disk

    This is the process to follow if you have needed to replace a faulty disk. If you want to emulate a replacement, you may boot into a Ubuntu Live session and use

    dd if=/dev/zero of=/dev/sdX
    

    to wipe the disk clean before re-rebooting into the real system. If you just tested the boot/RAID redundancy in the section above, you can skip this step. However, you must at least perform steps 2 and 4 below to recover full boot/RAID redundancy for your system.

    Restoring the RAID+boot system after a disk replacement requires the following steps:

    1. Partition the new drive.
    2. Add partitions to md devices.
    3. Clone the boot partition.
    4. Add an EFI record for the clone.

    1. Partition the new drive

    Copy the partition table from the healthy drive:

    sudo sgdisk /dev/sdY -R /dev/sdX
    

    Re-randomize UUIDs on the new drive.

    sudo sgdisk /dev/sdX -G
    

    2. Add to md devices

    sudo mdadm --add /dev/md0 /dev/sdX2
    sudo mdadm --add /dev/md1 /dev/sdX3
    

    3. Clone the boot partition

    Clone the ESP from the healthy drive. (Careful, maybe do a dump-to-file of both ESPs first to enable recovery if you really screw it up.)

    sudo dd if=/dev/sdY1 of=/dev/sdX1
    

    4. Insert the newly revived disk into the boot order

    Add an EFI record for the clone. Modify the -L label as required.

    sudo efibootmgr -c -g -d /dev/sdX -p 1 -L "Ubuntu #2" -l '\EFI\ubuntu\grubx64.efi'
    

    Now, rebooting the system should have it back to normal (the RAID devices may still be sync'ing)!

    Why the sleep script?

    It has been suggested by the community that adding a sleep script might be unnecessary and could be replaced by using GRUB_CMDLINE_LINUX="rootdelay=30" in /etc/default/grub followed by sudo update-grub. This suggestion is certainly cleaner and does work in a disk failure/replace scenario. However, there is a caveat...

    I disconnected my second SSD and found out that with rootdelay=30, etc. instead of the sleep script:
    1) The system does boot in degraded mode without the "failed" drive.
    2) In non-degraded boot (both drives present), the boot time is reduced. The delay is only perceptible with the second drive missing.

    1) and 2) sounded great until I re-added my second drive. At boot, the RAID array failed to assemble and left me at the initramfs prompt without knowing what to do. It might have been possible to salvage the situation by a) booting to the Ubuntu Live USB stick, b) installing mdadm and c) re-assembling the array manually but...I messed up somewhere. Instead, when I re-ran this test with the sleep script (yes, I did start the HOWTO from the top for the nth time...), the system did boot. The arrays were in degraded mode and I could manually re-add the /dev/sdb[23] partitions without any extra USB stick. I don't know why the sleep script works whereas the rootdelay doesn't. Perhaps mdadm gets confused by two, slightly out-of-sync component devices, but I thought mdadm was designed to handle that. Anyway, since the sleep script works, I'm sticking to it.

    It could be argued that removing a perfectly healthy RAID component device, re-booting the RAID to degraded mode and then re-adding the component device is an unrealistic scenario: The realistic scenario is rather that one device fails and is replaced by a new one, leaving less opportunity for mdadm to get confused. I agree with that argument. However, I don't know how to test how the system tolerates a hardware failure except to actually disable some hardware! And after testing, I want to get back to a redundant, working system. (Well, I could attach my second SSD to another machine and swipe it before I re-add it, but that's not feasible.)

    In summary: To my knowledge, the rootdelay solution is clean, faster than the sleep script for non-degraded boots, and should work for a real drive failure/replace scenario. However, I don't know a feasible way to test it. So, for the time being, I will stick to the ugly sleep script.