Setting up full NVMe/TCP BFS with QEMU
How does it work:
We need to create a target VM image, it will contain the RHEL root filesystem and will be exported
by a nvme/tcp target instance on a remote machine
On the local machine we will execute the UEFI firmware with QEMU, the firmware will connect to
the remote target, load the kernel and the latter will take over the boot process by using the
information provided by the UEFI firmware via the NBFT table.
Local Machine Remote Machine
------------------ -----------------------
| | | |
| QEMU + UEFI | | NVMe/TCP target |
| + EFIDISK | -------> LAN ---------> | RHEL rootfs |
| | | |
------------------ -----------------------
Prepare the target RHEL9.2 VM image:
Install a RHEL9.2 VM that we will use as a target:
$ qemu-system-x86_64 --enable-kvm -bios OVMF-pure-efi.fd -drive file=rhel9disk_target,if=none,id=NVME1
-device nvme,drive=NVME1,serial=nvme-1,physical_block_size=4096,logical_block_size=4096 -cpu host
-net user -net nic
-cdrom rhel9.2.iso -boot d -m 8G
WARNING: remember to set the disk's logical and physical block size to 4096 bytes.
You can download the OVMF-pure-efi bios image here
Boot the newly installed VM and read the grub.cfg file:
# cat /boot/efi/EFI/redhat/grub.cfg
search --no-floppy --fs-uuid --set=dev 877ea44d-0b8b-4abd-b777-9294757abfd0 <---
set prefix=($dev)/grub2
Copy the device's UUID, we will need it later
Download the RHEL9.2 kernel with the timberland patches and install it on the VM.
Download the timberland dracut RPM packages and install them.
Download the timberland libnvme RPM packages and install them.
Download the timberland nvme-cli RPM and install it.
Copy the hostnqn and hostid, we will need them later:
# cat /etc/nvme/hostnqn
# cat /etc/nvme/hostid
Note: on QEMU VMs sometimes the hostnqn has an invalid UUID
example: nqn.2014-08.org.nvmexpress:uuid:0000000-0000-0000-0000-000000000000
if this happens, you can generate a valid UUID using the uuidgen command and
replace the content of the /etc/nvme/hostnqn file.
This issue is likely due to a defect in the libnvme library that I will investigate soon.
Select the timberland's kernel as the default kernel and reboot the VM:
# grubby --set-default /boot/vmlinuz-5.14.0-206_nbft4.el9.x86_64
# reboot
Update the system's initrd and install the nvmf dracut module.
# dracut -f -v --add nvmf
The VM image is now ready and you can shutdown it.
------------------------------------------------------
Create a NVMe/TCP target on the remote machine:
Install the nvmetcli utility:
# dnf install nvmetcli
# nvmetcli
Create a host entry using the VM's NQN:
/> hosts/ create nqn=nqn.2014-08.org.nvmexpress:uuid:f687aaae-016f-4268-9fc3-5d8220e5a23b
Create a port entry for the target:
/> ports/ create portid=1
/> cd ports/1/
/ports/1> set addr adrfam=ipv4 traddr=10.37.153.132 trtype=tcp trsvcid=4420
/ports/1> cd /
Create the subsystem:
/> subsystems/ create nqn=nqn.2014-08.org.nvmexpress:uuid:4793044c-27c2-11b2-a85c-ec74d87fa65f
Add the ACL entry of the host:
/> cd subsystems/nqn.2014-08.org.nvmexpress:uuid:4793044c-27c2-11b2-a85c-ec74d87fa65f/
/subsystems/n...-ec74d87fa65f> allowed_hosts/ create nqn.2014-08.org.nvmexpress:uuid:f687aaae-016f-4268-9fc3-5d8220e5a23b
Create a namespace, add the VM disk and enable it:
/subsystems/n...-ec74d87fa65f> namespaces/ create nsid=1
/subsystems/n...-ec74d87fa65f> cd namespaces/1
/subsystems/n.../namespaces/1> set device path=/home/rhel9disk_target
/subsystems/n.../namespaces/1> enable
Link the subsystem to the port:
/subsystems/n.../namespaces/1> cd /ports/1/
/ports/1> subsystems/ create nqn.2014-08.org.nvmexpress:uuid:4793044c-27c2-11b2-a85c-ec74d87fa65f
------------------------------------------------------
Run the QEMU VM on the local machine:
On the local machine, download and mount the efidisk image:
# tar xvf efidisk.tar.bz2
# mount -t vfat -o loop,offset=1048576 efidisk /mnt/efidisk
Fix the /mnt/efidisk/EFI/redhat/grub.cfg file in the efidisk image, use the device UUID you copied before.
Open the /mnt/efidisk/EFI/BOOT/config file that contains the parameters used by the UEFI firmware
to connect to the NVMe/TCP target.
The config file has the following format:
# cat config
HostNqn:nqn.2014-08.org.nvmexpress:uuid:f687aaae-016f-4268-9fc3-5d8220e5a23b
HostId:1acb7b81-0194-48f4-8794-587472b60bd2
$Start
AttemptName:Attempt1
MacString:52:54:00:12:34:56
TargetPort:4420
Enabled:1
InitiatorInfoFromDhcp:TRUE
IpMode:0
TargetIp:10.37.153.132
NQN:nqn.2014-08.org.nvmexpress:uuid:4793044c-27c2-11b2-a85c-ec74d87fa65f
ConnectTimeout:600
DnsMode:FALSE
$End
HostNqn is the VM's NQN.
HostId is the VM's hostid.
TargetPort is the target's TCP port.
InitiatorInfoFromDhcp:TRUE means that the DHCP mode is active.
TargetIp is the target's IP address.
NQN: is the target's NQN.
Change the config file according to your configuration.
If your setup needs a static IP address, you can find a config file example
here
Note: the VM must be able to estabilish a tcp/ip connection to the target, you might need to set up
a bridged network device if the target is on the same subnet of the VM.
Example
Unmount the efidisk file
# umount /mnt/efidisk
Download the UEFI firmware files (OVMF_CODE.fd and OVMF_VARS.fd files).
Run the qemu virtual machine:
qemu-system-x86_64 --enable-kvm -drive if=pflash,format=raw,readonly=on,file=OVMF_CODE.fd
-drive if=pflash,format=raw,file=OVMF_VARS.fd -drive file=efidisk,if=none,id=NVME1 -device nvme,drive=NVME1,serial=nvme-1
-netdev type=user,id=host0 -device virtio-net-pci,netdev=host0,romfile= -m 8G -cpu host
Note: the romfile=
option is needed to avoid a known bug in the timberland's UEFI firmware.
Press immediately the ESC button to enter the UEFI setup menu and change the
device boot order so the EFI Internal Shell will start first, then reboot the VM:

The UEFI Shell will execute the startup script, let the countdown expire.

The firmware will now try to connect to the target, the process may take a few seconds.

The UEFI boot menu will now appear, open the "Boot Manager" and select the "redhat" entry.

If the connection to the target was successfull, the firmware should be able to
load the GRUB bootloader and execute it; loading the kernel may require several seconds:

The system boots, the "nvme" utility reads the NBFT table and performs the connection
to the target, the kernel then mounts the remote device as the root filesystem:
