--- OCP Version at Install Time: 48.84.202107202156-0 RHCOS Version at Install Time: 48.84.202107202156-0 Platform: bare metal Architecture: x86_64 What are you trying to do? What is your use case? Bare metal UPI OCP 4.8 deployment with custom ignition file due to 2 worker node groups (one group of workers has 2 disks attached, the other one has 3 disks attached) What happened? What went wrong or what did you expect? Idea is to generate 2 ignition files using butane (one for worker with 2 disks and the other one for the worker with 3 disks) What are the steps to reproduce your issue? Please try to reduce these steps to something that can be reproduced with a single RHCOS node. 1) Generating ignition files with installer !/bin/bash ## Clean old clusterconfig rm -Rf clusterconfig mkdir clusterconfig cp backup/install-config.yaml clusterconfig/ ## Create Manifests ./openshift-install create manifests --dir clusterconfig/ ## Set masters as unschedulable sed -i 's/mastersSchedulable\: true/mastersSchedulable\: false/g' clusterconfig/manifests/cluster-scheduler-02-config.yml ## OPTIONAL MACHINECONFIGS #cp backup/98-var-partition-worker.yaml clusterconfig/openshift/ cp backup/98-var-partition-master.yaml clusterconfig/openshift/ #cp backup/98-var-partition-infra.yaml clusterconfig/openshift/ ## Create Ignition Files ./openshift-install create ignition-configs --dir clusterconfig/ ###4.6 ## Remove old ignition files and replace with new rm -f /var/www/html/openshift/ocp-datahub/*.ign cp clusterconfig/*.ign /var/www/html/openshift/ocp-datahub/ chcon -R system_u:object_r:httpd_sys_content_t:s0 /var/www/html/ chown -R apache: /var/www/html/openshift/ ./openshift-install version ./openshift-install 4.8.2 built from commit a5ddd2dd6c72d8a5ea0a5f17acd8b964b6a3d1be release image quay.io/openshift-release-dev/ocp-release@sha256:0e82d17ababc79b10c10c5186920232810aeccbccf2a74c691487090a2c98ebc 2) Generating customer worker.ign with merge+local feature which includes default worker.ign generated by openshift installer in step 1) with butane cat test_ocp.bu variant: openshift version: 4.8.0 metadata: name: 98-worker-var-partition labels: machineconfiguration.openshift.io/role: worker ignition: config: merge: - local: worker_default.ign storage: disks: - device: /dev/sdb wipe_table: true partitions: - number: 1 label: var filesystems: - path: /var device: /dev/disk/by-partlabel/var format: xfs wipe_filesystem: true label: var with_mount_unit: true systemd: units: - name: var.mount enabled: true contents: | [Unit] Before=local-fs.target [Mount] Where=/var What=/dev/disk/by-partlabel/var [Install] WantedBy=local-fs.target podman run --rm --tty --interactive --security-opt label=disable --volume ${PWD}:/pwd --workdir /pwd quay.io/coreos/butane:release --pretty --strict test_ocp.bu --files-dir . --raw > worker.ign 3) Transfer generated worker.ign to the http server which is then passed to the coreos-installer, leads to worker node which cannot start service kubelet-auto-node-size.service because /usr/local/sbin/dynamic-system-reserved-calc.sh does not exist: [core@worker-001 ~]$ journalctl -u kubelet-auto-node-size.service -- Reboot -- Aug 05 13:56:22worker-001 systemd[1]: Starting Dynamically sets the system reserved for the kubelet... Aug 05 13:56:22worker-001 bash[3093]: /bin/bash: /usr/local/sbin/dynamic-system-reserved-calc.sh: No such file or directory Aug 05 13:56:22worker-001 systemd[1]: kubelet-auto-node-size.service: Main process exited, code=exited, status=127/n/a Aug 05 13:56:22worker-001 systemd[1]: kubelet-auto-node-size.service: Failed with result 'exit-code'. Aug 05 13:56:22worker-001 systemd[1]: Failed to start Dynamically sets the system reserved for the kubelet. Aug 05 13:56:22worker-001 systemd[1]: kubelet-auto-node-size.service: Consumed 1ms CPU time Script is included in mc 00-worker: oc get mc 00-worker -o yaml | grep -B10 dynamic-system-reserved-calc.sh path: /etc/modules-load.d/iptables.conf - contents: source: data:,NODE_SIZING_ENABLED%3Dfalse%0ASYSTEM_RESERVED_MEMORY%3D1Gi%0ASYSTEM_RESERVED_CPU%3D500m mode: 420 overwrite: true path: /etc/node-sizing-enabled.env - contents: source: data:,%23!%2Fbin%2Fbash%0Aset%20-e%0ANODE_SIZES_ENV%3D%24%7BNODE_SIZES_ENV%3A-%2Fetc%2Fnode-sizing.env%7D%0Afunction%20dynamic_memory_sizing%20%7B%0A%20%20%20%20total_memory%3D%24(free%20-g%7Cawk%20'%2F%5EMem%3A%2F%7Bprint%20%242%7D')%0A%20%20%20%20%23%20total_memory%3D8%20test%20the%20recommended%20values%20by%20modifying%20this%20value%0A%20%20%20%20recommended_systemreserved_memory%3D0%0A%20%20%20%20if%20((%24total_memory%20%3C%3D%204))%3B%20then%20%23%2025%25%20of%20the%20first%204GB%20of%20memory%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24total_memory%200.25%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%0A%20%20%20%20%20%20%20%20total_memory%3D0%0A%20%20%20%20else%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D1%0A%20%20%20%20%20%20%20%20total_memory%3D%24((total_memory-4))%0A%20%20%20%20fi%0A%20%20%20%20if%20((%24total_memory%20%3C%3D%204))%3B%20then%20%23%2020%25%20of%20the%20next%204GB%20of%20memory%20(up%20to%208GB)%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24recommended_systemreserved_memory%20%24(echo%20%24total_memory%200.20%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_memory%3D0%0A%20%20%20%20else%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24recommended_systemreserved_memory%200.80%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_memory%3D%24((total_memory-4))%0A%20%20%20%20fi%0A%20%20%20%20if%20((%24total_memory%20%3C%3D%208))%3B%20then%20%23%2010%25%20of%20the%20next%208GB%20of%20memory%20(up%20to%2016GB)%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24recommended_systemreserved_memory%20%24(echo%20%24total_memory%200.10%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_memory%3D0%0A%20%20%20%20else%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24recommended_systemreserved_memory%200.80%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_memory%3D%24((total_memory-8))%0A%20%20%20%20fi%0A%20%20%20%20if%20((%24total_memory%20%3C%3D%20112))%3B%20then%20%23%206%25%20of%20the%20next%20112GB%20of%20memory%20(up%20to%20128GB)%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24recommended_systemreserved_memory%20%24(echo%20%24total_memory%200.06%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_memory%3D0%0A%20%20%20%20else%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24recommended_systemreserved_memory%206.72%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_memory%3D%24((total_memory-112))%0A%20%20%20%20fi%0A%20%20%20%20if%20((%24total_memory%20%3E%3D%200))%3B%20then%20%23%202%25%20of%20any%20memory%20above%20128GB%0A%20%20%20%20%20%20%20%20recommended_systemreserved_memory%3D%24(echo%20%24recommended_systemreserved_memory%20%24(echo%20%24total_memory%200.02%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20fi%0A%20%20%20%20echo%20%22SYSTEM_RESERVED_MEMORY%3D%24%7Brecommended_systemreserved_memory%7DGi%22%3E%3E%20%24%7BNODE_SIZES_ENV%7D%0A%7D%0Afunction%20dynamic_cpu_sizing%20%7B%0A%20%20%20%20total_cpu%3D%24(getconf%20_NPROCESSORS_ONLN)%0A%20%20%20%20recommended_systemreserved_cpu%3D0%0A%20%20%20%20if%20((%24total_cpu%20%3C%3D%201))%3B%20then%20%23%206%25%20of%20the%20first%20core%0A%20%20%20%20%20%20%20%20recommended_systemreserved_cpu%3D%24(echo%20%24total_cpu%200.06%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%0A%20%20%20%20%20%20%20%20total_cpu%3D0%0A%20%20%20%20else%0A%20%20%20%20%20%20%20%20recommended_systemreserved_cpu%3D0.06%0A%20%20%20%20%20%20%20%20total_cpu%3D%24((total_cpu-1))%0A%20%20%20%20fi%0A%20%20%20%20if%20((%24total_cpu%20%3C%3D%201))%3B%20then%20%23%201%25%20of%20the%20next%20core%20(up%20to%202%20cores)%0A%20%20%20%20%20%20%20%20recommended_systemreserved_cpu%3D%24(echo%20%24recommended_systemreserved_cpu%20%24(echo%20%24total_cpu%200.01%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_cpu%3D0%0A%20%20%20%20else%0A%20%20%20%20%20%20%20%20recommended_systemreserved_cpu%3D%24(echo%20%24recommended_systemreserved_cpu%200.01%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_cpu%3D%24((total_cpu-1))%0A%20%20%20%20fi%0A%20%20%20%20if%20((%24total_cpu%20%3C%3D%202))%3B%20then%20%23%200.5%25%20of%20the%20next%202%20cores%20(up%20to%204%20cores)%0A%20%20%20%20%20%20%20%20recommended_systemreserved_cpu%3D%24(echo%20%24recommended_systemreserved_cpu%20%24(echo%20%24total_cpu%200.005%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_cpu%3D0%0A%20%20%20%20else%0A%20%20%20%20%20%20%20%20recommended_systemreserved_cpu%3D%24(echo%20%24recommended_systemreserved_cpu%200.01%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20%20%20%20%20total_cpu%3D%24((total_cpu-2))%0A%20%20%20%20fi%0A%20%20%20%20if%20((%24total_cpu%20%3E%3D%200))%3B%20then%20%23%200.25%25%20of%20any%20cores%20above%204%20cores%0A%20%20%20%20%20%20%20%20recommended_systemreserved_cpu%3D%24(echo%20%24recommended_systemreserved_cpu%20%24(echo%20%24total_cpu%200.0025%20%7C%20awk%20'%7Bprint%20%241%20*%20%242%7D')%20%7C%20awk%20'%7Bprint%20%241%20%2B%20%242%7D')%0A%20%20%20%20fi%0A%20%20%20%20echo%20%22SYSTEM_RESERVED_CPU%3D%24%7Brecommended_systemreserved_cpu%7D%22%3E%3E%20%24%7BNODE_SIZES_ENV%7D%0A%7D%0Afunction%20dynamic_ephemeral_sizing%20%7B%0A%20%20%20%20echo%20%22Not%20implemented%20yet%22%0A%7D%0Afunction%20dynamic_pid_sizing%20%7B%0A%20%20%20%20echo%20%22Not%20implemented%20yet%22%0A%7D%0Afunction%20dynamic_node_sizing%20%7B%0A%20%20%20%20rm%20-f%20%24%7BNODE_SIZES_ENV%7D%0A%20%20%20%20dynamic_memory_sizing%0A%20%20%20%20dynamic_cpu_sizing%0A%20%20%20%20%23dynamic_ephemeral_sizing%0A%20%20%20%20%23dynamic_pid_sizing%0A%7D%0Afunction%20static_node_sizing%20%7B%0A%20%20%20%20rm%20-f%20%24%7BNODE_SIZES_ENV%7D%0A%20%20%20%20echo%20%22SYSTEM_RESERVED_MEMORY%3D%241%22%20%3E%3E%20%24%7BNODE_SIZES_ENV%7D%0A%20%20%20%20echo%20%22SYSTEM_RESERVED_CPU%3D%242%22%20%3E%3E%20%24%7BNODE_SIZES_ENV%7D%0A%7D%0A%0Aif%20%5B%20%241%20%3D%3D%20%22true%22%20%5D%3B%20then%0A%20%20%20%20dynamic_node_sizing%0Aelif%20%5B%20%241%20%3D%3D%20%22false%22%20%5D%3B%20then%0A%20%20%20%20static_node_sizing%20%242%20%243%0Aelse%0A%20%20%20%20echo%20%22Unrecongnized%20command%20line%20option.%20Valid%20options%20are%20%5C%22true%5C%22%20or%20%5C%22false%5C%22%22%0Afi%0A mode: 493 overwrite: true path: /usr/local/sbin/dynamic-system-reserved-calc.sh -- If you're having problems booting/installing RHCOS, please provide: - the full contents of the serial console showing disk initialization, network configuration, and Ignition stage (see https://access.redhat.com/articles/7212 for information about configuring your serial console) - Ignition JSON - output of `journalctl -b`
The journal provided doesn't show the Ignition stages running, so it is not possible to determine why the `dynamic-system-reserved-calc.sh` is not present on the system. It looks like the system has already completed install and has rebooted. Please capture the output from the console during the first boot of the node, showing the Ignition stages executing.
(In reply to Micah Abbott from comment #2) > The journal provided doesn't show the Ignition stages running, so it is not > possible to determine why the `dynamic-system-reserved-calc.sh` is not > present on the system. It looks like the system has already completed > install and has rebooted. > > Please capture the output from the console during the first boot of the > node, showing the Ignition stages executing. Was kinda under the impression that this will be needed, struggling to get the logs from the console .... Last thing i can see on the serial console is a grub menu, after that, i cannot see anything ... racadm>> console com2 Press the spacebar to pause... KEY MAPPING FOR CONSOLE REDIRECTION: Use the <ESC><1> key sequence for <F1> Use the <ESC><2> key sequence for <F2> Use the <ESC><3> key sequence for <F3> Use the <ESC><0> key sequence for <F10> Use the <ESC><!> key sequence for <F11> Use the <ESC><@> key sequence for <F12> Use the <ESC><Ctrl><M> key sequence for <Ctrl><M> Use the <ESC><Ctrl><H> key sequence for <Ctrl><H> Use the <ESC><Ctrl><I> key sequence for <Ctrl><I> Use the <ESC><Ctrl><J> key sequence for <Ctrl><J> Use the <ESC><X><X> key sequence for <Alt><x>, where x is any letter key, and X is the upper case of that key Use the <ESC><R><ESC><r><ESC><R> key sequence for <Ctrl><Alt><Del> F2 = System Setup F10 = Lifecycle Controller F11 = Boot Manager F12 = PXE Boot IPMI: Boot to Initializing Serial ATA devices... Avago Technologies MPT SAS3 BIOS MPT3BIOS-8.37.02.00 (2020.03.02) Copyright 2000-2020 Avago Technologies. All rights reserved PCI ENCL LUN VENDOR PRODUCT PRODUCT SIZE \ SLOT SLOT NUM NAME IDENTIFIER REVISION NVDATA ---- ---- --- -------- ---------------- ------------ ---------- 0 Dell Inc Dell SAS HBA 16.00.11.00 0E:01:00:39 0 7 0 ATA SSDSC2KG480G8R DL69 447.1 GiB 0 8 0 ATA SSDSC2KG480G8R DL69 447.1 GiB 2 supportable devices are presented for system boot selection! Jumping to grub editing entry: load_video set gfxpayload=keep insmod gzio linux ($root)/ostree/rhcos-4db93642d1d8298df8c1c9b655ba64df137a7572854a8dbc6ad\ c8ce6431a9cef/vmlinuz-4.18.0-305.10.2.el8_4.x86_64 random.trust_cpu=on console\ =tty0 console=ttyS0,115200n8 ignition.platform.id=metal ostree=/ostree/boot.0\ /rhcos/4db93642d1d8298df8c1c9b655ba64df137a7572854a8dbc6adc8ce6431a9cef/0 root\ =UUID=19ac6124-7c6e-4fba-bda9-7d9408cd7897 rw rootflags=prjquota initrd ($root)/ostree/rhcos-4db93642d1d8298df8c1c9b655ba64df137a7572854a8dbc6a\ dc8ce6431a9cef/initramfs-4.18.0-305.10.2.el8_4.x86_64.img This is Dell PowerEdge R640 with iDRAC9, followed iDRAC settings from https://andrewladlow.co.uk/2019/09/28/idrac-serial-console-over-ssh/ it didn't help, when grub menu is gone (after entering to grub menu entry edit section and pressing ctrl+x),i don't see any output.
Edited grub menu entry and replaced console=tty0 console=ttyS0,115200n8 with console=tty1 console=ttyS1,115200n8 can see the output now, provided in the attachment.
In the first log, we can see the Ignition stages successfully running and the script was written to disk, service written to disk, and service enabled: ``` [ OK 47.154218] ignition[2332]: INFO : files: createFilesystemsFiles: createFiles: op(1d): [started] writing file "/sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc" 0m] Reached targ[ 47.171268] ignition[2332]: INFO : files: createFilesystemsFiles: createFiles: op(1d): [finished] writing file "/sysroot/var/usrlocal/sbin/dynamic-system-res" [ 47.650502] ignition[2332]: INFO : files: op(2c): [started] processing unit "kubelet-auto-node-size.service" Startin[ 47.660915] ignition[2332]: INFO : files: op(2c): op(2d): [started] writing unit "kubelet-auto-node-size.service" at "/sysroot/etc/systemd/system/kubelet-au" g Cleaning Up an[ 47.678617] ignition[2332]: INFO : files: op(2c): op(2d): [finished] writing unit "kubelet-auto-node-size.service" at "/sysroot/etc/systemd/system/kubelet-au" d Shutting Down [ 47.696278] ignition[2332]: INFO : files: op(2c): [finished] processing unit "kubelet-auto-node-size.service" [ 48.602430] ignition[2332]: INFO : files: op(49): [started] setting preset to enabled for "kubelet-auto-node-size.service" [ 48.615155] ignition[2332]: INFO : files: op(49): [finished] setting preset to enabled for "kubelet-auto-node-size.service" ``` The system does eventually enter the real root and displays a login prompt; this is indicative of a successful first boot. The second log shows the system coming up and then in the middle of starting the network, the logs are broken? The `worker.ign` attached is missing the complete Ignition config that is being served to the node; the bulk of the config is being served from the cluster at `api-int.datahub-ocp4.prod.psi.redhat.com:22623/config/worker` Where is the Ignition snippet that configures the `kubelet-auto-node-size.service`? Or machine config YAML that defines it? Could you hop on the failing node and do `systemctl cat kubelet-auto-node-size.service` and perhaps `ls -latrZ /usr/local/sbin`? It's still not clear based on the data provided what is going wrong.
[root@worker-dh-001 ~]# systemctl cat kubelet-auto-node-size.service # /etc/systemd/system/kubelet-auto-node-size.service [Unit] Description=Dynamically sets the system reserved for the kubelet Wants=network-online.target After=network-online.target ignition-firstboot-complete.service Before=kubelet.service crio.service [Service] # Need oneshot to delay kubelet Type=oneshot RemainAfterExit=yes EnvironmentFile=/etc/node-sizing-enabled.env ExecStart=/bin/bash /usr/local/sbin/dynamic-system-reserved-calc.sh ${NODE_SIZING_ENABLED} ${SYSTEM_RESERVED_MEMORY} ${SYSTEM_RESERVED_CPU} [Install] RequiredBy=kubelet.service [root@worker-dh-001 ~]# ls -latrZ /usr/local/sbin total 4 drwxr-xr-x. 11 root root system_u:object_r:var_t:s0 114 Jul 16 16:03 .. -rwxr-xr-x. 1 root root system_u:object_r:bin_t:s0 3003 Jul 16 16:03 set-valid-hostname.sh drwxr-xr-x. 2 root root system_u:object_r:bin_t:s0 35 Jul 16 16:03 . I've booted with rd.debug and can see that the script dynamic-system-reserved-calc.sh was actually created, however its missing ^^^ in preceeding ls output: [ 105.335206] ignition[2259]: INFO : files: createFilesystemsFiles: createFiles: op(1d): [started] writing file "/sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc.sh" [ 105.340782] ///usr/lib/dracut-lib.sh@291(getargnum): return [ 105.355243] ignition[2259]: INFO : files: createFilesystemsFiles: createFiles: op(1d): [finished] writing file "/sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc.sh"
If you boot with `rd.break` and do `ls /sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc.sh` before switchroot, is it there? Also `lsblk` would help after the machine is fully booted to sanity-check mounts. Aside: note that you don't need to specify a separate `var.mount` in your Butane config if using `with_mount_unit: true`. It doesn't seem like you require any additional options, so you can let Butane generate the unit for you.
This is when booted with rd.break when system was already provisioned using ignition file generated by butane (script is missing) Entering emergency mode. Exit the shell to continue. Type "journalctl" to view system logs. You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report. switch_root:/# ls /sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc.sh ls: cannot access '/sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc.sh': No such file or directory switch_root:/# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 447.1G 0 disk |-sda1 8:1 0 1M 0 part |-sda2 8:2 0 127M 0 part |-sda3 8:3 0 384M 0 part `-sda4 8:4 0 446.6G 0 part /sysroot/sysroot sdb 8:16 0 447.1G 0 disk `-sdb1 8:17 0 447.1G 0 part nvme0n1 259:0 0 1.5T 0 disk `-nvme0n1p1 259:1 0 1.5T 0 part switch_root:/# This is when system was PXE booted and sda disk got formatted and system got booted for the first time from disk (ignition file got applied and script is there): [ 46.877631] ignition[2338]: INFO : files: op(4f): [started] setting preset to enabled for "kubelet-auto-node-size.service" [ 46.889221] ignition[2338]: INFO : files: op(4f): [finished] setting preset to enabled for "kubelet-auto-node-size.service" Press Enter for [ 46.900849] ignition[2338]: INFO : files: op(50): [started] setting preset to enabled for "kubelet.service" emergency shell [ 46.912510] ignition[2338]: INFO : files: op(50): [finished] setting preset to enabled for "kubelet.service" or wait 5 minute[ 46.924176] ignition[2338]: INFO : files: op(51): [started] setting preset to enabled for "machine-config-daemon-firstboot.service" s for reboot. [ 46.937925] ignition[2338]: INFO : files: op(51): [finished] setting preset to enabled for "machine-config-daemon-firstboot.service" [ 46.951827] systemd[1]: Started Reload Configuration from the Real Root. [ 46.959862] dracut-pre-pivot[2451]: Warning: Break before switch_root [ 46.966392] d[ 46.966488] ignition[2338]: INFO : files: op(52): [started] relabeling 65 patterns racut-pre-pivot[[ 46.975960] ignition[2338]: DEBUG : files: op(52): executing: "setfiles" "-vF0" "-r" "/sysroot" "/sysroot/etc/selinux/targeted/contexts/files/file_contexts" "" 2451]: Warning: [ 46.992504] ignition[2338]: INFO : files: op(52): [finished] relabeling 65 patterns Break before swi[ 47.001995] ignition[2338]: INFO : files: files passed tch_root [ 47.008952] ignition[2338]: INFO : Ignition finished successfully [ 47.016347] systemd[1]: Reached target Initrd File Systems. [ 47.022056] systemd[1]: Reached target Initrd Default Target. [ 47.027923] systemd[1]: Starting dracut pre-pivot and cleanup hook... [ 47.034484] systemd[1]: Starting Setup Virtual Console... switch_root:/# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 447.1G 0 disk |-sda1 8:1 0 1M 0 part |-sda2 8:2 0 127M 0 part |-sda3 8:3 0 384M 0 part `-sda4 8:4 0 446.6G 0 part /sysroot/sysroot sdb 8:16 0 447.1G 0 disk `-sdb1 8:17 0 447.1G 0 part nvme0n1 259:0 0 1.5T 0 disk `-nvme0n1p1 259:1 0 1.5T 0 part /sysroot/var switch_root:/# [ 47.040006] systemd[1]: Stopped target Initrd Default Target. [ 47.045883] systemd[1]: Stopped target Ignition Complete. [ 47.051418] systemd[1]: Stopped target Ignition Boot Disk Setup. [ 47.057542] systemd[1]: systemd-vconsole-setup.service: Succeeded. [ 47.063836] systemd[1]: Started Setup Virtual Console. [ 47.069099] systemd[1]: Starting Dracut Emergency Shell... Press Enter for emergency shell or wait 4 minutes 45 seconds for reboot. Generating "/run/initramfs/rdsosreport.txt" Entering emergency mode. Exit the shell to continue. Type "journalctl" to view system logs. You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report. switch_root:/# ls /sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc.sh /sysroot/var/usrlocal/sbin/dynamic-system-reserved-calc.sh switch_root:/#
Right yeah, the `rd.break` on subsequent boots won't work for this test, because it's only in the first boot that Ignition runs and that mountpoints are still active in the switchroot shell. Looking at your second paste of `lsblk` though in the switchroot shell of the PXE boot, it looks like `nvme0n1p1` was mounted at `/var`, and not `sdb1`. Yet, by your Butane config I suspect you want `/dev/sdb1` mounted at `/var`, not the NVMe device. So I think what's likely happening here is that `/dev/nvme0n1p1` also has partition label `var` (maybe from a previous installation attempt when that was the configuration?), and then in the real root `What=/dev/disk/by-partlabel/var` is racy and might sometime point to the NVMe device (where nothing was written) instead of `/dev/sdb1`. Can you try either changing all instances of `/dev/disk/by-partlabel/var` to `/dev/sdb1` in the MC, or alternatively extend the MC to nuke all filesystems and/or partitions on the NVMe device (using `wipe_table: true`) ?
Correct, i've removed sdb1 and nvme0n1p1 partitions from inside CoreOS (cfdisk) and re-provision using worker.ign generated by butane shared in this bz and all worked fine: 1) provision worker-001 with 3 disks - sda, sdb, nvme0n1 - sda rootfs - sdb not used - nvme0n1 /var { "ignition": { "config": { "merge": [ { "source": "data:,%7B%22ignition%22%3A%7B%22config%22%3A%7B%22merge%22%3A%5B%7B%22source%22%3A%22https%3A%2F%2Fapi-int.XXXX.XXXX.XXXX.redhat.com%3A22623%2Fconfig%2Fworker%22%7D%5D%7D%2C%22security%22%3A%7B%22tls%22%3A%7B%22certificateAuthorities%22%3A%5B%7B%22source%22%3A%22data%3Atext%2Fplain%3Bcharset%3Dutf-8%3Bbase64%2CLS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFRENDQWZpZ0F3SUJBZ0lJQ3hWeGpnSkFPN2N3RFFZSktvWklodmNOQVFFTEJRQXdKakVTTUJBR0ExVUUKQ3hNSmIzQmxibk5vYVdaME1SQXdEZ1lEVlFRREV3ZHliMjkwTFdOaE1CNFhEVEl4TURnd05UQTVNelExT0ZvWApEVE14TURnd016QTVNelExT0Zvd0pqRVNNQkFHQTFVRUN4TUpiM0JsYm5Ob2FXWjBNUkF3RGdZRFZRUURFd2R5CmIyOTBMV05oTUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUF1YTN5aDM1NE5FMFIKSktpR1lCZ2R3VUYzNVJBSW9wd2pKQW1LVXNPYzc4WDFKcnZCK3luZlVMbkZyazdoeEYxVXp6ekZmRGcxSnN6SQpaV1RRczN3azdDVHBocS90VWxxRkVyblE5c0FNNjBJZGYzRmRkMHJUMVFLN2RhelNxbGl6b2czRFFrYVFRT2tVCitqclJQZUJRd3JWejBuMml1S2dKR2l2ZVlJVzY5TWxWQWpXc3NQWDV6RDFmZjZBNTNsYXlRYU1wTlBjb1Z0dWIKZkhMZDZJV2RXOFNBM2Jnelg4R2V5YU90Z3RMT0QzNHJmS3pPRzFNU2dYRkhxZXNIbTl4OG5QSkJ1aXd5VnhVVAplMDhwU0Vib1ZJZko5WlJqSmlZWEJNM3hma2NHdi9yQm1ZVERUdjhyaFQ1OTZqMzBHZjZObnVoN1JpdzgwTDRkCkpGRFNwNkROSXdJREFRQUJvMEl3UURBT0JnTlZIUThCQWY4RUJBTUNBcVF3RHdZRFZSMFRBUUgvQkFVd0F3RUIKL3pBZEJnTlZIUTRFRmdRVXBkM1RTUGJTUEUrSXVFYVVXL0NHS21CdDFiVXdEUVlKS29aSWh2Y05BUUVMQlFBRApnZ0VCQUdIZGR1T2l2alJveThCM3VwU3dGMHhxSmM5NzROSitDT3AvUG5YL0l5MUM2Y0V2UEZ4QTJQelBLa2NOCldKSW1mcy8reDJwMEdIY0cwTDFjQ0dNeXl6TmIzNUdRNkpjYUx2c0N6dUpqWHhoeDAwQm82VlgyUXV6TzNocW0KNXNDMzhIWGoyMEI4Tyt6dXMzRjE4RlI1WFFiSGJxcVZRYmZ0Wi9aaHZRQ0cvUkYybDduSDR1bTZtT1BScm1kKwoycHlqeW4yL3dwVFRFUjRZYm5CNi9jR0JLK2FDeHhaVlRuazc5Q1V3U1lzMjJGNDFlOUhSMTAvNE56dlh3cUFEClZlTmRVTWxLOWwvcHZ1dDc2N1FBOWRhZXkxSnlLdmlZTHZhajRvYnlnUWhOR0JCVWtnY0FwaElnWHFIQS85YXIKeFZSRnI2OEtLTy93OUVkYW9FVUtXYzFiMzZnPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg%3D%3D%22%7D%5D%7D%7D%2C%22version%22%3A%223.2.0%22%7D%7D" } ] }, "version": "3.2.0" }, "storage": { "disks": [ { "device": "/dev/nvme0n1", "partitions": [ { "label": "var", "number": 1 } ], "wipeTable": true } ], "filesystems": [ { "device": "/dev/disk/by-partlabel/var", "format": "xfs", "label": "var", "path": "/var", "wipeFilesystem": true } ] }, "systemd": { "units": [ { "contents": "[Unit]\nBefore=local-fs.target\n[Mount]\nWhere=/var\nWhat=/dev/disk/by-partlabel/var\n[Install]\nWantedBy=local-fs.target\n", "enabled": true, "name": "var.mount" } ] } } 2) provision worker-004...006 - sda rootfs - sdb /var - nvme0n1 - not present Have used the same worker.ign shown above, just changed the device name in the storage section from /dev/nvme0n1 to /dev/sdb, so far all works, deployed 4.8.2 like this, will try upgrade to 4.8.4 tomorrow to see how it behaves.
OK, i have another problem now, when using custom ignition file, however this one is not related exactly to custom ignition files... Master nodes on this bare metal system have one sas controller attached with 2 slots for the disks. SAS controller is attached to pci slot... When i use sda/sdb in the ignition config, the name for the disks is not persistent across the boots, first come first serves, so sometimes disk in slot 7 gets sda, sometimes the disk in slot 6 gets detected first and gets sda .... In an attempt to solve this i've used disk-by-id which are unique for each disk, so now i have dedicated ignition file for each master... control-dh-001.ign control-dh-002.ign control-dh-003.ign cat control-dh-001.ign { "ignition": { "config": { "merge": [ { "source": "data:,%7B%22ignition%22%3A%7B%22config%22%3A%7B%22merge%22%3A%5B%7B%22source%22%3A%22https%3A%2F%2Fapi-int.XXXXXXX.redhat.com%3A22623%2Fconfig%2Fmaster%22%7D%5D%7D%2C%22security%22%3A%7B%22tls%22%3A%7B%22certificateAuthorities%22%3A%5B%7B%22source%22%3A%22data%3Atext%2Fplain%3Bcharset%3Dutf-8%3Bbase64%2CLS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURFRENDQWZpZ0F3SUJBZ0lJTzlqaGpKU1pVZTB3RFFZSktvWklodmNOQVFFTEJRQXdKakVTTUJBR0ExVUUKQ3hNSmIzQmxibk5vYVdaME1SQXdEZ1lEVlFRREV3ZHliMjkwTFdOaE1CNFhEVEl4TURneU5qRTBOVGd4TVZvWApEVE14TURneU5ERTBOVGd4TVZvd0pqRVNNQkFHQTFVRUN4TUpiM0JsYm5Ob2FXWjBNUkF3RGdZRFZRUURFd2R5CmIyOTBMV05oTUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUF5OHFRbFFlTXJvV0MKTXhjc1hvRWhJSTFKOTJLazUvRjgxYkJhc1hQeTNTZTN6bnhzWjlnTlkwR1N1ZE9uNGxGNE9hQllmUUtncGd4Swprd005YnloMjFYSmVnYmNaVXBrQWtaYzEvNTQzVVFQTG9pUzhFdzZ3U0plTkxBSzRqTVdmNGwvN1dYR0F6NFhqCjNYd0NvRUp4S0pxQWNsS0x5cytLZ2ttdkpEQlR2WWFJKy9CNWw0WVhaVzBxd01tQWtPSGhMTS8vSlhoRWdCNWEKcktsblZzZHdQbVFjVjAvWjg3Ym1rSVZLcFR3MFAwZmZhQUo3OFNrL2FTR3U5b3ZlbTNjWGpxNFZXd0hQYVdwOQpIWXc0SHVJQUtIQ2pqUmxBSnF2K2phMmxZaWJHS2pWRnFJSWdsZkFnV2hlQW1VZFdrblVKTHZJMy9kS3ZDN3MzCnhiYVhobmdybFFJREFRQUJvMEl3UURBT0JnTlZIUThCQWY4RUJBTUNBcVF3RHdZRFZSMFRBUUgvQkFVd0F3RUIKL3pBZEJnTlZIUTRFRmdRVVcyM0ZibVpRRFZHTmh0d1Q1aFNBZCt5cVlMVXdEUVlKS29aSWh2Y05BUUVMQlFBRApnZ0VCQUZQaWpDMks3WVppNjkzaTUzTGs2T002YnM1eXNOb3JhdjUzU3FrODJ1bjJqSUNVbDllM2w3THNUb0FMCkVhM2k3eFZtT3ZYWS91UmtmMEhaTE9ubDJmMVNEN01aVXlCekZCVjVPUXpXcFFRWGpXajlnMTUzcUdMOTRUS2QKSnJrMnA1TE5FendKOEFFOENIZmRnVFgzS0Fmc2wxcDdGUmZNRXZzRVpWaHNEMm82NU9uUWNwUGlGeE1XSE9yVQpieU9CS3N5TXpUOEtOVmdRV0N6V3JMNW0ydnR6alhSd3cxekJ2VVBsVHB6TVFOajk3U0dwSUtlNExFcTdETTV3Cm5ZMk5TdGYyd0pKdURDeXBwWnVpVU5FMXJnTi9mZHIzS3M0ajcrejVvcll3N2UzS3Rwc0xhVzUvYnU0V2Z6YmcKOCtvNk9UblFPQ2VPRW1nc3B6ejVqa1pzZ3g4PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg%3D%3D%22%7D%5D%7D%7D%2C%22version%22%3A%223.2.0%22%7D%7D" } ] }, "version": "3.2.0" }, "storage": { "disks": [ { "device": "/dev/disk/by-id/scsi-355cd2e41530ac5de", "partitions": [ { "label": "var", "number": 1 } ], "wipeTable": true } ], "filesystems": [ { "device": "/dev/disk/by-partlabel/var", "format": "xfs", "label": "var", "path": "/var", "wipeFilesystem": true } ] }, "systemd": { "units": [ { "contents": "[Unit]\nBefore=local-fs.target\n[Mount]\nWhere=/var\nWhat=/dev/disk/by-partlabel/var\n[Install]\nWantedBy=local-fs.target\n", "enabled": true, "name": "var.mount" } ] } } However when ignition tries to create partition on the disk identified by "by-id" it times out, hitting timeout limit 1m 30s: 35.506140] ignition[1704]: Adding "root-ca" to list of CAs [ 35.511826] ignition[1704]: disks: createPartitions: op(1): [started] waiting for devices [/dev/disk/by-id/scsi-355cd2e41530ac5de] [ T[ 125.712806] systemd[1]: dev-disk-by\x2did-scsi\x2d355cd2e41530ac5de.device: Job dev-disk-by\x2did-scsi\x2d355cd2e41530ac5de.device/start timed out. IME ] Timed [ 125.727414] systemd[1]: Timed out waiting for device dev-disk-by\x2did-scsi\x2d355cd2e41530ac5de.device. out waiting for device dev-di…-scsi\x2d355cd2e41530ac5de.device. [ 125.746196] systemd[1]: dev-disk-by\x2did-scsi\x2d355cd2e41530ac5de.device: Job dev-disk-by\x2did-scsi\x2d355cd2e41530ac5de.device/start failed with result 'timeout'. [ 125.761201] systemd[1]: ignition-disks.service: Main process exited, code=exited, status=1/FAILURE [FAILED[ 125.770391] ignition[1704]: disks failedFull config: ] Failed to [ 125.776647] ignition[1704]: { start Ignition ([ 125.781001] ignition[1704]: "ignition": { disks). [ 125.786584] ignition[1704]: "config": { [ 125.791541] ignition[1704]: "merge": [ See 'systemctl s[ 125.795936] ignition[1704]: { tatus ignition-d[ 125.801023] ignition[1704]: "verification": {} isks.service' fo[ 125.807716] ignition[1704]: }, r details. [ 125.812847] ignition[1704]: { [DEPEND[ 125.817622] ignition[1704]: disks: createPartitions: op(1): [failed] waiting for devices [/dev/disk/by-id/scsi-355cd2e41530ac5de]: device unit dev-disk-by\x2did-scsi\x2d355ct ] Dependency[ 125.837161] systemd[1]: ignition-disks.service: Failed with result 'exit-code'. failed for Igni[ 125.845834] ignition[1704]: "source": "https://api-int.XXXXXXXX.redhat.com:22623/config/master", tion Complete. [ 125.857996] ignition[1704]: "verification": {} [ 125.864661] ignition[1704]: } [DEPEND ignition[1704]: ], [0m] Dependency [ 125.873513] ignition[1704]: "replace": { failed for Initr[ 125.879352] ignition[1704]: "verification": {} d Default Target[ 125.885889] ignition[1704]: } . [ 125.890765] ignition[1704]: }, [ 125.894426] ignition[1704]: "proxy": {}, [ 125.898797] ignition[1704]: "security": { [DEPEND ignition[1704]: "tls": { [0m] Dependency [ 125.908874] ignition[1704]: "certificateAuthorities": [ failed for Ignit[ 125.916193] ignition[1704]: { ion OSTree: Moun[ 125.921449] ignition[1704]: "verification": {} t (firstboot) /s[ 125.928302] ignition[1704]: } ysroot. [ 125.933534] ignition[1704]: ] [ 125.937970] ignition[1704]: } [ 125.941565] ignition[1704]: }, [ OK 125.945095] ignition[1704]: "timeouts": {}, 0m] Stopped targ[ 125.951036] ignition[1704]: "version": "3.3.0-experimental" et Timers. [ 125.958349] ignition[1704]: }, [ 125.962626] ignition[1704]: "passwd": { [ OK 125.966765] ignition[1704]: "users": [ 0m] Stopped Forw[ 125.972268] ignition[1704]: { ard Password Req[ 125.977244] ignition[1704]: Ignition failed: create partitions failed: failed to wait on disks devs: device unit dev-disk-by\x2did-scsi\x2d355cd2e41530ac5de.device tit uests to Clevis [ 125.993901] systemd[1]: Failed to start Ignition (disks). Directory Watch.[ 126.000658] ignition[1704]: "gecos": "CoreOS Admin",
There is an inconsistency between coreos-installer and dracut environment in the udev rules handling for the disks ..... 1) Booting from PXE, specifying scsi id for the disk - coreos.inst.install_dev=/dev/disk/by-id/scsi-355cd2e41530aad4c and using custom ignition file generated by butane control-dh-001.ign (content shared in comment 18) kernel http://example.host.com/openshift/ocp-XXX/rhcos-4.8.2-x86_64-live-kernel-x86_64 coreos.live.rootfs_url=http://example.host.com/openshift/ocp-XXX/rhcos-4.8.2-x86_64-live-rootfs.x86_64.img coreos.inst.install_dev=/dev/disk/by-id/scsi-355cd2e41530aad4c coreos.inst.ignition_url=http://example.host.com/openshift/ocp-XXX/control-dh-001.ign ip=bond0:dhcp bond=bond0:eno1,eno2:mode=802.3ad,miimon=100,lacp_rate=fast rd.neednet=1 nameserver=10.11.5.19 nameserver=10.5.30.160 initrd http://example.host.com/openshift/ocp-XXX/rhcos-4.8.2-x86_64-live-initramfs.x86_64.img boot Installation will pass, but when system boots for the first time after the installation, disks are not available inside ramdisk under ls -la /dev/disk/by-id/scsi-* ... Udev rules for scsi disks inside ramdisk are defined - line 12 1 :/# grep -e "sd\*" /lib/udev/rules.d/*.rules 2 /lib/udev/rules.d/40-redhat.rules:KERNEL=="sd*", SUBSYSTEMS=="ccw", DRIVERS=="zfcp", ENV{.ID_ZFCP_BUS}="1" 3 /lib/udev/rules.d/40-redhat.rules:KERNEL=="sd*[!0-9]", SUBSYSTEMS=="scsi", ENV{.ID_ZFCP_BUS}=="1", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-path/ccw-$attr{hba_id}-zfcp-$attr{wwpn}:$attr {" 4 /lib/udev/rules.d/40-redhat.rules:KERNEL=="sd*[0-9]", SUBSYSTEMS=="scsi", ENV{.ID_ZFCP_BUS}=="1", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-path/ccw-$attr{hba_id}-zfcp-$attr{wwpn}:$ a" 5 /lib/udev/rules.d/60-block.rules:ACTION!="remove", SUBSYSTEM=="block", KERNEL=="loop*|nvme*|sd*|vd*|xvd*|pmem*|mmcblk*|dasd*", OPTIONS+="watch" 6 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|scm*|pmem*|nbd*", GOTO="persistent_s t" 7 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="scsi", ATTRS{vendor}=="ATA", IMPORT{program}="ata_id --export $devnode" 8 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="scsi", ATTRS{type}=="5", ATTRS{scsi_level}=="[6-9]*", IMPORT{program}="ata_id -" 9 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", ATTR{removable}=="0", SUBSYSTEMS=="usb", IMPORT{program}="ata_id --export $devnode" 10 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", SUBSYSTEMS=="usb", IMPORT{builtin}="usb_id" 11 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", IMPORT{program}="scsi_id --export --whitelisted -d $devnode", ENV{ID_BUS}="scsi" 12 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*|sr*|cciss*", ENV{DEVTYPE}=="disk", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}" 13 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*|cciss*", ENV{DEVTYPE}=="partition", ENV{ID_SERIAL}=="?*", SYMLINK+="disk/by-id/$env{ID_BUS}-$env{ID_SERIAL}-part%n" 14 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*[!0-9]|sr*", ATTRS{ieee1394_id}=="?*", SYMLINK+="disk/by-id/ieee1394-$attr{ieee1394_id}" 15 /lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="sd*[0-9]", ATTRS{ieee1394_id}=="?*", SYMLINK+="disk/by-id/ieee1394-$attr{ieee1394_id}-part%n" 16 /lib/udev/rules.d/61-scsi-sg3_id.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SCSI_INQUIRY}!="?*", IMPORT{program}="/usr/bin/sg_inq --export --inhex=/sys/block/$kernel/device/inquiry --raw", E" 17 /lib/udev/rules.d/61-scsi-sg3_id.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SCSI}!="1", IMPORT{program}="/usr/bin/sg_inq --export $tempnode", ENV{ID_SCSI}="1" 18 /lib/udev/rules.d/61-scsi-sg3_id.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SCSI}=="1", ENV{ID_SCSI_INQUIRY}=="1", IMPORT{program}="/usr/bin/sg_inq --export --inhex=/sys/block/$kernel/devic e" 19 /lib/udev/rules.d/61-scsi-sg3_id.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SCSI}=="1", ENV{ID_SCSI_INQUIRY}!="1", IMPORT{program}="/usr/bin/sg_inq --export --page=sn $tempnode" 20 /lib/udev/rules.d/61-scsi-sg3_id.rules:KERNEL=="sd*[!0-9]", ENV{ID_SCSI}=="1", ENV{ID_SCSI_INQUIRY}=="1", IMPORT{program}="/usr/bin/sg_inq --export --inhex=/sys/block/$kernel/device/vp d" 21 /lib/udev/rules.d/61-scsi-sg3_id.rules:KERNEL=="sd*[!0-9]|sr*", ENV{ID_SCSI}=="1", ENV{ID_SCSI_INQUIRY}!="1", IMPORT{program}="/usr/bin/sg_inq --export --page=di $tempnode" 22 /lib/udev/rules.d/62-multipath.rules:KERNEL!="sd*|dasd*|nvme*", GOTO="end_mpath" 23 :/# Rule for scsi disks is probably not getting applied because those are ata disks .... The ATA rules are before the generic scsi ones and so they run first and they populate ENV{ID_SERIAL} with what ata_id returned (they also fill ID_BUS with "ata" instead of "scsi"), then the first rule for scsi runs: KERNEL=="sd*[!0-9]|sr*", ENV{ID_SERIAL}!="?*", IMPORT{program}="scsi_id --export --whitelisted -d $devnode", ENV{ID_BUS}="scsi" ENV{ID_SERIAL}!="?*" no longer matches - it is not empty any more :/ 2) Observing disk-id inside ramdisk, i can see this: :/# ls -la /dev/disk/by-id/ ata-SSDSC2KG480G8R_PHYG040105X3480BGN ata-SSDSC2KG480G8R_PHYG040105X3480BGN-part1 ata-SSDSC2KG480G8R_PHYG040105X3480BGN-part2 ata-SSDSC2KG480G8R_PHYG040105X3480BGN-part3 ata-SSDSC2KG480G8R_PHYG040105X3480BGN-part4 ata-SSDSC2KG480G8R_PHYG040200XE480BGN ata-SSDSC2KG480G8R_PHYG040200XE480BGN-part1 wwn-0x55cd2e41530aad4c wwn-0x55cd2e41530aad4c-part1 wwn-0x55cd2e41530aad4c-part2 wwn-0x55cd2e41530aad4c-part3 wwn-0x55cd2e41530aad4c-part4 wwn-0x55cd2e41530ac5de wwn-0x55cd2e41530ac5de-part1 3) Removing all the partitions inside ramdisk (to make sure i'm starting from scratch and installer does format the disk when sdX is not specified) on /dev/sda and /dev/sdb disks using cfdisk and then PXE boot with coreos.inst.install_dev=/dev/disk/by-id/wwn-0x55cd2e41530aad4c - grub cannot boot after pxe installation: Booting from Hard drive C: .. error: ../../grub-core/kern/disk.c:258:no such partition. Entering rescue mode... grub rescue>
Team, Any update ? As stated above, if we are suggesting to scsi id (to circumvent the sdX first come first serve problem), it looks like it can only help scsi disks. How are ATA disks to be handled ? Thx Anand
To make progress here as we do not have access to your hardware, we need the output from `ls -la /dev/disk/by-id/` and `fdisk -l` and `blkid` from each environment (liveiso, initramfs) to be able to compare and understand where the issue is.
Please also try to make sure to remove all RHCOS EFI boot entries from your firmware and wipe the beginning of both disks before the installation.
The output of `udevadm info <device-path>` from each environment would also be helpful. As an aside, have you checked whether /dev/disk/by-path contains any useful symlinks? That might allow you to avoid per-machine Ignition configs.
Indeed, when i've used /dev/disk/by-id/wwn-0x55cd2e41530ac5de as install dev (sda device at that time) grub got installed successfully - grub menu appeared and ignition file got applied. Then i've rebooted and edited CoreOS entry and booted with rd.break for the initramfs: switch_root:/# ls -la /dev/disk/by-id/ total 0 drwxr-xr-x 2 root root 320 Sep 7 12:16 . drwxr-xr-x 8 root root 160 Sep 7 12:16 .. lrwxrwxrwx 1 root root 9 Sep 7 12:16 ata-SSDSC2KG480G8R_PHYG040105X3480BGN -> ../../sdb lrwxrwxrwx 1 root root 10 Sep 7 12:16 ata-SSDSC2KG480G8R_PHYG040105X3480BGN-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 9 Sep 7 12:16 ata-SSDSC2KG480G8R_PHYG040200XE480BGN -> ../../sda lrwxrwxrwx 1 root root 10 Sep 7 12:16 ata-SSDSC2KG480G8R_PHYG040200XE480BGN-part1 -> ../../sda1 lrwxrwxrwx 1 root root 10 Sep 7 12:16 ata-SSDSC2KG480G8R_PHYG040200XE480BGN-part2 -> ../../sda2 lrwxrwxrwx 1 root root 10 Sep 7 12:16 ata-SSDSC2KG480G8R_PHYG040200XE480BGN-part3 -> ../../sda3 lrwxrwxrwx 1 root root 10 Sep 7 12:16 ata-SSDSC2KG480G8R_PHYG040200XE480BGN-part4 -> ../../sda4 lrwxrwxrwx 1 root root 9 Sep 7 12:16 wwn-0x55cd2e41530aad4c -> ../../sdb lrwxrwxrwx 1 root root 10 Sep 7 12:16 wwn-0x55cd2e41530aad4c-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 9 Sep 7 12:16 wwn-0x55cd2e41530ac5de -> ../../sda lrwxrwxrwx 1 root root 10 Sep 7 12:16 wwn-0x55cd2e41530ac5de-part1 -> ../../sda1 lrwxrwxrwx 1 root root 10 Sep 7 12:16 wwn-0x55cd2e41530ac5de-part2 -> ../../sda2 lrwxrwxrwx 1 root root 10 Sep 7 12:16 wwn-0x55cd2e41530ac5de-part3 -> ../../sda3 lrwxrwxrwx 1 root root 10 Sep 7 12:16 wwn-0x55cd2e41530ac5de-part4 -> ../../sda4 switch_root:/# fdisk -l sh: fdisk: command not found switch_root:/# udevadm info /dev/disk/by-id/wwn-0x55cd2e41530aad4c P: /devices/pci0000:17/0000:17:00.0/0000:18:00.0/host0/port-0:1/end_device-0:1/target0:0:1/0:0:1:0/block/sdb N: sdb S: disk/by-id/ata-SSDSC2KG480G8R_PHYG040105X3480BGN S: disk/by-id/wwn-0x55cd2e41530aad4c S: disk/by-path/pci-0000:18:00.0-sas-phy3-lun-0 E: DEVLINKS=/dev/disk/by-id/ata-SSDSC2KG480G8R_PHYG040105X3480BGN /dev/disk/by-path/pci-0000:18:00.0-sas-phy3-lun-0 /dev/disk/by-id/wwn-0x55cd2e41530aad4c E: DEVNAME=/dev/sdb E: DEVPATH=/devices/pci0000:17/0000:17:00.0/0000:18:00.0/host0/port-0:1/end_device-0:1/target0:0:1/0:0:1:0/block/sdb E: DEVTYPE=disk E: ID_ATA=1 E: ID_ATA_DOWNLOAD_MICROCODE=1 E: ID_ATA_FEATURE_SET_PM=1 E: ID_ATA_FEATURE_SET_PM_ENABLED=1 E: ID_ATA_FEATURE_SET_SMART=1 E: ID_ATA_FEATURE_SET_SMART_ENABLED=1 E: ID_ATA_ROTATION_RATE_RPM=0 E: ID_ATA_SATA=1 E: ID_ATA_SATA_SIGNAL_RATE_GEN1=1 E: ID_ATA_SATA_SIGNAL_RATE_GEN2=1 E: ID_ATA_WRITE_CACHE=1 E: ID_ATA_WRITE_CACHE_ENABLED=1 E: ID_BUS=ata E: ID_MODEL=SSDSC2KG480G8R E: ID_MODEL_ENC=SSDSC2KG480G8R\x20\x20 E: ID_PART_TABLE_TYPE=gpt E: ID_PART_TABLE_UUID=18b94a42-548e-4099-8c4d-b8372d4f7082 E: ID_PATH=pci-0000:18:00.0-sas-phy3-lun-0 E: ID_PATH_TAG=pci-0000_18_00_0-sas-phy3-lun-0 E: ID_REVISION=DL69 E: ID_SCSI=1 E: ID_SCSI_INQUIRY=1 E: ID_SERIAL=SSDSC2KG480G8R_PHYG040105X3480BGN E: ID_SERIAL_SHORT=PHYG040105X3480BGN E: ID_TYPE=disk E: ID_VENDOR=ATA E: ID_VENDOR_ENC=ATA\x20\x20\x20\x20\x20 E: ID_WWN=0x55cd2e41530aad4c E: ID_WWN_WITH_EXTENSION=0x55cd2e41530aad4c E: MAJOR=8 E: MINOR=16 E: SCSI_IDENT_LUN_NAA_REG=55cd2e41530aad4c E: SCSI_IDENT_SERIAL=PHYG040105X3480BGN E: SCSI_MODEL=SSDSC2KG480G8R E: SCSI_MODEL_ENC=SSDSC2KG480G8R\x20\x20 E: SCSI_REVISION=DL69 E: SCSI_TPGS=0 E: SCSI_TYPE=disk E: SCSI_VENDOR=ATA E: SCSI_VENDOR_ENC=ATA\x20\x20\x20\x20\x20 E: SUBSYSTEM=block E: TAGS=:systemd: E: USEC_INITIALIZED=13208214 switch_root:/# switch_root:/# udevadm info /dev/disk/by-id/wwn-0x55cd2e41530ac5de P: /devices/pci0000:17/0000:17:00.0/0000:18:00.0/host0/port-0:0/end_device-0:0/target0:0:0/0:0:0:0/block/sda N: sda S: disk/by-id/ata-SSDSC2KG480G8R_PHYG040200XE480BGN S: disk/by-id/wwn-0x55cd2e41530ac5de S: disk/by-path/pci-0000:18:00.0-sas-phy7-lun-0 E: DEVLINKS=/dev/disk/by-id/ata-SSDSC2KG480G8R_PHYG040200XE480BGN /dev/disk/by-path/pci-0000:18:00.0-sas-phy7-lun-0 /dev/disk/by-id/wwn-0x55cd2e41530ac5de E: DEVNAME=/dev/sda E: DEVPATH=/devices/pci0000:17/0000:17:00.0/0000:18:00.0/host0/port-0:0/end_device-0:0/target0:0:0/0:0:0:0/block/sda E: DEVTYPE=disk E: ID_ATA=1 E: ID_ATA_DOWNLOAD_MICROCODE=1 E: ID_ATA_FEATURE_SET_PM=1 E: ID_ATA_FEATURE_SET_PM_ENABLED=1 E: ID_ATA_FEATURE_SET_SMART=1 E: ID_ATA_FEATURE_SET_SMART_ENABLED=1 E: ID_ATA_ROTATION_RATE_RPM=0 E: ID_ATA_SATA=1 E: ID_ATA_SATA_SIGNAL_RATE_GEN1=1 E: ID_ATA_SATA_SIGNAL_RATE_GEN2=1 E: ID_ATA_WRITE_CACHE=1 E: ID_ATA_WRITE_CACHE_ENABLED=1 E: ID_BUS=ata E: ID_MODEL=SSDSC2KG480G8R E: ID_MODEL_ENC=SSDSC2KG480G8R\x20\x20 E: ID_PART_TABLE_TYPE=gpt E: ID_PART_TABLE_UUID=7f5c6179-32ea-4cd1-bad1-1b4e5cebbcc8 E: ID_PATH=pci-0000:18:00.0-sas-phy7-lun-0 E: ID_PATH_TAG=pci-0000_18_00_0-sas-phy7-lun-0 E: ID_REVISION=DL69 E: ID_SCSI=1 E: ID_SCSI_INQUIRY=1 E: ID_SERIAL=SSDSC2KG480G8R_PHYG040200XE480BGN E: ID_SERIAL_SHORT=PHYG040200XE480BGN E: ID_TYPE=disk E: ID_VENDOR=ATA E: ID_VENDOR_ENC=ATA\x20\x20\x20\x20\x20 E: ID_WWN=0x55cd2e41530ac5de E: ID_WWN_WITH_EXTENSION=0x55cd2e41530ac5de E: MAJOR=8 E: MINOR=0 E: SCSI_IDENT_LUN_NAA_REG=55cd2e41530ac5de E: SCSI_IDENT_SERIAL=PHYG040200XE480BGN E: SCSI_MODEL=SSDSC2KG480G8R E: SCSI_MODEL_ENC=SSDSC2KG480G8R\x20\x20 E: SCSI_REVISION=DL69 E: SCSI_TPGS=0 E: SCSI_TYPE=disk E: SCSI_VENDOR=ATA E: SCSI_VENDOR_ENC=ATA\x20\x20\x20\x20\x20 E: SUBSYSTEM=block E: TAGS=:systemd: E: USEC_INITIALIZED=13209795 switch_root:/#
Part of the console log from the system, where install dev with disk-by-id/wwn was symlinked to sdb device: [ OK ] Reached target Network is Online. Starting CoreOS Installer... [ 189.237972] coreos-installer-[ 189.292421] sdb: service[2333]: coreos-installer [ 189.344006] sdb: install /dev/disk/by-id/wwn-0x55cd2e41530ae1f2 --ignition-url http://example.host.com/openshift/ocp-datahub/control-dh-003.ign --insecure-ignition --firstboo0 [ 189.715348] coreos-installer-service[2333]: Installing Red Hat Enterprise Linux CoreOS 48.84.202107202156-0 (Ootpa) x86_64 (512-byte sectors) [ 190.321015] coreos-installer-service[2333]: Read disk 182.1 MiB/3.7 GiB (4%) [ 191.320965] coreos-installer-service[2333]: Read disk 365.6 MiB/3.7 GiB (9%) [ 192.320749] coreos-installer-service[2333]: Read disk 557.0 MiB/3.7 GiB (14%) [ 193.321022] coreos-installer-service[2333]: Read disk 768.6 MiB/3.7 GiB (20%) [ 194.321182] coreos-installer-service[2333]: Read disk 980.9 MiB/3.7 GiB (26%) [ 195.321161] coreos-installer-service[2333]: Read disk 1.2 GiB/3.7 GiB (31%) [ 196.321111] coreos-installer-service[2333]: Read disk 1.3 GiB/3.7 GiB (36%) [ 197.321103] coreos-installer-service[2333]: Read disk 1.5 GiB/3.7 GiB (40%) [ 198.321228] coreos-installer-service[2333]: Read disk 1.6 GiB/3.7 GiB (43%) [ 199.321696] coreos-installer-service[2333]: Read disk 1.8 GiB/3.7 GiB (47%) [ 200.321656] coreos-installer-service[2333]: Read disk 1.9 GiB/3.7 GiB (51%) [ 201.321926] coreos-installer-service[2333]: Read disk 2.0 GiB/3.7 GiB (54%) [ 202.322248] coreos-installer-service[2333]: Read disk 2.1 GiB/3.7 GiB (58%) [ 203.322150] coreos-installer-service[2333]: Read disk 2.3 GiB/3.7 GiB (61%) [ 204.322330] coreos-installer-service[2333]: Read disk 2.4 GiB/3.7 GiB (65%) [ 205.322376] coreos-installer-service[2333]: Read disk 2.5 GiB/3.7 GiB (67%) [ 206.323324] coreos-installer-service[2333]: Read disk 2.6 GiB/3.7 GiB (70%) [ 207.323529] coreos-installer-service[2333]: Read disk 2.7 GiB/3.7 GiB (73%) [ 208.324576] coreos-installer-service[2333]: Read disk 2.8 GiB/3.7 GiB (75%) [ 209.326337] coreos-installer-service[2333]: Read disk 2.9 GiB/3.7 GiB (78%) [ 210.327056] coreos-installer-service[2333]: Read disk 3.0 GiB/3.7 GiB (82%) [ 211.327145] coreos-installer-service[2333]: Read disk 3.1 GiB/3.7 GiB (83%) [ 212.327114] coreos-installer-service[2333]: Read disk 3.2 GiB/3.7 GiB (87%) [ 213.327038] coreos-installer-service[2333]: Read disk 3.3 GiB/3.7 GiB (90%) [ 214.327415] coreos-installer-service[2333]: Read disk 3.4 GiB/3.7 GiB (92%) [ 215.327601] coreos-installer-service[2333]: Read disk 3.5 GiB/3.7 GiB (94%) [ 216.327575] coreos-installer-service[2333]: Read disk 3.5 GiB/3.7 GiB (95%) [ 217.327712] coreos-installer-service[2333]: Read disk 3.6 GiB/3.7 GiB (97%) [ 218.327614] coreos-installer-service[2333]: Read disk 3.6 GiB/3.7 GiB (98%) [ 218.494214] coreos-installer-service[2333]: Read disk 3.7 GiB/3.7 GiB (100%) [ 218.579250] coreos-installer-service[2333]: Read disk 3.7 GiB/3.7 GiB (100%) [ 218.928490] GPT:Primary header thinks Alt. header is not at the end of the disk. [ 219.017685] GPT:7718911 != 937703087 [ 219.061119] GPT:Alternate GPT header not at the end of the disk. [ 219.133671] GPT:7718911 != 937703087 [ 219.177023] GPT: Use GNU Parted to correct GPT errors. [ 219.239106] sdb: sdb1 sdb2 sdb3 sdb4 [ 219.537840] EXT4-fs (sdb3): mounted filesystem with ordered data mode. Opts: (null) [ 219.632627] coreos-installer-[ 219.662263] GPT:Primary header thinks Alt. header is not at the end of the disk. service[2333]: W[ 219.755209] GPT:7718911 != 937703087 riting Ignition [ 219.814574] GPT:Alternate GPT header not at the end of the disk. config [ 219.903073] GPT:7718911 != 937703087 [ 219.954131] GPT: Use GNU Parted to correct GPT errors. [ 219.954140] sdb: sdb1 sdb2 sdb3 sdb4 [ OK ] Started CoreOS Installer. [ 219.953713] coreos-installer-service[2333]: Writing first-boot kernel arguments [ OK ] Reached target CoreOS Installer Target. [ 220.198271] coreos-installer-service[2333]: Install complete. [ OK ] Started Reboot after CoreOS Installer. [ OK ] Reached target Finalize CoreOS Installer Target. [ OK ] Stopped target Network is Online. [ OK ] Stopped target Finalize CoreOS Installer Target. System is unable to boot because its pointing to hd0,gpt3 partition - aka that's what sda would be: Booting from Hard drive C: .. error: ../../grub-core/kern/disk.c:258:no such partition. Entering rescue mode... Indeed, no such partition in ls: grub rescue> ls (hd0) (hd1) (hd1,gpt4) (hd1,gpt3) (hd1,gpt2) (hd1,gpt1) Correct partition is this one - (hd1,gpt3) aka sdb used by coreos-installer above ^^ grub rescue> ls (hd1,gpt3) (hd1,gpt3): Filesystem is ext2. Grub is trying to boot from (hd0,gpt3) but there is no such partition according to ls above^^ grub rescue> set prefix=(hd0,gpt3)/grub2 root=hd0,gpt3 grub rescue> Grub was able to boot from the disk after these steps: grub rescue> set prefix=(hd1,gpt3)/grub2 grub rescue> set root=hd1,gpt3 grub rescue> insmod normal grub rescue> normal Grub menu appeared, system booted and ignition config got applied ...
This bug now contains descriptions of three different problems (in comment 0, comment 18, and comment 27). In general, please report separate problems as separate BZs to ease tracking. In this case, let's continue tracking the second problem here; please file a separate BZ for the most recent problem of GRUB trying to boot from the wrong disk.
OK, a lot going on here. So to summarize: 1. when using /dev/disk/by-id/scsi-* symlinks in the Ignition config, boot of the installed system fails at the Ignition stage because the symlink is not present 2. when using the /dev/disk/by-id/wwn-* symlink to install to /dev/sdb, boot of the installed system fails at the GRUB stage For 1, I think likely we're just missing some udev rules to create the symlink in the initramfs. And... looking at the diff of /usr/lib/udev/rules.d now between the initramfs and the real root, looks like it's 63-scsi-sg3_symlink.rules missing. It contains: # 2: IEEE Registered ENV{SCSI_IDENT_LUN_NAA_REG}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_REG}" ENV{SCSI_IDENT_LUN_NAA_REG}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_REG}-part%n" which I think is what creates the symlink we want here. It's not in the FCOS' initramfs either. Will look at adding it. For 2, is it possible that the problem there is that GRUB is installed on both disks? When iterating on these tests, apart from the GPT partitions, did you also wipe the disks' MBRs? Assuming that's the issue, I think you should be able to use the wwn-* links as they do exist in both the real root and the initramfs. But still, I'll look at adding the missing udev rules to FCOS and RHCOS. As Benjamin mentioned, please open a separate BZ for any additional issues.
new PRs https://github.com/coreos/fedora-coreos-config/pull/1299 -> https://github.com/openshift/os/pull/658
The PR with the fixed got merged in 21/10/20. The fix is already available at the latest 4.10 version.
The fix for this bug will not be delivered to customers until it lands in an updated bootimage. That process is tracked in bug 2027501, which has status ASSIGNED. Moving this bug back to POST.
This bug has been reported fixed in a new RHCOS build and is ready for QE verification. To mark the bug verified, set the Verified field to Tested. This bug will automatically move to MODIFIED once the fix has landed in a new bootimage.
To validate it using qemu, you need to create the qemu disk as scsi. Here is the info from the tests I did: qemu-kvm -m 2048M -accel kvm -smp cores=4 -fw_cfg name=opt/com.coreos/config,file=/root/cosa.ign -drive file=/tmp/rhcos.qcow2 -net nic,model=virtio -net user,hostfwd=tcp::2222-:22 -nographic -device virtio-scsi-pci,id=scsi -drive file=cosa_disk,if=none,id=hd2 -device scsi-hd,drive=hd2 [core@localhost ~]$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 5G 0 disk `-sda1 8:1 0 5G 0 part /var sdb 8:16 0 16G 0 disk |-sdb1 8:17 0 1M 0 part |-sdb2 8:18 0 127M 0 part |-sdb3 8:19 0 384M 0 part /boot `-sdb4 8:20 0 15.5G 0 part /sysroot [core@localhost ~]$ ls -la /dev/disk/by-id/ total 0 drwxr-xr-x. 2 root root 480 Oct 19 01:13 . drwxr-xr-x. 8 root root 160 Oct 19 01:13 .. lrwxrwxrwx. 1 root root 9 Oct 19 01:15 ata-QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx. 1 root root 10 Oct 19 01:15 ata-QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 ata-QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 ata-QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 ata-QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 lrwxrwxrwx. 1 root root 9 Oct 19 01:15 scsi-0ATA_QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-0ATA_QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-0ATA_QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-0ATA_QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-0ATA_QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 lrwxrwxrwx. 1 root root 9 Oct 19 01:15 scsi-0QEMU_QEMU_HARDDISK_hd2 -> ../../sda lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-0QEMU_QEMU_HARDDISK_hd2-part1 -> ../../sda1 lrwxrwxrwx. 1 root root 9 Oct 19 01:15 scsi-1ATA_QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-1ATA_QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-1ATA_QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-1ATA_QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-1ATA_QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 lrwxrwxrwx. 1 root root 9 Oct 19 01:15 scsi-SATA_QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-SATA_QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-SATA_QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-SATA_QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 scsi-SATA_QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 [core@localhost ~]$ ls -la /dev/disk/by-partlabel/ total 0 drwxr-xr-x. 2 root root 140 Oct 19 01:13 . drwxr-xr-x. 8 root root 160 Oct 19 01:13 .. lrwxrwxrwx. 1 root root 10 Oct 19 01:15 BIOS-BOOT -> ../../sdb1 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 EFI-SYSTEM -> ../../sdb2 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 boot -> ../../sdb3 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 root -> ../../sdb4 lrwxrwxrwx. 1 root root 10 Oct 19 01:15 var -> ../../sda1 core@localhost ~]$ grep -e "scsi*" /lib/udev/rules.d/*.rules | grep 63 /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_SERIAL}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_SERIAL}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+=" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REGEXT}=="?*", ENV{DEVTYPE}=="disk", SYMLIN" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REGEXT}=="?*", ENV{DEVTYPE}=="partition", S" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REG}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+=" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REG}=="?*", ENV{DEVTYPE}=="partition", SYML" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_EXT}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+=" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_EXT}=="?*", ENV{DEVTYPE}=="partition", SYML" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_EUI64}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="d" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_EUI64}=="?*", ENV{DEVTYPE}=="partition", SYMLIN" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAME}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="di" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAME}=="?*", ENV{DEVTYPE}=="partition", SYMLINK" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_T10}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="dis" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_T10}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_LOCAL}=="?*", ENV{DEVTYPE}=="disk", SYMLINK" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_LOCAL}=="?*", ENV{DEVTYPE}=="partition", SY" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_VENDOR}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_VENDOR}=="?*", ENV{DEVTYPE}=="partition", SYMLI" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781a-1e82-4818-a1c3-63d806ec15bb}", ENV{fabri" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781b-1e82-4818-a1c3-63d806ec15bb}", ENV{fabri" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781c-1e82-4818-a1c3-63d806ec15bb}", ENV{fabri" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781d-1e82-4818-a1c3-63d806ec15bb}", ENV{fabri"
Test with RHCOS 410.84.202112012203, result is passed Test with latest RHCOS 4.8(not fixed), can reproduce the issue. Steps: 1) Prepare test.ign $ cat test.ign { "ignition": { "version": "3.2.0" }, "storage": { "disks": [ { "device": "/dev/disk/by-id/scsi-0ATA_QEMU_HARDDISK_QM00002", "partitions": [ { "label": "var", "number": 1 } ], "wipeTable": true } ], "filesystems": [ { "device": "/dev/disk/by-partlabel/var", "format": "xfs", "label": "var", "path": "/var", "wipeFilesystem": true } ] }, "systemd": { "units": [ { "contents": "[Unit]\nBefore=local-fs.target\n[Mount]\nWhere=/var\nWhat=/dev/disk/by-partlabel/var\n[Install]\nWantedBy=local-fs.target\n", "enabled": true, "name": "var.mount" }, { "dropins": [ { "contents": "[Service]\n# Override Execstart in main unit\nExecStart=\n# Add new Execstart with `-` prefix to ignore failure`\nExecStart=-/usr/sbin/agetty --autologin core --noclear %I $TERM\n", "name": "autologin-core.conf" } ], "name": "serial-getty" } ] } } 2) Start VM with rhcos qcow2 image and another disk qemu-kvm -m 2048M -accel kvm -smp 4 -fw_cfg name=opt/com.coreos/config,file=./test.ign -net nic,model=virtio -net user,hostfwd=tcp::2222-:22 -nographic -drive file=./rhcos.qcow2 -device virtio-scsi-pci,id=scsi0 -drive file=cosa_disk -device virtio-scsi-pci,id=scsi1 3) Check VM can boot up and ignition apply successfully ============================================= - Test with latest rhcos-48.84.202112022303-0, ignition apply failed as expected Dec 03 13:12:33 ignition[663]: disks: createPartitions: op(1): [started] waiting for devices [/dev/disk/by-id/scsi-0ATA_QEMU_HARDDISK_QM00002] Dec 03 13:14:03 systemd[1]: ignition-disks.service: Main process exited, code=exited, status=1/FAILURE Dec 03 13:14:03 systemd[1]: ignition-disks.service: Failed with result 'exit-code'. Dec 03 13:14:03 systemd[1]: Failed to start Ignition (disks). :/# ls /dev/disk/by-id/ -al total 0 drwxr-xr-x 2 root root 200 Dec 3 13:12 . drwxr-xr-x 8 root root 160 Dec 3 13:12 .. lrwxrwxrwx 1 root root 9 Dec 3 13:12 ata-QEMU_DVD-ROM_QM00003 -> ../../sr0 lrwxrwxrwx 1 root root 9 Dec 3 13:12 ata-QEMU_HARDDISK_QM00001 -> ../../sda lrwxrwxrwx 1 root root 10 Dec 3 13:12 ata-QEMU_HARDDISK_QM00001-part1 -> ../../sda1 lrwxrwxrwx 1 root root 10 Dec 3 13:12 ata-QEMU_HARDDISK_QM00001-part2 -> ../../sda2 lrwxrwxrwx 1 root root 10 Dec 3 13:12 ata-QEMU_HARDDISK_QM00001-part3 -> ../../sda3 lrwxrwxrwx 1 root root 10 Dec 3 13:12 ata-QEMU_HARDDISK_QM00001-part4 -> ../../sda4 lrwxrwxrwx 1 root root 9 Dec 3 13:12 ata-QEMU_HARDDISK_QM00002 -> ../../sdb lrwxrwxrwx 1 root root 10 Dec 3 13:12 ata-QEMU_HARDDISK_QM00002-part1 -> ../../sdb1 ============================================= - Test with latest rhcos-410.84.202112012203-0, ignition apply successfully $ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 16G 0 disk |-sda1 8:1 0 1M 0 part |-sda2 8:2 0 127M 0 part |-sda3 8:3 0 384M 0 part /boot `-sda4 8:4 0 15.5G 0 part /sysroot sdb 8:16 0 4G 0 disk `-sdb1 8:17 0 4G 0 part /var sr0 11:0 1 1024M 0 rom [core@ibm-p8-kvm-03-guest-02 ~]$ ls -la /dev/disk/by-id/ total 0 drwxr-xr-x. 2 root root 620 Dec 3 13:20 . drwxr-xr-x. 8 root root 160 Dec 3 13:20 .. lrwxrwxrwx. 1 root root 9 Dec 3 13:20 ata-QEMU_DVD-ROM_QM00003 -> ../../sr0 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 ata-QEMU_HARDDISK_QM00001 -> ../../sda lrwxrwxrwx. 1 root root 10 Dec 3 13:20 ata-QEMU_HARDDISK_QM00001-part1 -> ../../sda1 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 ata-QEMU_HARDDISK_QM00001-part2 -> ../../sda2 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 ata-QEMU_HARDDISK_QM00001-part3 -> ../../sda3 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 ata-QEMU_HARDDISK_QM00001-part4 -> ../../sda4 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 ata-QEMU_HARDDISK_QM00002 -> ../../sdb lrwxrwxrwx. 1 root root 10 Dec 3 13:20 ata-QEMU_HARDDISK_QM00002-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 scsi-0ATA_QEMU_HARDDISK_QM00001 -> ../../sda lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-0ATA_QEMU_HARDDISK_QM00001-part1 -> ../../sda1 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-0ATA_QEMU_HARDDISK_QM00001-part2 -> ../../sda2 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-0ATA_QEMU_HARDDISK_QM00001-part3 -> ../../sda3 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-0ATA_QEMU_HARDDISK_QM00001-part4 -> ../../sda4 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 scsi-0ATA_QEMU_HARDDISK_QM00002 -> ../../sdb lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-0ATA_QEMU_HARDDISK_QM00002-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 scsi-1ATA_QEMU_HARDDISK_QM00001 -> ../../sda lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-1ATA_QEMU_HARDDISK_QM00001-part1 -> ../../sda1 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-1ATA_QEMU_HARDDISK_QM00001-part2 -> ../../sda2 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-1ATA_QEMU_HARDDISK_QM00001-part3 -> ../../sda3 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-1ATA_QEMU_HARDDISK_QM00001-part4 -> ../../sda4 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 scsi-1ATA_QEMU_HARDDISK_QM00002 -> ../../sdb lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-1ATA_QEMU_HARDDISK_QM00002-part1 -> ../../sdb1 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 scsi-SATA_QEMU_HARDDISK_QM00001 -> ../../sda lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-SATA_QEMU_HARDDISK_QM00001-part1 -> ../../sda1 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-SATA_QEMU_HARDDISK_QM00001-part2 -> ../../sda2 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-SATA_QEMU_HARDDISK_QM00001-part3 -> ../../sda3 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-SATA_QEMU_HARDDISK_QM00001-part4 -> ../../sda4 lrwxrwxrwx. 1 root root 9 Dec 3 13:20 scsi-SATA_QEMU_HARDDISK_QM00002 -> ../../sdb lrwxrwxrwx. 1 root root 10 Dec 3 13:20 scsi-SATA_QEMU_HARDDISK_QM00002-part1 -> ../../sdb1 [core@ibm-p8-kvm-03-guest-02 ~]$ ls -la /dev/disk/by-partlabel/ total 0 drwxr-xr-x. 2 root root 140 Dec 3 13:20 . drwxr-xr-x. 8 root root 160 Dec 3 13:20 .. lrwxrwxrwx. 1 root root 10 Dec 3 13:20 BIOS-BOOT -> ../../sda1 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 EFI-SYSTEM -> ../../sda2 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 boot -> ../../sda3 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 root -> ../../sda4 lrwxrwxrwx. 1 root root 10 Dec 3 13:20 var -> ../../sdb1 $ grep -e "scsi*" /lib/udev/rules.d/*.rules | grep 63 /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_SERIAL}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-S$env{SCSI_VENDOR}_$env{SCSI_MODEL}" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_SERIAL}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-S$env{SCSI_VENDOR}_$env{SCSI_M" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REGEXT}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_RE" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REGEXT}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_N" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REG}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_REG}" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_REG}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_EXT}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_EXT}" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_EXT}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_EUI64}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-2$env{SCSI_IDENT_LUN_EUI64}" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_EUI64}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-2$env{SCSI_IDENT_LUN_EUI64}" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAME}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-8$env{SCSI_IDENT_LUN_NAME}" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAME}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-8$env{SCSI_IDENT_LUN_NAME}-p" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_T10}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-1$env{SCSI_IDENT_LUN_T10}" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_T10}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-1$env{SCSI_IDENT_LUN_T10}-par" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_LOCAL}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NAA_LOC" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_NAA_LOCAL}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-3$env{SCSI_IDENT_LUN_NA" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_VENDOR}=="?*", ENV{DEVTYPE}=="disk", SYMLINK+="disk/by-id/scsi-0$env{SCSI_VENDOR}_$env{SCSI_MO" /lib/udev/rules.d/63-scsi-sg3_symlink.rules:ENV{SCSI_IDENT_LUN_VENDOR}=="?*", ENV{DEVTYPE}=="partition", SYMLINK+="disk/by-id/scsi-0$env{SCSI_VENDOR}_$env{SC" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781a-1e82-4818-a1c3-63d806ec15bb}", ENV{fabric_scsi_controller}="scsi0", GOTO="azure_datadis" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781b-1e82-4818-a1c3-63d806ec15bb}", ENV{fabric_scsi_controller}="scsi1", GOTO="azure_datadis" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781c-1e82-4818-a1c3-63d806ec15bb}", ENV{fabric_scsi_controller}="scsi2", GOTO="azure_datadis" /lib/udev/rules.d/66-azure-storage.rules:ATTRS{device_id}=="{f8b3781d-1e82-4818-a1c3-63d806ec15bb}", ENV{fabric_scsi_controller}="scsi3", GOTO="azure_datadis"
Another question, do we have plan to backport this 4.9 or 4.8 ?
The fix for this bug has landed in a bootimage bump, as tracked in bug 2027501 (now in status MODIFIED). Moving this bug to MODIFIED.
Hi Jonathan, I failed to reproduce the issue with rhcos-48.84.202112092303-0(not include the fixed patch) with command in Comment 36, as the fixed patch is now include in 4.10, do you have some suggestions? Thanks! Additionally, I asked help from KVM qe, and he said the qemu command in Comment 37 is not correct, but I do not why it can reproduce the issue.
(In reply to HuijingHei from comment #41) > Hi Jonathan, I failed to reproduce the issue with > rhcos-48.84.202112092303-0(not include the fixed patch) with command in > Comment 36, as the fixed patch is now include in 4.10, do you have some > suggestions? Thanks! Can you retry the steps in comment 36, but interrupting GRUB to add `rd.break`. Then from the `rd.break` shell, you can do e.g. `ls -la /dev/disk/by-id/`. You should see that on 4.8 some of the symlinks are missing compared to doing this on 4.10. It's normal that in the real root the symlinks are always present in both 4.8 and 4.10. This RHBZ was about including the rules in the initrd so they also show up there. > Additionally, I asked help from KVM qe, and he said the qemu command in > Comment 37 is not correct, but I do not why it can reproduce the issue. Hmm, it looks OK to me, but I rarely directly use the QEMU cmdline these days. Another easy way to get SCSI disks added is to add a multipath disk. E.g.: ``` $ cosa run -c --add-disk 1G:mpath --kargs rd.break ... switch_root:/# ls -la /dev/disk/by-id/scsi-* lrwxrwxrwx 1 root root 9 Dec 14 16:05 /dev/disk/by-id/scsi-0NVME_VirtualMultipath_disk1 -> ../../sda lrwxrwxrwx 1 root root 9 Dec 14 16:05 /dev/disk/by-id/scsi-3c5cadec86ff5ebf9 -> ../../sda lrwxrwxrwx 1 root root 9 Dec 14 16:05 /dev/disk/by-id/scsi-SNVME_VirtualMultipath_disk1 -> ../../sda ``` Comparing against 4.8: ``` $ cosa run --qemu-image rhcos-4.8.14-x86_64-qemu.x86_64.qcow2 -c --add-disk 1G:mpath --kargs rd.break ... switch_root:/# ls -la /dev/disk/by-id/scsi-* lrwxrwxrwx 1 root root 9 Dec 14 16:05 /dev/disk/by-id/scsi-3d6fb0870b8a01c1f -> ../../sdb ```
(In reply to Jonathan Lebon from comment #42) > Can you retry the steps in comment 36, but interrupting GRUB to add > `rd.break`. Then from the `rd.break` shell, you can do e.g. `ls -la > /dev/disk/by-id/`. You should see that on 4.8 some of the symlinks are > missing compared to doing this on 4.10. > > It's normal that in the real root the symlinks are always present in both > 4.8 and 4.10. This RHBZ was about including the rules in the initrd so they > also show up there. Thanks Jonathan for your reply! 1) With `rd.break`, some symlinks are missing in 4.8, but shown in 4.10. Change bug status to verified. 2) I will try with your cosa command, and see how to write auto script =========================== switch_root:/# cat /etc/os-release VERSION="410.84.202112062002-0 dracut-049-135.git20210121.el8" switch_root:/# ls /dev/disk/by-id/ -l total 0 lrwxrwxrwx 1 root root 9 Dec 15 08:31 ata-QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx 1 root root 10 Dec 15 08:31 ata-QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Dec 15 08:31 ata-QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 10 Dec 15 08:31 ata-QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx 1 root root 10 Dec 15 08:31 ata-QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 lrwxrwxrwx 1 root root 9 Dec 15 08:31 scsi-0ATA_QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-0ATA_QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-0ATA_QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-0ATA_QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-0ATA_QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 lrwxrwxrwx 1 root root 9 Dec 15 08:31 scsi-0QEMU_QEMU_HARDDISK_hd2 -> ../../sda lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-0QEMU_QEMU_HARDDISK_hd2-part1 -> ../../sda1 lrwxrwxrwx 1 root root 9 Dec 15 08:31 scsi-1ATA_QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-1ATA_QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-1ATA_QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-1ATA_QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-1ATA_QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 lrwxrwxrwx 1 root root 9 Dec 15 08:31 scsi-SATA_QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-SATA_QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-SATA_QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-SATA_QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx 1 root root 10 Dec 15 08:31 scsi-SATA_QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 =========================== switch_root:/# cat /etc/os-release VERSION="48.84.202112142303-0 dracut-049-135.git20210121.el8" switch_root:/# ls /dev/disk/by-id/* -l lrwxrwxrwx 1 root root 9 Dec 15 08:41 /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001 -> ../../sdb lrwxrwxrwx 1 root root 10 Dec 15 08:41 /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001-part1 -> ../../sdb1 lrwxrwxrwx 1 root root 10 Dec 15 08:41 /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001-part2 -> ../../sdb2 lrwxrwxrwx 1 root root 10 Dec 15 08:41 /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001-part3 -> ../../sdb3 lrwxrwxrwx 1 root root 10 Dec 15 08:41 /dev/disk/by-id/ata-QEMU_HARDDISK_QM00001-part4 -> ../../sdb4 lrwxrwxrwx 1 root root 9 Dec 15 08:41 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_hd2 -> ../../sda lrwxrwxrwx 1 root root 10 Dec 15 08:41 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_hd2-part1 -> ../../sda1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056