Description of problem: configure-ovs is failing because jq is not available in RHEL. From must-gather logs: Feb 08 11:10:06.074973 ip-10-0-53-215 configure-ovs.sh[1267]: ++ ip -j a show dev ens5 Feb 08 11:10:06.075928 ip-10-0-53-215 configure-ovs.sh[1267]: ++ jq '.[0].addr_info | map(. | select(.family == "inet")) | length' Feb 08 11:10:06.079449 ip-10-0-53-215 configure-ovs.sh[1267]: + num_ipv4_addrs=1 Feb 08 11:10:06.080303 ip-10-0-53-215 configure-ovs.sh[1267]: + '[' 1 -gt 0 ']' Feb 08 11:10:06.080303 ip-10-0-53-215 configure-ovs.sh[1267]: + extra_if_brex_args+='ipv4.may-fail no ' Feb 08 11:10:06.082385 ip-10-0-53-215 configure-ovs.sh[1267]: ++ ip -j a show dev ens5 Feb 08 11:10:06.083353 ip-10-0-53-215 configure-ovs.sh[1267]: ++ jq '.[0].addr_info | map(. | select(.family == "inet6" and .scope != "link")) | length' Feb 08 11:10:06.085296 ip-10-0-53-215 configure-ovs.sh[1267]: + num_ip6_addrs=0 Feb 08 11:10:06.086110 ip-10-0-53-215 configure-ovs.sh[1267]: + '[' 0 -gt 0 ']' > oc get node -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-50-115.us-east-2.compute.internal Ready master 19h v1.23.3+d99c04f 10.0.50.115 <none> Red Hat Enterprise Linux CoreOS 410.84.202202070040-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 ip-10-0-50-155.us-east-2.compute.internal Ready worker 18h v1.23.3+d99c04f 10.0.50.155 <none> Red Hat Enterprise Linux CoreOS 410.84.202202070040-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 ip-10-0-54-39.us-east-2.compute.internal NotReady worker 12h v1.23.3+d99c04f 10.0.54.39 <none> Red Hat Enterprise Linux 8.5 (Ootpa) 4.18.0-348.12.2.el8_5.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 ip-10-0-55-224.us-east-2.compute.internal Ready worker 18h v1.23.3+d99c04f 10.0.55.224 <none> Red Hat Enterprise Linux CoreOS 410.84.202202070040-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 ip-10-0-57-184.us-east-2.compute.internal NotReady worker 12h v1.23.3+d99c04f 10.0.57.184 <none> Red Hat Enterprise Linux 8.5 (Ootpa) 4.18.0-348.12.2.el8_5.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 ip-10-0-59-38.us-east-2.compute.internal Ready master 19h v1.23.3+d99c04f 10.0.59.38 <none> Red Hat Enterprise Linux CoreOS 410.84.202202070040-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 ip-10-0-66-173.us-east-2.compute.internal Ready worker 18h v1.23.3+d99c04f 10.0.66.173 <none> Red Hat Enterprise Linux CoreOS 410.84.202202070040-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 ip-10-0-67-199.us-east-2.compute.internal Ready master 19h v1.23.3+d99c04f 10.0.67.199 <none> Red Hat Enterprise Linux CoreOS 410.84.202202070040-0 (Ootpa) 4.18.0-305.34.2.el8_4.x86_64 cri-o://1.23.0-112.rhaos4.10.gitb527000.el8 > oc get co network -oyaml <--SNIP--> status: conditions: - lastTransitionTime: "2022-02-07T09:01:00Z" status: "False" type: ManagementStateDegraded - lastTransitionTime: "2022-02-07T15:50:20Z" message: |- DaemonSet "openshift-ovn-kubernetes/ovn-ipsec" rollout is not making progress - last change 2022-02-07T15:49:32Z DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-dft7c is in CrashLoopBackOff State DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-n4kp4 is in CrashLoopBackOff State DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2022-02-07T15:49:32Z reason: RolloutHung status: "True" type: Degraded <--SNIP--> > oc get pod -n openshift-ovn-kubernetes | grep -v "Running" | grep -v "Completed" NAME READY STATUS RESTARTS AGE ovn-ipsec-44b6s 0/1 Init:Error 107 (7m15s ago) 12h ovn-ipsec-x926k 0/1 Init:0/1 107 (6m44s ago) 12h ovnkube-node-dft7c 4/5 CrashLoopBackOff 148 (4m37s ago) 12h ovnkube-node-n4kp4 4/5 CrashLoopBackOff 148 (3m32s ago) 12h > oc logs -n openshift-ovn-kubernetes ovnkube-node-n4kp4 -c ovnkube-node I0208 03:55:10.638403 346842 ovs.go:207] exec(1): stdout: "" I0208 03:55:10.638434 346842 ovs.go:208] exec(1): stderr: "" I0208 03:55:10.638454 346842 ovs.go:204] exec(2): /usr/bin/ovs-vsctl --timeout=15 -- clear bridge br-int netflow -- clear bridge br-int sflow -- clear bridge br-int ipfix I0208 03:55:10.663033 346842 ovs.go:207] exec(2): stdout: "" I0208 03:55:10.663060 346842 ovs.go:208] exec(2): stderr: "" I0208 03:55:10.670858 346842 node.go:386] Node ip-10-0-57-184.us-east-2.compute.internal ready for ovn initialization with subnet 10.130.2.0/23 I0208 03:55:10.670890 346842 ovs.go:204] exec(3): /usr/bin/ovn-sbctl --private-key=/ovn-cert/tls.key --certificate=/ovn-cert/tls.crt --bootstrap-ca-cert=/ovn-ca/ca-bundle.crt --db=ssl:10.0.50.115:9642,ssl:10.0.59.38:9642,ssl:10.0.67.199:9642 --timeout=15 --columns=up list Port_Binding I0208 03:55:10.706348 346842 ovs.go:207] exec(3): stdout: "up : false\n\nup : false\n\nup : true\n\nup : true\n\nup : false\n\nup : false\n\nup <--SNIP--> I0208 03:55:10.706525 346842 ovs.go:208] exec(3): stderr: "" I0208 03:55:10.706542 346842 node.go:315] Detected support for port binding with external IDs I0208 03:55:10.706654 346842 ovs.go:204] exec(4): /usr/bin/ovs-vsctl --timeout=15 -- --if-exists del-port br-int k8s-ip-10-0-57- -- --may-exist add-port br-int ovn-k8s-mp0 -- set interface ovn-k8s-mp0 type=internal mtu_request=8855 external-ids:iface-id=k8s-ip-10-0-57-184.us-east-2.compute.internal I0208 03:55:10.731323 346842 ovs.go:207] exec(4): stdout: "" I0208 03:55:10.731348 346842 ovs.go:208] exec(4): stderr: "" I0208 03:55:10.731365 346842 ovs.go:204] exec(5): /usr/bin/ovs-vsctl --timeout=15 --if-exists get interface ovn-k8s-mp0 mac_in_use I0208 03:55:10.756306 346842 ovs.go:207] exec(5): stdout: "\"ba:e9:29:b9:49:ae\"\n" I0208 03:55:10.756329 346842 ovs.go:208] exec(5): stderr: "" I0208 03:55:10.756351 346842 ovs.go:204] exec(6): /usr/bin/ovs-vsctl --timeout=15 set interface ovn-k8s-mp0 mac=ba\:e9\:29\:b9\:49\:ae I0208 03:55:10.780505 346842 ovs.go:207] exec(6): stdout: "" I0208 03:55:10.780529 346842 ovs.go:208] exec(6): stderr: "" I0208 03:55:10.818558 346842 gateway_init.go:261] Initializing Gateway Functionality I0208 03:55:10.818801 346842 gateway_localnet.go:131] Node local addresses initialized to: map[10.0.57.184:{10.0.48.0 fffff000} 10.130.2.2:{10.130.2.0 fffffe00} 127.0.0.1:{127.0.0.0 ff000000} ::1:{::1 ffffffffffffffffffffffffffffffff} fe80::8e:3dff:fe6a:8ce4:{fe80:: ffffffffffffffff0000000000000000} fe80::acc3:8dff:fe0a:26db:{fe80:: ffffffffffffffff0000000000000000} fe80::b8e9:29ff:feb9:49ae:{fe80:: ffffffffffffffff0000000000000000}] I0208 03:55:10.818994 346842 helper_linux.go:74] Found default gateway interface eth0 10.0.48.1 F0208 03:55:10.819075 346842 ovnkube.go:133] could not find IP addresses: failed to lookup link br-ex: Link not found OCP Version: 4.10.0-0.nightly-2022-02-07-032308 How reproducible: Always Steps to Reproduce: Create an OVN cluster Scaleup a RHEL machine Actual results: New RHEL machine not ready Expected results: Scaleup finished successfully Suggestion: Avoid to use jq tool in configure-ovs.sh script Or update documents that mention the jq tool is required for scaleup RHEL machines. Additional info: After installing jq tool in RHEL machine before doing scaleup, the scaleup process finished successfully.
Setting back to blocker+ to respect the initial assessment made by @yunjiang
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069