There appears to be something wrong with etcd on this job. It has a 99% failure rate -- https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6 I have filed a PR to make it optional. Once we fix the issue, we can re-enable it.
> I have filed a PR to make it optional. Once we fix the issue, we can re-enable it. It's a real regression so I'd prefer we fix the issue - I pushed https://github.com/openshift/cluster-etcd-operator/pull/785 but investigation/testing ongoing to confirm if that's the only issue
Stephen -- Agreed, this is a real regression. However, thanks to the interesting nature of prow and merge pools, it is essentially impossible to merge any PRs until this is fixed. Hence the blocker bug. With less than two weeks to go before feature freeze, we can't afford to be stuck for another week. Feel free to file a PR re-enabling the job when things are stable.
It seems my fix isn't sufficient, I'll remove my assignment so this can hopefully be triaged/investigated by the etcd team I triggered e2e-metal-ipi-ovn-ipv6 on https://github.com/openshift/cluster-etcd-operator/pull/785 so we can hopefully collect more details re the remaining issues
Spotted some similar issues with https://github.com/openshift/cluster-etcd-operator/pull/784 - updated my PR with another fix and re-testing
Not yet got the fixes working so trying a revert https://github.com/openshift/cluster-etcd-operator/pull/786 (this did work locally for me, but lets confirm in CI)
This was introduced with https://github.com/openshift/cluster-etcd-operator/pull/780 but was fixed by https://github.com/openshift/cluster-etcd-operator/pull/790 The test is passing here https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/27033/pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6/1516778963907645440 cc @cdc @shardy Anything else required here for further verification ?
According to testgrid [1], this job finally went green on 4/22. So, yes, I think we can set it as blocking for CNO if desired. 1: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.11-informing#periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6
@cdc Would you kindly change the component if possible ? .. I think it is no longer etcd problem, or ? cc @dwest
Agreed, we can kick this back to the SDN team. (It should have been a BLOCKS bug for etcd team anyways). Thanks for the prompt fix.
This job is no longer failing at such a high rate. sippy [0] is showing that it's passing more than 50% of the time. This PR [1] will revert the initial change that made these jobs optional in CNO and OVNK. Also, it's good to see that the job is also a payload blocker again [2], as it was also moved to informing/optional by TRT when it was failing so often. [0] https://sippy.dptools.openshift.org/sippy-ng/jobs/4.11/analysis?filters=%7B%22items%22%3A%5B%7B%22columnField%22%3A%22name%22%2C%22operatorValue%22%3A%22equals%22%2C%22value%22%3A%22periodic-ci-openshift-release-master-nightly-4.11-e2e-metal-ipi-ovn-ipv6%22%7D%5D%7D [1] https://github.com/openshift/release/pull/28469 [2] https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/#4.11.0-0.nightly
At least these two need to merge to move things along: https://github.com/openshift/image-customization-controller/pull/49 https://github.com/openshift/installer/pull/5909
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069