Bug 1999891 - must-gather collects backup data even when Pods fails to be created
Summary: must-gather collects backup data even when Pods fails to be created
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oc
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Maciej Szulik
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-31 22:58 UTC by Michael Washer
Modified: 2022-08-10 10:37 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Some errors were not being printed. Consequence: must-gather output did not contain information about problems and thus made it hard to figure out what failed. Fix: Bubble errors to make them visible to the user at any stage in the must-gather run. Result: The output provided by must-gather contains more detailed information about what went wrong, when and why.
Clone Of:
Environment:
Last Closed: 2022-08-10 10:37:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift oc pull 1093 0 None open Bug 1999891: bubble errors which happen prior to collection to BackupGathering 2022-03-18 15:30:27 UTC
Github openshift oc pull 916 0 None None None 2021-09-01 02:23:33 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:37:45 UTC

Description Michael Washer 2021-08-31 22:58:04 UTC
Description of problem:
The Must-Gather Pod can fail to be created due to issues with the parameters provided. This does not output the problem but instead attempts to collect the backup data.

Version-Release number of selected component (if applicable):
Current (OCP 4.6+)

How reproducible:
Every time

Steps to Reproduce:
1. Run `oc adm must-gather --node-name="node/abvbasib"

Actual results:
Will attempt to collect the backup data, as if the Must-Gather Pod had failed.

Expected results:
Present an error that the Node-Name parameter is formatted incorrectly

Additional info:
When looking at the following snippets of code:

Where the error is returned from the Pod creation
https://github.com/openshift/oc/blob/master/pkg/cli/admin/mustgather/mustgather.go#L325-L328

This deferred function then runs to collect the data:
https://github.com/openshift/oc/blob/master/pkg/cli/admin/mustgather/mustgather.go#L259-L264

But here there appears to have a sentiment that if the issue is caused by the user arguments, we should not run the backup collection.
https://github.com/openshift/oc/blob/master/pkg/cli/admin/mustgather/mustgather.go#L395-L399

This is an issue as the backup collection can take a large amount of time to collect and should not be needed if the must-gather can be run with the correct arguments.

Comment 2 Maciej Szulik 2022-03-16 16:21:32 UTC
https://github.com/openshift/oc/pull/1013 improved the messages in must-gather to make it clear what's happening and why.

Comment 8 zhou ying 2022-06-24 08:37:53 UTC
can't reproduce the issue now :

oc version --client
Client Version: 4.11.0-0.nightly-2022-06-24-041539
Kustomize Version: v4.5.4

oc adm must-gather --node-name="node/abvbasib"
[must-gather      ] OUT Using must-gather plug-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:82aa287dc11d558b9af6b261fe21b753c450649a8e8e3a7e0ef3e1440d8ec3c0
error: --node-name may not contain '/' or '%'
[root@localhost ~]# oc adm must-gather --node-name="sssdadkaadsl"
[must-gather      ] OUT Using must-gather plug-in image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:82aa287dc11d558b9af6b261fe21b753c450649a8e8e3a7e0ef3e1440d8ec3c0
When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
ClusterID: 182312bf-b2b7-432a-a9e5-3c4e3eb42db1
ClusterVersion: Stable at "4.11.0-0.nightly-2022-06-23-153912"
ClusterOperators:
	All healthy and stable


[must-gather      ] OUT namespace/openshift-must-gather-6lqcd created
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-kvxqv created
[must-gather      ] OUT namespace/openshift-must-gather-6lqcd deleted
[must-gather      ] OUT clusterrolebinding.rbac.authorization.k8s.io/must-gather-kvxqv deleted


Error running must-gather collection:
    nodes "sssdadkaadsl" not found

Falling back to `oc adm inspect clusteroperators.v1.config.openshift.io` to collect basic cluster information.
....

Comment 9 errata-xmlrpc 2022-08-10 10:37:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.