2072154 – Secondary Scheduler operator panics

Bug 2072154 - Secondary Scheduler operator panics

Summary: Secondary Scheduler operator panics

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-scheduler
Sub Component:
Version:	4.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	4.11.0
Assignee:	Jan Chaloupka
QA Contact:	RamaKasturi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-04-05 17:17 UTC by RamaKasturi
Modified:	2022-08-10 11:03 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-08-10 11:03:38 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift secondary-scheduler-operator pull 43	0	None	open	bug 2072154: Rename deployment to secondary scheduler	2022-04-05 17:34:53 UTC

Description RamaKasturi 2022-04-05 17:17:53 UTC

Description of problem:
I see that cluster instance for secondary scheduler operator does not get created and when looking into the operator logs i see that there has been a panic observed.

E0405 16:31:38.278247       1 runtime.go:78] Observed a panic: &errors.errorString{s:"v1.Deployment.Spec: v1.DeploymentSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.Value: ReadString: expects \" or n, but found t, error found in #10 byte of ...|,\"value\":true}],\"ima|..., bigger context ...|],\"env\":[{\"name\":\"ENABLE_OPENSHIFT_AUTH\",\"value\":true}],\"image\":\"${IMAGE}\",\"name\":\"secondary-schedul|..."} (v1.Deployment.Spec: v1.DeploymentSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.Value: ReadString: expects " or n, but found t, error found in #10 byte of ...|,"value":true}],"ima|..., bigger context ...|],"env":[{"name":"ENABLE_OPENSHIFT_AUTH","value":true}],"image":"${IMAGE}","name":"secondary-schedul|...)
goroutine 234 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1d26240, 0xc00084af90})
	k8s.io/apimachinery.2/pkg/util/runtime/runtime.go:74 +0x85
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc00004e260})
	k8s.io/apimachinery.2/pkg/util/runtime/runtime.go:48 +0x75
panic({0x1d26240, 0xc00084af90})
	runtime/panic.go:1038 +0x215
github.com/openshift/library-go/pkg/operator/resource/resourceread.ReadDeploymentV1OrDie({0xc000a8f200, 0x421, 0x480})
	github.com/openshift/library-go.0-20210331235027-66936e2fcc52/pkg/operator/resource/resourceread/apps.go:23 +0x137
github.com/openshift/secondary-scheduler-operator/pkg/operator.(*TargetConfigReconciler).manageDeployment(0xc000bc5cc8, 0xc000151440, 0x0)
	github.com/openshift/secondary-scheduler-operator/pkg/operator/target_config_reconciler.go:205 +0x6e
github.com/openshift/secondary-scheduler-operator/pkg/operator.TargetConfigReconciler.sync({{0x2388090, 0xc000312880}, {0x235f160, 0xc0001277d0}, 0xc000a753e0, {0x23e8c30, 0xc00063f8c0}, {0x235f098, 0xc000127890}, {0x2340c60, ...}, ...}, ...)
	github.com/openshift/secondary-scheduler-operator/pkg/operator/target_config_reconciler.go:131 +0x26a
github.com/openshift/secondary-scheduler-operator/pkg/operator.(*TargetConfigReconciler).processNextWorkItem(0xc000a13d00)
	github.com/openshift/secondary-scheduler-operator/pkg/operator/target_config_reconciler.go:312 +0x18a
github.com/openshift/secondary-scheduler-operator/pkg/operator.(*TargetConfigReconciler).runWorker(...)
	github.com/openshift/secondary-scheduler-operator/pkg/operator/target_config_reconciler.go:301
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x7f75d17618d0)
	k8s.io/apimachinery.2/pkg/util/wait/wait.go:155 +0x67
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x23c3db0, {0x2340800, 0xc000a7c480}, 0x1, 0xc000721140)
	k8s.io/apimachinery.2/pkg/util/wait/wait.go:156 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0, 0x3b9aca00, 0x0, 0x28, 0xc0007baf40)
	k8s.io/apimachinery.2/pkg/util/wait/wait.go:133 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(0xc000a13d00, 0x0, 0xc000721140)
	k8s.io/apimachinery.2/pkg/util/wait/wait.go:90 +0x25
created by github.com/openshift/secondary-scheduler-operator/pkg/operator.(*TargetConfigReconciler).Run
	github.com/openshift/secondary-scheduler-operator/pkg/operator/target_config_reconciler.go:295 +0x237
I0405 16:31:38.278325       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-secondary-scheduler-operator", Name:"secondary-scheduler-operator", UID:"e245427c-23ac-4e7b-bef8-613b161e0e75", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Warning' reason: 'Openshift-Secondary-Scheduler-OperatorPanic' Panic observed: v1.Deployment.Spec: v1.DeploymentSpec.Template: v1.PodTemplateSpec.Spec: v1.PodSpec.Containers: []v1.Container: v1.Container.Env: []v1.EnvVar: v1.EnvVar.Value: ReadString: expects " or n, but found t, error found in #10 byte of ...|,"value":true}],"ima|..., bigger context ...|],"env":[{"name":"ENABLE_OPENSHIFT_AUTH","value":true}],"image":"${IMAGE}","name":"secondary-schedul|...


Version-Release number of selected component (if applicable):
[knarra@knarra ~]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.8    True        False         96m     Cluster version is 4.10.8
[knarra@knarra ~]$ oc get csv -n openshift-secondary-scheduler-operator
NAME                                DISPLAY                                              VERSION   REPLACES   PHASE
secondaryscheduleroperator.v1.0.0   Secondary Scheduler Operator for Red Hat OpenShift   1.0.0                Succeeded


How reproducible:
Always

Steps to Reproduce:
1. Install secondary scheduler operator from operator hub
2. create secondary-scheduler-config configmap using the yaml below
apiVersion: v1
kind: ConfigMap
metadata:
  name: "secondary-scheduler-config"
  namespace: "openshift-secondary-scheduler-operator"
data:
  "config.yaml": |
    apiVersion: kubescheduler.config.k8s.io/v1beta1
    kind: KubeSchedulerConfiguration
    leaderElection:
      leaderElect: false
    profiles:
      - schedulerName: secondary-scheduler
        plugins:
          score:
            disabled:
              - name: NodeResourcesBalancedAllocation
              - name: NodeResourcesLeastAllocated

3. Now create new cluster instance for secondary scheduler by adding the image as schedulerImage: 'k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.22.6'

Actual results:
cluster pod for secondary scheduler operator does not become running and checking the operator logs shows there is a panic

Expected results:
Cluster pod for secondary scheduler operator should be running and there should be no panic.

Additional info:

Comment 2 RamaKasturi 2022-04-12 12:04:13 UTC

Verified with the latest secondary scheduler operator and i do not see any panic so moving the bug to verified state.

[knarra@knarra ~]$ oc get csv -n openshift-secondary-scheduler-operator
NAME                                DISPLAY                                              VERSION   REPLACES   PHASE
secondaryscheduleroperator.v1.0.0   Secondary Scheduler Operator for Red Hat OpenShift   1.0.0                Succeeded
[knarra@knarra ~]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-04-11-231524   True        False         4h17m   Cluster version is 4.10.0-0.nightly-2022-04-11-231524


[knarra@knarra ~]$ oc get pods -n openshift-secondary-scheduler-operator
NAME                                            READY   STATUS    RESTARTS   AGE
secondary-scheduler-5489b8f7fc-95jqg            1/1     Running   0          37s
secondary-scheduler-operator-6d5787f9bc-sfdrs   1/1     Running   0          77s
[knarra@knarra ~]$ oc logs -f secondary-scheduler-operator-6d5787f9bc-sfdrs -n openshift-secondary-scheduler-operator
I0412 11:59:28.875604       1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-secondary-scheduler-operator", Name:"secondary-scheduler-operator", UID:"e3dcf6ab-dd9a-4616-a859-c4e8efc0512d", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'DeploymentCreated' Created Deployment.apps/secondary-scheduler -n openshift-secondary-scheduler-operator because it was missing

oc logs -f secondary-scheduler-5489b8f7fc-95jqg -n openshift-secondary-scheduler-operator
I0412 12:02:29.765726       1 scheduler.go:675] "Successfully bound pod to node" pod="knarra/no-annotation-secondary" node="knarra04122-ld7s6-worker-c-2lbf8.c.openshift-qe.internal" evaluatedNodes=6 feasibleNodes=3

Comment 4 errata-xmlrpc 2022-08-10 11:03:38 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069

Note You need to log in before you can comment on or make changes to this bug.