Bug 2040136 - external-dns-operator pod keeps restarting and reports error: timed out waiting for cache to be synced
Summary: external-dns-operator pod keeps restarting and reports error: timed out waiti...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Andrey Lebedev
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-13 03:59 UTC by Hongan Li
Modified: 2022-08-04 22:39 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-08 16:03:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift external-dns-operator pull 112 0 None open Bug 2040136: notice about the extra roles 2022-01-13 10:18:45 UTC
Red Hat Product Errata RHEA-2022:0781 0 None None None 2022-03-08 16:03:14 UTC

Description Hongan Li 2022-01-13 03:59:06 UTC
Description of problem:
external-dns-operator pod keeps restarting and reports error: timed out waiting for cache to be synced

OpenShift release version:
4.10.0-0.nightly-2022-01-11-065245
external-dns-operator.v0.1.2

Cluster Platform:
AWS

How reproducible:
100%

Steps to Reproduce (in detail):
1. install ExternalDNS opertor via Console>OperatorHub
2. check the operator pod status
$ oc -n external-dns-operator get pod
NAME                                     READY   STATUS    RESTARTS      AGE
external-dns-operator-67479bcfd8-x2kdc   2/2     Running   5 (91s ago)   13m


Actual results:
$ oc -n external-dns-operator logs external-dns-operator-67479bcfd8-x2kdc -c operator -f
<---snip--->
E0113 03:41:42.998147       1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:245: Failed to watch *v1.Deployment: failed to list *v1.Deployment: deployments.apps is forbidden: User "system:serviceaccount:external-dns-operator:external-dns-operator" cannot list resource "deployments" in API group "apps" in the namespace "external-dns"
E0113 03:41:56.953632       1 reflector.go:138] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers_map.go:245: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:external-dns-operator:external-dns-operator" cannot list resource "secrets" in API group "" in the namespace "external-dns"
2022-01-13T03:42:15.222Z	ERROR	controller-runtime.manager.controller.credentials_secret_controller	Could not wait for Cache to sync	{"error": "failed to wait for credentials_secret_controller caches to sync: timed out waiting for cache to be synced"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/remote-source/workspace/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
	/remote-source/workspace/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:221
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1
	/remote-source/workspace/app/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:696
2022-01-13T03:42:15.222Z	ERROR	controller-runtime.manager.controller.external_dns_controller	Could not wait for Cache to sync	{"error": "failed to wait for external_dns_controller caches to sync: timed out waiting for cache to be synced"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	/remote-source/workspace/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:195
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
	/remote-source/workspace/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:221
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1
	/remote-source/workspace/app/vendor/sigs.k8s.io/controller-runtime/pkg/manager/internal.go:696
2022-01-13T03:42:15.223Z	ERROR	controller-runtime.manager	error received after stop sequence was engaged	{"error": "failed to wait for external_dns_controller caches to sync: timed out waiting for cache to be synced"}
2022-01-13T03:42:15.223Z	INFO	controller-runtime.webhook	shutting down webhook server
2022-01-13T03:42:15.224Z	ERROR	setup	failed to start externaldns operator	{"error": "failed to wait for credentials_secret_controller caches to sync: timed out waiting for cache to be synced"}
runtime.main
	/usr/lib/golang/src/runtime/proc.go:225


Expected results:
external-dns-operator pod should works well

Impact of the problem:


Additional info:



** Please do not disregard the report template; filling the template out as much as possible will allow us to help you. Please consider attaching a must-gather archive (via `oc adm must-gather`). Please review must-gather contents for sensitive information before attaching any must-gathers to a bugzilla report.  You may also mark the bug private if you wish.

Comment 2 Miciah Dashiel Butler Masters 2022-01-13 12:32:55 UTC
Setting blocker- as this is a documentation issue and isn't critical, but it would be good to get this in the 4.10.0 release, so we'll try to get Andrey's PR merged ASAP.

Comment 3 Hongan Li 2022-01-14 08:01:02 UTC
run below manual steps, then install the operator via OperatorHub and the external-dns-operator pod works well now.  

$ oc create ns external-dns
$ oc apply -f https://raw.githubusercontent.com/openshift/external-dns-operator/main/config/rbac/extra-roles.yaml

Comment 6 Hongan Li 2022-01-19 06:22:19 UTC
verified with 4.10.0-0.nightly-2022-01-18-044014 and external-dns-operator.v0.1.2, now we can see "Prerequisites and Requirements" as below:

The ExternalDNS Operator has minimal cluster level permissions. Unfortunately OLM doesn't allow the local role creation in the operand namespace yet. Therefore make sure the following commands are run against your cluster before the installation of the operator:

Create the operand namespace: oc create ns external-dns
Grant the operator access to manage operand resources: oc apply -f https://raw.githubusercontent.com/openshift/external-dns-operator/main/config/rbac/extra-roles.yaml

Comment 9 errata-xmlrpc 2022-03-08 16:03:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of ExternalDNS Operator on OperatorHub), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2022:0781


Note You need to log in before you can comment on or make changes to this bug.