Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgsql is unavailable #432

Closed
goern opened this issue Sep 7, 2023 · 2 comments
Closed

pgsql is unavailable #432

goern opened this issue Sep 7, 2023 · 2 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@goern
Copy link
Member

goern commented Sep 7, 2023

Describe the bug
https://paddock.b4mad.racing/ delivers a HTTP/500, log says:

 self._fetch_all()
File "/opt/app-root/lib64/python3.10/site-packages/django/db/models/query.py", line 1881, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/opt/app-root/lib64/python3.10/site-packages/django/db/models/query.py", line 91, in __iter__
results = compiler.execute_sql(
File "/opt/app-root/lib64/python3.10/site-packages/django/db/models/sql/compiler.py", line 1560, in execute_sql
cursor = self.connection.cursor()
File "/opt/app-root/lib64/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/app-root/lib64/python3.10/site-packages/django/db/backends/base/base.py", line 330, in cursor
return self._cursor()
File "/opt/app-root/lib64/python3.10/site-packages/django/db/backends/base/base.py", line 306, in _cursor
self.ensure_connection()
File "/opt/app-root/lib64/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/app-root/lib64/python3.10/site-packages/django/db/backends/base/base.py", line 288, in ensure_connection
with self.wrap_database_errors:
File "/opt/app-root/lib64/python3.10/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/opt/app-root/lib64/python3.10/site-packages/django/db/backends/base/base.py", line 289, in ensure_connection
self.connect()
File "/opt/app-root/lib64/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/app-root/lib64/python3.10/site-packages/django/db/backends/base/base.py", line 270, in connect
self.connection = self.get_new_connection(conn_params)
File "/opt/app-root/lib64/python3.10/site-packages/django_prometheus/db/backends/postgresql/base.py", line 9, in get_new_connection
conn = super().get_new_connection(*args, **kwargs)
File "/opt/app-root/lib64/python3.10/site-packages/django_prometheus/db/common.py", line 45, in get_new_connection
return super().get_new_connection(*args, **kwargs)
File "/opt/app-root/lib64/python3.10/site-packages/django/utils/asyncio.py", line 26, in inner
return func(*args, **kwargs)
File "/opt/app-root/lib64/python3.10/site-packages/django/db/backends/postgresql/base.py", line 275, in get_new_connection
connection = self.Database.connect(**conn_params)
File "/opt/app-root/lib64/python3.10/site-packages/psycopg2/__init__.py", line 122, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: connection to server at "db-primary.b4mad-racing.svc" (172.30.33.72), port 5432 failed: Connection refused
Is the server running on that host and accepting TCP/IP connections?

looking at the db-instance pod:

2023-09-07 11:05:42.614 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:42.723 P00 DEBUG: common/io/http/request::httpRequestProcess: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:42.827 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:42.930 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:43.133 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:43.436 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:43.939 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:44.742 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:46.045 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known
2023-09-07 11:05:46,122 INFO: Lock owner: db-instance-hnkw-0; I am db-instance-hnkw-0
2023-09-07 11:05:46,128 WARNING: manual failover: members list is empty
2023-09-07 11:05:46,128 INFO: updated leader lock during doing crash recovery in a single user mode
2023-09-07 11:05:48.150 P00 DEBUG: common/io/socket/client::sckClientOpen: retry HostConnectError: unable to get address for 'b4mad-racing-psql.192.169.178.22': [-2] Name or service not known

pgbackrest/pgbackrest#1778 might be related?!

To Reproduce
Steps to reproduce the behavior:

  1. Go to https://paddock.b4mad.racing/, see 500
  2. read https://console-openshift-console.apps.phobos.b4mad.emea.operate-first.cloud/k8s/ns/b4mad-racing/pods/paddock-377-r66vk/logs
  3. read https://console-openshift-console.apps.phobos.b4mad.emea.operate-first.cloud/k8s/ns/b4mad-racing/pods/db-instance-hnkw-0/logs

Expected behavior
HTTP/200

Screenshots
n/a

Additional context
n/a

/priority critical-urgent
/assign durandom

@goern goern added the kind/bug Categorizes issue or PR as related to a bug. label Sep 7, 2023
@op1st-prow op1st-prow bot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Sep 7, 2023
@goern
Copy link
Member Author

goern commented Sep 7, 2023

@durandom
Copy link
Member

durandom commented Sep 7, 2023

yes, same issue. I'll post the steps to fix this in #245

@durandom durandom closed this as completed Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

No branches or pull requests

2 participants