Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reopen postgresql db connection to be more robust #12

Open
bernhardreiter opened this issue Jun 9, 2017 · 7 comments
Open

Reopen postgresql db connection to be more robust #12

bernhardreiter opened this issue Jun 9, 2017 · 7 comments

Comments

@bernhardreiter
Copy link
Member

ffb268e v0.5.1 will fail if a database connection is closed.
Apache's mod_wsgi may restart the processes onces in the while, so there is some self healing.

Simple test to reproduce:
systemctl restart postgresql.service in a running contactdb_api results into

cur = contactdb_conn.cursor(cursor_factory=RealDictCursor)
psycopg2.InterfaceError: connection already closed

Technical idea:

Use psycopg2.pool.ThreadedConnectionPool().

This might also help to speed up access and use multiple threads with wsgi,
which would need an update in the Apache config instructions.

@bernhardreiter
Copy link
Member Author

Did some planning and also talked to @bernhard-herzog:

This is a task which requires some code changes.
Thoughts:

  • Ideally some db related functions are put in a common utility library (see Reduce code duplication in backend parts #13)
  • A connection (and cursor and transaction) should be acquired once for request.
  • Even when using psycopg2.pool.ThreadedConnectionPool() there is going to be a global variable for the pool instance.

To make the code easier to read it would be good to use a function to get new connections.
One way is to pass the connection or cursor object down to the called functions.
If we want this to be implicit, someone could use a class which has the connection object
as attribute.

In addtion a @contextmanager could make the code more readable.

@bernhardreiter
Copy link
Member Author

Considered a workaround to limit the time an apache WSGI process runs, to make the process self healing. A simple change to modify the 001-fody.conf to read

WSGIDaemonProcess www-fody threads=1 maximum-requests=10000 restart-interval=3600 graceful-timeout=30

does not work on Ubuntu 16.04LTS, because it would require at least http://modwsgi.readthedocs.io/en/latest/release-notes/version-4.5.12.html#new-features
which we don't have.

@bernhardreiter
Copy link
Member Author

bernhardreiter commented Mar 18, 2019

For the evaluation of this issue: the operating instructions (currently part of the README.md) already state that the web server process has to be restarted if the database is to be restarted. Other cases so far were not reported in production setting. It is recommended to have log-monitoring in place for a fully automated system, so this should raise an alert in this (or other cases that are more likely).

@ghost
Copy link

ghost commented Mar 18, 2019

It is recommended to have log-monitoring in place for a fully automated system, so this should raise an alert in this (or other cases that are more likely).

That need administrator attention and manual intervention. However, it can be handled automatically.

@bernhardreiter
Copy link
Member Author

@wagner-certat we seem to agree that the current behavior can be improved. Which is the purpose of this issue after all. ;-) My remark was helping to prioritize it.

@wagner-intevation
Copy link
Member

This is still an issue. If the DB was gone and is back up, the backend continues throwing 500s:

intelmq-fody-backend    | 172.28.0.1 - - [18/Jul/2022 14:05:19] "POST /api/login/ HTTP/1.1" 200 72
intelmq-fody-backend    | Traceback (most recent call last):
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 949, in searchcidr
intelmq-fody-backend    |     query_results = __db_query_organisation_ids("""
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 259, in __db_query_organisation_ids
intelmq-fody-backend    |     description, results = _db_query(operation_str.format(""), parameters)
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 207, in _db_query
intelmq-fody-backend    |     cur = contactdb_conn.cursor(cursor_factory=RealDictCursor)
intelmq-fody-backend    | psycopg2.InterfaceError: connection already closed
intelmq-fody-backend    | 
intelmq-fody-backend    | During handling of the above exception, another exception occurred:
intelmq-fody-backend    | 
intelmq-fody-backend    | Traceback (most recent call last):
intelmq-fody-backend    |   File "/usr/lib/python3.8/wsgiref/handlers.py", line 137, in run
intelmq-fody-backend    |     self.result = application(self.environ, self.start_response)
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/falcon/api.py", line 269, in __call__
intelmq-fody-backend    |     responder(req, resp, **params)
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 947, in __call__
intelmq-fody-backend    |     raise exception
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 918, in __call__
intelmq-fody-backend    |     self.call_function(input_parameters), context, request, response, **kwargs
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 840, in call_function
intelmq-fody-backend    |     return self.interface(**parameters)
intelmq-fody-backend    |   File "/usr/local/lib/python3.8/dist-packages/hug/interface.py", line 129, in __call__
intelmq-fody-backend    |     return __hug_internal_self._function(*args, **kwargs)
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 967, in searchcidr
intelmq-fody-backend    |     __commit_transaction()
intelmq-fody-backend    |   File "/opt/intelmq-fody-backend/contactdb_api/contactdb_api/serve.py", line 162, in __commit_transaction
intelmq-fody-backend    |     contactdb_conn.commit()
intelmq-fody-backend    | psycopg2.InterfaceError: connection already closed

@wagner-intevation
Copy link
Member

This is what a DB restart (recovery after error in WAL) caused for the ticket api:

2024-07-29 13:45:46 [FALCON] [ERROR] GET /api/tickets/stats?sent-at_after=2024-07-29&sent-at_before=2024-07-30 => Traceback (most recent call last):, referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 818, in stats, referer: https://automatix.cert-bund.de:8000/
    results = query(prep), referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 558, in query, referer: https://automatix.cert-bund.de:8000/
    cur = eventdb_conn.cursor(cursor_factory=RealDictCursor), referer: https://automatix.cert-bund.de:8000/
psycopg2.InterfaceError: connection already closed, referer: https://automatix.cert-bund.de:8000/
, referer: https://automatix.cert-bund.de:8000/
During handling of the above exception, another exception occurred:, referer: https://automatix.cert-bund.de:8000/
, referer: https://automatix.cert-bund.de:8000/
Traceback (most recent call last):, referer: https://automatix.cert-bund.de:8000/
  File "falcon/app.py", line 361, in falcon.app.App.__call__, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 947, in hug.interface.HTTP.__call__, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 918, in hug.interface.HTTP.__call__, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 840, in hug.interface.HTTP.call_function, referer: https://automatix.cert-bund.de:8000/
  File "hug/interface.py", line 129, in hug.interface.Interfaces.__call__, referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 825, in stats, referer: https://automatix.cert-bund.de:8000/
    __rollback_transaction(), referer: https://automatix.cert-bund.de:8000/
  File "/usr/lib/python3/dist-packages/tickets_api/tickets_api/serve.py", line 133, in __rollback_transaction, referer: https://automatix.cert-bund.de:8000/
    eventdb_conn.rollback(), referer: https://automatix.cert-bund.de:8000/
psycopg2.InterfaceError: connection already closed, referer: https://automatix.cert-bund.de:8000/
, referer: https://automatix.cert-bund.de:8000/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants