openQA pitfalls

Table of Contents

Needle editing
403 messages when using scripts
Mixed production and development environment
Performance impact
DB migration from SQlite to postgreSQL
Steps to debug developer mode setup
- Configure firewalld

Needle editing

If a new needle is created based on a failed test, the new needle will not be listed in old tests. However, when opening the needle editor, a warning about the new needle will be shown and it can be selected as base.
If an existing needle is updated with a new image or different areas, the old test will display the new needle which might be confusing.
If a needle is deleted, old tests may display an error when viewing them in the web UI.

403 messages when using scripts

If you come across messages displaying ERROR: 403 - Forbidden, make sure that the correct API key is present in client.conf file.
If you are using a hostname other than localhost, pass --host foo to the script.
If you are using fake authentication method, and the message says also "api key expired" you can simply logout and log in again in the webUI and the expiration will be automatically updated

Mixed production and development environment

There are few things to take into account when running a development version and a packaged version of openqa:

If the setup for the development scenario involves sharing /var/lib/openqa, it would be wise to have a shared group openqa, that will have write and execute permissions over said directory, so that geekotest user and the normal development user can share the environment without problems.

This approach will lead to a problem when the openqa package is updated, since the directory permissions will be changed again, nothing a chmod -R g+rwx /var/lib/openqa/ and chgrp -R openqa /var/lib/openqa can not fix.

Performance impact

openQA workers can cause high I/O load, especially when creating VM snapshots. The impact therefore gets more severe when MAKETESTSNAPSHOTS is enabled. should not impact the stability of openQA jobs but can increase job execution time. If you run jobs on a machine where responsiveness of other services matter, for example your desktop machine, consider patching the IOSchedulingPriority of a workers service file as described in the systemd documentation, for example set IOSchedulingPriority=7 for the lowest priority. If not available then you can try to execute the worker processes with ionice to reduce the risk of your system becoming significantly impacted by snapshot creation. Loading VM snapshots can also have an impact on SUT behavior as the execution of the first step after loading a snapshot might be delayed. This can lead to problems if the executed tests do not foresee an appropriate timeout margin.

On some less powerful systems (like Raspberry Pi), reducing screenshots size with optipng may take significant amount of time. In this case, you can switch to pngquant which is significantly faster, but uses lossy compression. To do that, install pngquant package and set USE_PNGQUANT=1 in worker or job settings.

DB migration from SQlite to postgreSQL

As a first step to start using postgreSQL, please, configure postgreSQL database according to the postgreSQL setup guide

To migrate api keys run following commands:

Export data from the SQlite db:

sqlite3 db.sqlite -csv -separator ',' 'select * from api_keys;' > apikeys.csv

Note: SQlite database file is located in /var/lib/openqa/db by default.

Import data to the postgreSQL

# openqa is the postgreSQL database name and apikeys.csv is api keys export file
psql -U postgres -d openqa -c "copy api_keys from 'apikeys.csv' with (format csv);"

In case you need to migrate job groups, test suites, use openqa-dump-templates and openqa-load-templates scripts accordingly.

Steps to debug developer mode setup

This is basically a checklist to go through in case the developer mode is broken in your setup (e.g. you are getting the error message unable to upgrade ws to command server):

Be sure to have everything up to date. That includes relevant packages on the machine hosting the web UI and on the worker.
Check whether the web browser can reach the livehandler daemon. Go to a running test and open the live view. Then open the JavaScript console of the web browser. If it contains messages like Received message via ws proxy: … the livehandler daemon can be reached. Otherwise, try the following sub-steps:
1. The installation guide has been updated to cover the developer mode. In case you installed your instance before the developer mode has been introduced, make sure that the Apache module rewrite is enabled (via a2enmod rewrite). Also be sure the vhost configuration looks like the one found in the openQA Git repository (especially the part for the reverse proxies).
2. Check whether openqa-livehandler.service is running. It is supposed to be run on the same machine as the web UI and should actually be started automatically as a dependency of openqa-webui.service.
Check whether the livehandler can reach the os-autoinst command server. Go to a running test and open the live view. Then open the JavaScript console of the web browser. If it contains messages like Received message via ws proxy: {…,"type":"info","what":"cmdsrvmsg"} the os-autoinst command server can be reached. Otherwise there should be at least a message like Received message via ws proxy: {"what":"connecting to os-autoinst command server at ws:\/\/host:20053\/xhB84lUuPlMfhDEF\/ws",…} which contains the URL the livehandler is attempting to query. In this case try the following sub-steps:
1. If the host is wrong, add WORKER_HOSTNAME = correcthost to workers.ini. The worker should then tell the web UI that it is reachable via correcthost resulting in a correct URL for the os-autoinst command server. Be sure the setting appears after the [global] section header.
2. It might also be the case that the firewall is blocking the HTTP/websocket connection on the required port. The required port is QEMUPORT plus 1. By default, QEMUPORT is set to $worker_instance_number * 10 + 20002 by the worker. So to cover worker slots from 1 to 10 one would by default require the ports 20013, 20023, … and 20103.
It can also help to look at the architecture diagram which shows which component needs access to which other components and the ports used by default.

Configure firewalld

As mentioned, certain ports are required to be open on the worker host. If you are using firewalld and the ports are blocked, it is fairly simple to open them:

# add new service "isotovideo" with ports covering 50 worker slots as explained in debugging steps above
firewall-cmd --new-service isotovideo
for i in {1..50}; do firewall-cmd --service=isotovideo --add-port=$((i * 10 + 20003))/tcp ; done
# allow the service in your zone (public in this example) right now
firewall-cmd --zone=public --add-service=isotovideo
# ensure the settings to be effective after firewall or host restarts
firewall-cmd --runtime-to-permanent

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pitfalls.asciidoc

Pitfalls.asciidoc

openQA pitfalls

Needle editing

403 messages when using scripts

Mixed production and development environment

Performance impact

DB migration from SQlite to postgreSQL

Steps to debug developer mode setup

Configure firewalld

Files

Pitfalls.asciidoc

Latest commit

History

Pitfalls.asciidoc

File metadata and controls

openQA pitfalls

Needle editing

403 messages when using scripts

Mixed production and development environment

Performance impact

DB migration from SQlite to postgreSQL

Steps to debug developer mode setup

Configure firewalld