Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robot crashes and reboots after canceling a protocol #15729

Open
jonasjonker opened this issue Jul 22, 2024 · 0 comments
Open

Robot crashes and reboots after canceling a protocol #15729

jonasjonker opened this issue Jul 22, 2024 · 0 comments
Labels

Comments

@jonasjonker
Copy link

jonasjonker commented Jul 22, 2024

Overview

Our lab personnel reported that the Opentrons robot crashes and reboots with increasing frequency. Specifically, at the moment the robot crashes each time a protocol run is cancelled.

Steps to reproduce

Cancel a protocol run at any moment.

Current behavior

After cancelling the run, the pipette goes back to the trash bin, and then crashes, rebooting the robot.

Expected behavior

More graceful canceling behavior.

Operating system

Windows

System and robot setup or anything else?

App Version

Both the robot and the robot run version 7.3.1, but I believe the lab personnel did an update from version 7.1.1 in the hope that this would fix the error. The problem existed previously.

Connection

We are connecting using USB

Context

I did take a quick look at the command line and logs via the terminal that is available via the build in Jupyter server.

I've noted the following. Using df I found that / is full. Directories like /var & /mnt are mounted somewhere else and not full.

I found some errors using journalctl that might be related:

$ journalctl -b -1 -e | grep -i 'err'
File "/usr/lib/python3.10/site-packages/opentrons/drivers/asyncio/communication/serial_connection.py", line 249, in raise_on_error opentrons.drivers.asyncio.co
mmunication.errors.AlarmResponse: /dev/ttyAMA0: 'Received error response 'L:703330305F6D756C74695F76322E310000000000000000000000000000000000okALARM: Kill button pressed - reset or M999 to continue' opentrons.drivers.smoothie_drivers.errors.SmoothieAlarm: SmoothieAlarm: None returned L:703330305F6D756C74695F76322E310000000000000000000000000000000000okALARM: Kill button pressed - reset orM999 to continueJul 22 09:39:45 OT2CEP20230303R02 uvicorn[196]: [generated in 0.00292s] ('{"status
": "failed", "errors": [{"id": "14e0c36a-00e3-4fbd-ab3b-6c7fffe639d8", "createdAt": "2024-07-22T09:39:45.385545+00:00", "errorCode": "4000",  ... (4447 characters truncated) ... cation": {"slotName": "12"}}], "pipettes": [], "modules": [], "labwareOffsets": [], "completedAt": "2024-07-22T09:39:39.089512+00:00", "liquids": []}'
, <EngineStatus.FAILED: 'failed'>, '2024-07-22 09:39:45.409968', '[]', 'a00ac7c3-f8ca-4601-be8b-708743f5829f')Jul 22 09:39:52 OT2CEP20230303R02 uvicorn[196]: [cached since 6.831s ago] ('{"status": "failed", "errors": [{"id": "14e0c36a-00e3-4fbd-ab3b-6c7fffe639d8", "createdAt": "2024-07-22T09:39:45.385545+00:00", "errorCode": "4000",  ... (4447 characters truncated) ... cation": {"slotName": "12"}}], "pipettes": [], "modules": [], "labwareOffsets": [], "completedAt": "2024-07-22T09:39:39.089512+00:00", "liquids": []}', <EngineStatus.FAILED: 'failed'>, '2024-07-22 09:39:52.249090', '[]', 'a00ac7c3-f8ca-4601-be8b-708743f5829f')

About a second before this error I saw a potentially related uvicorn error:

● opentrons-robot-server.service - Opentrons Robot HTTP Server
     Loaded: loaded (/etc/systemd/system/opentrons-robot-server.service; enabled
     Active: active (running) since Mon 2024-07-22 09:41:27 UTC; 58min ago
   Main PID: 194 (uvicorn)
     CGroup: /system.slice/opentrons-robot-server.service
             └─194 /usr/bin/python /usr/bin/uvicorn robot_server.app:app --u
Jul 22 10:38:52 OT2CEP20230303R02 uvicorn[194]:  - "GET /health HTTP/1.1" 200
Jul 22 10:39:07 OT2CEP20230303R02 uvicorn[194]:  - "GET /health HTTP/1.1" 200
Jul 22 10:39:07 OT2CEP20230303R02 uvicorn[194]:  - "GET /sessions HTTP/1.1" 200
Jul 22 10:39:08 OT2CEP20230303R02 uvicorn[194]: Error response: 403 - LegacyErro
Jul 22 10:39:08 OT2CEP20230303R02 uvicorn[194]:  - "GET /subsystems/updates/curr
Jul 22 10:39:22 OT2CEP20230303R02 uvicorn[194]:  - "GET /health HTTP/1.1" 200
Jul 22 10:39:37 OT2CEP20230303R02 uvicorn[194]:  - "GET /health HTTP/1.1" 200
Jul 22 10:39:52 OT2CEP20230303R02 uvicorn[194]:  - "GET /health HTTP/1.1" 200
Jul 22 10:40:07 OT2CEP20230303R02 uvicorn[194]:  - "GET /health HTTP/1.1" 200
Jul 22 10:40:22 OT2CEP20230303R02 uvicorn[194]:  - "GET /health HTTP/1.1" 200

The logs downloaded from the robot don't seem to show any errors around the time of the crash

$ cat combined.log | grep -iP ':0?9:39|:0?9:4[0-9]' -C 5
{"level":"debug","message":"Received action from main via IPC","timestamp":"2024-07-22T10:09:26.579Z","label":"renderer","meta":{"actionType":"robotUpdate:UPDATE_INFO"}}
{"level":"debug","message":"Received action from main via IPC","timestamp":"2024-07-22T10:09:26.580Z","label":"renderer","meta":{"actionType":"robotUpdate:UPDATE_VERSION"}}
{"level":"debug","message":"Files in robot update download directory","timestamp":"2024-07-22T10:09:26.581Z","label":"robotUpdate/release-files","meta":{"files":["ot3-system.zip","release-notes.md"]}}
{"level":"debug","message":"Received action from main via IPC","timestamp":"2024-07-22T10:09:26.584Z","label":"renderer","meta":{"actionType":"robotUpdate:UPDATE_INFO"}}
{"level":"debug","message":"No Flex serial port found.","timestamp":"2024-07-22T10:09:32.594Z","label":"usb","meta":{}}
{"level":"debug","message":"No Flex serial port found.","timestamp":"2024-07-22T10:09:42.608Z","label":"usb","meta":{}}
{"level":"debug","message":"No Flex serial port found.","timestamp":"2024-07-22T10:09:52.622Z","label":"usb","meta":{}}
{"level":"debug","message":"No Flex serial port found.","timestamp":"2024-07-22T10:10:02.623Z","label":"usb","meta":{}}
{"level":"debug","message":"No Flex serial port found.","timestamp":"2024-07-22T10:10:12.626Z","label":"usb","meta":{}}
{"level":"debug","message":"Sending mDNS discovery query","timestamp":"2024-07-22T10:10:13.719Z","label":"discovery","meta":{}}
{"level":"debug","message":"No Flex serial port found.","timestamp":"2024-07-22T10:10:22.628Z","label":"usb","meta":{}}

Can somebody help me find and fix the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant