Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

turtlebot4 service started ros-jazzy-depthai_driver for oak-d-lite non-responsive at 100% CPU - how to debug? #657

Open
slowrunner opened this issue Jan 20, 2025 · 5 comments
Labels
question Further information is requested

Comments

@slowrunner
Copy link

slowrunner commented Jan 20, 2025

turtlebot4.service launched Oak-D-Lite crashed and taking 100% CPU after 23 hours operation

Robot Model
Turtlebot4 Lite

ROS distro
Jazzy

Networking Configuration
Discovery Server

OS
Ubuntu 24.04

Built from source or installed?
Installed

Package version
Turtlebot4 OS: Ubuntu 24.04.1 Server
ROS: Jazzy
Turtlebot4 Processor: Raspberry Pi 5 8GB
Create3: (Iron) I.1.0.0.fast_dds

Package: ros-jazzy-turtlebot4-bringup
Status: install ok installed
Priority: optional
Section: misc
Installed-Size: 81
Maintainer: rkreinin [email protected]
Architecture: arm64
Version: 2.0.1-2noble.20241228.080750
Depends: ros-jazzy-create3-republisher, ros-jazzy-depthai-bridge, ros-jazzy-depthai-examples, ros-jazzy-depthai-ros-driver, ros-jazzy-depthai-ros-msgs, ros-jazzy-joy-linux, ros-jazzy-nav2-common, ros-jazzy-rplidar-ros, ros-jazzy-teleop-twist-joy (>= 2.6.1), ros-jazzy-tf2-ros, ros-jazzy-turtlebot4-description, ros-jazzy-turtlebot4-diagnostics, ros-jazzy-turtlebot4-node, ros-jazzy-ros-workspace
Description: Turtlebot4 Robot Bringup

Package: ros-jazzy-depthai-ros-driver
Status: install ok installed
Priority: optional
Section: misc
Installed-Size: 3426
Maintainer: Adam Serafin [email protected]
Architecture: arm64
Version: 2.10.3-1noble.20241228.074725
Type of issue
Camera

Expected behaviour
Camera was taking less than 5% of one core of Raspberry Pi 5 for most of the first twenty some hours after starting turtlebot4 service. (And the total power consumed by Raspberry Pi5 and USB powered devices (LIDAR, Speaker/Mic, GamePad WiFi Dongle) was steady 5.4W

Expected to see Diagnostics taking 15%, oakd taking 3-5% as reported in htop, and RPi5 power to be around 5.4W and /oakd/rgb/preview/image_raw to be published.

Actual behaviour
oakd is consuming 100% (htop) and RPi5 power is 6.8W and /oakd/rgb/preview/image_raw is no longer being published.

Image

Error messages
Don't know where to look or how to retrieve.
To Reproduce
ran turtlebot4 service for 23 hours until noticed camera using excess CPU

Other notes
Restarting turtlebot4 service returned all processes to expected - camera was around 2% and publishing at 30Hz, RPi5 power consumption dropped to 6.3W but not to 5.4W seen prior.

Image

This is the normal startup of the camera:

ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$ systemctl status turtlebot4.service 
● turtlebot4.service - "bringup turtlebot4"
     Loaded: loaded (/usr/lib/systemd/system/turtlebot4.service; enabled; preset: enabled)
     Active: active (running) since Sun 2025-01-19 19:18:33 EST; 33min ago
   Main PID: 19799 (turtlebot4-star)
      Tasks: 175 (limit: 9375)
     Memory: 340.2M (peak: 360.1M)
        CPU: 12min 2.987s
     CGroup: /system.slice/turtlebot4.service
             ├─19799 /bin/bash /usr/sbin/turtlebot4-start
             ├─19905 /usr/bin/python3 /opt/ros/jazzy/bin/ros2 launch /tmp/turtlebot4.launch.py
             ├─19909 /opt/ros/jazzy/lib/turtlebot4_node/turtlebot4_node --ros-args -r __ns:=/ --params-file /tmp/tmp0hj5fzrr --params-file /tmp/launch_params_w_vn_aby
             ├─19910 /opt/ros/jazzy/lib/create3_republisher/create3_republisher --ros-args -r __ns:=/ --params-file /opt/ros/jazzy/share/turtlebot4_bringup/config/republisher.yaml --params-file /tmp/launch_params_bt7d95zo -r cmd_ve>
             ├─19911 /opt/ros/jazzy/lib/joy_linux/joy_linux_node --ros-args -r __node:=joy_linux_node -r __ns:=/ --params-file /tmp/launch_params_krkamcn1 -r /diagnostics:=diagnostics
             ├─19912 /opt/ros/jazzy/lib/teleop_twist_joy/teleop_node --ros-args -r __node:=teleop_twist_joy_node -r __ns:=/ --params-file /tmp/tmp06g75gkg --params-file /tmp/launch_params_x6g_qu0l
             ├─19913 /opt/ros/jazzy/lib/rplidar_ros/sllidar_node --ros-args -r __node:=rplidar_composition -r __ns:=/ --params-file /tmp/launch_params_vy8_531s
             ├─19914 /opt/ros/jazzy/lib/robot_state_publisher/robot_state_publisher --ros-args -r __node:=robot_state_publisher -r __ns:=/ --params-file /tmp/launch_params__x42d_j7 --params-file /tmp/launch_params_wxp01_hb -r /tf:=>
             ├─19915 /usr/bin/python3 /opt/ros/jazzy/lib/joint_state_publisher/joint_state_publisher --ros-args -r __node:=joint_state_publisher -r __ns:=/ --params-file /tmp/launch_params_l0lyhlxi -r /tf:=tf -r /tf_static:=tf_stat>
             ├─19916 /opt/ros/jazzy/lib/diagnostic_aggregator/aggregator_node --ros-args -r __ns:=/ --params-file /tmp/tmp3udngrki -r /diagnostics:=diagnostics -r /diagnostics_agg:=diagnostics_agg -r /diagnostics_toplevel_state:=di>
             ├─19917 /usr/bin/python3 /opt/ros/jazzy/lib/turtlebot4_diagnostics/diagnostics_updater --ros-args -r __ns:=/ -r /diagnostics:=diagnostics -r /diagnostics_agg:=diagnostics_agg -r /diagnostics_toplevel_state:=diagnostics>
             └─20059 /opt/ros/jazzy/lib/rclcpp_components/component_container --ros-args -r __node:=oakd_container -r __ns:=/

Jan 19 19:19:07 TB5WaLI turtlebot4-start[19905]: [INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/oakd' in container '/oakd_container'
Jan 19 19:19:08 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332348.224115734] [oakd]: Starting camera.
Jan 19 19:19:08 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332348.235341119] [oakd]: No ip/mxid specified, connecting to the next available device.
Jan 19 19:19:10 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332350.741441013] [oakd]: Camera with MXID: 184430101175A41200 and Name: 3.1 connected!
Jan 19 19:19:10 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332350.742821830] [oakd]: USB SPEED: HIGH
Jan 19 19:19:10 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332350.764267950] [oakd]: Device type: OAK-D-LITE
Jan 19 19:19:10 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332350.767215602] [oakd]: Pipeline type: RGB
Jan 19 19:19:11 TB5WaLI turtlebot4-start[19905]: [component_container-10] [WARN] [1737332351.886279182] [oakd]: IMU enabled but not available!
Jan 19 19:19:11 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332351.886453330] [oakd]: Finished setting up pipeline.
Jan 19 19:19:12 TB5WaLI turtlebot4-start[19905]: [component_container-10] [INFO] [1737332352.103666574] [oakd]: Camera ready!

Since the Oak-D-Lite is a Luxonis product and the ROS 2 driver was not written by Clearpath, I am submitting this here as well.

turtlebot/turtlebot4#523

@slowrunner slowrunner added the question Further information is requested label Jan 20, 2025
@slowrunner
Copy link
Author

Not sure if it is pertinent: This morning when TurtleBot4 undocked, it called /oakd/start_camera service but the camera did not restart publishing /oak/rgb/preview/image_raw.

I manually called /oakd/stop_camera with success response,
and then manually called /oakd/start_camera but no response was returned and the camera did not restart.

I issued turtlebot4-service-restart and the camera became active again.

ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$ ros2 service call /oakd/stop_camera std_srvs/srv/Trigger "{}"
waiting for service to become available...
requester: making request: std_srvs.srv.Trigger_Request()

response:
std_srvs.srv.Trigger_Response(success=True, message='')

ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$ ros2 service call /oakd/start_camera std_srvs/srv/Trigger "{}"
waiting for service to become available...
requester: making request: std_srvs.srv.Trigger_Request()

^C

ubuntu@TB5WaLI:/TB5-WaLI/wali_ws$ turtlebot4-service-restart
ubuntu@TB5WaLI:
/TB5-WaLI/wali_ws$ systemctl status turtlebot4.service
● turtlebot4.service - "bringup turtlebot4"
Loaded: loaded (/usr/lib/systemd/system/turtlebot4.service; enabled; preset: enabled)
Active: active (running) since Mon 2025-01-20 14:09:34 EST; 11min ago
Main PID: 25530 (turtlebot4-star)
Tasks: 175 (limit: 9375)
Memory: 342.8M (peak: 360.4M)
CPU: 4min 25.786s
CGroup: /system.slice/turtlebot4.service
├─25530 /bin/bash /usr/sbin/turtlebot4-start
├─25627 /usr/bin/python3 /opt/ros/jazzy/bin/ros2 launch /tmp/turtlebot4.launch.py
├─25631 /opt/ros/jazzy/lib/turtlebot4_node/turtlebot4_node --ros-args -r __ns:=/ --params-file /tmp/tmp0x2eghp5 --params-file /tmp/launch_params_scwcpv89
├─25632 /opt/ros/jazzy/lib/create3_republisher/create3_republisher --ros-args -r __ns:=/ --params-file /opt/ros/jazzy/share/turtlebot4_bringup/config/republ>
├─25633 /opt/ros/jazzy/lib/joy_linux/joy_linux_node --ros-args -r __node:=joy_linux_node -r __ns:=/ --params-file /tmp/launch_params_jvt_7iju -r /diagnostic>
├─25634 /opt/ros/jazzy/lib/teleop_twist_joy/teleop_node --ros-args -r __node:=teleop_twist_joy_node -r __ns:=/ --params-file /tmp/tmp68fo1yes --params-file >
├─25635 /opt/ros/jazzy/lib/rplidar_ros/sllidar_node --ros-args -r __node:=rplidar_composition -r __ns:=/ --params-file /tmp/launch_params_m81ysdc0
├─25636 /opt/ros/jazzy/lib/robot_state_publisher/robot_state_publisher --ros-args -r __node:=robot_state_publisher -r __ns:=/ --params-file /tmp/launch_para>
├─25637 /usr/bin/python3 /opt/ros/jazzy/lib/joint_state_publisher/joint_state_publisher --ros-args -r __node:=joint_state_publisher -r __ns:=/ --params-file>
├─25638 /opt/ros/jazzy/lib/diagnostic_aggregator/aggregator_node --ros-args -r __ns:=/ --params-file /tmp/tmpill7l5ha -r /diagnostics:=diagnostics -r /diagn>
├─25639 /usr/bin/python3 /opt/ros/jazzy/lib/turtlebot4_diagnostics/diagnostics_updater --ros-args -r __ns:=/ -r /diagnostics:=diagnostics -r /diagnostics_ag>
└─25783 /opt/ros/jazzy/lib/rclcpp_components/component_container --ros-args -r __node:=oakd_container -r __ns:=/

Jan 20 14:10:07 TB5WaLI turtlebot4-start[25627]: [INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/oakd' in container '/oakd_container'
Jan 20 14:10:08 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400208.412826112] [oakd]: Starting camera.
Jan 20 14:10:08 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400208.424654868] [oakd]: No ip/mxid specified, connecting to the next available de>
Jan 20 14:10:10 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400210.933070670] [oakd]: Camera with MXID: 184430101175A41200 and Name: 3.1 connec>
Jan 20 14:10:10 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400210.933967375] [oakd]: USB SPEED: HIGH
Jan 20 14:10:10 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400210.955233513] [oakd]: Device type: OAK-D-LITE
Jan 20 14:10:10 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400210.957892609] [oakd]: Pipeline type: RGB
Jan 20 14:10:11 TB5WaLI turtlebot4-start[25627]: [component_container-10] [WARN] [1737400211.577994051] [oakd]: IMU enabled but not available!
Jan 20 14:10:11 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400211.578176106] [oakd]: Finished setting up pipeline.
Jan 20 14:10:11 TB5WaLI turtlebot4-start[25627]: [component_container-10] [INFO] [1737400211.800059205] [oakd]: Camera ready!

Then I retried the stop/start calls and they both succeeded with the preview stopping and restarting as expected.

ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$ ros2 service call /oakd/stop_camera std_srvs/srv/Trigger "{}"
waiting for service to become available...
requester: making request: std_srvs.srv.Trigger_Request()

response:
std_srvs.srv.Trigger_Response(success=True, message='')

ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$ ros2 service call /oakd/start_camera std_srvs/srv/Trigger "{}"
waiting for service to become available...
requester: making request: std_srvs.srv.Trigger_Request()

response:
std_srvs.srv.Trigger_Response(success=True, message='')

ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$

Image

@slowrunner
Copy link
Author

slowrunner commented Jan 21, 2025

To reproduce seems to be just sending /stop_camera (but not always):

  • Jan 21 13:35:50 "Camera Ready" (confirmed /oakd/rgb/preview/image_raw publishing)
  • Jan 21 13:38:46 /stop_camera service called (confirmed preview no longer publishing)
  • htop showing 100% cpu for:
      /opt/ros/jazzy/lib/rclcpp_components/component_container --ros-args -r __node:=oakd_container -r __ns:=/
  • sending /start_camera service yields no response:
ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$ ros2 service call /oakd/start_camera std_srvs/srv/Trigger "{}"
waiting for service to become available...
requester: making request: std_srvs.srv.Trigger_Request()

^Cubuntu@TB5WaLI:~/TB5-WaLI/wali_wssystemctl status turtlebot4.service 
● turtlebot4.service - "bringup turtlebot4"
     Loaded: loaded (/usr/lib/systemd/system/turtlebot4.service; enabled; preset: enabled)
     Active: active (running) since Tue 2025-01-21 13:35:12 EST; 5min ago
   Main PID: 34724 (turtlebot4-star)
      Tasks: 166 (limit: 9375)
     Memory: 340.6M (peak: 359.7M)
        CPU: 2min 36.204s
     CGroup: /system.slice/turtlebot4.service
             ├─34724 /bin/bash /usr/sbin/turtlebot4-start
             ├─34823 /usr/bin/python3 /opt/ros/jazzy/bin/ros2 launch /tmp/turtlebot4.launch.py
             ├─34827 /opt/ros/jazzy/lib/turtlebot4_node/turtlebot4_node --ros-args -r __ns:=/ --params-file /tmp/tmp4mipjfth --params-file /tmp/launch_params_6esi0nzr
             ├─34828 /opt/ros/jazzy/lib/create3_republisher/create3_republisher --ros-args -r __ns:=/ --params-file /opt/ros/jazzy/share/turtlebot4_bringup/config/republ>
             ├─34829 /opt/ros/jazzy/lib/joy_linux/joy_linux_node --ros-args -r __node:=joy_linux_node -r __ns:=/ --params-file /tmp/launch_params_38oikj6f -r /diagnostic>
             ├─34830 /opt/ros/jazzy/lib/teleop_twist_joy/teleop_node --ros-args -r __node:=teleop_twist_joy_node -r __ns:=/ --params-file /tmp/tmpcnhud653 --params-file >
             ├─34831 /opt/ros/jazzy/lib/rplidar_ros/sllidar_node --ros-args -r __node:=rplidar_composition -r __ns:=/ --params-file /tmp/launch_params_qasvlw3l
             ├─34832 /opt/ros/jazzy/lib/robot_state_publisher/robot_state_publisher --ros-args -r __node:=robot_state_publisher -r __ns:=/ --params-file /tmp/launch_para>
             ├─34833 /usr/bin/python3 /opt/ros/jazzy/lib/joint_state_publisher/joint_state_publisher --ros-args -r __node:=joint_state_publisher -r __ns:=/ --params-file>
             ├─34834 /opt/ros/jazzy/lib/diagnostic_aggregator/aggregator_node --ros-args -r __ns:=/ --params-file /tmp/tmpw3sl4m6v -r /diagnostics:=diagnostics -r /diagn>
             ├─34835 /usr/bin/python3 /opt/ros/jazzy/lib/turtlebot4_diagnostics/diagnostics_updater --ros-args -r __ns:=/ -r /diagnostics:=diagnostics -r /diagnostics_ag>
             └─34989 /opt/ros/jazzy/lib/rclcpp_components/component_container --ros-args -r __node:=oakd_container -r __ns:=/

Jan 21 13:35:47 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484547.377175375] [oakd]: Starting camera.
Jan 21 13:35:47 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484547.387114999] [oakd]: No ip/mxid specified, connecting to the next available de>
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.882714421] [oakd]: Camera with MXID: 184430101175A41200 and Name: 3.1 connec>
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.883641329] [oakd]: USB SPEED: HIGH
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.905094244] [oakd]: Device type: OAK-D-LITE
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.907720451] [oakd]: Pipeline type: RGB
Jan 21 13:35:50 TB5WaLI turtlebot4-start[34823]: [component_container-10] [WARN] [1737484550.526893148] [oakd]: IMU enabled but not available!
Jan 21 13:35:50 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484550.527078463] [oakd]: Finished setting up pipeline.
Jan 21 13:35:50 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484550.750126919] [oakd]: Camera ready!
Jan 21 13:38:46 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484726.644495597] [oakd]: Stopping camera.

Journalctl backtrace:

ubuntu@TB5WaLI:~/TB5-WaLI/wali_ws$ journalctl -r
Jan 21 13:40:29 TB5WaLI systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
Jan 21 13:40:29 TB5WaLI systemd[1]: sysstat-collect.service: Deactivated successfully.
Jan 21 13:40:29 TB5WaLI systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
Jan 21 13:38:48 TB5WaLI kernel: usb 3-1: SerialNumber: 03e72485
Jan 21 13:38:48 TB5WaLI kernel: usb 3-1: Manufacturer: Movidius Ltd.
Jan 21 13:38:48 TB5WaLI kernel: usb 3-1: Product: Movidius MyriadX
Jan 21 13:38:48 TB5WaLI kernel: usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Jan 21 13:38:48 TB5WaLI kernel: usb 3-1: New USB device found, idVendor=03e7, idProduct=2485, bcdDevice= 0.01
Jan 21 13:38:48 TB5WaLI kernel: usb 3-1: new high-speed USB device number 66 using xhci-hcd
Jan 21 13:38:46 TB5WaLI kernel: usb 3-1: USB disconnect, device number 65
Jan 21 13:38:46 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484726.644495597] [oakd]: Stopping camera.
Jan 21 13:38:39 TB5WaLI systemd[1]: fwupd.service: Deactivated successfully.
Jan 21 13:35:50 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484550.750126919] [oakd]: Camera ready!
Jan 21 13:35:50 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484550.527078463] [oakd]: Finished setting up pipeline.
Jan 21 13:35:50 TB5WaLI turtlebot4-start[34823]: [component_container-10] [WARN] [1737484550.526893148] [oakd]: IMU enabled but not available!
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.907720451] [oakd]: Pipeline type: RGB
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.905094244] [oakd]: Device type: OAK-D-LITE
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.883641329] [oakd]: USB SPEED: HIGH
Jan 21 13:35:49 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484549.882714421] [oakd]: Camera with MXID: 184430101175A41200 and Name: 3.1 connec>
Jan 21 13:35:49 TB5WaLI kernel: usb 3-1: SerialNumber: 184430101175A41200
Jan 21 13:35:49 TB5WaLI kernel: usb 3-1: Manufacturer: Intel Corporation
Jan 21 13:35:49 TB5WaLI kernel: usb 3-1: Product: Luxonis Device
Jan 21 13:35:49 TB5WaLI kernel: usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Jan 21 13:35:49 TB5WaLI kernel: usb 3-1: New USB device found, idVendor=03e7, idProduct=f63b, bcdDevice= 1.00
Jan 21 13:35:49 TB5WaLI kernel: usb 3-1: new high-speed USB device number 65 using xhci-hcd
Jan 21 13:35:48 TB5WaLI kernel: usb 3-1: USB disconnect, device number 64
Jan 21 13:35:47 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484547.387114999] [oakd]: No ip/mxid specified, connecting to the next available de>
Jan 21 13:35:47 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484547.377175375] [oakd]: Starting camera.
Jan 21 13:35:46 TB5WaLI turtlebot4-start[34823]: [INFO] [launch_ros.actions.load_composable_nodes]: Loaded node '/oakd' in container '/oakd_container'
Jan 21 13:35:46 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484546.349610886] [oakd_container]: Instantiate class: rclcpp_components::NodeFacto>
Jan 21 13:35:46 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484546.349528145] [oakd_container]: Found class: rclcpp_components::NodeFactoryTemp>
Jan 21 13:35:46 TB5WaLI turtlebot4-start[34823]: [component_container-10] [INFO] [1737484546.224483012] [oakd_container]: Load Library: /opt/ros/jazzy/lib/libdepthai_ros>
Jan 21 13:35:45 TB5WaLI turtlebot4-start[34823]: [INFO] [component_container-10]: process started with pid [34989]

@slowrunner
Copy link
Author

slowrunner commented Jan 21, 2025

Interesting - it just happened again, and perhaps there is a clue in the journal?

When I went to sleep TB5-WaLI was docked and the camera was stopped.
When I got up this morning, TB5-WaLI was still docked but the camera was running for some unknown reason
I issued a /stop_camera with success.
I issued a /start_camera with success.
I issued a /stop_camera with success to leave camera stopped while TB5-WaLI was docked.
TB5-WaLI undocked - oakd container went to 100% in htop and did not restart publishing the preview.

Jan 21 08:02:31 TB5WaLI turtlebot4-start[32144]: [create3_republisher-2] Action request _do_not_use/undock SUCCESS received from the robot
Jan 21 08:02:28 TB5WaLI turtlebot4-start[32144]: [turtlebot4_node-1] [INFO] [1737464548.206275459] [turtlebot4_node]: start_motor service completed.
Jan 21 08:02:27 TB5WaLI turtlebot4-start[32144]: [turtlebot4_node-1] [INFO] [1737464547.926505994] [turtlebot4_node]: start_motor service available, sending request
Jan 21 08:02:27 TB5WaLI turtlebot4-start[32144]: [turtlebot4_node-1] [INFO] [1737464547.926127735] [turtlebot4_node]: oakd/start_camera service available, sending request
Jan 21 08:02:26 TB5WaLI turtlebot4-start[32144]: [turtlebot4_node-1] [INFO] [1737464546.926128057] [turtlebot4_node]: RPLIDAR started
Jan 21 08:02:26 TB5WaLI turtlebot4-start[32144]: [turtlebot4_node-1] [INFO] [1737464546.925970575] [turtlebot4_node]: OAKD started
Jan 21 08:02:25 TB5WaLI turtlebot4-start[32144]: [create3_republisher-2] Action request _do_not_use/undock received goal handle from the robot
Jan 21 08:02:25 TB5WaLI turtlebot4-start[32144]: [create3_republisher-2] Forwarding action request to _do_not_use/undock
Jan 21 08:02:25 TB5WaLI turtlebot4-start[32144]: [create3_republisher-2] Received action request for _do_not_use/undock
Jan 21 08:00:39 TB5WaLI systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
Jan 21 08:00:39 TB5WaLI systemd[1]: sysstat-collect.service: Deactivated successfully.
Jan 21 08:00:39 TB5WaLI systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
Jan 21 08:00:02 TB5WaLI CRON[32887]: pam_unix(cron:session): session closed for user root
Jan 21 08:00:02 TB5WaLI CRON[32887]: (CRON) info (No MTA installed, discarding output)
Jan 21 08:00:01 TB5WaLI CRON[32888]: (root) CMD (/usr/bin/python3 /home/ubuntu/TB5-WaLI/plib/cleanlifelog.py)
Jan 21 08:00:01 TB5WaLI CRON[32887]: pam_unix(cron:session): session opened for user root(uid=0) by root(uid=0)
Jan 21 07:55:40 TB5WaLI systemd[1]: fwupd.service: Deactivated successfully.
Jan 21 07:55:01 TB5WaLI CRON[32875]: pam_unix(cron:session): session closed for user root
Jan 21 07:55:01 TB5WaLI CRON[32876]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jan 21 07:55:01 TB5WaLI CRON[32875]: pam_unix(cron:session): session opened for user root(uid=0) by root(uid=0)
Jan 21 07:50:39 TB5WaLI systemd[1]: Finished fwupd-refresh.service - Refresh fwupd metadata and update motd.
Jan 21 07:50:39 TB5WaLI systemd[1]: fwupd-refresh.service: Deactivated successfully.
Jan 21 07:50:39 TB5WaLI systemd[1]: Started fwupd.service - Firmware update daemon.
Jan 21 07:50:39 TB5WaLI dbus-daemon[727]: [system] Successfully activated service 'org.freedesktop.fwupd'
Jan 21 07:50:39 TB5WaLI fwupd[32861]: 12:50:39.924 FuMain               Daemon ready for requests (locale C.UTF-8)
Jan 21 07:50:39 TB5WaLI fwupd[32861]: 12:50:39.894 FuEngine             failed to add device /sys/devices/platform/axi/1000fff000.mmc/mmc_host/mmc0/mmc0:aaaa/block/mmcbl>
Jan 21 07:50:39 TB5WaLI fwupd[32861]: 12:50:39.805 FuPluginFlashrom     failed to set bios info: no structures with type 00
Jan 21 07:50:39 TB5WaLI systemd[1]: Starting fwupd.service - Firmware update daemon...
Jan 21 07:50:39 TB5WaLI dbus-daemon[727]: [system] Activating via systemd: service name='org.freedesktop.fwupd' unit='fwupd.service' requested by ':1.211' (uid=990 pid=3>
Jan 21 07:50:39 TB5WaLI systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
Jan 21 07:50:39 TB5WaLI systemd[1]: sysstat-collect.service: Deactivated successfully.
Jan 21 07:50:39 TB5WaLI systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
Jan 21 07:50:39 TB5WaLI systemd[1]: Starting fwupd-refresh.service - Refresh fwupd metadata and update motd...
Jan 21 07:47:12 TB5WaLI snapd[738]: storehelpers.go:954: cannot refresh snap "snapd": snap has no updates available
Jan 21 07:45:01 TB5WaLI CRON[32836]: pam_unix(cron:session): session closed for user root
Jan 21 07:45:01 TB5WaLI CRON[32837]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jan 21 07:45:01 TB5WaLI CRON[32836]: pam_unix(cron:session): session opened for user root(uid=0) by root(uid=0)
Jan 21 07:40:39 TB5WaLI systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
Jan 21 07:40:39 TB5WaLI systemd[1]: sysstat-collect.service: Deactivated successfully.
Jan 21 07:40:39 TB5WaLI systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
Jan 21 07:38:04 TB5WaLI kernel: usb 3-1: SerialNumber: 03e72485
Jan 21 07:38:04 TB5WaLI kernel: usb 3-1: Manufacturer: Movidius Ltd.
Jan 21 07:38:04 TB5WaLI kernel: usb 3-1: Product: Movidius MyriadX
Jan 21 07:38:04 TB5WaLI kernel: usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Jan 21 07:38:04 TB5WaLI kernel: usb 3-1: New USB device found, idVendor=03e7, idProduct=2485, bcdDevice= 0.01
Jan 21 07:38:04 TB5WaLI kernel: usb 3-1: new high-speed USB device number 60 using xhci-hcd
Jan 21 07:38:02 TB5WaLI kernel: usb 3-1: USB disconnect, device number 59
Jan 21 07:38:01 TB5WaLI turtlebot4-start[32144]: [component_container-10] [INFO] [1737463081.754153492] [oakd]: Stopping camera.
Jan 21 07:37:42 TB5WaLI turtlebot4-start[32144]: [component_container-10] [INFO] [1737463062.481160500] [oakd]: Camera ready!

@slowrunner slowrunner changed the title turtlebot4 service started ros-jazzy-depthai_driver for oak-d-lite found taking 100% of one core after 23 hours turtlebot4 service started ros-jazzy-depthai_driver for oak-d-lite non-responsive at 100% CPU - how to debug? Jan 21, 2025
@Serafadam
Copy link
Collaborator

Hi, thanks for the report, does it also happen if you run the camera driver in separation (via a launch file)

@slowrunner
Copy link
Author

slowrunner commented Jan 22, 2025

In my current test: "turn off turtlebot4_node power_saver feature" (which prevents stopping camera when on the dock) the

  • Oak-D-Lite has stayed alive for 36 hours,
  • survived four undocks and
  • survived four docking with no interruption of /oakd/rgb/preview/image_raw and
  • CPU usage has remained around 3% of one core of the Raspberry Pi 5.

Next I'll figure out how to properly stop TB4 oakd container to "run the camera driver via a launch file" and then test repeated stop_camera and start_camera service calls.

(The camera is drawing 1W from a separated 5v power supply, with very good voltage regulation of +/-10mV.)

Image

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants