Drop of messages with History "keep all" but not with "keep last" #385

theksg · 2024-12-19T06:20:23Z

I am creating a publisher and subscriber with QoS History "Keep all" and Durability "transient local".

If publisher publishes more than 5000 and messages, then I run the subscriber, to start listen, it will only capture last 5k messages and not all messages.

Now if I change the History to "keep last" and depth 50000, and if I do the same thing (starting subscriber after publishing 5000 messages), subscriber listen all the messages.

I want to know what the maximum queue depth allowed in fast rtps which can be adhered.

Publisher code:
#include
#include
#include
#include

#include "rclcpp/rclcpp.hpp"
#include "std_msgs/msg/int32.hpp"

using namespace std::chrono_literals;

class MinimalPublisher : public rclcpp::Node
{
public:
MinimalPublisher()
: Node("minimal_publisher"), count_(0)
{
// Set up QoS with history keep all, reliability reliable, and durability transient local
auto qos = rclcpp::QoS(rclcpp::KeepLast(50000))
.reliable()
.transient_local();

    publisher_ = this->create_publisher<std_msgs::msg::Int32>("topic", qos);
    timer_ = this->create_wall_timer(
        200ms, std::bind(&MinimalPublisher::timer_callback, this)); // Changed to 200ms
}

private:
void timer_callback()
{
if (count_ >= 7000) {
rclcpp::shutdown(); // Stop publishing after 7000 messages
return;
}

    auto message = std_msgs::msg::Int32();
    message.data = count_;
    RCLCPP_INFO(this->get_logger(), "Publishing: '%d'", message.data);
    publisher_->publish(message);
    count_++;
}

rclcpp::TimerBase::SharedPtr timer_;
rclcpp::Publisher<std_msgs::msg::Int32>::SharedPtr publisher_;
size_t count_;

};

int main(int argc, char * argv[])
{
rclcpp::init(argc, argv);
rclcpp::spin(std::make_shared());
rclcpp::shutdown();
return 0;
}

Subscriber code:
#include

#include "rclcpp/rclcpp.hpp"
#include "std_msgs/msg/int32.hpp"
using std::placeholders::_1;

class MinimalSubscriber : public rclcpp::Node
{
public:
MinimalSubscriber()
: Node("minimal_subscriber")
{
// Set up QoS with history keep all, reliability reliable, and durability transient local
auto qos = rclcpp::QoS(rclcpp::KeepLast(50000))
.reliable()
.transient_local();

    subscription_ = this->create_subscription<std_msgs::msg::Int32>(
        "/topic", qos, std::bind(&MinimalSubscriber::topic_callback, this, _1));
}

private:
void topic_callback(const std_msgs::msg::Int32::SharedPtr msg) const
{
RCLCPP_INFO(this->get_logger(), "I heard: '%d'", msg->data);
}

rclcpp::Subscription<std_msgs::msg::Int32>::SharedPtr subscription_;

};

int main(int argc, char * argv[])
{
rclcpp::init(argc, argv);
rclcpp::spin(std::make_shared());
rclcpp::shutdown();
return 0;
}

please change the QoS values in code as needed.

The text was updated successfully, but these errors were encountered:

fujitatomoya · 2024-12-19T19:45:26Z

@theksg

as far as I know, this is designed behavior.

there is ResourceLimitsQosPolicy(https://fast-dds.docs.eprosima.com/en/v3.0.0/fastdds/api_reference/dds_pim/core/policy/resourcelimitsqospolicy.html) defined in DDS specification that determines the limits.

int32_t max_samples
Value less or equal to 0 means infinite resources. By default, 5000.

if you want to increase the limits,

specify the limits with KEEP_LAST, that you already confirmed.
apply xml configuration policy to change max_samples.

hopefully this helps.

theksg · 2024-12-20T05:30:48Z

@theksg

as far as I know, this is designed behavior.

there is ResourceLimitsQosPolicy(https://fast-dds.docs.eprosima.com/en/v3.0.0/fastdds/api_reference/dds_pim/core/policy/resourcelimitsqospolicy.html) defined in DDS specification that determines the limits.

int32_t max_samples
Value less or equal to 0 means infinite resources. By default, 5000.

if you want to increase the limits,

specify the limits with KEEP_LAST, that you already confirmed.

apply xml configuration policy to change max_samples.

hopefully this helps.

@fujitatomoya thanks for reverting.
if the resource exhaustion happens, that will lead to crashing of node. Is that the expected behavior?

fujitatomoya · 2024-12-20T07:37:58Z

if the resource exhaustion happens, that will lead to crashing of node. Is that the expected behavior?

i am not quite following this, can you rephrase the question a bit?

theksg · 2024-12-20T10:34:43Z

i am not quite following this, can you rephrase the question a bit?

With a publisher using QoS depth 50,000 and transient local durability, sending 10 MB images can lead to a large queue. If memory is overwhelmed, the node might crash. Is this expected, and should developers prevent such crashes? or there is some mechanism to prevent such failure?

fujitatomoya · 2024-12-20T17:13:28Z

i guess that should be detected at the configuration if you are using POD data before runtime, because it tries to configure the buffer with the size which is much bigger than system originally has? if that is the variable length data, DataWriter::write() call may fail and throw exception? honestly i never tried that, and maybe dependent on rmw implementation.

CC: @eboasson @MiguelCompany @trubio-rti

clalancette · 2024-12-20T17:36:59Z

With a publisher using QoS depth 50,000 and transient local durability, sending 10 MB images can lead to a large queue. If memory is overwhelmed, the node might crash. Is this expected, and should developers prevent such crashes? or there is some mechanism to prevent such failure?

So while we should do our best to not crash, I'm also curious what the use case is here. What are you trying to achieve with such a large depth of large images?

theksg · 2024-12-21T05:27:47Z

So while we should do our best to not crash, I'm also curious what the use case is here. What are you trying to achieve with such a large depth of large images?

The scenario I am talkin about is hypothetical and basically only intended to identify a case where a crash will happen.
From the use case's perspective, I am writing a publisher with transient local durability and KeepAll history. I wondered if KeepAll can make node crash, so I was considering the limitations it may have which I need to consider.

I understand that default max QoS Depth for KeepAll QoS History is 5000 for fast dds and 10000 for cyclone dds. (Although hypothetical message size can even exhaust them, XD).

Then I tried KeepLast and was amazed to see their capacity can be more than what KeepAll can store, because from the definitions, KeepAll should have more capacity.

I wanted to know if a crash due to overwhelming of message queue can be identified before or not, or we relying on developer to create the message queue so as to avoid crashing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop of messages with History "keep all" but not with "keep last" #385

Drop of messages with History "keep all" but not with "keep last" #385

theksg commented Dec 19, 2024 •

edited

Loading

fujitatomoya commented Dec 19, 2024

theksg commented Dec 20, 2024

fujitatomoya commented Dec 20, 2024

theksg commented Dec 20, 2024

fujitatomoya commented Dec 20, 2024

clalancette commented Dec 20, 2024

theksg commented Dec 21, 2024

Drop of messages with History "keep all" but not with "keep last" #385

Drop of messages with History "keep all" but not with "keep last" #385

Comments

theksg commented Dec 19, 2024 • edited Loading

fujitatomoya commented Dec 19, 2024

theksg commented Dec 20, 2024

fujitatomoya commented Dec 20, 2024

theksg commented Dec 20, 2024

fujitatomoya commented Dec 20, 2024

clalancette commented Dec 20, 2024

theksg commented Dec 21, 2024

theksg commented Dec 19, 2024 •

edited

Loading