[test] baseline failures #15632

danielxiangzl · 2024-12-18T19:35:18Z

Description

How Has This Been Tested?

Key Areas to Review

Type of Change

Which Components or Systems Does This Change Impact?

Checklist

I have read and followed the CONTRIBUTING doc
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I identified and added all stakeholders and component owners affected by this change as reviewers
I tested both happy and unhappy path of the functionality
I have made corresponding changes to the documentation

trunk-io · 2024-12-18T19:35:22Z

⏱️ 3h 18m total CI duration on this PR

Slowest 15 Jobs	Cumulative Duration	Recent Runs
adhoc-forge-test / forge	31m	🟥
test-target-determinator	28m	🟩 🟩 🟩 🟩 🟩 (+3 more)
rust-move-tests	14m	🟥 🟩
rust-cargo-deny	14m	🟩 🟩 🟩 🟩 🟩 (+3 more)
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	13m	🟩
rust-move-tests	12m	🟩
check-dynamic-deps	9m	🟩 🟩 🟩 🟩 🟩 (+4 more)
semgrep/ci	4m	🟩 🟩 🟩 🟩 🟩 (+4 more)
general-lints	4m	🟩 🟩 🟩 🟩 🟩 (+3 more)
file_change_determinator	2m	🟩 🟩 🟩 🟩 🟩 (+3 more)

_{settings ⋅ feedback ⋅ docs ⋅ learn more about trunk.io}

graphite-app · 2024-12-18T19:36:12Z

testsuite/smoke-test/src/consensus/consensus_fault_tolerance.rs

+    )
+    .await
+    .unwrap();
+    panic!("test_fault_tolerance_of_leader_equivocation");


This panic! statement causes the test to fail unconditionally, regardless of whether the test logic succeeds or fails. Since the test appears to be validating leader equivocation fault tolerance, the test should be allowed to complete normally and verify the expected behavior through its assertions.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

graphite-app · 2024-12-18T21:34:24Z

testsuite/testcases/src/performance_test.rs

+#[async_trait]
+impl NetworkLoadTest for PerformanceBenchmark {
+    async fn test(
+        &self,
+        swarm: Arc<tokio::sync::RwLock<Box<dyn Swarm>>>,
+        _report: &mut TestReport,
+        duration: Duration,
+    ) -> Result<()> {
+        let validators = { swarm.read().await.get_validator_clients_with_names() };
+        // 10 vals, test 1,2,3 failures
+        let num_bad_leaders = 1;
+        for (name, validator)  in validators[..num_bad_leaders].iter() {
+            validator
+                    .set_failpoint(
+                        "consensus::leader_equivocation".to_string(),
+                        "return".to_string(),
+                    )
+                    .await
+                    .map_err(|e| {
+                        anyhow!(
+                            "set_failpoint to set consensus leader equivocation on {} failed, {:?}",
+                            name,
+                            e
+                        )
+                    })?;
+        };
+        Ok(())
+    }
+}


The test should honor the duration parameter by adding tokio::time::sleep(duration).await after setting the failpoints. This ensures the test runs for the expected duration and gives the failpoints time to take effect. Without this sleep, the test may complete prematurely before the fault injection has meaningful impact.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

graphite-app · 2024-12-18T21:57:20Z

testsuite/testcases/src/performance_test.rs

+    async fn test(
+        &self,
+        swarm: Arc<tokio::sync::RwLock<Box<dyn Swarm>>>,
+        _report: &mut TestReport,
+        duration: Duration,
+    ) -> Result<()> {
+        let validators = { swarm.read().await.get_validator_clients_with_names() };
+        // 10 vals, test 1,2,3 failures
+        let num_bad_leaders = 3;
+        for (name, validator)  in validators[..num_bad_leaders].iter() {
+            validator
+                    .set_failpoint(
+                        "consensus::leader_equivocation".to_string(),
+                        "return".to_string(),
+                    )
+                    .await
+                    .map_err(|e| {
+                        anyhow!(
+                            "set_failpoint to set consensus leader equivocation on {} failed, {:?}",
+                            name,
+                            e
+                        )
+                    })?;
+        };
+        Ok(())
+    }
+}


The test currently returns immediately after setting the failpoints, without waiting for the specified duration. This means the test may complete before the injected failures have time to manifest and be observed. Adding tokio::time::sleep(duration).await before returning would ensure the test runs for the intended duration while the failpoints are active.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

danielxiangzl added 2 commits December 18, 2024 11:33

test leader equivocation

e3eb8e3

test

8c69113

danielxiangzl added the CICD:build-failpoints-images Build failpoints docker image label Dec 18, 2024

graphite-app bot reviewed Dec 18, 2024

View reviewed changes

use rotating leader

328f7e8

danielxiangzl force-pushed the daniel-baseline-failures branch from dde8216 to 328f7e8 Compare December 18, 2024 21:56

graphite-app bot reviewed Dec 18, 2024

View reviewed changes

danielxiangzl added 4 commits December 18, 2024 14:35

1 failure

51d88e9

tps

d01d497

tps

21875f4

3 faults

527e90c

danielxiangzl force-pushed the daniel-baseline branch from a492fa9 to cc7eeb5 Compare December 21, 2024 01:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[test] baseline failures #15632

[test] baseline failures #15632

danielxiangzl commented Dec 18, 2024

trunk-io bot commented Dec 18, 2024 •

edited

Loading

graphite-app bot Dec 18, 2024

graphite-app bot Dec 18, 2024

graphite-app bot Dec 18, 2024

[test] baseline failures #15632

Are you sure you want to change the base?

[test] baseline failures #15632

Conversation

danielxiangzl commented Dec 18, 2024

Description

How Has This Been Tested?

Key Areas to Review

Type of Change

Which Components or Systems Does This Change Impact?

Checklist

trunk-io bot commented Dec 18, 2024 • edited Loading

graphite-app bot Dec 18, 2024

Choose a reason for hiding this comment

graphite-app bot Dec 18, 2024

Choose a reason for hiding this comment

graphite-app bot Dec 18, 2024

Choose a reason for hiding this comment

trunk-io bot commented Dec 18, 2024 •

edited

Loading