Skip to content

Fix: Lack of real-time network feedback (Zombie State)#31#40

Open
preetamspacee wants to merge 2 commits intoAOSSIE-Org:mainfrom
preetamspacee:fix/bug-lack-of-real-time-network-feedback-zombie-state
Open

Fix: Lack of real-time network feedback (Zombie State)#31#40
preetamspacee wants to merge 2 commits intoAOSSIE-Org:mainfrom
preetamspacee:fix/bug-lack-of-real-time-network-feedback-zombie-state

Conversation

@preetamspacee
Copy link

@preetamspacee preetamspacee commented Feb 9, 2026

This PR fixes the “Zombie State” issue (#31) where the trackpad UI remained connected after a network drop. It introduces an application-level heartbeat and real-time latency tracking, enabling proactive connection monitoring instead of relying on slow TCP timeouts.

Previously, network drops could leave the UI stuck in a false “Connected” state for up to 2 minutes, causing silent input failures. With this fix, connection issues are detected within ~4 seconds.

Changes Made

Core Connection Logic
• Added a heartbeat system in useRemoteConnection.ts that sends a ping every 2 seconds
• If no pong is received within 4 seconds, the connection transitions to a stale state
• Implemented a lightweight pong handler in server/websocket.ts

Latency Tracking
• Measured round-trip time (RTT) using ping/pong timestamps
• Displayed real-time latency in the trackpad UI

UI Improvements
• Added a live latency indicator
• Expanded status indicator to 4 states:
• Green: Healthy connection
• Amber (pulsing): Stale/unstable connection
• Blue: Connecting
• Red: Disconnected

Verification

Airplane Mode Test

Disabled Wi-Fi during an active session.

Result: Connection drop detected within ~4 seconds (previously 60–120 seconds).

Distance Test

Moved to the edge of Wi-Fi range.

Result: Latency indicator increased progressively and warned of instability before disconnection.
Impact
• Eliminates silent connection failures
• Provides real-time network feedback
• Improves user confidence and responsiveness
Fixes #31

Summary by CodeRabbit

  • New Features
    • Network latency is now measured and shown during active connections with a latency label and color coding.
    • Visual status bar and dynamic indicators reflect connection state (connected, stale) including a pulsing badge and stale messaging.
    • Connection stability monitoring with automatic reconnect handling and animated transitions/visual effects when recovering.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

Implements an application-level WebSocket heartbeat and latency reporting: periodic ping/pong, RTT-based latency, a new 'stale' status with reconnection logic, server-side pong response, and UI changes to display latency and a stale-network state.

Changes

Cohort / File(s) Summary
Heartbeat & Connection Logic
src/hooks/useRemoteConnection.ts
Adds optional onMessage param, ping/pong heartbeat (2s), latency (RTT) state, missed-pong counting, 'stale' detection, forced reconnect after repeated misses, exposes latency in return value.
Server Ping Handler
src/server/websocket.ts
Handles incoming {type:'ping'} messages by replying with {type:'pong'} immediately and returning early.
UI Status & Latency Display
src/components/Trackpad/TouchArea.tsx
Adds `latency: number
Prop Threading
src/routes/trackpad.tsx
Destructures latency from useRemoteConnection and passes latency={latency} to TouchArea.

Sequence Diagram

sequenceDiagram
    participant Client as Client (useRemoteConnection)
    participant Server as Server (websocket.ts)
    participant UI as UI (TouchArea)

    Note over Client,Server: Heartbeat (every 2s)
    Client->>Server: send {type: "ping"}
    Server->>Client: reply {type: "pong"}
    Client->>Client: compute latency = now - pingStart
    Client->>UI: update status='connected', latency=RTT

    Note over Client,UI: Stale detection (no pong ≧4s)
    Client->>Client: increment missedPongs -> mark status='stale', clear latency
    Client->>UI: update status='stale'
    UI->>UI: show pulsing badge, apply blur/grayscale

    Note over Client,Server: Reconnect flow
    Client->>Client: after threshold close socket and schedule reconnect (3s)
    Client->>Server: open new WebSocket -> restart heartbeat
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Thump-thump, I ping beneath the moon,
I wait for pongs that come back soon,
When links go stale I call and peep,
Then hop to mend the twitching sleep,
Now gestures wake — the network hums in tune.

🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately references the main fix (real-time network feedback/Zombie State) and issue #31, clearly summarizing the primary change.
Linked Issues check ✅ Passed All coding requirements from #31 are met: heartbeat implemented (ping every 2s, stale after 4s), latency tracking with RTT measurement, server pong handler added, and UI enhanced with status color mapping and latency indicator.
Out of Scope Changes check ✅ Passed All changes are directly aligned with fixing the Zombie State issue; no unrelated modifications detected beyond heartbeat, latency tracking, and corresponding UI improvements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@src/components/Trackpad/TouchArea.tsx`:
- Around line 56-66: The "Reconnecting..." label in TouchArea.tsx is misleading
because status === 'stale' only sets a flag in useRemoteConnection and doesn't
trigger a socket reconnect; update the UI by changing the displayed text inside
the block that checks status === 'stale' from "Reconnecting..." to "Connection
Unstable" (or similar), and/or implement real reconnection logic in
useRemoteConnection (add a reconnect attempt with backoff that closes and
re-initializes the socket, updates status from 'stale' to
'connecting'/'connected' on success, and surfaces errors via the same status) so
the message matches actual behavior.
- Around line 33-38: The getLatencyColor function incorrectly treats latency===0
as falsy; change the initial check from a falsy check to an explicit
null/undefined check (e.g., latency == null or latency === null || latency ===
undefined) so a 0ms latency falls through to the <50ms branch; update the
getLatencyColor implementation to use that strict null check while keeping the
existing comparisons and return values.

In `@src/hooks/useRemoteConnection.ts`:
- Around line 41-53: The onmessage handler in useRemoteConnection currently
swallows every non-'pong' message; update socket.onmessage so it only intercepts
type === 'pong' (update setLatency, lastPongTime, setStatus as now) and
otherwise forwards the parsed message to the connection consumer (e.g., call a
provided callback like onServerMessage or emit an event) or at least log
unexpected types; locate socket.onmessage in useRemoteConnection and add a
branch that forwards event.data for types such as 'connected', 'server-ip', and
'config-updated' instead of discarding them.
- Around line 27-38: The heartbeat loop marks the connection 'stale' but never
closes the WebSocket, leaving a zombie; update useRemoteConnection's heartbeat
logic (around heartbeatInterval, lastPongTime, setStatus, setLatency) to track
consecutive missed pongs (e.g., missedPongCount) incrementing when Date.now() -
lastPongTime > 4000 and reset to 0 when a pong updates lastPongTime, and when
missedPongCount exceeds a small threshold (e.g., 2–3) call socket.close() to
force the onclose/reconnect path to run and then setStatus appropriately; ensure
you clear or reset missedPongCount when reopening/reinitializing the socket.
- Around line 72-76: The cleanup closure in the useRemoteConnection effect
captures the initial stale ws value (null), so ws?.close() is a no-op on
unmount; update the cleanup to close the actual socket created in the effect by
referencing the local socket variable (or a ref) instead of the state variable:
use the local socket (the variable assigned when creating the WebSocket) for
socket.close(), and ensure you clearTimeout(reconnectTimer) and
clearInterval(heartbeatInterval) for the timers created in the same effect;
locate the return cleanup in useRemoteConnection and replace ws?.close() with a
call to the locally-scoped socket (or a ref.current) to reliably close the
connection on unmount.
🧹 Nitpick comments (2)
src/server/websocket.ts (1)

72-75: Pong response lacks a correlation identifier — latency measurement can be inaccurate.

The server echoes a bare { type: 'pong' } without forwarding any identifier or timestamp from the ping. On the client side, lastPingStart is overwritten on every ping send (every 2 s). If a pong arrives after the next ping has already been sent (e.g., under high latency or a GC pause), the client computes now - lastPingStart against the newer ping, producing an artificially low latency reading.

Consider echoing a sequence number or the client-supplied timestamp so the client can correlate each pong to the correct ping:

Suggested change
 if (msg.type === 'ping') {
-    ws.send(JSON.stringify({ type: 'pong' }));
+    ws.send(JSON.stringify({ type: 'pong', seq: msg.seq }));
     return;
 }
src/hooks/useRemoteConnection.ts (1)

46-48: Unnecessary setStatus('connected') on every pong when already connected.

This calls setStatus('connected') every 2 seconds even when the status is already 'connected', causing avoidable re-renders. Guard with a conditional:

Suggested fix
                         setLatency(now - lastPingStart);
                         lastPongTime = now;
-                        setStatus('connected');
+                        // Only update status if recovering from stale
+                        setStatus((prev) => prev === 'stale' ? 'connected' : prev);

@preetamspacee
Copy link
Author

I’ve pushed an update addressing your feedback:

• Fixed the stale React closure by switching to functional status updates
• Improved the heartbeat logic so missed pongs are counted only after real timeouts
• Added proper stale → reconnect recovery behavior
• Ensured non-pong messages are forwarded instead of being dropped
• Fixed WebSocket cleanup to prevent zombie sockets

I tested the changes locally under unstable network conditions and the connection recovery now behaves as expected. Please let me know if there’s anything else you’d like me to adjust or improve.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/components/Trackpad/TouchArea.tsx`:
- Around line 40-47: This div in TouchArea.tsx is interactive (uses handleStart,
handlers.onTouchMove, handlers.onTouchEnd, handlePreventFocus) but lacks an ARIA
role; add role="application" to the outer <div> to signal to assistive tech that
this is a custom input surface, ensuring the role is applied alongside the
existing className and event handlers and does not alter the existing logic that
checks status.

In `@src/hooks/useRemoteConnection.ts`:
- Around line 76-82: The onclose handler schedules a reconnect unconditionally
which causes a stray reconnect when cleanup intentionally closes the socket;
update the logic to distinguish intentional vs unintentional closes by adding a
flag (e.g., isMounted or intentionalClose) and checking it inside socket.onclose
before calling setTimeout(connect, 3000). In the cleanup, set that flag
(intentionalClose = true or isMounted = false) before calling
activeSocket?.close(), and continue clearing reconnectTimer and nulling
activeSocket as you already do; reference the onclose handler, activeSocket,
reconnectTimer, connect, setStatus and setLatency so you locate where to add the
guard. Ensure onclose only schedules reconnect when the flag indicates the close
was unintentional.
🧹 Nitpick comments (1)
src/hooks/useRemoteConnection.ts (1)

35-54: Heartbeat escalation logic is sound; consider wrapping socket.send in try-catch.

The missed-pong escalation (1 = tolerate, 2 = stale, 3 = force close) is well-designed and addresses the prior review concern about zombie states never recovering.

One minor robustness concern: socket.send() on line 38 is unguarded. Although the readyState === WebSocket.OPEN check on line 36 reduces the risk, there's a small window where the socket could transition to CLOSING between the check and the send, causing an uncaught exception inside setInterval.

Suggested defensive wrap
                     if (socket.readyState === WebSocket.OPEN) {
                         lastPingStart = Date.now();
-                        socket.send(JSON.stringify({ type: 'ping' }));
+                        try {
+                            socket.send(JSON.stringify({ type: 'ping' }));
+                        } catch (e) {
+                            console.warn('Heartbeat send failed', e);
+                        }

Comment on lines 40 to 47
return (
<div
className="flex-1 bg-neutral-800 relative touch-none select-none flex items-center justify-center p-4"
className={`flex-1 relative touch-none select-none flex items-center justify-center p-4 transition-colors duration-500 ${status === 'stale' ? 'bg-neutral-900 opacity-80' : 'bg-neutral-800'}`}
onTouchStart={handleStart}
onTouchMove={handlers.onTouchMove}
onTouchEnd={handlers.onTouchEnd}
onMouseDown={handlePreventFocus}
>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a role attribute for accessibility on the interactive <div>.

Static analysis (Biome) correctly flags that this <div> has touch/mouse event handlers but no ARIA role. Screen readers and assistive technologies can't identify its purpose. Since this is a trackpad input surface, role="application" is appropriate.

Suggested fix
         <div
-            className={`flex-1 relative touch-none select-none flex items-center justify-center p-4 transition-colors duration-500 ${status === 'stale' ? 'bg-neutral-900 opacity-80' : 'bg-neutral-800'}`}
+            className={`flex-1 relative touch-none select-none flex items-center justify-center p-4 transition-colors duration-500 ${status === 'stale' ? 'bg-neutral-900 opacity-80' : 'bg-neutral-800'}`}
+            role="application"
+            aria-label={scrollMode ? 'Scroll input area' : 'Trackpad input area'}
             onTouchStart={handleStart}
             onTouchMove={handlers.onTouchMove}
             onTouchEnd={handlers.onTouchEnd}
             onMouseDown={handlePreventFocus}
         >
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return (
<div
className="flex-1 bg-neutral-800 relative touch-none select-none flex items-center justify-center p-4"
className={`flex-1 relative touch-none select-none flex items-center justify-center p-4 transition-colors duration-500 ${status === 'stale' ? 'bg-neutral-900 opacity-80' : 'bg-neutral-800'}`}
onTouchStart={handleStart}
onTouchMove={handlers.onTouchMove}
onTouchEnd={handlers.onTouchEnd}
onMouseDown={handlePreventFocus}
>
return (
<div
className={`flex-1 relative touch-none select-none flex items-center justify-center p-4 transition-colors duration-500 ${status === 'stale' ? 'bg-neutral-900 opacity-80' : 'bg-neutral-800'}`}
role="application"
aria-label={scrollMode ? 'Scroll input area' : 'Trackpad input area'}
onTouchStart={handleStart}
onTouchMove={handlers.onTouchMove}
onTouchEnd={handlers.onTouchEnd}
onMouseDown={handlePreventFocus}
>
🧰 Tools
🪛 Biome (2.3.13)

[error] 41-47: Static Elements should not be interactive.

To add interactivity such as a mouse or key event listener to a static element, give the element an appropriate role value.

(lint/a11y/noStaticElementInteractions)

🤖 Prompt for AI Agents
In `@src/components/Trackpad/TouchArea.tsx` around lines 40 - 47, This div in
TouchArea.tsx is interactive (uses handleStart, handlers.onTouchMove,
handlers.onTouchEnd, handlePreventFocus) but lacks an ARIA role; add
role="application" to the outer <div> to signal to assistive tech that this is a
custom input surface, ensuring the role is applied alongside the existing
className and event handlers and does not alter the existing logic that checks
status.

Comment on lines 76 to 82
socket.onclose = () => {
setStatus('disconnected');
setLatency(null);
clearInterval(heartbeatInterval);
activeSocket = null;
reconnectTimer = setTimeout(connect, 3000);
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reconnect on close doesn't distinguish intentional vs. unintentional close.

When the component's cleanup calls activeSocket?.close() (line 97), it triggers onclose, which schedules a reconnect via setTimeout(connect, 3000) on line 81. The cleanup then clears reconnectTimer on line 95 — but the order matters: clearTimeout runs before activeSocket?.close(), so the reconnect timer set by onclose fires after the cleanup has already run.

In practice this is mitigated because the component is unmounted so setStatus/setLatency calls in the new connect() are no-ops on an unmounted component (React 19 doesn't warn, but it's still wasted work). Consider guarding with an isMounted flag or reversing the cleanup order.

Suggested fix with isMounted guard
+        let isMounted = true;
+
         const connect = () => {
+            if (!isMounted) return;
             console.log(`Connecting to ${wsUrl}`);
             setStatus('connecting');
             ...
         };

         connect();

         return () => {
+            isMounted = false;
             clearTimeout(reconnectTimer);
             clearInterval(heartbeatInterval);
             activeSocket?.close();
         };
🤖 Prompt for AI Agents
In `@src/hooks/useRemoteConnection.ts` around lines 76 - 82, The onclose handler
schedules a reconnect unconditionally which causes a stray reconnect when
cleanup intentionally closes the socket; update the logic to distinguish
intentional vs unintentional closes by adding a flag (e.g., isMounted or
intentionalClose) and checking it inside socket.onclose before calling
setTimeout(connect, 3000). In the cleanup, set that flag (intentionalClose =
true or isMounted = false) before calling activeSocket?.close(), and continue
clearing reconnectTimer and nulling activeSocket as you already do; reference
the onclose handler, activeSocket, reconnectTimer, connect, setStatus and
setLatency so you locate where to add the guard. Ensure onclose only schedules
reconnect when the flag indicates the close was unintentional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Lack of Real-time Network Feedback (Zombie State)

2 participants