Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout support when connecting user #3303

Merged
merged 29 commits into from
Jul 18, 2024

Conversation

nuno-vieira
Copy link
Member

@nuno-vieira nuno-vieira commented Jul 12, 2024

🔗 Issue Links

https://stream-io.atlassian.net/browse/PBE-4780

🎯 Goal

Add support for timing out reconnection.

📝 Summary

SDK:

  • Add an option to configure reconnection timeout
  • Fix Channel List not hiding error view when data is available
  • Fix not trying to connect if the network is not available (This can cause edge cases, according to Apple)
  • Ignore network status since it is not reliable, even on the device sometimes network status is not returned correctly
  • Fix invalid token errors considered as possible to reconnect

DemoApp:

  • Fix the demo app not showing the connection bar
  • Fixes fake token failures in the demo app not working
  • Fix Demo App channel list error view animation

🛠 Implementation

An alternative approach to #3302 uses a separate entity to control the timeout, independent of the exponential backoff mechanism.

It uses a separate timeout handler to timeout the reconnection logic instead of using a number of retries of the existing backoff retry mechanism. The 2 main reasons for using a separate timeout handler are the following:

🧪 Manual Testing Notes

With timeout

  1. Go To Demo App Configuration Screen
  2. Set reconnection timeout, ex: 15s
  3. Disable Network on your device
  4. Login
  5. Should show "Connecting..." banner
  6. After 15s
  7. Should show "Disconnect" banner
  8. Enable Network on your device
  9. Should show "Connecting.." and then "Connected" and then disappear
  10. Tap on Error View to reload the channel list
  11. Should load channel list

Without timeout

  1. Go To Demo App Configuration Screen
  2. Disable timeout (By default)
  3. Disable Network on your device
  4. Login
  5. Should show "Connecting..." banner
  6. Wait 60s
  7. Should still show "Connecting..." banner
  8. Enable Network on your device
  9. Should show "Connected" and then disappear
  10. Tap on Error View to reload the channel list
  11. Should load channel list

Retest token refresh:

  1. Open Demo App Configuration
  2. Edit tokenRefreshDetails
  3. Add a fake App Secret
  4. Set the duration, ex: 10
  5. Set the number of refreshes, ex: 3
  6. Login
  7. Should see "Demo App Token Refreshing: Token refresh failed." 3x in Xcode Console
  8. Should see "Disconnected"
  9. Logout
  10. Put a valid App Secret from Stream's Dashboard
  11. Should see "Demo App Token Refreshing: Token refresh failed." 3x in Xcode Console
  12. Then you should see "Connected"
  13. For each 10 seconds, the banner should show "Connecting" -> "Connected"

☑️ Contributor Checklist

  • I have signed the Stream CLA (required)
  • This change should be manually QAed
  • Changelog is updated with client-facing changes
  • Changelog is updated with new localization keys
  • New code is covered by unit tests
  • Comparison screenshots added for visual changes
  • Affected documentation updated (docusaurus, tutorial, CMS)

@nuno-vieira nuno-vieira changed the title Add timeout when connecting user [Alternative Approach] Add timeout when connecting user Jul 16, 2024
@nuno-vieira nuno-vieira added 🐞 Bug An issue or PR related to a bug ✅ Feature An issue or PR related to a feature 🌐 SDK: StreamChat (LLC) Tasks related to the StreamChat LLC SDK labels Jul 16, 2024
@nuno-vieira nuno-vieira changed the title Add timeout when connecting user Add timeout support when connecting user Jul 16, 2024
@nuno-vieira nuno-vieira force-pushed the fix/add-timeout-when-connecting-user-alternative branch from 06fa6de to 04e0e2a Compare July 16, 2024 16:46
@nuno-vieira nuno-vieira marked this pull request as ready for review July 17, 2024 14:08
@nuno-vieira nuno-vieira requested a review from a team as a code owner July 17, 2024 14:08
@Stream-SDK-Bot
Copy link
Collaborator

StreamChat XCMetrics

target metric benchmark branch performance status
MessageList Hitches total duration 10 ms 1.67 ms 83.3% 🔼 🟢
Duration 2.6 s 2.55 s 1.92% 🔼 🟢
Hitch time ratio 4 ms per s 0.65 ms per s 83.75% 🔼 🟢
Frame rate 75 fps 78.53 fps 4.71% 🔼 🟢
Number of hitches 1 0.2 80.0% 🔼 🟢
ChannelList Hitches total duration 12.5 ms 5.84 ms 53.28% 🔼 🟢
Duration 2.6 s 2.55 s 1.92% 🔼 🟢
Hitch time ratio 5 ms per s 2.3 ms per s 54.0% 🔼 🟢
Frame rate 72 fps 74.81 fps 3.9% 🔼 🟢
Number of hitches 1.2 0.4 66.67% 🔼 🟢

Copy link
Contributor

@martinmitrevski martinmitrevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just added few questions. In any case, @testableapple can start testing.

@nuno-vieira nuno-vieira force-pushed the fix/add-timeout-when-connecting-user-alternative branch from 4791f04 to 6fb30b2 Compare July 17, 2024 17:12
Copy link
Contributor

@martinmitrevski martinmitrevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some risky changes here and there. Can't we limit this PR to the timeout support only? This whole reconnection thing was very fragile already, not sure if we should do all these changes.

@nuno-vieira
Copy link
Member Author

nuno-vieira commented Jul 17, 2024

Some risky changes here and there. Can't we limit this PR to the timeout support only? This whole reconnection thing was very fragile already, not sure if we should do all these changes.

@martinmitrevski All changes are required to pass all QA Scenarios. I tried to make the changes as minimal as possible, but even with a simple change, there was always something conflicting. As you said, this part is very fragile. This PR is already a step in making it less fragile, because it makes things simpler and more obvious.

@nuno-vieira nuno-vieira force-pushed the fix/add-timeout-when-connecting-user-alternative branch from 6fb30b2 to f635d07 Compare July 18, 2024 00:01
Copy link
Contributor

@martinmitrevski martinmitrevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it then. Please check the failing e2e tests first - something around the offline mode, not sure if it's connected to your changes.
Afterwards, @testableapple should do a thorough QA around reconnection and token expiry.

@nuno-vieira nuno-vieira added the 🤞 Ready For QA A PR that is Ready for QA label Jul 18, 2024
@testableapple testableapple added 🟢 QAed A PR that was QAed and removed 🤞 Ready For QA A PR that is Ready for QA labels Jul 18, 2024
Copy link
Contributor

@testableapple testableapple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✅
Just one tiny issue, discussed in slack.

@nuno-vieira nuno-vieira enabled auto-merge (squash) July 18, 2024 17:16
Copy link

sonarcloud bot commented Jul 18, 2024

@nuno-vieira nuno-vieira merged commit de3376b into develop Jul 18, 2024
14 checks passed
@nuno-vieira nuno-vieira deleted the fix/add-timeout-when-connecting-user-alternative branch July 18, 2024 19:46
@Stream-SDK-Bot Stream-SDK-Bot mentioned this pull request Jul 18, 2024
@@ -379,6 +382,7 @@ public class ChatClient {
userInfo: UserInfo,
completion: ((Error?) -> Void)? = nil
) {
connectionRecoveryHandler?.start()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nuno-vieira @martinmitrevski Hello, I found that this code is causing an issue, so I'm leaving a comment.

connectGuestUser calls logOut first. As a result, ConnectionRecoveryHandler.stop is invoked, causing the monitoring of appDidBecomeActive and appDidEnterBackground to stop. This leads to the issue where "the WebSocket does not reconnect when the app resumes from the background."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted issue #3338

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @masarusanjp, thanks for spotting this issue!

The actual issue is that when reconnecting, logOut should not be happening, so the logic should be changed here. Stopping the recovery handler on logout is something that we want, but it caused this unwanted side effect due to the fact that logOutFirst in guest users is not properly implemented, it seems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 Bug An issue or PR related to a bug ✅ Feature An issue or PR related to a feature 🟢 QAed A PR that was QAed 🌐 SDK: StreamChat (LLC) Tasks related to the StreamChat LLC SDK
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants