Conversation
Have made changes to the flutter sdk for supporting vlm, following the swift sdk. Once the example app also has the corresponding changes, ill be able to fully test out this thing. Have test built it.
📝 WalkthroughWalkthroughThis PR adds comprehensive Vision Language Model (VLM) support to the RunAnywhere SDK and example applications across Flutter and React Native platforms, including model registration, camera integration, native bridges, type definitions, and image processing APIs with both streaming and non-streaming modes. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| } else if (imageFormat == RacVlmImageFormat.rgbPixels && | ||
| pixelData != null) { | ||
| // Allocate native memory for pixel data | ||
| pixelDataPtr = calloc<Uint8>(pixelData.length); | ||
| for (int i = 0; i < pixelData.length; i++) { | ||
| pixelDataPtr![i] = pixelData[i]; | ||
| } | ||
| imagePtr.ref.filePath = nullptr; | ||
| imagePtr.ref.pixelData = pixelDataPtr!; | ||
| imagePtr.ref.base64Data = nullptr; | ||
| imagePtr.ref.dataSize = pixelData.length; | ||
| } else if (imageFormat == RacVlmImageFormat.base64 && base64Data != null) { |
There was a problem hiding this comment.
pixel data is copied in a loop which is inefficient for large images
| } else if (imageFormat == RacVlmImageFormat.rgbPixels && | |
| pixelData != null) { | |
| // Allocate native memory for pixel data | |
| pixelDataPtr = calloc<Uint8>(pixelData.length); | |
| for (int i = 0; i < pixelData.length; i++) { | |
| pixelDataPtr![i] = pixelData[i]; | |
| } | |
| imagePtr.ref.filePath = nullptr; | |
| imagePtr.ref.pixelData = pixelDataPtr!; | |
| imagePtr.ref.base64Data = nullptr; | |
| imagePtr.ref.dataSize = pixelData.length; | |
| } else if (imageFormat == RacVlmImageFormat.base64 && base64Data != null) { | |
| // Allocate native memory for pixel data | |
| pixelDataPtr = calloc<Uint8>(pixelData.length); | |
| final uint8List = pixelDataPtr!.asTypedList(pixelData.length); | |
| uint8List.setAll(0, pixelData); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart
Line: 526:537
Comment:
pixel data is copied in a loop which is inefficient for large images
```suggestion
// Allocate native memory for pixel data
pixelDataPtr = calloc<Uint8>(pixelData.length);
final uint8List = pixelDataPtr!.asTypedList(pixelData.length);
uint8List.setAll(0, pixelData);
```
How can I resolve this? If you propose a fix, please make it concise.| // Allocate native memory for pixel data | ||
| pixelDataPtr = calloc<Uint8>(params.pixelData!.length); | ||
| for (int i = 0; i < params.pixelData!.length; i++) { | ||
| pixelDataPtr![i] = params.pixelData![i]; | ||
| } |
There was a problem hiding this comment.
same inefficient pixel data copying pattern
| // Allocate native memory for pixel data | |
| pixelDataPtr = calloc<Uint8>(params.pixelData!.length); | |
| for (int i = 0; i < params.pixelData!.length; i++) { | |
| pixelDataPtr![i] = params.pixelData![i]; | |
| } | |
| // Allocate native memory for pixel data | |
| pixelDataPtr = calloc<Uint8>(params.pixelData!.length); | |
| final uint8List = pixelDataPtr!.asTypedList(params.pixelData!.length); | |
| uint8List.setAll(0, params.pixelData!); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart
Line: 687:691
Comment:
same inefficient pixel data copying pattern
```suggestion
// Allocate native memory for pixel data
pixelDataPtr = calloc<Uint8>(params.pixelData!.length);
final uint8List = pixelDataPtr!.asTypedList(params.pixelData!.length);
uint8List.setAll(0, params.pixelData!);
```
How can I resolve this? If you propose a fix, please make it concise.| unawaited(_cameraController?.dispose()); | ||
| _cameraController = null; |
There was a problem hiding this comment.
unawaited used on dispose calls but camera disposal errors should be handled
| unawaited(_cameraController?.dispose()); | |
| _cameraController = null; | |
| _cameraController?.dispose().catchError((e) { | |
| debugPrint('Camera disposal error: $e'); | |
| }); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart
Line: 88:89
Comment:
`unawaited` used on dispose calls but camera disposal errors should be handled
```suggestion
_cameraController?.dispose().catchError((e) {
debugPrint('Camera disposal error: $e');
});
```
How can I resolve this? If you propose a fix, please make it concise.| const performAutoStreamCapture = useCallback(async () => { | ||
| if (!cameraRef.current) return; | ||
|
|
||
| try { | ||
| // FIX: Removed 'qualityPrioritization' | ||
| const photo = await cameraRef.current.takePhoto({ | ||
| enableShutterSound: false | ||
| }); | ||
|
|
||
| let accumulatedText = ''; | ||
| await vlmService.describeImage( | ||
| photo.path, | ||
| AUTO_STREAM_PROMPT, | ||
| AUTO_STREAM_MAX_TOKENS, | ||
| (token) => { | ||
| accumulatedText += token; | ||
| setCurrentDescription(accumulatedText); | ||
| } | ||
| ); | ||
| } catch (err) { | ||
| console.warn('[useVLMCamera] Auto-stream skipped frame:', err); | ||
| } | ||
| }, [cameraRef, vlmService]); |
There was a problem hiding this comment.
consider adding a guard to prevent queuing captures if one is already in progress
| const performAutoStreamCapture = useCallback(async () => { | |
| if (!cameraRef.current) return; | |
| try { | |
| // FIX: Removed 'qualityPrioritization' | |
| const photo = await cameraRef.current.takePhoto({ | |
| enableShutterSound: false | |
| }); | |
| let accumulatedText = ''; | |
| await vlmService.describeImage( | |
| photo.path, | |
| AUTO_STREAM_PROMPT, | |
| AUTO_STREAM_MAX_TOKENS, | |
| (token) => { | |
| accumulatedText += token; | |
| setCurrentDescription(accumulatedText); | |
| } | |
| ); | |
| } catch (err) { | |
| console.warn('[useVLMCamera] Auto-stream skipped frame:', err); | |
| } | |
| }, [cameraRef, vlmService]); | |
| const performAutoStreamCapture = useCallback(async () => { | |
| if (!cameraRef.current || isProcessing) return; | |
| setIsProcessing(true); | |
| try { | |
| const photo = await cameraRef.current.takePhoto({ | |
| enableShutterSound: false | |
| }); | |
| let accumulatedText = ''; | |
| await vlmService.describeImage( | |
| photo.path, | |
| AUTO_STREAM_PROMPT, | |
| AUTO_STREAM_MAX_TOKENS, | |
| (token) => { | |
| accumulatedText += token; | |
| setCurrentDescription(accumulatedText); | |
| } | |
| ); | |
| } catch (err) { | |
| console.warn('[useVLMCamera] Auto-stream skipped frame:', err); | |
| } finally { | |
| setIsProcessing(false); | |
| } | |
| }, [cameraRef, vlmService, isProcessing]); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: examples/react-native/RunAnywhereAI/src/hooks/useVLMCamera.ts
Line: 188:210
Comment:
consider adding a guard to prevent queuing captures if one is already in progress
```suggestion
const performAutoStreamCapture = useCallback(async () => {
if (!cameraRef.current || isProcessing) return;
setIsProcessing(true);
try {
const photo = await cameraRef.current.takePhoto({
enableShutterSound: false
});
let accumulatedText = '';
await vlmService.describeImage(
photo.path,
AUTO_STREAM_PROMPT,
AUTO_STREAM_MAX_TOKENS,
(token) => {
accumulatedText += token;
setCurrentDescription(accumulatedText);
}
);
} catch (err) {
console.warn('[useVLMCamera] Auto-stream skipped frame:', err);
} finally {
setIsProcessing(false);
}
}, [cameraRef, vlmService, isProcessing]);
```
How can I resolve this? If you propose a fix, please make it concise.| */ | ||
| async findGGUFInDirectory(directory: string): Promise<string> { | ||
| if (!RNFS) { | ||
| throw new Error('react-native-fs not available'); | ||
| } | ||
|
|
||
| try { | ||
| const contents = await RNFS.readDir(directory); | ||
|
|
||
| // Look for .gguf files (not mmproj) | ||
| for (const item of contents) { | ||
| if (item.isFile() && item.name.endsWith('.gguf') && !item.name.includes('mmproj')) { | ||
| logger.info(`Found main GGUF model: ${item.name}`); | ||
| return item.path; | ||
| } | ||
| } | ||
|
|
||
| // If not found, check nested directories | ||
| for (const item of contents) { | ||
| if (item.isDirectory()) { | ||
| try { | ||
| return await this.findGGUFInDirectory(item.path); | ||
| } catch { | ||
| // Continue searching other directories | ||
| } | ||
| } | ||
| } | ||
|
|
||
| throw new Error(`No GGUF model file found in ${directory}`); | ||
| } catch (error) { | ||
| logger.error(`Error finding GGUF file: ${error}`); | ||
| throw error; | ||
| } | ||
| }, |
There was a problem hiding this comment.
consider adding a depth limit or breadth-first search to prevent excessive recursion for deeply nested directories
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-react-native/packages/core/src/services/FileSystem.ts
Line: 630:663
Comment:
consider adding a depth limit or breadth-first search to prevent excessive recursion for deeply nested directories
How can I resolve this? If you propose a fix, please make it concise.|
@coderabbitai please do a detailed review for this PR. Thanks! |
|
@greptile please do a detailed review for this PR. Thanks! |
|
Based on the PR objectives, I'll review: Core SDK Changes:
Example Applications:
Architecture & Integration:
I'll analyze the code quality, architectural patterns, potential issues, and provide actionable feedback. ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sdk/runanywhere-flutter/packages/runanywhere_llamacpp/lib/llamacpp.dart (1)
274-280:⚠️ Potential issue | 🟠 Major
dispose()does not reset_isVlmRegistered, causing stale state on re-registration.
dispose()clears_bindings,_registeredModels, and_isRegistered, but leaves_isVlmRegistered = true. Ifregister()is called again afterdispose(),_registerVlm()(Line 160) will see_isVlmRegistered == trueand skip VLM registration entirely — even though the native bindings were destroyed.🐛 Proposed fix
static void dispose() { _bindings = null; _registeredModels.clear(); _isRegistered = false; + _isVlmRegistered = false; _logger.info('LlamaCpp disposed'); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere_llamacpp/lib/llamacpp.dart` around lines 274 - 280, dispose() currently clears _bindings, _registeredModels, and _isRegistered but fails to reset _isVlmRegistered, which causes _registerVlm() to skip re-registering VLMs after dispose; update dispose() to also set _isVlmRegistered = false so that subsequent calls to register() and _registerVlm() will perform VLM registration against the new/native bindings, ensuring the internal state (_bindings, _registeredModels, _isRegistered, _isVlmRegistered) is fully reset.
🧹 Nitpick comments (22)
sdk/runanywhere-flutter/packages/runanywhere/lib/foundation/error_types/sdk_error.dart (2)
322-323:vlmCancelledrecovery suggestion reads as a description, not actionable guidance.Every other
recoverySuggestiontells the user what to do.'The VLM generation was cancelled by the user.'simply restates the error. Consider something like'No action needed. Re-initiate VLM processing to try again.'to stay consistent with the pattern.✏️ Suggested wording
case SDKErrorType.vlmCancelled: - return 'The VLM generation was cancelled by the user.'; + return 'No action needed. Re-initiate VLM processing to try again.';🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/foundation/error_types/sdk_error.dart` around lines 322 - 323, The recoverySuggestion for SDKErrorType.vlmCancelled currently restates the error; update the recovery guidance in the recoverySuggestion mapping (where SDKErrorType.vlmCancelled is handled) to provide an actionable suggestion such as "No action needed. Re-initiate VLM processing to try again." so it matches the pattern of other cases and gives clear next steps to the user.
653-693: VLM factory constructors omitunderlyingError, unlike similar existing factories.
vlmModelLoadFailedandvlmProcessingFaileddon't accept an optionalunderlyingErrorparameter, while analogous factories likemodelLoadFailed(line 373),requestFailed(line 479), anddatabaseInitializationFailed(line 588) do. Preserving the exception chain aids debugging.♻️ Proposed fix to add underlyingError support
/// VLM model load failed error - static SDKError vlmModelLoadFailed(String message) { + static SDKError vlmModelLoadFailed(String message, {Object? underlyingError}) { return SDKError( 'VLM model load failed: $message', SDKErrorType.vlmModelLoadFailed, + underlyingError: underlyingError, ); } /// VLM processing failed error - static SDKError vlmProcessingFailed(String message) { + static SDKError vlmProcessingFailed(String message, {Object? underlyingError}) { return SDKError( 'VLM processing failed: $message', SDKErrorType.vlmProcessingFailed, + underlyingError: underlyingError, ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/foundation/error_types/sdk_error.dart` around lines 653 - 693, Add an optional dynamic underlyingError parameter to the VLM factory functions vlmModelLoadFailed and vlmProcessingFailed and pass it into the SDKError constructor so the exception chain is preserved (mirror the signatures/pattern used by modelLoadFailed, requestFailed, and databaseInitializationFailed); update vlmModelLoadFailed(String message) to vlmModelLoadFailed(String message, [dynamic? underlyingError]) and vlmProcessingFailed(String message, [dynamic? underlyingError]) and include underlyingError in the returned SDKError(...) call.examples/flutter/RunAnywhereAI/pubspec.yaml (1)
46-47: Using pure-Dartimagefor per-frame BGRA→RGB conversion will be a performance bottleneck in streaming VLM pipelines.The Dart Image Library provides the ability to load, save, and manipulate images in a variety of image file formats. The library can be used with both
dart:ioanddart:html, for command-line, Flutter, and web applications. However, it performs all processing in pure Dart on the CPU with no native acceleration. For a VLM streaming use case (continuously converting BGRA camera frames), running this on the main Dart thread will cause dropped frames and UI jank.Mitigations to consider:
- Push the conversion into the native FFI/C++ layer already used elsewhere in the VLM pipeline, so the pixel format conversion happens before the buffer reaches Dart.
- If it must stay in Dart, dispatch via
compute()or a dedicatedIsolateto avoid blocking the UI thread.- Consider whether the
camerastreaming API can be configured to produce a format the model already accepts natively, avoiding the conversion entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/pubspec.yaml` around lines 46 - 47, The pubspec currently depends on the pure-Dart "image" package for per-frame BGRA→RGB conversion (the "image" dependency), which will cause UI jank in streaming VLM pipelines; instead, move the conversion out of the main Dart thread by implementing the pixel-format conversion in the native FFI/C++ layer already used elsewhere in the pipeline so frames arrive to Dart in the model-native format, or if native changes are not possible, dispatch the per-frame conversion off the UI isolate via compute()/a dedicated Isolate when handling camera frames (where the conversion currently occurs), and/or configure the camera streaming API to emit a supported pixel format to eliminate the conversion step.examples/flutter/RunAnywhereAI/lib/app/runanywhere_ai_app.dart (1)
231-231: Log message says "VLM models registered" (plural) but only one model is registered.If this is the only VLM entry planned for this release, align the message with the code:
📝 Suggested correction
- debugPrint('✅ VLM models registered'); + debugPrint('✅ VLM model registered (SmolVLM 500M)');🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/app/runanywhere_ai_app.dart` at line 231, The log string used in debugPrint ('✅ VLM models registered') is plural but only one VLM model is registered; update the message to singular (e.g., '✅ VLM model registered') or, if the count is dynamic, interpolate the actual number when calling debugPrint. Locate the debugPrint call in runanywhere_ai_app.dart (the '✅ VLM models registered' string) and change it to the appropriate singular text or to use the model count variable in the message.examples/flutter/RunAnywhereAI/lib/core/services/permission_service.dart (1)
176-233: Consider extracting a shared permission-request helper to reduce duplication.
requestCameraPermission,requestMicrophonePermission, andrequestSpeechRecognitionPermissionare structurally identical — they differ only in thePermissiontype and user-facing strings. A private generic method would eliminate ~50 lines of repeated logic.♻️ Sketch of a shared helper
+ Future<bool> _requestPermission( + BuildContext context, { + required Permission permission, + required String title, + required String message, + required String deniedSnackbarMessage, + }) async { + final status = await permission.status; + if (status.isGranted) return true; + + if (status.isPermanentlyDenied) { + if (!context.mounted) return false; + final shouldOpenSettings = await _showSettingsDialog(context, title: title, message: message); + if (shouldOpenSettings) await openAppSettings(); + return false; + } + + final result = await permission.request(); + if (result.isGranted) return true; + if (!context.mounted) return false; + + if (result.isPermanentlyDenied) { + final shouldOpenSettings = await _showSettingsDialog(context, title: title, message: message); + if (shouldOpenSettings) await openAppSettings(); + } else if (result.isDenied) { + _showDeniedSnackbar(context, deniedSnackbarMessage); + } + return false; + }Then each public method becomes a one-liner delegating to
_requestPermission.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/core/services/permission_service.dart` around lines 176 - 233, Extract the duplicated logic in requestCameraPermission/requestMicrophonePermission/requestSpeechRecognitionPermission into a single private helper (e.g. _requestPermission) that accepts the Permission object and the title/message/snackbar text, then replace each public method with a call that passes Permission.camera/Permission.microphone/Permission.speech and the appropriate strings; ensure _requestPermission handles status.isGranted, status.isPermanentlyDenied (showing _showSettingsDialog + openAppSettings), requests permission, handles result.isGranted/result.isPermanentlyDenied/result.isDenied, checks context.mounted before UI actions, and returns the same boolean semantics as the original methods.examples/react-native/RunAnywhereAI/src/screens/VLMScreen.tsx (1)
93-99:handleMainActionalso depends on[vlm]— recreated every render.Same issue as above —
vlmobject identity changes on every render. Since this is passed to aTouchableOpacity.onPress, the constant recreation is wasteful but not infinite-looping. Consistent fix would be to depend on the specificvlmproperties used.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/react-native/RunAnywhereAI/src/screens/VLMScreen.tsx` around lines 93 - 99, The handleMainAction callback is being recreated every render because it depends on the whole vlm object; change its dependency to the specific properties/methods it uses instead of vlm. Update the useCallback for handleMainAction to depend on vlm.isAutoStreaming, vlm.toggleAutoStreaming and vlm.captureAndDescribe (or destructure those three from vlm above and depend on the destructured values) so the function is only recreated when the actual inputs change.examples/flutter/RunAnywhereAI/lib/features/vision/vision_hub_view.dart (1)
70-96: Prefer design system constants over rawColors.indigoand inlineTextStyle.The Vision Chat tile above correctly uses
AppColors.primaryPurple, but this "coming soon" tile reaches outside the design system. UsingColors.indigodirectly also makes future theming harder.Additionally,
enabled: false(Line 71) is redundant when noonTapis provided —ListTile.onTapdefaults tonull, which already prevents interaction.♻️ Proposed fix
- ListTile( - enabled: false, + ListTile( leading: Container( width: 44, height: 44, decoration: BoxDecoration( - color: Colors.indigo.withValues(alpha: 0.5), + color: AppColors.primaryPurple.withValues(alpha: 0.4), borderRadius: BorderRadius.circular(10), ), child: Icon( Icons.auto_awesome, - color: Colors.white.withValues(alpha: 0.5), + color: Colors.white.withValues(alpha: 0.4), ), ), - title: Text( - 'Image Generation', - style: TextStyle( - color: AppColors.textSecondary(context).withValues(alpha: 0.5), - ), + title: Text( + 'Image Generation', + style: AppTypography.body(context).copyWith( + color: AppColors.textSecondary(context).withValues(alpha: 0.5), + ), ), - subtitle: Text( - 'Create images with Stable Diffusion', - style: TextStyle( - color: AppColors.textSecondary(context).withValues(alpha: 0.5), - ), + subtitle: Text( + 'Create images with Stable Diffusion', + style: AppTypography.caption(context).copyWith( + color: AppColors.textSecondary(context).withValues(alpha: 0.5), + ), ), ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vision_hub_view.dart` around lines 70 - 96, Replace the hard-coded Colors.indigo and inline TextStyle usages in the ListTile with design-system tokens and remove the redundant enabled: false; specifically, in the Vision "Image Generation" ListTile replace Container.decoration.color: Colors.indigo.withValues(alpha: 0.5) with a design token (e.g., AppColors.primaryPurple.withOpacity(0.5) or the appropriate AppColors.* value), replace Icon and Text TextStyle color calls that use .withValues(alpha: 0.5) with the design system color helpers (e.g., AppColors.textSecondary(context).withOpacity(0.5) or an existing AppTextStyles variant) instead of inline TextStyle, and delete the enabled: false property since ListTile.onTap is null by default; update references in this widget (ListTile, the leading Container, Icon, title Text, subtitle Text) accordingly.examples/flutter/RunAnywhereAI/lib/features/vision/vlm_camera_view.dart (1)
316-328: Unnecessaryunawaitedwrapper around snackbar.
ScaffoldMessenger.of(context).showSnackBar(...)is synchronous (returnsScaffoldFeatureController, not aFuture). The.closed.then((_) => null)chain creates a Future that is thenunawaited— but there's no reason to reference.closedat all. Simplify to a directshowSnackBarcall.🔧 Simplified snackbar
onPressed: () { unawaited(Clipboard.setData(ClipboardData(text: _viewModel.currentDescription))); - unawaited( - ScaffoldMessenger.of(context).showSnackBar( - const SnackBar( - content: Text('Description copied to clipboard'), - duration: Duration(seconds: 2), - ), - ).closed.then((_) => null), + ScaffoldMessenger.of(context).showSnackBar( + const SnackBar( + content: Text('Description copied to clipboard'), + duration: Duration(seconds: 2), + ), ); },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vlm_camera_view.dart` around lines 316 - 328, The onPressed handler wraps ScaffoldMessenger.of(context).showSnackBar(...).closed in an unnecessary unawaited Future chain; remove the .closed.then(...) and the unawaited wrapper for the snackbar call and just call ScaffoldMessenger.of(context).showSnackBar(...) directly while keeping the Clipboard.setData call (Clipboard.setData(ClipboardData(text: _viewModel.currentDescription))) as-is; update the onPressed closure containing Clipboard.setData and ScaffoldMessenger.of(context).showSnackBar to remove references to .closed and unawaited around the snackbar.sdk/runanywhere-flutter/packages/runanywhere/lib/public/extensions/runanywhere_frameworks.dart (1)
37-97: Inconsistent indentation withingetFrameworksmethod body.Lines 39–82 (the switch block) are indented at 2 spaces from the method boundary, while lines 86–96 (the model-filtering loop) use the expected 4-space indent. This makes the method body look like two different scopes. Likely a merge/rebase artifact.
🔧 Re-indent the switch block to match the rest of the method
static Future<List<InferenceFramework>> getFrameworks( - SDKComponent capability) async { - final frameworks = <InferenceFramework>{}; - - // Map capability to model categories - final Set<ModelCategory> relevantCategories; - - switch (capability) { - case SDKComponent.llm: - relevantCategories = { - ModelCategory.language, - ModelCategory.multimodal - }; - break; - - case SDKComponent.stt: - relevantCategories = {ModelCategory.speechRecognition}; - break; - - case SDKComponent.tts: - relevantCategories = {ModelCategory.speechSynthesis}; - break; - - case SDKComponent.vad: - relevantCategories = {ModelCategory.audio}; - break; - - case SDKComponent.voice: - relevantCategories = { - ModelCategory.language, - ModelCategory.speechRecognition, - ModelCategory.speechSynthesis - }; - break; - - case SDKComponent.embedding: - relevantCategories = { - ModelCategory.language, - ModelCategory.multimodal - }; - break; - - case SDKComponent.vlm: - relevantCategories = {ModelCategory.multimodal}; - break; - } - - + SDKComponent capability) async { + final frameworks = <InferenceFramework>{}; + + // Map capability to model categories + final Set<ModelCategory> relevantCategories; + + switch (capability) { + case SDKComponent.llm: + relevantCategories = { + ModelCategory.language, + ModelCategory.multimodal, + }; + break; + + case SDKComponent.stt: + relevantCategories = {ModelCategory.speechRecognition}; + break; + + case SDKComponent.tts: + relevantCategories = {ModelCategory.speechSynthesis}; + break; + + case SDKComponent.vad: + relevantCategories = {ModelCategory.audio}; + break; + + case SDKComponent.voice: + relevantCategories = { + ModelCategory.language, + ModelCategory.speechRecognition, + ModelCategory.speechSynthesis, + }; + break; + + case SDKComponent.embedding: + relevantCategories = { + ModelCategory.language, + ModelCategory.multimodal, + }; + break; + + case SDKComponent.vlm: + relevantCategories = {ModelCategory.multimodal}; + break; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/public/extensions/runanywhere_frameworks.dart` around lines 37 - 97, The switch block in getFrameworks is mis-indented (2 spaces) causing inconsistent scope appearance; re-indent the entire switch and its case blocks (the assignment to relevantCategories for each SDKComponent case) to match the method's 4-space indentation, ensuring the declaration of relevantCategories and subsequent logic (the for loop over RunAnywhere.availableModels, use of frameworks, and building result) are consistently indented and aligned with the rest of the method body.sdk/runanywhere-react-native/packages/llamacpp/src/LlamaCPP.ts (1)
230-297:addVLMModellargely duplicatesaddModel— consider extracting shared logic.The file-existence check,
ModelInfoconstruction, andModelRegistry.registerModelcall are repeated almost verbatim betweenaddModel(Lines 143–206) andaddVLMModel(Lines 230–297). The only differences are the hardcoded category,supportsThinking, andmmprojUrlin metadata.Extracting a private helper (e.g.,
_registerModelInfo(options, overrides)) would reduce the duplication while keeping both public methods thin and readable. Not blocking, but worth addressing before the two paths inevitably drift out of sync.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/LlamaCPP.ts` around lines 230 - 297, addVLMModel duplicates addModel: extract the shared file-existence, ModelInfo construction and ModelRegistry.registerModel logic into a private helper (e.g., _registerModelInfo or createModelInfoAndRegister) that takes the incoming options and an overrides object for category, supportsThinking, and metadata.mmprojUrl; have addModel and addVLMModel call this helper, passing LlamaCPPVLMModelOptions/ModelOptions respectively, reuse FileSystem checks (modelExists/getModelPath), populate common fields (id, name, format, localPath, isDownloaded, createdAt/updatedAt, etc.) inside the helper, and call ModelRegistry.registerModel(modelInfo) there to keep both public methods thin and prevent divergence.sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp (2)
100-120:base64Decoderebuilds its lookup table on every call.The 256-element lookup table
Tis reconstructed each invocation. For hot paths processing streaming frames, consider making itstatic constor moving it to namespace scope.Move lookup table to static scope
std::vector<uint8_t> base64Decode(const std::string& encoded) { static const std::string base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"; + static const auto T = []() { + std::vector<int> table(256, -1); + for (int i = 0; i < 64; i++) table[base64_chars[i]] = i; + return table; + }(); + std::vector<uint8_t> decoded; - std::vector<int> T(256, -1); - for (int i = 0; i < 64; i++) T[base64_chars[i]] = i;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp` around lines 100 - 120, The base64Decode function rebuilds the 256-entry lookup table T on every call; move the table out of the function so it is constructed once (e.g., declare a static const std::array<int,256> or a namespace-scope const vector initialized from base64_chars) and have base64Decode reference that table instead of creating T locally; keep base64_chars accessible (static or namespace-scope) and preserve the existing mapping logic used by base64Decode.
540-556: Stream callbacks capture references to stack-local variables — safe only becauseprocessStreamblocks.
streamCallbacks.onTokenandstreamCallbacks.onErrorcapturefullResponseandstreamErrorby reference. This is safe only ifVLMBridge::shared().processStream()is synchronous and blocks until streaming completes (matching the LLMgenerateStreampattern on Lines 324-346). If the streaming call is ever made asynchronous, these references become dangling.Worth a brief comment to document this invariant.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp` around lines 540 - 556, The stream callbacks (VLMStreamCallbacks streamCallbacks with onToken and onError) currently capture fullResponse and streamError by reference which is only safe because VLMBridge::shared().processStream() is synchronous; add a concise comment above the streamCallbacks definition stating that invariant (i.e., these references must not outlive the call because processStream blocks, matching the generateStream pattern) and either change the captures to by-value or use an owning/shared string (e.g., std::shared_ptr<std::string>) if you plan to make processing asynchronous in the future; ensure the comment references onToken, onError, fullResponse, streamError and VLMBridge::shared().processStream() so future maintainers notice the requirement.sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart (2)
672-708: Duplicated image struct setup between_processInIsolateand_vlmStreamingIsolateEntry.Lines 672-708 are nearly identical to Lines 514-548. This duplication increases maintenance burden — any format change needs updating in both places.
Consider extracting a shared helper function for image struct initialization.
Proposed extraction
/// Shared helper to populate image struct and allocate native memory. /// Returns allocated pointers that must be freed by the caller. ({Pointer<Utf8>? filePathPtr, Pointer<Uint8>? pixelDataPtr, Pointer<Utf8>? base64DataPtr}) _setupImageStruct( Pointer<RacVlmImageStruct> imagePtr, int imageFormat, String? filePath, Uint8List? pixelData, int width, int height, String? base64Data, ) { imagePtr.ref.format = imageFormat; imagePtr.ref.width = width; imagePtr.ref.height = height; Pointer<Utf8>? filePathPtr; Pointer<Uint8>? pixelDataPtr; Pointer<Utf8>? base64DataPtr; if (imageFormat == RacVlmImageFormat.filePath && filePath != null) { filePathPtr = filePath.toNativeUtf8(); imagePtr.ref.filePath = filePathPtr; imagePtr.ref.pixelData = nullptr; imagePtr.ref.base64Data = nullptr; imagePtr.ref.dataSize = 0; } else if (imageFormat == RacVlmImageFormat.rgbPixels && pixelData != null) { pixelDataPtr = calloc<Uint8>(pixelData.length); pixelDataPtr.asTypedList(pixelData.length).setAll(0, pixelData); imagePtr.ref.filePath = nullptr; imagePtr.ref.pixelData = pixelDataPtr; imagePtr.ref.base64Data = nullptr; imagePtr.ref.dataSize = pixelData.length; } else if (imageFormat == RacVlmImageFormat.base64 && base64Data != null) { base64DataPtr = base64Data.toNativeUtf8(); imagePtr.ref.filePath = nullptr; imagePtr.ref.pixelData = nullptr; imagePtr.ref.base64Data = base64DataPtr; imagePtr.ref.dataSize = base64Data.length; } return (filePathPtr: filePathPtr, pixelDataPtr: pixelDataPtr, base64DataPtr: base64DataPtr); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart` around lines 672 - 708, There is duplicated image-struct initialization in _processInIsolate and _vlmStreamingIsolateEntry; extract a shared helper (e.g., _setupImageStruct) that accepts Pointer<RacVlmImageStruct>, imageFormat, filePath, pixelData, width, height, base64Data and returns the allocated pointers (filePathPtr, pixelDataPtr, base64DataPtr) so both callers call the helper to populate imagePtr.ref and receive pointers to free later; update both _processInIsolate and _vlmStreamingIsolateEntry to use this helper and ensure callers still free any returned native allocations.
526-532: Byte-by-byte pixel data copy could use bulk copy.The loop at Lines 530-532 copies pixel data one byte at a time. For large images (e.g., 1920×1080 RGB = ~6MB), this is slow. Dart FFI provides
asTypedListfor efficient bulk operations.Proposed fix using bulk copy
// Allocate native memory for pixel data pixelDataPtr = calloc<Uint8>(pixelData.length); - for (int i = 0; i < pixelData.length; i++) { - pixelDataPtr![i] = pixelData[i]; - } + pixelDataPtr!.asTypedList(pixelData.length).setAll(0, pixelData);Same applies to the duplicated code in
_vlmStreamingIsolateEntry(Lines 688-691).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart` around lines 526 - 532, The byte-by-byte loop copying pixelData into the allocated native buffer is slow; replace the per-byte loop in the branch handling RacVlmImageFormat.rgbPixels (where pixelDataPtr is allocated with calloc<Uint8>(pixelData.length)) with a bulk copy using pixelDataPtr!.asTypedList(pixelData.length).setAll(0, pixelData) or equivalent to copy the Uint8List in one operation; make the identical change in the duplicated copy inside _vlmStreamingIsolateEntry so both locations use asTypedList bulk copying instead of manual for-loops.examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart (3)
143-184: Same temp file cleanup concern in single-capture mode.
takePicture()on Line 155 also creates a temp file. For consistency, consider cleaning up after processing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart` around lines 143 - 184, The describeCurrentFrame method currently calls _cameraController!.takePicture() which leaves a temp file (xFile); after processing the file (creating sdk.VLMImage.filePath and streaming results) ensure the temp file is deleted to avoid leaks by adding an await xFile.delete() (or File(xFile.path).delete()) in the finally block of describeCurrentFrame so the file is removed whether processing succeeds or fails, taking care not to delete it before the SDK has read it.
87-92:disposeCameranulls the controller while async dispose is still in-flight.
_cameraController?.dispose()is async, but_cameraControlleris set tonullsynchronously on the next line. If any other code path checks_cameraControlleror_isCameraInitializedduring the async disposal, it may see inconsistent state (e.g., controller is null but the underlying camera resource is still being released). For an example app this is low-risk, but ideallydisposeCamerashould beasyncandawaitthe dispose.Proposed fix
- void disposeCamera() { - unawaited(_cameraController?.dispose()); + Future<void> disposeCamera() async { + await _cameraController?.dispose(); _cameraController = null; _isCameraInitialized = false; notifyListeners(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart` around lines 87 - 92, disposeCamera currently calls _cameraController?.dispose() without awaiting and then immediately nulls _cameraController and clears _isCameraInitialized, which can leave observers seeing inconsistent state; change disposeCamera to be async, await _cameraController?.dispose() (remove unawaited), then set _cameraController = null and _isCameraInitialized = false, and finally call notifyListeners(); update any callers to await disposeCamera() where needed.
266-309: Auto-streamingtakePicture()accumulates temp files without cleanup.Each 2.5s cycle calls
takePicture()which writes to the device's temp directory. Over extended use of live mode, this accumulates files. Consider deleting theXFileafter processing completes.Proposed cleanup
// Capture image from camera final xFile = await _cameraController!.takePicture(); + final capturedPath = xFile.path; // Create VLMImage from file path final image = sdk.VLMImage.filePath(xFile.path); // ... process ... + // Clean up temp file + try { + await File(capturedPath).delete(); + } catch (_) {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart` around lines 266 - 309, In _describeCurrentFrameForAutoStream, the XFile returned by _cameraController!.takePicture() is not deleted causing temp-file accumulation; after processing (in the finally block) delete the captured file (use xFile.delete() or File(xFile.path).delete() awaited) and swallow/log any deletion errors without setting _error so auto-stream behavior remains non-critical; ensure you only attempt deletion if xFile is non-null and do the delete inside the existing finally block so cleanup always runs even on exceptions.sdk/runanywhere-flutter/packages/runanywhere/lib/public/runanywhere.dart (2)
1559-1615: Consider usingswitchexpression for exhaustive pattern matching on sealed class.The
if/elsechain handles allVLMImageFormatsubtypes, but the finalelseat Line 1603 is technically unreachable sinceVLMImageFormatis sealed with only three variants. Aswitchexpression would make this exhaustive at compile time.Proposed refactor
- if (format is VLMImageFormatFilePath) { - bridgeResult = await DartBridge.vlm.processImage( - imageFormat: RacVlmImageFormat.filePath, - filePath: format.path, - ... - ); - } else if (format is VLMImageFormatRgbPixels) { - ... - } else if (format is VLMImageFormatBase64) { - ... - } else { - throw SDKError.vlmInvalidImage('Unsupported image format'); - } + bridgeResult = switch (format) { + VLMImageFormatFilePath(:final path) => await DartBridge.vlm.processImage( + imageFormat: RacVlmImageFormat.filePath, + filePath: path, + prompt: prompt, + maxTokens: maxTokens, + temperature: temperature, + topP: topP, + useGpu: useGpu, + ), + VLMImageFormatRgbPixels(:final data, :final width, :final height) => await DartBridge.vlm.processImage( + imageFormat: RacVlmImageFormat.rgbPixels, + pixelData: data, + width: width, + height: height, + prompt: prompt, + maxTokens: maxTokens, + temperature: temperature, + topP: topP, + useGpu: useGpu, + ), + VLMImageFormatBase64(:final encoded) => await DartBridge.vlm.processImage( + imageFormat: RacVlmImageFormat.base64, + base64Data: encoded, + prompt: prompt, + maxTokens: maxTokens, + temperature: temperature, + topP: topP, + useGpu: useGpu, + ), + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/public/runanywhere.dart` around lines 1559 - 1615, The if/else chain in _processImageViaBridge that checks image.format (VLMImageFormatFilePath, VLMImageFormatRgbPixels, VLMImageFormatBase64) should be replaced with a switch expression on image.format to get compiler-enforced exhaustiveness; inside each case call the corresponding DartBridge.vlm.processImage with the correct parameters and assign to bridgeResult, and remove the final unreachable else/SDKError.vlmInvalidImage branch so the switch guarantees all variants are handled at compile time.
1277-1403: BroadcastStreamControllercan drop early tokens before the caller subscribes.
StreamController<String>.broadcast()(Line 1293) does not buffer events. ThemetricsFutureat Line 1339 subscribes via.toList()immediately, so it captures all tokens. However, the caller receives theVLMStreamingResultasynchronously — by the time they attach a listener toresult.stream, some tokens may have already been emitted and lost from their perspective.This matches the existing LLM
generateStreampattern (Line 1867), so it's a pre-existing design choice. In practice the async gap is tiny and unlikely to cause visible token loss, but worth noting.If this becomes an issue, switching to a single-subscription controller with a replay mechanism or ensuring the bridge stream doesn't start emitting until the caller subscribes would fix it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/public/runanywhere.dart` around lines 1277 - 1403, processImageStream currently starts _processImageStreamViaBridge and emits tokens before callers may subscribe to controller.stream, risking dropped tokens; change the logic to defer creating/subscribing to tokenStream until the returned stream is listened to by moving the call to _processImageStreamViaBridge and the subscription into controller.onListen (use controller.onCancel to cancel subscription and close controller), keep collecting allTokens and telemetry as before, and ensure the cancel closure still calls DartBridge.vlm.cancel() and unsubscribes the subscription created in onListen.sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere+VLM.ts (2)
324-330:promptTokensis a rough heuristic, not actual token count.
Math.ceil(prompt.length / 4)is an approximation that could be quite inaccurate (especially for non-English text or with image tokens). This metric gets surfaced asVLMResult.promptTokens, which consumers may treat as authoritative.Consider either documenting this as an estimate or setting it to 0 with a comment (as the Dart streaming path does on Line 1372 of
runanywhere.dart).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere`+VLM.ts around lines 324 - 330, The VLMResult currently exposes promptTokens using a rough heuristic (Math.ceil(prompt.length / 4)); update the construction of finalResult in RunAnywhere+VLM.ts so promptTokens is not presented as an authoritative token count—either set promptTokens to 0 (matching the Dart streaming path behavior) and add a clear inline comment that accurate token counts are not available here, or keep the field but rename/annotate it as an estimated value; modify the finalResult assignment (symbol: VLMResult / variable: finalResult) accordingly so consumers aren't misled by the heuristic.
448-454:uint8ArrayToBase64may be slow for large pixel buffers.Building a string via
String.fromCharCodein a byte-by-byte loop creates O(n²) intermediate strings in some JS engines. For VLM images that can be megabytes of RGB data, this could cause noticeable latency or GC pressure.Consider a chunked approach:
Proposed chunked implementation
function uint8ArrayToBase64(bytes: Uint8Array): string { - // Use btoa with binary string conversion - let binaryString = ''; - for (let i = 0; i < bytes.length; i++) { - binaryString += String.fromCharCode(bytes[i]); - } - return btoa(binaryString); + // Process in chunks to avoid stack overflow and reduce string concatenation overhead + const chunkSize = 8192; + const chunks: string[] = []; + for (let i = 0; i < bytes.length; i += chunkSize) { + const slice = bytes.subarray(i, Math.min(i + chunkSize, bytes.length)); + chunks.push(String.fromCharCode(...slice)); + } + return btoa(chunks.join('')); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere`+VLM.ts around lines 448 - 454, The current uint8ArrayToBase64 builds a single string by appending one char per byte which is O(n^2) and slow for large buffers; update uint8ArrayToBase64 to convert the Uint8Array to a binary string in fixed-size chunks (e.g., chunkSize ~ 0x8000/32768 bytes), for each chunk call String.fromCharCode on the chunk (via Function.apply or equivalent safe chunk-to-string conversion), push each chunk-string into an array, join the array once to produce the full binary string, then call btoa on that joined string; ensure you reference and update the function uint8ArrayToBase64 so large VLM image buffers are processed without creating quadratic intermediate strings.sdk/runanywhere-react-native/packages/llamacpp/cpp/bridges/VLMBridge.hpp (1)
49-57:VLMImageInput::pixel_datais a non-owning raw pointer — document lifetime requirements.
pixel_dataisconst uint8_t*whilefile_pathandbase64_dataarestd::string(owning). Callers must ensure the underlying buffer outlives theVLMImageInput. This already caused a bug inHybridRunAnywhereLlama.cpp(the static buffer issue). Consider adding a brief comment or storingstd::vector<uint8_t>for owned pixel data.Option: use owned storage
struct VLMImageInput { rac_vlm_image_format_t format; std::string file_path; - const uint8_t* pixel_data = nullptr; + std::vector<uint8_t> pixel_data; // Owned pixel data (RGB) std::string base64_data; uint32_t width = 0; uint32_t height = 0; - size_t data_size = 0; };This eliminates the lifetime management burden on callers. The C API call sites would use
.pixel_data.data()and.pixel_data.size().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/cpp/bridges/VLMBridge.hpp` around lines 49 - 57, The VLMImageInput::pixel_data field is a non-owning const uint8_t* but there is no documentation of its lifetime requirements which led to a static buffer bug; update VLMBridge.hpp by either (A) adding a clear comment on VLMImageInput::pixel_data stating callers must ensure the pointed-to buffer outlives the VLMImageInput instance (mention HybridRunAnywhereLlama.cpp as a cautionary example), or (B) change pixel storage to an owning std::vector<uint8_t> (e.g., add std::vector<uint8_t> pixel_data_owned and use pixel_data_owned.data()/size() for C API calls) and update usages that construct VLMImageInput (including code in HybridRunAnywhereLlama.cpp) to populate the vector so callers no longer manage lifetimes.
| CURRENT_PROJECT_VERSION = "$(FLUTTER_BUILD_NUMBER)"; | ||
| DEVELOPMENT_TEAM = L86FH3K93L; | ||
| ENABLE_BITCODE = NO; | ||
| ENABLE_RESOURCE_ACCESS_CAMERA = YES; |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
ENABLE_RESOURCE_ACCESS_CAMERA iOS Xcode build setting
💡 Result:
ENABLE_RESOURCE_ACCESS_CAMERA is not an iOS permission setting. It’s an Xcode build setting used for macOS (and Mac Catalyst) sandboxed targets to control whether Xcode adds the camera entitlement (com.apple.security.device.camera) to the target’s entitlements when App Sandbox is enabled. (developer.apple.com)
For a normal iOS app, camera access is granted via:
- an Info.plist purpose string:
NSCameraUsageDescription, and - requesting authorization at runtime (via AVFoundation APIs).
If the purpose string is missing, the app can terminate when it tries to access the camera. (developer.apple.com)
Where to change it (macOS / Mac Catalyst):
- Target → Signing & Capabilities → App Sandbox → Hardware → Camera (checkbox). (developer.apple.com)
Citations:
- 1: https://developer.apple.com/library/archive/documentation/Miscellaneous/Reference/EntitlementKeyReference/Chapters/EnablingAppSandbox.html?utm_source=openai
- 2: https://developer.apple.com/library/archive/qa/qa1937/_index.html?utm_source=openai
- 3: https://developer.apple.com/documentation/xcode/configuring-the-macos-app-sandbox/?utm_source=openai
Remove ENABLE_RESOURCE_ACCESS_CAMERA from iOS build configurations.
ENABLE_RESOURCE_ACCESS_CAMERA is a macOS/Mac Catalyst-only build setting used to control the App Sandbox camera entitlement. It has no effect on iOS apps and should not be present in the Runner's iOS build configurations (Profile, Debug, Release at lines 475, 660, 685). iOS camera access is controlled exclusively via NSCameraUsageDescription in Info.plist and runtime permission requests—there is no build-time equivalent for iOS. Remove this setting from all iOS configurations to keep the build settings clean and prevent confusion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/flutter/RunAnywhereAI/ios/Runner.xcodeproj/project.pbxproj` at line
475, Remove the macOS-only build setting ENABLE_RESOURCE_ACCESS_CAMERA from the
Runner iOS build configurations (Profile, Debug, Release) in the Xcode project
so it is not present in iOS targets; locate occurrences of
ENABLE_RESOURCE_ACCESS_CAMERA in the project.pbxproj and delete those entries
for the iOS configurations, and ensure camera access is instead
documented/handled via NSCameraUsageDescription in Info.plist and runtime
permission requests.
| // SmolVLM 500M - Ultra-lightweight VLM for mobile (~500MB total) | ||
| RunAnywhere.registerModel( | ||
| id: 'smolvlm-500m-instruct-q8_0', | ||
| name: 'SmolVLM 500M Instruct', | ||
| url: Uri.parse( | ||
| 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-vlm-models-v1/smolvlm-500m-instruct-q8_0.tar.gz'), | ||
| framework: InferenceFramework.llamaCpp, | ||
| modality: ModelCategory.multimodal, | ||
| artifactType: ModelArtifactType.tarGzArchive( | ||
| structure: ArchiveStructure.directoryBased, | ||
| ), | ||
| memoryRequirement: 600000000, |
There was a problem hiding this comment.
Inline comment says ~500MB total but memoryRequirement is 600 MB — correct the comment.
The SmolVLM-500M-Instruct Q8_0 GGUF weighs ~437 MB and the mmproj file is ~109 MB, for a combined artifact size of ~546 MB. The memoryRequirement of 600,000,000 (600 MB) is a reasonable runtime figure, but the adjacent inline comment ~500MB total is understated and inconsistent with it.
📝 Suggested comment correction
- // SmolVLM 500M - Ultra-lightweight VLM for mobile (~500MB total)
+ // SmolVLM 500M - Ultra-lightweight VLM for mobile (~600MB total: ~437MB model + ~109MB mmproj + runtime overhead)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/flutter/RunAnywhereAI/lib/app/runanywhere_ai_app.dart` around lines
218 - 229, Update the inline comment above the RunAnywhere.registerModel call
for id 'smolvlm-500m-instruct-q8_0' to reflect the actual combined artifact size
(~546 MB) instead of "~500MB total"; mention the breakdown (≈437 MB GGUF + ≈109
MB mmproj) and keep the memoryRequirement: 600000000 as the runtime figure. This
change should be made adjacent to the RunAnywhere.registerModel block so the
comment and the memoryRequirement value are consistent.
| # Camera access for VLM | ||
| camera: ^0.11.0 | ||
| # Image picker for gallery photos | ||
| image_picker: ^1.0.0 | ||
| # Image manipulation (BGRA to RGB pixel conversion) | ||
| image: ^4.0.0 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
latest stable version camera package pub.dev 2026
💡 Result:
The latest stable version of the Flutter camera package on pub.dev (as of February 17, 2026) is camera 0.11.3+1. [1]
🌐 Web query:
latest stable version image_picker package pub.dev 2026
💡 Result:
The latest stable version of Flutter’s image_picker package on pub.dev (as of February 17, 2026) is 1.2.1. [1][2]
Use:
dependencies:
image_picker: ^1.2.1Sources: [1] [2]
🌐 Web query:
latest stable version image package pub.dev 2026
💡 Result:
The latest stable version of the Dart image package on pub.dev (as of Feb 17, 2026) is 4.7.2 (published ~58 days ago). [1]
pubspec.yaml:
dependencies:
image: ^4.7.2🌐 Web query:
Flutter camera NSCameraUsageDescription Info.plist iOS requirements
💡 Result:
On iOS, if your Flutter app (or any bundled dependency) accesses the camera / microphone / photo library, you must add the corresponding privacy usage description keys to ios/Runner/Info.plist (with non-empty strings). If you don’t, the app can exit/crash at runtime when the API is accessed. [1]
Minimum required for Flutter camera plugin
Add (at least) these keys (the plugin requires iOS 10+): [2]
<key>NSCameraUsageDescription</key>
<string>We use the camera to take photos/videos.</string>
<key>NSMicrophoneUsageDescription</key>
<string>We use the microphone when recording video with audio.</string>If you only take photos (no audio/video)
If you disable audio in CameraController(enableAudio: false), you can avoid needing the microphone permission string; otherwise iOS will require it when audio is enabled/used. [2][3]
If you save to (or read from) the user’s Photos library
Apple’s guidance: [4][1]
- Only adding/saving images: use
NSPhotoLibraryAddUsageDescription - Reading (or read/write) the library: use
NSPhotoLibraryUsageDescription
Example:
<key>NSPhotoLibraryAddUsageDescription</key>
<string>We save your captured photos to your library.</string>Sources: Apple Info.plist privacy keys reference [1], Apple Photos privacy guidance [4], Flutter camera plugin iOS notes [2], Flutter/iOS camera audio toggle note [3].
🌐 Web query:
Flutter image_picker NSPhotoLibraryUsageDescription iOS requirements
💡 Result:
For Flutter’s image_picker on iOS, you must add privacy “purpose strings” to ios/Runner/Info.plist (inside the <dict>). The plugin’s iOS setup explicitly requires:
NSPhotoLibraryUsageDescription(Photo Library) — include it even if you passrequestFullMetadata: false; App Store processing can still reject builds that reference photo APIs without this key. [1][2]NSCameraUsageDescription(Camera) — required if you useImageSource.camera. [1]NSMicrophoneUsageDescription(Microphone) — required only if you record video with audio. (Commonly added alongside camera.) [3]
Example:
<key>NSPhotoLibraryUsageDescription</key>
<string>We need access to your photo library to let you choose a photo.</string>
<key>NSCameraUsageDescription</key>
<string>We need camera access to let you take a photo.</string>
<key>NSMicrophoneUsageDescription</key>
<string>We need microphone access to record video audio.</string>Notes:
- On iOS 14+,
image_pickeruses PHPicker for library picking. [1] - There’s a known iOS 14+ Simulator limitation where picking HEIC images may not work; test on a real device or use non‑HEIC images. [1]
Sources: [1] image_picker iOS setup on pub.dev, [2] Flutter issue about App Store requiring NSPhotoLibraryUsageDescription, [3] Stack Overflow guidance on adding iOS privacy keys.
Update all three package versions to current stable releases.
The pinned versions are significantly outdated:
camera: update from^0.11.0to^0.11.3image_picker: update from^1.0.0to^1.2.1image: update from^4.0.0to^4.7.2
These outdated lower bounds may cause you to miss important security patches and bug fixes. The caret constraints will floor resolution at the specified versions, so updating them ensures your app benefits from recent patches.
Regarding iOS integration: the required Info.plist entries are accurate but incomplete. For completeness:
cameraplugin requiresNSCameraUsageDescriptionandNSMicrophoneUsageDescription(the latter can be omitted ifenableAudio: falseis used)image_pickerrequiresNSPhotoLibraryUsageDescription(even withrequestFullMetadata: false) andNSCameraUsageDescriptionif the app usesImageSource.camera;NSMicrophoneUsageDescriptionis needed only if recording video with audio
Verify that all required keys are present in ios/Runner/Info.plist with appropriate non-empty strings, as missing keys will cause runtime crashes when the respective APIs are accessed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/flutter/RunAnywhereAI/pubspec.yaml` around lines 42 - 47, Update the
pinned Flutter package versions in pubspec.yaml by bumping camera from ^0.11.0
to ^0.11.3, image_picker from ^1.0.0 to ^1.2.1, and image from ^4.0.0 to ^4.7.2;
then run flutter pub get to lock the new versions. Also verify iOS Info.plist
contains the required non-empty usage keys for these plugins—add
NSCameraUsageDescription and NSMicrophoneUsageDescription (omit microphone only
if you set enableAudio: false), add NSPhotoLibraryUsageDescription for
image_picker and NSCameraUsageDescription if using ImageSource.camera—so the app
won’t crash when accessing camera, microphone, or photos.
|
|
||
| <uses-permission android:name="android.permission.INTERNET" /> | ||
| <uses-permission android:name="android.permission.RECORD_AUDIO" /> | ||
| <uses-permission android:name="android.permission.CAMERA" /> |
There was a problem hiding this comment.
Android photo library permissions are missing.
The iOS side adds NSPhotoLibraryUsageDescription and includes PhotoLibrary in setup_permissions, indicating photo library access is intentional. Android requires explicit manifest declarations for media read access (READ_MEDIA_IMAGES/READ_MEDIA_VIDEO on API 33+, or READ_EXTERNAL_STORAGE on older versions). Without these, any photo library permission request will silently return denied on Android.
🐛 Proposed fix
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.CAMERA" />
+ <!-- For photo library access on Android 13+ -->
+ <uses-permission android:name="android.permission.READ_MEDIA_IMAGES" />
+ <uses-permission android:name="android.permission.READ_MEDIA_VIDEO" />
+ <!-- For photo library access on Android < 13 -->
+ <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"
+ android:maxSdkVersion="32" />📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <uses-permission android:name="android.permission.CAMERA" /> | |
| <uses-permission android:name="android.permission.INTERNET" /> | |
| <uses-permission android:name="android.permission.RECORD_AUDIO" /> | |
| <uses-permission android:name="android.permission.CAMERA" /> | |
| <!-- For photo library access on Android 13+ --> | |
| <uses-permission android:name="android.permission.READ_MEDIA_IMAGES" /> | |
| <uses-permission android:name="android.permission.READ_MEDIA_VIDEO" /> | |
| <!-- For photo library access on Android < 13 --> | |
| <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" | |
| android:maxSdkVersion="32" /> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/react-native/RunAnywhereAI/android/app/src/main/AndroidManifest.xml`
at line 5, The AndroidManifest currently only requests CAMERA but lacks
photo-library permissions, causing Android photo access requests to be denied;
update the manifest by adding the appropriate read permissions: include
READ_EXTERNAL_STORAGE for older Android versions and READ_MEDIA_IMAGES (and
READ_MEDIA_VIDEO if video access is needed) for API 33+, alongside the existing
CAMERA <uses-permission> entries so runtime requests succeed (add both
permission names as <uses-permission> entries so the platform will use the
correct one per OS version).
| <key>NSCameraUsageDescription</key> | ||
| <string>Vision AI needs camera access to describe what you see</string> | ||
| <key>NSMicrophoneUsageDescription</key> | ||
| <string>RunAnywhere needs access to your microphone for speech-to-text transcription.</string> | ||
| <key>NSPhotoLibraryUsageDescription</key> | ||
| <string>Vision AI needs photo library access to describe images</string> |
There was a problem hiding this comment.
Inconsistent app name in permission descriptions.
The new descriptions reference "Vision AI" while the existing NSMicrophoneUsageDescription references "RunAnywhere". Users will see different app names across permission prompts.
✏️ Proposed fix
- <string>Vision AI needs camera access to describe what you see</string>
+ <string>RunAnywhere needs camera access to describe what you see</string>
...
- <string>Vision AI needs photo library access to describe images</string>
+ <string>RunAnywhere needs photo library access to describe images</string>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <key>NSCameraUsageDescription</key> | |
| <string>Vision AI needs camera access to describe what you see</string> | |
| <key>NSMicrophoneUsageDescription</key> | |
| <string>RunAnywhere needs access to your microphone for speech-to-text transcription.</string> | |
| <key>NSPhotoLibraryUsageDescription</key> | |
| <string>Vision AI needs photo library access to describe images</string> | |
| <key>NSCameraUsageDescription</key> | |
| <string>RunAnywhere needs camera access to describe what you see</string> | |
| <key>NSMicrophoneUsageDescription</key> | |
| <string>RunAnywhere needs access to your microphone for speech-to-text transcription.</string> | |
| <key>NSPhotoLibraryUsageDescription</key> | |
| <string>RunAnywhere needs photo library access to describe images</string> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/react-native/RunAnywhereAI/ios/RunAnywhereAI/Info.plist` around
lines 36 - 41, Permission strings use inconsistent app names ("Vision AI" and
"RunAnywhere"); make them consistent by using the actual app display name
"RunAnywhereAI" in all plist descriptions. Update the values for
NSCameraUsageDescription, NSMicrophoneUsageDescription, and
NSPhotoLibraryUsageDescription so each string references "RunAnywhereAI" (e.g.,
"RunAnywhereAI needs camera access..." and "RunAnywhereAI needs access to your
microphone...") to ensure uniform permission prompts.
| void VLMBridge::cancel() { | ||
| cancellationRequested_ = true; | ||
| if (handle_) { | ||
| rac_vlm_component_cancel(handle_); | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the header file for VLMBridge
find . -name "VLMBridge.h" -type fRepository: RunanywhereAI/runanywhere-sdks
Length of output: 56
🏁 Script executed:
#!/bin/bash
# Search for cancellationRequested_ to see its declaration and all usages
rg -n 'cancellationRequested_' --type=cpp -B2 -A2Repository: RunanywhereAI/runanywhere-sdks
Length of output: 6007
🏁 Script executed:
#!/bin/bash
# Read the VLMBridge.cpp file to see the cancel() method and tokenCallback around lines 96-101 and 207-213
wc -l sdk/runanywhere-react-native/packages/llamacpp/cpp/bridges/VLMBridge.cppRepository: RunanywhereAI/runanywhere-sdks
Length of output: 150
🏁 Script executed:
#!/bin/bash
# Read the tokenCallback implementation around lines 207-230
sed -n '207,230p' sdk/runanywhere-react-native/packages/llamacpp/cpp/bridges/VLMBridge.cppRepository: RunanywhereAI/runanywhere-sdks
Length of output: 1010
🏁 Script executed:
#!/bin/bash
# Also check if StreamContext is defined and how cancellationRequested_ is accessed
rg -n 'StreamContext' --type=cpp -B2 -A5Repository: RunanywhereAI/runanywhere-sdks
Length of output: 17584
Use std::atomic<bool> for thread-safe cancellation flag.
cancel() sets cancellationRequested_ = true (likely from the UI/main thread), while the tokenCallback lambda reads it from the streaming/inference thread (line 211). A plain bool provides no cross-thread visibility guarantees — the streaming thread may never observe the cancellation. Update the declaration in VLMBridge.hpp to std::atomic<bool> cancellationRequested_ = false;.
Also applies to: LLMBridge.hpp, LLMBridge.cpp (lines 93-96, 166-169).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-react-native/packages/llamacpp/cpp/bridges/VLMBridge.cpp`
around lines 96 - 101, The cancellation flag is not thread-safe: update the
member declaration cancellationRequested_ in VLMBridge.hpp to use
std::atomic<bool> (initialize to false) and adjust any includes (add <atomic>),
then keep VLMBridge::cancel() as-is but rely on the atomic write so the
tokenCallback lambda on the streaming/inference thread will observe the change;
apply the same change to LLMBridge.hpp (and ensure LLMBridge.cpp logic that
reads/writes cancellationRequested_ at lines referenced uses the atomic) so both
bridges use std::atomic<bool> for cross-thread visibility.
| } else if (imageFormat == 1) { | ||
| // RGB pixels format (base64-encoded from JS) | ||
| input.format = RAC_VLM_IMAGE_FORMAT_RGB_PIXELS; | ||
| // Decode base64 to raw bytes | ||
| static std::vector<uint8_t> pixelBuffer; | ||
| pixelBuffer = base64Decode(imageData); | ||
| input.pixel_data = pixelBuffer.data(); | ||
| input.width = static_cast<uint32_t>(imageWidth); | ||
| input.height = static_cast<uint32_t>(imageHeight); | ||
| input.data_size = pixelBuffer.size(); | ||
| } else if (imageFormat == 2) { | ||
| // Base64 format | ||
| input.format = RAC_VLM_IMAGE_FORMAT_BASE64; | ||
| input.base64_data = imageData; | ||
| } else { | ||
| throw std::runtime_error("Invalid image format: " + std::to_string(imageFormat)); | ||
| } | ||
|
|
||
| return input; | ||
| } |
There was a problem hiding this comment.
Static pixelBuffer in buildVLMImageInput is a data race and memory hazard.
pixelBuffer is declared static, meaning all concurrent invocations share the same buffer. If two processVLMImage or processVLMImageStream calls execute concurrently (they are not guarded by modelMutex_), the second call will overwrite the buffer while the first is still using input.pixel_data. This causes data corruption.
Additionally, the VLMImageInput stores a raw pointer (input.pixel_data = pixelBuffer.data()) into this static vector. If the vector reallocates on a subsequent call, the previously returned pointer becomes dangling.
Proposed fix: return the buffer alongside the input, or store it in VLMImageInput
-// Build VLMImageInput from JS bridge parameters
-VLMImageInput buildVLMImageInput(int imageFormat, const std::string& imageData,
- int imageWidth, int imageHeight) {
- VLMImageInput input;
+// Build VLMImageInput from JS bridge parameters.
+// For RGB pixel format, decoded data is stored in `pixelStorage` to keep it alive.
+struct VLMImageWithStorage {
+ VLMImageInput input;
+ std::vector<uint8_t> pixelStorage;
+};
+
+VLMImageWithStorage buildVLMImageInput(int imageFormat, const std::string& imageData,
+ int imageWidth, int imageHeight) {
+ VLMImageWithStorage result;
+ auto& input = result.input;
if (imageFormat == 0) {
// File path format
input.format = RAC_VLM_IMAGE_FORMAT_FILE_PATH;
input.file_path = imageData;
} else if (imageFormat == 1) {
// RGB pixels format (base64-encoded from JS)
input.format = RAC_VLM_IMAGE_FORMAT_RGB_PIXELS;
- // Decode base64 to raw bytes
- static std::vector<uint8_t> pixelBuffer;
- pixelBuffer = base64Decode(imageData);
- input.pixel_data = pixelBuffer.data();
+ result.pixelStorage = base64Decode(imageData);
+ input.pixel_data = result.pixelStorage.data();
input.width = static_cast<uint32_t>(imageWidth);
input.height = static_cast<uint32_t>(imageHeight);
- input.data_size = pixelBuffer.size();
+ input.data_size = result.pixelStorage.size();
} else if (imageFormat == 2) {
// ...
}
- return input;
+ return result;
}Then update callers to use auto imageWithStorage = buildVLMImageInput(...) and pass imageWithStorage.input to the bridge.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp`
around lines 131 - 150, The buildVLMImageInput function currently uses a static
std::vector<uint8_t> pixelBuffer and assigns input.pixel_data to its data(),
causing data races and dangling pointers when concurrent calls (e.g.,
processVLMImage or processVLMImageStream) occur; fix by making the image bytes
owned with the returned object instead of static storage: change
buildVLMImageInput to either (A) embed a std::vector<uint8_t> member inside
VLMImageInput (e.g., VLMImageInput::pixel_buffer) and set input.pixel_data =
pixel_buffer.data() so the input owns the storage, or (B) change its return to a
small struct like ImageWithStorage { VLMImageInput input; std::vector<uint8_t>
storage; } and set input.pixel_data = storage.data(); then update callers
(processVLMImage/processVLMImageStream and any bridge invocations) to hold the
returned object for the lifetime of the call so no dangling pointer or shared
static buffer is used.
| export { | ||
| registerVLMBackend, | ||
| loadVLMModel, | ||
| isVLMModelLoaded, | ||
| unloadVLMModel, | ||
| describeImage, | ||
| askAboutImage, | ||
| processImage, | ||
| processImageStream, | ||
| cancelVLMGeneration, | ||
| } from './RunAnywhere+VLM'; |
There was a problem hiding this comment.
VLM types are not re-exported, leaving consumers unable to type inputs/outputs.
The exported functions (describeImage, askAboutImage, processImage, processImageStream) accept VLMImage and return VLMResult / VLMStreamingResult. Without exporting these types, TypeScript consumers must resort to any, Parameters<typeof describeImage>[0], or imports from internal module paths.
🐛 Proposed fix — add VLM type re-exports
export {
registerVLMBackend,
loadVLMModel,
isVLMModelLoaded,
unloadVLMModel,
describeImage,
askAboutImage,
processImage,
processImageStream,
cancelVLMGeneration,
} from './RunAnywhere+VLM';
+export type {
+ VLMImage,
+ VLMResult,
+ VLMStreamingResult,
+ VLMErrorCode,
+} from './RunAnywhere+VLM';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export { | |
| registerVLMBackend, | |
| loadVLMModel, | |
| isVLMModelLoaded, | |
| unloadVLMModel, | |
| describeImage, | |
| askAboutImage, | |
| processImage, | |
| processImageStream, | |
| cancelVLMGeneration, | |
| } from './RunAnywhere+VLM'; | |
| export { | |
| registerVLMBackend, | |
| loadVLMModel, | |
| isVLMModelLoaded, | |
| unloadVLMModel, | |
| describeImage, | |
| askAboutImage, | |
| processImage, | |
| processImageStream, | |
| cancelVLMGeneration, | |
| } from './RunAnywhere+VLM'; | |
| export type { | |
| VLMImage, | |
| VLMResult, | |
| VLMStreamingResult, | |
| VLMErrorCode, | |
| } from './RunAnywhere+VLM'; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-react-native/packages/llamacpp/src/index.ts` around lines 59
- 69, The package re-exports VLM functions but not their types, forcing
consumers to use any or internal imports; update the export block in index.ts to
also re-export the VLM types (e.g., VLMImage, VLMResult, VLMStreamingResult and
any other related types/interfaces) from './RunAnywhere+VLM' so consumers can
import the function types directly (match the existing exported function names:
describeImage, askAboutImage, processImage, processImageStream, etc.). Ensure
the type names exactly match those declared in RunAnywhere+VLM so TypeScript
consumers can type inputs/outputs without reaching into internal paths.
| } catch { | ||
| if (resultJson.includes('error')) { | ||
| throw new Error(resultJson); | ||
| } | ||
| return { | ||
| text: resultJson, | ||
| promptTokens: 0, | ||
| completionTokens: 0, | ||
| totalTimeMs: 0, | ||
| tokensPerSecond: 0, | ||
| }; | ||
| } |
There was a problem hiding this comment.
Fragile error detection in JSON parse fallback.
When JSON.parse fails, the code checks resultJson.includes('error') to decide whether to throw. This is too broad — any successful result text containing the word "error" (e.g., "Describe any error messages visible") would incorrectly throw an exception.
Proposed fix: check for a JSON error pattern instead
} catch {
- if (resultJson.includes('error')) {
+ // If the raw string looks like a JSON error object, throw it
+ if (resultJson.startsWith('{"error"')) {
throw new Error(resultJson);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch { | |
| if (resultJson.includes('error')) { | |
| throw new Error(resultJson); | |
| } | |
| return { | |
| text: resultJson, | |
| promptTokens: 0, | |
| completionTokens: 0, | |
| totalTimeMs: 0, | |
| tokensPerSecond: 0, | |
| }; | |
| } | |
| } catch { | |
| // If the raw string looks like a JSON error object, throw it | |
| if (resultJson.startsWith('{"error"')) { | |
| throw new Error(resultJson); | |
| } | |
| return { | |
| text: resultJson, | |
| promptTokens: 0, | |
| completionTokens: 0, | |
| totalTimeMs: 0, | |
| tokensPerSecond: 0, | |
| }; | |
| } |
| /** | ||
| * Process an image with VLM (non-streaming) | ||
| * @param imageFormat 0=filePath, 1=rgbPixels, 2=base64 | ||
| * @param imageData The image data (file path string, or base64 string) | ||
| * @param imageWidth Width in pixels (for rgbPixels format, 0 otherwise) | ||
| * @param imageHeight Height in pixels (for rgbPixels format, 0 otherwise) | ||
| * @param prompt Text prompt | ||
| * @param optionsJson Optional JSON with max_tokens, temperature, top_p | ||
| * @returns JSON string with result: text, promptTokens, completionTokens, totalTimeMs, tokensPerSecond | ||
| */ | ||
| processVLMImage(imageFormat: number, imageData: string, imageWidth: number, imageHeight: number, prompt: string, optionsJson?: string): Promise<string>; | ||
|
|
||
| /** | ||
| * Process an image with streaming | ||
| * @param imageFormat Image format enum value | ||
| * @param imageData Image data string | ||
| * @param imageWidth Width (for rgbPixels) | ||
| * @param imageHeight Height (for rgbPixels) | ||
| * @param prompt Text prompt | ||
| * @param optionsJson Generation options JSON | ||
| * @param callback Token callback (token, isComplete) | ||
| * @returns Full generated text | ||
| */ | ||
| processVLMImageStream(imageFormat: number, imageData: string, imageWidth: number, imageHeight: number, prompt: string, optionsJson: string, callback: (token: string, isComplete: boolean) => void): Promise<string>; |
There was a problem hiding this comment.
imageData: string is lossy for rgbPixels format (format=1).
For file paths (format=0) and base64 (format=2), string is natural. For raw RGB pixel data (format=1), binary data passed as a JS string will go through UTF-16 encoding and may corrupt non-text bytes. If rgbPixels is an expected input path from JS (not just C++-internal), consider using an ArrayBuffer-backed type or document that format=1 is only supported from native callers.
If rgbPixels is unused from the JS side, a doc comment clarifying this would prevent misuse.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-react-native/packages/llamacpp/src/specs/RunAnywhereLlama.nitro.ts`
around lines 187 - 210, The imageData parameter is unsafe as plain string for
rgbPixels (format=1); update both processVLMImage and processVLMImageStream
signatures to accept binary (e.g., imageData: string | ArrayBuffer | Uint8Array)
and update their JSDoc to state that when imageFormat==1 callers must pass an
ArrayBuffer/Uint8Array (and native bridging code must handle that type), or
alternatively explicitly document that format=1 is native-only; adjust any
downstream marshalling code to correctly handle ArrayBuffer/Uint8Array for rgb
pixel data.
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sdk/runanywhere-flutter/packages/runanywhere_llamacpp/lib/llamacpp.dart (1)
275-280:⚠️ Potential issue | 🟠 Major
dispose()does not reset_isVlmRegistered, breaking re-registration after dispose.After calling
dispose()followed byregister(), the_registerVlm()method will short-circuit at the_isVlmRegisteredguard (line 160) and never actually re-register the VLM backend.Proposed fix
static void dispose() { _bindings = null; _registeredModels.clear(); _isRegistered = false; + _isVlmRegistered = false; _logger.info('LlamaCpp disposed'); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere_llamacpp/lib/llamacpp.dart` around lines 275 - 280, The dispose() method currently clears _bindings, _registeredModels and _isRegistered but fails to reset the _isVlmRegistered flag, causing _registerVlm() to short-circuit on subsequent register() calls; update dispose() to set _isVlmRegistered = false (in addition to clearing _bindings, _registeredModels and _isRegistered) so that _registerVlm() can run again after dispose() and register() are called.
🧹 Nitpick comments (19)
sdk/runanywhere-react-native/packages/llamacpp/src/native/NativeRunAnywhereLlama.ts (2)
19-21:requireNativeLlamaModuleshould be marked@deprecated.The PR explicitly documents that calling
requireNativeLlamaModule()directly creates throwaway HybridObject instances whose C++ destructors can tear down shared bridge state. However, the function remains a public export with no deprecation signal. Future callers have no indication they should prefergetNativeLlamaModule()instead.♻️ Proposed fix
/** * Get the native RunAnywhereLlama Hybrid Object + * `@deprecated` Use {`@link` getNativeLlamaModule} instead. Calling this directly + * creates a throwaway HybridObject whose C++ destructor can tear down shared + * bridge state. Prefer the singleton getter. */ export function requireNativeLlamaModule(): NativeRunAnywhereLlamaModule { return NitroModules.createHybridObject<RunAnywhereLlama>('RunAnywhereLlama'); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/native/NativeRunAnywhereLlama.ts` around lines 19 - 21, Mark the exported function requireNativeLlamaModule as deprecated by adding a deprecation JSDoc comment (e.g. /** `@deprecated` Use getNativeLlamaModule() instead. */) immediately above its declaration and update the message to recommend getNativeLlamaModule; ensure the text mentions the reason briefly (creating throwaway HybridObject instances) so callers see the risk, and optionally add a TypeScript `@deprecated` tag so IDEs surface the warning when requireNativeLlamaModule is used.
28-35:isNativeLlamaModuleAvailablesilently initializes the singleton as a side effect.Delegating to
getNativeLlamaModule()means a successful availability check eagerly sets_nativeModule. A predicate namedisAvailableis expected to be a read-only observation; callers who use it as a guard (e.g., "check before use") will unknowingly warm up the singleton. At minimum, document this in the JSDoc. Alternatively, fast-path on an already-initialized singleton to keep the observable behaviour identical but surface intent clearly:♻️ Proposed fix
/** * Check if the native Llama module is available. * Uses the singleton getter to avoid creating throwaway HybridObject instances * whose C++ destructors could tear down shared bridge state. + * Note: a successful check initializes the singleton as a side effect. */ export function isNativeLlamaModuleAvailable(): boolean { + // Fast-path: already initialized + if (_nativeModule !== undefined) return true; try { getNativeLlamaModule(); return true; } catch { return false; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/native/NativeRunAnywhereLlama.ts` around lines 28 - 35, isNativeLlamaModuleAvailable currently calls getNativeLlamaModule which eagerly initializes and assigns the singleton _nativeModule as a side effect; change it to first fast-path-check the existing _nativeModule and return true if present, and if not present, probe for the native module without mutating _nativeModule (e.g., add a non-mutating probe helper or call the underlying native detection logic rather than getNativeLlamaModule) so the availability predicate does not warm the singleton; update isNativeLlamaModuleAvailable to reference _nativeModule and use the new probe (or low-level NativeModules check) instead of directly invoking getNativeLlamaModule.examples/flutter/RunAnywhereAI/ios/Runner/Info.plist (1)
29-30: LGTM —NSCameraUsageDescriptioncorrectly added.The usage description is present and App Store review will accept it. Minor note: "vision language model features" is a bit developer-facing for an end-user permission dialog; consider something like "RunAnywhere AI needs camera access to analyze images." for better UX in a production app (fine as-is for an example).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/ios/Runner/Info.plist` around lines 29 - 30, Update the user-facing camera permission text in Info.plist by replacing the NSCameraUsageDescription string value so it reads a clearer end-user message (e.g., "RunAnywhere AI needs camera access to analyze images."); locate the NSCameraUsageDescription entry in the iOS Runner Info.plist and change the existing developer-facing phrase ("vision language model features to analyze images") to the suggested simpler phrasing to improve UX for App Store permission dialogs.sdk/runanywhere-flutter/packages/runanywhere/lib/foundation/error_types/sdk_error.dart (1)
664-669:vlmModelLoadFailedsilently drops the underlying error.Unlike
modelLoadFailed(String modelId, Object? error)which storesunderlyingErroron theSDKError, the VLM variant accepts only a pre-formattedString message, discarding the root-cause exception. This loses debugging context for load failures.♻️ Proposed fix
- static SDKError vlmModelLoadFailed(String message) { + static SDKError vlmModelLoadFailed(String message, [Object? error]) { return SDKError( 'VLM model load failed: $message', SDKErrorType.vlmModelLoadFailed, + underlyingError: error, ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/foundation/error_types/sdk_error.dart` around lines 664 - 669, The vlmModelLoadFailed factory currently takes only a String and discards the root-cause; change vlmModelLoadFailed to accept an additional Object? underlyingError (or similar) and forward it into the returned SDKError so the underlyingError field is populated (mirror how modelLoadFailed(String modelId, Object? error) does); update the method signature and the SDKError constructor call to pass the underlyingError and keep the message and SDKErrorType.vlmModelLoadFailed unchanged.sdk/runanywhere-flutter/packages/runanywhere/lib/public/extensions/runanywhere_frameworks.dart (1)
38-84: Inconsistent indentation insidegetFrameworks— will faildart format.Lines 39–82 (new code) use 2-space method-body indentation while lines 86–96 (existing code) and every other method in the file use 4-space. Line 84 also has trailing whitespace. Running
dart formaton this file would normalise everything to 4-space, but as-is the mixed indentation makes the method visually discontinuous.♻️ Proposed fix (align to 4-space)
- static Future<List<InferenceFramework>> getFrameworks( - SDKComponent capability) async { - final frameworks = <InferenceFramework>{}; - - // Map capability to model categories - final Set<ModelCategory> relevantCategories; - - switch (capability) { - case SDKComponent.llm: - relevantCategories = { - ModelCategory.language, - ModelCategory.multimodal - }; - break; - // ... remaining cases ... - } - - + static Future<List<InferenceFramework>> getFrameworks( + SDKComponent capability) async { + final frameworks = <InferenceFramework>{}; + + // Map capability to model categories + final Set<ModelCategory> relevantCategories; + + switch (capability) { + case SDKComponent.llm: + relevantCategories = { + ModelCategory.language, + ModelCategory.multimodal, + }; + break; + // ... remaining cases at 6-space body ... + } +🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere/lib/public/extensions/runanywhere_frameworks.dart` around lines 38 - 84, The getFrameworks method has mixed 2-space indentation and a trailing whitespace which breaks dart format; re-indent the entire method body to the project's 4-space style (including all switch cases and the final closing brace) and remove the trailing whitespace (around the end of the switch/blank line) so the method aligns with other functions using SDKComponent and ModelCategory and passes dart format.sdk/runanywhere-flutter/packages/runanywhere_llamacpp/lib/native/llamacpp_bindings.dart (1)
138-139: NoisVlmAvailablegetter to match the existingisAvailablepattern.
isAvailablelets callers guard on LLM availability; there's no equivalent for VLM. Without it, callers must callregisterVlm(), inspect the return code, and infer support — asymmetric with the existing API surface.♻️ Proposed addition
/// Check if bindings are available. bool get isAvailable => _register != null; + + /// Check if VLM bindings are available. + bool get isVlmAvailable => _registerVlm != null;Also applies to: 166-179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-flutter/packages/runanywhere_llamacpp/lib/native/llamacpp_bindings.dart` around lines 138 - 139, Add a symmetric VLM availability getter to match isAvailable: implement "bool get isVlmAvailable => _registerVlm != null;" (mirror the existing isAvailable which checks _register) and expose it alongside registerVlm so callers can guard on VLM support; update the doc comment above the new getter to match the style of the existing /// Check if bindings are available. and place it near the registerVlm/_registerVlm definitions to keep the API consistent.examples/flutter/RunAnywhereAI/lib/app/content_view.dart (1)
26-34: 7 bottom tabs may crowd the navigation bar on smaller devices.This is a demo app so it's acceptable, but be aware that Material Design recommends 3–5 bottom destinations. On narrow screens, the labels may truncate. Consider grouping related features (e.g., Transcribe/Speak/Voice under a single "Audio" hub) if this becomes an issue.
Also applies to: 52-62
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/app/content_view.dart` around lines 26 - 34, The bottom navigation currently defines seven pages in the _pages list (ChatInterfaceView, VisionHubView, SpeechToTextView, TextToSpeechView, VoiceAssistantView, ToolsView, CombinedSettingsView) which may crowd small screens; update the UI to condense audio-related pages into a single hub (e.g., create an AudioHubView that groups SpeechToTextView, TextToSpeechView, and VoiceAssistantView) or implement responsive logic that swaps multiple tabs for a single "Audio" tab on narrow screens, then replace the three individual entries in _pages with the new AudioHubView (or a conditional selection) so the bottom navigation shows 3–5 destinations on small devices.sdk/runanywhere-react-native/packages/llamacpp/src/LlamaCppProvider.ts (1)
166-169: Silent error swallowing inautoRegister.The
catch(() => {})swallows all errors silently. This is pre-existing code but worth noting — if registration fails during auto-register, there's no diagnostic trail. Theregister()method itself logs errors, so this is mitigated, but a briefcatchlog would improve debuggability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/LlamaCppProvider.ts` around lines 166 - 169, autoRegister currently swallows all errors via catch(() => {}) which hides failures; update the catch on LlamaCppProvider.register() inside autoRegister to log the caught error (e.g., console.warn or a module logger) with a short message referencing auto-registration and include the error object so failures are visible; keep the existing register() call and its internal logging but ensure autoRegister's catch does not silently discard the error.examples/flutter/RunAnywhereAI/lib/core/services/permission_service.dart (1)
6-10: Class docstring omits camera permissions.📝 Suggested update
/// PermissionService - Centralized permission handling for the app /// -/// Handles microphone and speech recognition permissions with proper -/// user guidance for denied/permanently denied states. +/// Handles microphone, speech recognition, and camera permissions with +/// proper user guidance for denied/permanently denied states.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/core/services/permission_service.dart` around lines 6 - 10, Update the PermissionService class docstring to include camera permissions alongside microphone and speech recognition; locate the class declaration PermissionService and expand its top comment to mention that this service handles microphone, speech recognition, and camera permissions and provides guidance for denied/permanently denied states so the documentation accurately reflects the implemented responsibilities.examples/flutter/RunAnywhereAI/lib/features/vision/vision_hub_view.dart (1)
41-96: Hardcoded magic numbers and rawTextStylebreak design-system consistency.Two nits in the list tiles:
width: 44,height: 44, andBorderRadius.circular(10)repeat across both tiles — these should come fromAppSpacingconstants to stay consistent with the rest of the app.- The disabled "Image Generation" text (lines 84–93) constructs
TextStyle(...)directly. The rest of the file usesAppTypography.caption(context),AppTypography.subheadline(context), etc.♻️ Suggested approach (Vision Chat tile as example)
- width: 44, - height: 44, - decoration: BoxDecoration( - color: AppColors.primaryPurple, - borderRadius: BorderRadius.circular(10), - ), + width: AppSpacing.iconLarge, // or whichever constant maps to 44 + height: AppSpacing.iconLarge, + decoration: BoxDecoration( + color: AppColors.primaryPurple, + borderRadius: BorderRadius.circular(AppSpacing.cornerRadiusMedium), + ),- title: Text( - 'Image Generation', - style: TextStyle( - color: AppColors.textSecondary(context).withValues(alpha: 0.5), - ), - ), - subtitle: Text( - 'Create images with Stable Diffusion', - style: TextStyle( - color: AppColors.textSecondary(context).withValues(alpha: 0.5), - ), - ), + title: Text( + 'Image Generation', + style: AppTypography.subheadline(context).copyWith( + color: AppColors.textSecondary(context).withValues(alpha: 0.5), + ), + ), + subtitle: Text( + 'Create images with Stable Diffusion', + style: AppTypography.caption(context).copyWith( + color: AppColors.textSecondary(context).withValues(alpha: 0.5), + ), + ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vision_hub_view.dart` around lines 41 - 96, Replace hardcoded sizes and raw TextStyle usage in the vision hub tiles: swap the repeated width: 44, height: 44, and BorderRadius.circular(10) with the app spacing constants (use AppSpacing.* for size and radius) for the leading Container in both ListTile instances, and replace the direct TextStyle(...) usages in the disabled "Image Generation" tile with the appropriate AppTypography helpers (e.g., AppTypography.subheadline(context) and AppTypography.caption(context)) .copyWith(color: AppColors.textSecondary(context).withOpacity(0.5)) to preserve the design system while keeping the faded color via opacity; keep color adjustments using AppColors and withOpacity/withAlpha rather than .withValues.examples/flutter/RunAnywhereAI/lib/features/vision/vlm_camera_view.dart (1)
472-488: Shadowedcontextparameter in builder may cause confusion.The
builder: (context)parameter shadows the widget'sBuildContext. You correctly usethis.contextto refer to the State's context, but this is easy to misread. Renaming the builder parameter (e.g.,sheetContext) would improve clarity.Optional rename
- await showModalBottomSheet<void>( - context: context, - isScrollControlled: true, - backgroundColor: Colors.transparent, - builder: (context) => ModelSelectionSheet( + await showModalBottomSheet<void>( + context: context, + isScrollControlled: true, + backgroundColor: Colors.transparent, + builder: (_) => ModelSelectionSheet( context: ModelSelectionContext.vlm, onModelSelected: (model) async { - await _viewModel.onModelSelected(model.id, model.name, this.context); + await _viewModel.onModelSelected(model.id, model.name, context);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vlm_camera_view.dart` around lines 472 - 488, The builder callback in _onModelButtonTap uses a parameter named context which shadows the State's BuildContext (this.context); rename the builder parameter to something like sheetContext in the ModelSelectionSheet builder to eliminate shadowing and update any references inside the closure to use sheetContext when referring to the bottom-sheet-specific context while keeping this.context for the State (e.g., in the call to _viewModel.onModelSelected).examples/react-native/RunAnywhereAI/src/screens/VLMScreen.tsx (1)
102-105:handleDismissErroris dead code — never used in JSX.This callback is defined but not referenced anywhere in the component's render output. Remove it to reduce noise.
Remove dead code
- // Dismiss error - const handleDismissError = useCallback(() => { - // Reset error in next render to prevent flicker - // Since hook doesn't expose setError, we'll just let user retry - }, []);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/react-native/RunAnywhereAI/src/screens/VLMScreen.tsx` around lines 102 - 105, The function handleDismissError defined with useCallback is dead code and should be removed; delete the entire handleDismissError callback block (including its comment) from VLMScreen.tsx and, after removal, remove the now-unused useCallback import if it is no longer referenced elsewhere in the file. This cleans up unused symbols (handleDismissError and possibly useCallback) and reduces noise in the component.sdk/runanywhere-react-native/packages/llamacpp/src/LlamaCPP.ts (1)
230-297:addVLMModellargely duplicatesaddModel— consider extracting shared logic.The two methods share ~80% of their code (ID generation, format inference, disk existence check, ModelInfo construction, registry call). Only the category,
supportsThinking, and metadata differ.A shared private helper (e.g.,
_buildAndRegisterModel) accepting the differing fields would reduce duplication and ensure future fixes (e.g., to disk-check logic) apply to both paths.As per coding guidelines, "focus on SIMPLICITY and SOLID principles" — the DRY violation here is straightforward to resolve and would prevent the two methods from diverging over time.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/LlamaCPP.ts` around lines 230 - 297, The addVLMModel implementation duplicates most logic from addModel; extract the shared behavior into a private helper (suggested name: _buildAndRegisterModel) that handles ID generation (this._generateModelId), format inference (url -> ModelFormat), disk checks (FileSystem.modelExists, FileSystem.getModelPath), ModelInfo construction (common fields like id, name, format, downloadURL, localPath, createdAt/updatedAt, isDownloaded, isAvailable), and ModelRegistry.registerModel, then have addVLMModel and addModel call that helper while passing the differing bits (category, supportsThinking, metadata.mmprojUrl or other per-model fields) so the duplicated checks and registry call live in one place.sdk/runanywhere-react-native/packages/core/src/services/FileSystem.ts (1)
627-663:findGGUFInDirectorylooks correct; minor note on error-path logging.The recursive search with mmproj exclusion is sound. One small observation: the
throwat line 658 ("No GGUF model file found") is immediately caught by the outercatchat line 659, which logs it as "Error finding GGUF file" before rethrowing. This makes the "not found" case look like an unexpected error in logs. Consider throwing only from outside the try-catch, or distinguishing between "not found" and actual I/O errors.Optional: cleaner error flow
async findGGUFInDirectory(directory: string): Promise<string> { if (!RNFS) { throw new Error('react-native-fs not available'); } - try { - const contents = await RNFS.readDir(directory); - - // Look for .gguf files (not mmproj) - for (const item of contents) { - if (item.isFile() && item.name.endsWith('.gguf') && !item.name.includes('mmproj')) { - logger.info(`Found main GGUF model: ${item.name}`); - return item.path; - } + const contents = await RNFS.readDir(directory); + + // Look for .gguf files (not mmproj) + for (const item of contents) { + if (item.isFile() && item.name.endsWith('.gguf') && !item.name.includes('mmproj')) { + logger.info(`Found main GGUF model: ${item.name}`); + return item.path; } + } - // If not found, check nested directories - for (const item of contents) { - if (item.isDirectory()) { - try { - return await this.findGGUFInDirectory(item.path); - } catch { - // Continue searching other directories - } + // If not found, check nested directories + for (const item of contents) { + if (item.isDirectory()) { + try { + return await this.findGGUFInDirectory(item.path); + } catch { + // Continue searching other directories } } - - throw new Error(`No GGUF model file found in ${directory}`); - } catch (error) { - logger.error(`Error finding GGUF file: ${error}`); - throw error; } + + throw new Error(`No GGUF model file found in ${directory}`); },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/core/src/services/FileSystem.ts` around lines 627 - 663, The current findGGUFInDirectory method logs every thrown Error as an I/O error because the local "No GGUF model file found" is thrown inside the try and immediately caught; modify the flow so only real I/O/runtime errors are logged: move the final "throw new Error(`No GGUF model file found in ${directory}`)" out of the try block (or throw a distinct NotFoundError and only logger.error inside catch for non-NotFoundError cases), keep recursion via this.findGGUFInDirectory, and continue to rethrow actual errors so RNFS failures are logged while "not found" remains a normal control signal.sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere+VLM.ts (2)
71-88:loadVLMModeldoesn't catch errors — inconsistent withregisterVLMBackend.
registerVLMBackendwraps the native call in try/catch and returnsfalseon failure.loadVLMModellets the native exception propagate directly to the caller. This is a deliberate design choice (callers handle the error), but it's inconsistent and could surprise SDK consumers who expect a uniform boolean return pattern.Not a bug, but worth documenting or harmonizing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere`+VLM.ts around lines 71 - 88, The loadVLMModel function currently lets native.loadVLMModel throw, causing inconsistent behavior with registerVLMBackend; update loadVLMModel to mirror registerVLMBackend by wrapping the native call in a try/catch, logging the error via logger.warning or logger.error (include the error message) and returning false on failure while still returning the native result on success; locate symbols loadVLMModel, getNativeLlamaModule, isNativeLlamaModuleAvailable, and native.loadVLMModel to implement this change.
448-454:uint8ArrayToBase64may be very slow for large RGB images.Building a string one
String.fromCharCodeat a time is O(n²) in many JS engines due to string immutability. For a 1920×1080 RGB image (~6 MB), this could cause noticeable jank. A chunked approach is significantly faster:Proposed fix
function uint8ArrayToBase64(bytes: Uint8Array): string { - // Use btoa with binary string conversion - let binaryString = ''; - for (let i = 0; i < bytes.length; i++) { - binaryString += String.fromCharCode(bytes[i]); - } - return btoa(binaryString); + const CHUNK_SIZE = 0x8000; // 32KB chunks + const chunks: string[] = []; + for (let i = 0; i < bytes.length; i += CHUNK_SIZE) { + const slice = bytes.subarray(i, Math.min(i + CHUNK_SIZE, bytes.length)); + chunks.push(String.fromCharCode(...slice)); + } + return btoa(chunks.join('')); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere`+VLM.ts around lines 448 - 454, The uint8ArrayToBase64 function builds a huge string incrementally causing O(n²) behavior for large buffers; change it to a chunked conversion: iterate over the Uint8Array in chunks (e.g., 32k or 0x8000 bytes), for each chunk convert to a string with String.fromCharCode.apply or a fast typed-array-to-string method and push to an array, then join the array and call btoa once on the full string (or call btoa per chunk and concatenate) to avoid repeated string reallocation; update the uint8ArrayToBase64 function to use this chunked approach to improve performance for large RGB images.sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp (2)
100-120: Base64 lookup table is rebuilt on every call.The 256-element lookup table
Tis recomputed each invocation. For images this could be called with large payloads. Consider making itstatic const(thread-safe in C++11+) for a small but free performance win.Proposed fix
std::vector<uint8_t> base64Decode(const std::string& encoded) { static const std::string base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"; - std::vector<uint8_t> decoded; - std::vector<int> T(256, -1); - for (int i = 0; i < 64; i++) T[base64_chars[i]] = i; + static const auto T = []() { + std::vector<int> table(256, -1); + for (int i = 0; i < 64; i++) table[base64_chars[i]] = i; + return table; + }(); + + std::vector<uint8_t> decoded;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp` around lines 100 - 120, The base64Decode function rebuilds the 256-entry lookup vector T on every call; change the lookup to a single, shared, pre-initialized static (e.g., static const std::array<int,256> or static std::vector<int>) so it is initialized once and reused across calls, and also make base64_chars static const; update references in base64Decode to use the static table (keep function signature and decoding logic unchanged) to avoid per-call allocation and improve performance and thread-safety.
448-459:unloadVLMModelsilently swallows errors, unlikeloadVLMModelandunloadModel.
loadVLMModel(line 437) re-throws aftersetLastError, and the LLMunloadModel(line 252) returns the result code. ButunloadVLMModelcatches exceptions and returnsfalse, meaning callers on the JS side never see the error reason — only afalseboolean. Consider being consistent: either re-throw (likeloadVLMModel) or at minimum log the exception message so it's available viagetLastError().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp` around lines 448 - 459, unloadVLMModel currently swallows exceptions and returns false, hiding the error from JS; make its behavior consistent with loadVLMModel and unloadModel by capturing the exception, calling setLastError(e.what()), and then re-throwing the exception (or alternatively return a distinct error code/log the message) so callers can observe the failure; modify HybridRunAnywhereLlama::unloadVLMModel around the VLMBridge::shared().unload() call to mirror loadVLMModel's error handling pattern (invoke setLastError and re-throw) instead of silently returning false.examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart (1)
143-184: Three near-identical "capture → stream → append tokens" flows.
describeCurrentFrame,describePickedImage, and_describeCurrentFrameForAutoStreamshare the same pattern: buildVLMImage.filePath, callprocessImageStream, iterate tokens, append to buffer, and notify. This could be extracted to a shared helper acceptingimagePath,prompt,maxTokens, and a flag for error handling behavior. Fine as-is for an example app, but worth noting if this grows.Also applies to: 190-224, 266-309
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart` around lines 143 - 184, The three near-identical flows in describeCurrentFrame, describePickedImage, and _describeCurrentFrameForAutoStream should be extracted into a single helper (e.g., _processImageStreamAndAppend) that accepts imagePath, prompt, maxTokens, and an error-handling/notification flag; inside the helper create the VLMImage with VLMImage.filePath(imagePath), call sdk.RunAnywhere.processImageStream(...), iterate result.stream to append tokens into a StringBuffer and update the view-model state via notifyListeners (or delegate notification behavior based on the flag), and then replace the bodies of describeCurrentFrame, describePickedImage, and _describeCurrentFrameForAutoStream to call this helper with the appropriate arguments.
| @override | ||
| void dispose() { | ||
| _autoStreamTimer?.cancel(); | ||
| unawaited(_cameraController?.dispose()); | ||
| super.dispose(); | ||
| } |
There was a problem hiding this comment.
dispose doesn't cancel in-flight VLM generation — risks notifyListeners() on a disposed ChangeNotifier.
If the ViewModel is disposed while describeCurrentFrame, describePickedImage, or _describeCurrentFrameForAutoStream is mid-stream, the await for loop will continue receiving tokens and call notifyListeners() after super.dispose(), throwing a FlutterError. The timer is cancelled but not the generation itself.
Proposed fix
+ bool _disposed = false;
+
`@override`
void dispose() {
+ _disposed = true;
_autoStreamTimer?.cancel();
+ // Cancel any in-flight VLM generation
+ sdk.RunAnywhere.cancelVLMGeneration();
unawaited(_cameraController?.dispose());
super.dispose();
}Then guard notification calls:
void _safeNotify() {
if (!_disposed) notifyListeners();
}Replace all notifyListeners() calls with _safeNotify().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/flutter/RunAnywhereAI/lib/features/vision/vlm_view_model.dart`
around lines 321 - 326, The dispose method currently only cancels the timer and
camera controller but doesn’t stop in-flight VLM generation, so streams in
describeCurrentFrame, describePickedImage, and
_describeCurrentFrameForAutoStream can continue and call notifyListeners() after
disposal; add a private boolean _disposed = false and set it true in dispose,
store any active StreamSubscription(s) returned by your `await for` loops (from
describeCurrentFrame/describePickedImage/_describeCurrentFrameForAutoStream) and
cancel them in dispose, and replace all direct notifyListeners() calls with a
guarded helper (e.g., _safeNotify that calls notifyListeners() only if
!_disposed) to prevent notifications after disposal while ensuring subscriptions
are cancelled.
| // SmolVLM 500M - Ultra-lightweight VLM for mobile (~500MB total) | ||
| await LlamaCPP.addVLMModel({ | ||
| id: 'smolvlm-500m-instruct-q8_0', | ||
| name: 'SmolVLM 500M Instruct', | ||
| url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-vlm-models-v1/smolvlm-500m-instruct-q8_0.tar.gz', | ||
| memoryRequirement: 600_000_000, |
There was a problem hiding this comment.
Comment says ~500MB total but memoryRequirement is 600 MB.
The inline comment understates the memory figure. memoryRequirement: 600_000_000 is the runtime value; the comment should reflect it.
📝 Suggested correction
- // SmolVLM 500M - Ultra-lightweight VLM for mobile (~500MB total)
+ // SmolVLM 500M - Ultra-lightweight VLM for mobile (~600MB runtime: model + mmproj + overhead)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // SmolVLM 500M - Ultra-lightweight VLM for mobile (~500MB total) | |
| await LlamaCPP.addVLMModel({ | |
| id: 'smolvlm-500m-instruct-q8_0', | |
| name: 'SmolVLM 500M Instruct', | |
| url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-vlm-models-v1/smolvlm-500m-instruct-q8_0.tar.gz', | |
| memoryRequirement: 600_000_000, | |
| // SmolVLM 500M - Ultra-lightweight VLM for mobile (~600MB runtime: model + mmproj + overhead) | |
| await LlamaCPP.addVLMModel({ | |
| id: 'smolvlm-500m-instruct-q8_0', | |
| name: 'SmolVLM 500M Instruct', | |
| url: 'https://github.com/RunanywhereAI/sherpa-onnx/releases/download/runanywhere-vlm-models-v1/smolvlm-500m-instruct-q8_0.tar.gz', | |
| memoryRequirement: 600_000_000, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/react-native/RunAnywhereAI/App.tsx` around lines 182 - 187, The
inline comment for the SmolVLM model is inconsistent with the runtime memory
setting; update the comment near the LlamaCPP.addVLMModel call (the model with
id 'smolvlm-500m-instruct-q8_0' and property memoryRequirement) to reflect the
actual memoryRequirement value (600_000_000 / ~600MB) so the comment accurately
matches memoryRequirement: 600_000_000.
| const handleMainAction = useCallback(() => { | ||
| if (vlm.isAutoStreaming) { | ||
| vlm.toggleAutoStreaming(); | ||
| } else { | ||
| vlm.captureAndDescribe(); | ||
| } | ||
| }, [vlm]); |
There was a problem hiding this comment.
handleMainAction also depends on unstable [vlm] — memoization is ineffective.
Same root cause as the effects and handleModelSelected. The callback is recreated every render. Either extract stable refs from the hook or use the vlm object directly without useCallback.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/react-native/RunAnywhereAI/src/screens/VLMScreen.tsx` around lines
93 - 99, The handleMainAction callback is being memoized with an unstable
dependency (vlm) so it’s recreated every render; fix by either removing
useCallback and call vlm.toggleAutoStreaming or vlm.captureAndDescribe directly
in the handler, or extract stable method refs (e.g., const toggleRef =
useRef(vlm.toggleAutoStreaming); const captureRef =
useRef(vlm.captureAndDescribe)) or stable callbacks that reference those methods
and then use those refs in handleMainAction so its dependency array can be
stable; update the handleMainAction implementation (and its dependency array)
accordingly to reference either stable refs or no longer memoize.
| final lib = PlatformLoader.loadCommons(); | ||
| final processFn = lib.lookupFunction< | ||
| Int32 Function(RacHandle, Pointer<RacVlmImageStruct>, Pointer<Utf8>, | ||
| Pointer<RacVlmOptionsStruct>, Pointer<RacVlmResultStruct>), | ||
| int Function( | ||
| RacHandle, | ||
| Pointer<RacVlmImageStruct>, | ||
| Pointer<Utf8>, | ||
| Pointer<RacVlmOptionsStruct>, | ||
| Pointer<RacVlmResultStruct>)>('rac_vlm_component_process'); | ||
|
|
||
| final status = processFn(handle, imagePtr, promptPtr, optionsPtr, resultPtr); | ||
|
|
||
| if (status != RAC_SUCCESS) { | ||
| return VlmBridgeResult( | ||
| text: '', | ||
| error: 'VLM processing failed: ${RacResultCode.getMessage(status)}', | ||
| ); | ||
| } | ||
|
|
||
| final result = resultPtr.ref; | ||
| final text = result.text != nullptr ? result.text.toDartString() : ''; | ||
|
|
||
| return VlmBridgeResult( | ||
| text: text, | ||
| promptTokens: result.promptTokens, | ||
| imageTokens: result.imageTokens, | ||
| completionTokens: result.completionTokens, | ||
| totalTokens: result.totalTokens, | ||
| timeToFirstTokenMs: result.timeToFirstTokenMs, | ||
| imageEncodeTimeMs: result.imageEncodeTimeMs, | ||
| totalTimeMs: result.totalTimeMs, | ||
| tokensPerSecond: result.tokensPerSecond, | ||
| ); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Searching for rac_vlm_result_free in commons headers ==="
rg -n 'rac_vlm_result_free' sdk/runanywhere-commons/include
echo ""
echo "=== Searching for rac_vlm_result_free usage in Flutter native code ==="
rg -n 'VlmResultFree|rac_vlm_result_free' sdk/runanywhere-flutter/packages/runanywhere/lib/nativeRepository: RunanywhereAI/runanywhere-sdks
Length of output: 924
🏁 Script executed:
#!/bin/bash
echo "=== Searching for VLM-related code to understand structure ==="
rg -n 'RacVlmResultStruct' sdk/runanywhere-commons/include -A 3
echo ""
echo "=== Checking for memory management patterns in Flutter code ==="
rg -n 'resultPtr|RacVlmResultStruct' sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart | head -30Repository: RunanywhereAI/runanywhere-sdks
Length of output: 854
🏁 Script executed:
#!/bin/bash
echo "=== Checking the entire _processInIsolate function for context ==="
rg -n '_processInIsolate' sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart -A 50 | head -100Repository: RunanywhereAI/runanywhere-sdks
Length of output: 3300
🏁 Script executed:
#!/bin/bash
echo "=== Getting the complete _processInIsolate function ==="
sed -n '490,650p' sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dartRepository: RunanywhereAI/runanywhere-sdks
Length of output: 5295
🏁 Script executed:
#!/bin/bash
echo "=== Checking if rac_vlm_result_free appears anywhere in entire codebase ==="
rg -n 'rac_vlm_result_free' sdk/ --type dart --type cpp --type h
echo ""
echo "=== Checking C++ bridge for how result is freed ==="
rg -n 'rac_vlm_result_free|VlmResult.*free' sdk/runanywhere-commons -A 2 | head -50Repository: RunanywhereAI/runanywhere-sdks
Length of output: 4004
🏁 Script executed:
#!/bin/bash
echo "=== Implementation of rac_vlm_result_free ==="
sed -n '190,200p' sdk/runanywhere-commons/src/features/vlm/rac_vlm_service.cppRepository: RunanywhereAI/runanywhere-sdks
Length of output: 315
🏁 Script executed:
#!/bin/bash
echo "=== Checking React Native usage for comparison ==="
sed -n '150,165p' sdk/runanywhere-react-native/packages/llamacpp/cpp/bridges/VLMBridge.cppRepository: RunanywhereAI/runanywhere-sdks
Length of output: 678
Call rac_vlm_result_free() to avoid memory leaks.
The _processInIsolate function extracts the result text but never calls rac_vlm_result_free(), leaking the C++-allocated result.text on every call. The C++ implementation (React Native) correctly calls this function after processing. Currently, only calloc.free(resultPtr) is called, which frees the struct container but not the owned memory inside it.
🐛 Suggested fix
final result = resultPtr.ref;
final text = result.text != nullptr ? result.text.toDartString() : '';
+
+ final freeResultFn = lib.lookupFunction<
+ Void Function(Pointer<RacVlmResultStruct>),
+ void Function(Pointer<RacVlmResultStruct>)>('rac_vlm_result_free');
+ freeResultFn(resultPtr);
return VlmBridgeResult(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart`
around lines 562 - 595, In _processInIsolate, after calling
rac_vlm_component_process and reading resultPtr.ref (including converting
result.text to Dart via toDartString()), call the C cleanup function
rac_vlm_result_free(resultPtr) to free any C-allocated memory owned by the
struct (including result.text) before calling calloc.free(resultPtr) or
returning; do not manually free result.text if rac_vlm_result_free handles it to
avoid double-free.
| /// SendPort for the current streaming operation in the background isolate | ||
| SendPort? _vlmIsolateSendPort; | ||
|
|
||
| /// Entry point for the VLM streaming isolate | ||
| @pragma('vm:entry-point') | ||
| void _vlmStreamingIsolateEntry(_VlmStreamingIsolateParams params) { | ||
| // Store the SendPort for callbacks to use | ||
| _vlmIsolateSendPort = params.sendPort; | ||
|
|
||
| final handle = Pointer<Void>.fromAddress(params.handleAddress); | ||
| final promptPtr = params.prompt.toNativeUtf8(); | ||
| final imagePtr = calloc<RacVlmImageStruct>(); | ||
| final optionsPtr = calloc<RacVlmOptionsStruct>(); | ||
|
|
||
| Pointer<Utf8>? filePathPtr; | ||
| Pointer<Uint8>? pixelDataPtr; | ||
| Pointer<Utf8>? base64DataPtr; | ||
|
|
||
| try { | ||
| // Set up image struct based on format | ||
| imagePtr.ref.format = params.imageFormat; | ||
| imagePtr.ref.width = params.width; | ||
| imagePtr.ref.height = params.height; | ||
|
|
||
| if (params.imageFormat == RacVlmImageFormat.filePath && | ||
| params.filePath != null) { | ||
| filePathPtr = params.filePath!.toNativeUtf8(); | ||
| imagePtr.ref.filePath = filePathPtr!; | ||
| imagePtr.ref.pixelData = nullptr; | ||
| imagePtr.ref.base64Data = nullptr; | ||
| imagePtr.ref.dataSize = 0; | ||
| } else if (params.imageFormat == RacVlmImageFormat.rgbPixels && | ||
| params.pixelData != null) { | ||
| // Allocate native memory for pixel data | ||
| pixelDataPtr = calloc<Uint8>(params.pixelData!.length); | ||
| for (int i = 0; i < params.pixelData!.length; i++) { | ||
| pixelDataPtr![i] = params.pixelData![i]; | ||
| } | ||
| imagePtr.ref.filePath = nullptr; | ||
| imagePtr.ref.pixelData = pixelDataPtr!; | ||
| imagePtr.ref.base64Data = nullptr; | ||
| imagePtr.ref.dataSize = params.pixelData!.length; | ||
| } else if (params.imageFormat == RacVlmImageFormat.base64 && | ||
| params.base64Data != null) { | ||
| base64DataPtr = params.base64Data!.toNativeUtf8(); | ||
| imagePtr.ref.filePath = nullptr; | ||
| imagePtr.ref.pixelData = nullptr; | ||
| imagePtr.ref.base64Data = base64DataPtr!; | ||
| imagePtr.ref.dataSize = params.base64Data!.length; | ||
| } else { | ||
| params.sendPort.send( | ||
| _VlmStreamingMessage(error: 'Invalid image format or missing data'), | ||
| ); | ||
| return; | ||
| } | ||
|
|
||
| // Set options | ||
| optionsPtr.ref.maxTokens = params.maxTokens; | ||
| optionsPtr.ref.temperature = params.temperature; | ||
| optionsPtr.ref.topP = params.topP; | ||
| optionsPtr.ref.stopSequences = nullptr; | ||
| optionsPtr.ref.numStopSequences = 0; | ||
| optionsPtr.ref.streamingEnabled = RAC_TRUE; | ||
| optionsPtr.ref.systemPrompt = nullptr; | ||
| optionsPtr.ref.maxImageSize = 0; | ||
| optionsPtr.ref.nThreads = 0; | ||
| optionsPtr.ref.useGpu = params.useGpu ? RAC_TRUE : RAC_FALSE; | ||
|
|
||
| final lib = PlatformLoader.loadCommons(); | ||
|
|
||
| // Get callback function pointers | ||
| final tokenCallbackPtr = | ||
| Pointer.fromFunction<Int32 Function(Pointer<Utf8>, Pointer<Void>)>( | ||
| _vlmIsolateTokenCallback, 1); | ||
| final completeCallbackPtr = Pointer.fromFunction< | ||
| Void Function(Pointer<RacVlmResultStruct>, | ||
| Pointer<Void>)>(_vlmIsolateCompleteCallback); | ||
| final errorCallbackPtr = Pointer.fromFunction< | ||
| Void Function(Int32, Pointer<Utf8>, | ||
| Pointer<Void>)>(_vlmIsolateErrorCallback); | ||
|
|
||
| final processStreamFn = lib.lookupFunction< | ||
| Int32 Function( | ||
| RacHandle, | ||
| Pointer<RacVlmImageStruct>, | ||
| Pointer<Utf8>, | ||
| Pointer<RacVlmOptionsStruct>, | ||
| Pointer<NativeFunction<Int32 Function(Pointer<Utf8>, Pointer<Void>)>>, | ||
| Pointer< | ||
| NativeFunction< | ||
| Void Function(Pointer<RacVlmResultStruct>, Pointer<Void>)>>, | ||
| Pointer< | ||
| NativeFunction< | ||
| Void Function(Int32, Pointer<Utf8>, Pointer<Void>)>>, | ||
| Pointer<Void>, | ||
| ), | ||
| int Function( | ||
| RacHandle, | ||
| Pointer<RacVlmImageStruct>, | ||
| Pointer<Utf8>, | ||
| Pointer<RacVlmOptionsStruct>, | ||
| Pointer<NativeFunction<Int32 Function(Pointer<Utf8>, Pointer<Void>)>>, | ||
| Pointer< | ||
| NativeFunction< | ||
| Void Function(Pointer<RacVlmResultStruct>, Pointer<Void>)>>, | ||
| Pointer< | ||
| NativeFunction< | ||
| Void Function(Int32, Pointer<Utf8>, Pointer<Void>)>>, | ||
| Pointer<Void>, | ||
| )>('rac_vlm_component_process_stream'); | ||
|
|
||
| // This FFI call blocks until processing is complete | ||
| final status = processStreamFn( | ||
| handle, | ||
| imagePtr, | ||
| promptPtr, | ||
| optionsPtr, | ||
| tokenCallbackPtr, | ||
| completeCallbackPtr, | ||
| errorCallbackPtr, | ||
| nullptr, | ||
| ); | ||
|
|
||
| if (status != RAC_SUCCESS) { | ||
| params.sendPort.send(_VlmStreamingMessage( | ||
| error: | ||
| 'Failed to start streaming: ${RacResultCode.getMessage(status)}', | ||
| )); | ||
| } | ||
| } catch (e) { | ||
| params.sendPort.send(_VlmStreamingMessage(error: 'Streaming exception: $e')); | ||
| } finally { | ||
| calloc.free(promptPtr); | ||
| calloc.free(imagePtr); | ||
| calloc.free(optionsPtr); | ||
| if (filePathPtr != null) calloc.free(filePathPtr); | ||
| if (pixelDataPtr != null) calloc.free(pixelDataPtr); | ||
| if (base64DataPtr != null) calloc.free(base64DataPtr); | ||
| _vlmIsolateSendPort = null; | ||
| } |
There was a problem hiding this comment.
Prevent concurrent streaming calls from overwriting the global SendPort.
_vlmIsolateSendPort is global; a second stream can overwrite it and misroute tokens. Consider guarding against concurrent streams (or documenting single-stream usage).
✅ Simple single-stream guard
class DartBridgeVLM {
+ bool _streamingActive = false;
...
Stream<String> processImageStream({
required String prompt,
required int imageFormat,
String? filePath,
Uint8List? pixelData,
int width = 0,
int height = 0,
String? base64Data,
int maxTokens = 2048,
double temperature = 0.7,
double topP = 0.9,
bool useGpu = true,
}) {
+ if (_streamingActive) {
+ throw StateError('VLM streaming already in progress');
+ }
+ _streamingActive = true;
final handle = getHandle();
...
Future<void> _startBackgroundStreaming(
...
receivePort.listen((message) {
if (controller.isClosed) return;
if (message is String) {
controller.add(message);
} else if (message is _VlmStreamingMessage) {
if (message.isComplete) {
controller.close();
receivePort.close();
+ _streamingActive = false;
} else if (message.error != null) {
controller.addError(StateError(message.error!));
controller.close();
receivePort.close();
+ _streamingActive = false;
}
}
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-flutter/packages/runanywhere/lib/native/dart_bridge_vlm.dart`
around lines 654 - 793, The global _vlmIsolateSendPort is being overwritten by
concurrent isolates which can misroute tokens; in _vlmStreamingIsolateEntry
either (A) avoid using the global: remove the assignment to _vlmIsolateSendPort
and use a local final sendPort = params.sendPort everywhere in this function
(and in callbacks capture that sendPort) or (B) add a guard at the top of
_vlmStreamingIsolateEntry that checks if _vlmIsolateSendPort != null and, if so,
calls params.sendPort.send(...) with an error message and returns; ensure the
finally block no longer clears a sendPort you didn't set (or clears only when
you set it). Use the symbols _vlmStreamingIsolateEntry, _vlmIsolateSendPort,
_VlmStreamingIsolateParams and the callback functions referenced (e.g.,
_vlmIsolateTokenCallback/_vlmIsolateCompleteCallback/_vlmIsolateErrorCallback)
to locate and implement the change.
|
|
||
| rac_handle_t handle_ = nullptr; | ||
| std::string loadedModelId_; | ||
| bool cancellationRequested_ = false; |
There was a problem hiding this comment.
cancellationRequested_ must be std::atomic<bool> — plain bool is a data race.
cancel() is designed to be called from a different thread than processStream(). Writing a plain bool from one thread while another thread reads it is undefined behaviour under C++11's memory model regardless of the platform's word size. Use std::atomic<bool> with the appropriate memory order.
🔒 Proposed fix
+#include <atomic>
private:
VLMBridge();
~VLMBridge();
VLMBridge(const VLMBridge&) = delete;
VLMBridge& operator=(const VLMBridge&) = delete;
rac_handle_t handle_ = nullptr;
std::string loadedModelId_;
- bool cancellationRequested_ = false;
+ std::atomic<bool> cancellationRequested_{false};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-react-native/packages/llamacpp/cpp/bridges/VLMBridge.hpp` at
line 119, Replace the plain bool cancellationRequested_ with std::atomic<bool>
and include <atomic>; update its declaration in VLMBridge.hpp (symbol:
cancellationRequested_) to initialize to false as an atomic, and ensure cancel()
stores true (use store(true, std::memory_order_relaxed) or stronger if you
prefer) while processStream() reads it via load(...) (or uses the atomic in
conditions) to avoid data races—look for all accesses to cancellationRequested_
and switch them to atomic store/load semantics.
| return buildJsonObject({ | ||
| {"text", jsonString(result.text)}, | ||
| {"promptTokens", std::to_string(result.promptTokens)}, | ||
| {"completionTokens", std::to_string(result.completionTokens)}, | ||
| {"totalTimeMs", std::to_string(result.totalTimeMs)}, | ||
| {"tokensPerSecond", std::to_string(result.tokensPerSecond)} | ||
| }); |
There was a problem hiding this comment.
JSON key mismatch between C++ and TypeScript — all VLM metrics will be zero.
The C++ side emits camelCase keys (promptTokens, completionTokens, totalTimeMs, tokensPerSecond), but the TypeScript consumer in RunAnywhere+VLM.ts (lines 198–201) reads snake_case keys (result.prompt_tokens, result.completion_tokens, result.total_time_ms, result.tokens_per_second). Since the keys never match, all parsed metrics will fall through to the ?? 0 defaults and every VLMResult will report zeros for tokens and timing.
Either align C++ keys to snake_case or update the TS parsing. I'd recommend fixing C++ to match the existing TS convention:
Proposed fix (align C++ keys to snake_case)
return buildJsonObject({
{"text", jsonString(result.text)},
- {"promptTokens", std::to_string(result.promptTokens)},
- {"completionTokens", std::to_string(result.completionTokens)},
- {"totalTimeMs", std::to_string(result.totalTimeMs)},
- {"tokensPerSecond", std::to_string(result.tokensPerSecond)}
+ {"prompt_tokens", std::to_string(result.promptTokens)},
+ {"completion_tokens", std::to_string(result.completionTokens)},
+ {"total_time_ms", std::to_string(result.totalTimeMs)},
+ {"tokens_per_second", std::to_string(result.tokensPerSecond)}
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@sdk/runanywhere-react-native/packages/llamacpp/cpp/HybridRunAnywhereLlama.cpp`
around lines 495 - 501, The JSON emitted by buildJsonObject in
HybridRunAnywhereLlama.cpp uses camelCase keys (e.g., "promptTokens",
"completionTokens", "totalTimeMs", "tokensPerSecond") but the TypeScript
consumer RunAnywhere+VLM.ts expects snake_case (result.prompt_tokens,
result.completion_tokens, result.total_time_ms, result.tokens_per_second);
update the JSON keys in the buildJsonObject return (the block that constructs
{"text", jsonString(result.text)} ...) to use snake_case names that match the TS
parser (prompt_tokens, completion_tokens, total_time_ms, tokens_per_second) so
the metrics are parsed correctly.
| const native = requireNativeLlamaModule(); | ||
|
|
||
| // Unregister VLM first (matches iOS: unregister VLM before LLM) | ||
| if (this.isVLMRegistered) { | ||
| try { | ||
| await native.unloadVLMModel(); | ||
| this.isVLMRegistered = false; | ||
| vlmLog.info('LlamaCPP VLM backend unregistered'); | ||
| } catch (error) { | ||
| vlmLog.error(`LlamaCPP VLM unregistration failed: ${error instanceof Error ? error.message : String(error)}`); | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for unregisterVLMBackend in the codebase
rg -n "unregisterVLMBackend" --type=tsRepository: RunanywhereAI/runanywhere-sdks
Length of output: 56
🏁 Script executed:
# Also search for the native module interface/spec
rg -n "unloadVLMModel|registerVLMBackend" --type=ts -B2 -A2Repository: RunanywhereAI/runanywhere-sdks
Length of output: 6245
🏁 Script executed:
# Find the native module type definitions
fd -e "\.ts" -e "\.tsx" | xargs rg "requireNativeLlamaModule|NativeLlamaModule" | head -20Repository: RunanywhereAI/runanywhere-sdks
Length of output: 2315
🏁 Script executed:
# Check the full native module spec to understand the VLM API design
cat -n sdk/runanywhere-react-native/packages/llamacpp/src/specs/RunAnywhereLlama.nitro.ts | sed -n '160,190p'Repository: RunanywhereAI/runanywhere-sdks
Length of output: 1500
🏁 Script executed:
# Also check if there are any comments or docs explaining the VLM registration/unregistration strategy
rg -n "registerVLMBackend|unloadVLMModel" -B5 -A5 sdk/runanywhere-react-native/packages/llamacpp/src/specs/RunAnywhereLlama.nitro.tsRepository: RunanywhereAI/runanywhere-sdks
Length of output: 881
🏁 Script executed:
# Check how iOS implements VLM cleanup for comparison
fd -e "swift" | xargs rg -l "VLM|registerVLM" 2>/dev/null | head -5Repository: RunanywhereAI/runanywhere-sdks
Length of output: 527
🏁 Script executed:
# Check the iOS Swift implementation of VLM cleanup
cat -n sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/RunAnywhere+VisionLanguage.swift | head -150Repository: RunanywhereAI/runanywhere-sdks
Length of output: 6880
🏁 Script executed:
# Search for unload/unregister patterns in iOS VLM implementation
rg -n "unload|unregister" sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/VLM/ -B3 -A3Repository: RunanywhereAI/runanywhere-sdks
Length of output: 1144
🏁 Script executed:
# Check the iOS CppBridge VLM implementation
fd -e "swift" -path "*/CppBridge*" | xargs rg -l "VLM" 2>/dev/nullRepository: RunanywhereAI/runanywhere-sdks
Length of output: 9163
🏁 Script executed:
# Search for VLM registration/unregistration in iOS CppBridge
rg -n "registerVLMBackend|unload|unregister" sdk/runanywhere-swift/ -B2 -A5 | grep -A10 -B10 "VLM"Repository: RunanywhereAI/runanywhere-sdks
Length of output: 12293
🏁 Script executed:
# Check how the LlamaCppProvider.unregister() compares to iOS implementation
rg -n "unregister\|deinit" sdk/runanywhere-swift/Sources/RunAnywhere/ --type=swift | head -30Repository: RunanywhereAI/runanywhere-sdks
Length of output: 56
Add unregisterVLMBackend() to native module and call it during cleanup.
The React Native implementation unloads only the VLM model but doesn't unregister the VLM backend—unlike iOS, which calls rac_backend_llamacpp_vlm_unregister() during unregister(). The C++ API for rac_backend_llamacpp_vlm_unregister() exists but is not exposed in the React Native native module interface. This leaves the VLM backend registered in C++ after the JS-side flag is cleared, creating state inconsistency with iOS.
Add unregisterVLMBackend() to RunAnywhereLlama.nitro.ts and call it in LlamaCppProvider.unregister() instead of (or in addition to) unloadVLMModel() to match iOS behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-react-native/packages/llamacpp/src/LlamaCppProvider.ts`
around lines 121 - 132, The native VLM backend isn't being unregistered from the
C++ layer—add and wire a new method named unregisterVLMBackend on the native
module (in RunAnywhereLlama.nitro.ts) and call it from
LlamaCppProvider.unregister() instead of only calling unloadVLMModel(); locate
the native accessor via requireNativeLlamaModule(), add an exported
unregisterVLMBackend() that invokes the underlying
rac_backend_llamacpp_vlm_unregister() binding, then update
LlamaCppProvider.unregister() to await native.unregisterVLMBackend() (with the
same try/catch and logging using vlmLog) and clear this.isVLMRegistered
accordingly.
| try { | ||
| const result = JSON.parse(resultJson); | ||
| return { | ||
| text: result.text ?? '', | ||
| promptTokens: result.prompt_tokens ?? 0, | ||
| completionTokens: result.completion_tokens ?? 0, | ||
| totalTimeMs: result.total_time_ms ?? 0, | ||
| tokensPerSecond: result.tokens_per_second ?? 0, | ||
| }; | ||
| } catch { | ||
| if (resultJson.includes('error')) { | ||
| throw new Error(resultJson); | ||
| } | ||
| return { | ||
| text: resultJson, | ||
| promptTokens: 0, | ||
| completionTokens: 0, | ||
| totalTimeMs: 0, | ||
| tokensPerSecond: 0, | ||
| }; | ||
| } |
There was a problem hiding this comment.
JSON key mismatch: parsed keys don't match C++ output.
Lines 198–201 read result.prompt_tokens, result.completion_tokens, etc. (snake_case), but the C++ bridge emits promptTokens, completionTokens, etc. (camelCase). All metrics will silently fall to the ?? 0 defaults. See the corresponding comment on HybridRunAnywhereLlama.cpp lines 495–501 for the proposed fix.
If the C++ keys are changed to snake_case, this code is correct as-is. Otherwise, update the keys here:
Alternative: fix on the TS side
return {
text: result.text ?? '',
- promptTokens: result.prompt_tokens ?? 0,
- completionTokens: result.completion_tokens ?? 0,
- totalTimeMs: result.total_time_ms ?? 0,
- tokensPerSecond: result.tokens_per_second ?? 0,
+ promptTokens: result.promptTokens ?? 0,
+ completionTokens: result.completionTokens ?? 0,
+ totalTimeMs: result.totalTimeMs ?? 0,
+ tokensPerSecond: result.tokensPerSecond ?? 0,
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| const result = JSON.parse(resultJson); | |
| return { | |
| text: result.text ?? '', | |
| promptTokens: result.prompt_tokens ?? 0, | |
| completionTokens: result.completion_tokens ?? 0, | |
| totalTimeMs: result.total_time_ms ?? 0, | |
| tokensPerSecond: result.tokens_per_second ?? 0, | |
| }; | |
| } catch { | |
| if (resultJson.includes('error')) { | |
| throw new Error(resultJson); | |
| } | |
| return { | |
| text: resultJson, | |
| promptTokens: 0, | |
| completionTokens: 0, | |
| totalTimeMs: 0, | |
| tokensPerSecond: 0, | |
| }; | |
| } | |
| try { | |
| const result = JSON.parse(resultJson); | |
| return { | |
| text: result.text ?? '', | |
| promptTokens: result.promptTokens ?? 0, | |
| completionTokens: result.completionTokens ?? 0, | |
| totalTimeMs: result.totalTimeMs ?? 0, | |
| tokensPerSecond: result.tokensPerSecond ?? 0, | |
| }; | |
| } catch { | |
| if (resultJson.includes('error')) { | |
| throw new Error(resultJson); | |
| } | |
| return { | |
| text: resultJson, | |
| promptTokens: 0, | |
| completionTokens: 0, | |
| totalTimeMs: 0, | |
| tokensPerSecond: 0, | |
| }; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere`+VLM.ts around
lines 194 - 214, The JSON parsing block in RunAnywhere+VLM.ts is reading
snake_case keys (result.prompt_tokens, result.completion_tokens,
result.total_time_ms, result.tokens_per_second) but the C++ bridge emits
camelCase (promptTokens, completionTokens, totalTimeMs, tokensPerSecond); update
the return mapping in the try branch to read the camelCase properties (e.g.,
result.promptTokens, result.completionTokens, result.totalTimeMs,
result.tokensPerSecond) and optionally keep a fallback to the snake_case names
(e.g., result.promptTokens ?? result.prompt_tokens ?? 0) so parsing in the
function that uses resultJson handles both key styles.
| const finalResult: VLMResult = { | ||
| text: fullText, | ||
| promptTokens: Math.ceil(prompt.length / 4), | ||
| completionTokens: tokenCount, | ||
| totalTimeMs: latencyMs, | ||
| tokensPerSecond, | ||
| }; |
There was a problem hiding this comment.
promptTokens is fabricated in the streaming path.
Math.ceil(prompt.length / 4) is a rough character-based guess, not an actual token count. The non-streaming processVLMImage path returns real metrics from the C++ backend, so consumers comparing the two will see inconsistent values. Consider either:
- Returning
0with a clear doc note that streaming doesn't report prompt tokens, or - Propagating the actual count from the
onCompletecallback'srac_vlm_result_t*(which presumably contains real metrics).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk/runanywhere-react-native/packages/llamacpp/src/RunAnywhere`+VLM.ts around
lines 324 - 330, The current streaming path fabricates promptTokens using
Math.ceil(prompt.length / 4) in the VLMResult, causing inconsistent metrics vs.
processVLMImage's real backend values; fix by removing the heuristic and either
set promptTokens to 0 with a clear comment that streaming doesn't report prompt
token count, or (preferable) read the actual token counts from the native
callback (use the onComplete handler's rac_vlm_result_t* result structure) and
populate VLMResult.promptTokens from that real metric; update the code that
constructs finalResult (VLMResult) in the streaming flow to pull promptTokens
from the onComplete rac_vlm_result_t* instead of using Math.ceil.
|
Hey @VyasGuru |
Have talked this over with @shubhammalhotra28 on slack, this version works... The bot report does have a few things worth checking over but for now, keeping this as is, because it is working. |
Noted ✔️ |
The flutter sdk and the example app have been updated to support VLM.
The React-Native sdk also has been updated.
Summary by CodeRabbit
Release Notes
New Features
Dependencies
Greptile Summary
Added comprehensive Vision Language Model (VLM) support to both Flutter and React Native SDKs with example implementations.
Key Changes:
DartBridgeVLMwith FFI C++ interop, isolate-based streaming to prevent ANR, supports three image formats (file path, RGB pixels, base64)VLMBridgeC++ implementation with proper lifecycle management, backend registration pattern, and streaming callbacksArchitecture:
rac_vlm_component_*C APIsConfidence Score: 4/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant App as Example App participant SDK as RunAnywhere SDK participant Bridge as VLM Bridge (Dart/TS) participant CPP as C++ VLM Component participant Model as VLM Model (llama.cpp) Note over App,Model: VLM Model Loading Flow App->>SDK: loadVLMModel(modelId) SDK->>SDK: Find model in registry SDK->>SDK: Resolve GGUF file path SDK->>SDK: Find mmproj file SDK->>Bridge: loadModel(modelPath, mmprojPath) Bridge->>CPP: rac_vlm_component_create() CPP-->>Bridge: handle Bridge->>CPP: rac_vlm_component_load_model() CPP->>Model: Load model & mmproj Model-->>CPP: Model ready CPP-->>Bridge: RAC_SUCCESS Bridge-->>SDK: Success SDK-->>App: Model loaded Note over App,Model: Image Processing (Streaming) App->>SDK: processImageStream(image, prompt) SDK->>Bridge: processImageStream() Bridge->>Bridge: Spawn background isolate Bridge->>CPP: rac_vlm_component_process_stream() CPP->>Model: Encode image CPP->>Model: Generate tokens loop Token Generation Model-->>CPP: Token CPP-->>Bridge: onToken callback Bridge-->>SDK: Stream token SDK-->>App: Display token end Model-->>CPP: Complete CPP-->>Bridge: onComplete callback Bridge-->>SDK: Stream complete SDK-->>App: Final metrics Note over App,Model: Camera Auto-Streaming Mode App->>App: Start 2.5s timer loop Every 2.5 seconds App->>App: Capture camera frame App->>SDK: processImageStream(frame) SDK->>Bridge: Process with streaming Bridge->>CPP: Generate description CPP-->>App: Stream tokens to UI endLast reviewed commit: 9ed97e3
Context used:
dashboard- CLAUDE.md (source)