Tensor preprocessing library for Flutter/Dart. NumPy-like transforms pipeline for ONNX Runtime, TFLite, and other AI inference engines.
- PyTorch Compatible: Matches PyTorch/torchvision tensor operations
- Non-blocking: Isolate-based async execution prevents UI jank
- Type-safe: ONNX-compatible tensor types (Float32, Int64, Uint8, etc.)
- Zero-copy: View/stride manipulation for reshape/transpose operations
- Declarative: Chain operations into reusable pipelines
dependencies:
dart_tensor_preprocessing: ^0.5.1import 'package:dart_tensor_preprocessing/dart_tensor_preprocessing.dart';
// Create a tensor from image data (HWC format, Uint8)
final imageData = Uint8List.fromList([/* RGBA pixel data */]);
final tensor = TensorBuffer.fromUint8List(imageData, [height, width, channels]);
// Use a preset pipeline for ImageNet models
final pipeline = PipelinePresets.imagenetClassification();
final result = await pipeline.runAsync(tensor);
// result.shape: [1, 3, 224, 224] (NCHW, Float32, normalized)| Preset | Output Shape | Use Case |
|---|---|---|
imagenetClassification() |
[1, 3, 224, 224] | ResNet, VGG, etc. |
objectDetection() |
[1, 3, 640, 640] | YOLO, SSD |
faceRecognition() |
[1, 3, 112, 112] | ArcFace, FaceNet |
clip() |
[1, 3, 224, 224] | CLIP models |
mobileNet() |
[1, 3, 224, 224] | MobileNet family |
final pipeline = TensorPipeline([
ResizeOp(height: 224, width: 224),
ToTensorOp(normalize: true), // HWC -> CHW, scale to [0,1]
NormalizeOp.imagenet(), // ImageNet mean/std
UnsqueezeOp.batch(), // Add batch dimension
]);
// Sync execution
final result = pipeline.run(input);
// Async execution (runs in isolate)
final result = await pipeline.runAsync(input);
// Async with custom isolate threshold (default: 100,000 elements)
// Small tensors skip isolate overhead and run synchronously
final result = await pipeline.runAsync(input, isolateThreshold: 50000);ResizeOp- Resize to fixed dimensions (nearest, bilinear, bicubic)ResizeShortestOp- Resize preserving aspect ratioCenterCropOp- Center crop to fixed dimensionsClipOp- Element-wise value clamping (presets: unit, symmetric, uint8)PadOp- Padding with multiple modes (constant, reflect, replicate, circular)SliceOp- Python-like tensor slicing with negative index support
NormalizeOp- Channel-wise normalization (presets: ImageNet, CIFAR-10, symmetric)ScaleOp- Scale values (e.g., [0-255] to [0-1])BatchNormOp- Batch normalization for CNN inference (PyTorch compatible)LayerNormOp- Layer normalization for Transformer inference (presets: BERT, BERT-Large)
PermuteOp- Axis reordering (e.g., HWC to CHW)ToTensorOp- HWC uint8 to CHW float32 with optional scalingToImageOp- CHW float32 to HWC uint8
RandomCropOp- Random cropping with deterministic seed supportGaussianBlurOp- Gaussian blur using separable convolution
concat()- Concatenates tensors along specified axis
UnsqueezeOp- Add dimensionSqueezeOp- Remove size-1 dimensionsReshapeOp- Reshape tensor (supports -1 for inference)FlattenOp- Flatten dimensions
TypeCastOp- Convert between data types
Tensor with shape and stride metadata over physical storage.
// Create tensors
final zeros = TensorBuffer.zeros([3, 224, 224]);
final ones = TensorBuffer.ones([3, 224, 224], dtype: DType.float32);
final fromData = TensorBuffer.fromFloat32List(data, [3, 224, 224]);
// Access elements
final value = tensor[[0, 100, 100]];
// Zero-copy operations
final transposed = tensor.transpose([2, 0, 1]); // Changes strides only
final squeezed = tensor.squeeze();
// Copy operations
final contiguous = tensor.contiguous(); // Force contiguous memory
final cloned = tensor.clone();ONNX-compatible data types with onnxId for runtime integration.
DType.float32 // ONNX ID: 1
DType.int64 // ONNX ID: 7
DType.uint8 // ONNX ID: 2Memory pooling for buffer reuse, reducing GC pressure in hot paths.
final pool = BufferPool.instance;
// Acquire buffer (reuses from pool if available)
final buffer = pool.acquireFloat32(1000);
// ... use buffer ...
// Release back to pool for reuse
pool.release(buffer);
// Monitor pool usage
print('Pooled: ${pool.pooledCount} buffers, ${pool.pooledBytes} bytes');TensorBuffer extension methods for zero-copy tensor manipulation:
// Slice along first dimension (batch slicing)
final batch = tensor.sliceFirst(2, 5); // Views elements 2..4
// Split tensor into views
final items = tensor.unbind(0); // List of views along dim 0
// Select single index (reduces rank)
final first = tensor.select(0, 0); // First item, shape reduced
// Narrow dimension
final narrowed = tensor.narrow(0, 1, 3); // 3 elements starting at 1
// Format conversion without copying
final nhwc = nchwTensor.toChannelsLast(); // NCHW -> NHWC view
final nchw = nhwcTensor.toChannelsFirst(); // NHWC -> NCHW view
// Flatten to 1D view
final flat = tensor.flatten();| Format | Layout | Strides (for [1,3,224,224]) |
|---|---|---|
contiguous |
NCHW | [150528, 50176, 224, 1] |
channelsLast |
NHWC | [150528, 1, 672, 3] |
This library is designed to produce identical results to PyTorch/torchvision operations:
| Operation | PyTorch Equivalent |
|---|---|
TensorBuffer.zeros() |
torch.zeros() |
TensorBuffer.ones() |
torch.ones() |
tensor.transpose() |
tensor.permute() |
tensor.reshape() |
tensor.reshape() |
tensor.squeeze() |
tensor.squeeze() |
tensor.unsqueeze() |
tensor.unsqueeze() |
tensor.sum() / sumAxis() |
tensor.sum() |
tensor.mean() / meanAxis() |
tensor.mean() |
tensor.min() / max() |
tensor.min() / max() |
NormalizeOp.imagenet() |
transforms.Normalize(mean, std) |
ResizeOp(mode: bilinear) |
F.interpolate(mode='bilinear') |
ToTensorOp() |
transforms.ToTensor() |
ClipOp(min, max) |
torch.clamp(min, max) |
PadOp(mode: reflect) |
F.pad(mode='reflect') |
SliceOp([(start, end, step)]) |
tensor[start:end:step] |
concat(tensors, axis) |
torch.cat(tensors, dim) |
RandomCropOp |
transforms.RandomCrop() |
GaussianBlurOp |
transforms.GaussianBlur() |
AddOp / SubOp |
torch.add() / torch.sub() |
MulOp / DivOp |
torch.mul() / torch.div() |
PowOp |
torch.pow() |
AbsOp / NegOp |
torch.abs() / torch.neg() |
SqrtOp / ExpOp / LogOp |
torch.sqrt() / exp() / log() |
ReLUOp / LeakyReLUOp |
F.relu() / F.leaky_relu() |
SigmoidOp / TanhOp |
torch.sigmoid() / torch.tanh() |
SoftmaxOp |
F.softmax() |
BatchNormOp |
torch.nn.BatchNorm2d (inference) |
LayerNormOp |
torch.nn.LayerNorm |
TensorBuffer.full() |
torch.full() |
TensorBuffer.random() |
torch.rand() |
TensorBuffer.randn() |
torch.randn() |
TensorBuffer.eye() |
torch.eye() |
TensorBuffer.linspace() |
torch.linspace() |
TensorBuffer.arange() |
torch.arange() |
tensor.select(dim, index) |
tensor.select(dim, index) |
tensor.narrow(dim, start, len) |
tensor.narrow(dim, start, len) |
tensor.unbind(dim) |
tensor.unbind(dim) |
tensor.flatten() |
tensor.flatten() |
Run benchmarks with dart run benchmark/run_all.dart.
| Operation | Time | Ops/sec |
|---|---|---|
transpose() |
~1µs | 700K+ |
reshape() |
~1µs | 1.6M+ |
squeeze() |
<1µs | 3.2M+ |
unsqueeze() |
~1µs | 780K+ |
| Pipeline | Input Shape | Time |
|---|---|---|
| Simple (Normalize + Unsqueeze) | [3, 224, 224] | ~3.4ms |
| ImageNet Classification | [3, 224, 224] | ~3.0ms |
| Object Detection | [3, 640, 640] | ~25ms |
| Execution | 224x224 | 640x640 |
|---|---|---|
run() (sync) |
~3.5ms | ~29ms |
runAsync() (isolate) |
~11ms | ~93ms |
| Isolate overhead | ~7ms | ~64ms |
Note: Use
runAsync()for large tensors or when UI responsiveness is critical.
- Dart SDK ^3.0.0
MIT