Skip to content

Releases: google-ai-edge/mediapipe

MediaPipe v0.10.5

15 Sep 04:22
Compare
Choose a tag to compare

Framework and core calculator improvements

  • Fix crash in SavePngTestOutput
  • Log stack traces for combined CalculatorGraph statuses
  • Add a GpuOrigin parameter to TensorConverterCalculator
  • Replace some size EXPECTs by ASSERTs
  • Add a support for label annotations (image/label/string and image/label/confidence). Also fixed some clang tidy issues.
  • Set confidence score of the bounding box label.
  • Add setGpuBufferVerticalFlip to GraphRunner TS API
  • Remove unsafe cast.
  • apply affine transform before drawing, in order to keep constant line width regardless of face cropping.
  • Migrate packet messages auto registration to rely on MEDIAPIPE_STATIC_REGISTRATOR_TEMPLATE
  • add end loop calculator for image size
  • Provide a way to disable static registration using MEDIAPIPE_DISABLE_STATIC_REGISTRATION
  • Header for callback_packet_calculator to allow dynamic registration for superusers
  • Support more GPU formats in tensor converter calculator.
  • Expose stream handlers in headers to allow dynamic registration for superusers
  • Expose tool calculators in headers to enable dynamic registration by superusers.
  • Dry-Run mode for static registration to make it easier to find all required static registrations
  • Fix MediaPipe build in Chromium.
  • Swap left and right hand labels.
  • Don't access "document" in WebWorker
  • Update PackMediaSequenceCalculator to support adding clip/media/id to the MediaSequence.
  • update pose rendering
  • Update the header information for EnsureMinimumDefaultExecutorStackSize.
  • Move stream API loopback to third_party.
  • Add pose landmarks constants
  • Add an API in model_task_graph to create or use cached model resources.
  • Move stream API image_size to third_party.
  • Add C++ converters for C Text Classifier API
  • Move stream API rect_transformation to third_party.
  • Change the image label input from Classification to Detection.
  • Update port includes with IWYU to fix clang warnings in code where corresponding ports are used.
  • New image test utilities and memory management fixes.
  • Add a custom op resolver for fused batch norm.
  • Improving throttling logs by providing a node info corresponding to a throttling stream.
  • Use ABSL_LOG in MediaPipe.
  • Remove reference pointer to prevent using a constant reference in the looped iteration variable
  • Remove unnecessary includes in threadpool_std_thread_impl.cc.
  • Make cache writes optional in InferenceCalculatorAdvancedGL
  • Update PackMediaSequenceCalculator to support setting clip/media/string, clip/media/confidence and clip/label/index.
  • Some spelling and grammar fixes in the comments.
  • Add notes/warnings for calculators which use dedicated GL contexts.
  • Remove video and stream model in face stylizer.
  • Move stream API landmarks_projection to third_party.
  • Remove video and streaming mode for face stylizer.
  • landmarks_to_detection stream utility function.
  • Ensure that C header don't import C++ types
  • Splitting GraphRunner into public API declared interfaces and private TS impls
  • Add option for nearest neighbor interpolation.
  • Fixes two issues with file handling on windows:
  • Remove uncoditional texture params reset to make float textures handled correctly.
  • fixes the non-unicode path of file_helpers on windows
  • Modifying tensor_to_vector_float_calculator to take in D_BFLOAT16 values
  • Don't define field in ExternalFileHandler that's not used on Windows.
  • Clean up TensorConverterCalculator flipping behavior
  • Fix win32 build break in mediapipe.

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

  • Adds option to use tensor_ahwb in Android vendor processes
  • Add output size as parameters in Java ImageSegmenter
  • Change SegmentationOptions.builder() to be public
  • ImageGenerator Java API
  • Provide API/options to show intermediate results and generating progress for Java Image Generator.
  • Set enableFlowLimiting to false since only Image model is supported for face stylizer.
  • Move loading tasks-vision-jni to individual vision task class

iOS

  • Added refactored iOS vision task runner sources
  • Removed convenience initializer from refactored MPPVisionTaskRunner
  • Updated iOS docs to use swift names in place of objective c names
  • Added gesture recognizer and hand landmarker to iOS vision framework
  • Fixed directory creation issues in build_ios_framework.sh
  • Changed delegate method to optional
  • Added iOS image segmenter implementation file
  • Updated image segmenter bazel target to add MPPImageSegmenter.mm
  • Renamed option in MPPImageSegmenterOptions
  • Updated iOS face detector to use refactored vision task runner
  • Updated iOS image classifier to use refactored vision task runner
  • Changed order of methods in MPPImageSegmenter.mm
  • Fixed method call in MPPImageSegmenter.mm
  • Updated face landmarker, gesture recognizer,hand landmarker,object detector to use refactored vision task runner
  • Replaced the old iOS vision task runner with the refactored task runner
  • Updated iOS gesture recognizer documentation to use Swift names
  • Updated iOS hand landmarker documentation to use swift names
  • Moved iOS MPPHandLandmark enum to MPPHandLandmarker.h
  • Fixes iOS hand landmarker connections

Javascript

  • vlog default executor and its config usage
  • Updates the runners to support wasm-style binary assets files, and allows their URLs to be explicitly specified as part of the WasmFileset.
  • Add 'types' to package.json
  • Add externs to js_library targets
  • Add API exports for MPMask and MPImage
  • Add Handedness to JS, C++ and Android API
  • Fix missing exports for FilesetResolver and static constants
  • Add exports to ImageSegmenterResult and InteractiveSegmenterResult

Python

  • Set the default running model to Image for face stylizer.

Bug fixes

  • Internal fixes

Model Maker changes

  • Add tensorflow-addons to model_maker requirements.txt

  • Change to add the w_avg latent code to style encoding before layer swapping. This is a bug in the previous code. Also set training=True for encoder since this affect the encoding performance.

  • add metadata writer into face stylizer.

  • Refactor text_classifier preprocessor to move away from using classifier_data_lib

  • Import image_util for using it in mediapipe face stylizer open sourcing.

  • Fix image_util shortcut import line

  • Change supported_ops to a Tuple instead of List to match the API definition.

  • Add a new from_image API to create face stylizer dataset from a single image. Also deprecate the from_folder API since we only support one-shot use case now.

  • Add an API to run inference with face stylizer TF model.

  • Check if the image contains valid face that can be aligned for stylization. If not, throw an exception for invalid input image. This is applied to both input stylized face and raw face.

  • Add allow_custom_ops to model_util.convert_to_tflite and enable custom ops for face stylizer.

  • MediaPipe Dependencies

  • Update WASM files for 10.5 release

MediaPipe v0.10.3

01 Aug 18:56
Compare
Choose a tag to compare

Build changes

  • Fix Halide BUILD rules
  • Fix Android build with any Protos

Framework changes

  • add symmetric color style option
  • InferenceCalculatorAdvancedGL save cache in Open().
  • MEDIAPIPE_NODE/SUBGRAPH_IMPLEMENTATION to use common define for registration
  • Generalize non-define registration with MEDIAPIPE_STATIC_REGISTRATOR_TEMPLATE
  • Discard outdated packets earlier in MuxInputStreamHandler.
  • Replace CHECK with RET_CHECK in GetContract() implementation from six calculators.
  • Move waitOnCpu and waitOnGpu out of the synchronized block, which can cause deadlock.
  • Adding support for 2 things in tensors_to_image_calculator:
  • C++ Image segmenter add output size parameters.
  • Add C Headers for Text Classifier

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

  • Add proto3 Any proto support for Java task api
  • Java API add visibility and presence for landmarks.

iOS

  • Changed left and right image orientation angles to match iOS UIImageOrientation
  • Updated documentation of MPImage
  • Added missing headers in ios vision framework build
  • Fixed swift name of iOS face landmarker delegate
  • Added iOS Image Segmenter Header

Javascript

  • Add angle to BoundingBox
  • Support WASM asset loading for MediaPipe Task Web
  • Update WASM binaries for 0.10.3 release

Model Maker changes

  • Model Maker allow core dataset library to handle datasets with unknown sizes.
  • No public description
  • Add class weights to core hyperparameters and classifier library.
  • Move evaluation onto GPU/TPU hardware if available.

MediaPipe Dependencies

  • Update glog to 0.6
  • Removed internal dependency on OpenCV 3.x, migrating it to OpenCV 4.x

MediaPipe v0.10.2

10 Jul 18:39
Compare
Choose a tag to compare

Build changes

  • Added gesture_recognizer.task to vision tasks test data

Framework and core calculator improvements

  • Log the Bazel build
  • Fix tests to work with arch haswell/sandybridge.
  • Update base audio/vision tasks api to suuport proto3 graph options.
  • Add an option to disable explicit CPU sync for ExternalTextureRenderer
  • Add support for int64 constant side package value.
  • Deprecate GraphStatus()
  • Modify the TensorToImageFrameCalculator to support normalized outputs.
  • Add metadata for all PREFIX/image... prefixes.
  • Update Tensorflow dependency in MediaPipe
  • update face drawing function.
  • Speed up TimeSeriesFramerCalculator.
  • Add MatrixData as a packet option for ConstantSidePacketCalculatorOptions.
  • Fix timestamp computation when copying within first block.
  • Fix -Wsign-compare warning in api2/builder.h
  • Shows the recently added warning when WaitUntilIdle is called with source nodes only once. Otherwise, it is very spammy as it's shown every frame. Moreover, display the names of the sources, so the warning is more actionable.
  • Exposes OpenCV photo lib.
  • Add keys for the context that better match the featurelist for text.
  • Do not send PreviousLoopback output packets to closed streams
  • Add gpu to cpu fallback for tensors_to_detections_calculator.
  • Revert Add location info in registry (debug mode only)
  • Fix bounds calculation in RefineLandmarksFromHeatMapCalculator
  • Add concatenate image vector calculator

MediaPipe Tasks update

This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.

Android

  • Add delegate options to base options for java API and add unit tests for BaseOptions.

iOS

  • Added iOS Gesture Recognizer Protobuf utils
  • Added iOS Gesture Recognizer ObjC Test for simple recognition
  • Added more recognize tests to iOS Gesture Recognizer Objective C tests
  • Added convenience method for creating results for tests in MPPGestureRecognizerResult Helpers
  • Added MPPHandLandmarkerResult Helpers
  • Added MPPConnection
  • Added MPPHandLandmarker
  • Added MPPHandLandmark
  • Renamed iOS gesture recognizer protobuf utils
  • Fixed import in iOS gesture recognizer test utils
  • Updated MPPHandLandmarker.h to return the hand connections via class mathods
  • Add FaceLandmarker iOS API
  • Add FaceLandmarker iOS Live Stream API
  • Update iOS Gesture Recognizer error assertion
  • Updated method names in MPPGestureRecognizer
  • Rename MPPFaceLandmarker.m to MPPFaceLandmarker.mm
  • Added hand landmarker implementation file and hand landmarker connections
  • Updated constant names in MPPHandLandmarkConnections
  • Add FaceLandmarker constants for iOS
  • Added hand landmarker protobuf utils
  • Added iOS Objective C hand landmarker tests
  • Added iOS segmentation mask
  • Updated documentation of MPPMask
  • Added live stream mode tests for iOS Hand Landmarker
  • Updated protobuf helper method name in iOS Gesture Recognizer Helpers
  • Updated iOS hand landmarker tests
  • Updated hand connections in iOS hand landmarker to class properties.
  • Updated signature of initializer in MPPMask
  • Fixed float calculations in MPPMask
  • Removed generic methods for alloc and memcpy from MPPMask
  • Updated init method implementations in MPPMask
  • Fixed implementation of init methods in MPPMask
  • Added MPPMask Tests
  • Added iOS Image Segmenter Result
  • Added iOS Image Segmenter Options
  • Updated image segmenter delegate method to be required
  • Added copying of running mode in NSCopying implementation in iOS tasks
  • Added iOS Image Segmenter Options Helpers
  • Added Image Segmenter Result Helpers

Javascript

  • Add CommonJS bundle for MediaPipe Tasks
  • Use .mjs for ESM Modules and use .cjs for CommonJS
  • Add "exports" field definitions to package.json

Model Maker changes

  • Use GFile for internal file systems.
    1. Model maker core classifier change _metric_function field to _metric_functions in order to support having multiple metrics.
  • Add a face alignment preprocessor to face stylizer.
  • Support ExBert training and option to select between AdamW and LAMB optimizers for BertClassifier
  • Add MobileNetV2_I320 and MobileNetMultiHWAVG_I384 to support larger input image sizes.

MediaPipe Dependencies

  • Update WASM files for 0.10.2 release

iOS, Python GPU, and Windows Python 3.11 support for MediaPipe Tasks

05 Jun 18:51
Compare
Choose a tag to compare

Major Features and Improvements

iOS

  • Published MediaPipeTasksText and MediaPipeTasksVision CocoaPods at version 0.10.1-alpha-2
  • Add FaceDetector iOS API
  • Add Gesture Recognizer iOS API

Web

  • Add iOS support for GPU processing for Segmentation Tasks
  • Add .close() method to ImageSegmenterResult/InteractiveSegmenterResult/PoseLandmarkerResult
  • Add quality scores to Segmenter tasks
  • Make FaceLandmarker result non-optional

Bug Fixes and Other Changes

Android

  • Remove unused MediaPipe Tasks Android sample

iOS

  • Added validation of C++ image classification result packet in MPPImageClassifierResult+Helpers.mm
  • Fixed deps in iOS task BUILD file
  • Reverted addition of flow limiter calculator in image classifier iOS
  • Added delegates in iOS gesture recognizer options
  • Added MPPGestureRecognizerOptionsHelpers, MPPGestureRecognizerResultHelpers, MPPGestureRecognizer header
  • Updated the vision task runner to split the method that creates normalized rect based on ROI
  • Added C++ utils for parsing protos from text files for iOS tests
  • Added hand landmarker result, hand landmarker options, hand landmarker options helpers
  • Add FaceLandmarkerOptions and FaceLandmarker Result API
  • Added utils of containers and core to MPPTaskCommon to avoid warnings in xcode
  • Updated error tests to use XCTAssertEqualObjects

Javascript

  • Update WASM files for 0.10.1 release


Framework and Core Calculator Improvements

Bug Fixes and Other Changes

  • Added clearing of all graph options protos in MPPGestureRecognizerOptions Helpers, support to set delegates in MPPBaseOptions
  • Added method to create unique dispatch queue names in MPPVisionTaskRunner
  • Updated MPPImageClassifier to use delegates instead of completion blocks for callback
  • Updated documentation
  • Updated time out for image classifier async tests
  • Updated time out for object detector
  • Added flow limiter calculator in MediaPipeTasksCommon
  • Added clearing of all graph options protos in MPPGestureRecognizerOptions Helpers
  • Update base_options.py
  • Update Dockerfile
  • Add some helpful error messages in case GL texture creation fails
  • Updated Image classifier result to return empty results if packet can't be validated
  • Updated MPPObjectDetectorResult Helpers to return empty result instead of nil
  • Add needed enum type for choose fuse pipeline
  • Added C++ utils for parsing protos from text files for iOS tests
  • Updated face detector to use new methods from vision task runner
  • Updated variable names in MPPHandLandmarkerOptionsHelpers
  • Add MultiLandmarksSmoothingCalculator
  • Add MP_DISABLE_GPU to .so target
  • Updated CVPixelBuffer to support pixel format type of 32RGBA
  • Added support to set delegates in MPPBaseOptions

MediaPipe Dependencies

  • Removed opencv dependency from MPPVIsionTaskRunner
  • Update MediaPipe to RE2 release 2023-06-01

MediaPipe Solutions Release and API updates

11 May 15:15
Compare
Choose a tag to compare

Major Features and Improvements

  • Released MediaPipe Solutions APIs for Java, Python and Web that offer advanced end-to-end solutions for end-to-end on-device ML.

Bug Fixes and Other Changes

Bazel changes

  • Updated bazelrc with required config

Framework and core calculator improvements

  • Added Language Detector Python API and fixed a typo in Interactive Segmenter Options' docstring, Update CalculatorOptions to encourage proto3 options
  • Updated roi not allowed check in ios vision task runner
  • Updated normalized rect calculation for some angles in MPPVisionTaskRunner
  • Added shell script for building cocoapods archive
  • Added more pose landmarker tests and updated face landmarker tests to cover all the results
  • Added Language Detector Python API and fixed a typo in Interactive Segmenter Options' docstring
  • Add nullable annotation to AudioDataProducer#setAudioConsumer
  • Add a default_applicable_licenses to model_maker/python/vision/core
  • Added podspec for CommonObjects and Vision tasks
  • Add customizable face stylizer module in MediaPipe model maker
  • Add custom metadata for object detection model with out-of-graph nms
  • Update MPImage to use containers
  • Update the face stylizer config to match the latest encoder and detector config
  • Add nose in facemesh drawing
  • Added config for fat simulator builds
  • Added http_archive to download opencv sources
  • Added config settings to select building iOS xcframework from source for certain configs
  • Updated BUILD files to use the open sourced Language Detector model
  • Add the TFLite conversion API to BlazeFaceStylizer in model maker
  • Add the "FACE_ALIGNMENT" output stream to the face stylizer graph
  • Add an extra op to rescale face stylizer generation output from [-1, 1] to [0, 1]
  • Add TransformerParameters proto
  • Updated docuemntation of MPPObjectDetector
  • Added hash implementation for iOS normalized keypoint
  • Updated wait time for object detector tests
  • Updated pixel format types in object detector
  • Added flow limiter calculator and conditionally selected xcframework in iOS framework targets
  • Added conditional building of opencv xc framework to test targets
  • Update CalculatorOptions to encourage proto3 options
  • Add support for single-channel images to MPImage

MediaPipe solutions update

Android
  • Move Java Connections arrays to Task class
iOS
  • Updated roi not allowed check in iOS vision task runner
  • Removed roi apis from iOS object detector
  • Added iOS Object Detector Objective D tests
  • Removed detect in image with region of interest api from iOS Object Detector
  • Updated iOS tests to reflect the new orientation calculation
  • Updated iOS Image Classifier to reflect new calculation for normalized rect
  • Updated build rules for iOS frameworks to duplicate symbols
  • Updated iOS cocoapods build script
  • Updated iOS framework names
  • Added build file for iOS opencv from sources
  • Updated iOS object detector to use delegates instead of callbacks for async calls
  • Added hash implementation for iOS normalized keypoint
  • Added flow limiter calculator and conditionally selected xcframework in iOS framework targets
  • Updated deps names in iOS test targets
  • Added iOS task text cocoapods podspec
  • Added targets for iOS text frameworks
  • Added method for creating unique dispatch queue names in MPPVisionTaskRunner
Javascript
  • Add scribble support to InteractiveSegmenter Web API
  • Update WASM files for Alpha 14
  • Add .close() method to ImageSegmenterResult/InteractiveSegmenterResult/PoseLandmarkerResult
  • Update FaceStylizer, ImageSegmenter, InteractiveSegmenter, PoseLandmarker to return MPImage
Python
  • Added the PoseLandmarker Python API and a simple test
  • Populate labels using model metadata for the ImageSegmenter Python API
  • Added the Face Aligner Python API
  • Expose PoseLandmarker as a public MediaPipe Tasks Python API
  • Expose FaceAligner and LanguageDetector to be public MediaPipe Tasks Python API
  • Add image_segmenter_metadata_schema and object_detector_metadata_schema python files to the mediapipe python wheels
  • Add HAND_CONNECTIONS to HandLandmarker and GestureRecognizer

MediaPipe Dependencies

  • Aded version of dependency to podspec template
  • Updated common dependencies to link in helpers

Major upgrades to MediaPipe - v0.9.3.0

18 Apr 04:06
Compare
Choose a tag to compare

Bazel changes

  • Bazel version upgrade to v6.1.1
  • Update Halide build rules for MediaPipe to use Halide v15.0.1
  • Use "x86_32" instead of "i386" for Bazel CPU ID

Framework and core calculator improvements

  • Added MPPImageClassifierOptionsHelpers, TensorsToSegmentationCalculatorOptionsProto.java into tasks core's maven package, MPPObjectDetectorOptions, MPPObjectDetectorOptionsHelpers, MPPClassifierOptions, MPPGestureRecognizerOptions, MPPGestureRecognizerOptions.m, support for more standard scaling options in GlSurfaceViewRenderer
  • Updated cosine similarity utility
  • Added method to send packet map to C++ task runner
  • Added methods to MPPVisionTaskRunner
  • Added methods to MPPVisionPacketCreator
  • Updated build targets of vision packet creator and task runner
  • Added MPPImageClassifierResultHelpers, MPPImageClassifierOptionsHelpers
  • Added MPPImageClassifier
  • Updated method signature in MPPTaskRunner
  • Added Face Detector implementation and tests
  • Added the AudioRecord API
  • Update audio_record_test.py
  • Add FaceLandmarker C++ API
  • Updated models
  • Add the dataset module for face stylizer in model maker
  • Update Node version to 16.19.0
  • Add metadata writer for image segmentation
  • Add Interactive Segmenter MediaPipe Task
  • Add label_map filtering into filter_detection drishti calculator
  • Add the source code TensorsToSegmentationCalculatorOptionsProto.java into tasks core's maven package
  • Add ImageData output to GraphRunner
  • Added MPPImage Utils for tests
  • Added stream info for some modes in MPPImageClassifier
  • Added flow limiting for live stream mode in MPPImageClassifier
  • Add WebGLTexture output for ImageSegmenter
  • Add face_landmarker to vision types
  • Add a function to convert CoreAudio buffers into a MediaPipe time series matrix
  • Add the model configuration and training hyperparameters for BlazeFaceStylizer
  • Add landmarks smoothing filter when requested face num is 1
  • Added MPPDetection
  • Added MPPObjectDetectionResult
  • Added MPPObjectDetectorOptions
  • Added MPPObjectDetectorOptionsHelpers, MPPObjectDetectionResultHelpers, MPPDetectionHelpers
  • Added MPPObjectDetector
  • Add FrameBuffer view on ImageFrame
  • Add EDGETPU_NNAPI delegate option in MediaPipe tasks API
  • Added MPPLandmark
  • Added MPPLandmarkHelpers
  • Added MPPGestureRecognizerResult
  • Added MPPGestureRecognizerOptions, MPPClassifierOptions
  • Added EndLoopImageCalculator and FaceToRectCalculator
  • Updated FaceStylizer API to align with the new Base Vision Task API changes
  • Added some face landmarks constants
  • Added pose landmarker C++ API
  • Update TF version to 2023-04-12
  • Added CoreAudio and MediaToolbox to BUILD file
  • Update Flatbuffers to 23.1.21
  • Updated error with info about unsupported mirrored orientations in MPPVisionTaskRunner
  • Add VEC32F4 support to ImageFrame
  • Add shaders that support better landscape rendering with GlSurfaceViewRenderer
  • Update TensorsToFaceLandmarksGraph to support face mesh v2 model
  • Add support for more standard scaling options in GlSurfaceViewRenderer

MediaPipe solutions update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

  • Add FaceDetector, Pose Landmarker, FaceLandmarker and FaceStylizer Java API
  • Add getLabels to ImageSegmeter Java API
  • Fix the vision tasks aar build rule to solve the "cannot find symbol" error:
  • Add LabelMapProto.java source code to MediaPipe AAR
  • Add interactive segmenter java API
  • Add face landmarker and face geometry java lite proto source code into mediapipe tasks AAR
  • Switch to use the isPresent() API since the isEmpty() is only available since java 11: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/Optional.html#isEmpty()
  • Update java image segmenter to always output confidence masks and optionally output category mask
  • Adds a LanguageDetector Java API
  • Update Java interactive segmenter to output both confidence masks and category mask optionally

iOS

  • Changed method Updated method calls to process packet map in iOS text tasks
  • Solve iOS build error for gpu_buffer.cc
  • Fixed iOS running mode display strings
  • Linked in Opencv iOS framework with vision tasks
  • Added flow limiter calculator to iOS vision tasks

Javascript

  • Add FaceLandmarker Web API
  • Add the FaceStylizer Web API
  • Add InteractiveSegmenter Web API

Python

  • Python support for M1
  • Added Interactive Segmenter Python API and some tests
  • Expose face detector, face landmarker, face stylizer and interactive segmenter as MediaPipe Tasks Python API
  • Enable TextClassifier and TextEmbedder on Windows Python
  • Gracefully fail resource path lookup for Python on Windows
  • Expose as mediapipe python API
  • Make AudioTools compile when build from python:framework_bindings

Bug fixes

  • Upgrades and fixes for image segmentation category mask on GPU

MediaPipe Dependencies

  • Added dependency for image format
  • Disable OpenCL dependency for OpenCV
  • Add missing dependency library targets to mediapipe_task_aar

Updates to MP Tasks

23 Mar 19:00
Compare
Choose a tag to compare

Bazel changes

  • Add @ to all references to files in WORKSPACE.bazel

Framework and core calculator improvements

  • Added MPPTextEmbedderOptions, MPPTextEmbedderOptionsHelpers, MPPImageClassifierOptions
  • Added volume_gain_db option into AudioToTensorCalculator
  • Added MPPEmbedding, MPPEmbeddingResult, MPPTextEmbedderResult
  • Added iOS text embedder result files
  • Update test to reflect the recommended graph construction style:
  • Add FrameBuffer format
  • Updated documentation of embedding containers
  • Add YuvImage as a GpuBuffer storage backend
  • Updated to types of float and quantized embedding
  • Add Text Embedder tests for text with different themes
  • Added MPPEmbeddingHelpers, MPPEmbeddingResultHelpers, MPPTextEmbedderOptionsHelpers, MPPTextEmbedderResultHelpers, MPPTextEmbedder
  • Add "noasan" to MPPTextClassifierObjcTest
  • Added MPPCosineSimilarity and cosine similarity to MPPTextEmbedder
  • Added text embedder objective c tests
  • Add ViewProvider to YuvImage storage backend
  • Update MP Tasks to observe timestamp bounds
  • Updated swift name for ImageSource Type
  • Updated list of designated initializers
  • Update TensorFlow to latest
  • Add more filtering methods to detection filter calculator
  • Update WASM files for 0.1.0-alpha-4 release
  • Updated the Begin/EndLoopCalculator to be able to handle mediapipe::Tensor
  • Add location info in registry (debug mode only)
  • Added vision task runner
  • Added designated initializer in vision task runner
  • Updated MPPImageUtils with methods to create image frame
  • Updated MPPVisionTaskRunner
  • Add mediapipe tasks face blendshapes graph
  • Add "java_package" and "java_outer_classname" to ImageTransformationCalculatorOptions
  • Updated method name in MPPVisionPacketCreator
  • Update MediaPipe TFLite code to use generic "shim" symbols and headers
  • Update detection result to include optional keypoints
  • Update face detector graph for downstream face landmarks graph
  • Add Bitmap image capture capability to GlSurfaceViewRenderer
  • Update ImageSegmenter API for image/video mode to have both callback API and returned result API
  • Small fixes to TensorsToImageCalculator
  • Add optional face blendshapes to face landmarks detector graph
  • Add a CHECK for the cases when null service is accessed unconditionally
  • Add FaceLandmarkerResult for FaceLandmarker API
  • Add ViewProvider for ImageFrame in GpuBufferStorageYuvImage
  • Add GetInputImageTensorSpecs into BaseVisionTaskApi for tasks api users to get input image tensor specifications
  • Add custom metadata in metadata_schema
  • Add FaceDetectorResult
  • Add volume_gain_db option to TensorsToAudioCalculator
  • Add build system for Halide and expose FrameBufferUtils
  • Add requiredInputBufferSize as an input argument of createAudioRecord
  • Update ImageFrameToGpuBufferCalculator to use api2 and GpuBuffer conversions
  • Add Empty Packet support to GraphRunner
  • Add support for [xmin, ymin, xmax, ymax] style of bbox output
  • Add TensorsToFaceLandmarksGraph to support two types of face mesh models

MediaPipe solutions update

This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.

Android

  • Remove usage of var for ImageSegmenter.java
  • When "--define=MEDIAPIPE_NO_JNI=1" used in compilation, no implementation in libandroid.so is used

iOS

  • Added iOS text embedder result files
  • Added iOS test for different themes in text embedder
  • Added iOS test for quantized embedding
  • Added a note about swift test coverage in iOS text embedder tests
  • Added MPPTaskImage for iOS vision tasks
  • Open visibility of iOS TextClassifier & TextEmbedder
  • Solve Linking error for Hello World iOS example
  • Added swift tests for text embedder

Javascript

  • Fix incorrect uint8 -> int8 conversion in JS cosine similarity
  • Add MediaPipe Image Segmenter task for Web

Python

  • Enable Python Audio Classifier & Embedder on Windows

Bug fixes

  • Bug fixes in MPPImage
  • Ssd anchors calculator add fixed anchors

MediaPipe Dependencies

  • Bump Halide version from 14.0.0 to 15.0.0 and add MacOS Halide dependency

February 1st, 2023

01 Feb 16:20
Compare
Choose a tag to compare

Build changes

  • Allow split_vector_calculator to be build with iOS and MEDIAPIPE_DISABLE_GPU
  • Update mediapipe_aar.bzl to put more mediapipe framework java proto classes into AARs.

Bazel changes

Update Bazel dependencies for Apple

Framework and core calculator improvements

  • Add HandLandmarkerGraph which connect HandDetectorGraph and HandLandmarkerSubgraph with landmarks tracking.
  • Updated image classifier to use a region of interest parameter
  • Add support for input image rotation in ImageClassifier and ObjectDetector C++ API
  • Adding BypassCalculator for use with SwitchContainer.
  • Add MergeDetectionsToVectorCalculator, CombinedPredictionCalculator, EndLoopMatrixCalculator, ConcatenateClassificationListCalculator, RegexPreprocessingCalculator and BERTPreprocessorCalculator, TextToTensorCalculator and UniversalSentenceEncoderPreprocessorCalculator
  • Added the TextClassifier C++ API, the TextPreprocessingSubgraph.
  • Rename "Bound" struct to "Rect" and remove unused "Landmark" struct.
  • Add tensor_index and tensor_name fields to ClassificationList
  • Replace numpy.float with the builtin float type as numpy removes its own float type in v1.24.
  • Add BGR -> RGB color conversion to ColorConvertCalculator.
  • Add SQRT_HANN window type to both SpectrogramCalculator and InverseSpectrogramCalculator.
  • Allow conversion of GlTextureBuffer to CVPixelBufferRef. This means that, if an iOS application sends in a GlTextureBuffer but expects a CVPixelBufferRef as output, everything will work even if the graph just forwards the same input. Also, access by Metal calculators will also work transparently.
  • Allowing BypassCalculator to accept InputSidePackets.
  • Enable unsigned quantized infererence using XNNPACK.
  • Adds a preprocessor for Universal Sentence Encoder models.

MediaPipe solutions update

Android

  • Enable creating MediaPipe Image c++ packet directly from an Android media image object when its format is RGBA_8888.
  • Add Java ImageEmbedder API and TextEmbedder API.
  • Fix aar breakage caused by missing "//mediapipe/tasks/java/com/google/mediapipe/tasks/components/containers:normalized_landmark".
  • Fix aar breakage caused by missing "//mediapipe/tasks/cc/vision/image_segmenter/proto:segmenter_options_java_proto_lite".

Web

  • Hand Landmarker Web API
  • Allow Web developers to opt into CPU or GPU processing
  • Add support for browsers without SIMD
  • Add pre-compiled WASM files to NPM packages

Bug fixes

  • Fix RGBA vs RGB selection when creating GLTexture.
  • Fix accidental suppressions of GLSL linker error reporting
  • Fix for CHECK failure due to pointer description sometimes being larger than allocated string space
  • ClassificationAggregationCalculator and EmbeddingAggregationCalculator now fill in the timestamp_ms field of the classification results in the stream mode.
  • Fix ObjectDetector C++ flow limiter and improve documentation.
  • Better handling of empty packets in vector calculators.

MediaPipe Dependencies

  • Bump up the dependency library pybind11's version to 2.10.1.

Model download from GCS

09 Sep 15:37
Compare
Choose a tag to compare

Build changes

  • We are no longer adding *.tflite model files and other large binaries to our GitHub repository. Instead, these models are downloaded from Google Cloud Storage. This should speed up your getting started experience with MediaPipe (especially if you can work of a shallow clone of the repository) and allows us to expand our feature set without significantly increasing the size of the repository. Please update your Python binaries if they are fetching models from GitHub (see download_utils.py).
  • We have made the build targets //mediapipe/objc:mediapipe_framework_ios, //mediapipe/objc:mediapipe_input_sources_ios, //mediapipe/objc:mediapipe_layer_renderer publicly visible. These targets can now be used in external iOS applications.

Windows build system improvements

29 Jun 18:49
Compare
Choose a tag to compare

Build changes

  • Fixed a duplicate symbol conflict in the Windows build