Releases: CodeWithKyrian/transformers-php
TransformersPHP v0.5.3
This release brings new features, critical bug fixes, and improvements to enhance the functionality and performance of the package. Below is a summary of the changes.
What's New
- feat: Add support for PostProcessor Sequence by @CodeWithKyrian in 2cf18cc
- feat: New tensor method
random
by @CodeWithKyrian in 0564dd1 - feat: Correctly parse and use the Precompiled Normalizer by @CodeWithKyrian in 7e880dd
- feat: Update conversion notebook to include task by @CodeWithKyrian in ca6fc3b
Bug Fixes
- fix: Regex bug in Precompiled Normalizer by @CodeWithKyrian in 69089b1
- fix: Improve Unigram Tokenizer handling of multibyte strings by @CodeWithKyrian in e8a8a9a
- fix: Correct WhitespaceSplit Pretokenizer handling of invisible space characters by @CodeWithKyrian in 6ec3e3e
- fix: Correctly handle multibyte strings in Precompiled Normalizer by @CodeWithKyrian in bee47e0
- fix: Fuse function not combining unknown token IDs correctly by @CodeWithKyrian in 3008013
- fix: Tensor topK error when -1 is passed by @CodeWithKyrian in 0564dd1
Improvements
- fix: Precompiled Normalizer improvements by @CodeWithKyrian b012400
I encourage everyone to update to this latest version and explore the new features. As always, feel free to report any issues or contribute to the project.
Full Changelog: 0.5.2...0.5.3
TransformersPHP v0.5.2
What's Changed
- Use a static list for byte-unicode and unicode-byte conversion by @CodeWithKyrian in 75f5d9c
- Fix Vips RGBA -> RGBA conversion error by @CodeWithKyrian in 54bdee0
- Show output progress when downloading the shared libraries by @CodeWithKyrian in d613a0d
- Obviate the need for autoload.php in establishing library base path by @timwhitlock in #65
New Contributors
- @timwhitlock made their first contribution in #65
Full Changelog: 0.5.1...0.5.2
TransformersPHP v0.5.1
What's new
- Tensor Operations:
magnitude
,sqrt
andcosSimilarity
added. - Vips Binaries: - Vips binaries are now bundled by default, eliminating need to modify anything on the system to use libvips.
Bug Fixes
- Error Handling: - Adjusted error level to a warning for unknown model types, providing clearer feedback without interrupting the workflow.
Reversions
- Dependencies: Reverted
rokka/vips
from dev back to normal dependencies. Since vips binaries are bundled by default, use of vips is now encouraged.
Full Changelog: 0.5.0...0.5.1
TransformersPHP v0.5.0
I'm excited to announce the latest version of TransformersPHP, packed with new features, improvements, and bug fixes. This release brings powerful enhancements to your machine-learning-driven PHP applications, enabling more efficient and versatile operations.
New Features
-
New Pipeline: Audio Classification - Easily classify audio clips with a pre-trained model.
$classifier = pipeline('audio-classification', 'Xenova/ast-finetuned-audioset-10-10-0.4593'); $audioUrl = __DIR__ . '/../sounds/cat_meow.wav'; $output = $classifier($audioUrl); // [ // [ // "label" => "Meow" // "score" => 0.6109990477562 // ] // ]
-
New Pipeline: Automatic Speech Recognition (ASR) - Supports models like
wav2vec
andwhisper
for transcribing speech to text. If a specific model is not officially supported, please open an issue with a feature request.- Example:
$transcriber = pipeline('asr', 'Xenova/whisper-tiny.en'); $audioUrl = __DIR__ . '/../sounds/preamble.wav'; $output = $transcriber($audioUrl, maxNewTokens: 256); // [ // "text" => "We, the people of the United States, ..." // ]
- Example:
Enhancements
-
Shared Libraries Dependencies: - A revamped workflow for downloading shared libraries dependencies ensures they are versioned correctly, reducing download sizes. These binaries are now thoroughly tested on Apple Silicon, Intel Macs, Linux x86_64, Linux aarch64, and Windows platforms.
-
Transformers::setup
Simplified -Transformers::setup()
is now optional. Default settings are automatically applied if not called. Theapply()
method is no longer necessary, but still available for backward compatibility. -
Immutable Image Utility - The Image utility class is now immutable. Each operation returns a new instance, allowing for method chaining and a more predictable workflow.
$image = Image::read($url); $resizedImage = $image->resize(100, 100); // $image remains unchanged
-
New Tensor Operations - New operations were added:
copyTo
,log
,exp
,pow
,sum
,reciprocal
,stdMean
. Additionally, overall performance improvements have been made to Tensor operations. -
TextStreamer Improvements - TextStreamer now prints to stdout by default. You can override this behavior using the
onStream(callable $callback)
method. Consequently, theStdoutStreamer
class is now obsolete. -
VIPS PHP Driver Update - The VIPS PHP driver is no longer bundled by default in
composer.json
. Detailed documentation is provided for installing the Vips PHP driver and setting up Vips on your machine. -
ONNX Runtime Upgrade - Upgraded to version 1.19.0, bringing more performance and compatibility with newer models.
-
Bug Fixes & Performance Improvements - Various bug fixes have been implemented to enhance stability and performance across the package.
I hope you enjoy these updates and improvements. If you encounter any issues or have any suggestions, please don’t hesitate to reach out through our Issue Tracker
Full Changelog: 0.4.4...0.5.0
v0.4.4
v0.4.3
What's Changed
- Fix typo in docs by @BlackyDrum in #42
- fix: statically calling FFI::new deprecated in PHP 8.3 by @CodeWithKyrian in #48
- fix: improve regex for detecting language codes in NllbTokenizer by @CodeWithKyrian in #49
- fix: digits pre-tokenizer returning empty array for text with no digits by @CodeWithKyrian in #51
- feat: allow passing model filename when downloading a model from CLI
- fix: preTokenizer null error when there's no text pair](901a049)
- feat: implement enforce size divisibility for image feature extractor by @CodeWithKyrian in #53
New Contributors
- @BlackyDrum made their first contribution in #42
Full Changelog: 0.4.2...0.4.3
v0.4.2
What's Changed
- bugfix: Repository url resolution not working properly in Windows by @CodeWithKyrian in #41
Full Changelog: 0.4.1...0.4.2
v0.4.1
What's Changed
- configuration.md: fix indentation of Transformers::setup() by @k00ni in #35
- PretrainedTokenizer::truncateHelper: prevent array_slice() error for flawed text input (summarization) by @k00ni in #36
- Fix bug with Download CLI - use named parameters for model construct by @CodeWithKyrian in #39
New Contributors
Full Changelog: 0.4.0...0.4.1
v0.4.0
This release marks a significant milestone in enhancing the performance and functionality of the Tensor class while introducing convenient tools to streamline the installation of essential dependencies. These improvements not only optimize existing operations but also pave the way for future enhancements and expanded capabilities within the project.
What's Changed
- New Inference Session: The InferenceSession has been overhauled to now receive Tensor inputs directly, facilitating easier conversion of Tensor objects to ONNX Tensors by simplifying memory copying.
- Overhaul Tensor Buffer Implementation: The Tensor class has been revamped to utilize OpenBlas and Rindow Matlib C shared libraries, introducing a massive performance improvements in Tensor operations.
- PHP Buffer Fallback: When the C Based Buffer fails for some reason, there's still a working PHP buffer implemented as a fallback, which is obviously slower, but will prevent errors.
- OpenMP Integration: The Tensor operations can be further optimized further by utilizing the parallel operation ability of OpenMP with an optional fallback to the the non OpenMP alternatives when OpenMP isn't installed.
- New Tensor Methods: Several new methods, including
topk
,divide
, andslice
, have been added to the Tensor class, along with corresponding changes to existing implementations to leverage these methods. - Refactor Stack Method: The
stack
method in the Tensor class has been refactored for enhanced performance. - Move Thumbnail Method: The thumbnail method has been relocated from the feature extractor to the Image class for improved organization.
- Code Cleanup and Style Review: The codebase has undergone cleanup and style review to ensure consistency and readability.
- Optimize Image <-> Tensor Conversion: Efforts have been made to optimize the speed of conversion between Image and Tensor objects, and vice versa enhancing overall performance for image related tasks.
- Image Driver Configuration: While Image driver setting can still be set in the
Transformers
class, it can be set directly on theImage
class, allowing it to be used independently. - Introduce Libraries Loader: A new library loader package has been introduced to automate the downloading of required shared libraries, such as
onnxruntime
,openblas
, andrindow-matlib
, during the Composer install process. - TinyLlama Support: Add support for the TinyLlama model by @CodeWithKyrian
- Install Command Returned: Returned the
install
command back to serve as an alternative way of getting the shared libraries if it fails for any reason during composer install.
New Contributors
- @das-peter made their first contribution in #30
Full Changelog: 0.3.1...0.4.0
v0.3.1
What's Changed
- Add Qwen2 model support by @CodeWithKyrian in #20
- Add chat input detection for text generation, and refactor streamer API. by @CodeWithKyrian in #21
- bugfix: Fix error that occurs when streamer is not used by @CodeWithKyrian in #22
- bugfix: Decoder sequence not calling the right method by @CodeWithKyrian in #23
Full Changelog: 0.3.0...0.3.1