Skip to content

Conversation

@aszenz
Copy link
Contributor

@aszenz aszenz commented Dec 22, 2025

Q A
Bug fix? no
New feature? yes
Docs? no
Issues Fix #1250
License MIT

Not an good implementation , just exploring

* @param non-empty-list<string> $separators
* @return iterable<string>
*/
private static function splitText(string $text, array $separators, int $chunkSize): iterable
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

langchain seems to do this with regex, maybe that's faster and more flexible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using UnicodeString?

@aszenz aszenz changed the title feat: Add Recursive Character text splitter [Store] Adds Recursive Character text splitter Dec 22, 2025
@chr-hertel chr-hertel added the Store Issues & PRs about the AI Store component label Dec 22, 2025
@OskarStark OskarStark changed the title [Store] Adds Recursive Character text splitter [Store] Add RecursiveCharacterTextTransformer Dec 23, 2025
@chr-hertel chr-hertel added the Feature New feature label Dec 24, 2025
use Symfony\AI\Store\Document\TransformerInterface;
use Symfony\Component\Uid\Uuid;

class RecursiveCharacterTextTransformer implements TransformerInterface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add @author

Comment on lines +32 to +35
public function transform(
iterable $documents,
array $options = [],
): iterable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function transform(
iterable $documents,
array $options = [],
): iterable {
public function transform(iterable $documents, array $options = []): iterable
{

Comment on lines +96 to +97


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

{
#[Test]
#[DataProvider('provideDocumentsContents')]
public function it_works(array $inputDocumentsText, array $options, array $expectedSplittedTexts): void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function it_works(array $inputDocumentsText, array $options, array $expectedSplittedTexts): void
public function ...(array $inputDocumentsText, array $options, array $expectedSplittedTexts): void

Please use camelCase and be more descriptive

@OskarStark
Copy link
Contributor

Do you want to finish this PR?

@aszenz
Copy link
Contributor Author

aszenz commented Jan 7, 2026

Do you want to finish this PR?

I became a bit busy this month so if anyway wants to pick this up, feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature New feature Status: Needs Work Status: Waiting feedback Store Issues & PRs about the AI Store component

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Store] Implement Recursive Character Text Transformer

4 participants