Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stream support to IReader #3759

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

apreiml
Copy link

@apreiml apreiml commented Oct 5, 2023

When using backends like minio or s3 to store files, those files are usually retrieved into memory. This pull request tries to add stream support for IReaders. This will allow to load files in memory using a php://memory stream.

I've only started with changing the interface and adding stream support to csv. I'd like to know, if this is a desirable feature which has the chance of being merged, before I continue with changing the other readers.

This is:

  • a bugfix
  • a new feature
  • refactoring
  • additional unit tests

Checklist:

  • Changes are covered by unit tests
    • Changes are covered by existing unit tests
    • New unit tests have been added
  • Code style is respected
  • Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
  • CHANGELOG.md contains a short summary of the change and a link to the pull request if applicable
  • Documentation is updated as necessary

Why this change is needed?

Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.

@oleibman
Copy link
Collaborator

This has arisen in the past (see issue #1931), but was rejected because it wasn't a good fit for Xlsx (or Ods) files. Which doesn't mean we can't think about it again. But ... your test case does something like:

        $stream = fopen('data://text/plain;base64,' . base64_encode($data), 'rb');
        self::assertNotFalse($stream);
        $reader = new Csv();
        $spreadsheet = $reader->load($stream);

But we already have a means to do this without any changes to PhpSpreadsheet:

        $stream = fopen('data://text/plain;base64,' . base64_encode($data), 'rb');
        self::assertNotFalse($stream);
        $reader = new Csv();
        $spreadsheet = $reader->loadSpreadsheetFromString(stream_get_contents($stream));

Do you feel that your change adds functionality or ease of use beyond that approach?

@apreiml
Copy link
Author

apreiml commented Oct 13, 2023

@oleibman my plan is trying to replace ZipArchive with ZipStream in the readers, which would support working with streams.

As for the Csv example: A stream may be something else than a string stream. It's just the most convenient one for testing stream support. I'd like to introduce stream support at the IReader interface level, so that it works for all readers with the same api. For CSV and XML it's easier to work from strings, so they basically get the support for little effort.

@oleibman
Copy link
Collaborator

It sounds like maybe you have an idea on how to solve the streaming problem for Xlsx. That would certainly be useful. Do you think you could attack that first, and leave xml/etc. till later (no need to undo what you've already done for csv)?

One important thing to remember - ZipStream3 supports Php8.1+ 64-bit, but we will support Php8.0 and 32-bit for some time to come, so whatever you come up with should support ZipStream2 if possible (might be willing to throw an exception if not possible, but would have to think about it).

Also, we no longer support Php7, so prefer to define parameter types in code rather than doc-blocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants