-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Allow to read data from a table located in a different sheet #3659
fix: Allow to read data from a table located in a different sheet #3659
Conversation
7021ab5
to
60e120f
Compare
When I use your code to load the example spreadsheet you provided and save it as another spreadsheet, the result is not correct. Excel shows the formula in sheet Feul1 cell A1 as While running that test, I also got a deprecation warning on line 344 of StructuredReference - columnName is apparently null (not null-string) when the statement is executed. I have no idea whether this is related to the previous problem. One of several ways to avoid the Phpstan complaint is to use Finally, please add a test case to show that your change is working as desired. |
Indeed, when saving the file with the library, a "@" sign is added an cause some trouble. But I don't think it is related to my fix proposal, so I'll add a test case for the fix and I let you decide if you want to merge it or not |
Implicit intersection - see https://support.microsoft.com/en-gb/office/implicit-intersection-operator-ce3be07b-0101-4450-a24e-c1c999be2b34 for an explanation of the at sign. I'm not sure exactly what is in the original spreadsheet, but is missing or different in the copy, that allows Excel to read it without adding in the at sign. I do know one difference for sure - in the original, the formula for cell A1 specifies Please go ahead and add the test case. I am not yet sure whether we should proceed with this change in light of the implicit intersection problem. If you think that the result is still reasonably usable for you, that is an argument for proceeding, but please do try to confirm that you are really okay with it. |
I think I've figured out what I want in the test case. In your sample spreadsheet in sheet Feul1 cell A1, use the formula |
Test added, calculation corrected. |
awesome ! sorry for the time, I was facing another issue with my proposal and tried to fix it before writing the test and going back to you. I see that with the test you have made some more fixes, and it seems to be what I was looking for, as it seems to fix my issue totally now. Thank you very much |
another issue found with some files I'm trying to import is, on this line Do you want that I open a new issue for that ? Because I'm not sure this is not related to our fix here. My quick workaround is to add a __toString() method on StructuredReference, but I feel like it is not the right solution. What do you think ? |
Calculate correct result, at least for calculations with a single answer. For an array of answers, result will be similar to other functions which return an array (dynamic arrays not yet fully supported).
cc0c4ed
to
7f3776f
Compare
Without more context, I have no idea whether your toString problem has anything to do with this fix. Feel free to open a new issue. I too have something that I will need to report as an issue once this one has been merged. |
Test file renamed. I think we can merge this one if it's ok for you. I'll open a new issue for the toString issue with a reproducer if I can figure out one Thanks for the help |
Thank you for doing the rename. I think I have figured out my issue, which I feel is closely enough related to the original, that I will fix it with this PR. So expect at least one more change before merging. In case you were wondering about my issue, the test case currently tests for |
See Issue3569Test, which behaves like Excel.
Scrutinizer is correct here.
Worksheet points to Table, Table points back to Worksheet. Circular reference prevents garbage collection. Remove Table collection in Worksheet at destruct time to avoid this problem.
Avoiding memory leak triggered segfault in Phpunit 10. Research and fix later.
Absence of such a method seems to cause problems for untaken IF branches in Calculation.
The toString fix is now part of this PR. |
Correctly detected by Scrutinizer.
Fix PHPOffice#3679. That issue was opened for a problem which was already solved by PR PHPOffice#3659, however there were additional problems with the spreadsheet. The main problem is that Data Validations and Conditional Styles can each be supplied in the Xml in either "external" or "internal" formats. The code for each to handle "external" assumes that each is the only "external" item on the worksheet in the Xml, but some of the worksheets in the sample spreadsheet provide both as "external" on some sheets. The code to fix this is verified against the supplied sample, however no formal test has been added for it. The sample is much too large and complicated to be added to the test suite - it takes several minutes to read, and even longer to write (`setPreCalculateFormulas(false)` is highly recommended). I will leave this ticket open for a few days to see if I can hand-craft a suitable test case, but I am not hopeful. A second problem is that something in the Xlsx Reader `$xmlSheetNS->sheetData->row` loop breaks the selected cell for the worksheet. This is easily fixed and verified by eye (and with the supplied sample), but, again, no explicit test case is added. A third problem is that drawings which are part of the supplied sample use `srcRect` tags in the Xml to effectively produce a cropped version of the image. This tag has hitherto been ignored. It is now supported in Xlsx Reader, Xlsx Writer, and Worksheet/BaseDrawing object. This is again verified with the supplied sample; unlike the other parts, it was easy to add a new formal test case for this part of the fix.
* Fix Several Problems in a Very Complicated Spreadsheet Fix #3679. That issue was opened for a problem which was already solved by PR #3659, however there were additional problems with the spreadsheet. The main problem is that Data Validations and Conditional Styles can each be supplied in the Xml in either "external" or "internal" formats. The code for each to handle "external" assumes that each is the only "external" item on the worksheet in the Xml, but some of the worksheets in the sample spreadsheet provide both as "external" on some sheets. The code to fix this is verified against the supplied sample, however no formal test has been added for it. The sample is much too large and complicated to be added to the test suite - it takes several minutes to read, and even longer to write (`setPreCalculateFormulas(false)` is highly recommended). I will leave this ticket open for a few days to see if I can hand-craft a suitable test case, but I am not hopeful. A second problem is that something in the Xlsx Reader `$xmlSheetNS->sheetData->row` loop breaks the selected cell for the worksheet. This is easily fixed and verified by eye (and with the supplied sample), but, again, no explicit test case is added. A third problem is that drawings which are part of the supplied sample use `srcRect` tags in the Xml to effectively produce a cropped version of the image. This tag has hitherto been ignored. It is now supported in Xlsx Reader, Xlsx Writer, and Worksheet/BaseDrawing object. This is again verified with the supplied sample; unlike the other parts, it was easy to add a new formal test case for this part of the fix. * Scrutinizer False Positives
This has come up a number of times, most recently with issue PHPOffice#3901, and also issue PHPOffice#3659. It will certainly come up more often in days to come. Excel is changing formulas which PhpSpreadsheet has output as `=UNIQUE(A1:A19)`; Excel is processing the formula as it were `=@unique(A1:A19)`. This behavior is explained, in part, by PHPOffice#3659 (comment). It is doing so in order to ensure that the function returns only a single value rather than an array of values, in case the spreadsheet is being processed (or possibly was created) by a less current version of Excel which cannot handle the array result. PhpSpreadsheet follows Excel to a certain extent; it defaults to returning a single calculated value when an array would be returned. Further, its support for outputting an array even when that default is overridden is incomplete. I am not prepared to do everything that Excel does for the array functions (details below), but this PR is a start in that direction. If the default is changed via: ```php use PhpOffice\PhpSpreadsheet\Calculation\Calculation; Calculation::setArrayReturnType(Calculation::RETURN_ARRAY_AS_ARRAY); ``` When that is done, `getCalculatedValue` will return an array (no code change necessary). However, Writer/Xlsx will now be updated to look at that value, and if an array is returned in that circumstance, will indicate in the Xml that the result is an array *and* will include a reference to the bounds of the array. This gets us close, although not completely there, to what Excel does, and may be good enough for now. Excel will still mess with the formula, but now it will treat it as `{=UNIQUE(A1:A19)}`. This means that the spreadsheet will now look correct; there will be superficial differences, but all cells will have the expected value. Technically, the major difference between what PhpSpreadsheet will output now, and what Excel does on its own, is that Excel supplies values in the xml for all the cells in the range. That would be difficult for PhpSpreadsheet to do; that could be a project for another day. Excel will treat the output from PhpSpreadsheet as "Array Formulas" (a.k.a. CSE (control shift enter) formulas because you need to use that combination of keys to manually enter them in older versions of Excel). Current versions of Excel will instead use "Dynamic Array Formulas". Dynamic Array Formulas can be changed by the user; Array Formulas need to be deleted and re-entered if you want to change them. I don't know what else might have to change to get Excel to use the latter for PhpSpreadsheet formulas, and I will probably not even try to look now, saving it for a future date. Unit testing of this change uncovered a bug in Calculation::calculateCellValue. That routine saves off ArrayReturnType, and may change it, and is supposed to restore it. But it does not do the restore if the calculation throws an exception. It is changed to do so.
This is:
Checklist:
Why this change is needed?
When a cell is referencing a "Table" (structured reference) that is located in a different sheet that the cell itself, the library currently throws an error "{SheetName}!{cellName} -> Table {tableName} for Structured Reference cannot be located".
Here is a simple example excel file with this error
bug_with_table_in_other_sheet.xlsx
This fix will keep the same behaviour, but will also try to read all tables from all sheet before throwing the error.
Please tell me if you think the fix is needed or if there is a better solution.
Fix #3635.
Thanks