Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data validation not working when opening & saving a file #1432

Closed
tntsoft opened this issue Apr 3, 2020 · 22 comments
Closed

Data validation not working when opening & saving a file #1432

tntsoft opened this issue Apr 3, 2020 · 22 comments

Comments

@tntsoft
Copy link

tntsoft commented Apr 3, 2020

This is:

- [x ] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What am I trying to accomplish?

I want to open an XLSX file that has some List Dropdown validation rules in it, write some data in it and save it.

What is the expected behavior?

When opening and saving an Excel file, it should retain all the data validation inside it.

What is the current behavior?

It loses all data validation fields ( dropdowns, explanations etc)

What are the steps to reproduce?

Open the bellow excel file with data validation -> save it -> it loses validation.

Sample XLSX file to help reproduce

sample.xlsx

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\IOFactory;

// Declare the path of the file
$inputFileName = __DIR__ . '/sample.xlsx';

// Load the file
$spreadsheet = IOFactory::load($inputFileName);

// Get first sheet in preparation for changes
$sheetData = $spreadsheet->getSheet(0);

// Create an XLSX writer
$writer = IOFactory::createWriter($spreadsheet, "Xlsx");

// Save the file
$writer->save("05featuredemo.xlsx");

Which versions of PhpSpreadsheet and PHP are affected?

"phpoffice/phpspreadsheet": "^1.11"
PHP 7.3

@tntsoft
Copy link
Author

tntsoft commented Apr 27, 2020

No suggestions, anybody?

@zzhovo
Copy link

zzhovo commented Apr 30, 2020

1.12.0

still doesn't work

@PowerKiKi
Copy link
Member

Try to open-and-save without doing anything at all in between. Then unzip both xlsx files and compare the content of each files. Can you pinpoint what changed?

Second question, are you able to create a file from scratch, create the validation and save to for properly?

@zzhovo
Copy link

zzhovo commented May 31, 2020

Look at this line,

if ($xmlSheet && $xmlSheet->dataValidations && !$this->readDataOnly) {

it is expected from the $xmlSheet to have validations, but sometimes the validations are here $xmlSheet->extLst->ext->children('x14', TRUE)->dataValidations->dataValidation

so I've solved the problem by adding this code just before the line mentioned

// handle Microsoft extension if present
if (isset ($xmlSheet->extLst, $xmlSheet->extLst->ext, $xmlSheet->extLst->ext['uri'])
	&& $xmlSheet->extLst->ext['uri'] == "{CCE6A557-97BC-4b89-ADB6-D9C93CAAB3DF}" )
{
	if( ! $xmlSheet->dataValidations ){
		$xmlSheet->addChild('dataValidations');
	}
	// retreive MS extension data to create a node that matches expectations.
	foreach ($xmlSheet->extLst->ext->children('x14', TRUE)->dataValidations->dataValidation as $item)
	{
		$node = $xmlSheet->dataValidations->addChild('dataValidation');
		foreach ($item->attributes() as $attr)
			$node->addAttribute($attr->getName(), $attr);
		$node->addAttribute('sqref', $item->children('xm',TRUE)->sqref);
		$node->addChild('formula1', $item->formula1->children('xm',TRUE)->f);
	}
}

which checks for other possible validations to exist and if so, it creates the validations node and then clones the other validations into it.

I think this is just a workaround that works for my current project, so I'll not suggest this as a patch.

@tiggeymone
Copy link

tiggeymone commented Jun 6, 2020

Look at this line,

if ($xmlSheet && $xmlSheet->dataValidations && !$this->readDataOnly) {

it is expected from the $xmlSheet to have validations, but sometimes the validations are here $xmlSheet->extLst->ext->children('x14', TRUE)->dataValidations->dataValidation

so I've solved the problem by adding this code just before the line mentioned

// handle Microsoft extension if present
if (isset ($xmlSheet->extLst, $xmlSheet->extLst->ext, $xmlSheet->extLst->ext['uri'])
	&& $xmlSheet->extLst->ext['uri'] == "{CCE6A557-97BC-4b89-ADB6-D9C93CAAB3DF}" )
{
	if( ! $xmlSheet->dataValidations ){
		$xmlSheet->addChild('dataValidations');
	}
	// retreive MS extension data to create a node that matches expectations.
	foreach ($xmlSheet->extLst->ext->children('x14', TRUE)->dataValidations->dataValidation as $item)
	{
		$node = $xmlSheet->dataValidations->addChild('dataValidation');
		foreach ($item->attributes() as $attr)
			$node->addAttribute($attr->getName(), $attr);
		$node->addAttribute('sqref', $item->children('xm',TRUE)->sqref);
		$node->addChild('formula1', $item->formula1->children('xm',TRUE)->f);
	}
}

which checks for other possible validations to exist and if so, it creates the validations node and then clones the other validations into it.

I think this is just a workaround that works for my current project, so I'll not suggest this as a patch.

Thank you man! I put your changes in a "patches" folder, overriding through composer the default class autoload behavior for this specific class. It works like a charm. I also fixed the composer dependency to the current version, to avoid breaking changes. I hope it could be fixed in the near future.

@stale
Copy link

stale bot commented Aug 8, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If this is still an issue for you, please try to help by debugging it further and sharing your results.
Thank you for your contributions.

@stale stale bot added the stale label Aug 8, 2020
@stale stale bot closed this as completed Aug 16, 2020
@eugenio11
Copy link

Hello,
has this fix been implemented in the latest release of PhpSpreadsheet?
I've just installed it, I have Data validation list with values loaded from another sheet and those lists are lost after opening with PhpSpreadsheet.

Thanks.

@tntsoft
Copy link
Author

tntsoft commented Apr 14, 2021

Thank you @eugenio11!

@eugenio11
Copy link

More precisely, what happens in my case is:

  • data validation lists with values coming from another sheet are not preserved
  • "hardcoded" data validation lists are preserved but, for some reason, in the wrong cell (correct col, wrong row)

@otargetj2s
Copy link
Contributor

UP !

Hello,
This regression was reported in early 2018.
The correction code is provided.
In the end, this code is still not merged.
Can you do what's nessessary? I absolutely need it for my development.
Thanks.

See links :
#388
#991
#1432

@pich
Copy link

pich commented Aug 3, 2021

I'm also on the waitlist. Any update on that?

@tntsoft
Copy link
Author

tntsoft commented Aug 3, 2021

Let's get this fixed, please! It's been a while!

@oleibman
Copy link
Collaborator

oleibman commented Aug 4, 2021

I do not understand enough about this problem to work on it. But perhaps I can make a suggestion that will resonate with someone. The code to read the data validations from the alternate location in Xlsx Reader was merged some time ago, and I believe it is working correctly. I think the problem is happening in Xlsx writer. Xlsx Reader can now read from:

<extLst>
<ext uri=...>
<x14:dataValidations count=...>
<x14:dataValidation type="list" ...>
<x14:formula1><xm:f>Sheet2!$D$3:$D$4</xm:f></x14:formula1>
<xm:sqref>...</xm:sqref>
</x14:dataValidation>
</x14:dataValidations>
</ext>
</extlst>

That's how it reads the list from another sheet. But, from the same sheet, it reads:

<dataValidations count="1">
<dataValidation type="list" ... sqref="A2" xr:uid="{63893217-F310-4747-9BDC-1350626D1438}">
<formula1>$D$3:$D$4</formula1>
</dataValidation>
</dataValidations>

That's how it reads the list from the same sheet. (Just for clarity's sake, no extLst, no ext, no namespaces, except for xr:uid, which appears to not be required.) I don't know if the reader can handle both formats on the same worksheet.

At this point, I'll add that I haven't seen what happens when you have both types of list. I also don't know what happens when you combine the first format with other things that require extLst (e.g. conditional formatting). Nevertheless, the solution might be to have the writer generate the appropriate format for each data validation if you can figure out what the appropriate format should be in each case.

@oleibman
Copy link
Collaborator

oleibman commented Aug 8, 2021

I have tried changing Xlsx writer to use extLst in this case. Excel complains "Replaced Part: /xl/worksheets/sheet1.xml part with XML error. Load error. Line 2, column 0." Nevertheless, that part loads successfully as Xml, and looks pretty much identical to the corresponding part in the input dataset. No idea how to proceed without better diagnostics from Excel.

@oleibman
Copy link
Collaborator

oleibman commented Aug 9, 2021

Aha! I found a helpful comment in Writer\Xlsx\Worksheet.

        // ConditionalFormattingRuleExtensionList
        // (Must be inserted last. Not insert last, an Excel parse error will occur)

I am as baffled by this requirement as the original author for Conditional Formatting must have been. But ... when I use extLst to write data validations after everything else, Excel does read it correctly. There are still lots of edge cases to worry about.

  • Can you write "internal" data validations in extLst section?
  • If not, how do you distinguish internal from external? Easy if the sheet name is part of the formula or not present at all, not so easy if, e.g., a defined name is used for the list, or a more complicated formula.
  • Is there an ordering requirement for sheets that have data validations and conditional formatting? (And, apparently, data bars with conditional formatting are a separate case altogether.)

So, progress, but lots more work to go before slaying this one.

@oleibman oleibman reopened this Aug 9, 2021
@stale stale bot removed the stale label Aug 9, 2021
@oleibman
Copy link
Collaborator

oleibman commented Aug 9, 2021

It does look like "internal" data validations can be placed in extLst, so that makes things a little less complicated.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Aug 10, 2021
See issues PHPOffice#1432 and PHPOffice#2149. Data validations on an Xlsx worksheet can be specified in two manners - one (henceforth "internal") if a list is specified from the same sheet, and a different one (henceforth "external") if a list is specified from a different sheet. Xlsx worksheet reader formerly processed only the internal format; PR PHPOffice#2150 fixed this so that both would be processed correctly on read. However, Xlsx worksheet writer outputs data validators only in the internal format, and that does not work for external data validations; it appears, however, that internal data validations can be specified in external format.

This PR changes Xlsx worksheet writer to use only the external format. Somewhat surprisingly, this must come after most of the other XML tags that constitute a worksheet. It shares this characteristic (and XML tag) with conditional formatting. The new test case DataValidator2Test includes a worksheet which has both internal and external data validation, as well as conditional formatting.

There is some additional namespacing work supporting Data Validations that needs to happen on Xlsx reader. Since that is substantially unchanged with this PR, that work will happen in a future namespacing phase, probably phase 2. However, there are some non-namespace-related changes to Xlsx reader in this PR:
- Cell DataValidation adds support for a new property sqref, which is initialized through Xlsx reader using a setSqref method. If not initialized at write time, the code will work as it did before the introduction of this property. In particular, before this change, data validation applied to an entire column (as in the sample spreadsheet) would be applied only through the last populated row. In addition, this also allows a user to extend a Data Validation over a range of cells rather than just a single cell; the new method is added to the documentation.
- The topLeft property had formerly been used only for worksheets which use "freeze panes". However, as luck would have it, the sample dataset provided to demonstrate the Data Validations problem uses topLeft without freeze panes, slightly affecting the view when the spreadsheet is initially opened; PhpSpreadsheet will now do so as well.

It is worth noting issue PHPOffice#2262, which documents a problem with the hasValidValue method involving the calculation engine. That problem existed before this PR, and I do not yet have a handle on how it might be fixed.
oleibman added a commit that referenced this issue Aug 24, 2021
See issues #1432 and #2149. Data validations on an Xlsx worksheet can be specified in two manners - one (henceforth "internal") if a list is specified from the same sheet, and a different one (henceforth "external") if a list is specified from a different sheet. Xlsx worksheet reader formerly processed only the internal format; PR #2150 fixed this so that both would be processed correctly on read. However, Xlsx worksheet writer outputs data validators only in the internal format, and that does not work for external data validations; it appears, however, that internal data validations can be specified in external format.

This PR changes Xlsx worksheet writer to use only the external format. Somewhat surprisingly, this must come after most of the other XML tags that constitute a worksheet. It shares this characteristic (and XML tag) with conditional formatting. The new test case DataValidator2Test includes a worksheet which has both internal and external data validation, as well as conditional formatting.

There is some additional namespacing work supporting Data Validations that needs to happen on Xlsx reader. Since that is substantially unchanged with this PR, that work will happen in a future namespacing phase, probably phase 2. However, there are some non-namespace-related changes to Xlsx reader in this PR:
- Cell DataValidation adds support for a new property sqref, which is initialized through Xlsx reader using a setSqref method. If not initialized at write time, the code will work as it did before the introduction of this property. In particular, before this change, data validation applied to an entire column (as in the sample spreadsheet) would be applied only through the last populated row. In addition, this also allows a user to extend a Data Validation over a range of cells rather than just a single cell; the new method is added to the documentation.
- The topLeft property had formerly been used only for worksheets which use "freeze panes". However, as luck would have it, the sample dataset provided to demonstrate the Data Validations problem uses topLeft without freeze panes, slightly affecting the view when the spreadsheet is initially opened; PhpSpreadsheet will now do so as well.

It is worth noting issue #2262, which documents a problem with the hasValidValue method involving the calculation engine. That problem existed before this PR, and I do not yet have a handle on how it might be fixed.
@mupic
Copy link

mupic commented Oct 20, 2021

Version 1.18 - Doesn't work.
Version 1.17 - Works well.

@oleibman
Copy link
Collaborator

PR 2265 has been applied only in master, if you want to test with that. I am unsure of when the next official release will happen.

@PowerKiKi
Copy link
Member

Fixed in 1.19.0

@DeekshaPrabhub96
Copy link

DeekshaPrabhub96 commented Nov 3, 2021

Facing the Same issue with 1.19.

Steps to recreate,

Create a dummy excel with a data validation of type List,
Read this file using PHPSpreadsheet and save the file with a different name.
The output excel is corrupted and MSExcel tries to recover it and deletes the Sheet.

$spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($filePath);
$spreadsheet->getCalculationEngine()->suppressFormulaErrors=true;
$writer = new Xlsx($spreadsheet);
$writer->save('test.xlsm');

@mupic
Copy link

mupic commented Nov 3, 2021

@DeekshaPrabhub96 I got it broken in ms office 2007 version, have you tried opening the file in newer versions? For example, you can open it in Google Docs, it worked for me.

@DeekshaPrabhub96
Copy link

Yes Working Fine with Google Docs but same issue with latest version of MSExcel (16.54).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

10 participants