Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a feature to throw an exception for column count mismatches during CSV parsing #326

Open
kamiazya opened this issue Aug 18, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@kamiazya
Copy link
Owner

Overview

Currently, in web-csv-toolbox, when parsing a CSV file, the process continues even if the number of columns in each row does not match. While this behavior provides flexibility, some applications may require the detection of column mismatches to ensure data integrity and throw an exception when such mismatches occur.

This issue proposes extending the existing ParseError class to throw an exception when the number of columns in the header row does not match the data rows during CSV parsing. Additionally, an option will be added to allow the user to suppress this exception, enabling flexible operation according to application requirements.

Specification

  1. Throwing an exception for column count mismatches:

    • During CSV parsing, if a data row has a different number of columns than the header row, a ParseError will be thrown.
    • The ParseError message will include details such as the row number where the mismatch occurred, the expected number of columns from the header, and the actual number of columns in the data row.
  2. Option to suppress the exception:

    • To accommodate applications that tolerate column mismatches, an option will be added to suppress the exception.
    • By default, the exception will be enabled, but the user can disable it by setting an option.
    • The option will be named allowColumnMismatch (tentative) and will default to false. If set to true, no ParseError will be thrown, and parsing will continue even if there is a column count mismatch.

Expected Use Cases

  • When data integrity is crucial:

    • Users can treat column mismatches as ParseError exceptions, preventing any inconsistencies in the data.
  • When flexible CSV parsing is required:

    • If the application needs to tolerate column count discrepancies, the user can suppress the exception using the option, allowing for more flexible data handling.
@kamiazya kamiazya added the enhancement New feature or request label Aug 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant