You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Move newlines_in_values from FileScanConfig to CsvSource
This PR moves the CSV-specific `newlines_in_values` configuration
option from `FileScanConfig` (a shared format-agnostic configuration)
to `CsvSource` where it belongs.
Changes:
- Add `newlines_in_values` field and methods to `CsvSource`
- Add `has_newlines_in_values()` method to `FileSource` trait
- Update `FileSource::repartitioned()` to use the trait method
- Remove `new_lines_in_values` from `FileScanConfig` and builder
- Update proto serialization to use `CsvSource`
- Update tests and documentation
Closesapache#18453
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <[email protected]>
/// Specifies whether newlines in (quoted) values are supported.
917
-
///
918
-
/// Parsing newlines in quoted values may be affected by execution behaviour such as
919
-
/// parallel file scanning. Setting this to `true` ensures that newlines in values are
920
-
/// parsed successfully, which may reduce performance.
921
-
///
922
-
/// The default behaviour depends on the `datafusion.catalog.newlines_in_values` setting.
923
-
pubfnnewlines_in_values(&self) -> bool{
924
-
self.new_lines_in_values
925
-
}
926
-
927
898
#[deprecated(
928
899
since = "52.0.0",
929
900
note = "This method is no longer used, use eq_properties instead. It will be removed in 58.0.0 or 6 months after 52.0.0 is released, whichever comes first."
Copy file name to clipboardExpand all lines: docs/source/library-user-guide/upgrading.md
+30Lines changed: 30 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,6 +57,36 @@ See <https://github.com/apache/datafusion/issues/19056> for more details.
57
57
58
58
Note that the internal API has changed to use a trait `ListFilesCache` instead of a type alias.
59
59
60
+
### `newlines_in_values` moved from `FileScanConfig` to `CsvSource`
61
+
62
+
The CSV-specific `newlines_in_values` configuration option has been moved from `FileScanConfig` to `CsvSource`, as it only applies to CSV file parsing.
63
+
64
+
**Who is affected:**
65
+
66
+
- Users who set `newlines_in_values` via `FileScanConfigBuilder::with_newlines_in_values()`
67
+
68
+
**Migration guide:**
69
+
70
+
Set `newlines_in_values` on `CsvSource` instead of `FileScanConfigBuilder`:
71
+
72
+
**Before:**
73
+
74
+
```rust,ignore
75
+
let source = Arc::new(CsvSource::new(file_schema.clone()));
76
+
let config = FileScanConfigBuilder::new(object_store_url, source)
77
+
.with_newlines_in_values(true)
78
+
.build();
79
+
```
80
+
81
+
**After:**
82
+
83
+
```rust,ignore
84
+
let source = Arc::new(CsvSource::new(file_schema.clone())
85
+
.with_newlines_in_values(true));
86
+
let config = FileScanConfigBuilder::new(object_store_url, source)
87
+
.build();
88
+
```
89
+
60
90
### Removal of `pyarrow` feature
61
91
62
92
The `pyarrow` feature flag has been removed. This feature has been migrated to
0 commit comments