Skip to content

Commit 56e8208

Browse files
alambwestonpace
andauthored
Improve documentation on ArrayData::offset (#7385)
* Improve documentation on ArrayData::offsets * Apply suggestions from code review Co-authored-by: Weston Pace <[email protected]> --------- Co-authored-by: Weston Pace <[email protected]>
1 parent a5af643 commit 56e8208

File tree

1 file changed

+35
-10
lines changed

1 file changed

+35
-10
lines changed

arrow-data/src/data.rs

Lines changed: 35 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -201,26 +201,50 @@ pub(crate) fn new_buffers(data_type: &DataType, capacity: usize) -> [MutableBuff
201201
202202
#[derive(Debug, Clone)]
203203
pub struct ArrayData {
204-
/// The data type for this array data
204+
/// The data type
205205
data_type: DataType,
206206

207-
/// The number of elements in this array data
207+
/// The number of elements
208208
len: usize,
209209

210-
/// The offset into this array data, in number of items
210+
/// The offset in number of items (not bytes).
211+
///
212+
/// The offset applies to [`Self::child_data`] and [`Self::buffers`]. It
213+
/// does NOT apply to [`Self::nulls`].
211214
offset: usize,
212215

213-
/// The buffers for this array data. Note that depending on the array types, this
214-
/// could hold different kinds of buffers (e.g., value buffer, value offset buffer)
215-
/// at different positions.
216+
/// The buffers that store the actual data for this array, as defined
217+
/// in the [Arrow Spec].
218+
///
219+
/// Depending on the array types, [`Self::buffers`] can hold different
220+
/// kinds of buffers (e.g., value buffer, value offset buffer) at different
221+
/// positions.
222+
///
223+
/// The buffer may be larger than needed. Some items at the beginning may be skipped if
224+
/// there is an `offset`. Some items at the end may be skipped if the buffer is longer than
225+
/// we need to satisfy `len`.
226+
///
227+
/// [Arrow Spec](https://arrow.apache.org/docs/format/Columnar.html#physical-memory-layout)
216228
buffers: Vec<Buffer>,
217229

218-
/// The child(ren) of this array. Only non-empty for nested types, currently
219-
/// `ListArray` and `StructArray`.
230+
/// The child(ren) of this array.
231+
///
232+
/// Only non-empty for nested types, such as `ListArray` and
233+
/// `StructArray`.
234+
///
235+
/// The first logical element in each child element begins at `offset`.
236+
///
237+
/// If the child element also has an offset then these offsets are
238+
/// cumulative.
220239
child_data: Vec<ArrayData>,
221240

222-
/// The null bitmap. A `None` value for this indicates all values are non-null in
223-
/// this array.
241+
/// The null bitmap.
242+
///
243+
/// `None` indicates all values are non-null in this array.
244+
///
245+
/// [`Self::offset]` does not apply to the null bitmap. While the
246+
/// BooleanBuffer may be sliced (have its own offset) internally, this
247+
/// `NullBuffer` always represents exactly `len` elements.
224248
nulls: Option<NullBuffer>,
225249
}
226250

@@ -555,6 +579,7 @@ impl ArrayData {
555579
}
556580

557581
/// Returns the `buffer` as a slice of type `T` starting at self.offset
582+
///
558583
/// # Panics
559584
/// This function panics if:
560585
/// * the buffer is not byte-aligned with type T, or

0 commit comments

Comments
 (0)