Skip to content

Conversation

mhov
Copy link
Contributor

@mhov mhov commented Jun 24, 2025

Faster arrays!

Now that Array<T> is stable, this PR ports in performance enhancements for creating PostgreSQL int[] arrays—based on the optimizations found in PostgreSQL’s contrib/intarray extension (specifically new_intArrayType(int num)).

The goal is to significantly speed up the conversion from &[i32] to a PostgreSQL ArrayType Datum.

Background

Currently, to return a PostgreSQL int[] from a Rust function, we typically return a Vec<i32>, which then gets converted to a Datum via array_datum_from_iter(..)
(implementation here).

This approach uses:

  • pg_sys::initArrayResult
  • pg_sys::accumArrayResult
  • pg_sys::makeArrayResult

These APIs build a varlena array by accumulating values and repeatedly reallocating memory as the capacity grows. For larger arrays, this results in substantial performance overhead.

Optimization

PostgreSQL's contrib/intarray takes a different approach for fixed-size datums: it pre-allocates the entire array up front and directly memcpys the data into ARR_DATA_PTR. This avoids the need for reallocations and is much faster.

What I've added

  • Implemented BoxRet for Array<'a, T> when T is a fixed-size numeric type (i8, i16, i32, f32, f64)
    • Enables using Array<T> directly as a return type from #[pg_extern] functions.
  • Adds optimized allocation logic for these numeric types
    • Pre-allocates the full ArrayType
    • Uses memcpy-like behavior for fast construction from slices
  • Adds helper functions for:
    • Creating empty arrays
    • Creating Array<T> from &[T]

Benchmarks

I've benchmarked two approaches for turning Vec<i32> -> Array<i32>:

  1. The current array_datum_from_iter(..)
  2. The new fast pre-allocated method

Each test used 10,000 rows of randomly generated int[] in a temp table, work_mem = '1GB'. I used Instant::now() to time only the microseconds needed to convert and accumulated all those timings

    #[pg_extern]
    fn accum_alloc(input: Vec<i32>) -> i32 {
        let start = std::time::Instant::now();
        let array = input.into_datum().unwrap();  // this uses array_datum_from_iter(..)
        let duration = start.elapsed();
        duration.subsec_micros() as i32
    }
    
    #[pg_extern]
    fn fast_alloc(input: Vec<i32>) -> i32 {
        let start = std::time::Instant::now();
        let array = Array::<i32>::new_from_slice(input.as_slice()).expect("couldn't allocate");
        let duration = start.elapsed();
        duration.subsec_micros() as i32
    }
Row Count int[] Length accum_alloc (total μs) fast_alloc (total μs) Improvement
10,000 10 16,933.00 2,120.00 8.0×
10,000 100 155,352.00 10,228.00 15.2×
10,000 1,000 1,375,613.00 30,443.00 45.2×
10,000 10,000 13,666,835.00 251,825.00 54.3×

It's been a while since I contributed, so I'm a little rusty on matching the teams style/conventions, and as usual lifetime differences between PG allocated and rust allocated still escape me. I'm sure the ergonomics of the new Array functions could be improved. Am I doing anything the wrong way here?

Comment on lines +395 to +399
where
T: IntoDatum,
T: UnboxDatum,
T: Sized,
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not okay to block something like:

let a = pgrx::datum::new_array_with_len::<&[u8]>(5).expect("failed to create array");

It's better to mark this function as unsafe, and avoid exposing anything other than those specific Array::new_with_len methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum. Could you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is

let elem_size = std::mem::size_of::<T>(); // T = &[u8]

in the function body. This looks unreasonable. IntoDatum + UnboxDatum + Sized doesn't block types that is passed by value, and not all types of zero-initialization are valid.

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, mostly style nits. Let's discuss the motivation more.

If memory serves, I believe an array is a varlena allocation that does not have to match the precise size of the array's length, yes? That is, the varlena can be allocated bigger than the array needs. Why not something more like the spare_capacity_mut API that Vec has, so we can construct things without first zeroing them? I believe the zeroing overhead is in many cases inconsequential and often optimized-out anyways, but we can always make our code easier to optimize.

let elem_size = std::mem::size_of::<T>();
let nbytes: usize = port::ARR_OVERHEAD_NONULLS(1) + elem_size * len;

unsafe {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In new code, a safe function with internal unsafe should explain why the code is sound.

}

/// Creates an `Array<T>` of a fixed len, with 0 for all elements
/// Uses a single PG allocation rather than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, rather than pg_sys::accumArrayResult(..)

Comment on lines +685 to +688
pub fn new_array_with_len<'a, T: Sized>(len: usize) -> Result<Array<'a, T>, ArrayAllocError>
where
T: IntoDatum,
T: UnboxDatum,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe T: IntoDatum + UnboxDatum + Sized is a sufficient bound to establish that it is sound to create that type with all-zero initialization. This function is probably unsafe in actuality, albeit with a maybe-trivial precondition.

Comment on lines +685 to +688
pub fn new_array_with_len<'a, T: Sized>(len: usize) -> Result<Array<'a, T>, ArrayAllocError>
where
T: IntoDatum,
T: UnboxDatum,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this sort of bound to be written like this

Suggested change
pub fn new_array_with_len<'a, T: Sized>(len: usize) -> Result<Array<'a, T>, ArrayAllocError>
where
T: IntoDatum,
T: UnboxDatum,
pub fn new_array_with_len<'a, T>(len: usize) -> Result<Array<'a, T>, ArrayAllocError>
where
T: Sized,
T: IntoDatum,
T: UnboxDatum,

or this if it works:

Suggested change
pub fn new_array_with_len<'a, T: Sized>(len: usize) -> Result<Array<'a, T>, ArrayAllocError>
where
T: IntoDatum,
T: UnboxDatum,
pub fn new_array_with_len<'a, T>(len: usize) -> Result<Array<'a, T>, ArrayAllocError>
where
T: IntoDatum + UnboxDatum + Sized,

Comment on lines +395 to +399
where
T: IntoDatum,
T: UnboxDatum,
T: Sized,
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum. Could you elaborate?

Comment on lines +144 to +145
.add(std::mem::size_of::<pg_sys::ArrayType>())
.add(std::mem::size_of::<i32>() * ((*a).ndim as usize))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit

Suggested change
.add(std::mem::size_of::<pg_sys::ArrayType>())
.add(std::mem::size_of::<i32>() * ((*a).ndim as usize))
.add(mem::size_of::<pg_sys::ArrayType>())
.add(mem::size_of::<i32>() * ((*a).ndim as usize))

@@ -126,3 +126,22 @@ pub(super) unsafe fn ARR_DATA_PTR(a: *mut pg_sys::ArrayType) -> *mut u8 {

unsafe { a.cast::<u8>().add(ARR_DATA_OFFSET(a)) }
}

/// Returns a pointer to the lower bounds of the array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...Maybe this should specify whether it actually points to the list of lower bounds given by Postgres as a series of integers, or the actual lower bound position in the actual data.

( The answer is the former. )

Copy link
Member

@workingjubilee workingjubilee Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ARR_DIMS I just left off a comment on that entirely on the tacit assumption that people shouldn't use it or they will read the original source enough to understand how to use it if they touch it. I find that also acceptable here. Basically, the text should be correct or simply absent, not misleading.

@mhov
Copy link
Contributor Author

mhov commented Sep 3, 2025

@usamoi @workingjubilee instead of all the T: Sized + IntoDatum + UnboxDatum stuff should we just make a opt-in trait like we do for RangeSubType that we specifically implement for i8,i16,i32,i64,f32,f64 ?

@workingjubilee
Copy link
Member

@mhov Probably so, I think that would be a better starting point anyways as then it would be clearer what we're asking for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants