Skip to content

Lazier construction functions #29

@edmundnoble

Description

@edmundnoble

When implementing #28, I wrote filter and deleteSet as:

-- | Remove every element of a 'Set' from an 'IxSet'.
deleteSet :: Indexable ixs a => Set a -> IxSet ixs a -> IxSet ixs a
deleteSet deletes set = set `difference` fromSet deletes

-- | Remove elements from an `IxSet` not matching a predicate.
filter :: Indexable ixs a => (a -> Bool) -> IxSet ixs a -> IxSet ixs a
filter p ixset@(IxSet elements _ixs) =
  ixset `difference` fromSet (Set.filter (not . p) elements)

This allows for some nice reuse of code, but unfortunately runs afoul of the fact that fromSet strictly builds indices, and even rebuilds its input Set:

fromSet :: (Indexable ixs a) => Set a -> IxSet ixs a
fromSet = fromList . Set.toList

--

fromList :: (Indexable ixs a) => [a] -> IxSet ixs a
fromList list = insertList list empty

--

insertList :: forall ixs a. Indexable ixs a
           => [a] -> IxSet ixs a -> IxSet ixs a
insertList xs (IxSet a indexes) = IxSet (List.foldl' (\ b x -> Set.insert x b) a xs) v
  where
    v :: IxList ixs a
    v = mapIxList' update indexes
    -- ^ note the use of mapIxList', which is strict in all indices

    update :: forall ix. Ord ix => Ix ix a -> Ix ix a
    update (Ix index f) = ...

To improve performance, I propose that:

  • fromSet begins to reuse its input set, instead of recreating it from the input list.
  • Both fromSet and fromList lazily build indices, with the understanding that because these indices only close over the elements of the IxSet, they cannot cause a significant space leak.

This may require modifying the docs which say:

All operations on IxSet with the exception of queries are spine-strict in the indices as well. Query operations, however, are lazy in the indices, so querying a number of times and subsequently selecting the result will not unnecessarily rebuild all indices.

To say:

All operations that modify IxSet with the exception of queries are spine-strict in the indices as well. Query operations, however, are lazy in the indices, so querying a number of times and subsequently selecting the result will not unnecessarily rebuild all indices. Construction operations (fromSet and fromList) also lazily build the indices.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions