Skip to content

Commit

Permalink
Merge pull request #10 from emiflake/emiflake/bench
Browse files Browse the repository at this point in the history
add contrived benchmark
  • Loading branch information
kylixafonso authored Jun 29, 2023
2 parents 51fc5b7 + b9c5baa commit b542fbf
Show file tree
Hide file tree
Showing 5 changed files with 114 additions and 1 deletion.
21 changes: 20 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,29 @@ A `Series` is a _sequence_ of `DataPoint`s, each `DataPoint` contains the _time_

## Performance

While there are as of yet no proper benchmarks in order to assess the performance of this library. The current implementation is backed by a `Vector`, and (ab)uses the `O(1)` slicing, and binary search in order to make operations more optimal than a naive implementation.
The current implementation is backed by a `Vector`, and (ab)uses the `O(1)` slicing, and binary search in order to make operations more optimal than a naive implementation.

As an example, slicing `Series` with a list leads to `O(n)` slice. With a Vector, and binary search, we get `O(log n)` slice.

Here is a comparison of naive and vector-backed slicing:

```
benchmarking slice/series
time 271.5 ns (270.1 ns .. 273.2 ns)
1.000 R² (1.000 R² .. 1.000 R²)
mean 272.5 ns (271.5 ns .. 273.6 ns)
std dev 3.600 ns (2.779 ns .. 4.583 ns)
variance introduced by outliers: 13% (moderately inflated)
benchmarking slice/naive
time 1.119 ms (1.113 ms .. 1.126 ms)
0.999 R² (0.999 R² .. 1.000 R²)
mean 1.106 ms (1.102 ms .. 1.114 ms)
std dev 21.08 μs (16.63 μs .. 30.04 μs)
```

More benchmarks may follow.

## Prior work

- [time-series](https://hackage.haskell.org/package/time-series) (github private, no active maintainer)
Expand Down
34 changes: 34 additions & 0 deletions bench/Main.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
module Main (main) where

import Criterion (bench, bgroup, env)
import Criterion.Main (defaultMain, whnf)
import Data.Series (Series, series)
import Data.Series qualified as Series
import Data.Time (Day (ModifiedJulianDay), UTCTime (UTCTime), secondsToDiffTime)
import Naive qualified

mkUTCTime :: Integral a => a -> UTCTime
mkUTCTime x =
UTCTime
(ModifiedJulianDay $ toInteger x `div` 86401)
(secondsToDiffTime $ toInteger x `mod` 86401)

setupEnv :: IO (Series Int, Naive.Series Int)
setupEnv = do
let largeSeries :: Series Int
largeSeries = series [(mkUTCTime x, x) | x <- [0 .. 1000000]]

naiveLargeSeries :: Naive.Series Int
naiveLargeSeries = Naive.series [(mkUTCTime x, x) | x <- [0 .. 1000000]]
pure (largeSeries, naiveLargeSeries)

main :: IO ()
main =
defaultMain
[ env setupEnv $ \ ~(largeSeries, naiveLargeSeries) ->
bgroup
"slice"
[ bench "series" $ whnf (Series.slice (mkUTCTime @Int 50000) (mkUTCTime @Int 100000)) largeSeries
, bench "naive" $ whnf (Naive.slice (mkUTCTime @Int 50000) (mkUTCTime @Int 100000)) naiveLargeSeries
]
]
22 changes: 22 additions & 0 deletions bench/Naive.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
module Naive (Series (..), slice, series) where

import Control.DeepSeq (NFData)
import Data.Kind (Type)
import Data.Series (DataPoint (DataPoint))
import Data.Time (UTCTime)
import GHC.Generics (Generic)

newtype Series (a :: Type) = Series [DataPoint a]
deriving stock (Show, Eq, Generic)
deriving newtype (NFData)

slice ::
forall (a :: Type).
UTCTime ->
UTCTime ->
Series a ->
Series a
slice start end (Series xs) = Series [DataPoint x y | DataPoint x y <- xs, x >= start && x <= end]

series :: [(UTCTime, a)] -> Series a
series = Series . fmap (uncurry DataPoint)
15 changes: 15 additions & 0 deletions series.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ common shared
, these
, time
, vector
, deepseq

library
import: shared
Expand All @@ -73,6 +74,7 @@ common test-deps
, tasty
, tasty-hunit
, tasty-quickcheck
, criterion

common test-opts
ghc-options: -threaded -rtsopts -with-rtsopts=-N -O2
Expand All @@ -94,3 +96,16 @@ test-suite series-test
Utils

hs-source-dirs: src test


benchmark series-bench
import: shared, test-deps, test-opts
type: exitcode-stdio-1.0
main-is: Main.hs
other-modules:
Naive
Data.Series
Data.Series.TimeRange
Data.Series.Internal

hs-source-dirs: src bench
23 changes: 23 additions & 0 deletions src/Data/Series/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,13 @@ module Data.Series.Internal (
emptySeries,
) where

import Control.DeepSeq (NFData, NFData1)
import Data.Kind (Type)
import Data.These (These (..), these)
import Data.Time (UTCTime)
import Data.Vector (Vector)
import Data.Vector qualified as Vector
import GHC.Generics (Generic, Generic1)
import Prelude hiding (lookup)

{- | Represents a data point in a 'Series'. It has a time and a value.
Expand All @@ -32,8 +34,18 @@ data DataPoint (a :: Type) = DataPoint
Show
, -- | @since 0.1.0.0
Eq
, -- | @since 0.1.0.0
Generic
, -- | @since 0.1.0.0
Generic1
)

-- | @since 0.1.0.0
instance NFData a => NFData (DataPoint a)

-- | @since 0.1.0.0
instance NFData1 DataPoint

{- | A collection of 'DataPoint's. For any given time, we may or may not have a data point.
The data points are sorted by time.
Expand All @@ -49,8 +61,19 @@ newtype Series (a :: Type) = Series
Show
, -- | @since 0.1.0.0
Eq
, -- | @since 0.1.0.0
Generic
, -- | @since 0.1.0.0
Generic1
)
deriving newtype
( -- | @since 0.1.0.0
NFData
)

-- | @since 0.1.0.0
instance NFData1 Series

-- | /O(log n)/. Perform a binary search for a time in the given 'Series'.
binarySearch ::
forall (a :: Type).
Expand Down

0 comments on commit b542fbf

Please sign in to comment.