Operate along vertical (time) dimension after isolating data above threshold #9788
Unanswered
geacomputing
asked this question in
Q&A
Replies: 1 comment
-
Update: This is my mask, after computing and applying an above/below threshold condition.
This is, in essence, how I would like to have my mask: multidim array, instead of timeseries. I am in the process of running a cumsum along time: Maybe this?
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi all. My name's Marco, this is my first post asking for help. I have searched long and wide but am still struggling to get to a decent solution to my problem. Hope to find some constructive comments and help here. Let me explain.
I have a sea surface temperature (SST), for a chunk of ocean, no vertical level. Say 30 lats and longs (at 0.05 deg resolution), over 40 years, daily values.
I want to investigate marine heatwaves. In that respect, I need to see when and to what extent my SST exceeds (custom) climatology.
The climatology is quantile based, involving two steps.
1st step:
t=sst.groupby("time.dayofyear").quantile(0.9)
I am creating a first threshold.
2 step:
I apply a rolling mean to first step, named rt, with one sided amplitude of 5 days:
rt=t.rolling(time=11, center=True).mean()
The amplitude is 11 because:
5+5+1=11
In my rolling window, I am using dimension time (not dayofyear) as I am combining (subtracting) it with original SST (grouped by day): as a result rolling climatology has the exact same dimensions and number of elements.
Now, I am creating a mask by doing:
mask=sst-rt>0
This mask has the same dimensions of SST and tells me when, where and to what extent my SST exceeds the rolling threshold (rt). In essence when and where a marine wave MIGHT occurs. This is where I stand now.
Question:
From here I am trying to work along dimension time ONLY. That means that along my matrix I want to apply a function that, over the sole time domain, detects events whose uninterrupted duration is at least N days, separating heat spikes and marine heat waves. Whatever survives this test, is a heat wave. All the rest is a heat spike.
Expected outcome:
By so doing, and by selecting a time, I could:
see on a map the spatial distribution of events (heatwaves).
quantify their spatial extent. For instance counting the grid cells within a shape file (buffer zone of 10 km off the coast).
compile a taxonomy. I would add another variable, with same dimensions, to the result of the operation. I could populate that variable with categories (mild, moderate, intense). Also, I could append the traits and fingerprints of each event (parameters defining taxonomy).
I am trying to vectorize everything. At the moment, for a test, I'm only working with a time varying box of 30*30 lats, lons, buty dataset is far bigger than that.so everything should stay away from for loops as much as possible, in favour of a vectorized approach.
I have a function that works on 1D time-series, but would prefer to perform (vectorized) bulk operations in the whole dataset.
I am trying to use apply ufunc, but the syntax is not so clear to me (core in and core out).
What is the best strategy? What should I consider or avoid? Any constructive comment is more than welcome.
Thank you so much.
Marco
Beta Was this translation helpful? Give feedback.
All reactions