- ํ๋ก์ ํธ ๊ณผ์ : ์ธ์ฒ๋ํ๊ต ์ฐ์ ๊ฒฝ์๊ณตํ๊ณผ ์บก์คํค ๋์์ธ(๊ณต๊ณผ๋ํ ์กธ์ ์ํ)
- ํ๋ก์ ํธ ๋ช : ๋จธ์ ๋ฌ๋์ ํตํ ๋นํธ์ฝ์ธ ๊ฐ๊ฒฉ ์์ธก
- ํ๋ก์ ํธ ๊ธฐ๊ฐ : 21.04.01 ~ 21.05.31
๋ณธ '๋จธ์ ๋ฌ๋์ ํตํ ๋นํธ์ฝ์ธ ๊ฐ๊ฒฉ์์ธก' ํ๋ก์ ํธ๋ ์ธ์ฒ๋ํ๊ต ์ฐ์ ๊ฒฝ์๊ณตํ๊ณผ 4ํ๋ ์ฌํ์๋ค์ ์กธ์ ์ํ์ ์ํด ์์ํ ํ๋ก์ ํธ์ ๋๋ค. ๊ฐ๊ธ์ ๊ต๋ด ๋ฌธ์ ๊ฐ ์๋ ์ค์ํ์์ ์ ๊ทผ๊ฐ๋ฅํ ๋ฌธ์ ๋ฅผ ๋ค๋ฃจ๋ ๊ฒ์ ์ด์ ์ ๋ง์ถ์์ผ๋ฉฐ, ์ต๊ทผ ์ด์๊ฐ ๋๊ณ ์๋ ๋นํธ์ฝ์ธ์ ๋ํด ๋ค๋ค๋ณด๋ ๊ฒ์ด ์ข๊ฒ ๋ค๋ผ๋ ๋ชฉ์ ์ผ๋ก ํด๋น ํ๋ก์ ํธ๋ฅผ ์์ํ๊ฒ ๋์์ต๋๋ค.
์ด๋ก์จ ์ฐ์ ๊ฒฝ์๊ณตํ ์กธ์ ์ํ์ ์ํด ๋ฐ์ด์ฝ ์ธ๊ณต์ง๋ฅ ๋นํธ ํธ๋ ์ด๋ ๊ฒฝ์ง๋ํ์ ์ฐธ๊ฐํ์์ต๋๋ค. ๊ฒฐ๊ณผ์ ์ผ๋ก ์ข์ ์ฑ๊ณผ๋ฅผ ๊ฑฐ๋์ง๋ ๋ชปํ์๋๋, ํ๋ฌธ์ ์ผ๋ก ๋ค์ํ ์ ๊ทผ์ ์๋ํด๋ณด์๋ค๋ ์ ์ ์๋ฏธ๋ฅผ ๋๊ณ ๋ด์ฃผ์๋ฉด ๊ฐ์ฌํ๊ฒ ์ต๋๋ค.
๋ณธ ํ๋ก์ ํธ์ ๊ตฌ์ฑ์ ์๋์ ๊ฐ์ต๋๋ค.
- Chapter. 1 - EDA
- Chapter. 2 - Season 1 pilot
- Chapter. 3 - Personal modeling prediction
- Chapter. 4 - Data preprocess
- Chapter. 5 - Pytorch modeling prediction
- Chapter. 6 - Experiments & Simulation
- Reference
๋ณธ ํ๋ก์ ํธ ์ํ์ ์ด ๋ฌธ์ ๋ Forecasting problem์ด๋ผ๋ ๊ฐ์ ํ์ ์งํํ์์ต๋๋ค. ํ๋ก์ ํธ ์ด๊ธฐ์๋ EDA๋ฅผ ํตํด ์ด๋ฒ ํ๋ก์ ํธ์์ ๋ค๋ฃฐ ๋ฐ์ดํฐ์ ๋ํด ์ดํด๋ฅผ ํ์์ต๋๋ค. (์ผ๋ฐ์ ์ธ ๊ฐ๊ฒฉ ๋ฐ์ดํฐ ์ด์ธ์ ์ง๊ด์ ์ผ๋ก ์ดํดํ๊ธฐ ์ด๋ ค์ด ๋ฐ์ดํฐ๋ค์ด ์กด์ฌํ์๊ธฐ์ ์ํํ์์ต๋๋ค) ๋ํ, ์์ฆ1 ์ด ์กด์ฌํ์๊ธฐ์ ๊ธฐ์กด ํ๋ค์ ์ด ๋ฌธ์ ์ ๋ํด ์ด๋ป๊ฒ ์ ๊ทผํ์๋์ง ์ดํด๋ณด์๊ณ ์ดํ ๊ธฐ๋ณธ์ ์ธ ARIMA ์ Prophet ๋ชจ๋ธ์ ๊ธฐ์ค์ผ๋ก ๋๊ณ ์ดํ Pytorch framework์ Tensorflow framework๋ฅผ ๋ฐํ์ผ๋ก Neural Netwokr ๋ชจ๋ธ๋ค๋ ์ํํ์์ต๋๋ค.
์ํ๊ณผ์ ์ ๊ฐ๋จํ ์ค๋ช ๋๋ฆฌ์๋ฉด, ๋ฅ ๋ด๋ด ๋คํธ์ํฌ ๋ชจ๋ธ๋ค(LSTM, Conv1d, Seq2Seq)์ ๋ฅ๋ ฅ์ด ๊ธฐ๋ณธ ๋ชจ๋ธ์ธ ARIMA์ Prophet ๋ชจ๋ธ๋ค์ ๋นํด ํ์ํ์ง ์์์ต๋๋ค. ๊ทธ๋์ ์ ํฌ๋ ๊ฐ๊ฒฉ ๋ฐ์ดํฐ ๋ณ๋ํญ์ด ๋๋ฌด ์ปค์ ์ผ๋ฐ์ ์ธ forecasting modeling์ ๋ฐ๋ก ํ ์ ์๋ค๊ณ ํ๋จํ์ฌ ์ถ๊ฐ์ ์ธ data handling์ ํ์์ต๋๋ค. (simple exponential smoothing๊ณผ moving average smoothing ๊ทธ๋ฆฌ๊ณ data discretize ๋ฅผ ํด๋ณด์์ต๋๋ค. ์ถ๊ฐ๋ก Fractional differencing์ ํด๋ณด๊ธฐ๋ ํ์์ผ๋ ARIMA ๋ชจ๋ธ์์๋ง ๋ค๋ค๋ดค์ต๋๋ค)
data handling ์ดํ์๋ ๋ณด๋ค ๋์ ๊ฒฐ๊ณผ๊ฐ ์์ ์ ์๋ค๊ณ ์๊ฐํ์์ผ๋ ์ฝ์ธ ๋ฐ์ดํฐ์ ๋ณ๋์ ์ค๋ช ํ๊ธฐ์๋ ์ญ๋ถ์กฑ์ด์์ต๋๋ค. ๊ฒฐ๋ก ์ ์ผ๋ก ์ฌ๋ฌ ์คํ์ ํด๋ณด์์ผ๋ ๋ถ๋ถ์ ์ผ๋ก ๋ผ๋ ์์ธก๊ฐ๋ฅํ ๋ชจ๋ธ๋ง์ ์ํํ์ง๋ ๋ชปํ์์ต๋๋ค.
youtube link : https://www.youtube.com/watch?v=-ZSlri43b5A
- sample_id : ํ ์ํ์ค ์ํ, ํ ์ํ์ค๋ 1380๋ถ์ ์๊ณ์ด ๋ฐ์ดํฐ๋ก ๊ตฌ์ฑ ์๋ ์์
Figure. ๋ฐ์ดํฐ ์ํ ์์
- X : 1380๋ถ(23์๊ฐ)์ ์ฐ์ ๋ฐ์ดํฐ
- Y : 120๋ถ(2์๊ฐ)์ ์ฐ์ ๋ฐ์ดํฐ
- 23์๊ฐ ๋์์ ๋ฐ์ดํฐ ํ๋ฆ์ ๋ณด๊ณ ์์ผ๋ก์ 2์๊ฐ ๋ฐ์ดํฐ๋ฅผ ์์ธกํ๋ ๊ฒ
- sample_id๋ 7661๊ฐ์ ์ธํธ ๊ตฌ์ฑ, ๊ฐ ์ธํธ๋ ๋ ๋ฆฝ์ ์ธ dataset
- coin_index๋ ์ด 10๊ฐ ์ข ๋ฅ๋ก ๊ตฌ์ฑ(index number is 0 ~ 9)
- ๊ฐ ์ฝ์ธ๋ณ๋ก ์ํ ๊ฐ์๋ ๋ค๋ฆ
- 9, 8๋ฒ์ ์ํ ์๊ฐ ๊ฐ์ฅ ๋ง์
Figure. ์ฝ์ธ ์ธ๋ฑ์ค ๋ณ ๋ฐ์ดํฐ ์ํ ๊ฐ์
- 'Volume' - ' Taker buy base asset volume' = ' Maker buy base asset volume'
source by : https://www.binance.kr/apidocs/#individual-symbol-mini-ticker-stream
- quote asset volume = coin volume / btc volume
quote asset volume = Volume expressed in quote asset units. For pair DOGE/ BTC the volume is shown in BTC , instead of DOGE.
์์) ๊ฐ์ํํ/๊ฑฐ๋ํํ์์ ๊ฑฐ๋ํํ์ ์
ํ๊ตญ๋์ผ๋ก ๋๊ณ ๋์ ๊ณ์ฐ(100๋ง)
ex) btc/usdt ๋ฉด usdt์ ๊ฐ์น 57000*1200์์์ qav = 100๋ง/1200 => 8๋งxxx
btc/krw๋ฉด btc์ ๊ฐ์น 7400๋ง์์์ qav = 100๋ง
tb_base_av
coin / xxxxx
volume / quote_av
0 = 19.xxxxx
1 = 0.028xxxxx
2 = 0.268xxxxx
3 = 0.238 xxxxx
4 = 2.1312xxxx
5 = 52.1123xxxx(maximum coin)
6= 0.22421
7= 19.3821
8 = 0.003426
9 = 0.00013(minimum coin)
====> ์์์๋ก ๋น์ผ ์ฝ์ธ์ผ๋ก ์ถ์
- ์ํ ๋ด outlier ๋๋ฌด ๋น๋๊ฐ ์ ๊ณ , regression์ผ๋ก ํ์ตํ๊ธฐ ์ด๋ ค์(raw, smoothing, log smoothing ๋ณ ์ฐจ์ด ์์)
Figure. open price distribution plot
- open price outlier detection tempary method code
for temp_arr in outlier_arr:
plt.plot(temp_arr, label = 'True series')
plt.ylim(open_arr.min(), open_arr.max())
plt.legend()
plt.show()
filtered_y_df = raw_y_df[~raw_y_df["sample_id"].isin(outlier_list)]
coin eda code link : here
- greedy feature add based on taker volumn data
''' greedy feature handleing'''
# test_df = train_x_df[train_x_df['volume'] != 0]
# test_df['rest_asset'] = test_df['volume'] - test_df['tb_base_av']
# test_df['greedy'] = test_df['tb_base_av'] / test_df['volume']
# test_df2 = test_df[['time', 'coin_index', 'open', 'high', 'low', 'close', 'volume', 'trades', 'tb_base_av','rest_asset', 'greedy']]
# test_df2[['coin_index','trades', 'volume', 'tb_base_av','rest_asset', 'greedy']].head()
# test_df2[test_df2['greedy'] == 1][['coin_index','trades', 'volume', 'tb_base_av','rest_asset', 'greedy']].head()
- ๋ณ๋ํญ feature add based on high and low price difference
print(
f'''
{df.high.max()}
{df.low.max()}
{df.open.max()}
{df.close.max()}
{df.high.min()}
{df.low.min()}
{df.open.min()}
{df.close.min()}
'''
''' high - low = ๋ณ๋ํญ \n'''
''' ์๋ด์๋ด ๊ตฌ๋ถ ์ถ๊ฐ ๊ฐ๋ฅ'''
)
- sample id = 0, open data series๋ก๋ง ๋ชจ๋ธ๋ง ์งํ
- ARIMA arg meanings : https://otexts.com/fppkr/arima-forecasting.html
- ARIMA python code
# ARIMA model fitting : model arguments ๋ ์์๋ก ์งํ
model = ARIMA(x_series, order=(3,0,1))
fit = model.fit()
pred_by_arima = fit.predict(1381, 1380+120, typ='levels')
-
Time Series Forecasting โ ARIMA vs Prophet : https://medium.com/analytics-vidhya/time-series-forecasting-arima-vs-prophet-5015928e402a
-
facebook github : https://facebook.github.io/prophet/docs/quick_start.html
-
prophet ์ค๋ช ๋ธ๋ก๊ทธ : https://zzsza.github.io/data/2019/02/06/prophet/
-
prophet python code
# pprophet ๋ชจ๋ธ ํ์ต
prophet = Prophet(seasonality_mode='multiplicative',
yearly_seasonality=False,
weekly_seasonality=False, daily_seasonality=True,
changepoint_prior_scale=0.06)
prophet.fit(x_df)
future_data = prophet.make_future_dataframe(periods=120, freq='min')
forecast_data = prophet.predict(future_data)
season 1 pilot code link : here
- ๊ธฐ์กด์ driving ๋ฐฉ์์ฒ๋ผ trian_x์์ open column๋ง ํ์ฉํ์ฌ yhat predictํจ.
- ์ฐ์ ๊ธฐ์กด ARIMA ๋ฐฉ๋ฒ์ Baseline์ผ๋ก ์ก๊ณ , ์งํ
- hyperparameter p,d,q๋ ์์๋ก ์ก์
- ARIMA python code
def train(x_series, y_series, args):
model = ARIMA(x_series, order=(2,0,2))
fit = model.fit()
y_pred = fit.predict(1381, 1380+120, typ='levels')
error = mean_squared_error(y_series, y_pred)
plotting(y_series, y_pred, args.sample_id)
return error*10E5
Figure. open price ARIMA prediction plot
Colab link : https://colab.research.google.com/drive/1x28Mi9MSqqkSTO2a8UU0wXDzgXNy2WT9?usp=sharing
- hyperparameter๋ ์์๋ก ์ค์ , seasonality๋ ์ฝ์ธ ๋ฐ์ดํฐ๊ฐ addtitive ๋ณด๋ค๋ multiplicative๊ฐ ์ ํฉํ๋ค๊ณ ํ๋จ
- prophet python code
prophet= Prophet(seasonality_mode='multiplicative',
yearly_seasonality='auto',
weekly_seasonality='auto', daily_seasonality='auto',
changepoint_range=0.9,
changepoint_prior_scale=0.1 # ์ค๋ฒํผํ
, ์ธ๋ํผํ
์ ํผํ๊ธฐ ์ํด ์กฐ์
)
prophet.add_seasonality(name='first_seasonality', period=1/12, fourier_order=7) # seasonality ์ถ๊ฐ
prophet.add_seasonality(name='second_seasonality', period=1/8, fourier_order=15) # seasonality ์ถ๊ฐ
prophet.fit(x_df)
future_data = prophet.make_future_dataframe(periods=120, freq='min')
forecast_data = prophet.predict(future_data)
Figure. open price prophet prediction plot
Colab link : https://colab.research.google.com/drive/1dDf6AIln31catWWDsrB_lbL-0M5DsZTd?usp=sharing
- hyperparameter ์์๋ก ์ก์, seasonality mode๋ ์ด์ prophet model์ฒ๋ผ mulplicative๋ก ์งํ
- neural prophet python code
def prophet_preprocessor(x_series):
# start time initialization
start_time = '2021-01-01 00:00:00'
start_dt = datetime.datetime.strptime(start_time, '%Y-%m-%d %H:%M:%S')
# datafram ๋ง๋ค๊ธฐ
x_df = pd.DataFrame()
# ๋ถ๋น ์๊ฐ ๋ฐ์ดํฐ ์๋ฆฌ์ฆ ์
๋ ฅ
x_df['ds'] = [start_dt + datetime.timedelta(minutes = time_min) for time_min in np.arange(1, x_series.shape[0]+1).tolist()]
# ๊ฐ๊ฒฉ ๋ฐ์ดํฐ ์๋ฆฌ์ฆ ์
๋ ฅ
x_df['y'] = x_series.tolist()
return x_df
def train(x_series, y_series, **paras):
x_df = prophet_preprocessor(x_series)
model = NeuralProphet(
n_changepoints = paras['n_changepoints'],
changepoints_range = paras['changepoints_range'],
num_hidden_layers = paras['num_hidden_layers'],
learning_rate = 0.1, epochs = 40, batch_size = 32,
seasonality_mode = 'multiplicative',
yearly_seasonality = False, weekly_seasonality = False, daily_seasonality = False,
normalize='minmax'
)
model.add_seasonality(name='first_seasonality', period=1/24, fourier_order=5)
model.add_seasonality(name='second_seasonality', period=1/12, fourier_order=10)
metrics = model.fit(x_df, freq="min")
future = model.make_future_dataframe(x_df, periods=120)
forecast = model.predict(future)
error = mean_squared_error(y_series, forecast.yhat1.values[-120:])
return error
Colab link : https://colab.research.google.com/drive/1E38kkH2mfFgnGKj89t2mLZV6xg7rPQl8?usp=sharing
-
์ผ๋ฐ์ ์ผ๋ก, ์ฐจ๋ถ์ ํด๋ฒ๋ฆฌ๋ฉด ์์ฆ๋์ด ์๊ธฐ์ง๋ง ๊ทธ ๋งํผ ๊ธฐ์กด ๋ฐ์ดํฐ๊ฐ ๋ณํ๋์ด ์ ๋ณด ์์ค์ด ์๊น. ์ด๋ฅผ ์ปค๋ฒํ๊ธฐ ์ํด, ์ค์ ์ฐจ๋ถ์ ๊ฐ๋ ์ด ๋์
-
์ค์ ์ฐจ์์ ์ฐจ๋ถ ์๊ณ์ด : https://m.blog.naver.com/chunjein/222072460703
- fractional differecing ARIMA code
#์ฐจ๋ถ์ฉ ํจ์
def getWeights_FFD(d, size, thres):
w = [1.] # w์ ์ด๊น๊ฐ = 1
for k in range(1, size):
w_ = -w[-1] * (d - k + 1) / k # ์ 2)๋ฅผ ์ฌ์ฉํ๋ค.
if abs(w[-1]) >= thres and abs(w_) <= thres:
break
else:
w.append(w_)
# w์ inverse
w = np.array(w[::-1]).reshape(-1, 1)
return w
def fracDiff_FFD(series, d, thres=0.002):
'''
Constant width window (new solution)
Note 1: thres determines the cut-off weight for the window
Note 2: d can be any positive fractional, not necessarily bounded [0,1]
'''
# 1) Compute weights for the longest series
w = getWeights_FFD(d, series.shape[0], thres)
width = len(w) - 1
# 2) Apply weights to values
df = []
seriesF = series
for iloc in range(len(w), seriesF.shape[0]):
k = np.dot(w.T[::-1], seriesF[iloc - len(w):iloc])
df.append(k)
df = np.array(df)
return df, w
# ์ค์ ์ฐจ๋ถ ์์
x_series = train_x_array[idx,:,data_col_idx]
# fractional differecing
fdiff, fdiff_weight = fracDiff_FFD(x_series, d=0.2, thres=0.002)
differencing_x_series = fdiff.reshape(fdiff.shape[0],)
# ARIMA modeling
model = ARIMA(differencing_x_series, order =(2,0,2))
fitted_model = model.fit()
pred_y_series = fitted_model.predict(1,120, type='levels')
# scale control : ์ค์ ์ฐจ๋ถ์ ํ๋ฉด ์๊ณ์ด์ฑ ๋ฐ ์ ๋ณด๋ ์ด๋ ์ ๋ ๋ณด์กด๋์ง๋ง, ๋ฐ์ดํฐ ์ค์ผ์ผ์ด ๋ฌ๋ผ์ง. 1380๋ถ์ ๋ฐ์ดํฐ๊ฐ 1์ด ๋๋๋ก ๋ง์ถฐ์ค.
first_value = pred_y_series[0]
scale_controler = 1 / first_value
scaled_pred_y_series = scale_controler * pred_y_series
Colab link : https://colab.research.google.com/drive/19hrQP6nI-KgVwWu9Udp2fbntYCjpnHG9?usp=sharing
- ๋น์ทํ ๋ฐฉ์์ผ๋ก open ๊ฐ๊ฒฉ ๋ฐ์ดํฐ๋ง์ด ์๋, feature๊น์ง ํ์ฉํด์ driving
- keras moduler LSTM ์ด๋ GRU ์๋
#๋ชจ๋ธํ์ต
class CustomHistory(keras.callbacks.Callback):
def init(self):
self.train_loss = []
self.val_loss = []
def on_epoch_end(self, batch, logs={}):
self.train_loss.append(logs.get('loss'))
self.val_loss.append(logs.get('val_loss'))
def train(x_train, y_train, n_epoch, n_batch, x_val, y_val):
#๋ชจ๋ธ
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape= (x_train.shape[1],x_train.shape[2] )))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25, activation='relu'))
model.add(Dense(1))
# ๋ชจ๋ธ ํ์ต๊ณผ์ ์ค์ ํ๊ธฐ
model.compile(loss='mean_squared_error', optimizer='adam')
# ๋ชจ๋ธ ํ์ต์ํค๊ธฐ
custom_hist = CustomHistory()
custom_hist.init()
#๋ชจ๋ธ ๋๋ ค๋ณด๊ธฐ
model.fit(x_train, y_train, epochs=n_epoch, batch_size=n_batch, shuffle=True, callbacks=[custom_hist], validation_data=(x_val, y_val), verbose=1)
return model
Figure. Keras LSTM predition plot
Colab link : https://colab.research.google.com/drive/1oCCXpJSlLXDs6x968eYrIPQtzEo0klMq?usp=sharing
# GRU๋ก๋ ์๋ ํด๋ด.
model = keras.models.Sequential(
[
keras.layers.Bidirectional(layers.GRU(units = 50, return_sequences =True), input_shape=(x_frames, 1)),
keras.layers.GRU(units = 50),
keras.layers.Dense(1)
]
)
model.compile(optimizer='adam', loss='mse')
model.summary()
Figure. Keras GRU predition plot
Colab link : https://colab.research.google.com/drive/1w2GZXVXSjRX-tlI49WAcC77szQaK_H6R?usp=sharing
-
์ดํ, DNN ๊ณ์ด์ ๋ชจ๋ธ๋ง์ ์๋ํ์ผ๋, ์ ๋๋ก regression์ด ๋์ง ์์. -> ๊ธฐ์กด ๋ฐ์ดํฐ๋ ๋๋ฌด ์งํญ์ด ์ฌํด์ ๋ชจ๋ธ์ด regression์ ํ๊ธฐ ์ด๋ ต๋ค๊ณ ํ๋จํจ
-
smoothing method 1 : simple exponential smoothing
Exponential smoothing is a time series forecasting method for univariate data that can be extended to support data with a systematic trend or seasonal component. It is a powerful forecasting method that may be used as an alternative to the popular Box-Jenkins ARIMA family of methods.
- smoothing method 2 : moving average
Smoothing is a technique applied to time series to remove the fine-grained variation between time steps. The hope of smoothing is to remove noise and better expose the signal of the underlying causal processes. Moving averages are a simple and common type of smoothing used in time series analysis and time series forecasting.
- smoothing python code
def simple_exponetial_smoothing(arr, alpha=0.3):
y_series = list()
for temp_arr in arr:
target_series = temp_arr[:, 1].reshape(-1) # open col is 1 index
smoother = SimpleExpSmoothing(target_series, initialization_method="heuristic").fit(smoothing_level=0.3,optimized=False)
smoothing_series = smoother.fittedvalues
y_series.append(smoothing_series)
return np.array(y_series)
def moving_average(arr, window_size = 10):
#length = ma ๋ช ํ ์ง
length = window_size
ma = np.zeros((arr.shape[0], arr.shape[1] - length, arr.shape[2]))
for idx in range(arr.shape[0]):
for i in range(length, arr.shape[1]):
for col in range(arr.shape[2]):
ma[idx, i-length, col] = arr[idx,i-length:i, col].mean() #open
return ma[:, :, 1] # open col is 1
Figure. price data smoothing plot
- y๊ฐ open ๋ฐ์ดํฐ๊ฐ ์งํญ์ด ๋๋ฌด ํฐ outlier ๋ฐ์ดํฐ๊ฐ ๋๋ฌด ๋ง์์, true y์ prediction ํ๋ ๊ฒ๋ณด๋ค y ๊ฐ์ ํจํด ์์๋ง์ ํ์ตํ๋ ๋ฐฉ๋ฒ์ผ๋ก ๋ฐ๊ฟ driving
- discretize method : KBinsdiscretizer library(in scikit-learn)
- kbinsdiscretizer python code
from sklearn.preprocessing import KBinsDiscretizer
kb = KBinsDiscretizer(n_bins=10, strategy='uniform', encode='ordinal')
kb.fit(open_y_series)
# ์ด๋ `bin_edges_` ๋ฉ์๋๋ฅผ ์ด์ฉํ์ฌ ์ ์ฅ๋์ด์ง ๊ฒฝ๊ณ๊ฐ์ ํ์ธํ ์ ์๋ค.
print("bin edges :\n", kb.bin_edges_ )
Figure. kbinsdiscretizer before & after plot
-
๋ฐ์ดํฐ ์ธํ ์ open data ์ด์ธ์ ๋ค๋ฅธ feature์ ๊ฐ์ด ํ์ฉํ๊ธฐ ์ํด, ๋ค์๊ณผ ๊ฐ์ ๋ฐฉ๋ฒ์ผ๋ก normalization์ ์ทจํด์ค. ์ผ๋ฐ์ ์ธ scikit-learn normalizizer์ ๋ฐ๋ก ์ฌ์ฉํ๊ธฐ์๋ ๋ํ ๋ด์์ 1380๋ถ์ผ ๋์ open price๋ฅผ 1๋ก ์์ ํ๋ฉด์ ์ ๋ฐ์ ์ธ ์ ์ฒ๋ฆฌ๊ฐ ์ด๋ฏธ ํ๋ฒ ๋ ์ํ์ด๊ธฐ ๋๋ฌธ์ ํด๋น ๋ฐฉ๋ฒ์ ์ฌ์ฉํจ.
-
log normalizer python code
data = data.apply(lambda x: np.log(x+1) - np.log(x[self.x_frames-1]+1))
- ์๊ณ์ด ๋ฐ์ดํฐ ์ ๊ทํ ๋ฐฉ๋ฒ ์ถ์ฒ : https://github.com/heartcored98/Standalone-DeepLearning/blob/master/Lec8/Lab10_Stock_Price_Prediction_with_LSTM.ipynb(2019 KAIST ๋ฅ๋ฌ๋ ํ๋ก์๊ธฐ )
- condition
- Only coin 9 data use
- Data preprocess - simple exponential smoothing
- LSTM layer is 1
- pytorch LSTM python code
class LSTM(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim, num_layers, dropout, use_bn):
super(LSTM, self).__init__()
self.input_dim = input_dim
self.hidden_dim = hidden_dim
self.output_dim = output_dim
self.num_layers = num_layers
self.dropout = dropout
self.use_bn = use_bn
self.lstm = nn.LSTM(self.input_dim, self.hidden_dim, self.num_layers)
self.regressor = self.make_regressor()
def init_hidden(self, batch_size):
return (torch.zeros(self.num_layers, batch_size, self.hidden_dim),
torch.zeros(self.num_layers, batch_size, self.hidden_dim))
def make_regressor(self):
layers = []
if self.use_bn:
layers.append(nn.BatchNorm1d(self.hidden_dim))
layers.append(nn.Dropout(self.dropout))
layers.append(nn.Linear(self.hidden_dim, self.hidden_dim))
layers.append(nn.ReLU())
layers.append(nn.Linear(self.hidden_dim, self.output_dim))
regressor = nn.Sequential(*layers)
return regressor
def forward(self, X):
lstm_out, self.hidden = self.lstm(X)
y_pred = self.regressor(lstm_out[-1].view(X.shape[1], -1))
return y_pred
- ๋ชจ๋ธ ํ์ต ๋ฐฉ๋ฒ ์๊ฐํ
Figure. Multistep LSTM modeling(source by : tensorflow tutorial)
modeling ๋ด ํ๋ฒ์ 120๊ฐ์ y๊ฐ์ ์ถ๋ ฅ์, ๋ค์ ๊ทธ๋ฆผ์ฒ๋ผ ํจํด์ ์๊ด์์ด ๊ฐ์ y๊ฐ์ ์ถ๋ ฅํ๊ฒ ๋จ. -> ์คํจ
Colab link : https://colab.research.google.com/drive/1I0Arck8qkV4FTXnOOYMxkpZGIRKCGj7J?usp=sharing
์ดํ, ํ ์ํ ๋ด ๋ฐ์ดํฐ๋ฅผ slicing ํด์ ๊ณผ๊ฑฐ 120 time-series ๋ก ์ดํ 120 time-series๋ฅผ ์์ธกํ๋ ๋ชจ๋ธ๋ก ๋ณํํด๋ดค์ง๋ง ์คํจ.
- one sample data slicing python code
class WindowGenerator():
''' Dataset Generate'''
def __init__(self, input_width, label_width, stride, data_arr, column_indices = column_indices,
shfit = None, label_columns=None):
# Store the raw data
self.data_arr = data_arr
# Work out the label column indices.
self.label_columns = label_columns
if label_columns is not None:
self.label_columns_indices = {name: i for i, name in enumerate(label_columns)}
self.column_indices = column_indices
# Work out the window parameters.
self.input_width = input_width
self.label_width = label_width
self.shift = 1
if shfit is not None:
self.shift = shfit
self.stride = stride
self.label_start = self.input_width + self.shift
self.total_window_size = self.label_start + self.label_width
# input, label indices
self.input_slice = slice(0, self.input_width)
self.input_indices = np.arange(self.total_window_size)[self.input_slice]
self.labels_slice = slice(self.label_start, None)
self.label_indices = np.arange(self.total_window_size)[self.labels_slice]
self.X_arr, self.y_arr = self.split_windows()
def __repr__(self):
return '\n'.join([
f'Total window size: {self.total_window_size}',
f'Input indices: {self.input_indices}',
f'Label indices: {self.label_indices}',
f'Label column name(s): {self.label_columns}'
])
def split_windows(self):
X, y = list(), list()
sample_length = int(self.data_arr.shape[0])
split_length = int((self.data_arr.shape[1] - self.total_window_size)/self.stride) + 1
for temp_id in range(sample_length):
for i in range(split_length):
X.append(self.data_arr[temp_id, (i*self.stride) : (i*self.stride)+self.input_width])
y.append(self.data_arr[temp_id, (i*self.stride)+self.label_start : (i*self.stride)+self.total_window_size])
return np.array(X), np.array(y)
def __len__(self):
return len(self.X_arr)
def __getitem__(self, idx):
X = self.X_arr[idx, :, :]
y = self.y_arr[idx, :, :]
return X, y
Colab link : https://colab.research.google.com/drive/11s1KCtT8NPvsaOR-1mYaR66lneQ1yxU7?usp=sharing
์ดํ, ๋ชจ๋ ์ฝ์ธ์ผ๋ก ํ์ฅํด์ ์ฌ์ ์ฉ ์๋
- condition
- all coin data use
- No data preprocess
- log normalization
- LSTM layer is 1
์ด์ ๋ชจ๋ธ ๊ตฌ์กฐ๋ ๊ทธ๋๋ก ์งํํ์๊ณ , ๋ฐ์ดํฐ ์ธํธ๋ง ๊ธฐ์กด์ 9๋ฒ ์ฝ์ธ ๋ฐ์ดํฐ์์ ์ฝ์ธ์ ์๊ด์์ด ๋ชจ๋ ๋ฐ์ดํฐ ์ธํธ๋ฅผ ์ ์ฉํจ. -> ์ด์ฐจํผ ์ฝ์ธ๋ณ ๊ฐ๊ธฐ ๋ค๋ฅธ ๋ฐ์ดํฐ ํจํด ์์์ด ์๋, ์ฝ์ธ๊ณผ ๋ฌด๊ดํ๊ฒ ๊ฐ๊ฒฉ ๋ฐ์ดํฐ์ ์์ง์์ ๊ทธ๋ฅ ์ผ์ ํจํด ๋ณ๋ก ๋๋ ๊ฒ์ผ๋ก ์์ -> ์ ๋๋ก ๋ ํ์ต์กฐ์ฐจ ์ ๋๋ฉฐ, ์ฝ์ธ ๊ฐ๊ฒฉ ์์ฒด๊ฐ seasonal์ ์์ ์์ด์ LSTM regression์ด ๋ฌด์๋ฏธํ ๊ฒ์ผ๋ก ํ์
Figure. LSTM prediction with all coin data
Colab link : https://colab.research.google.com/drive/1blDNKqxy6GvTkR-rq8pjn9eUL0IUpShi?usp=sharing
์ดํ, all coin regression์, ์์ฆ๋์ด ์์ ๋ฒ์ด๋๋ ๋ฐ์ดํฐ๋ค ๋๋ฌธ์ ํ์ต์ด ์๋๋ ๊ฒ์ด๋ผ ํ๋จํ์ฌ y series ์ค์์ min-max range๊ฐ ํน์ ๊ธฐ์ค(outlier criteria)๊ฐ ๋๋ ์ํ์ ์ ์ธํ๊ณ ํ์ต ์๋
- outlier remove python code
def outlier_detecter(raw_y_arr, outlier_criteria = 0.03):
open_arr = raw_y_arr[:, :, 1] #open col is 1
outlier_list = []
openrange_list = []
for idx, temp_arr in enumerate(open_arr):
temp_min = temp_arr.min()
temp_max = temp_arr.max()
temp_arr_range = temp_max - temp_min
openrange_list.append(temp_arr_range)
if temp_arr_range > outlier_criteria:
outlier_list.append(idx)
print(f'{idx}๋ฒ์งธ open series is outlier sample!')
print(f'temp array range is {temp_arr_range:.3}\n')
return outlier_list, np.array(openrange_list)
Figure. outlier remove & LSTM prediction
outlier๋ฅผ ๊ตณ์ด ์ ๊ฑฐํ์ง ์๊ณ , ์์ธกํด์ผ ํ y series๋ฅผ ๊ณ์ธตํ์์ผ์ ํจํด์ ํ์ต์ํค๋ฉด ๋ช ๊ฐ์ ํจํด์ผ๋ก ์์ธก๊ฐ๋ฅํ ๊ฒ์ผ๋ก ์๊ฐํ์ผ๋ ์คํจ -> ์ด๋ฐ ๋ฐฉ๋ฒ์ผ๋ก ์ ์ฉ์, classification๋ฌธ์ ๋ก ๋ณํ๋์ด ํธ๋ ๊ฒ์ด ๋ง์ ๋ฏ ์ถ์.
- kbindiscretize python code
def kbin_discretizer(input_array):
kb = KBinsDiscretizer(n_bins=10, strategy='uniform', encode='ordinal')
processed_data = np.zeros((input_array.shape[0], input_array.shape[1], 1))
for i in range(input_array.shape[0]):
# coin_index_export args : (input_array, coin_num)
globals()['processing_array{}'.format(i)] = input_array[i,:,1]
#globals()['outliery_array{}'.format(i)] = train_y_array[outlier[i],:,1]
kb.fit(globals()['processing_array{}'.format(i)].reshape(input_array.shape[1],1))
globals()['processed_fit{}'.format(i)] = kb.transform(globals()['processing_array{}'.format(i)].reshape(input_array.shape[1],1))
#globals()['outliery_fit{}'.format(i)] = kb.transform(globals()['outliery_array{}'.format(i)].reshape(120,1))
processed_data[i,:,:] = globals()['processed_fit{}'.format(i)]
return processed_data
๊ธฐ์กด LSTM ๋ฐฉ๋ฒ์ผ๋ก๋ ๋๋ฌด ๋ฐ์ดํฐ ์ํ์ค๊ฐ ๊ธธ์ด์(lstm time sequence length = 1380)๋ชจ๋ธ์ด ํ์ตํ๊ธฐ ์ด๋ ต๋ค๊ณ ์๊ฐํ์ฌ, Conv1d๋ก ํน์ ๊ตฌ๊ฐ์ฉ splitํ์ฌ ํน์ง์ ์ถ์ถํ๊ณ ์ด๋ฅผ LSTM์ ๋ฐ์ํ๋ฉด ํ์ต์ด ๊ฐ๋ฅํด์ง ์ ์๋ค๊ณ ์๊ฐํ์ฌ driving
- Conv1d-LSTM modeling code
class CNN_LSTM(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim, n_layers):
super(CNN_LSTM, self).__init__()
self.input_dim = input_dim
self.hidden_dim = hidden_dim
self.output_dim = output_dim
self.num_layers = n_layers
self.conv1 = nn.Conv1d(args.input_dim, args.hidden_dim, kernel_size = 10)
self.pooling1 = nn.MaxPool1d(2, stride = 5)
self.conv2 = nn.Conv1d(args.hidden_dim, args.hidden_dim // 2, kernel_size = 5)
self.pooling2 = nn.MaxPool1d(4, stride = 4)
self.norm = nn.BatchNorm1d(32)
self.lstm = nn.LSTM(32, 128, self.num_layers, batch_first = True, bidirectional = True)
self.linear = nn.Linear(256, args.output_dim)
self.flatten = nn.Flatten()
def init_hidden(self, batch_size):
return (torch.zeros(self.num_layers, batch_size, self.hidden_dim),
torch.zeros(self.num_layers, batch_size, self.hidden_dim))
def forward(self, X):
# input์ (Batch, Feature dimension, Time_step)์
output = F.relu(self.conv1(X))
output = self.pooling1(output)
output = F.relu(self.conv2(output))
output = self.pooling2(output)
# output = self.flatten(output)
# [Batch_size, Seq_len, Hidden_size]
# x_input.reshape(1, -1, self.output_dim
# torch.Size([16, 32, 135])
# torch.Size([16, 135, 32])
output, self.hidden = self.lstm(output.reshape(args.batch_size, -1, 32))
y_pred = self.linear(output[:, -1, :])
return y_pred
Figure. Conv1d-LSTM prediction
- normalization์ด๋ smoothing์ ๋ฌธ์ ๊ฐ ์๋, ์ ์ด์ ๋ฐ์ดํฐ๊ฐ ์ฃผ๊ธฐ์ฑ์ด ์์ด์ ์ํ ๋ณ๋ก ๋ฐ์ดํฐ๋ฅผ regression ํ๋ ๋ฐฉ๋ฒ์ ๋ฐฉํฅ์ด ํ๋ฆผ.
- LSTM๊ณผ ๊ฐ์ RNN ๊ณ์ด์ ๋ชจ๋ธ๋ค์ ํจํด์ ํ์ตํ๋ ๊ฒ์ผ๋ก, onestep์ด ์๋ multistep์์๋ ๋๋ฌด ๋์ผํ ๊ฒฐ๊ณผ๋ฅผ ์ถ๋ ฅํ๊ฒ ๋จ.
- ํด๋น ๋ฌธ์ ๋ฅผ ํน์ ํจํด์ ํ์ตํ๊ฒ ํ๊ธฐ ์ํด์๋, discretize์์ผ์ classification ๋ฌธ์ ๋ก ์ ๊ทผํ๋ ๊ฒ๋ ํ๋์ ๋ฐฉ๋ฒ(ํฅํ ์์ฆ3์์ ๊ฒํ )
- regression ๋ฌธ์ ๋ก ํ๊ธฐ ์ํด์๋ ์ผ๋ฐ์ ์ธ time-series forcasting ๋ชจ๋ธ(ARIMA or Prophet)์ฒ๋ผ ํ ์ํ ๋ด open data seriees๋ฅผ ๊ฐ์ง๊ณ onestep์ฉ ํ์ต ํ ์ด๋ฅผ target length(120min)๋งํผ loopํ์ฌ ์๋ํด์ผ ํจ(ํฅํ ์์ฆ3์์ ๊ฒํ )
- ARIMA Experiment code : here
- Neural Prophet codes
- Pytorch Study
- Pytorch Colabs
- LSTM & smoothing : https://colab.research.google.com/drive/1uXBoRAMEza3Q0MRIrY33FKDJmN3lSKsi?usp=sharing
- Conv1d-LSTM : https://colab.research.google.com/drive/1UfPfdf6WSuYl4JYR2lMgdqRq7rAW8qIz
- LSTM & outlier remover : https://colab.research.google.com/drive/1lnj7t92-yEGE-U4NngMSIyu72pgvSQ34?usp=sharing
- LSTM & log normal : https://colab.research.google.com/drive/1blDNKqxy6GvTkR-rq8pjn9eUL0IUpShi?usp=sharing
โป ํน์ ๊ฒฝ๋ก๋ณ๊ฒฝ์ผ๋ก ๋งํฌ๊ฐ ์ ์ด๋ฆด ์๋ ์์ด์ ํด๋ ๋งํฌ๋ฅผ ๋ฐ๋ก ๋จ๊น๋๋ค. Colabs ๋ ธํธ๋ถ ํด๋ ๋งํฌ :https://drive.google.com/drive/folders/1UNQQqKb_b2bhm7vpyjj_WZbtko4LFuBY?usp=sharing
Coin investing simulator code : here
- ํด๋น ๋ฌธ์ ๋ฅผ ์ต๊ณ ์ ํจํด ๋ถ๋ฅ ๋ชจ๋ธ๋ก ๋ณํํ์ฌ ํ์ต
- y๊ฐ ๊ตฌ๊ฐ ๋ด open price ์ต๊ณ ์ ์ labling
- ๋ชจ๋ธ์ 1380๋ถ ๊ฐ์ input ํจํด์ ๋ฐ๋ผ, ์ต๊ณ ์ ์ธ lable์ ๋ถ๋ฅ
- Pytorch Conv1d + bidirectional LSTM
- ์ผ๋ฐ์ ์ธ time-series forcasting ๋ชจ๋ธ(ARIMA or Prophet)์ฒ๋ผ ํ ์ํ ๋ด open data seriees๋ฅผ ๊ฐ์ง๊ณ onestep์ฉ ํ์ต ํ ์ด๋ฅผ target length(120min)๋งํผ looping
- smoothing ๋ฐ fractional differecing, log normalization ์ ์ฉ
- moving average ์ฌ์ ์ฉ
- ํน์ ๋ถ๋ฅ ๋ถ๊ฐํ ๊ฒ ๊ฐ์ outlier data sample remove
- Pytorch Conv1d + bidirectional LSTM
- Residual modeling
์๊ณ์ด ๋ถ์์์๋ ๋ค์ ๊ฐ์ ์์ธกํ๋ ๋์ ๋ค์ ํ์์คํ ์์ ๊ฐ์ด ์ด๋ป๊ฒ ๋ฌ๋ผ์ง๋ ์ง๋ฅผ ์์ธกํ๋ ๋ชจ๋ธ์ ๋น๋ํ๋ ๊ฒ์ด ์ผ๋ฐ์ ์ ๋๋ค. ๋ง์ฐฌ๊ฐ์ง๋ก ๋ฅ๋ฌ๋์์ "์์ฌ ๋คํธ์ํฌ(Residual networks)" ๋๋ "ResNets"๋ ๊ฐ ๋ ์ด์ด๊ฐ ๋ชจ๋ธ์ ๋์ ๊ฒฐ๊ณผ์ ์ถ๊ฐ๋๋ ์ํคํ ์ฒ๋ฅผ ๋ํ๋ ๋๋ค. ์ด๊ฒ์ ๋ณํ๊ฐ ์์์ผ ํ๋ค๋ ์ฌ์ค์ ์ด์ฉํ๋ ๋ฐฉ๋ฒ์ ๋๋ค.
- Gloden cross strategy
์์ธก๋ ๊ตฌ๊ฐ ๋ด ๊ณจ๋ ํฌ๋ก์ค๊ฐ ๋ฐ์ํ๋ค๊ณ ์์ธก์ด ๋๋ฉด 1381๋ 1๋ก ๊ตฌ๋งค ์๋๋ฉด 0 ์ผ๋ก ํจ์ค ๊ทธ๋ฆฌ๊ณ ๊ตฌ๋งค์ดํ ๋ฐ๋ํฌ๋ก์ค๊ฐ ๋ฐ์ ์ ์ ๋ถ ํ๋งคํ๋ฉด ์์ ์ ์ธ ์ ๋ต ๊ฐ๋ฅ -> ๊ฐ๋ฅํ ์ง๋ ์๋ฌธ
- Simply Classification
์ต๊ณ ์ y ๊ฐ lable์ 120์ด ์๋, ํด๋น time-length๋ฅผ ์์ถ ๋ํ, ๋ถ๋ฅ ๋ฌธ์ ๋ก ํ ๊ฒฝ์ฐ ๋ฌด์กฐ๊ฑด ์ฌ์ง ๋ง๊ณ ๋ง์ถ ํ๋ฅ ์ cap์ ์์์ ๊ฐ๋ฅ์ฑ ๋์ ๊ฑฐ๋ง ์ฌ๋ ๋ฐฉ๋ฒ์ผ๋ก ํ์ธ
- Conv1d
1d CNN์ ๋ ํฐ filter size๋ฅผ ์จ๋ ๋๋ค. 1d CNN์ ๋ ํฐ window size๋ฅผ ์จ๋ ๋๋ค. filter size๋ก ์ผ๋ฐ์ ์ผ๋ก 7 or 9๊ฐ ์ ํ๋๋ค.
- Time-Series Forecasting: NeuralProphet vs AutoML: https://towardsdatascience.com/time-series-forecasting-neuralprophet-vs-automl-fa4dfb2c3a9e
-
Techniques to Handle Very Long Sequences with LSTMs : https://machinelearningmastery.com/handle-long-sequences-long-short-term-memory-recurrent-neural-networks/
A reasonable limit of 250-500 time steps is often used in practice with large LSTM models.
-
Neural prophet baseline : https://dacon.io/codeshare/2492
-
์๋ณด ๋ฐ์ดํฐ ์ ์ฒ๋ฆฌ์ ์ ํ๋ณด๊ฐ : https://dacon.io/competitions/official/235720/codeshare/2499?page=1&dtype=recent
ํธ๋ ๋ ์ถ์ถํด์ interpolation ์ฒ๋ฆฌ ํ ๋ฐ์ ๋ฐฉ๋ฒ
-
ARIMA ์๋ฆฌ ์ค๋ช : https://youngjunyi.github.io/analytics/2020/02/27/forecasting-in-marketing-arima.html
-
facebook prophet : https://facebook.github.io/prophet/docs/quick_start.html#python-api
prophet changepoint range์ ์๋ฏธ 100%์ผ๋ก ํ๋ฉด ์ค๋ฒํผํ ๋๊ธด ํ ๋ฏ, By default changepoints are only inferred for the first 80% of the time series in order to have plenty of runway for projecting the trend forward and to avoid overfitting fluctuations at the end of the time series. This default works in many situations but not all, and can be changed using the changepoint_range argument.
prophet changepoint_prior_scale์ ์๋ฏธ ์ด์์น๋ฐ์์ ๋? ๊ฐ์ ๋๋, If the trend changes are being overfit (too much flexibility) or underfit (not enough flexibility), you can adjust the strength of the sparse prior using the input argument changepoint_prior_scale. By default, this parameter is set to 0.05
-
Cryptocurrency price prediction using LSTMs | TensorFlow for Hackers (Part III) : https://towardsdatascience.com/cryptocurrency-price-prediction-using-lstms-tensorflow-for-hackers-part-iii-264fcdbccd3f
- tensorflow time-series forecasting tutorial : https://www.tensorflow.org/tutorials/structured_data/time_series?hl=ko
- ์๊ณ์ด ์์ธก ํจํค์ง Prophet ์๊ฐ : https://hyperconnect.github.io/2020/03/09/prophet-package.html
-
fourie order meaning in prophet : https://medium.com/analytics-vidhya/how-does-prophet-work-part-2-c47a6ceac511
m.add_seasonality(name='first_seasonality', period= 1/24 , fourier_order = 7) 1/24 ๊ฐ 1์ผ์ 24๋ฑ๋ถํด์ 1์๊ฐ ๋ง๋ค์ ์์ฆ๋์ ์ ํ๋ ๊ฒ m.add_seasonality(name='second_seasonality', period=1/6, fourier_order = 15) 1/6 ํ๋ฉด 1์ผ์ 6๋ฑ๋ถํด์ 4์๊ฐ ๋ง๋ค์ ์์ฆ๋์ ์ ํ๋ ๊ฒ
-
[ML with Python] 4.๊ตฌ๊ฐ ๋ถํ /์ด์ฐํ & ์ํธ์์ฉ/๋คํญ์ - https://jhryu1208.github.io/data/2021/01/11/ML_segmentation/
- A Simple LSTM-Based Time-Series Classifier : https://www.kaggle.com/purplejester/a-simple-lstm-based-time-series-classifier
- PyTorch RNN ๊ด๋ จ ํฐ์คํ ๋ฆฌ ๋ธ๋ก๊ทธ : https://seducinghyeok.tistory.com/8
- [PyTorch] Deep Time Series Classification : https://www.kaggle.com/purplejester/pytorch-deep-time-series-classification/notebook
- PyTorch๋ก ์์ํ๋ ๋ฅ ๋ฌ๋ ์ ๋ฌธ wicidocs : https://wikidocs.net/64703
- scikit-learn kbins docs : https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.KBinsDiscretizer.html
- Pytorch๋ก CNN ๊ตฌํํ๊ธฐ ํฐ์คํ ๋ฆฌ ๋ธ๋ก๊ทธ : https://justkode.kr/deep-learning/pytorch-cnn
- CNN์ ํ์ฉํ ์ฃผ๊ฐ ๋ฐฉํฅ ์์ธก : https://direction-f.tistory.com/19
- Bitcoin Time Series Prediction with LSTM : https://www.kaggle.com/jphoon/bitcoin-time-series-prediction-with-lstm
- ์์ฆ 1, CNN ๋ชจ๋ธ ํ : https://dacon.io/competitions/official/235740/codeshare/2486?page=1&dtype=recent
- A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python : https://machinelearningmastery.com/exponential-smoothing-for-time-series-forecasting-in-python/