diff --git a/Raytune/Multi_HRHN_SML2010_RAYTUNE-TPU.ipynb b/Raytune/Multi_HRHN_SML2010_RAYTUNE-TPU.ipynb index 9a6b7d8..4c44b63 100644 --- a/Raytune/Multi_HRHN_SML2010_RAYTUNE-TPU.ipynb +++ b/Raytune/Multi_HRHN_SML2010_RAYTUNE-TPU.ipynb @@ -535,31 +535,42 @@ "execution_count": null, "outputs": [] }, + { + "cell_type": "code", + "metadata": { + "id": "9ZTF58ZnmpCo" + }, + "source": [ + "" + ], + "execution_count": null, + "outputs": [] + }, { "cell_type": "markdown", "metadata": { "id": "_KvuPUL1nXpX" }, "source": [ - "# Création du modèle DSTP-RNN" + "# Création du modèle HRHN" ] }, { "cell_type": "markdown", "metadata": { - "id": "c0lDwKjfzzPh" + "id": "LBvAzS8KtFlp" }, "source": [ - "Le modèle DSTP-RNN implanté est le suivant :" + "Le modèle HRHN est décrit dans ce document de recherche : [Hierarchical Attention-Based Recurrent Highway Networks for Time Series Prediction](https://arxiv.org/pdf/1806.00685)" ] }, { "cell_type": "markdown", "metadata": { - "id": "pFTrTbJ5zuqj" + "id": "gKq0mqg2ts2w" }, "source": [ - "" + "" ] }, { @@ -568,143 +579,152 @@ "id": "IrCCnwy8nXp4" }, "source": [ - "**1. Création de la couche d'attention spatiale de l'étage n°1 / Phase 1**" + "**1. Création de l'encodeur**" ] }, { "cell_type": "markdown", "metadata": { - "id": "mmThmaJnoyEF" + "id": "9tn5xnfnuehY" }, "source": [ - "On commence par créer la couche permettant de calculer le score. Cette fonction calcule le score de l'encodeur, c'est-à-dire le score à attribuer à chaque série d'entrée. \n", - "Cette fonction est appellée par l'encodeur à l'aide de la méthode TimeDistribued de Keras, pour chaque série d'entrée." + "L'encodeur a pour but de créer des représentations cachées des séries exogènes qui prennent en compte les relations spatiales entre ces séries ainsi que les relations temporelles. \n", + "Les relations spatiales sont extraitent à l'aide d'un ensemble de réseaux de convolution qui produisent des représentations w1, w2... w(T-1). \n", + "Ces représentations sont ensuites codées par un réseau RHN à 3 couches afin d'en extraire les relations temporelles. En sortie de ce réseau RHN, on extrait 3 tenseurs dont chacun contient les (T-1) états cachés de chaque couche du réseau RHN." ] }, { "cell_type": "markdown", "metadata": { - "id": "Rx3ac1_G00Ga" + "id": "I0mq5BSauQUq" }, "source": [ - "" + "" ] }, { - "cell_type": "code", + "cell_type": "markdown", "metadata": { - "id": "EiCdmR0dnXp4" + "id": "IJ-boowSGDp3" }, "source": [ - "class CalculScores_Encodeur_Phase1(tf.keras.layers.Layer):\n", - " def __init__(self, dim_LSTM):\n", - " self.dim_LSTM = dim_LSTM\n", - " super().__init__() # Appel du __init__() de la classe Layer\n", - " \n", - " def build(self,input_shape):\n", - " self.Wf = self.add_weight(shape=(input_shape[1],2*self.dim_LSTM),initializer=\"normal\",name=\"Wf\") # (Tin, 2x#LSTM)\n", - " self.Uf = self.add_weight(shape=(input_shape[1],input_shape[1]),initializer=\"normal\",name=\"Uf\") # (Tin, Tin)\n", - " self.bf = self.add_weight(shape=(input_shape[1],1),initializer=\"normal\",name=\"bf\") # (Tin, 1)\n", - " self.vf = self.add_weight(shape=(input_shape[1],1),initializer=\"normal\",name=\"vf\") # (Tin, 1)\n", - " super().build(input_shape) # Appel de la méthode build()\n", - " \n", - " def compute_output_shape(self, input_shape):\n", - " return (input_shape[0], 1)\n", - "\n", - " # hidd_state: hidden_state : (batch_size,#LSTM)\n", - " # cell_state: Cell state : (batch_size,#LSTM)]\n", - " def SetStates(self,hidd_state, cell_state):\n", - " self.hidd_state = hidd_state\n", - " self.cell_state = cell_state\n", - "\n", - " # Entrées :\n", - " # input: Entrées X : (batch_size,Tin,1)\n", - " # Sorties :\n", - " # score: Score : (batch_size,1,1)\n", - " def call(self, input):\n", - " if self.hidd_state is not None:\n", - " hs = tf.keras.layers.concatenate([self.hidd_state,self.cell_state],axis=1) # (batch_size,2x#LSTM)\n", - " hs = tf.expand_dims(hs,-1) # (batch_size,2x#LSTM) => (batch_size,2#LSTM,1)\n", - " e = tf.matmul(self.Wf,hs) # (Tin,2x#LSTM)x(batch_size,2x#LSTM,1) = (batch_size,Tin,1)\n", - " e = e + tf.matmul(self.Uf,input) # (Tin,Tin)x(batch_size,Tin,1) = (batch_size,Tin,1)\n", - " e = e + self.bf # (batch_size,Tin,1)\n", - " else:\n", - " e = tf.matmul(self.Uf,input) # (Tin,Tin)x(batch_size,Tin,1) = (batch_size,Tin,1)\n", - " e = e + self.bf # (batch_size,Tin,1)\n", - " e = K.tanh(e)\n", - " score = tf.matmul(tf.transpose(self.vf),e) # (1,Tin)x(batch_size,Tin,1) = (batch_size,1,1)\n", - " return tf.squeeze(score,-1) # (batch_size,1)" - ], - "execution_count": null, - "outputs": [] + "***a. Création des CNN parallèlisés***" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "s6dF5Cp9vzWB" + }, + "source": [ + "La structure d'un réseau de convolution est composée de trois couches CNN-1D + Max-pooling :" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jga5_ZzKv6CI" + }, + "source": [ + "" + ] }, { "cell_type": "markdown", "metadata": { - "id": "SM8tVDxynXp4" + "id": "0CMXl2tgwJ76" }, "source": [ - "Puis maintenant la couche d'attention :" + "L'intégration de caque réseau dans Keras est parallélisée :" ] }, { "cell_type": "markdown", "metadata": { - "id": "IMlE2tNF1XRB" + "id": "gtR-u7ZqwjTL" }, "source": [ - "" + "" ] }, { "cell_type": "code", "metadata": { - "id": "XX8n_qKEnXp4" + "id": "5ETqqvAIGKLi" }, "source": [ - "class Encodeur_Phase1(tf.keras.layers.Layer):\n", - " def __init__(self, dim_LSTM, regul=0.0, drop=0.0):\n", - " self.regul = regul\n", - " self.dim_LSTM = dim_LSTM # Dimension des vecteurs cachés\n", - " self.drop = drop\n", + "# Arguments de la méthode __init__\n", + "# dim_filtres_cnn : liste dimension des filtres ex: [3,3,3]\n", + "# nbr_filtres_cnn : liste nbr de filtre sur chaque couche ex: [16,32,64]\n", + "# dim_max_pooling : liste dimension max pooling après chaque couche ex: [3,3,3]\n", + "\n", + "class Encodeur_CNN(tf.keras.layers.Layer):\n", + " def __init__(self, dim_filtres_cnn, nbr_filtres_cnn, dim_max_pooling,dim_motif):\n", + " self.dim_filtres_cnn = dim_filtres_cnn\n", + " self.nbr_filtres_cnn = nbr_filtres_cnn\n", + " self.dim_max_pooling = dim_max_pooling\n", + " self.dim_motif = dim_motif\n", " super().__init__() # Appel du __init__() de la classe Layer\n", " \n", + " # Création de Tin réseaux de convolution + max_pooling en //\n", + " ############################################################\n", " def build(self,input_shape):\n", - " self.couche_LSTM = tf.keras.layers.LSTM(self.dim_LSTM,kernel_regularizer=tf.keras.regularizers.l2(self.regul),return_sequences=False,return_state=True,dropout=self.drop,recurrent_dropout=self.drop, name=\"LSTM_Encodeur\")\n", - " self.CalculScores_Encodeur_Phase1 = CalculScores_Encodeur_Phase1(dim_LSTM=self.dim_LSTM)\n", - " super().build(input_shape) # Appel de la méthode build()\n", + " convs = []\n", + " input_cnns = []\n", + "\n", + " # Création des Tin entrées des réseaux CNN\n", + " for i in range(input_shape[1]):\n", + " input_cnns.append(tf.keras.Input(shape=(input_shape[2],1))) # input = Tin*(batch_size,#dim,1)\n", + "\n", + " # Création des Tin réseaux CNN\n", + " for i in range(input_shape[1]):\n", + " conv = tf.keras.layers.Conv1D(filters=self.nbr_filtres_cnn[0], # conv : (batch_size,#dim,16)\n", + " kernel_size=self.dim_filtres_cnn[0],\n", + " activation='relu',\n", + " padding='same',\n", + " strides=1)(input_cnns[i])\n", + " conv = tf.keras.layers.MaxPool1D(pool_size=self.dim_max_pooling[0], # conv : (batch_size,#pooling1,16)\n", + " padding='same')(conv)\n", + " for n in range(1,len(self.dim_filtres_cnn)):\n", + " conv = tf.keras.layers.Conv1D(filters=self.nbr_filtres_cnn[n], # conv : (batch_size,#pooling_x,dim_filtres_cnn[n])\n", + " kernel_size=self.dim_filtres_cnn[n],\n", + " activation='relu',\n", + " padding='same',\n", + " strides=1)(conv)\n", + " conv = tf.keras.layers.MaxPool1D(pool_size=self.dim_max_pooling[n], # conv : (batch_size,#pooling_x,dim_filtres_cnn[n])\n", + " padding='same')(conv)\n", + " convs.append(conv)\n", + " \n", + " # Création de la sortie concaténée des Tin réseaux CNN\n", + " out = tf.convert_to_tensor(convs) # out : (Tin,batch_size,#pooling,64)\n", + " out = tf.transpose(out,perm=[1,0,2,3]) # out : (batch_size,Tin,#pooling,64)\n", + " out = tf.keras.layers.Reshape( # out : (batch_size,Tin,#pooling*64)\n", + " target_shape=(out.shape[1],out.shape[2]*out.shape[3]))(out)\n", + "\n", + " if self.dim_motif == 0:\n", + " out = tf.keras.layers.Dense(units=out.shape[2])(out) # out : (batch_size,Tin,dim_motif = #pooling*64) \n", + " else:\n", + " out = tf.keras.layers.Dense(units=self.dim_motif)(out) # out : (batch_size,Tin,dim_motif) \n", + "\n", + " # Création du modèle global\n", + " self.conv_model = tf.keras.Model(inputs=input_cnns,outputs=out)\n", "\n", + " super().build(input_shape) # Appel de la méthode build()\n", + " \n", " # Entrées :\n", - " # input: Entrées X : (batch_size,Tin,#dim)\n", - " # hidd_state: hidden_state : (batch_size,#LSTM)\n", - " # cell_state: Cell state : (batch_size,#LSTM)]\n", - " # index: index série : (1)\n", + " # input: Entrée séries exogènes : (batch_size,Tin,#dim)\n", " # Sorties :\n", - " # out_hid : Sortie vecteur caché : (batch_size,#LSTM)\n", - " # out_cell: Sortie cell state : (btach_size,#LSTM)\n", - " # x_tilda : Coupe temporelle pondérée : (batch_size,1,#dim)\n", - " def call(self, input, hidd_state, cell_state, index):\n", - " # Calcul des scores\n", - " input_TD = tf.transpose(input,perm=[0,2,1]) # (batch_size,Tin,#dim) => (batch_size,#dim,Tin)\n", - " input_TD = tf.expand_dims(input_TD,axis=-1) # (batch_size,#dim,Tin) => (batch_size,#dim,Tin,1)\n", - " self.CalculScores_Encodeur_Phase1.SetStates(hidd_state,cell_state)\n", - " a = tf.keras.layers.TimeDistributed(\n", - " self.CalculScores_Encodeur_Phase1)(input_TD) # (batch_size,#dim,Tin,1) : Timestep=#dim\n", - " # (batch_size,Tin,1) envoyé #dim fois en //\n", - " # (batch_size,#dim,1) retourné\n", - " # Normalisation des scores alpha\n", - " a = tf.keras.activations.softmax(a,axis=1) # (batch_size,#dim,1)\n", - "\n", - " # Applique les poids normalisés à la coupe temporelle des séries exogènes\n", - " x_tilda = tf.multiply(tf.expand_dims(input[:,index,:],-1),a) # (batch_size,#dim,1) _x_ (batch_size,#dim,1) = (batch_size,#dim,1)\n", - " x_tilda = tf.transpose(x_tilda,perm=[0,2,1]) # (batch_size,1,#dim)\n", - "\n", - " # Applique x_tilda à la cellule LSTM\n", - " x_tilda = tf.transpose(x_tilda,perm=[0,2,1]) # (batch_size,#dim,1)\n", - " out_dec, out_hid, out_cell = self.couche_LSTM(x_tilda) # out_dec et out_cell : (batch_size,#LSTM)\n", - " x_tilda = tf.transpose(x_tilda,perm=[0,2,1]) # (batch_size,1,#dim)\n", - "\n", - " return out_hid, out_cell, x_tilda\n" + " # w: Sorties des motifs CNN : (batch_size,Tin,#dim_motif)\n", + " # (taille dernier filtre=64)\n", + " def call(self, input):\n", + " # Coupes temporelles sur les séries exogènes\n", + " # au format : Tin*(batch_size,#dim,1)\n", + " input_list = []\n", + " for i in range(input.shape[1]):\n", + " input_list.append(tf.transpose(input[:,i:i+1,:],perm=[0,2,1])) # (batch_size,#dim,1)\n", + " # Convolutions spatiales des séries exogènes\n", + " w = self.conv_model(input_list) # (batch_size,Tin,dim_motif)\n", + " return w" ], "execution_count": null, "outputs": [] @@ -712,83 +732,164 @@ { "cell_type": "markdown", "metadata": { - "id": "dI5YwkzX84Vi" + "id": "0_y68mmiRpR1" }, "source": [ - "**2. Création de la couche d'attention spatiale de l'étage n°1 / Phase 2**" + "***b. Création des cellules RHN***" ] }, { "cell_type": "markdown", "metadata": { - "id": "SN79pZ349FR6" + "id": "2FQ47zOsxHpx" }, "source": [ - "" + "" ] }, { "cell_type": "markdown", "metadata": { - "id": "H6pu4agmAtqB" + "id": "lvNrS-wQ1B5C" }, "source": [ - "On commence par créer le calcul du score :" + "On crée une cellule RHN en reprenant le code précédent auquel : \n", + "- On ajoute la possibilité de retourner tous les états cachés de chaque couche\n", + "- On ajoute la prise en compte de la dimension d'entrée correspondant à la dimension des motifs en sortie des réseaux CNN (dim_motif)" ] }, { "cell_type": "markdown", "metadata": { - "id": "d5pA1z8N-koF" + "id": "CygD9DXbBTDJ" }, "source": [ - "" + "" ] }, { "cell_type": "code", "metadata": { - "id": "r-IiIBuJ-SKi" + "id": "9HldCwl9z3fY" }, "source": [ - "class CalculScores_Encodeur_Phase2(tf.keras.layers.Layer):\n", - " def __init__(self, dim_LSTM):\n", - " self.dim_LSTM = dim_LSTM\n", + "class Cellule_RHN(tf.keras.layers.Layer):\n", + " def __init__(self, dim_RHN, nbr_couches, return_all_states = False, dim_input=1):\n", + " self.dim_RHN = dim_RHN\n", + " self.nbr_couches = nbr_couches\n", + " self.dim_input = dim_input\n", + " self.return_all_states = return_all_states\n", " super().__init__() # Appel du __init__() de la classe Layer\n", " \n", " def build(self,input_shape):\n", - " self.Ws = self.add_weight(shape=(input_shape[1],2*self.dim_LSTM),initializer=\"normal\",name=\"Ws\") # (Tin, 2x#LSTM)\n", - " self.Us = self.add_weight(shape=(input_shape[1],input_shape[1]),initializer=\"normal\",name=\"Us\") # (Tin, Tin)\n", - " self.bs = self.add_weight(shape=(input_shape[1],1),initializer=\"normal\",name=\"bs\") # (Tin, 1)\n", - " self.vs = self.add_weight(shape=(input_shape[1],1),initializer=\"normal\",name=\"vs\") # (Tin, 1)\n", + " self.Wh = self.add_weight(shape=(input_shape[2],self.dim_RHN),initializer=\"normal\",name=\"Wh\") # (#dim, #RHN)\n", + " self.Wt = self.add_weight(shape=(input_shape[2],self.dim_RHN),initializer=\"normal\",name=\"Wt\") # (#dim, #RHN)\n", + " self.Wc = self.add_weight(shape=(input_shape[2],self.dim_RHN),initializer=\"normal\",name=\"Wc\") # (#dim, #RHN)\n", + "\n", + " self.Rh = self.add_weight(shape=(self.nbr_couches,self.dim_RHN,self.dim_RHN),initializer=\"normal\",name=\"Rh\") # (n_couches,#RHN, #RHN)\n", + " self.Rt = self.add_weight(shape=(self.nbr_couches,self.dim_RHN,self.dim_RHN),initializer=\"normal\",name=\"Rt\") # (n_couches,#RHN, #RHN)\n", + " self.Rc = self.add_weight(shape=(self.nbr_couches,self.dim_RHN,self.dim_RHN),initializer=\"normal\",name=\"Rc\") # (n_couches,#RHN, #RHN)\n", + "\n", + " self.bh = self.add_weight(shape=(self.nbr_couches,self.dim_RHN,1),initializer=\"normal\",name=\"bh\") # (n_couches,#RHN, 1)\n", + " self.bt = self.add_weight(shape=(self.nbr_couches,self.dim_RHN,1),initializer=\"normal\",name=\"bt\") # (n_couches,#RHN, 1)\n", + " self.bc = self.add_weight(shape=(self.nbr_couches,self.dim_RHN,1),initializer=\"normal\",name=\"bc\") # (n_couches,#RHN, 1)\n", + "\n", " super().build(input_shape) # Appel de la méthode build()\n", - " \n", - " def compute_output_shape(self, input_shape):\n", - " return (input_shape[0], 1)\n", "\n", - " # hidd_state: hidden_state : (batch_size,#LSTM)\n", - " # cell_state: Cell state : (batch_size,#LSTM)]\n", - " def SetStates(self,hidd_state, cell_state):\n", - " self.hidd_state = hidd_state\n", - " self.cell_state = cell_state\n", + " # Initialisation des masques de dropout\n", + " def InitMasquesDropout(self,drop=0.0):\n", + " self.Wh_ = tf.convert_to_tensor(np.random.binomial(n=1,p=1.0-drop,size=(self.dim_input,1)),dtype=tf.float32) # (#dim,1)\n", + " self.Wt_ = tf.convert_to_tensor(np.random.binomial(n=1,p=1.0-drop,size=(self.dim_input,1)),dtype=tf.float32) # (#dim,1)\n", + " self.Wc_ = tf.convert_to_tensor(np.random.binomial(n=1,p=1.0-drop,size=(self.dim_input,1)),dtype=tf.float32) # (#dim,1)\n", + " self.Rh_ = tf.convert_to_tensor(np.random.binomial(n=1,p=1.0-drop,size=(self.nbr_couches,self.dim_RHN,1)),dtype=tf.float32) # (n_couches,#RHN,1)\n", + " self.Rt_ = tf.convert_to_tensor(np.random.binomial(n=1,p=1.0-drop,size=(self.nbr_couches,self.dim_RHN,1)),dtype=tf.float32) # (n_couches,#RHN,1)\n", + " self.Rc_ = tf.convert_to_tensor(np.random.binomial(n=1,p=1.0-drop,size=(self.nbr_couches,self.dim_RHN,1)),dtype=tf.float32) # (n_couches,#RHN,1)\n", "\n", " # Entrées :\n", - " # input: Entrées Z : (batch_size,Tin,1)\n", + " # input: Entrées X[t] : (batch_size,1,#dim)\n", + " # init_hidden: Etat caché Init. : (batch_size,#RHN)\n", " # Sorties :\n", - " # score: Score : (batch_size,1,1)\n", - " def call(self, input):\n", - " if self.hidd_state is not None:\n", - " hs = tf.keras.layers.concatenate([self.hidd_state,self.cell_state],axis=1) # (batch_size,2x#LSTM)\n", - " hs = tf.expand_dims(hs,-1) # (batch_size,2x#LSTM) => (batch_size,2#LSTM,1)\n", - " e = tf.matmul(self.Ws,hs) # (Tin,2x#LSTM)x(batch_size,2x#LSTM,1) = (batch_size,Tin,1)\n", - " e = e + tf.matmul(self.Us,input) # (Tin,Tin)x(batch_size,Tin,1) = (batch_size,Tin,1)\n", - " e = e + self.bs # (batch_size,Tin,1)\n", + " # sL: Etat caché de la dernière couche : (batch_size,#RHN) \n", + " # ou Etats cachés de chaque couche SL[t] : (batch_size,nbr_couches,#RHN)\n", + " def call(self, input, init_hidden=None):\n", + " # Construction d'un vecteur d'état nul si besoin\n", + " if init_hidden == None:\n", + " init_hidden = tf.matmul(tf.zeros(shape=(self.dim_RHN,input.shape[2])), # (#RHN,#dim)X(batch_size,#dim,1) = (batch_size,#RHN,1)\n", + " tf.transpose(input,perm=[0,2,1]))\n", + " init_hidden = tf.squeeze(init_hidden,-1) # (batch_size,#RHN,1) => (batch_size,#RHN)\n", + " \n", + " liste_sl = [] # Liste pour enregistrer les états cachés de chaque couche\n", + " # Calcul de hl, tl et cl\n", + " for i in range(self.nbr_couches):\n", + " if i==0:\n", + " # Applique le masque aux poids\n", + " Rh = tf.multiply(self.Rh_[0,:,:],self.Rh[0,:,:]) # (#RHN,1)_x_(#RHN,#RHN) = (#RHN,#RHN)\n", + " Rt = tf.multiply(self.Rt_[0,:,:],self.Rt[0,:,:])\n", + " Rc = tf.multiply(self.Rc_[0,:,:],self.Rc[0,:,:])\n", + "\n", + " Wh = tf.multiply(self.Wh_,self.Wh) # (#dim,1)_x_(#dim,#RHN) = (#dim,#RHN)\n", + " Wt = tf.multiply(self.Wt_,self.Wt)\n", + " Wc = tf.multiply(self.Wc_,self.Wc)\n", + " \n", + " # Calcul de hl\n", + " hl = tf.matmul(Rh,tf.expand_dims(init_hidden,-1)) # (#RHN,#RHN)X(batch_size,#RHN,1) = (batch_size,#RHN,1)\n", + " hl = hl + self.bh[0,:,:] # (batch_size,#RHN,1) + (#RHN,1) = (batch_size,#RHN,1)\n", + " hl = hl + tf.matmul(tf.transpose(Wh),\n", + " tf.transpose(input,perm=[0,2,1])) # (#RHN,#dim)X(batch_size,#dim,1) = (batch_size,#RHN,1)\n", + " hl = tf.squeeze(hl,-1) # (batch_size,#RHN)\n", + " hl = K.tanh(hl)\n", + "\n", + " # Calcul de tl\n", + " tl = tf.matmul(Rt,tf.expand_dims(init_hidden,-1)) # (#RHN,#RHN)X(batch_size,#RHN,1) = (batch_size,#RHN,1)\n", + " tl = tl + self.bt[0,:,:] # (batch_size,#RHN,1) + (#RHN,1) = (batch_size,#RHN,1)\n", + " tl = tl + tf.matmul(tf.transpose(Wt),\n", + " tf.transpose(input,perm=[0,2,1])) # (#RHN,#dim)X(batch_size,#dim,1) = (batch_size,#RHN,1)\n", + " tl = tf.squeeze(tl,-1) # (batch_size,#RHN)\n", + " tl = tf.keras.activations.sigmoid(tl)\n", + "\n", + " # Calcul de cl\n", + " cl = tf.matmul(Rc,tf.expand_dims(init_hidden,-1)) # (#RHN,#RHN)X(batch_size,#RHN,1) = (batch_size,#RHN,1)\n", + " cl = cl + self.bc[0,:,:] # (batch_size,#RHN,1) + (#RHN,1) = (batch_size,#RHN,1)\n", + " cl = cl + tf.matmul(tf.transpose(Wc),\n", + " tf.transpose(input,perm=[0,2,1])) # (#RHN,#dim)X(batch_size,#dim,1) = (batch_size,#RHN,1)\n", + " cl = tf.squeeze(cl,-1) # (batch_size,#RHN)\n", + " cl = tf.keras.activations.sigmoid(cl)\n", + "\n", + " else:\n", + " # Applique le masque aux poids\n", + " Rh = tf.multiply(self.Rh_[i,:,:],self.Rh[i,:,:])\n", + " Rt = tf.multiply(self.Rt_[i,:,:],self.Rt[i,:,:])\n", + " Rc = tf.multiply(self.Rc_[i,:,:],self.Rc[i,:,:])\n", + "\n", + " # Calcul de hl\n", + " hl = tf.matmul(Rh,tf.expand_dims(init_hidden,-1)) # (#RHN,#RHN)X(batch_size,#RHN,1) = (batch_size,#RHN,1)\n", + " hl = hl + self.bh[i,:,:] # (batch_size,#RHN,1) + (#RHN,1) = (batch_size,#RHN,1)\n", + " hl = tf.squeeze(hl,-1) # (batch_size,#RHN)\n", + " hl = K.tanh(hl)\n", + "\n", + " # Calcul de tl\n", + " tl = tf.matmul(Rt,tf.expand_dims(init_hidden,-1)) # (#RHN,#RHN)X(batch_size,#RHN,1) = (batch_size,#RHN,1)\n", + " tl = tl + self.bt[i,:,:] # (batch_size,#RHN,1) + (#RHN,1) = (batch_size,#RHN,1)\n", + " tl = tf.squeeze(tl,-1) # (batch_size,#RHN)\n", + " tl = tf.keras.activations.sigmoid(tl)\n", + "\n", + " # Calcul de cl\n", + " cl = tf.matmul(Rc,tf.expand_dims(init_hidden,-1)) # (#RHN,#RHN)X(batch_size,#RHN,1) = (batch_size,#RHN,1)\n", + " cl = cl + self.bc[i,:,:] # (batch_size,#RHN,1) + (#RHN,1) = (batch_size,#RHN,1)\n", + " cl = tf.squeeze(cl,-1) # (batch_size,#RHN)\n", + " cl = tf.keras.activations.sigmoid(cl)\n", + " \n", + " # Calcul de sl\n", + " sl = tf.keras.layers.multiply([hl,tl]) # (batch_size,#RHN)\n", + " sl = sl + tf.keras.layers.multiply([init_hidden,cl]) # (batch_size,#RHN)\n", + " liste_sl.append(sl) # Sauvegarde l'état caché de la couche courante\n", + " init_hidden = sl\n", + " if self.return_all_states == False:\n", + " return sl\n", " else:\n", - " e = tf.matmul(self.Us,input) # (Tin,Tin)x(batch_size,Tin,1) = (batch_size,Tin,1)\n", - " e = e + self.bs # (batch_size,Tin,1)\n", - " e = K.tanh(e)\n", - " score = tf.matmul(tf.transpose(self.vs),e) # (1,Tin)x(batch_size,Tin,1) = (batch_size,1,1)\n", - " return tf.squeeze(score,-1) # (batch_size,1)" + " liste_sl = tf.convert_to_tensor(liste_sl) # (nbr_couches,batch_size,#RHN)\n", + " liste_sl = tf.transpose(liste_sl,perm=[1,0,2]) # (batch_size,nbr_couches,#RHN)\n", + " return liste_sl" ], "execution_count": null, "outputs": [] @@ -796,69 +897,81 @@ { "cell_type": "markdown", "metadata": { - "id": "pe10l-HI_Nsb" + "id": "wJY-TY7W55ZF" }, "source": [ - "Puis maintenant la couche d'attention :" + "***c. Création de l'encodeur : Convolutions + RHN***" ] }, { "cell_type": "markdown", "metadata": { - "id": "5WmZzmBV_p4D" + "id": "wMT8C8-UVe2O" }, "source": [ - "" + "" ] }, { "cell_type": "code", "metadata": { - "id": "P7elzJ8S_Rrd" - }, - "source": [ - "class Encodeur_Phase2(tf.keras.layers.Layer):\n", - " def __init__(self, dim_LSTM, regul=0.0, drop=0.0):\n", - " self.regul = regul\n", - " self.dim_LSTM = dim_LSTM # Dimension des vecteurs cachés\n", - " self.drop = drop\n", + "id": "HYQu67IfBdel" + }, + "source": [ + "# Arguments de la méthode __init__\n", + "# dim_filtres_cnn : liste dimension des filtres ex: [3,3,3]\n", + "# nbr_filtres_cnn : liste nbr de filtre sur chaque couche ex: [16,32,64]\n", + "# dim_max_pooling : liste dimension max pooling après chaque couche ex: [3,3,3]\n", + "# dim_motif : dimension du motif en sortie du CNN\n", + "# dim_RHN : dimension du vecteur caché RHN\n", + "# nbr_couches_RHN : nombre de couches du RHN\n", + "# dropout : dropout variationnel pour le RHN ex: [0.1]\n", + "\n", + "class Encodeur(tf.keras.layers.Layer):\n", + " def __init__(self, dim_filtres_cnn, nbr_filtres_cnn, dim_max_pooling, dim_motif,dim_RHN,nbr_couches_RHN, dropout=0.0):\n", + " self.dim_filtres_cnn = dim_filtres_cnn\n", + " self.nbr_filtres_cnn = nbr_filtres_cnn\n", + " self.dim_max_pooling = dim_max_pooling\n", + " self.dim_motif = dim_motif\n", + " self.dim_RHN = dim_RHN\n", + " self.nbr_couches_RHN = nbr_couches_RHN\n", + " self.dropout = dropout\n", " super().__init__() # Appel du __init__() de la classe Layer\n", " \n", " def build(self,input_shape):\n", - " self.couche_LSTM = tf.keras.layers.LSTM(self.dim_LSTM,kernel_regularizer=tf.keras.regularizers.l2(self.regul),return_sequences=False,return_state=True,dropout=self.drop,recurrent_dropout=self.drop, name=\"LSTM_Encodeur\")\n", - " self.CalculScores_Encodeur_Phase2 = CalculScores_Encodeur_Phase2(dim_LSTM=self.dim_LSTM)\n", + " self.encodeur_cnn = Encodeur_CNN(dim_filtres_cnn=self.dim_filtres_cnn,nbr_filtres_cnn=self.nbr_filtres_cnn,dim_max_pooling=self.dim_max_pooling,dim_motif=self.dim_motif)\n", + " self.RHN = Cellule_RHN(dim_RHN=self.dim_RHN,nbr_couches=self.nbr_couches_RHN,return_all_states=True,dim_input=self.dim_motif)\n", " super().build(input_shape) # Appel de la méthode build()\n", - "\n", + " \n", " # Entrées :\n", - " # input: Entrées Z : (batch_size,Tin,#dim+1)\n", - " # hidd_state: hidden_state : (batch_size,#LSTM)\n", - " # cell_state: Cell state : (batch_size,#LSTM)]\n", - " # index: index série : (1)\n", + " # input: Entrées X : (batch_size,Tin,#dim)\n", " # Sorties :\n", - " # out_hid : Sortie vecteur caché : (batch_size,#LSTM)\n", - " # out_cell: Sortie cell state : (btach_size,#LSTM)\n", - " def call(self, input, hidd_state, cell_state, index):\n", - " # Calcul des scores\n", - " input_TD = tf.transpose(input,perm=[0,2,1]) # (batch_size,Tin,#dim+1) => (batch_size,#dim+1,Tin)\n", - " input_TD = tf.expand_dims(input_TD,axis=-1) # (batch_size,#dim+1,Tin) => (batch_size,#dim+1,Tin,1)\n", - " self.CalculScores_Encodeur_Phase2.SetStates(hidd_state,cell_state)\n", - " b = tf.keras.layers.TimeDistributed(\n", - " self.CalculScores_Encodeur_Phase2)(input_TD) # (batch_size,#dim+1,Tin,1) : Timestep=#dim+1\n", - " # (batch_size,Tin,1) envoyé #dim+1 fois en //\n", - " # (batch_size,#dim+1,1) retourné\n", - " # Normalisation des scores beta\n", - " b = tf.keras.activations.softmax(b,axis=1) # (batch_size,#dim+1,1)\n", - "\n", - " # Applique les poids normalisés à la série\n", - " z_tilda = tf.multiply(tf.expand_dims(input[:,index,:],-1),b) # (batch_size,#dim+1,1) _x_ (batch_size,#dim+1,1) = (batch_size,#dim+1,1)\n", - " z_tilda = tf.transpose(z_tilda,perm=[0,2,1]) # (batch_size,1,#dim+1)\n", - "\n", - " # Applique z_tilda à la cellule LSTM\n", - " z_tilda = tf.transpose(z_tilda,perm=[0,2,1]) # (batch_size,#dim+1,1)\n", - " out_dec, out_hid, out_cell = self.couche_LSTM(z_tilda) # out_dec et out_cell : (batch_size,#LSTM)\n", - " z_tilda = tf.transpose(z_tilda,perm=[0,2,1]) # (batch_size,1,#dim+1)\n", - "\n", - " return out_hid, out_cell\n" + " # hidden_states Vecteurs cachés : (batch_size,nbr_couches,Tin,#RHN)\n", + " def call(self, input):\n", + " # Convolutions spatiales des séries exogènes\n", + " w = self.encodeur_cnn(input) # (batch_size,Tin,dim_motif)\n", + "\n", + " # Encodage des motifs CNN avec les cellules RHN\n", + " sequence = []\n", + " hidden = None\n", + "\n", + " # Initialisation des masques de dropout pour tous les pas de temps\n", + " self.RHN.InitMasquesDropout(self.dropout)\n", + "\n", + " # Applique la cellule RHN à chaque pas de temps\n", + " for i in range(input.shape[1]):\n", + " hidden = self.RHN(w[:,i:i+1,:],hidden) # Envoie (batch_size,1,dim_motif)\n", + " sequence.append(hidden) # Sauve (batch_size,nbr_couches,#RHN)\n", + "\n", + " # Le premier état caché du prochain instant\n", + " # est l'état caché de la dernière couche précédente\n", + " hidden = hidden[:,self.nbr_couches_RHN-1,:] # (batch_size,#RHN)\n", + "\n", + " # Traite le format des vecteurs cachés de l'encodeur\n", + " sequence = tf.convert_to_tensor(sequence) # (Tin,batch_size,nbr_couches,#RHN)\n", + " hidden_states = tf.transpose(sequence,perm=[1,2,0,3]) # (batch_size,nbr_couches,Tin,#RHN) \n", + "\n", + " return hidden_states" ], "execution_count": null, "outputs": [] @@ -866,150 +979,95 @@ { "cell_type": "markdown", "metadata": { - "id": "qLuRur14AWxz" + "id": "__CJ4O7yJne3" }, "source": [ - "**3. Création de la couche d'attention du décodeur**" + "**2. Création du décodeur**" ] }, { "cell_type": "markdown", "metadata": { - "id": "7-_zhkmaDucJ" + "id": "Lt2yWQeaJwNn" }, "source": [ - "" + "Le décodeur prend en entrée et à chaque pas de temps : \n", + "- Le tenseur en sortie de l'encodeur RHN qui contient l'ensemble des vecteurs cachés des différentes couches : (batch_size,Nbr_couches,Tin,#RHN)\n", + "- L'état caché de la dernière couche du décodeur RHN précédent : (batch_size,#RHN)\n", + "- La valeur de la série cible à l'instant courant : (batch_size,1,1)" ] }, { "cell_type": "markdown", "metadata": { - "id": "LyjAVbSbAama" + "id": "DtYLxAoIK8Xn" }, "source": [ - "On commence par créer la couche de calcul du score du décodeur. \n", - "Cette fonction calcule le score du décodeur, c'est-à-dire le score à attribuer à chaque hidden-state en sortie de l'encodeur. \n", - "Cette fonction est appellée par la couche d'attention temporelle du décodeur à l'aide de la méthode TimeDistribued de Keras." + "" ] }, { "cell_type": "markdown", "metadata": { - "id": "FPqAY_fBByOC" + "id": "vjHiRZZbLief" }, "source": [ - "" + "**a. Création de la couche d'attention hiérarchique**" ] }, - { - "cell_type": "code", - "metadata": { - "id": "ohWFJ93YCc9i" - }, - "source": [ - "class CalculScores_Decodeur(tf.keras.layers.Layer):\n", - " def __init__(self,dim_LSTM):\n", - " self.dim_LSTM = dim_LSTM # Dimension des vecteurs cachés\n", - " super().__init__() # Appel du __init__() de la classe Layer\n", - " \n", - " def build(self,input_shape):\n", - " self.Wd = self.add_weight(shape=(self.dim_LSTM,2*self.dim_LSTM),initializer=\"normal\",name=\"Wd\") # (#LSTM, 2x#LSTM)\n", - " self.Ud = self.add_weight(shape=(self.dim_LSTM,self.dim_LSTM),initializer=\"normal\",name=\"Ud\") # (#LSTM, #LSTM)\n", - " self.bd = self.add_weight(shape=(self.dim_LSTM,1),initializer=\"normal\",name=\"bd\") # (#LSTM, 1)\n", - " self.vd = self.add_weight(shape=(self.dim_LSTM,1),initializer=\"normal\",name=\"vd\") # (#LSTM, 1)\n", - " super().build(input_shape) # Appel de la méthode build()\n", - "\n", - " def compute_output_shape(self, input_shape):\n", - " return (input_shape[0], 1)\n", - "\n", - "\n", - " # hidd_state: hidden_state : (batch_size,#LSTM)\n", - " # cell_state: Cell state : (batch_size,#LSTM)\n", - " def SetStates(self,hidd_state, cell_state):\n", - " self.hidd_state = hidd_state\n", - " self.cell_state = cell_state\n", - "\n", - " # Entrées :\n", - " # input: Entrée score décodeur : (batch_size,#LSTM)\n", - " # Sorties :\n", - " # score: score : (batch_size,1)\n", - " def call(self,input):\n", - " input = tf.expand_dims(input,-1)\n", - " if self.hidd_state is not None:\n", - " hs = tf.keras.layers.concatenate([self.hidd_state,self.cell_state],axis=1) # (batch_size,2x#LSTM)\n", - " hs = tf.expand_dims(hs,-1) # (batch_size,2x#LSTM) => (batch_size,2#LSTM,1)\n", - " e = tf.matmul(self.Wd,hs) # (#LSTM,2x#LSTM)x(batch_size,2x#LSTM,1) = (batch_size,#LSTM,1)\n", - " e = e + tf.matmul(self.Ud,input) # (#LSTM,#LSTM)x(batch_size,#LSTM,1) = (batch_size,#LSTM,1)\n", - " e = e + self.bd # (batch_size,#LSTM,1)\n", - " else:\n", - " e = tf.matmul(self.Ud,input) # (#LSTM,#LSTM)x(batch_size,#LSTM,1) = (batch_size,#LSTM,1)\n", - " e = e + self.bd # (batch_size,#LSTM,1)\n", - " e = K.tanh(e)\n", - " score = tf.matmul(tf.transpose(self.vd),e) # (1,#LSTM)x(batch_size,#LSTM,1) = (batch_size,1,1)\n", - " score = tf.squeeze(score,-1) # (batch_size,1)\n", - " return score" - ], - "execution_count": null, - "outputs": [] - }, { "cell_type": "markdown", "metadata": { - "id": "4BeYO8p6DZ6b" + "id": "9JX5hGeWNN8w" }, "source": [ - "Puis maintenant la couche d'attention :" + "" ] }, { "cell_type": "markdown", "metadata": { - "id": "diFuNNIpEf6i" + "id": "l9p7ylHmY6gS" }, "source": [ - "" + "On commence par créer la fonction permettant de calculer les scores. Cette fonction sera appelée avec la méthode TimeDistributed de Keras." ] }, { "cell_type": "code", "metadata": { - "id": "2v5qRdB9DpJC" + "id": "MvaGAb0uY1XL" }, "source": [ - "class CalculAttention_Decodeur(tf.keras.layers.Layer):\n", - " def __init__(self, dim_LSTM):\n", - " self.dim_LSTM = dim_LSTM # Dimension des vecteurs cachés\n", + "class CalculScore(tf.keras.layers.Layer):\n", + " def __init__(self):\n", " super().__init__() # Appel du __init__() de la classe Layer\n", " \n", " def build(self,input_shape):\n", - " self.couche_CalculScores_Decodeur = CalculScores_Decodeur(dim_LSTM=self.dim_LSTM)\n", + " self.T = self.add_weight(shape=(input_shape[1],input_shape[1]),initializer=\"normal\",name=\"T\") # (#RHN, #RHN)\n", + " self.U = self.add_weight(shape=(input_shape[1],input_shape[1]),initializer=\"normal\",name=\"U\") # (#RHN, #RHN)\n", + " self.b = self.add_weight(shape=(input_shape[1],1),initializer=\"normal\",name=\"b\") # (#RHN, 1)\n", + " self.v = self.add_weight(shape=(input_shape[1],1),initializer=\"normal\",name=\"v\") # (#RHN, 1)\n", " super().build(input_shape) # Appel de la méthode build()\n", "\n", - " # hidd_state: hidden_state : (batch_size,#LSTM)\n", - " # cell_state: Cell state : (batch_size,#LSTM)\n", - " def SetStates(self,hidd_state, cell_state):\n", - " self.hidd_state = hidd_state\n", - " self.cell_state = cell_state\n", + " # hid_state: Etat initial RHN : (batch_size,#RHN)\n", + " def SetInitState(self,hid_state):\n", + " self.hid_state = hid_state\n", + "\n", + " def compute_output_shape(self,input_shape):\n", + " return(input_shape[0],1)\n", "\n", " # Entrées :\n", - " # input: Entrées X : (batch_size,Tin,#LSTM)\n", + " # input: 1 sortie encodeur RHN : (batch_size,#RHN)\n", " # Sorties :\n", - " # vect_contexte Vecteur Contexte : (batch_size,1,#LSTM)\n", + " # score: score : (batch_size,1,1)\n", " def call(self, input):\n", - " # Calcul des scores\n", - " self.couche_CalculScores_Decodeur.SetStates(self.hidd_state,self.cell_state)\n", - " g = tf.keras.layers.TimeDistributed(\n", - " self.couche_CalculScores_Decodeur)(input) # (batch_size,Tin,#LSTM) : Timestep=Tin\n", - " # (batch_size,#LSTM) envoyé Tin fois en //\n", - " # (batch_size,Tin,1) retourné\n", - " # Normalisation des scores gama\n", - " g = tf.keras.activations.softmax(g,axis=1) # (batch_size,Tin,1)\n", - "\n", - " # Calcul du vecteur contexte\n", - " C = tf.multiply(input,g) # (batch_size,Tin,#LSTM)_x_(batch_size,Tin,1) = (batch_size,Tin,#LSTM)\n", - " C = K.sum(C,axis=1) # (batch_size,#LSTM)\n", - " C = tf.expand_dims(C,1) # (batch_size,1,#LSTM)\n", - " return C\n" + " score = tf.matmul(self.U,tf.expand_dims(input,-1)) # (#RHN,#RHN)x(batch_size,#RHN,1) = (batch_size,#RHN,1)\n", + " score = score + tf.matmul(self.T,tf.expand_dims(self.hid_state,-1)) # (batch_size,#RHN,1)\n", + " score = score + self.b # (batch_size,#RHN,1)\n", + " score = K.tanh(score)\n", + " score = tf.matmul(tf.transpose(self.v),score) # (1,#RHN)x(batch_size,#RHN,1) = (batch_size,1,1)\n", + " return tf.squeeze(score,-1) # (batch_size,1)" ], "execution_count": null, "outputs": [] @@ -1017,68 +1075,45 @@ { "cell_type": "markdown", "metadata": { - "id": "Xbk9Nbn8EvEx" + "id": "0pF02ysdbxWU" }, "source": [ - "**4. Création de la couche de décodeur**" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "L72CzheDExup" - }, - "source": [ - "" + "On crée maintenant la couche d'attention hiérarchique :" ] }, { "cell_type": "code", "metadata": { - "id": "1tJnICP9FStv" + "id": "8kxnR9fSVXDC" }, "source": [ - "class Decodeur(tf.keras.layers.Layer):\n", - " def __init__(self,dim_LSTM, regul=0.0, drop=0.0):\n", - " self.regul = regul\n", - " self.dim_LSTM = dim_LSTM # Dimension des vecteurs cachés\n", - " self.drop = drop\n", + "class AttentionHierarchique(tf.keras.layers.Layer):\n", + " def __init__(self):\n", " super().__init__() # Appel du __init__() de la classe Layer\n", " \n", " def build(self,input_shape):\n", - " self.couche_Attention_Decodeur = CalculAttention_Decodeur(dim_LSTM=self.dim_LSTM)\n", - " self.couche_LSTM = tf.keras.layers.LSTM(self.dim_LSTM,kernel_regularizer=tf.keras.regularizers.l2(self.regul),return_sequences=False,return_state=True,dropout=self.drop,recurrent_dropout=self.drop, name=\"LSTM_Decodeur\")\n", - " self.W = self.add_weight(shape=(self.dim_LSTM+1,1),initializer=\"normal\",name=\"W\") # (#LSTM+1, 1)\n", - " self.b = self.add_weight(shape=(1,1),initializer=\"normal\",name=\"b\") # (1, 1)\n", + " self.couche_score = CalculScore()\n", " super().build(input_shape) # Appel de la méthode build()\n", - "\n", + " \n", " # Entrées :\n", - " # input: Entrée décodeur : (batch_size,Tin,#LSTM)\n", - " # Y: Yt : (batch_size,1,1)\n", - " # hid_state: hidden state : (batch_size,#LSTM)\n", - " # cell_state: cell_state : (batch_size,#LSTM)\n", + " # input: Sorties d'une couche encodeur RHN : (batch_size,Tin,#RHN)\n", + " # hid_state: Etat initial RHN : (batch_size,#RHN)\n", " # Sorties :\n", - " # out_hid : hidden_state : (batch_size,#LSTM)\n", - " # out_cell : cell_state : (batch_size,#LSTM)\n", - " # v_contexte: vecteur contexte : (batch_size,#LSTM)\n", - " def call(self,input,Y,hid_state,cell_state):\n", - " # Calcul du vecteur contexte\n", - " self.couche_Attention_Decodeur.SetStates(hid_state,cell_state)\n", - " C = self.couche_Attention_Decodeur(input) # (batch_size,1,#LSTM)\n", - "\n", - " # Calcul de Y_tilda\n", - " add = tf.keras.layers.concatenate([Y,C],axis=2) # (batch_size,1,#LSTM+1)\n", - " add = tf.transpose(add,perm=[0,2,1]) # (batch_size,#LSTM+1,1)\n", - " Y_tilda = tf.matmul(tf.transpose(self.W),add) # (1,#LSTM+1) x (batch_size,#LSTM+1,1) = (batch_size,1,1)\n", - " Y_tilda = Y_tilda + self.b\n", - "\n", - " # Calcul des hidden state et cell state\n", - " if hid_state is not None:\n", - " out_, out_hid, out_cell = self.couche_LSTM(Y_tilda,initial_state=[hid_state,cell_state])\n", - " else:\n", - " out_, out_hid, out_cell = self.couche_LSTM(Y_tilda)\n", + " # vc: SousVecteur contexte : (batch_size,1,RHN)\n", + " def call(self, input, hid_state):\n", + " # Calcul des scores\n", + " self.couche_score.SetInitState(hid_state)\n", + " scores = tf.keras.layers.TimeDistributed(self.couche_score)(input) # (batch_size,Tin,#RHN) : Timestep = Tin\n", + " # (batch_size,#RHN) envoyé Tin fois\n", + " # (batch_size,Tin,1) retourné\n", + " scores = tf.keras.activations.softmax(scores,axis=1) # (batch_size,Tin,1)\n", "\n", - " return out_hid,out_cell, C" + " # Applique les scores aux sorties de la couche RHN\n", + " poids = tf.multiply(input,scores) # (batch_size,Tin,#RHN)_x_(batch_size,Tin,1) = (batch_size,Tin,#RHN)\n", + "\n", + " # Calcul le sous-vecteur contexte\n", + " vc = K.sum(poids,axis=1) # (batch_size,#RHN)\n", + " return tf.expand_dims(vc,1) # (batch_size,1,#RHN)" ], "execution_count": null, "outputs": [] @@ -1086,146 +1121,183 @@ { "cell_type": "markdown", "metadata": { - "id": "cJz4ghULFn6a" + "id": "slCSUTmyifEY" }, "source": [ - "**5. Création de la couche de décodeur**" + "**b. Création du décodeur**" ] }, { "cell_type": "markdown", "metadata": { - "id": "b8pXmw_bFxoJ" + "id": "l_efbOikfRwt" }, "source": [ - "Il ne reste plus qu'à créer l'architecture complète et d'ajouter l'estimation de la sortie :" + "Dans le décodeur, on parallélise autant de couches d'attention que nécessaire afin de créer un modèle d'attention multi-entrées." ] }, { "cell_type": "markdown", "metadata": { - "id": "E4qpMf0rFtsJ" + "id": "wCElCxBcnUHj" }, "source": [ - "" + "" ] }, + { + "cell_type": "code", + "metadata": { + "id": "k2nZG3Rrjv3O" + }, + "source": [ + "class Decodeur(tf.keras.layers.Layer):\n", + " def __init__(self,dim_RHN,nbr_couches_RHN,dropout=0.0):\n", + " self.dim_RHN = dim_RHN\n", + " self.nbr_couches_RHN = nbr_couches_RHN\n", + " self.dropout = dropout\n", + " super().__init__() # Appel du __init__() de la classe Layer\n", + " \n", + " def build(self,input_shape):\n", + " attentions = []\n", + " inputs_attention = []\n", + "\n", + " # Création des \"nbr_couches\" entrées des attentions\n", + " # Chaque entrée est une liste : [input,init_state] = [((batch_size,Tin,#RHN)),((batch_size,#RHN))]\n", + " for i in range(input_shape[1]):\n", + " inputs_attention.append([tf.keras.Input(shape=(input_shape[2],input_shape[3])), # input = \"nbr_couches\"*(batch_size,Tin,#RHN)\n", + " tf.keras.Input(shape=(input_shape[3]))]) # init_state = \"nbr_couches\"*(batch_size,#RHN)\n", + "\n", + " # Création des \"nbr_couches\" couches d'attentions hierarchiques\n", + " for i in range(input_shape[1]):\n", + " att = AttentionHierarchique()(inputs_attention[i][0], # inputs_attention[i][0] : (batch_size,Tin,#RHN)\n", + " inputs_attention[i][1]) # inputs_attention[i][1] : (batch_size,#RHN)\n", + " attentions.append(att)\n", + "\n", + " # Création de la sortie concaténée des \"nbr_couches\" couches d'attentions\n", + " out = tf.convert_to_tensor(attentions) # out : (nbr_couches,batch_size,1,#RHN)\n", + " out = tf.transpose(out,perm=[1,0,2,3]) # out : (batch_size,nbr_couches,1,#RHN)\n", + "\n", + " # Création du modèle global\n", + " self.att_model = tf.keras.Model(inputs=inputs_attention,outputs=out)\n", + "\n", + " # Création des poids\n", + " self.Wtilda = tf.keras.layers.Dense(units=1,activation=None,use_bias=None)\n", + " self.Vtilda = tf.keras.layers.Dense(units=1,activation=None,use_bias=True)\n", + "\n", + " # Création du décodeur RHN\n", + " self.dec_RHN = Cellule_RHN(dim_RHN=self.dim_RHN,nbr_couches=self.nbr_couches_RHN,return_all_states=False,dim_input=1)\n", + " \n", + " super().build(input_shape) # Appel de la méthode build()\n", + " \n", + " # Entrées :\n", + " # input: Sorties des couches de l'encodeur RHN : (batch_size,nbr_couches,Tin,#RHN)\n", + " # hid_state: Etat initial RHN : (batch_size,#RHN)\n", + " # Y: Valeur de la série cible : (batch_size,1)\n", + " # only_att Si =True ne calcul que le vecteur ctx : True/False\n", + " # Sorties :\n", + " # d: Vecteur contexte : (batch_size,nbr_couches*RHN)\n", + " # s: Vecteur caché décodeur RHN : (batch_size,#RHN)\n", + " def call(self, input, hid_state, Y,only_att):\n", + " # Initialisation de l'état caché à 0 si besoin\n", + " # Construit le tenseur nul au format (batch_size,#RHN)\n", + " if hid_state == None:\n", + " coef = tf.expand_dims(input[:,0,0,0],-1) # (batch_size,1)\n", + " coef = tf.expand_dims(coef,-1) # (batch_size,1,1)\n", + " hid_state = tf.matmul(coef,tf.zeros(shape=(1,input.shape[3]))) # (batch_size,1,1)X(1,#RHN) = (batch_size,1,#RHN)\n", + " hid_state = tf.squeeze(hid_state,axis=1) # (batch_size,#RHN)\n", + "\n", + " # Construction de l'entrée du modèle\n", + " # nbr_couches*[((batch_size,Tin,#RHN)),((batch_size,#RHN))]\n", + " input_model = []\n", + " for i in range(input.shape[1]):\n", + " input_model.append([input[:,i,:,:],hid_state]) # [((batch_size,Tin,#RHN)),((batch_size,#RHN))]\n", + " \n", + " # Calcul des sous-vecteurs contextes\n", + " # avec le modèle d'attention hiérarchique parallélisé\n", + " d = self.att_model(input_model) # d : (batch_size,nbr_couches,1,#RHN)\n", + "\n", + " # Concaténation des sous-vecteurs contextes\n", + " d = tf.squeeze(d,axis=2) # (batch_size,nbr_couches,#RHN)\n", + " d = tf.keras.layers.Flatten()(d) # (batch_size,nbr_couches*RHN)\n", + "\n", + " if only_att == False :\n", + " # Calcul de y_tilda\n", + " ytilda = self.Wtilda(Y) # (batch_size,1)\n", + " ytilda = ytilda + self.Vtilda(d) # (batch_size,1)\n", + "\n", + " # Initialisation des masques de dropout pour tous les pas de temps\n", + " self.dec_RHN.InitMasquesDropout(self.dropout)\n", + "\n", + " # Décodage avec le réseau RHN\n", + " s = self.dec_RHN(tf.expand_dims(ytilda,-1),hid_state) # (batch_size,#RHN)\n", + " return d,s\n", + " else:\n", + " return d" + ], + "execution_count": null, + "outputs": [] + }, { "cell_type": "markdown", "metadata": { - "id": "JUEWJblwv3BI" + "id": "UYOTdM7fZT65" }, "source": [ - "Prédictions des valeurs multi-step :" + "**3. Création de la couche HRHN**" ] }, { "cell_type": "markdown", "metadata": { - "id": "W7eE1A-8GrvC" + "id": "UhML2b5bFsZB" }, "source": [ - "" + "" ] }, { "cell_type": "code", "metadata": { - "id": "cD5OoZc4K7SJ" + "id": "8PCEHUDEZ1bt" }, "source": [ - "class Net_DSTPRNN(tf.keras.layers.Layer):\n", - " def __init__(self,encodeur_phase1, encodeur_phase2,decodeur,longueur_sequence, longueur_sortie, dim_LSTM, regul=0.0, drop = 0.0):\n", - " self.encodeur_phase1 = encodeur_phase1\n", - " self.encodeur_phase2 = encodeur_phase2\n", + "class Net_HRHN(tf.keras.layers.Layer):\n", + " def __init__(self,encodeur,decodeur,longueur_sequence, dim_RHN, regul=0.0, drop = 0.0):\n", + " self.encodeur = encodeur\n", " self.decodeur = decodeur\n", " self.longueur_sequence = longueur_sequence\n", - " self.longueur_sortie = longueur_sortie\n", " self.regul = regul\n", " self.drop = drop\n", - " self.dim_LSTM = dim_LSTM\n", + " self.dim_RHN = dim_RHN\n", " super().__init__() # Appel du __init__() de la classe Layer\n", " \n", " def build(self,input_shape):\n", - " self.Wy = self.add_weight(shape=(self.longueur_sortie,self.dim_LSTM,2*self.dim_LSTM),initializer=\"normal\",name=\"Wy\") # (longueur_sortie,#LSTM, 2x#LSTM)\n", - " self.by = self.add_weight(shape=(self.longueur_sortie,self.dim_LSTM,1),initializer=\"normal\",name=\"by\") # (longueur_sortie,#LSTM, 1)\n", - " self.vy = self.add_weight(shape=(self.longueur_sortie,self.dim_LSTM,1),initializer=\"normal\",name=\"vy\") # (longueur_sortie,#LSTM,1)\n", + " self.W = tf.keras.layers.Dense(units=1,activation=None,use_bias=None)\n", + " self.V = tf.keras.layers.Dense(units=1,activation=None,use_bias=True)\n", " super().build(input_shape) # Appel de la méthode build()\n", "\n", " # Entrées :\n", " # input: Entrées X : (batch_size,Tin,#dim)\n", " # output_seq: Sortie séquence Y : (batch_size,Tin,1)\n", " # Sorties :\n", - " # sortie: Prédiction Y : (batch_size,longueur_sortie,1)\n", + " # sortie: Prédiction Y : (batch_size,1,1)\n", " def call(self,input,output_seq):\n", - " # Phase n°1 d'encodage\n", - " # Calcul les représentations spatiales pondérées\n", - " # des coupes temporelles des séries exogènes en entrée\n", - " # x_tilda\n", - " x_tilda = []\n", - " hid_state = None\n", - " cell_state = None\n", - " for i in range(input.shape[1]):\n", - " hid_state, cell_state, x_t = self.encodeur_phase1(input,hid_state,cell_state,i)\n", - " x_t = tf.squeeze(x_t,1) # (batch_size,1,#dim) => (batch_size,#dim)\n", - " x_tilda.append(x_t) # (batch_size,#dim)\n", - " x_tilda = tf.convert_to_tensor(x_tilda) # (Tin,batch_size,#dim)\n", - " x_tilda = tf.transpose(x_tilda,perm=[1,0,2]) # (batch_size,Tin,#dim)\n", - "\n", - " # Concaténation des sorties de la phase 1 avec la série cible\n", - " Z = []\n", - "\n", - " for i in range(input.shape[1]):\n", - " z = tf.keras.layers.concatenate([x_tilda[:,i,:], # (batch_size,#dim+1)\n", - " output_seq[:,i,:]],axis=1)\n", - " Z.append(z)\n", - " Z = tf.convert_to_tensor(Z) # (Tin,batch_size,#dim+1)\n", - " Z = tf.transpose(Z,perm=[1,0,2]) # (batch_size,Tin,#dim+1)\n", - "\n", - " # Phase n°2 d'encodage\n", - " # Création des représentations cachées des\n", - " # concaténations précédentes\n", - " hid = []\n", - " hid_state = None\n", - " cell_state = None\n", - " for i in range(input.shape[1]):\n", - " hid_state, cell_state = self.encodeur_phase2(Z,hid_state,cell_state,i)\n", - " hid.append(hid_state)\n", - " hid = tf.convert_to_tensor(hid) # (Tin,batch_size,#LSTM)\n", - " hid = tf.transpose(hid,perm=[1,0,2]) # (batch_size,Tin,#LSTM)\n", - "\n", - "\n", - " # Phase de décodage\n", - " # Récupère les états cachés à (T-1)\n", - " hid_ = None\n", - " cell_ = None\n", - " for i in range(0,output_seq.shape[1]-1):\n", - " hid_, cell_, vc = self.decodeur(hid,output_seq[:,i:i+1,:],hid_,cell_)\n", + " # Appel de l'encodeur\n", + " # Récupère l'ensemble des états cachés de l'encodeur RHN\n", + " H = self.encodeur(input) # (batch_size,nbr_couches,Tin,#RHN)\n", + "\n", + " # Décodage\n", + " hidden_state = None\n", + " for t in range(input.shape[1]):\n", + " vc, hidden_state = self.decodeur(H,hidden_state,output_seq[:,t:t+1,0],only_att = False)\n", " \n", - " # hid_ : hT-1 : hidden state à t=T-1\n", - " # cell_ : sT-1 : cell state à t=T-1\n", - " # vc : CT-1 : vecteur contexte à t=T-1\n", - " \n", - " # Estimation des sorties\n", - " # hid_ : (batch_size,#LSTM)\n", - " # vc : (batch_size,1,#LSTM)\n", - " Y = []\n", - " y = tf.expand_dims(output_seq[:,-1,:],-1) # y = YT : (batch_size,1,1)\n", - " \n", - " for i in range(0,self.longueur_sortie):\n", - " hid_, cell_, vc = self.decodeur(hid,y,hid_,cell_)\n", - " add = tf.keras.layers.concatenate([tf.expand_dims(hid_,1),vc],axis=2) # (batch_size,1,2x#LSTM)\n", - " add = tf.transpose(add,perm=[0,2,1]) # (batch_size,2x#LSTM,1)\n", - " sortie = tf.matmul(self.Wy[i,:,:],add) # (#LSTM,2x#LSTM) x (batch_size,2x#LSTM+1,1) = (batch_size,#LSTM,1)\n", - " sortie = sortie + self.by[i,:,:] # (batch_size,#LSTM,1)\n", - " sortie = tf.matmul(tf.transpose(self.vy[i,:,:]),sortie) # (1,#LSTM)x(batch_size,#LSTM,1) = (batch_size,1,1)\n", - " y = sortie\n", - " Y.append(y)\n", - "\n", - " Y = tf.convert_to_tensor(Y) # Y = (longueur_sortie,batch_size,1,1)\n", - " Y = tf.transpose(Y,perm=[1,0,2,3]) # Y = (batch_size,longueur_sortie,1,1)\n", - " Y = tf.squeeze(Y,-1) # Y = (batch_size,longueur_sortie,1)\n", - " return Y" + " # Couche d'attention finale\n", + " vc = self.decodeur(H,hidden_state,output_seq[:,0,0],only_att=True)\n", + "\n", + " # Génération de la prédiction\n", + " sortie = self.W(hidden_state) + self.V(vc) # (batch_size,1)\n", + " return tf.expand_dims(sortie,-1) # (batch_size,1,1)" ], "execution_count": null, "outputs": [] @@ -1236,7 +1308,7 @@ "id": "Nj12PHWgPTjD" }, "source": [ - "# Mise en place du modèle DSTP-RNN" + "# Mise en place du modèle HRHN" ] }, { @@ -1246,14 +1318,35 @@ }, "source": [ "def get_model(config):\n", + " dim_cnn_ = config['dim_filtres_cnn']\n", + " nbr_cnn_ = config['nbr_filtres_cnn']\n", + " max_pool_ = config['max_pooling_cnn']\n", + "\n", + " dim_cnn=[]\n", + " nbr_cnn=[]\n", + " max_pool=[]\n", + "\n", + " val = dim_cnn_[0].split(',')\n", + " for c in val:\n", + " dim_cnn.append(int(c))\n", + " \n", + " val = nbr_cnn_[0].split(',')\n", + " for c in val:\n", + " nbr_cnn.append(int(c))\n", + "\n", + " val = max_pool_[0].split(',')\n", + " for c in val:\n", + " max_pool.append(int(c))\n", + "\n", " entrees_sequences = tf.keras.layers.Input(shape=(config['longueur_sequence'],x_train[0].shape[2]))\n", " sorties_sequence = tf.keras.layers.Input(shape=(config['longueur_sequence'],1))\n", "\n", - " encodeur_P1 = Encodeur_Phase1(dim_LSTM=config['dim_LSTM'],drop=config['drop'],regul=config['l2reg'])\n", - " encodeur_P2 = Encodeur_Phase2(dim_LSTM=config['dim_LSTM'],drop=config['drop'],regul=config['l2reg'])\n", - " decodeur = Decodeur(dim_LSTM=config['dim_LSTM'],drop=config['drop'],regul=config['l2reg'])\n", + " dim_motif = Encodeur_CNN(dim_filtres_cnn=dim_cnn,nbr_filtres_cnn=nbr_cnn,dim_max_pooling=max_pool,dim_motif=0)(x_train[0][0:1,:,:]).shape[2]\n", + "\n", + " encodeur = Encodeur(dim_filtres_cnn=dim_cnn,nbr_filtres_cnn=nbr_cnn,dim_max_pooling=max_pool,dim_motif=dim_motif,dim_RHN=config['dim_RHN'],nbr_couches_RHN=config['nbr_couches_RHN'],dropout=config['drop'])\n", + " decodeur = Decodeur(dim_RHN=config['dim_RHN'],nbr_couches_RHN=config['nbr_couches_RHN'],dropout=config['drop'])\n", "\n", - " sortie = Net_DSTPRNN(encodeur_P1,encodeur_P2,decodeur,longueur_sequence=config['longueur_sequence'],longueur_sortie=longueur_sortie, dim_LSTM=config['dim_LSTM'],regul=config['l2reg'],drop=config['drop'])(entrees_sequences,sorties_sequence)\n", + " sortie = Net_HRHN(encodeur,decodeur,longueur_sequence=config['longueur_sequence'],drop=config['drop'],dim_RHN=config['dim_RHN'])(entrees_sequences,sorties_sequence)\n", "\n", " model = tf.keras.Model([entrees_sequences,sorties_sequence],sortie)\n", " return model" @@ -1279,6 +1372,156 @@ "**1. Espace des hyperparamètres**" ] }, + { + "cell_type": "code", + "metadata": { + "id": "zhHTHSdgnEEP" + }, + "source": [ + "liste_dim_filtres_cnn = [\"1\",\"2\",\"3\",\"4\"]\n", + "liste_nbr_filtres_cnn = [\"2\",\"4\",\"8\",\"16\",\"32\",\"64\",\"128\",\"256\"]\n", + "liste_rapport_max_pooling_cnn = [\"1\",\"2\",\"3\",\"4\"]\n", + "\n", + "def create_filtres_x2():\n", + " dim_filtres_cnn_x2 = []\n", + " nbr_filtres_cnn_x2 = []\n", + " liste_rapport_max_pooling_cnn_x2 = []\n", + "\n", + " for i in liste_dim_filtres_cnn:\n", + " for j in liste_dim_filtres_cnn:\n", + " dim_filtres_cnn_x2.append([i,j])\n", + " for i in liste_nbr_filtres_cnn:\n", + " for j in liste_nbr_filtres_cnn:\n", + " nbr_filtres_cnn_x2.append([i,j])\n", + " for i in liste_rapport_max_pooling_cnn:\n", + " for j in liste_rapport_max_pooling_cnn:\n", + " liste_rapport_max_pooling_cnn_x2.append([i,j])\n", + " \n", + " return dim_filtres_cnn_x2,nbr_filtres_cnn_x2,liste_rapport_max_pooling_cnn_x2\n", + "\n", + "def create_filtres_x3():\n", + " dim_filtres_cnn_x3 = []\n", + " nbr_filtres_cnn_x3 = []\n", + " liste_rapport_max_pooling_cnn_x3 = []\n", + "\n", + " for i in liste_dim_filtres_cnn:\n", + " for j in liste_dim_filtres_cnn:\n", + " for k in liste_dim_filtres_cnn:\n", + " dim_filtres_cnn_x3.append([i,j,k])\n", + " for i in liste_nbr_filtres_cnn:\n", + " for j in liste_nbr_filtres_cnn:\n", + " for k in liste_nbr_filtres_cnn:\n", + " nbr_filtres_cnn_x3.append([i,j,k])\n", + " for i in liste_rapport_max_pooling_cnn:\n", + " for j in liste_rapport_max_pooling_cnn:\n", + " for k in liste_rapport_max_pooling_cnn:\n", + " liste_rapport_max_pooling_cnn_x3.append([i,j,k])\n", + " \n", + " return dim_filtres_cnn_x3,nbr_filtres_cnn_x3,liste_rapport_max_pooling_cnn_x3\n", + "\n", + "def create_liste_filtres():\n", + " all_dim_filtres_cnn = [[],[]]\n", + " all_nbr_filtres_cnn = [[],[]]\n", + " all_max_pooling_cnn = [[],[]]\n", + "\n", + " liste_dim_filtres_x2,liste_nbr_filtres_x2,liste_rapport_max_pooling_cnn_x2 = create_filtres_x2()\n", + " liste_dim_filtres_x3,liste_nbr_filtres_x3,liste_rapport_max_pooling_cnn_x3 = create_filtres_x3()\n", + "\n", + " for i in liste_dim_filtres_x2:\n", + " all_dim_filtres_cnn[0].append(i)\n", + "\n", + " for i in liste_dim_filtres_x3:\n", + " all_dim_filtres_cnn[1].append(i)\n", + "\n", + " for i in liste_nbr_filtres_x2:\n", + " all_nbr_filtres_cnn[0].append(i)\n", + "\n", + " for i in liste_nbr_filtres_x3:\n", + " all_nbr_filtres_cnn[1].append(i)\n", + "\n", + " for i in liste_rapport_max_pooling_cnn_x2:\n", + " all_max_pooling_cnn[0].append(i)\n", + "\n", + " for i in liste_rapport_max_pooling_cnn_x3:\n", + " all_max_pooling_cnn[1].append(i)\n", + "\n", + "\n", + " return all_dim_filtres_cnn,all_nbr_filtres_cnn,all_max_pooling_cnn" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "kx-aUxmEnGKi" + }, + "source": [ + "all_dim_filtres_cnn = [[],[]]\n", + "all_nbr_filtres_cnn = [[],[]]\n", + "all_max_pooling_cnn = [[],[]]\n", + "\n", + "\n", + "all_dim_filtres_cnn_, all_nbr_filtres_cnn_ , all_max_pooling_cnn_= create_liste_filtres()\n", + "\n", + "val_str = \"\"\n", + "\n", + "for i in all_dim_filtres_cnn_[0]:\n", + " for j in i:\n", + " val_str = val_str+str(j)+\",\"\n", + " val_str = val_str[:-1]\n", + " all_dim_filtres_cnn[0].append(val_str)\n", + " val_str = \"\"\n", + "\n", + "for i in all_nbr_filtres_cnn_[0]:\n", + " for j in i:\n", + " val_str = val_str+j+\",\"\n", + " val_str = val_str[:-1]\n", + " all_nbr_filtres_cnn[0].append(val_str)\n", + " val_str = \"\"\n", + "\n", + "for i in all_dim_filtres_cnn_[1]:\n", + " for j in i:\n", + " val_str = val_str+str(j)+\",\"\n", + " val_str = val_str[:-1]\n", + " all_dim_filtres_cnn[1].append(val_str)\n", + " val_str = \"\"\n", + "\n", + "for i in all_nbr_filtres_cnn_[1]:\n", + " for j in i:\n", + " val_str = val_str+j+\",\"\n", + " val_str = val_str[:-1]\n", + " all_nbr_filtres_cnn[1].append(val_str)\n", + " val_str = \"\"\n", + "\n", + "for i in all_max_pooling_cnn_[0]:\n", + " for j in i:\n", + " val_str = val_str+str(j)+\",\"\n", + " val_str = val_str[:-1]\n", + " all_max_pooling_cnn[0].append(val_str)\n", + " val_str = \"\"\n", + "\n", + "for i in all_max_pooling_cnn_[1]:\n", + " for j in i:\n", + " val_str = val_str+str(j)+\",\"\n", + " val_str = val_str[:-1]\n", + " all_max_pooling_cnn[1].append(val_str)\n", + " val_str = \"\"\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "9qWo0E2ynILr" + }, + "source": [ + "all_nbr_filtres_cnn[0]" + ], + "execution_count": null, + "outputs": [] + }, { "cell_type": "code", "metadata": { @@ -1289,21 +1532,27 @@ "\n", "def create_search_space():\n", " config = {\n", - " \"longueur_sequence\": tune.choice([5,10,15,20,25,30]),\n", - " \"dim_LSTM\": tune.choice([16,32,64,128,256]),\n", + " \"longueur_sequence\": tune.choice([3,4,5,6,7,8,9,10,15,20,25,30]),\n", + " \"dim_filtres_cnn\": tune.choice(all_dim_filtres_cnn[1][:]),\n", + " \"nbr_filtres_cnn\": tune.choice(all_nbr_filtres_cnn[1][:]),\n", + " \"max_pooling_cnn\": tune.choice(all_max_pooling_cnn[1][:]),\n", + " \"dim_RHN\": tune.choice([8,16,32,64,128,256]),\n", + " \"nbr_couches_RHN\": tune.choice([1,2,3,4]),\n", " \"drop\": tune.choice([0.0,0.01,0.1,0.3,0.6]),\n", - " \"l2reg\": tune.choice([0.0,0.0001,0.001,0.01]),\n", " \"batch_size\": tune.choice([128,256,512]),\n", " \"lr\": tune.loguniform(1e-4, 1e-1)\n", " }\n", " \n", " initial_best_config = [{\n", - " \"longueur_sequence\": 20,\n", - " \"dim_LSTM\": 128,\n", + " \"longueur_sequence\":20,\n", + " \"dim_filtres_cnn\": \"3,3,3\",\n", + " \"nbr_filtres_cnn\": \"16,32,64\",\n", + " \"max_pooling_cnn\": \"3,3,3\",\n", + " \"dim_RHN\": 64,\n", + " \"nbr_couches_RHN\": 3,\n", " \"drop\": 0.0,\n", - " \"l2reg\": 0.0,\n", " \"batch_size\": 128,\n", - " \"lr\": 0.001\n", + " \"lr\": 1e-3\n", " }]\n", "\n", " return config,initial_best_config" @@ -1392,6 +1641,7 @@ "\n", " with strategy.scope():\n", " model = get_model(config)\n", + " model.build(input_shape=([(int(config['batch_size']/8),config['longueur_sequence'],x_train[0].shape[2]),(int(config['batch_size']/8),config['longueur_sequence'],1)]))\n", " model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=config['lr']),loss='mse')\n", " \n", " callbacks = create_callbacks()\n", @@ -1426,7 +1676,7 @@ "\n", "class SendFileToGoogleDrive(Callback):\n", " def on_trial_complete(self,iteration,trials,trial,**info):\n", - " os.system(\"zip -r /content/googledrive/MyDrive/RayTuneDSTP_SML2010.zip /content/ray_results\")" + " os.system(\"zip -r /content/googledrive/MyDrive/RayTuneHRHN_SML2010.zip /content/ray_results\")" ], "execution_count": null, "outputs": [] @@ -1459,7 +1709,7 @@ "id": "6XvAMmojZrTg" }, "source": [ - "os.system(\"unzip /content/googledrive/MyDrive/RayTuneDSTP_SML2010.zip -d /\")" + "os.system(\"unzip /content/googledrive/MyDrive/RayTuneHRHN_SML2010.zip -d /\")" ], "execution_count": null, "outputs": [] @@ -1521,16 +1771,12 @@ { "cell_type": "code", "metadata": { - "id": "FRa7aHc7lvV-", - "outputId": "65d1f47d-7666-4127-c12c-fa8e3848993b", - "colab": { - "base_uri": "https://localhost:8080/" - } + "id": "FRa7aHc7lvV-" }, "source": [ "analysis = tune.run(train_model, \n", " local_dir=results_dir,\n", - " name=\"RayTuneDSTP_SML2010\",\n", + " name=\"RayTuneHRHN_SML2010\",\n", " verbose=0,\n", " num_samples=10000,\n", " scheduler=scheduler,\n", @@ -1544,133 +1790,7 @@ " callbacks=[SendFileToGoogleDrive()])\n" ], "execution_count": null, - "outputs": [ - { - "output_type": "stream", - "text": [ - "WARNING:ray.tune.function_runner:Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be `func(config, checkpoint_dir=None)`.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:37.991547: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:39.753965: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:39.764577: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:39.764613: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (db78049c8fd9): /proc/driver/nvidia/version does not exist\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbebf5f5950> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbebf5f5950>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbebf5f5d40> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbebf5f5d40>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbebf5f5e60> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbebf5f5e60>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94e10320> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94e10320>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94e10b90> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94e10b90>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94e10ef0> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94e10ef0>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94da0440> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94da0440>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94da09e0> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94da09e0>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94da0ef0> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94da0ef0>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94da0c20> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94da0c20>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94d4d7a0> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94d4d7a0>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform . at 0x7fbe94d4db90> and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: could not parse the source code of . at 0x7fbe94d4db90>: no matching AST found\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:39.928114: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:39.928496: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2299995000 Hz\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:40.284652: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:301] Initialize GrpcChannelCache for job worker -> {0 -> 10.59.148.90:8470}\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:40.284709: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:301] Initialize GrpcChannelCache for job localhost -> {0 -> localhost:33952}\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:40.303072: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:301] Initialize GrpcChannelCache for job worker -> {0 -> 10.59.148.90:8470}\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:40.303125: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:301] Initialize GrpcChannelCache for job localhost -> {0 -> localhost:33952}\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m 2021-08-15 20:53:40.303569: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:411] Started server with target: grpc://localhost:33952\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m WARNING:tensorflow:AutoGraph could not transform > and will run it as-is.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Cause: Unable to locate the source code of >. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code\n", - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert\n" - ], - "name": "stderr" - }, - { - "output_type": "stream", - "text": [ - "\u001b[2m\u001b[36m(pid=1184)\u001b[0m Epoch 1/1000\n" - ], - "name": "stdout" - } - ] + "outputs": [] } ] -} +} \ No newline at end of file