File tree 1 file changed +8
-5
lines changed
1 file changed +8
-5
lines changed Original file line number Diff line number Diff line change @@ -115,13 +115,16 @@ class LongRoPEModel(nn.Module):
115
115
d_model (int): Dimension of the model.
116
116
n_heads (int): Number of attention heads.
117
117
num_layers (int): Number of transformer layers.
118
- max_len (int): Maximum sequence length.
118
+ vocab_size (int): Size of the vocabulary.
119
+ base_context_length (int): Original context window length of the model.
119
120
rope (RoPEPositionalEncoding): Rotary Position Encoding (RoPE) module.
120
121
transformers (nn.ModuleList): List of transformer encoder layers.
121
- lambda_factors (list): Lambda factors for non-uniform interpolation.
122
- lambda_factors_base (list): Lambda factors for the base model.
123
- extension_ratio (float): Extension ratio for the context window.
124
- n_hat (int): Threshold for applying interpolation.
122
+ lambda_factors (dict): Lambda factors for non-uniform interpolation for different context lengths.
123
+ n_hat (dict): Threshold for applying interpolation for different context lengths.
124
+ lambda_factors_base (list): Base lambda factors for the original context length.
125
+ n_hat_base (int): Base n_hat for the original context length.
126
+ extension_ratio (float): Ratio of the extended context length to the base context length.
127
+
125
128
126
129
Methods:
127
130
forward(input_ids):
You can’t perform that action at this time.
0 commit comments