Skip to content

Commit 4589b66

Browse files
committed
Update the comment for the model
1 parent 028b4a0 commit 4589b66

File tree

1 file changed

+8
-5
lines changed

1 file changed

+8
-5
lines changed

src/main.py

+8-5
Original file line numberDiff line numberDiff line change
@@ -115,13 +115,16 @@ class LongRoPEModel(nn.Module):
115115
d_model (int): Dimension of the model.
116116
n_heads (int): Number of attention heads.
117117
num_layers (int): Number of transformer layers.
118-
max_len (int): Maximum sequence length.
118+
vocab_size (int): Size of the vocabulary.
119+
base_context_length (int): Original context window length of the model.
119120
rope (RoPEPositionalEncoding): Rotary Position Encoding (RoPE) module.
120121
transformers (nn.ModuleList): List of transformer encoder layers.
121-
lambda_factors (list): Lambda factors for non-uniform interpolation.
122-
lambda_factors_base (list): Lambda factors for the base model.
123-
extension_ratio (float): Extension ratio for the context window.
124-
n_hat (int): Threshold for applying interpolation.
122+
lambda_factors (dict): Lambda factors for non-uniform interpolation for different context lengths.
123+
n_hat (dict): Threshold for applying interpolation for different context lengths.
124+
lambda_factors_base (list): Base lambda factors for the original context length.
125+
n_hat_base (int): Base n_hat for the original context length.
126+
extension_ratio (float): Ratio of the extended context length to the base context length.
127+
125128
126129
Methods:
127130
forward(input_ids):

0 commit comments

Comments
 (0)