You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On the test case from your comment, final_window_start is greater than full_seq_len:
full_seq_len=16max_pieces=8start_tokens=1end_tokens=1# Next, select indices of the sequence such that it will result in embeddings representing the original# sentence. To capture maximal context, the indices will be the middle part of each embedded window# sub-sequence (plus any leftover start and final edge windows), e.g.,# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15# "[CLS] I went to the very fine [SEP] [CLS] the very fine store to eat [SEP]"# with max_pieces = 8 should produce max context indices [2, 3, 4, 10, 11, 12] with additional start# and final windows with indices [0, 1] and [14, 15] respectively.# Find the stride as half the max pieces, ignoring the special start and end tokens# Calculate an offset to extract the centermost embeddings of each windowstride= (max_pieces-start_tokens-end_tokens) //2stride_offset=stride//2+start_tokensfirst_window=list(range(stride_offset))
max_context_windows= [iforiinrange(full_seq_len)
ifstride_offset-1<i%max_pieces<stride_offset+stride]
final_window_start=full_seq_len- (full_seq_len%max_pieces) +stride_offset+stridefinal_window=list(range(final_window_start, full_seq_len))
select_indices=first_window+max_context_windows+final_windowprint(select_indices)
Output is [0, 1, 2, 3, 4, 10, 11, 12] and [14, 15] is missing.
The text was updated successfully, but these errors were encountered:
Hi, there seems to be a bug in the calculation of final_window_start:
udify/udify/modules/bert_pretrained.py
Lines 488 to 509 in cbabef6
On the test case from your comment,
final_window_start
is greater than full_seq_len:Output is
[0, 1, 2, 3, 4, 10, 11, 12]
and[14, 15]
is missing.The text was updated successfully, but these errors were encountered: