Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions About Frames extracting #35

Open
fringe-k opened this issue Sep 29, 2022 · 2 comments
Open

Questions About Frames extracting #35

fringe-k opened this issue Sep 29, 2022 · 2 comments

Comments

@fringe-k
Copy link

In the Line 85 of SwinBERT/create_image_frame_tsv.py.
" current_image_path = previous_image_path "

Does it mean when the amount of extracted images is less than num_frames, you will pad them to num_frames with the last image? This step is a little confused to me. Is the result of it different from the one which do not copy the last image?

@liyaowei-stu
Copy link

liyaowei-stu commented Oct 31, 2022

I followed “/prepro/extract_ youcook2_ frms.sh" to execute "./prepro/extract_ frames.py", but it doesn't seem to work, and the following results are obtained:

python ./prepro/extract_frames.py
--video_root_dir ./datasets/MSRVTT-v2/videos
--save_dir ./datasets/MSRVTT-v2/
--video_info_tsv ./datasets/MSRVTT-v2/val.img.tsv
--num_frames 32
0it [00:00, ?it/s]`

Is my operation incorrect?
Thank you very much ~

@tiesanguaixia
Copy link

In the Line 85 of SwinBERT/create_image_frame_tsv.py. " current_image_path = previous_image_path "

Does it mean when the amount of extracted images is less than num_frames, you will pad them to num_frames with the last image? This step is a little confused to me. Is the result of it different from the one which do not copy the last image?

Hi! Have you reproduced the results in paper? May I ask did you adjust the value of 'loss_sparse_w' and the 'learning_rate' in command? For the 'loss_sparse_w', I guess it's the regularization hyperparameter of $Loss_{SPARSE}$ , i.e. the $\lambda$ in the paper. In the appendix, it seems like for MSR-VTT, the model performs best when $\lambda$ = 5. But the why the default value of 'loss_sparse_w' in command is 0.5? Do I need to adjust it to 5? Thank you a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants