Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

在opengait上使用单卡训练步态识别的模型时,得到的准确率全是2%,这是怎么回事呢? #234

Open
heyihao2018 opened this issue Jul 27, 2024 · 6 comments

Comments

@heyihao2018
Copy link

大家好,在我复现opengait的gaitset代码时,因为我用的是单卡,所以我把分布式相关的代码都注释掉了,我个人觉得训练过程应该是没有问题的,最可能出现问题的是我用的windows系统,这是否会有影响,以及测试的代码是否存在小问题,如果有人也出现过这个问题,请大家告诉我,我自己也会慢慢琢磨的,谢谢大家!

@world1987
Copy link

和系统关系不大,win和linux都复现过,单卡需要调整参数

@wj1tr0y
Copy link
Contributor

wj1tr0y commented Aug 11, 2024

单卡可以不用改动代码,直接改动train和test的命令即可,CUDA_VISBLE_DEVICES改成0,nproc_per_node改成1

@JJ-Luov
Copy link

JJ-Luov commented Sep 12, 2024

你好,请问这个问题你解决了吗

@heyihao2018
Copy link
Author

heyihao2018 commented Sep 18, 2024

def ddp_all_gather(features, dim=0, requires_grad=False):
    '''
        inputs: [n, ...]
    '''

    world_size = 1
    rank = 0
    feature_list = [features for _ in range(world_size)]
    if requires_grad:
        feature_list[rank] = features
    feature = torch.cat(feature_list, dim=dim)
    return feature

我把lib/utils/common.py文件里面的这个函数改了改,就可以了

@yyj-001
Copy link

yyj-001 commented Nov 11, 2024

您好,请问您的问题解决了吗?我也遇到了类似的问题,想参考前辈。我在Linux系统上用单卡和双卡都试过了,训练结果都只有2%左右的rank1,我没有改动代码,只改动了.yaml文件里的参数。目前我找不到问题在哪里了

@JJ-Luov
Copy link

JJ-Luov commented Nov 11, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants