Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trident 3 branch fused? #335

Open
hezhu1996 opened this issue Jun 6, 2020 · 4 comments
Open

Trident 3 branch fused? #335

hezhu1996 opened this issue Jun 6, 2020 · 4 comments

Comments

@hezhu1996
Copy link

Hi, thank you for your great work.
I was wondering after using Trident in conv4 layer(which is best is your paper), How do you fuse them into one branch to feed into RPN if not apply scale specific setting, concat or element-wise product? cause I want to try it in single-shot detector which don't have RPN or something like that. Thanks for your reply :)

@xmyqsh
Copy link

xmyqsh commented Jun 6, 2020

Try to analogy to FPN, FPN can be used in both two-stage and one-stage detector.

@hezhu1996
Copy link
Author

@xmyqsh So do you mean you actually didn't fuse them together in one branch but with each branch goes to a individual RPN and RCNN header even without scale-aware training scheme? Thanks

@xmyqsh
Copy link

xmyqsh commented Jun 7, 2020

@TWDH
Aha, I got you.
TridentNet is developed on the two stage-detection, inherited from faster-rcnn, not FPN, but could be viewed as another version of FPN. It adopts a similar training scheme that SNIP introduced, but SNIP uses faster-rcnn or R-FCN, not FPN. What the innovate of TridentNet is that it uses dilation to get feature pyramid instead of image pyramid in SNIP or SNIPER and is pretrained on the ImageNet. I'd like to see someone pretrains FPN on the imageNet to see how much gain could be got.

I cannot say if TridentDilation better than FPN, or vice versa, both of them use the feature pyramid. TridentDilation could detect small scale objects with fewer resolution than FPN, but for extreme small object, it will turn to image pyramid. FPN has similar problem and higher resolution for small object. For large object, TridentDilation use the same resolution which is not flexible and efficient. For extreme larger object, TridentNet have to turn to image pyramid again. But for a specific object scale, TridentNet is definitely better than FPN. For a diverse scale, image pyramid is more suitable for TridentNet because of its scale-aware training scheme.

What is scale-aware training scheme? Scale-aware training scheme shout out at the detector: Be stupid! Do what you should do! Do what you good at! Be a scale specific detector! :)

If my remember is correct, the scale-aware training scheme is mainly on rpn phase, removing the extreme-scale harder example for a specific feature map to ease the modeling learning. And the dropped extreme-scale objects could be handle by other suitable feature maps or image pyramid.

For RCNN, all of the two-stage detectors are the same. RPN is on several branch/feature map, and roi-pooling to the same 7x7 size which should be the fuse you wanted.

Now, let's have a conclusion, TridentNet and its scale-aware training scheme could be used in one-stage detector. You could find some clues in the FCOS anchor selection scheme, it have adopted the scale-aware training scheme more or less.

At last, I have developed a detector called CropNet, which can double boost APs without extra order of computation, targeting autonomous driving scenario. Instead of pretrained it on imageNet, we could train it on larger autonomous driving dataset.

I'm not the author of TridentNet, there maybe some misinterpreted of it. I'd love to see the author correct me :)

Ops...
I have missed an important feature of TridentNet, the weight-sharing in the TridentDilation. I have to say, this is the most innovative design that I liked. It allows to use different scales of objects to train the same weight. As a result, only using one branch which is trained by three branch objects could get very promising performance and fast speed.

@hezhu1996
Copy link
Author

Thanks for the comments. It seems TridentNet split the original resnet into 3 branches and each branch connects to a RPN and RCNN header respectively which means there are 3 RPN,RCNN altoghter without interference each other. I notice that scale-aware acturally just improve about 0.3% which is not that important:) Not sure if im right

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants