Why we need another loop to compute the rest anchor matches? #258

troyliu0105 · 2020-05-29T01:11:55Z

Hi, Eric. Thx for your amazing job.
But I have a little issue about the code...

MobileNet-YOLO/src/caffe/layers/yolov3_layer.cpp

Lines 647 to 678 in 6a4db28

    
           if (mask_n >= 0) { 
        
             bool overlap = false; 
        
             float iou; 
        
             best_n = mask_n; 
        
             //LOG(INFO) << best_n; 
        
             best_index = best_n*len*stride + pos + b * bottom[0]->count(1); 
        
             iou = delta_region_box(truth, swap_data, biases_,mask_[best_n], best_index, i, j, side_w_, side_h_, side_w_*anchors_scale_, side_h_*anchors_scale_,  
        
             diff, coord_scale_*(2 - truth[2] * truth[3]), stride,iou_loss_,iou_normalizer_,max_delta_,accumulate_); 
        
             if (iou > 0.5) 
        
               recall += 1; 
        
             if (iou > 0.75) 
        
               recall75 += 1; 
        
             avg_iou += iou; 
        
             avg_iou_loss += (1 - iou); 
        
             avg_obj += swap_data[best_index + 4 * stride]; 
        
             if (use_logic_gradient_) { 
        
               diff[best_index + 4 * stride] = (-1.0) * (1 - swap_data[best_index + 4 * stride]) * object_scale_; 
        
             } 
        
             else { 
        
               diff[best_index + 4 * stride] = (-1.0) * (1 - swap_data[best_index + 4 * stride]); 
        
               //diff[best_index + 4 * stride] = (-1) * (1 - exp(input_data[best_index + 4 * stride] - exp(input_data[best_index + 4 * stride]))); 
        
             } 
        
             //diff[best_index + 4 * stride] = (-1.0) * (1 - swap_data[best_index + 4 * stride]) ; 
        
             delta_region_class_v3(swap_data, diff, best_index + 5 * stride, class_label, num_class_, class_scale_, &avg_cat, stride, use_focal_loss_,label_smooth_eps_); //softmax_tree_ 
        
             ++count; 
        
             ++class_count_; 
        
           }

In line 674, we find the anchor which match the GT best, we treat it as positive sample and compute its gradient.

MobileNet-YOLO/src/caffe/layers/yolov3_layer.cpp

Lines 680 to 698 in 6a4db28

    
           for (int n = 0; n < biases_size_; ++n) { 
        
             int mask_n = int_index(mask_, n, num_); 
        
             if (mask_n >= 0 && n != best_n && iou_thresh_ < 1.0f) { 
        
               vector<Dtype> pred(4); 
        
               pred[2] = biases_[2 * n] / (float)(side_w_*anchors_scale_); 
        
               pred[3] = biases_[2 * n + 1] / (float)(side_h_*anchors_scale_); 
        
               pred[0] = 0; 
        
               pred[1] = 0; 
        
               float iou = box_iou(pred, truth_shift,iou_loss_);  
        
               if (iou > iou_thresh_) { 
        
                 bool overlap = false; 
        
                 float iou; 
        
                 //LOG(INFO) << best_n; 
        
                 best_index = mask_n*len*stride + pos + b * bottom[0]->count(1); 
        
                 iou = delta_region_box(truth, swap_data, biases_,mask_[mask_n], best_index, i, j, side_w_, side_h_, side_w_*anchors_scale_, side_h_*anchors_scale_,  
        
                 diff, coord_scale_*(2 - truth[2] * truth[3]), stride,iou_loss_,iou_normalizer_,max_delta_,accumulate_);

But in line 680, we iterate the rest anchor, the anchor, which iou between GT is larger than iou_thresh_, is also positive sample.

So, here is my confusion, why don't we just iterate all the anchor one time, compute gradient for which iou is larger than threshold.

eric612 · 2020-05-29T02:12:03Z

This is a trick to increase mAP , which be used in alexeyAB project , but I think it may produce some false bbox in real case .

The first phase was original implement with yolov3 , it only do regression in one anchor which is the best iou , and second phase will search the anchor pool , and find another anchors can also do the same thing except the first phase anchor (accumulate delta), if the size of GT box was in middle of two anchors (or more) , it was helpful , especially in small objects , take more chance to detect

troyliu0105 · 2020-05-29T08:14:32Z

Thanks for reply, I figured it out with your help. I think the main point is to increase the positive samples.

In the original implementation. If there is a image with N gt boxes and there are only 3 * N anchors (every loss layer must have one anchor to match the GTs).

With alexeyAB implementation, the extra anchors (match GT enough) are added.

But I noticed that alexeyAB's impl accumulate also the MSE gradients and yours do not. Is there anything reason?

https://github.com/AlexeyAB/darknet/blob/e6469eb071521a4ff5be8c0e9cceb524d0a65a95/src/yolo_layer.c#L181-L184

eric612 · 2020-05-29T08:41:46Z

Yes , I usually use GIOU here , I forgot to check MSE term .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why we need another loop to compute the rest anchor matches? #258

Why we need another loop to compute the rest anchor matches? #258

troyliu0105 commented May 29, 2020

eric612 commented May 29, 2020 •

edited

Loading

troyliu0105 commented May 29, 2020

eric612 commented May 29, 2020

Why we need another loop to compute the rest anchor matches? #258

Why we need another loop to compute the rest anchor matches? #258

Comments

troyliu0105 commented May 29, 2020

eric612 commented May 29, 2020 • edited Loading

troyliu0105 commented May 29, 2020

eric612 commented May 29, 2020

eric612 commented May 29, 2020 •

edited

Loading