Reproducing your results #1

ludovic-carre · 2018-08-27T13:30:49Z

Hi,

I am working on a similar project, xview and yolo, and I would like to reproduce your results. I have a few questions:

Why do use 30 anchors ? Did you run some analysis or what is your intuition ?
I noticed that you have multiple cfg files but that your 30 anchors symmetric is the default one in train so I figure it is the one that gave you the best results ? What size should my input be to use your cfg file correctly ?
I know you started working using the repo of eriklindernoren, can you mention the major changes you did on the training process (or other) so that I know what to pay attention to ?
I haven't looked at everything yet but from my first look you don't seem to use a loss that emphasize mistakes on xview classes that are not well represented. How do you deal with the fact that building and small car make up a huge part of the dataset ? My network only learns to predict these two classes.
I haven't noticed any data augmentation, do you use any ?

Finally if you can mention/explain anything that you think could help someone reproduce your results it would be really helpful !

glenn-jocher · 2018-08-29T18:38:23Z

Hi @PiggyGenius, good questions, here are your answers!

I used 30 anchors (compared to 9 in yolov3) because I've read that in general higher anchor counts correlate to higher mAPs. Here is an example from http://machinethink.net/blog/object-detection/
I actually achieved the best results with c60_a30.cfg, but I reasoned that since this is overhead imagery, if you had infinite examples of each class, the anchor boxes should be vertical-horizontal symmetric, so to force this idea I duplicated the data, transposing the boxes in the duplicated set, before doing kmeans for the 'symmetric' cfg files. In the end its probably much to do about nothing though, as the two cfg files ended up being very similar to each other (symmetric and non-symmetric). I doubt your results will change materially depending on which you use, but if you want to duplicate my results use c60_a30.cfg.
Yes, the eriklindernoren repo was great for inference but did not train correctly, so I modified the cost functions in models.py and build_targets() in detect.py. I also essentially rewrote much of datasets.py, switching it from PIL to opencv, and adding augmentation, which of course is necessary for training but not inference.
I use a weighted loss for the classification loss term, so buildings and cars for example are much less important. The weight is the inverse of the class frequency. In utils.py you'll find the weights as a lookup table. The numbers here are the number of occurances of each class in the dataset. The weights are their inverses, normalized to sum to 1.

def xview_class_weights(indices):  # weights of each class in the training set, normalized to mu = 1
    weights = 1 / torch.FloatTensor(
        [74, 364, 713, 71, 2925, 209767, 6925, 1101, 3612, 12134, 5871, 3640, 860, 4062, 895, 149, 174, 17, 1624, 1846, 125, 122, 124, 662, 1452, 697, 222, 190, 786, 200, 450, 295, 79, 205, 156, 181, 70, 64, 337, 1352, 336, 78, 628, 841, 287, 83, 702, 1177, 313865, 195, 1081, 882, 1059, 4175, 123, 1700, 2317, 1579, 368, 85])
    weights /= weights.sum()
    return weights[indices]

Yes, there is significant data augmentation in datasets.py: both spatial augmentation (translation, rotation, skew, zoom, flipping), and lighting augmentation (variation of the SV channels when the RGB image is projected to HSV). Bounding boxes are automatically augmented along with the image. Note that the lighting augmentation actually hurt the results on xview however. The mAP dropped from 0.16 to 0.12 when I used it. Also note that rotating bounding boxes can get a bit dicey at rotation angles around 45 degrees, as the box may become much larger about the object than desireable.

To reproduce the results, you should just be able to start training. You should notice right away after a few epochs if the results are similar, as the results posted to results.txt should match the image on the repo home page. You can use plotResults() in utils.py to plot your results.

abidmalikwaterloo · 2018-11-16T20:34:42Z

@glenn-jocher I am also trying to reproduce the results. I have the following graphs for precision and recall.

The last 100 epochs behavior is not in line with the behavior you have on the web. I am getting mAP=0.20 on the training set as compared to 0.30 claimed by you. Any comments.

I see you turned off the Cuda flag in detect.py. Any special reason for this? it is pretty slow on CPU.

I am trying to reduce the classes for my experiments. You have 61 classes and label for them. I want to reduce it to "10" classes. I see that you have

def xview_classes2indices(classes):  # remap xview classes 11-94 to 0-61
    indices = [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 2, -1, 3, -1, 4, 5, 6, 7, 8, -1, 9, 10, 11, 12, 13, 14,
               15, -1, -1, 16, 17, 18, 19, 20, 21, 22, -1, 23, 24, 25, -1, 26, 27, -1, 28, -1, 29, 30, 31, 32, 33, 34,
               35, 36, 37, -1, 38, 39, 40, 41, 42, 43, 44, 45, -1, -1, -1, -1, 46, 47, 48, 49, -1, 50, 51, -1, 52, -1,
               -1, -1, 53, 54, -1, 55, -1, -1, 56, -1, 57, -1, 58, 59]
    return [indices[int(c)] for c in classes]

My understanding is to change all indices of unnecessary classes to "-1" and they will be filtered out. Am I on the right track or I am have to do more?

glenn-jocher · 2018-11-19T15:01:56Z

@abidmalikwaterloo you're free to set the CUDA flag as you like. The graphs looks good, your specific results may vary as I was making changes to the repository after uploading those results to try to optimize it.

Yes, if you want to use custom classes and data you will need to redefine those sections of the code that are relevant like the one you highlighted. There are many ways to do this. The purpose of the function you see there is to use arbitrary class numbers. You do not need to use this if your classes are ordered simply, such as 0, 1, 2, 3 etc. In xview the classes skip numbers, i.e. 5, 6, 17, 20, etc.

abidmalikwaterloo · 2018-12-04T14:33:22Z

@glenn-jocher I am playing with parameters but unable to get the mAP =0.16 on the data using validation set ( images not included in the training). I am using 791 images for training and 85 for validation. The max mAP I get is 0.09. Do you have any specific parameter values that I can use to get the mAP close to 0.16?

abidmalikwaterloo · 2018-12-04T14:39:41Z

@PiggyGenius Were you able to get mAP = 0.16? What parameters did you use for your architecture?

glenn-jocher · 2019-03-20T15:16:06Z

Be advised that the https://github.com/ultralytics/xview-yolov3 repository is not under active development anymore. We recommend you use https://github.com/ultralytics/yolov3 instead, our main YOLOv3 repository.

sawhney-medha · 2019-06-05T08:57:14Z

@abidmalikwaterloo I am also trying to train on a subset of the data for around 9-10 classes. Could you please tell me if you were successful in doing it and how?

glenn-jocher · 2019-06-05T10:46:35Z

@sawhney-medha please be advised that the https://github.com/ultralytics/xview-yolov3 repository is not under active development anymore. We recommend you use https://github.com/ultralytics/yolov3 instead, our main YOLOv3 repository.

github-actions · 2020-06-10T00:19:42Z

This issue is stale because it has been open 30 days with no activity. Remove Stale label or comment or this will be closed in 5 days.

im-tanyasuri · 2022-07-30T09:27:59Z

Hi, I want to use resized xview images,. like, decrease their resolution first and then use your model with it. I think, you are cropping the patches from the original images which are of around 3k x 3k. I want to do the same with 1000 x 1000 sized images. Please help. Thanks

glenn-jocher · 2023-11-15T08:16:38Z

@im-tanyasuri you can achieve this by resizing the images using any image processing library such as OpenCV or PIL before feeding them into the model. You can use the cv2.resize() function in OpenCV or the Image.resize() method in PIL to resize the images to the desired dimensions. Once resized, you can proceed with using the modified images as inputs to the model.

glenn-jocher pinned this issue Aug 9, 2019

github-actions bot added the Stale Stale and schedule for closing soon label Jun 10, 2020

github-actions bot closed this as completed Jun 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing your results #1

Reproducing your results #1

ludovic-carre commented Aug 27, 2018 •

edited

Loading

glenn-jocher commented Aug 29, 2018 •

edited

Loading

abidmalikwaterloo commented Nov 16, 2018

glenn-jocher commented Nov 19, 2018

abidmalikwaterloo commented Dec 4, 2018 •

edited

Loading

abidmalikwaterloo commented Dec 4, 2018

glenn-jocher commented Mar 20, 2019

sawhney-medha commented Jun 5, 2019

glenn-jocher commented Jun 5, 2019

github-actions bot commented Jun 10, 2020

im-tanyasuri commented Jul 30, 2022

glenn-jocher commented Nov 15, 2023

Reproducing your results #1

Reproducing your results #1

Comments

ludovic-carre commented Aug 27, 2018 • edited Loading

glenn-jocher commented Aug 29, 2018 • edited Loading

abidmalikwaterloo commented Nov 16, 2018

glenn-jocher commented Nov 19, 2018

abidmalikwaterloo commented Dec 4, 2018 • edited Loading

abidmalikwaterloo commented Dec 4, 2018

glenn-jocher commented Mar 20, 2019

sawhney-medha commented Jun 5, 2019

glenn-jocher commented Jun 5, 2019

github-actions bot commented Jun 10, 2020

im-tanyasuri commented Jul 30, 2022

glenn-jocher commented Nov 15, 2023

ludovic-carre commented Aug 27, 2018 •

edited

Loading

glenn-jocher commented Aug 29, 2018 •

edited

Loading

abidmalikwaterloo commented Dec 4, 2018 •

edited

Loading