Weighted Optimization #543

MaxiBoether · 2024-06-22T11:31:00Z

There are two reasons we perform weighted optimization

we receive weights from the selector (manual prioritization)
a downsampling strategy outputs weights (e.g. loss/gradnorm)

However I think right now some things in the weight handling are a bit weird:

a) If we receive weights from the selector and then do downsampling, we loose the weights

b) We do the following logic:

        weighted_optimization = (
            retrieve_weights_from_dataloader or self._downsampling_mode == DownsamplingMode.BATCH_THEN_SAMPLE
        )

and I don't know why we do this. First, do we never use weighted optimization in StB? Second, why do we always use weights in BtS? I think whether we should perform weighted optimization is a property of the downsampling startegy (if we don't receive weights from the selector). If we don't receive weights from the selector and e.g. use RHO-LOSS, we should not use weighted optimization. While we set the weights to 1 currently, we use the noreduction loss function, which I think can have performance implications for training with downsampling. I think we should do the following:

if we receive weights from selector and use no downsampling, use weights from selector
if we don't receive weights from selector and use no downsampling OR a downsampling strategy that does not output weights (we probably need to add a flag), use no weighted optimization
if we don't receive weights from selector and use downsampling that outputs weights (e.g. loss/gradnorm), use donwmsapling weights in both StB and BtS
if we receive weights from selector and use downsamplers that outputs weights, not sure. either we use one of the weights or we multiply them.

right now the weight handling is using the expensive no reduction thing even if it's not necessary, I think

The text was updated successfully, but these errors were encountered:

MaxiBoether · 2024-06-22T11:31:32Z

@XianzheMa maybe, at your convienence, you could check this out, since it may have downsampling performance implications :) thank you! but not highest priority of course

MaxiBoether assigned XianzheMa Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weighted Optimization #543

Weighted Optimization #543

MaxiBoether commented Jun 22, 2024

MaxiBoether commented Jun 22, 2024

Weighted Optimization #543

Weighted Optimization #543

Comments

MaxiBoether commented Jun 22, 2024

MaxiBoether commented Jun 22, 2024