Missing sorting of features? #284

rcheu-quora · 2020-07-02T22:28:06Z

I was looking at reagent/preprocessing/preprocessor.py, it seems like the Preprocessor expects that the input has already been sorted according to the normalization parameters, but I believe that's not actually the case. Instead, the input is just in increasing feature idx.

One of the first lines of the forward pass of the Preprocessor is:

split_input = torch.split(input, self.split_sections, dim=1)

Which appears to expect that the input tensor has been sorted as in sorted_features.

The input to the preprocessor is generated by reagent/workflow/data_fetcher.py. Inside that file, the order is generated by:

def infer_states_names(df, multi_steps: Optional[int]):
    """ Infer possible state names from states and next state features. """
    state_keys = get_distinct_keys(df, "state_features")
    next_states_is_col_arr_map = not (multi_steps is None)
    next_state_keys = get_distinct_keys(
        df, "next_state_features", is_col_arr_map=next_states_is_col_arr_map
    )
    return sorted(set(state_keys) | set(next_state_keys))

This later is passed to make_sparse2dense(df, col_name: str, possible_keys: List) as possible_keys and used to generate the dense feature array input.

I believe either the preprocessor needs to first re-arrange the input to match the sorted feature ordering, or the sorted ordering needs to be used when generating the datasets as the possible_keys variable.

The text was updated successfully, but these errors were encountered:

kaiwenw · 2020-07-10T05:11:49Z

Hi @rcheu-quora, thanks for the detailed analysis! I think you're right. Either query_data or DataLoader can handle the sorting. I'll be sure to fix this. Right now the examples work because they're generated in the same order.

rcheu-quora changed the title ~~Possible missing sorting of features?~~ Missing sorting of features? Jul 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing sorting of features? #284

Missing sorting of features? #284

rcheu-quora commented Jul 2, 2020

kaiwenw commented Jul 10, 2020

Missing sorting of features? #284

Missing sorting of features? #284

Comments

rcheu-quora commented Jul 2, 2020

kaiwenw commented Jul 10, 2020