-
Notifications
You must be signed in to change notification settings - Fork 11
Description
What is your work environment?
OS: macOS 15.5(24F74)
Python version: 3.13.3
Package versions: See requirements.txt section below
SCML version: 0.7.6
Describe the topic you would like to discuss
It seems that context_switch in the OneShotRLAgent may not be functioning as intended in the SCML-OneShot RL skeleton code.
In particular, the SupplierContext is being used even when the agent's production_level is 1, which should correspond to a consumer. This suggests that the context-switching mechanism might not be selecting the correct context based on the agent's level in the production graph.
To investigate, I added logging in the context_switch method of OneShotRLAgent (in scml/oneshot/rl/agent.py) like so:
def context_switch(self):
aolist = zip(
self._action_managers, self._obs_managers, range(len(self._obs_managers))
)
if self._randomize_test_order:
aolist = list(aolist)
shuffle(aolist)
self._valid_index = -1
for a, o, i in aolist:
if a.context.is_valid_awi(
self.awi, types=(type(self),), raise_on_failure=True
) and o.context.is_valid_awi(
self.awi, types=(type(self),), raise_on_failure=True
):
self._valid_index = i
print(f"\nproduction level:{self.awi.level}")
print(
f"Agent {self.id} is using model {i} with action manager {a.context.__class__.__name__} and observation manager {o.context.__class__.__name__}"
)
breakHowever, based on the logs, even when production level: 1, the selected context is still SupplierContext, as shown below:
production level:1
Agent 06My@1 is using model 0 with action manager SupplierContext and observation manager SupplierContext
production level:1
Agent 04My@1 is using model 0 with action manager SupplierContext and observation manager SupplierContext
...
This suggests that the consumer-specific context (e.g., ConsumerContext) is not being selected properly.
Additional context
All models and contexts are initialized correctly and passed to the agent.
I confirmed that production_level reflects correct consumer/supplier assignment.
The full simulation and evaluation run correctly, but the model may be learning under the wrong context.
If this is not a bug but rather a misconfiguration on my part, I would appreciate any guidance on how to properly configure or validate the context-switching mechanism.
(For privacy, I have removed identifiable user and path information from the log below.)
Sample log output
production level:0
Agent 00My@0 is using model 0 with action manager SupplierContext and observation manager SupplierContext
production level:1
Agent 06My@1 is using model 0 with action manager SupplierContext and observation manager SupplierContext
requirements.txt