Add Iterative Predictor for Improved SWE-bench Issue Resolution

### Is this a new feature, an improvement, or a change to existing functionality?

Improvement

### How would you describe the priority of this feature request

Medium

### Please provide a clear description of problem this feature solves

The current `full` predictor in the `swe_bench` evaluation example utilizes a **one-shot generation** approach, which lacks the necessary robustness for complex swe_bench tasks. Based on my evaluation of 8 instances across major swe_bench projects (including SymPy, Astropy, Django, Matplotlib), the current success rate is **0%**.

The core limitations identified are:

- Generates fixes without running tests to validate them
- Lacks feedback loops to refine solutions based on execution errors
- Cannot recover from failures or adjust strategies
- Relies on static code analysis without dynamic execution feedback

```bash
nat eval --config_file examples/evaluation_and_profiling/swe_bench/configs/config_full.yml
```
```text
=== EVALUATION SUMMARY ===
Workflow Status: COMPLETED (workflow_output.json)
Total Runtime: 132.62s

Per evaluator results:
| Evaluator   |   Avg Score | Output File           |
|-------------|-------------|-----------------------|
| swe_bench   |           0 | swe_bench_output.json |
```

### Describe your ideal solution

I propose the implementation of an **Iterative Predictor** that introduces a `dynamic feedback loop` into the SWE-bench resolution process. This feature will transition the agent from a `"one-shot"` model to an `"reason-action-observation"` model.

### Key Components of the Solution:

- **Step-by-step execution:** Executes commands incrementally and observes results
- **Test-driven validation:** Runs tests after each fix attempt and uses failure signals to guide refinement
- **Error recovery:** Handles failures gracefully with retry mechanisms and strategy adjustments
- **Dynamic feedback:** Uses runtime errors, test outputs, and execution results instead of static analysis

### Additional context

I plan to implement this iterative predictor.  Will extend the SweBenchPredictorBase class and reuse the existing environment interaction logic to ensure consistency with the current framework. Once the implementation is verified, I will submit a PR for review.

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct
- [x] I have searched the [open feature requests](https://github.com/NVIDIA/NeMo-Agent-Toolkit/issues?q=is%3Aopen+is%3Aissue+label%3A%22feature+request%22%2Cimprovement%2Cenhancement) and have found no duplicates for this feature request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Iterative Predictor for Improved SWE-bench Issue Resolution #1397

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Describe your ideal solution

Key Components of the Solution:

Additional context

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Iterative Predictor for Improved SWE-bench Issue Resolution #1397

Description

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Describe your ideal solution

Key Components of the Solution:

Additional context

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions