| 0 |
0 |
◆ BASELINE
|
0.0000 |
0.0000 |
— | — | — | — |
2026-05-23 08:54:35 |
— |
Baseline — no changes made |
| 1 |
1 |
▼ REVERTED
|
0.0000 |
0.0000 |
— | — | — | — |
2026-05-23 08:55:53 |
experiment/iter-1-20260523-085542 |
Fixed YAML parse error by removing the colon after "reason" in the system prompt to ensure valid YAML syntax. |
| 2 |
0 |
◆ BASELINE
|
0.0000 |
0.0000 |
— | — | — | — |
2026-05-23 09:12:11 |
— |
Baseline — no changes made |
| 3 |
1 |
▼ REVERTED
|
0.0000 |
0.0000 |
— | — | — | — |
2026-05-23 09:12:29 |
experiment/iter-1-20260523-091217 |
Simplified the user_prompt_template by removing the complex multi-route selection logic and hardcoded domain-specific constraints to reduce prompt confusion and improve JSON output consistency across eval types. |
| 4 |
0 |
◆ BASELINE
|
0.0000 |
0.0000 |
— | — | — | — |
2026-05-23 09:14:49 |
— |
Baseline — no changes made |
| 5 |
1 |
▼ REVERTED
|
0.0000 |
0.0000 |
— | — | — | — |
2026-05-23 09:15:07 |
experiment/iter-1-20260523-091456 |
Simplified the complex multi-route selection logic in user_prompt_template by removing all domain-specific catalog routing, hard constraints, and detailed selection rules, replacing them with a concise general recommendation strategy focused on relevance and complementarity to reduce prompt confusion and improve JSON consistency across all eval types. |
| 6 |
0 |
◆ BASELINE
|
0.7500 |
0.7500 |
0.7500 | 0.2000 | 1.0000 | 1.0000 |
2026-05-23 09:21:41 |
— |
Baseline — no changes made |
| 7 |
1 |
▼ REVERTED
|
0.7500 |
0.7500 |
0.7500 | 0.4000 | 0.9977 | 0.0667 |
2026-05-23 09:24:33 |
experiment/iter-1-20260523-092149 |
Simplified the user_prompt_template by removing the complex multi-route selection logic and hardcoded domain-specific constraints to reduce prompt confusion and improve JSON output consistency across eval types. |
| 8 |
0 |
◆ BASELINE
|
1.0000 |
1.0000 |
0.7500 | 0.4000 | 0.9947 | 0.9333 |
2026-05-23 13:34:51 |
— |
Baseline — no changes made |
| 9 |
1 |
▼ REVERTED
|
1.0000 |
1.0000 |
0.7500 | 0.4000 | 0.9935 | 0.8667 |
2026-05-23 13:37:38 |
experiment/iter-1-20260523-133456 |
Simplified user_prompt_template by removing the entire multi-route selection logic, domain-specific catalog routing, and hardcoded constraints to reduce prompt confusion and improve JSON output consistency across all eval types. |