Add counterfactual dataset #2119
CI.yml
on: pull_request
select-category
3s
lint-and-test
16s
Matrix: mock-evaluation
summarize-results
/
Results
37s
Annotations
5 errors and 4 warnings
|
lint-and-test
Process completed with exit code 1.
|
|
ruff (UP042):
src/bcbench/analysis/family.py#L16
src/bcbench/analysis/family.py:16:7: UP042 Class FamilyType inherits from both `str` and `enum.Enum`
help: Inherit from `enum.StrEnum`
|
|
ruff (ANN003):
evaluator/counterfactual_scores.py#L17
evaluator/counterfactual_scores.py:17:43: ANN003 Missing type annotation for `**kwargs`
|
|
ruff (ANN003):
evaluator/counterfactual_scores.py#L12
evaluator/counterfactual_scores.py:12:43: ANN003 Missing type annotation for `**kwargs`
|
|
ruff (ANN003):
evaluator/counterfactual_scores.py#L7
evaluator/counterfactual_scores.py:7:43: ANN003 Missing type annotation for `**kwargs`
|
|
bcbench.results.base
Result for microsoftInternal__NAV-224668__cf-1 missing metrics: execution_time, llm_duration, turn_count, prompt_tokens, completion_tokens, tool_usage
|
|
bcbench.results.base
Creating result for microsoft__BCApps-4699__cf-1 with no agent metrics - performance data will be unavailable
|
|
bcbench.results.base
Result for microsoftInternal__NAV-203923__cf-1 missing metrics: execution_time, llm_duration, turn_count, prompt_tokens, completion_tokens, tool_usage
|
|
bcbench.results.base
Creating result for microsoftInternal__NAV-175765__cf-1 with no agent metrics - performance data will be unavailable
|
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
evaluation-summary
Expired
|
527 Bytes |
sha256:7b1b89f71dc895a9dbbfe0ef0948fd66e7f509f2401b4db77805ed2ff0b2b912
|
|
|
microsoftInternal__NAV-175765__cf-1
Expired
|
470 Bytes |
sha256:09d8b808a0d816c8fd7bf3b9fdf687646e7b32c938ef9e6be7abec0ab2b0ea31
|
|
|
microsoftInternal__NAV-203923__cf-1
Expired
|
527 Bytes |
sha256:15999c12201789a89505f984fbd76ef679b2e6a6c8f732c006a807a14adebd60
|
|
|
microsoftInternal__NAV-224668__cf-1
Expired
|
526 Bytes |
sha256:c56425eac079902519a83ed427bb0aa120a7700b8af3c3713f9511c62d0ac154
|
|
|
microsoft__BCApps-4699__cf-1
Expired
|
400 Bytes |
sha256:dbd02ff2e1e5b006d4e9fee9be7fe0dbcbb7389ea374c7b2fd1fbfaab2d2b1a1
|
|