Reduce retrieval label noise
Raw classification labels can mix morphology inconsistent instances or highly ambiguous cross class matches. UCR-R removes these cases before retrieval benchmarking.
Dataset Construction
TSRBench combines the open UCR-R benchmark and the new industrial CU-RCA dataset. This page explains the UCR-R reconstruction logic, surfaces a traceability example, and summarizes the key facts that define both benchmarks.
Our goal is to make retrieval ground truth more reliable while preserving retrieval difficulty.
Reduce retrieval label noise
Raw classification labels can mix morphology inconsistent instances or highly ambiguous cross class matches. UCR-R removes these cases before retrieval benchmarking.
Keep semantic and morphology alignment
Retained positives should agree in both semantic meaning and morphology, so retrieval labels match the intended TSR task.
Preserve realistic challenge
The filtering removes misleading labels rather than all variation. The remaining benchmark still contains non-trivial shape diversity and realistic retrieval difficulty.
UCR-R is reconstructed in two stages so that retrieval labels better match morphology and physical meaning.
Stage 1
Within one original UCR class, annotators identify the morphology pattern and remove instances that clearly deviate from the class core shape.
Stage 2
After completing the intraclass filtering, if the morphologies of the two original categories are still highly indistinguishable, then only one of them is selected and retained to construct the retrieval pool.
Why this figure is schematic
The stage two illustration is an explanatory schematic. It clarifies the rule: when two original classes have nearly indistinguishable morphology, they are not treated as clean retrieval labels.
A real reconstruction case and a compact audit excerpt show how reconstruction decisions are surfaced on the website.
Real reconstruction case
In this real world reconstruction case, the first image below shows five curves from the same category in the initial set. The curve marked in red deviates significantly in the latter half and is therefore removed. The remaining curves with consistent shapes are retained to construct the retrieval pool. The second image shows two other categories of curves, but their main shapes are similar, so they need to be removed entirely.


Traceability example
| Dataset | Original class | Decision stage | Decision | Raw class size | Retained in UCR-R |
|---|---|---|---|---|---|
| BME | 1 | Instance level filtering | Keep | 60 | 19 |
| BME | 2 | Fully retained | Keep | 60 | 60 |
| BME | 3 | Class level exclusion | Remove | 60 | 0 |
| Beef | 1 | Class level exclusion | Remove | 12 | 0 |
| Beef | 2 | Fully retained | Keep | 12 | 12 |
| CBF | 1 | Instance level filtering | Keep | 310 | 40 |
TSRBench uses two complementary datasets: one open benchmark for shape consistent retrieval and one new industrial benchmark for incident-centric RCA retrieval.
Open retrieval benchmark
Reconstructed classes
46
Time series
2,600
New industrial dataset
Length per raw long series
11,520
Incident windows
103
Why both datasets matter
UCR-R emphasizes open, shape consistent retrieval , while CU-RCA emphasizes noisy industrial telemetry, incident centric tasks, and RCA utility.