ZHU Zhibin, ZHANG Jinming
Slab number recognition in heavy-plate production lines operates under extreme conditions such as high temperature, dust, strong glare, and occlusion. Irregular chalk handwriting introduces substantial domain shifts, causing generic OCR systems to miss or misclassify hard samples, generalize poorly across plants, and lead to high costs for fine-grained annotations. To address these challenges, this paper proposes an industrial-constraint-compatible two-stage pipeline. In the training phase, a two-level detector automatically crops slab-side regions and character boxes. This paper introduces a vertical projection prior to enhance the separability of faint strokes and connected handwriting. Then, a closed loop of self-supervised classification, core sample selection, and self-training update iteratively optimizes the ViT classification head. In the inference phase, this paper reuse the slab-side/character-box detector and the classifier, and impose sequence constraints using process-specific rules and the spatial ordering of character boxes. A sequence-level multimodal post-processing module subsequently performs error correction, missing-position completion, and candidate re-ranking to output structured results. Importantly, this post-processing module only consumes structured inputs produced by the front-end (character candidates, confidence scores, spatial order, and process rules); it does not operate on image pixels and does not replace visual recognition. When front-end candidates are insufficient due to detection misses, it outputs ″review required/reject release″ rather than hallucinating characters. This paper evaluates on real production-line subsets covering occlusion, smearing, glare, low illumination, and tilt. Compared with generic OCR and small-sample supervised baselines, the proposed method achieves significant gains at both character-level and sequence-level metrics, with notably fewer misses on the hardest subsets. Under edge-compute budgets, the approach jointly balances resolution, throughput, and confidence calibration, reduces manual annotation effort by about 70%, and measurably lowers manual transcription/verification workload and misloading risk at the reheating-furnace stage. The same paradigm can be extended to other metallurgical identifiers, such as continuous casting numbers, coil IDs, and furnace IDs.