VHSMarker

A High-Precision Annotation Tool for Canine Cardiac key point Detection and VHS Estimation

VHSMarker Interface

Abstract

We present VHSMarker, a web-based human computer interaction annotation tool designed to streamline the creation of high-quality ground truth data for canine cardiac analysis. VHSMarker provides an intuitive interface for precise labeling of six critical cardiac key points in dog heart X-ray images, significantly reducing annotation time and improving consistency. It also supports real-time vertebral heart score (VHS) calculation as points are placed, enables predictions using pretrained models, and allows side-by-side comparison of ground truth and model predictions for performance evaluation. In addition, we introduce a canine cardiac key point (CCK) dataset, a meticulously curated collection of annotated 21,465 X-ray images for accurate VHS prediction. To facilitate the labeling, we developed MambaVHS (Mamba-Enhanced Vertebral Heart Score Detection), a custom model that integrates Mamba blocks for efficient long-range sequence modeling alongside complementary convolutional components for precise spatial feature extraction. Together, this comprehensive framework sets a new benchmark for canine cardiology research.

Key Contributions

CCK Dataset

21,465 annotated canine thoracic radiographs with comprehensive anatomical coverage and clinical diversity.

  • Diverse breeds and conditions
  • Standardized annotations
  • Rich metadata

VHSMarker Tool

Web-based annotation platform for precise labeling of canine cardiac keypoints with real-time VHS calculation and model comparison capabilities.

  • Intuitive human-computer interface
  • Six anatomical keypoint labeling
  • Automated data export

MambaVHS Model

Hybrid architecture combining Mamba blocks for sequence modeling with convolutional networks for spatial feature extraction.

  • State-of-the-art accuracy
  • Efficient long-range modeling
  • Real-time inference

Canine Cardiac Keypoint (CCK) Dataset

The CCK dataset is a meticulously curated collection of 21,465 annotated canine thoracic radiographs, capturing a diverse range of anatomical variations and clinical conditions.

Dataset Features

  • 21,465 images with six precisely annotated cardiac keypoints each
  • Standardized lateral radiographs only for consistent VHS measurement
  • Multiple export formats (JSON, CSV, .mat) for flexibility
Demographic Information
Table 1: Sex Distribution
Sex Count
Female 7941
Male 4395
Unknown 49
Total 12385
Table 2: Age Group Distribution
Age Group (years) Count
0 to 5 2961
6 to 11 6272
12 to 17 2827
18 to 30 86
Unknown 239
Total 12385
CCK Dataset Sample Images
CCK Dataset Sample Images

Sample radiographs from the CCK dataset with Annotations

Table 3: Dataset Distribution
Split Number of Images Percentage
Training 15,026 70%
Validation 2,155 10%
Testing 4,275 20%
Total 21,465 100%
Breed Information

The CCK dataset includes a diverse range of breeds, ensuring comprehensive coverage of canine cardiac anatomy.

#BreedCount
1 Mixed Dog 1256
2 Labrador Retriever 479
3 Golden Retriever 191
4 German Shepherd 164
5 Chihuahua 100
6 Boxer 79
7 Shih Tzu 77
8 Yorkshire Terrier 77
9 French Bulldog 76
10 English Bulldog 72
11 Canine, Nos 67
12 Miniature Poodle 62
13 Siberian Husky 61
14 Border Collie 57
15 Beagle Hound 56
16 Pomeranian 52
17 Cavalier King Chas S… 51
18 Pug 48
19 Boston Terrier 44
20 Jack Russell Terrier 44
21 Maltese 44
22 Australian Shepherd 42
23 Shetland Sheepdog 42
24 Rottweiler 41
25 English Cocker Spani… 35
26 Great Dane 34
27 Bernese Mountain Dog 32
28 Miniature Schnauzer 32
29 Cock-A-Poo 31
30 Irish Wolfhound 9
31 Swiss Mountain Dog 4
32 English Shepherd 4
33 Nova Scotia Duck Tol… 4
34 Saluki 4
35 Italian Greyhound 4
36 Flat-Coated Retriever 4
37 Shiba Inu 4
#BreedCount
38Smooth Minature Dach… 23
39 English Setter 21
40 Australian Cattle Dog 20
41 Toy Poodle 20
42 Chinese Sharpei 20
43 Bichon Frise 19
44 American Bulldog 18
45 Pembroke Welsh Corgi 18
46 West Highland Terrier 18
47 Rhodesian Ridgeback 18
48 English Springer Spa… 17
49 American Staffordshi… 17
50 Miniature Pinscher 16
51 Brittany Spaniel 16
52 Long-Haired Std Dach… 14
53 German Short-Haired … 14
54 Terrier, Nos 14
55 Basset Hound 13
56 Newfoundland 13
57 Bull Mastiff 12
58 Long-Haired Mini Dac… 12
59 Bulldog, Nos 12
60 Belgian Malinois 12
61 Lhasa Apso 11
62 Greyhound 11
63 Bull Terrier 10
64 Irish Setter 10
65 Catahula Leopard Dog 9
66 Saint Bernard 9
67 Treeing Walker Coonh… 4
68 Bloodhound 3
69 Chinese Crested 3
70 American Foxhound 3
71 Tibetan Terrier 3
72 Neapolitan Mastiff 3
73 Australian Heeler 2
#BreedCount
74Red Bone Hound 8
75Samoyed 7
76Chesapeake Bay Retri… 7
77Vizsla 7
78Smooth Standard Dach… 7
79American Pit Bull Te… 7
80Whippet 7
81Akita 6
82Leonberger 6
83Schipperke 6
84American Eskimo Dog 6
85Mexican Hairless 6
86Coonhound 5
87English Mastiff 5
88Silky Terrier 5
89German Wire-Haired P… 5
90Weimaraner 5
91Papillon 5
92Scottish Terrier 5
93Staffordshire Bull T… 5
94Mastiff, Nos 5
95Hound, Nos 5
96Keeshond 5
97Giant Schnauzer 4
98Airedale Terrier 4
99Coton De Tulear 4
100Cocker Spaniel, Nos 9
101Cairn Terrier 9
102Rat Terrier 9
103Spinone Italiano 2
104Briard 2
105Old English Sheepdog 2
106Borzoi 2
107Alaskan Malamute 2
108Norwegian Elkhound 2
109German Long-Haired P… 2
110Affenpinscher 2
#BreedCount
111Toy Manchester Terri… 2
112Clumber Spaniel 2
113Standard Schnauzer 2
114Irish Water Spaniel 1
115Shiloh Shepherd 1
116Cardigan Welsh Corgi 1
117American Bully 1
118Japanese Chin 1
119English Coonhound 1
120Border Terrier 1
121Setter, Nos 1
122Tibetan Spaniel 1
123American Cocker Span… 1
124Australian Terrier 1
125Welsh Terrier 1
126Norfolk Terrier 1
127Dalmatian 1
128Pharaoh Hound 1
129Springer Spaniel 1
130Silken Windsprite 1
131Wirehaired Standard … 1
132Retriever, Nos 1
133Soft-Coated Wheaten … 1
134Maremma Sheepdog 1
135Standard Poodle 31
136Havanese 30
137Dachshund, Nos 28
138Collie, Nos 9
139Peke-A-Poo 2
140Anatolian Shepherd 2
141Wirehaired Pointing … 2
142Doberman Pinscher 26
143Labradoodle 26
144Great Pyrenees 24
145Cane Corso 8
146Unknown 8039
 Total12385
Table 4: Breed-wise distribution of annotated canine X-ray images in the CCK dataset, detailing the number of samples per breed for demographic and diversity analysis.

VHSMarker Annotation Tool

A specialized web-based platform for precise canine cardiac analysis, combining clinical expertise with advanced computer vision capabilities.

Technical Architecture

  • Backend: Flask (Python) with OpenCV for image processing
  • Frontend: HTML5 Canvas + JavaScript for responsive annotation
  • Real-time API: Real-Time communication for instant VHS calculation
  • ML Integration: ONNX runtime for model inference
  • Versioning: Git-like annotation history tracking

VHSMarker revolutionizes canine cardiac analysis through three core innovations: Smart Annotation enables precise labeling of six anatomical keypoints (cardiac apex, tracheal bifurcation, and vertebral reference points) with pixel-level accuracy; Real-Time Analysis instantly computes the Vertebral Heart Score using the formula VHS = 6 × (AB + CD)/EF, where AB is the long axis (cardiac apex to tracheal bifurcation), CD is the short axis (perpendicular width), and EF is the vertebral reference length; The tool's hybrid web architecture combines Flask backend processing with HTML5 Canvas frontend interactivity, delivering 10-12 second annotation times while maintaining sub-pixel precision in measurements.

Tool Demonstration

Video 1 : Demonstrating: Keypoint placement (A-F), real-time VHS calculation, and model prediction comparison

Video 2 : Demonstrating: Brightness / Contrast

MambaVHS Model

The MambaVHS model is a hybrid architecture that combines Mamba blocks for efficient long-range sequence modeling with complementary convolutional components for precise spatial feature extraction. This innovative design enables the model to achieve state-of-the-art accuracy in canine cardiac key point detection and VHS estimation.

Model Features

  • State-of-the-art accuracy in canine cardiac analysis
  • Efficient long-range modeling with Mamba blocks
  • Real-time inference capabilities
MambaVHS Model Architecture

Figure : MambaVHS Model Architecture

Mamba Stem

The initial feature extractor uses two 3×3 convolutional layers with SiLU activation and stride-2 downsampling. This reduces spatial dimensions while preserving critical cardiac boundaries, forming a robust foundation for subsequent processing stages.

Mamba Stage

Four progressive stages combine residual blocks for local feature extraction with Mamba state-space layers for efficient long-range modeling. This hybrid approach captures both detailed cardiac structures and their relationships across thoracic vertebrae.

SE Layer & Regression Head

Squeeze-Excitation layers dynamically recalibrate channel-wise features, while the final regression head uses global average pooling and fully-connected layers to predict keypoints at 42ms inference speed.

Experimental Results

This section presents the experimental evaluation of the VHSMarker framework for vertebral heart score (VHS) estimation from canine thoracic radiographs. The primary evaluation metric is test accuracy, based on VHS classification into three clinically significant categories: normal heart size (VHS < 8.2), borderline cardiomegaly (8.2 <=VHS <=10), and severe cardiomegaly (VHS> 10). These thresholds provide a clear basis for assessing the presence and severity of cardiomegaly, which is critical for accurate veterinary diagnostics.

Quantitative Results

The comprehensive evaluation demonstrates significant improvements across all metrics:

Key findings: Table 5 compares the performance of various state-of-the-art models on the Canine Cardiac key point (CCK) Dataset. Notably, the proposed MambaVHS model achieves the highest test accuracy of 91.8%, demonstrating its effectiveness in precise key point localization and VHS estimation. This performance surpasses several well-established baselines, including ConvNeXt (89.4%), EfficientNetB7 (88.41%), and CDA (86.4%), highlighting the advantage of Mamba-based architectures in capturing complex anatomical structures.

Accuracy Comparison Chart

Table 5: Comparative accuracy across models (MambaVHS vs baselines)

Prediction Analysis

Visual comparisons from different models, including MambaVHS, ConvNeXt, EfficientNetB7, and CDA, on canine thoracic radiographs. MambaVHS consistently generates predictions closer to the actual VHS, particularly for less common cases with irregular thoracic structures and unusual imaging angles. This highlights its superior ability to capture long and short axes accurately, outperforming other models in challenging scenarios, making it a reliable choice for real-world veterinary diagnostics.

* The ground truth is shown in , while predictions are shown in

VHS(<8.2)
Prediction Samples
VHS(>=8.2 and <=10)
Prediction Samples
VHS(>10)
Prediction Samples

The model demonstrates strong performance across challenging scenarios: it accurately classifies borderline VHS values near clinical thresholds (8.2 and 10), handles irregular anatomies including spinal deformities and thoracic abnormalities, and processes images in 42ms (A100 GPU) for real-time clinical workflow integration.

Ablation Study

Systematic evaluation of architectural components reveals each element's contribution:

Model Variant Val Acc (%) Test Acc (%) Performance Impact
Without SE Layers 88.0 88.5 3.3% accuracy drop shows importance of channel attention
With L1 Loss Only 88.4 88.7 3.1% drop highlights VHSAwareLoss benefits
With Attention + MLP 80.1 84.7 7.1% gain from Mamba's selective scanning
Without Residual Blocks 82.0 84.5 7.3% drop demonstrates need for skip connections
Full MambaVHS 89.5 91.8 Optimal configuration

Table 6: Complete ablation study with performance deltas

The ablation study demonstrates the critical contributions of each architectural component, with the full MambaVHS model achieving 91.8% test accuracy. Removing SE layers reduced performance by 3.3%, while using only L1 loss decreased accuracy by 3.1%. Most significantly, replacing Mamba blocks with standard attention mechanisms resulted in a 7.1% performance drop, highlighting the importance of efficient sequence modeling. The complete architecture provides optimal balance between accuracy and computational efficiency.

The VHSMarker framework advances canine cardiac analysis through its efficient annotation tool (10-12s/image), comprehensive CCK dataset (21,465 images), and high-accuracy MambaVHS model (91.8% test accuracy, 42ms inference), establishing a new benchmark for automated veterinary diagnostics.