VHSMarker: A High-Precision Annotation Tool for Canine Cardiac key point Detection and VHS Estimation

Abstract

We present VHSMarker, a web-based human computer interaction annotation tool designed to streamline the creation of high-quality ground truth data for canine cardiac analysis. VHSMarker provides an intuitive interface for precise labeling of six critical cardiac key points in dog heart X-ray images, significantly reducing annotation time and improving consistency. It also supports real-time vertebral heart score (VHS) calculation as points are placed, enables predictions using pretrained models, and allows side-by-side comparison of ground truth and model predictions for performance evaluation. In addition, we introduce a canine cardiac key point (CCK) dataset, a meticulously curated collection of annotated 21,465 X-ray images for accurate VHS prediction. To facilitate the labeling, we developed MambaVHS (Mamba-Enhanced Vertebral Heart Score Detection), a custom model that integrates Mamba blocks for efficient long-range sequence modeling alongside complementary convolutional components for precise spatial feature extraction. Together, this comprehensive framework sets a new benchmark for canine cardiology research.

Canine Cardiac Keypoint (CCK) Dataset

The CCK dataset is a meticulously curated collection of 21,465 annotated canine thoracic radiographs, capturing a diverse range of anatomical variations and clinical conditions.

Dataset Features

21,465 images with six precisely annotated cardiac keypoints each
Standardized lateral radiographs only for consistent VHS measurement
Multiple export formats (JSON, CSV, .mat) for flexibility

Demographic Information

Table 1: Sex Distribution
Sex	Count
Female	7941
Male	4395
Unknown	49
Total	12385

Table 2: Age Group Distribution
Age Group (years)	Count
0 to 5	2961
6 to 11	6272
12 to 17	2827
18 to 30	86
Unknown	239
Total	12385

CCK Dataset Sample Images

Sample radiographs from the CCK dataset with Annotations

Table 3: Dataset Distribution
Split	Number of Images	Percentage
Training	15,026	70%
Validation	2,155	10%
Testing	4,275	20%
Total	21,465	100%

Breed Information

The CCK dataset includes a diverse range of breeds, ensuring comprehensive coverage of canine cardiac anatomy.

#	Breed	Count
1	Mixed Dog	1256
2	Labrador Retriever	479
3	Golden Retriever	191
4	German Shepherd	164
5	Chihuahua	100
6	Boxer	79
7	Shih Tzu	77
8	Yorkshire Terrier	77
9	French Bulldog	76
10	English Bulldog	72
11	Canine, Nos	67
12	Miniature Poodle	62
13	Siberian Husky	61
14	Border Collie	57
15	Beagle Hound	56
16	Pomeranian	52
17	Cavalier King Chas S…	51
18	Pug	48
19	Boston Terrier	44
20	Jack Russell Terrier	44
21	Maltese	44
22	Australian Shepherd	42
23	Shetland Sheepdog	42
24	Rottweiler	41
25	English Cocker Spani…	35
26	Great Dane	34
27	Bernese Mountain Dog	32
28	Miniature Schnauzer	32
29	Cock-A-Poo	31
30	Irish Wolfhound	9
31	Swiss Mountain Dog	4
32	English Shepherd	4
33	Nova Scotia Duck Tol…	4
34	Saluki	4
35	Italian Greyhound	4
36	Flat-Coated Retriever	4
37	Shiba Inu	4

#	Breed	Count
38	Smooth Minature Dach…	23
39	English Setter	21
40	Australian Cattle Dog	20
41	Toy Poodle	20
42	Chinese Sharpei	20
43	Bichon Frise	19
44	American Bulldog	18
45	Pembroke Welsh Corgi	18
46	West Highland Terrier	18
47	Rhodesian Ridgeback	18
48	English Springer Spa…	17
49	American Staffordshi…	17
50	Miniature Pinscher	16
51	Brittany Spaniel	16
52	Long-Haired Std Dach…	14
53	German Short-Haired …	14
54	Terrier, Nos	14
55	Basset Hound	13
56	Newfoundland	13
57	Bull Mastiff	12
58	Long-Haired Mini Dac…	12
59	Bulldog, Nos	12
60	Belgian Malinois	12
61	Lhasa Apso	11
62	Greyhound	11
63	Bull Terrier	10
64	Irish Setter	10
65	Catahula Leopard Dog	9
66	Saint Bernard	9
67	Treeing Walker Coonh…	4
68	Bloodhound	3
69	Chinese Crested	3
70	American Foxhound	3
71	Tibetan Terrier	3
72	Neapolitan Mastiff	3
73	Australian Heeler	2

#	Breed	Count
74	Red Bone Hound	8
75	Samoyed	7
76	Chesapeake Bay Retri…	7
77	Vizsla	7
78	Smooth Standard Dach…	7
79	American Pit Bull Te…	7
80	Whippet	7
81	Akita	6
82	Leonberger	6
83	Schipperke	6
84	American Eskimo Dog	6
85	Mexican Hairless	6
86	Coonhound	5
87	English Mastiff	5
88	Silky Terrier	5
89	German Wire-Haired P…	5
90	Weimaraner	5
91	Papillon	5
92	Scottish Terrier	5
93	Staffordshire Bull T…	5
94	Mastiff, Nos	5
95	Hound, Nos	5
96	Keeshond	5
97	Giant Schnauzer	4
98	Airedale Terrier	4
99	Coton De Tulear	4
100	Cocker Spaniel, Nos	9
101	Cairn Terrier	9
102	Rat Terrier	9
103	Spinone Italiano	2
104	Briard	2
105	Old English Sheepdog	2
106	Borzoi	2
107	Alaskan Malamute	2
108	Norwegian Elkhound	2
109	German Long-Haired P…	2
110	Affenpinscher	2

#	Breed	Count
111	Toy Manchester Terri…	2
112	Clumber Spaniel	2
113	Standard Schnauzer	2
114	Irish Water Spaniel	1
115	Shiloh Shepherd	1
116	Cardigan Welsh Corgi	1
117	American Bully	1
118	Japanese Chin	1
119	English Coonhound	1
120	Border Terrier	1
121	Setter, Nos	1
122	Tibetan Spaniel	1
123	American Cocker Span…	1
124	Australian Terrier	1
125	Welsh Terrier	1
126	Norfolk Terrier	1
127	Dalmatian	1
128	Pharaoh Hound	1
129	Springer Spaniel	1
130	Silken Windsprite	1
131	Wirehaired Standard …	1
132	Retriever, Nos	1
133	Soft-Coated Wheaten …	1
134	Maremma Sheepdog	1
135	Standard Poodle	31
136	Havanese	30
137	Dachshund, Nos	28
138	Collie, Nos	9
139	Peke-A-Poo	2
140	Anatolian Shepherd	2
141	Wirehaired Pointing …	2
142	Doberman Pinscher	26
143	Labradoodle	26
144	Great Pyrenees	24
145	Cane Corso	8
146	Unknown	8039
	Total	12385

Table 4: Breed-wise distribution of annotated canine X-ray images in the CCK dataset, detailing the number of samples per breed for demographic and diversity analysis.

🤗 Download from Hugging Face Download from Zenodo

VHSMarker Annotation Tool

A specialized web-based platform for precise canine cardiac analysis, combining clinical expertise with advanced computer vision capabilities.

Technical Architecture

Backend: Flask (Python) with OpenCV for image processing
Frontend: HTML5 Canvas + JavaScript for responsive annotation
Real-time API: Real-Time communication for instant VHS calculation

ML Integration: ONNX runtime for model inference
Versioning: Git-like annotation history tracking

VHSMarker revolutionizes canine cardiac analysis through three core innovations: Smart Annotation enables precise labeling of six anatomical keypoints (cardiac apex, tracheal bifurcation, and vertebral reference points) with pixel-level accuracy; Real-Time Analysis instantly computes the Vertebral Heart Score using the formula VHS = 6 × (AB + CD)/EF, where AB is the long axis (cardiac apex to tracheal bifurcation), CD is the short axis (perpendicular width), and EF is the vertebral reference length; The tool's hybrid web architecture combines Flask backend processing with HTML5 Canvas frontend interactivity, delivering 10-12 second annotation times while maintaining sub-pixel precision in measurements.

Tool Demonstration

Video 1 : Demonstrating: Keypoint placement (A-F), real-time VHS calculation, and model prediction comparison

Video 2 : Demonstrating: Brightness / Contrast

MambaVHS Model

The MambaVHS model is a hybrid architecture that combines Mamba blocks for efficient long-range sequence modeling with complementary convolutional components for precise spatial feature extraction. This innovative design enables the model to achieve state-of-the-art accuracy in canine cardiac key point detection and VHS estimation.

Model Features

State-of-the-art accuracy in canine cardiac analysis
Efficient long-range modeling with Mamba blocks
Real-time inference capabilities

Figure : MambaVHS Model Architecture

Mamba Stem

The initial feature extractor uses two 3×3 convolutional layers with SiLU activation and stride-2 downsampling. This reduces spatial dimensions while preserving critical cardiac boundaries, forming a robust foundation for subsequent processing stages.

Mamba Stage

Four progressive stages combine residual blocks for local feature extraction with Mamba state-space layers for efficient long-range modeling. This hybrid approach captures both detailed cardiac structures and their relationships across thoracic vertebrae.

SE Layer & Regression Head

Squeeze-Excitation layers dynamically recalibrate channel-wise features, while the final regression head uses global average pooling and fully-connected layers to predict keypoints at 42ms inference speed.

Experimental Results

This section presents the experimental evaluation of the VHSMarker framework for vertebral heart score (VHS) estimation from canine thoracic radiographs. The primary evaluation metric is test accuracy, based on VHS classification into three clinically significant categories: normal heart size (VHS < 8.2), borderline cardiomegaly (8.2 <=VHS <=10), and severe cardiomegaly (VHS> 10). These thresholds provide a clear basis for assessing the presence and severity of cardiomegaly, which is critical for accurate veterinary diagnostics.

Quantitative Results

The comprehensive evaluation demonstrates significant improvements across all metrics:

Key findings: Table 5 compares the performance of various state-of-the-art models on the Canine Cardiac key point (CCK) Dataset. Notably, the proposed MambaVHS model achieves the highest test accuracy of 91.8%, demonstrating its effectiveness in precise key point localization and VHS estimation. This performance surpasses several well-established baselines, including ConvNeXt (89.4%), EfficientNetB7 (88.41%), and CDA (86.4%), highlighting the advantage of Mamba-based architectures in capturing complex anatomical structures.

Table 5: Comparative accuracy across models (MambaVHS vs baselines)

Prediction Analysis

Visual comparisons from different models, including MambaVHS, ConvNeXt, EfficientNetB7, and CDA, on canine thoracic radiographs. MambaVHS consistently generates predictions closer to the actual VHS, particularly for less common cases with irregular thoracic structures and unusual imaging angles. This highlights its superior ability to capture long and short axes accurately, outperforming other models in challenging scenarios, making it a reliable choice for real-world veterinary diagnostics.

* The ground truth is shown in Red, while predictions are shown in Yellow

VHS(<8.2)

VHS(>=8.2 and <=10)

VHS(>10)

The model demonstrates strong performance across challenging scenarios: it accurately classifies borderline VHS values near clinical thresholds (8.2 and 10), handles irregular anatomies including spinal deformities and thoracic abnormalities, and processes images in 42ms (A100 GPU) for real-time clinical workflow integration.

Ablation Study

Systematic evaluation of architectural components reveals each element's contribution:

Model Variant	Val Acc (%)	Test Acc (%)	Performance Impact
Without SE Layers	88.0	88.5	3.3% accuracy drop shows importance of channel attention
With L1 Loss Only	88.4	88.7	3.1% drop highlights VHSAwareLoss benefits
With Attention + MLP	80.1	84.7	7.1% gain from Mamba's selective scanning
Without Residual Blocks	82.0	84.5	7.3% drop demonstrates need for skip connections
Full MambaVHS	89.5	91.8	Optimal configuration

Table 6: Complete ablation study with performance deltas

The ablation study demonstrates the critical contributions of each architectural component, with the full MambaVHS model achieving 91.8% test accuracy. Removing SE layers reduced performance by 3.3%, while using only L1 loss decreased accuracy by 3.1%. Most significantly, replacing Mamba blocks with standard attention mechanisms resulted in a 7.1% performance drop, highlighting the importance of efficient sequence modeling. The complete architecture provides optimal balance between accuracy and computational efficiency.

VHSMarker

A High-Precision Annotation Tool for Canine Cardiac key point Detection and VHS Estimation

Abstract

Key Contributions

CCK Dataset

VHSMarker Tool

MambaVHS Model

Canine Cardiac Keypoint (CCK) Dataset

Dataset Features

Demographic Information

CCK Dataset Sample Images

Breed Information

VHSMarker Annotation Tool

Technical Architecture

Tool Demonstration

MambaVHS Model

Model Features

Mamba Stem

Mamba Stage

SE Layer & Regression Head

Experimental Results

Quantitative Results

Prediction Analysis

VHS(<8.2)

VHS(>=8.2 and <=10)

VHS(>10)

Ablation Study