RobustDeiT: Noise-Robust Vision Transformers for Medical Image Classification

Authors

  • Mehdi Taassori Obuda University

DOI:

https://doi.org/10.5566/ias.3561

Keywords:

classification, data-efficient image transformer, medical imaging, noise robustness

Abstract

Effective classification of medical images is vital for accurate diagnosis and treatment, but noisy datasets remain a significant challenge, obscuring critical features and leading to unreliable predictions. To address this, we propose RobustDeiT, a noise-robust architecture based on the Data-efficient Image Transformer (DeiT), tailored for medical image classification in noisy environments. By integrating a multi-stage preprocessing pipeline, our approach systematically reduces noise, enhances contrast, and highlights fine details, ensuring the preservation of essential features. Advanced denoising methods, contrast enhancement with Contrast Limited Adaptive Histogram Equalization, and sharpening via unsharp masking collectively improve image quality, enabling the model to extract meaningful patterns. Extensive evaluations demonstrate that RobustDeiT achieves superior performance across diverse metrics, establishing its effectiveness in handling noisy medical imaging datasets and paving the way for reliable and accurate classification in real-world scenarios.

References

[1] Ling, Y., Wang, Y., Dai, W., Yu, J., Liang, P., & Kong, D. (2023). Mtanet: Multi-task attention network for automatic medical image segmentation and classification. IEEE Transactions on Medical Imaging.

[2] Penso, C., Frenkel, L., & Goldberger, J. (2024). Confidence calibration of a medical imaging classification system that is robust to label noise. IEEE Transactions on Medical Imaging.

[3] Liu, J., Li, R., & Sun, C. (2021). Co-correcting: noise-tolerant medical image classification via mutual label correction. IEEE transactions on medical imaging, 40(12), 3580-3592.

[4] Ju, L., Wang, X., Wang, L., Mahapatra, D., Zhao, X., Zhou, Q., ... & Ge, Z. (2022). Improving medical images classification with label noise using dual-uncertainty estimation. IEEE transactions on medical imaging, 41(6), 1533-1546.

[5] Zhu, C., Chen, W., Peng, T., Wang, Y., & Jin, M. (2021). Hard sample aware noise robust learning for histopathology image classification. IEEE transactions on medical imaging, 41(4), 881-894.

[6] Xue, C., Yu, L., Chen, P., Dou, Q., & Heng, P. A. (2022). Robust medical image classification from noisy labeled data with global and local representation guided co-training. IEEE transactions on medical imaging, 41(6), 1371-1382.

[7] Li, Q., Shen, L., Guo, S., & Lai, Z. (2021). WaveCNet: Wavelet integrated CNNs to suppress aliasing effect for noise-robust image classification. IEEE Transactions on Image Processing, 30, 7074-7089.

[8] Yang, Y., Hui, H., Zeng, L., Zhao, Y., Zhan, Y., & Yan, T. (2021). Edge-preserving image filtering based on soft clustering. IEEE Transactions on Circuits and Systems for Video Technology, 32(7), 4150-4162.

[9] Wang, D., Nieto, J. J., Li, X., & Li, Y. (2020). A spatially adaptive edge-preserving denoising method based on fractional-order variational PDEs. IEEE Access, 8, 163115-163128.

[10] Zhou, S., Wang, J., Wang, L., Zhang, J., Wang, F., Huang, D., & Zheng, N. (2020). Hierarchical and interactive refinement network for edge-preserving salient object detection. IEEE Transactions on Image Processing, 30, 1-14.

[11] Zhu, H., & Ng, M. K. (2020). Structured dictionary learning for image denoising under mixed gaussian and impulse noise. IEEE Transactions on Image Processing, 29, 6680-6693.

[12] Taassori, M., & Vizvári, B. (2024). Enhancing Medical Image Denoising: A Hybrid Approach Incorporating Adaptive Kalman Filter and Non-Local Means with Latin Square Optimization. Electronics, 13(13), 2640.

[13] Mishiba, K. (2023). Fast guided median filter. IEEE Transactions on Image Processing, 32, 737-749.

[14] Chang, Y., Jung, C., Ke, P., Song, H., & Hwang, J. (2018). Automatic contrast-limited adaptive histogram equalization with dual gamma correction. Ieee Access, 6, 11782-11792.

[15] Chen, R. C., Dewi, C., Zhuang, Y. C., & Chen, J. K. (2023). Contrast limited adaptive histogram equalization for recognizing road marking at night based on YOLO models. IEEE Access.

[16] Ye, W., & Ma, K. K. (2018). Blurriness-guided unsharp masking. IEEE Transactions on Image Processing, 27(9), 4465-4477.

[17] Kansal, S., Purwar, S., & Tripathi, R. K. (2018). Image contrast enhancement using unsharp masking and histogram equalization. Multimedia Tools and Applications, 77, 26919-26938.

[18] Vaswani, A. (2017). Attention is all you need. Advances in Neural Information Processing Systems.

[19] Alexey, D. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv: 2010.11929.

[20] Manzari, O. N., Ahmadabadi, H., Kashiani, H., Shokouhi, S. B., & Ayatollahi, A. (2023). MedViT: a robust vision transformer for generalized medical image classification. Computers in Biology and Medicine, 157, 106791.

[21] Dalmaz, O., Yurt, M., & Çukur, T. (2022). ResViT: residual vision transformers for multimodal medical image synthesis. IEEE Transactions on Medical Imaging, 41(10), 2598-2614.

[22] Song, Y., He, Z., Qian, H., & Du, X. (2023). Vision transformers for single image dehazing. IEEE Transactions on Image Processing, 32, 1927-1941.

[23] Azad, R., Kazerouni, A., Heidari, M., Aghdam, E. K., Molaei, A., Jia, Y., ... & Merhof, D. (2023). Advances in medical image analysis with vision transformers: a comprehensive review. Medical Image Analysis, 103000.

[24] Shamshad, F., Khan, S., Zamir, S. W., Khan, M. H., Hayat, M., Khan, F. S., & Fu, H. (2023). Transformers in medical imaging: A survey. Medical Image Analysis, 88, 102802.

[25] Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. (2021). Training data-efficient image transformers & distillation through attention. In International conference on machine learning (pp. 10347-10357). PMLR.

[26] Taassori, M. (2024). Enhanced Wavelet-Based Medical Image Denoising with Bayesian-Optimized Bilateral Filtering. Sensors, 24(21), 6849.

[27] Al-Dhabyani, W., Gomaa, M., Khaled, H., & Fahmy, A. (2020). Dataset of breast ultrasound images. Data in brief, 28, 104863.

Downloads

Published

2025-06-28

Data Availability Statement

The dataset used for training the CNN model is based on the paper Al-Dhabyani et al. (2020) and is available upon request from the corresponding author.

Issue

Section

Original Research Paper

How to Cite

Taassori, M. (2025). RobustDeiT: Noise-Robust Vision Transformers for Medical Image Classification. Image Analysis and Stereology, 44(2), 111-129. https://doi.org/10.5566/ias.3561