Open Access

Vision Transformer and CNN-Based Models for Image Analysis of Plant Diseases: An Approach for Agricultural Decision Support Systems

1 Cukurova University, Faculty of Arts and Sciences, Department of Computer Sciences, Adana

Abstract

Plant diseases remain a major threat to global food security, making reliable and scalable diagnostic systems increasingly important. This study compares three model families, Convolutional Neural Networks (CNNs), Vision Transformers (ViTs), and hybrid CNN–ViT architectures, for plant disease classification. The goal is to evaluate both accuracy and computational efficiency, two factors that heavily influence how suitable these models are for Agricultural Decision Support Systems (ADSS), especially those running on edge devices. Six representative architectures were trained using the same experimental setup, including transfer learning and data augmentation. All models performed well on the controlled dataset, but the hybrid models stood out. They achieved 99.29% accuracy and a 99.18% F1-score by combining local and global feature extraction. ViT models also reached high accuracy (98.92%) but required far more computation, making them less practical for real-time use. Lightweight CNNs had slightly lower accuracy (~97.44%) but were extremely efficient, with fewer parameters and very low FLOPs, which makes them strong candidates for mobile or IoT-based systems. Future directions should include using multispectral data, adding object-level localization to reduce background bias, and adopting Explainable AI to increase interpretability and trust. In conclusion, this work offers a clear comparison of leading deep learning architectures and highlights practical guidelines for selecting efficient and reliable models for next-generation ADSS aimed at early plant disease detection.

Keywords

How to Cite

ÖZDEN, C. (2026). Vision Transformer and CNN-Based Models for Image Analysis of Plant Diseases: An Approach for Agricultural Decision Support Systems. ISPEC Journal of Agricultural Sciences. https://doi.org/10.5281/zenodo.18228928

References

📄 Andrew, J., Eunice, J., Popescu, D.E., Chowdary, M.K., Hemanth, J., 2022. Deep learning-based leaf disease detection in crops using images for agricultural applications. Agronomy, 12(10): 2395.
📄 Chakraborty, K.K., Mukherjee, R., Chakroborty, C., Bora, K., 2022. Automated recognition of optical image-based potato leaf blight diseases using deep learning. Physiological and Molecular Plant Pathology, 117: 101781.
📄 Chen, Y., Wang, A., Liu, Z., Yue, J., Zhang, E., Li, F., Zhang, N., 2025. MoSViT: a lightweight vision transformer framework for efficient disease detection via precision attention mechanism. Frontiers in Artificial Intelligence, 8: 1498025.
📄 De Silva, M., Brown, D., 2023. Multispectral plant disease detection with vision transformer–convolutional neural network hybrid approaches. Sensors, 23(20): 8531.
📄 Gehlot, M., Saxena, R. K., Gandhi, G.C., 2023. Tomato-Village: a dataset for end-to-end tomato disease detection in a real-world environment. Multimedia Systems, 29: 3305–3328.