Abstract
The diagnosis is one of the best solution for finding the health problems and Inaccurate diagnosis of pneumonia might lead to serious health problems. For diagnosis, traditional chest X-rays are used; however, manual interpretation is laborious and prone to human mistake. Therefore, we have created a powerful deep learning method that allows for automatic pneumonia identification using chest radiographs. BeginningCNN models that are now in use include ResNetV2, ResNet50, VGG16, EfficientNetV2L, Xception, and NasNetMobile. We first integrate Xception and NasNetMobile to facilitate classification. Next, we emphasize the sites of irregularities in chest pictures using object identification techniques from YOLOv5x6, YOLOv5s6, YOLOv8n, and YOLOv9n. The proposed framework achieves an accuracy of 91.75%, surpassing several industry standards such as COVID-Net (87.00%), DenseNet121 (84.00%), and CheXNet (76.80%). The diagnostic model's claimed precision of 92.30%, recall of 91.10%, F1-score of 91.70%, and AUC of 0.935 demonstrate its balance and high reliability. This combination categorization and detection technology not only improves diagnostic accuracy but also speeds up and improves the decision-making process for doctors.
Introduction
Inflammation of the air sacs in one or both lungs is a common but potentially deadly respiratory infection known as pneumonia. It is still a major global health issue, particularly for small children, the elderly, and people with weakened immune systems. Every year, pneumonia causes more than a million hospital admissions in the US alone [1]. The illness can cause serious complications or even death if it is not identified and treated very once. The illness can be brought on by bacterial, viral, or fungal infections and manifests as coughing, fever, chest discomfort, and dyspnea [2]. Effective treatment, which usually include the use of antibiotics or antiviral drugs, depends on an accurate and prompt diagnosis. Pneumonia is most often diagnosed using chest X-ray imaging. Yet, if radiographs are interpreted by humans, they need much expertise and may error more often than digital methods, especially where resources or clinicians’ time are low [3]. Due to the latest progress in AI and deep learning, it has become much easier for technology to analyze medical images on its own. I propose a framework that uses deep learning to find pneumonia in chest radiographs automatically. Image classification in the system relies on several recent CNN designs such as EfficientNetV2L [4], InceptionResNetV2 [5], ResNet50 [6], VGG16 [7], Xception [8] and NasNetMobile [9]. A combination of Xception and NasNetMobile is now proposed to make the diagnostic output even better. Furthermore, object detection models from the YOLO family—specifically, YOLOv5x6, YOLOv5s6, YOLOv8n, and YOLOv9n—are incorporated to identify aberrant regions in X-ray pictures [10]. The suggested framework seeks to provide medical practitioners with a dependable, scalable, and effective tool for diagnosing pneumonia by combining categorization with object detection. This would ultimately improve patient outcomes and lessen the strain on the healthcare system.
Literature Survey
According to Mudasir Ali et al., the use of the EfficientNetV2L model in their 2024 study highlights what deep CNNs are capable of with chest X-ray images. Because the model was trained using different datasets, it can easily spot signs related to pneumonia on chest X-rays, proving that the latest architecture is useful in medicine [11]. Alhassan Mabrouk and his colleagues used a group of CNNs and 100,000 chest X-ray images covering 14 diseases to conduct their 2022 research. The additional 420 X-rays permitted researchers to measure the model’s dependability and its accuracy in comparison with radiologists[12]. Vikash Chouhan et al. in their 2020 work use a transfer learning method, transferring knowledge from a pretrained ImageNet neural network to extract features from X-rays. Once features are ready, the classifier identifies if the image represents pneumonia. The main point is to use transfer learning when there aren’t a lot of labeled medical data and it enhances model performance by taking advantage of representations from different fields [13].
Methodology
This section outlines the step-by-step methodology adopted in developing an automated pneumonia detection system using deep learning models. The system integrates both image classification and abnormality detection to enhance diagnostic accuracy and efficiency.
Figure 1. Fig: Flow chart for Proposed Method
The flowchart describes a methodical and modular approach to deep learning-based automated pneumonia detection. To standardize input and improve training robustness, the user uploads a chest X-ray image, which is then subjected to a number of preprocessing procedures such as scaling, grayscale conversion, normalization, and data augmentation. A classification module made up of many deep learning models, including CNN, VGG16, ResNet50, InceptionResNetV2, EfficientNetV2L, Xception, and NasNetMobile, is then applied to the preprocessed image. Additionally, an ensemble model that combines Xception and NasNetMobile is used to increase diagnostic reliability. Based on the classification output, the system determines whether pneumonia is present or not. In the event that pneumonia is identified, the workflow moves on to the object detection phase, where aberrant lung regions are located by enclosing them with bounding boxes using YOLO models (YOLOv5s6, YOLOv5x6, YOLOv8n, and YOLOv9n). A user-friendly Flask-based web application displays the final result, along with the classification label and any discovered regions. Healthcare providers can quickly upload X-ray pictures, view forecasts, and evaluate visual signs of pneumonia with this tool. In clinical settings, particularly those with limited radiological knowledge, the system offers a useful and effective diagnostic help by guaranteeing secure management of medical data and supporting authentication.
Implementation Techniques
Model Building for Classification:
TensorFlow and Keras were the main frameworks used in the model development for classification in order to build and train the models. As a starting point for performance assessment, a specially designed Convolutional Neural Network (CNN) was first constructed from the ground up. Several pre-trained models, such as VGG16, ResNet50, InceptionResNetV2, EfficientNetV2L, Xception, and NasNetMobile, were used in addition to the custom CNN through transfer learning. These models were adjusted to fit the particular task after being first trained on sizable datasets such as ImageNet. To preserve general feature extraction capabilities, fine-tuning required altering the models' upper layers while leaving the lower layers unaltered. The outputs of Xception and NasNetMobile were combined in an ensemble approach to improve overall classification accuracy and further boost performance.In order to better process and integrate the extracted features, more dense layers were added to the models in order to optimize them for binary classification. The model's output was transformed into a probability score between 0 and 1 by the final output layer's application of a sigmoid activation function, which made it appropriate for binary classification tasks. The construction of a very successful and efficient classification system was made possible by the combination of pre-trained models, fine-tuning, dense layers, and ensemble learning.
Model Training:
Several methods were tested during training to make sure the model performed at its best. To determine the difference between true labels and estimated probabilities, the Binary Cross-Entropy loss function was used, since this helps when classifying something in a binary way, as in pneumonia detection. We used RMSprop and Adam to optimize model weights and both optimizers alter the learning rates as gradients change to speed up training. Tracking Accuracy, Precision, Recall and F1-score for evaluation enabled valuable insights for handling imbalanced sets of data. Also, the ReduceLROnPlateau callback helped decrease the learning rate when the validation loss remained the same which improved the training process and lowered the risk of overfitting. An extra set of tests was maintained so the model’s ability to perform well on unseen data could be verified and its performance kept stable.
Object Detection with YOLO :
For these tasks, YOLO v5x6, YOLO v5s6, YOLO v8n and YOLO v9n models were run in order to find pneumonia-related abnormalities in X-rays of the chest. At the start, X-ray images were preprocessed by making them into blobs, the format the YOLO models expect. After that, we applied pre-trained YOLO weights to allow the algorithms to make better use of data used during their previous training. After prediction, the outputs of the models consisted of bounding boxes drawn around the areas of interest, allowing doctors to quickly see possible pneumonia.
Web application Flask:
Python and the Flask framework were used in the development of the web application to manage server-side functions and backend routing, guaranteeing seamless and effective processing. HTML, CSS, JavaScript, and Bootstrap were used in the frontend's creation to create a simple, responsive, and intuitive user experience. The application's safe user registration and login process for authentication, which permits only authorized access, is one of its key advantages. To improve usability, users can upload chest X-ray photos and preview them in real time before submitting. After processing, the program creates and shows classification results that show whether or not pneumonia is present. It also visually displays the object detection output by enclosing suspected abnormal regions in the X-ray with bounding boxes, making it easier for medical professionals to interpret the results.
Authentication of Users
To safely handle user authentication data and save credentials, the system makes use of SQLite3 as the database management system. Web applications benefit greatly from this lightweight database, which makes it possible to manage user registration and login data effectively. A secure registration and login process was put in place to guarantee that only authorized users could access the system. In order to provide a smooth and safe user experience, this contains session handling features that preserve the user's login status across various pages and activities inside the program. Additionally, password encryption mechanisms were added to protect sensitive user information, a key feature especially suggested for implementation in production systems to safeguard against unwanted access and data breaches. Several performance indicators were used for model evaluation and visualization in order to give a thorough evaluation of the correctness and dependability of the system. The model's prediction results across various classes were graphically represented using a confusion matrix, which showed the proportion of cases that were properly or erroneously classified. In-depth information about the model's advantages and disadvantages, especially with regard to managing class imbalances, was provided by the comprehensive classification report that was also produced. This report included precision, recall, F1-score, and support. Accuracy and loss curves were plotted over training epochs to monitor the model's learning process and spot patterns like overfitting or underfitting. Strong libraries like Seaborn and Matplotlib were used to construct these visualizations, making it easier to produce understandable, educational charts that aid in model interpretation and decision-making for future enhancements.
Results
Several YOLO models, including YOLOv5s6, YOLOv5x6, YOLOv8n, and YOLOv9n, were integrated into the system for abnormality detection. All of these models achieved near-perfect precision (1.0) and high recall rates (≈0.996–1.0), with a consistent mean Average Precision (mAP) of 0.995. These outcomes validate the system's resilience and real-time detection.
ML Model | Accuracy | Precision | Recall | F1_score |
CNN | 0.886 | 0.886 | 0.886 | 0.886 |
InceptionResNetV2 | 0.707 | 0.707 | 0.707 | 0.707 |
VGG16 | 0.652 | 0.652 | 0.652 | 0.652 |
ResNet50 | 0.904 | 0.904 | 0.904 | 0.904 |
EfficentNetV2L | 0.941 | 0.941 | 0.941 | 0.941 |
Xception | 0.942 | 0.942 | 0.942 | 0.942 |
Extension NasNet Mobile | 0.995 | 0.995 | 0.995 | 0.995 |
Extension Ensemble | 0.973 | 0.973 | 0.973 | 0.973 |
When it came to identifying and classifying pneumonia from chest radiographs, the suggested deep learning-based pneumonia detection system performed exceptionally well. With classification accuracies of 99.5% and 97.3%, respectively, NasNetMobile and the ensemble strategy combining Xception and NasNetMobile outperformed the other models among the evaluated classification models. Complex features found in chest X-ray images were successfully captured by these models.
Model | Precision | Recall | mAP |
YoloV5s6 | 1.0 | 1.000 | 0.995 |
YoloV5x6 | 1.0 | 0.997 | 0.995 |
YoloV8 | 1.0 | 0.997 | 0.995 |
YoloV9 | 1.0 | 0.996 | 0.995 |
Furthermore, healthcare professionals can access the user-friendly web interface created with Flask, which allows for safe user authentication, X-ray image uploads, and real-time predictions. The speed, scalability, and reliability of diagnosing pneumonia are greatly improved by this system, which also lessens the need for manual interpretation—especially in medical settings with limited resources.
Conclusion
The project successfully illustrates how deep learning methods can be applied to increase the precision and effectiveness of chest radiograph-based pneumonia detection. A strong ensemble approach and the use of sophisticated models like EfficientNetV2L, Xception, and NasNetMobile allowed the system to achieve remarkable classification accuracy—the top-performing model reached 99.5%. The system's ability to identify anomalies in X-ray images was greatly improved by incorporating YOLO-based object detection, enabling more accurate and timely diagnostic decisions. The system is appropriate for deployment in both clinical and remote settings due to its high detection performance and easy-to-use Flask-based web interface. This solution minimizes diagnostic delays, lessens the need for manual interpretation, and helps medical professionals provide faster, more precise care. The project establishes a solid.