Consult an Expert
Trademark
Design Registration
Consult an Expert
Trademark
Copyright
Patent
Infringement
Design Registration
More
Consult an Expert
Consult an Expert
Trademark
Design Registration
Login
SYSTEM AND METHOD FOR PREDICTING MONKEYPOX INFECTION
Extensive patent search conducted by a registered patent agent
Patent search done by experts in under 48hrs
₹999
₹399
Abstract
Information
Inventors
Applicants
Specification
Documents
ORDINARY APPLICATION
Published
Filed on 9 November 2024
Abstract
SYSTEM AND METHOD FOR PREDICTING MONKEYPOX INFECTION ABSTRACT A system (100) for predicting monkeypox infection is disclosed. The system (100) receives and processes symptom data from a subject to assess a likelihood of monkeypox infection. The system (100) comprises an input unit (102) for capturing symptom data, a processing unit (104) for analyzing the data, a set of machine learning models (106) for classification, and a display interface (108) for presenting results to the user. The processing unit (104) extracts relevant features from the input data, processes the extracted features through scaling and dimensionality reduction techniques, and applies a set of machine learning models (106) to classify a likelihood of the monkeypox infection. The display interface (108) provides a user-friendly output by presenting the likelihood of the monkeypox infection in an accessible format. Claims: 10, Figures: 24 Figure 1 is selected.
Patent Information
Application ID | 202441086349 |
Invention Field | BIO-MEDICAL ENGINEERING |
Date of Application | 09/11/2024 |
Publication Number | 46/2024 |
Inventors
Name | Address | Country | Nationality |
---|---|---|---|
Dr. Raghvendra S, Dubey | SR University, Ananthasagar, Hasanparthy (PO), Warangal, Telangana, India-506371. | India | India |
Lagishetty Nanditha | SR University, Ananthasagar, Hasanparthy (PO), Warangal, Telangana, India-506371. | India | India |
Applicants
Name | Address | Country | Nationality |
---|---|---|---|
SR University | SR University, Ananthasagar, Warangal Telangana India 506371 patent@sru.edu.in 08702818333 | India | India |
Specification
Description:BACKGROUND
Field of Invention
[001] Embodiments of the present invention generally relate to a disease detection system and particularly to a system for predicting monkeypox infection.
Description of Related Art
[002] Monkeypox is a viral infection caused by the monkeypox virus, part of the same family as the variola virus that causes smallpox. While typically less severe than smallpox, monkeypox can lead to a range of symptoms, including fever, rash, and swollen lymph nodes, and may result in complications, especially in individuals with weakened immune systems. Traditional diagnostic approaches are challenged by varied symptom presentations and overlap with other infectious diseases. Thus, necessitating advanced machine learning models to optimize early detection and prediction of monkeypox outbreaks. Machine learning (ML) and artificial intelligence (AI) have become critical tools in enhancing disease diagnosis and outbreak forecasting. Numerous models, including Logistic Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Perceptron Learning, have been successfully employed to model complex epidemiological data, improving classification accuracy and adaptability in dynamic disease landscapes. However, achieving high accuracy while maintaining generalizability across various datasets remains challenging. The integration of multiple ML approaches, such as Principal Component Analysis (PCA) and hierarchical clustering, has proven effective in reducing dimensionality and enhancing clustering, critical to managing large and diverse datasets in healthcare applications.
[003] The SVM model is widely recognized for an ability to classify diseases based on distinct symptoms by identifying an optimal hyperplane for separation. In monkeypox detection, SVM has achieved moderate accuracy, distinguishing between symptomatic and non-symptomatic cases by emphasizing critical features like rectal pain, HIV infection, and sexually transmitted infections. Logistic regression, another popular approach, approximates probabilities of disease presence based on symptom profiles. While effective in certain applications, logistic regression has limitations in handling complex datasets due to a linear nature, often resulting in moderate classification accuracy for diseases like monkeypox.
[004] The KNN model is particularly useful for clustering similar cases, especially in predicting disease onset. However, it presents scalability challenges, as predictions require distance calculations for all training data points. The optimal performance in predicting monkeypox cases was observed at K=16, beyond which overfitting or underfitting tendencies were noted. KNN is, therefore, best suited for smaller, well-defined datasets. Further, PCA and hierarchical clustering have been instrumental in simplifying complex datasets by reducing features without losing variance, which is essential for identifying high-risk patients quickly. In monkeypox predictions, hierarchical clustering models successfully grouped symptomatic features, achieving 100% accuracy under specific configurations. The perceptron model, emulating neuron behavior, offers a rudimentary approach for establishing decision boundaries between symptomatic and asymptomatic cases. Although moderately accurate, the perceptron model's false-positive rate necessitates further refinement for real-world applications, such as high-stakes disease prediction where accurate early detection is essential. Despite notable advancements, existing approaches face limitations in the specificity and accuracy required for monkeypox prediction in diverse populations.
[005] There is thus a need for a system for predicting monkeypox infection that can address the aforementioned limitations in a more efficient manner.
SUMMARY
[006] Embodiments in accordance with the present invention provide a system for predicting monkeypox infection, comprising an input unit configured to receive symptom data from a subject; a processing unit operatively connected to the input unit, wherein the processing unit is characterized by its configuration to extract relevant features from the received symptom data, the features being selected based on a predictive correlation with monkeypox infection; process the extracted features using scaling and selection techniques to standardize and refine the data for improved analysis; apply a set of machine learning models to the processed features; classify a likelihood of monkeypox infection based on the outputs derived from the machine learning models; and a display interface operatively connected to the processing unit, configured to display the likelihood to an end user based on the classified likelihood.
[007] Embodiments in accordance with the present invention further provide a method for predicting monkeypox infection using a system, comprising steps of receiving symptom data from a subject through an input unit, wherein the symptom data is selected from rectal pain, penile edema, oral lesions, sexually transmitted infections, or a combination thereof; extracting, by a processing unit, relevant features from the received symptom data based on a predictive correlation with monkeypox infection; processing the extracted features by applying scaling and selection techniques; applying a set of machine learning models to the processed features, wherein the models are selected from at least one of Support Vector Machine (SVM), Logistic Regression, Decision Tree, Perceptron Learning, K-Nearest Neighbors (KNN), and Density-Based Spatial Clustering of Applications with Noise (DBSCAN); classifying a likelihood of the monkeypox infection based on outputs from the machine learning models; and displaying the classified likelihood to an end user through a display interface.
[008] Embodiments of the present invention may provide a number of advantages depending on their particular configuration. First, embodiments of the present application may provide a system for predicting monkeypox infection.
[009] Next, embodiments of the present application may provide a system for enhancing diagnostic accuracy by enabling symptom data standardization through scaling and selection techniques, which reduce noise and improve feature relevancy. Next, embodiments of the present application may provide a system for improving machine learning model interpretability by integrating a feature ranking module, allowing the system to prioritize symptom data based on a correlation matrix and inform end users of the symptom impact on model predictions. Next, embodiments of the present application may provide a system for rapid symptom-based classification by incorporating real-time interactive interfaces, allowing users to select and input symptoms easily for faster and more accurate disease likelihood estimation. Next, embodiments of the present application may provide a system for model performance monitoring and optimization through a confusion matrix generation, enabling the system to evaluate key metrics including true positive, false positive, true negative, and false negative rates.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The above and still further features and advantages of embodiments of the present invention will become apparent upon consideration of the following detailed description of embodiments thereof, especially when taken in conjunction with the accompanying drawings, and wherein:
[0011] FIG. 1 illustrates a block diagram of a system for predicting monkeypox infection;
[0012] FIG. 2 illustrates a block diagram of a processing unit of the system, according to an embodiment of the present invention;
[0013] FIG. 3A illustrates a correlation matrix with correlation coefficients;
[0014] FIG. 3B illustrates a pie chart depicting a distribution of target values;
[0015] FIG. 3C illustrates a first confusion matrix for a Support Vector Machine (SVM) mode;
[0016] FIG. 3D illustrates a first table depicting performance metrics Support Vector Machine (SVM) model;
[0017] FIG. 3E illustrates a second confusion matrix for a Perceptron Learning Model;
[0018] FIG. 3F illustrates a second table depicting performance metrics using the Perceptron Learning Model;
[0019] FIG. 3G illustrates a third confusion matrix for a Logic Regression Model;
[0020] FIG. 3H illustrates a third table depicting performance metrics using the Logic Regression Model;
[0021] FIG. 3I illustrates a first graph depicting an accuracy of a K-Nearest Neighbours model;
[0022] FIG. 3J illustrates graphs for features of the K-Nearest Neighbours model;
[0023] FIG. 3K illustrates a second graph depicting a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) Model for monkeypox prediction;
[0024] FIG. 3L illustrates a third graph depicting an outcome of a decision tree model for the monkeypox prediction;
[0025] FIG. 3M illustrates a fourth table depicting performance metrics using the decision tree model;
[0026] FIG. 3N illustrates a fourth graph depicting a Principal Component Analysis (PCA) model for the monkeypox infection prediction;
[0027] FIG. 3O illustrates a fifth table depicting performance metrics using the Principal Component Analysis (PCA) model;
[0028] FIG. 3P illustrates a Hierarchical Clustering Dendrogram (HCD) for monkeypox prediction;
[0029] FIG. 3Q illustrates a sixth table depicting performance metrics using a Hierarchical Clustering model;
[0030] FIG. 4 illustrates a set of graphs showing a variation of features with a target;
[0031] FIG. 5 illustrates a set of graphs exemplifying a distribution of specific symptoms in patients affected by the monkeypox infection;
[0032] FIG. 6A illustrates a drop-down box to choose and visualize the relationship between two selected data;
[0033] FIG. 6B illustrates an information widget that predicts whether the person is suffering from the monkeypox infection; and
[0034] FIG. 7 illustrates a flowchart of a method for predicting the monkeypox infection using the system.
DETAILED DESCRIPTION
[0035] FIG. 1 illustrates a block diagram of a system 100 for predicting monkeypox infection, according to an embodiment of the present invention. Embodiments of the present invention are intended to address and overcome challenges and limitations encountered by existing systems. According to the embodiments of the present invention, the system 100 may incorporate non-limiting hardware components to enhance the processing speed and efficiency such as the system 100 may comprise an input unit 102, a processing unit 104, machine learning models 106, and a display interface 108. In an embodiment of the present invention, the hardware components of the system 100 may be integrated with computer-executable instructions for overcoming the challenges and the limitations of the existing systems.
[0036] In an embodiment of the present invention, the input unit 102 may be configured to receive symptom data. The symptom data may include, but is not limited to, specific indicators such as a rectal pain, a penile edema, oral lesions, sexually transmitted infections, fever, headache, muscle aches, swollen lymph nodes, or rash. In an embodiment of the present invention, the symptom data may be indicators that may be utilized based on a predictive relevance to monkeypox infection. Additionally, the input unit 102 may capture other relevant data types, such as patient history and recent exposure details, enhancing the comprehensiveness of the data collected. In an embodiment of the present invention, the input unit 100 may enable the end user to select symptoms via checkboxes and capture the selected symptoms for receiving the symptom data. The checkboxes may be loaded with all possible symptoms that may or may not be directly related to the monkeypox infection. In an embodiment of the present invention, the input unit 102 may be, but is not limited to, a keyboard (not shown), a database interface (not shown), a data acquisition device (not shown), a network interface (not shown) for receiving data from remote sources (not shown), or a combination thereof. The input unit 102 may ensure that diverse types of the symptom data may be integrated into the system 100 for comprehensive analysis and model generation. Embodiments of the present invention are intended to include or otherwise cover any type of the input unit 102 including known, related art, and/or later developed technologies.
[0037] In an embodiment of the present invention, the processing unit 104 may be connected to the input unit 102. The processing unit 104 may be configured to receive the symptom data from the input unit 102. The received symptom data may be preprocessed by the processing unit 104. The preprocessed of the symptom data may involve removing errors, handling missing values, normalizing data to ensure consistency, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of preprocessing including known, related art, and/or later developed technologies. In an embodiment of the present invention, the processing unit 104 may be configured to classify the preprocessed symptom data to predict a likelihood of the monkeypox infection based on the outputs derived from the machine learning models 106. In an embodiment of the present invention, the processing unit 104 may be configured to run on a distributed computing architecture to enhance a processing speed and the efficiency of the system 100. The processing unit 104 may further be configured to execute the computer-executable instructions to generate an output relating to the system 100. According to embodiments of the present invention, the processing unit 104 may be, but not limited to, a Programmable Logic Control (PLC) unit, a microprocessor, a development board, and so forth. In an embodiment of the present invention, the processing unit 104 may further be explained in conjunction with FIG. 2.
[0038] In an embodiment of the present invention, the machine learning models 106 may be, but not limited to, algorithms such as Support Vector Machine (SVM), Logistic Regression, Decision Tree, Perceptron Learning, K-Nearest Neighbors (KNN), and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The machine learning models 106 may be selected based on an ability to process complex symptom data and effectively classify the likelihood of monkeypox infection. Each of the machine learning models 106 may be trained on a dataset that may comprise relevant features and target values associated with the monkeypox infection. In an embodiment of the present invention, the machine learning models 106 may be fetched by the processing unit 104 in a sequential or parallel manner, depending on computational requirements and a model selection criteria. The processing unit 104 may comprise model selection instructions that may enable the processing unit 104 to dynamically load, execute, and switch between different machine learning models 106 based on performance metrics and/or specific characteristics of the received symptom data.
[0039] In an embodiment of the present invention, the display interface 108 may be operatively connected to the processing unit 104. In an embodiment of the present invention, the display interface 108 may be configured to receive the classified likelihood as processed by the processing unit 104. The display interface 108 is further configured to display results of the monkeypox infection to an end user based on the classified likelihood. The display interface 108 may further an optimal disease prediction model and related information to the user. The display interface 108 may include, but not limited to, a Graphical User Interface (GUI), a display screen, a printer, or a network interface for transmitting the results of the system 100 to other systems or devices. The display interface 108 may also be equipped with features for visualizing performance metrics, such as charts, graphs, and other visual aids that may help users to understand an effectiveness and accuracy of the optimal disease prediction model.
[0040] FIG. 2 illustrates a block diagram of the processing unit 104 of the system 100, according to an embodiment of the present invention. The processing unit 104 may comprise the computer-executable instructions in the form of programming modules such as a data receiving module 200, a feature extraction module 202, a data preprocessing module 204, a machine learning application module 206, a model evaluation module 208, and a result display module 210.
[0041] In an embodiment of the present invention, the data receiving module 200 may be configured to receive and validate the symptom data from the input unit 102. The data receiving module 200 may ensure that the symptom data is accurately formatted and checked for any inconsistencies or errors before further processing. Upon receiving the symptom data, the data receiving module 200 may generate a feature extraction signal. In an embodiment of the present invention, the feature extraction module 202 may be activated upon receiving the feature extraction signal from the data receiving module 200. In an embodiment of the present invention, the feature extraction module 202 may be configured to analyze the received symptom data and extract relevant features that may be predictive of the monkeypox infection. The feature extraction module 202 utilizes statistical methods and domain-specific algorithms to identify key indicators within the data, ensuring that the features selected are both informative and significant for the subsequent analysis.
[0042] In an embodiment of the present invention, the data preprocessing module 204 may be configured to apply scaling and normalization techniques to the extracted features. The data preprocessing module 204 may refine the extracted features. Additionally, the data preprocessing module 204 may perform a dimensionality reduction using techniques such as Principal Component Analysis (PCA) that may simplify the extracted features while preserving critical information for accurate predictions. In an embodiment of the present invention, the machine learning application module 206 may be configured to implement a set of the machine learning models 106. The machine learning application module 206 may apply to select the set of the machine learning models 106 based on the predictivity of the extracted features. The machine learning application module 206 may further process the extracted features to classify the likelihood of the monkeypox infection.
[0043] In an embodiment of the present invention, the model evaluation module 208 may be configured to assess the performance of the machine learning models 106. In an embodiment of the present invention, the model evaluation module 208 may be configured to evaluate outputs from the applied set of the machine learning models 106. The model evaluation module 208 may generate various performance metrics, such as accuracy, precision, and recall, alongside confusion matrices that may illustrate an effectiveness of the machine learning models 106. In an embodiment of the present invention, the model evaluation module 208 may check if all the outputs of the applied set of the machine learning models 106 are indicating the likelihood of the monkeypox infection, the model evaluation module 208 may generate a result display signal. In an embodiment of the present invention, If the outputs of the applied set of the machine learning models 106 are varied such as to indicate the likelihood of the monkeypox infection ranging from 0.4 to 0.6, the model evaluation module 208 may select another set of the machine learning models 106 to re-evaluate the likelihood of the monkeypox infection. Once the likelihood of the monkeypox infection is indicated below 0.4, the model evaluation module 208 may classify the likelihood of the monkeypox infection as 'Low' and may generate the result display signal. Once the likelihood of the monkeypox infection is indicated above 0.6, the model evaluation module 208 may classify the likelihood of the monkeypox infection as 'High' and may generate the result display signal. In an embodiment of the present invention, the model evaluation module 208 may utilize a feedback mechanism that may allow for continuous optimization of the machine learning models 106 for enhancing a predictive accuracy over time. In an embodiment of the present invention, the result display module 210 may be configured to be activated upon receiving the result display signal from the model evaluation module 208. In an embodiment of the present invention, the result display module 210 may be configured to present the classified likelihood of the monkeypox infection to the end user through the display interface 108. The result display module 210 may format the outputs in a user-friendly manner, potentially incorporating visual elements such as charts and graphs to facilitate easy interpretation of the results.
[0044] FIG. 3A illustrates a correlation matrix 300 with correlation coefficients, according to an embodiment of the present invention. The correlation matrix may depict symptoms associated with the monkeypox infection and relations. The correlation matrix 300 may be an n×n matrix that may represent correlation coefficients between pairs of variables. Each cell of the correlation matrix 300 may represent a cross-correlation coefficient between two symptoms. The cross-correlation coefficient may vary between -1 and 1. Diagonal elements may be exhibiting a perfect correlation, therefore the value of the diagonal elements may be 1. Off-diagonal elements of the matrix may exhibit correlations between the pairs of symptoms; the closer the value to 1 or -1, the stronger correlation may be exhibited. The closer an element may be to the diagonal element, the more positive correlation may be exhibited. Moreover, the further an element may be to the diagonal element, the less or undignified correlation may be exhibited. As depicted in the correlation matrix 300, HIV Infection (0.15), Rectal Pain (0.14), and Sexually Transmitted Infection (0.12) indicate the highest positive correlations with Monkeypox. On the other hand, Systemic Illness (-0.01) shows a very weak or almost no correlation with Monkeypox. Swollen Tonsils (0.01) and Solitary Lesion (0.04) also show weak correlations. This correlation in the correlation matrix 300 may indicate that HIV Infection, Rectal Pain, and Sexually Transmitted Infection may be the features most related to the target (Monkeypox), while Systemic Illness may be the least related based on the correlation values.
[0045] FIG. 3B illustrates a pie chart 302 depicting a distribution of target values, according to an embodiment of the present invention. The pie chart 302 may show the distribution of the target values in a dataset. The target variable may have two possible values: 1 and 0. As depicted in the pie chart 302, 63.6% of data points may have a target value of 1 whereas 36.4% of the data points have a target value of 0.
[0046] FIG. 3C illustrates a first confusion matrix 304 for the Support Vector Machine (SVM) model, according to an embodiment of the present invention. The Support Vector Machine (SVM) model may incorporate a Support Vector Classifier (SVC). The Support Vector Classifier (SVC) may be an algorithm that may be used in classified based learning. The Support Vector Machine (SVM) model may operate by capturing the best line of division between data sets that may belong to diverse categories. The output of the Support Vector Classifier (SVC) may be used as an assessment to indicate how effectively the Support Vector Machine (SVM) model may be predicting the monkeypox infection. In an embodiment of the present invention, the accuracy may determine correct instances from all the monkeypox infection instances predicted, that may be either from the training set or from the test set. For predictive modeling, a precision may imply how much of the observed cases predicted may be the monkeypox infection. The first confusion matrix 304 for the Support Vector Machine (SVM) model may be distributed among True Negative (TN), False Positive (FP), False Negative (FN), and True Positive (TP). In an embodiment of the present invention, the True Negative (TN) may indicate no instance of the monkeypox infection. Further, the true Negative (TN) region in the first confusion matrix 304 may be where the Support Vector Machine (SVM) model may have made a correct prediction of zero cases of the monkeypox infection. In an embodiment of the present invention, the False Positive (FP) may indicate false acclaimed cases of the monkeypox infection. Further, the False Positive (FP) region in the first confusion matrix 304 may be where the Support Vector Machine (SVM) model may have made incorrect prediction of zero cases of the monkeypox infection.
[0047] FIG. 3D illustrates a first table 306 depicting performance metrics using the Support Vector Machine (SVM) model, according to an embodiment of the present invention. The first table 306 may illustrate performance indices that may be obtained to predict monkeypox disease through the Support Vector Classifier (SVC) model. Of the total cases predicted as positive, 64% are positive, as marked by Precision 0.64 (for class 1). Recall 1.00 of class 1 means that in all the actual positive samples, 100% was correctly classified as positive. F1-Score 0.78 (for class 1) may be used when equal weight may be given to precision and recall thus giving a balanced conclusion. Total sample in DATA Set =5000, may represent the total samples in the data set while Accuracy 0.64 may represent the fact that 64% of all cases in the sample may have been predicted accurately. FIG. 3E illustrates a second confusion matrix 308 for the Perceptron Learning Model, according to an embodiment of the present invention. The Perceptron Learning Model may mimic biological neurons and may possess characteristics that may enable the Perceptron Learning Model to learn decision boundaries between the classes of inputs. The Perceptron Learning Model may begin with choosing the weights and choosing the biases.
[0048] FIG. 3F illustrates a second table 310 depicting performance metrics using the Perceptron Learning Model, according to an embodiment of the present invention. The second table 310 may depict performance metrics using the Perceptron Learning Model. Precision 0.67 may be calculated, that may conclude the monkeypox infection may be predicted, then 67% of cases were actually correct in the cases of class 1. Recall 0.82 (for class 1) means that of all the actual positive cases only 82% were said to be positive. An F1 Score of 0.74 may be achieved for class 1, that may combine both the precision and recall in a midpoint measure. Total Sample 2500 and testify that a ratio of correct predictions of all the cases of the selected samples has 63% accuracy. Further, the learning efficiency of the Perceptron Learning Model may have the recall of 0.82 meaning that 82% of real positive samples were identified. However, there were also 646 False Positive (FP) cases that may reveal a relatively high false positive rate in comparison with the other analyzed cases. The lower prediction accuracy in patients might indicate that the Perceptron Learning Model could label some of them as having the monkeypox infection. It may be 0.67 for class 1 showing that the Perceptron Learning Model was right 67% of the time of the cases that the Perceptron Learning Model erroneously predicted to be positive are actually positive but also incorrectly predicted negative cases to be positive. The total accuracy of 0.63 was moderate that means that the Perceptron Learning Model of the classifier may be rather effective. However, there are large possibilities for increasing the accuracy of the predictions. Hence, the F1-score of 0.74 offered a trade-off between the precision and recall assessments.
[0049] FIG. 3G illustrates a third confusion matrix 312 for the Logic Regression Model, according to an embodiment of the present invention. The third confusion matrix 312 for the Logic Regression Model may be distributed among True Negative (TN), False Positive (FP), False Negative (FN), and True Positive (TP). In an embodiment of the present invention, the True Negative (TN) may indicate no instance of the monkeypox infection. Further, the true Negative (TN) region in the first confusion matrix 304 may be where the Support Vector Machine (SVM) model may have made the correct prediction of 1383 cases of the monkeypox infection. In an embodiment of the present invention, the False Positive (FP) may indicate false acclaimed cases of the monkeypox infection. Further, the False Positive (FP) region in the first confusion matrix 304 may be where the Support Vector Machine (SVM) model may have made the incorrect prediction of 339 cases of the monkeypox infection. In an embodiment of the present invention, the False Negative (FN) may indicate a negative yield of positive cases of the monkeypox infection. Further, the False Negative (FN) region in the first confusion matrix 304 may be where the Support Vector Machine (SVM) model may have made the negative yield of 285 positive cases of the monkeypox infection. In an embodiment of the present invention, the True Positive (TP) may indicate a prediction of positive cases of the monkeypox infection.
[0050] FIG. 3H illustrates a third table 314 depicting performance metrics using the Logic Regression Model, according to an embodiment of the present invention. The third table 314 depicts the performance metrics based on the Logic Regression Model. FIG. 3I illustrates a graph 316 depicting the accuracy of a K-Nearest Neighbors model, according to an embodiment of the present invention. The graph 316 shows a plot of accuracy versus number of neighbors based on the K-nearest neighbors (KNN) model employed to predict monkeypox infection. The K-nearest neighbors (KNN) model may depend on the number of neighbors assumed. There may be a perfect neighbor where the majority of predictions may have optimal accuracy, and beyond this optimal number of neighbors, they are over- or undertrained predictions.
[0051] FIG. 3J illustrates graphs 318 for features of the K-Nearest Neighbors model, according to an embodiment of the present invention. The graphs 318 may depict the K-nearest neighbors (KNN) model regression plots for predicting monkeypox cases based on nine different features (1-9). Each plot may represent a regression model built using one of these features. The red curve may represent the K-nearest neighbors (KNN) regression model predictions, while the black dots may represent the actual data points. Based on these visualizations, feature 4, feature 5, feature 6, feature 7, feature 8, and feature 9) may be highly predictive of monkeypox cases. The K-nearest neighbors (KNN) regression model may further seem to effectively capture the underlying relationships between these features and the target variable.
[0052] FIG. 3K illustrates a second graph 320 depicting a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) Model for monkeypox prediction, according to an embodiment of the present invention. The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) may be a versatile clustering algorithm widely applied in data mining and machine learning to detect clusters in huge databases with noise points. The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) Model may be different from other clustering methods like K-Means where the number of clusters doesn't have to be defined in advance and there may be no assumptions on the shapes of clusters.
[0053] FIG. 3L illustrates a third graph 322 depicting an outcome of a decision tree model for monkeypox prediction, according to an embodiment of the present invention. The decision tree model may used for regression and classification. The third graph 322 may be a scatter graph of actual and predicted values of monkeypox disease in the decision tree model. Each dot may be a data point, a measure along an x-axis may be an actual value, whereas, a measure along a y-axis may be a predicted value. The red dashed line represents the perfect forecast line, where the actual and the forecasted values are the same. The dots may be closely packed around the red dotted line that signifies that the decision tree model may be accurate. The decision tree model may be capable to analyze the features and trends of the data accurately and give correct predictions.
[0054] FIG. 3M illustrates a fourth table 324 depicting performance metrics using the decision tree model, according to an embodiment of the present invention. A precision of 1.00 in the fifth table may imply that all cases predicted in the positive class were positive. A recall of 1.00 means that all actual positive cases were predicted by the decision tree model. F1-Score of 1.00 remains an average of both Precision and Recall, and so may be more reliable for measuring the decision tree model overall performance.
[0055] FIG. 3N illustrates a fourth graph 326 depicting a Principal Component Analysis (PCA) model for the monkeypox infection prediction, according to an embodiment of the present invention. The fourth graph 326 illustrates results obtained using the Principal Component Analysis (PCA) model. FIG. 3O illustrates a fifth table 328 depicting performance metrics using the Principal Component Analysis (PCA) model, according to an embodiment of the present invention. The fifth table 328 represents the evaluation scores obtained by the Principal Component Analysis (PCA) model. Analyzing the various metrics, their high values indicate that the Principal Component Analysis (PCA) model may be well-suited to predict monkeypox infection.
[0056] FIG. 3P illustrates a Hierarchical Clustering Dendrogram (HCD) 330 for monkeypox prediction, according to an embodiment of the present invention. The Hierarchical Clustering Dendrogram (HCD) 330 may be an algorithm under a cluster analysis, that may be intended to develop a system with a hierarchical of clusters. This method may categorize into two types, agglomerative hierarchical clustering and divisive hierarchical clustering. Using the Hierarchical Clustering Dendrogram (HCD) 330 approach, each feature may start in its own cluster, and pairs of clusters may be combined as moving up in the hierarchy. It continues until all points are in a same cluster. FIG. 3Q illustrates a sixth table 332 depicting performance metrics using the Hierarchical Clustering model, according to an embodiment of the present invention.
[0057] FIG. 4 illustrates set of graphs 400 showing variation of every feature with target, according to an embodiment of the present invention. The set of graphs 400 may illustrate a relationship between each feature (on x-axis) and the target variable (on y-axis), presumably 0 or 1. In an embodiment of the present invention, if the line slopes upward from left to right, it may indicate a positive correlation between the feature and the target. In yet another embodiment of the present invention, if the line may be relatively horizontal, then it implies that the line of best fit indicates little influence of the patient's identity on the target variable. As shown the Systemic Illness, Rectal Pain, Sore Throat, Penile Oedema, Oral Lesions, Solitary Lesion, Swollen Tonsils, HIV Infection, and Sexually Transmitted Infection plots intend to compare the slopes of these lines to identify of these features are more or less related to the target. For instance, an increased rate in the upward slope for "Rectal Pain" may point to the fact that it may be highly related to the probability of the target being 1. If the frequency of occurrences of "Monkeypox" may be high in the context associated with the target, then this feature will be highly relevant to, or predictive of the result.
[0058] FIG. 5 illustrates a set of charts 500 illustrating distribution of specific symptom in the patients affected by monkeypox graphs illustrating a distribution of specific symptoms in the patients affected by monkeypox, according to an embodiment of the present invention.
[0059] FIG. 6A illustrates a drop-down box 600 to choose and visualize the relationship between two selected data, according to an embodiment of the present invention. The drop-down box 600 may illustrate a screenshot of a widget, that may allow a user to generate the plot based on any two columns of the dataset. The first drop-down list provides options for the user to choose a data column (e.g., Patient_ID) to be plotted on the x-axis. Likewise, from the second drop-down list, the user may select a data column (e.g., Oral Lesions) to be plotted on the y-axis. After choosing the two columns of interest, the user may click the 'Update Plot' button to plot the selected columns to visualize the relationship.
[0060] FIG. 6B illustrates an information widget 602 that predicts whether the person may be suffering from the monkeypox or not along with the accuracy of the test by just selecting the checkboxes of existing symptoms, according to an embodiment of the present invention. In an embodiment of the present invention, the information widget 602 may comprise elements such as, but not limited to, checkboxes, the Support Vector Machine (SVM) model, a kernel, a gamma, and an accuracy.
[0061] FIG. 7 depicts a flowchart of a method 300 for predicting monkeypox infection using the system 100, according to an embodiment of the present invention. At step 702, the system 100 may receive the symptom data from the subject through the input unit 102. At step 704, the system 100 may extract the relevant features from the received symptom data using the processing unit 104 based on the predictive correlation with the monkeypox infection. At step 706, the system 100 may process the extracted features by applying the scaling and selection techniques. At step 708, the system 100 may apply the set of machine learning models 106 to the processed features. The machine learning models 106 may classify the likelihood of monkeypox infection based on the processed data.
[0062] At step 710, the system 100 may evaluate the outputs from the applied set of the machine learning models 106. This step may include generating the performance metrics, such as accuracy, precision, recall, and the confusion matrices, to assess the predictive accuracy. If all the outputs of the applied set of the machine learning models 106 are indicating the likelihood of monkeypox infection, the system 100 may proceed to the step 712. If the outputs of the applied set of the machine learning models 106 are varied, the system 100 may return to the step 708. At step 712, the system 100 may display the classified likelihood of monkeypox infection to the end user through the display interface 108.
[0063] While the invention has been described in connection with what is presently considered to be the most practical and various embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims. , Claims:CLAIMS
I/We Claim:
1. A system (100) for predicting monkeypox infection, comprising:
an input unit (102) configured to receive symptom data from a subject;
a processing unit (104) operatively connected to the input unit (102), wherein the processing unit (104) is characterized by its configuration to:
extract relevant features from the received symptom data, the features being selected based on a predictive correlation with the monkeypox infection;
process the extracted features using scaling and selection techniques to standardize and refine the symptom data;
apply a set of machine learning models (106) to the processed features;
classify a likelihood of monkeypox infection based on the output derived from the machine learning models (106); and
a display interface (108) operatively connected to the processing unit (104), configured to display the likelihood to an end user based on the classified likelihood.
2. The system (100) as claimed in claim 1, wherein the processing unit (104) further comprises a feature ranking module configured to prioritize the symptom data based on a correlation matrix.
3. The system (100) as claimed in claim 1, wherein the machine learning models (106) are selected from at least one of Support Vector Machine (SVM), Logistic Regression, Decision Tree, Perceptron Learning, K-Nearest Neighbors (KNN), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), or a combination thereof.
4. The system (100) as claimed in claim 1, wherein the processing unit (104) is configured to perform dimensionality reduction on the symptom data using Principal Component Analysis (PCA) before applying the set of machine learning models (106).
5. The system (100) as claimed in claim 1, wherein the machine learning models (106) are trained on a dataset containing target values for monkeypox to allow each of the models to calculate a likelihood score.
6. The system (100) as claimed in claim 1, wherein the machine learning models (106) include a support vector machine (SVM) configured with a linear kernel that adjusts a gamma parameter to optimize prediction accuracy.
7. The system (100) as claimed in claim 1, wherein the input unit (102) enables the end user to select symptoms via checkboxes and captures the selected symptoms for receiving the symptom data.
8. The system (100) as claimed in claim 1, wherein the processing unit (104) generates a confusion matrix for evaluating performances of the model performance, calculating metrics including true positive, false positive, true negative, and false negative rates for model optimization.
9. The system (100) as claimed in claim 1, wherein the symptom data comprises at least one of rectal pain, penile edema, oral lesions, sexually transmitted infections, or a combination thereof.
10. A method (700) for predicting monkeypox infection using a system (100), the method (700) comprising steps of:
receiving symptom data from a subject through an input unit (102), wherein the symptom data is selected from a rectal pain, a penile edema, oral lesions, sexually transmitted infections, or a combination thereof;
extracting, by a processing unit (104), relevant features from the received symptom data based on a predictive correlation with the monkeypox infection;
processing the extracted features by applying scaling and selection techniques;
applying a set of machine learning models (106) to the processed features, wherein the models are selected from at least one of Support Vector Machine (SVM), Logistic Regression, Decision Tree, Perceptron Learning, K-Nearest Neighbors (KNN), and Density-Based Spatial Clustering of Applications with Noise (DBSCAN);
classifying a likelihood of the monkeypox infection based on outputs from the machine learning models (106); and
displaying the classified likelihood to an end user through display interface (108).
Date: November 07, 2024
Place: Noida
Nainsi Rastogi
Patent Agent (IN/PA-2372)
Agent for the Applicant
Documents
Name | Date |
---|---|
202441086349-COMPLETE SPECIFICATION [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-DECLARATION OF INVENTORSHIP (FORM 5) [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-DRAWINGS [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-EDUCATIONAL INSTITUTION(S) [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-FORM 1 [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-FORM FOR SMALL ENTITY(FORM-28) [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-FORM-9 [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-OTHERS [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-POWER OF AUTHORITY [09-11-2024(online)].pdf | 09/11/2024 |
202441086349-REQUEST FOR EARLY PUBLICATION(FORM-9) [09-11-2024(online)].pdf | 09/11/2024 |
Talk To Experts
Calculators
Downloads
By continuing past this page, you agree to our Terms of Service,, Cookie Policy, Privacy Policy and Refund Policy © - Uber9 Business Process Services Private Limited. All rights reserved.
Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.
Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.