Consult an Expert
Trademark
Design Registration
Consult an Expert
Trademark
Copyright
Patent
Infringement
Design Registration
More
Consult an Expert
Consult an Expert
Trademark
Design Registration
Login
A SYSTEM AND METHOD FOR BREAST CANCER DIAGNOSIS
Extensive patent search conducted by a registered patent agent
Patent search done by experts in under 48hrs
₹999
₹399
Abstract
Information
Inventors
Applicants
Specification
Documents
ORDINARY APPLICATION
Published
Filed on 15 November 2024
Abstract
ABSTRACT A SYSTEM AND METHOD FOR BREAST CANCER DIAGNOSIS The present disclosure discloses a system (100) for diagnosing breast cancer that utilizes topological data analysis to transform mammogram images into meaningful diagnostic insights. It includes a data preprocessing module (102) for image standardization and enhancement and a feature extraction module (104) to create histograms for topological analysis. The topological data analysis module (106) converts these histograms into Persistent Homology Diagrams (PHDs), representing topological features. An Earth Mover's Distance (EMD) matrix is generated by a similarity metric module (108) to compare PHDs. Representative PHDs are identified using a representative selection module (110), enabling accurate classification by the classification module (112). The system's performance is assessed through various metrics by a performance analysis module (114), and a web service module (116) provides an intuitive interface for users to upload images and receive diagnostic results. This approach enhances breast cancer detection by focusing on persistent topological features, offering improved precision and interpretability.
Patent Information
Application ID | 202441088356 |
Invention Field | CHEMICAL |
Date of Application | 15/11/2024 |
Publication Number | 47/2024 |
Inventors
Name | Address | Country | Nationality |
---|---|---|---|
GHOSH ANIRBAN | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
PHANINDRA RAYAPUDI VENKATA | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
BASWALA SRUJANA | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
GADDE SARANYA | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
ABBURI SOWGANDHI | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
CHINNADURAI SUNIL | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
JANARDHANAN RAJIV | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
NANDI DHRUVA | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
AIYAPPAN SENTHIL KUMAR | SRM University-AP, Neerukonda, Mangalagiri Mandal, Guntur- 522502, Andhra Pradesh, India | India | India |
Applicants
Name | Address | Country | Nationality |
---|---|---|---|
SRM UNIVERSITY | Amaravati, Mangalagiri, Andhra Pradesh-522502, India | India | India |
Specification
Description:FIELD
The present disclosure relates to the healthcare domain. More particularly, focusing on diagnosing breast cancer using advanced data analysis techniques
DEFINITION
As used in the present disclosure, the following terms are generally intended to have the meaning as set forth below, except to the extent that the context in which they are used indicates otherwise.
• Topological Features: The term "topological features" refers to the properties of a geometric shape or space that remain unchanged under continuous transformations, such as stretching or bending, without tearing or gluing. In the context of image analysis, topological features are used to study the shape and structure of data, allowing the extraction of meaningful patterns from complex datasets. These features help to understand the spatial distribution and arrangement of elements within the image.
• Earth Mover's Distance (EMD) Matrix: The term "Earth Mover's Distance (EMD) matrix" refers to a metric used to measure the similarity between two probability distributions over a region. EMD calculates the minimal cost of transforming one distribution into another, where the cost is defined as the amount of work required to move the mass. In image processing, EMD is often used to compare different histograms or distributions of features, enabling more accurate image analysis and comparison.
• Persistent Homology Diagrams (PHDs): The term "Persistent Homology Diagrams (PHDs)" refers to a mathematical tool used in topological data analysis to study the shape and features of data at different spatial resolutions. PHDs capture the persistence of topological features such as connected components, holes, and voids as a function of a scale parameter.
• Birth (Topological Feature): The term "birth" in Persistent Homology Diagrams (PHDs) refers to the point at which a topological feature, such as a connected component, loop, or void, first emerges as the scale or threshold of the analysis increases. It represents the initial appearance of the feature within the dataset, indicating when it begins to become significant in the topological analysis. In this context, birth signifies the starting point of a feature's existence as the data is examined at finer levels of detail.
• Death (Topological Feature): The term "Death" in Persistent Homology Diagrams (PHDs) denotes the point at which a topological feature disappears or merges with another feature as the scale or threshold continues to increase. It marks the stage where the feature loses its distinct identity, either by blending into a larger structure or becoming irrelevant in the overall data analysis. Death indicates the end of the feature's significance in the dataset, highlighting when it ceases to be a meaningful topological component. They provide a way to visualize and quantify the significance of these features, offering insights into the underlying structure of complex datasets.
• Region of Interest (ROI): The term "Region of Interest (ROI)" refers to a specific area within an image that is selected for further analysis or processing. In medical imaging, an ROI is often chosen because it contains relevant anatomical structures or abnormalities, such as tumours or lesions, that require detailed examination. Focusing on the ROI helps improve the accuracy and efficiency of image analysis techniques by reducing the amount of data to be processed.
• Microcalcifications: The term "microcalcifications" refers to tiny deposits of calcium in the breast tissue that appear as small white spots on mammogram images. They are often an early sign of breast cancer or other benign breast conditions. Detecting and analysing microcalcifications is crucial in diagnosing and determining the nature of breast abnormalities, as they can indicate the presence of ductal carcinoma in situ (DCIS) or other precancerous changes.
• Accuracy: The term "accuracy" refers to the proportion of correctly classified instances (both true positives and true negatives) out of the total instances evaluated. It indicates how often the system makes the right predictions overall.
• Sensitivity (Recall or True Positive Rate): The term "sensitivity," also known as "recall" or "true positive rate," refers to the ability of a system to correctly identify positive cases. It measures the proportion of actual positives that are accurately detected by the model.
• Specificity (True Negative Rate): The term "specificity," also known as "true negative rate," refers to the ability of a system to correctly identify negative cases.
• Precision (Positive Predictive Value): The term "precision," also known as "positive predictive value," refers to the proportion of true positive cases out of all the cases that the system has predicted as positive. It indicates how many of the predicted positives are actually positive.
• F1 Score: The term "F1 score" refers to the harmonic mean of precision and sensitivity (recall). It is used to balance the trade-off between precision and recall, providing a single metric that takes both into account, especially when the data is imbalanced.
• Similarity Metric Module: The term "similarity metric module" refers to a component within a system that is configured to quantify the degree of similarity or dissimilarity between different sets of data. In the context of image analysis, this module calculates how closely one image or feature representation matches another. It employs mathematical metrics, such as the Earth Mover's Distance (EMD), to compare patterns, structures, or distributions of features extracted from the data. By doing so, the similarity metric module aids in tasks like image classification, clustering, and anomaly detection, providing a numerical value that reflects the level of correspondence between data elements.
The above definitions are in addition to those expressed in the art.
BACKGROUND
The background information herein below relates to the present disclosure but is not necessarily prior art.
Breast cancer remains one of the leading causes of cancer-related mortality worldwide, and early detection is critical to improving patient outcomes. Mammogram imaging is the most common method for screening and diagnosing breast cancer, allowing clinicians to identify anomalies such as lumps, microcalcifications, and other changes in breast tissue. However, the interpretation of mammogram images can be challenging due to the complexity of breast tissue structures and the subtle nature of early-stage abnormalities.
Conventional methods for mammogram analysis primarily rely on visual inspection and manual interpretation by radiologists, which can be subjective and prone to errors. Recent advancements in data analysis techniques have enabled the use of computational tools to enhance image processing and feature extraction. Techniques like topological data analysis (TDA) and Earth Mover's Distance (EMD) offer innovative approaches to quantifying complex patterns in image data, providing a more objective and data-driven basis for diagnosis.
Therefore, there is a need for a system and method for breast cancer diagnosis that alleviates the aforementioned drawbacks.
OBJECTS
Some of the objects of the present disclosure, which at least one embodiment herein satisfies, are as follows:
It is an object of the present disclosure to ameliorate one or more problems of the prior art or to at least provide a useful alternative.
An object of the present disclosure is to provide a system for breast cancer diagnosis.
Another object of the present disclosure is to provide a system that enhances diagnostic capabilities by utilizing advanced data analysis techniques in medical imaging.
Still another object of the present disclosure is to provide a system that facilitates more efficient and accurate interpretation of mammogram images through the use of computational tools.
Yet another object of the present disclosure is to provide a system that improves decision-making processes in clinical environments by providing a reliable and reproducible diagnostic framework.
Still another object of the present disclosure is to provide a system that integrates state-of-the-art image processing techniques to support radiologists and clinicians in identifying potential breast cancer cases.
Yet another object of the present disclosure is to provide a system that enables seamless interaction between healthcare providers and diagnostic tools for effective data sharing and patient management.
Still another object of the present disclosure is to provide a system that remains adaptable to new data inputs and continues to refine its diagnostic models over time for improved performance.
Yet another object of the present disclosure is to provide a system that promotes accessibility and user-friendliness in diagnostic tools, making them suitable for both experts and non-experts in medical imaging.
Still another object of the present disclosure is to provide a method for breast cancer diagnosis.
Other objects and advantages of the present disclosure will be more apparent from the following description, which is not intended to limit the scope of the present disclosure.
SUMMARY
The present disclosure is a system and method for breast cancer diagnosis, the system comprising: a data preprocessing module, a feature extraction module, a topological data analysis module, a similarity metric module, a representative selection module, a classification module, a performance analysis module and a web service module.
The data preprocessing module is configured to receive mammogram images, standardize their size, and enhance the region of interest (ROI) that indicates potential areas of lump formation.
The feature extraction module is configured to transform the standardized mammogram images into histograms to facilitate topological feature extraction.
The topological data analysis module is configured to convert the histograms into Persistent Homology Diagrams (PHDs) to represent the birth-death cycles of the topological features.
The similarity metric module is configured to generate an Earth Mover's Distance (EMD) matrix to quantify the similarity between PHDs of different mammogram images.
The representative selection module is configured to identify representative PHDs for healthy and unhealthy cohorts based on intra-group similarity and inter-group dissimilarity metrics using the EMD matrix.
The classification module is configured to perform binary classification by comparing the PHDs of new mammogram images against the selected representative PHDs using the EMD metric.
The performance analysis module is configured to evaluate the system's performance using metrics including accuracy, sensitivity, specificity, precision, F1 score, and computational efficiency.
The web service module is configured to provide a user interface enabling clinicians and non-experts to upload mammogram images and receive diagnostic results with interpretive visualizations.
In an embodiment, the data preprocessing module is configured to employ adaptive thresholding techniques to automatically detect and highlight areas with dense tissue that may indicate tumour regions.
In an embodiment, the topological data analysis module implements a multi-scale analysis using different filtration parameters to ensure the detection of significant topological features while minimizing noise artifacts.
In an embodiment, the similarity metric module includes a parallel processing mechanism to compute the Earth Mover's Distance (EMD) more efficiently, reducing computation time without compromising accuracy.
In an embodiment, the representative selection module dynamically updates the representative PHDs based on new incoming mammogram data to refine the classification model over time.
In an embodiment, the classification module integrates a hybrid machine learning technique, combining the topological data analysis with a supervised learning model to improve classification accuracy.
In an embodiment, the web service module is configured with an AI-based chatbot that provides users with step-by-step guidance through the diagnostic process and interpretation of the results.
The present disclosure provides a method for diagnosing breast cancer using topological data analysis, comprising:
• receiving mammogram images and preprocessing them with a data preprocessing module to standardize the size and focus on the region of interest (ROI);
• converting the pre-processed mammogram images into histograms with a feature extraction module to highlight significant patterns in the image data;
• transforming the histograms into Persistent Homology Diagrams (PHDs) using a topological data analysis module to capture birth-death processes of topological features;
• generating an Earth Mover's Distance (EMD) matrix with a similarity metric module to measure the similarity between PHDs of different mammograms;
• selecting representative PHDs for healthy and unhealthy cohorts using a representative selection module by assessing both intra-group cohesion and inter-group separation based on EMD values;
• classifying mammogram images with a classification module by comparing their PHDs to the representative PHDs and assigning diagnostic labels accordingly;
• evaluating diagnostic accuracy using a performance analysis module with metrics such as accuracy, sensitivity, specificity, precision, F1 score, and confusion matrix analysis; and
• displaying the classification results through a web service module, including visual representations of the mammogram data and interpretive feedback for the user.
In an embodiment, the method includes topological data analysis involves applying persistent homology with varying filtration parameters to differentiate between noise and significant topological structures in the dataset.
In an embodiment, the method further comprises the step of optimizing the Earth Mover's Distance (EMD) computation by leveraging GPU acceleration to reduce processing time for large datasets.
In an embodiment, the method includes representative PHD selection and incorporates a confidence scoring mechanism that quantifies the reliability of the representative PHDs based on their EMD metrics.
In an embodiment, the method includes a classification process includes a voting-based ensemble technique to improve the robustness and accuracy of the binary classification.
In an embodiment, the method further comprises generating a performance report that includes both quantitative metrics and qualitative analysis to guide healthcare providers in making clinical decisions.
In an embodiment, the method includes a web service module that automatically updates its model parameters using active learning techniques as new mammogram data is continuously fed into the system.
In an embodiment, the method includes classification results that are further used to create a geospatial map of breast cancer incidence, helping identify emerging hotspots and support clinical resource allocation in underserved regions.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWING
A system and method for breast cancer diagnosis, of the present disclosure will now be described with the help of the accompanying drawing in which:
Figure 1 illustrates a block diagram of the present disclosure;
Figures 2A and 2B illustrate a flowchart of the process, in accordance with the present disclosure;
Figure 3 illustrates a block diagram depicting the conversion of a mammogram into its corresponding Persistent Homology Diagram (PHD) through stages of resizing, feature extraction, and cropping, in accordance with the present disclosure;
Figure 4 illustrates a flowchart depicting the decision-making process for classifying mammogram images into healthy and unhealthy categories using distance matrices, in accordance with the present disclosure;
Figures 5A and 5B illustrate a graphical plot of the initial Persistent Homology Diagram (PHD) depicting the birth and death of topological features with a color gradient indicating feature intensity, in accordance with the present disclosure; and
Figures 6A and 6B illustrate a graphical plot of the cropped Persistent Homology Diagram (PHD) depicting the most significant topological features after noise reduction, in accordance with the present disclosure.
LIST OF REFERENCE NUMERALS
100 - System
102 - Data preprocessing module
104 - Feature extraction module
106 - Topological data analysis module
108 - Similarity metric module
110 - Representative selection module
112 - Classification module
114 - Performance analysis module
116 - Web service module
200 - Method
DETAILED DESCRIPTION
The present disclosure relates to systems and methods for medical image analysis, particularly focusing on diagnosing breast cancer using advanced data analysis techniques. The disclosure utilizes topological data analysis, feature extraction, and statistical metrics to enhance the accuracy and reliability of mammogram image interpretation. It is configured to provide healthcare professionals with improved diagnostic tools that can offer insights into breast tissue abnormalities.
Embodiments, of the present disclosure, will now be described with reference to the accompanying drawing.
Embodiments are provided so as to thoroughly and fully convey the scope of the present disclosure to the person skilled in the art. Numerous details are set forth, relating to specific components, and methods, to provide a complete understanding of embodiments of the present disclosure. It will be apparent to the person skilled in the art that the details provided in the embodiments should not be construed to limit the scope of the present disclosure. In some embodiments, well known processes, well known apparatus structures, and well known techniques are not described in detail.
The terminology used, in the present disclosure, is only for the purpose of explaining a particular embodiment and such terminology shall not be considered to limit the scope of the present disclosure. As used in the present disclosure, the forms "a," "an," and "the" may be intended to include the plural forms as well, unless the context clearly suggests otherwise. The terms "including," and "having," are open ended transitional phrases and therefore specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not forbid the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The particular order of steps disclosed in the method and process of the present disclosure is not to be construed as necessarily requiring their performance as described or illustrated. It is also to be understood that additional or alternative steps may be employed.
When an element is referred to as being "engaged to," "connected to," or "coupled to" another element, it may be directly engaged, connected, or coupled to the other element. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed elements.
Referring to Figure 1, the present disclosure provides a system and method for breast cancer diagnosis, the system (100) comprising: a data preprocessing module (102), a feature extraction module (104), a topological data analysis module (106), a similarity metric module (108), a representative selection module (110), a classification module (112), a performance analysis module (114) and a web service module (116).
The data preprocessing module (102) is configured to receive mammogram images, standardize their size, and enhance the region of interest (ROI) that indicates potential areas of lump formation.
The feature extraction module (104) is configured to transform the standardized mammogram images into histograms to facilitate topological feature extraction.
The topological data analysis module (106) is configured to convert the histograms into Persistent Homology Diagrams (PHDs) to represent the birth-death cycles of the topological features.
The similarity metric module (108) is configured to generate an Earth Mover's Distance (EMD) matrix to quantify the similarity between PHDs of different mammogram images.
The representative selection module (110) is configured to identify representative PHDs for healthy and unhealthy cohorts based on intra-group similarity and inter-group dissimilarity metrics using the EMD matrix.
The classification module (112) is configured to perform binary classification by comparing the PHDs of new mammogram images against the selected representative PHDs using the EMD metric.
The performance analysis module (114) is configured to evaluate the system's performance using metrics including accuracy, sensitivity, specificity, precision, F1 score, and computational efficiency.
The web service module (116) is configured to provide a user interface enabling clinicians and non-experts to upload mammogram images and receive diagnostic results with interpretive visualizations.
In an embodiment, the data preprocessing module (102) is configured to employ adaptive thresholding techniques to automatically detect and highlight areas with dense tissue that may indicate tumour regions.
In an embodiment, the feature extraction module (104) includes converting mammograms into histogram based on pixel intensity distribution.
In an embodiment, the topological data analysis module (106) implements a multi-scale analysis using different filtration parameters to ensure the detection of significant topological features while minimizing noise artifacts.
In an embodiment, the similarity metric module (108) includes a parallel processing mechanism to compute the Earth Mover's Distance (EMD) more efficiently, reducing computation time without compromising accuracy.
In an embodiment, the representative selection module (110) dynamically updates the representative PHDs based on new incoming mammogram data to refine the classification model over time.
In an embodiment, the classification module (112) integrates a hybrid machine learning technique, combining the topological data analysis with a supervised learning model to improve classification accuracy.
In an embodiment, the performance analysis module (114) utilizes advanced statistical techniques, to assess the system's diagnostic performance.
In an embodiment, the web service module (116) is configured with an AI-based chatbot that provides users with step-by-step guidance through the diagnostic process and interpretation of the results.
Figures 2A and 2B illustrate a flowchart that includes the steps involved in a method (200) for breast cancer diagnosis, in accordance with an embodiment of the present disclosure. The order in which method (200) is described is not intended to be construed as a limitation, and any number of the described method (200) steps may be combined in any order to implement method (200), or an alternative method. Furthermore, method (200) may be implemented by processing resource or electronic device(s) through any suitable hardware, non-transitory machine-readable medium/instructions, or a combination thereof. The method (200) comprises the following steps:
At step (202), the method (200), includes receiving mammogram images and preprocessing them with a data preprocessing module (102) to standardize the size and focus on the region of interest (ROI).
At step (204), the method (200), includes converting the pre-processed mammogram images into histograms with a feature extraction module (104) to highlight significant patterns in the image data.
At step (206), the method (200), includes transforming the histograms into Persistent Homology Diagrams (PHDs) using a topological data analysis module (106) to capture birth-death processes of topological features.
At step (208), the method (200), includes generating an Earth Mover's Distance (EMD) matrix with a similarity metric module (108) to measure the similarity between PHDs of different mammograms.
At step (210), the method (200), includes selecting representative PHDs for healthy and unhealthy cohorts using a representative selection module (110) by assessing both intra-group cohesion and inter-group separation based on EMD values.
At step (212), the method (200), includes classifying mammogram images with a classification module (112) by comparing their PHDs to the representative PHDs and assigning diagnostic labels accordingly.
At step (214), the method (200), includes evaluating diagnostic accuracy using a performance analysis module (114) with metrics such as accuracy, sensitivity, specificity, precision, F1 score, and confusion matrix analysis.
At step (216), the method (200), includes displaying the classification results through a web service module (116), including visual representations of the mammogram data and interpretive feedback for the user.
In an embodiment, the method (200) further comprises the step of implementing a feature extraction technique to enhance the detection of microcalcifications in mammogram images.
In an embodiment, the method (200) includes topological data analysis that involves applying persistent homology with varying filtration parameters to differentiate between noise and significant topological structures in the dataset.
In an embodiment, the method (200) further comprises the step of optimizing the Earth Mover's Distance (EMD) computation by leveraging GPU acceleration to reduce processing time for large datasets.
In an embodiment, the method (200) includes representative PHD selection and incorporates a confidence scoring mechanism that quantifies the reliability of the representative PHDs based on their EMD metrics.
In an embodiment, the method (200) includes a classification process includes a voting-based ensemble technique to improve the robustness and accuracy of the binary classification.
In an embodiment, the method (200) further comprises generating a performance report that includes both quantitative metrics and qualitative analysis to guide healthcare providers in making clinical decisions.
In an embodiment, the method (200) includes a web service module that automatically updates its model parameters using active learning techniques as new mammogram data is continuously fed into the system.
In an embodiment, the method (200) includes classification results that are further used to create a geospatial map of breast cancer incidence, helping identify emerging hotspots and support clinical resource allocation in underserved regions.
Here are the paragraphs describing both diagrams sequentially, with each focusing on the key elements of the process involved in the classification of mammogram images into topological data analysis using Persistent Homology Diagrams (PHDs) and the decision-making flow for classifying healthy and unhealthy mammograms.
Figure 3 shows the step-by-step process of converting a mammogram into a Persistent Homology Diagram (PHD) through a series of transformations. The process begins with the Original Mammogram, which is the initial raw image obtained from the screening procedure. To standardize the dataset and focus on the key regions, the mammogram undergoes resizing, resulting in a Resized Mammogram that ensures consistency in dimensions across all images. Next, the resized image is analysed to create the histogram of extracted features, which captures the significant data points and patterns detected within the image. This histogram is then used to generate a generated PHD, representing the birth-death cycles of the topological features identified in the mammogram. The final step involves cropping the PHD to remove redundant information, resulting in a PHD that contains only the most relevant topological details for analysis. This sequence highlights the transformation of the mammogram into its topological representation, which is essential for the classification process.
Fig. 4, represents the flow diagram that guides the classification of mammogram images into either healthy or unhealthy categories using the generated topological data. The process starts with Initialization, where the system prepares both the healthy training dataset and the unhealthy training dataset for analysis. Each dataset is separately subjected to Pre-processing to enhance image quality and focus on the regions of interest. Following this, the system generates two Distance Matrices - the intra-cohort matrix is generated in the same manner as the normal method while the inter-cohort matrix is generated by comparing the images of the complementary group. The two matrices are used to rank the images based on their similarity to its own group and dissimilarity to the remaining group. The image with the lowest average rank is selected as the representative of its cohort. . The analysis then proceeds to the evaluation phase, where a Test Mammogram is introduced. Pairwise distance calculations are performed for the test mammogram against the Healthy Representative and the Unhealthy Representative, yielding similarity scores H and U, respectively. The system reaches a Decision Node that compares these scores to determine whether the test mammogram is more similar to the healthy or unhealthy representative. Based on this comparison, the mammogram is either Tagged as Healthy if the score H is higher or Tagged as Unhealthy if the score U is greater. This structured decision-making process ensures precise classification of mammograms using topological data analysis techniques.
The confusion matrix for this technique is shown below.
The performance of this technique in terms of the previously mentioned metrics can be calculated as follows.
Accuracy = 94.59%
Sensitivity = 0.958
Specificity = 0.94
Precision = 0.884
F1 Score = 0.92
Figures 5A and 5B are Persistent Homology Diagrams (PHDs) that represent the birth and death of topological features within mammogram data.
Figure 5A shows the healthy mammograms, a broader distribution of features across the birth-death plane, with many points dispersed away from the diagonal. This suggests a wide range of feature persistence. The intensity bar on the right quantifies feature significance, with darker regions indicating higher persistence values, highlighting the more prominent topological features in the data.
Figure 5B shows mammograms with lesions, a more refined concentration of features, with most points clustered closer to the diagonal line, implying that the remaining features are more persistent and relevant. The intensity scale, similar to Figure 5A, reflects feature significance, emphasizing highly persistent topological attributes. This refinement reduces noise and focuses on the most critical features, improving the ability to classify mammogram data more effectively.
Similarly, figure 6A represents the crop version of Figure 5A that shows healthy mammograms and figure 6B represents the crop version of figure 5B that shows mammograms with lesions respectively.
In an embodiment, the present disclosure provides an outline for constructing simplicial complexes from pixel-intensity data derived from mammograms. It details the systematic tracking of topological features-such as connected components, loops, and voids-during the filtration process as the scale parameter increases. The primary objective is to analyse the birth and death of these features to enhance breast cancer detection.
1. Data Preprocessing
i. Input Data: Initiate with a dataset consisting of mammogram images in a digital format. Each image is represented as a two-dimensional array of pixel intensities, where pixel values correspond to varying shades of Gray.
ii. Normalization: Standardize all mammograms to a uniform resolution (e.g., 512x512 pixels) to ensure consistency across the dataset, facilitating comparative analysis.
iii. Histogram Representation: Generate a histogram for each mammogram based on pixel intensity values. This histogram serves as a 1D representation that aggregates the frequency of pixel intensities, providing a foundational input for subsequent topological analysis.
2. Construction of Simplicial Complexes
• Initial Set of Points: Define each pixel or intensity value as a point in a higher-dimensional space, effectively mapping the two-dimensional image into a multi-dimensional point cloud;
• Ball Growth: Introduce a scale parameter r that determines the radius of the balls cantered around each data point (pixel):
o For each pixel, grow a ball of radius r ; and
o As the radius r increases, monitor and document the intersections between these balls, establishing connections among the points.
3. Filtration Process: Filtration is defined as a series of simplicial complexes constructed as the radius r expands incrementally.
Step-by-Step Construction:
1. Initialization: Begin with an empty simplicial complex, which will be populated as the filtration progresses.
2. Iterative Growth: For each increment in the radius r:
• New Edges Formation: Identify and record new edges formed between points as their corresponding balls overlap;
• Connected Components (H₀): Detect and mark the emergence of new connected components when isolated points begin to merge;
• Loops (H₁): Identify loops that form when edges create cycles, indicating the presence of circular structures within the data; and
• Voids (H₂): Monitor the formation of voids when higher-dimensional simplices start enclosing space, thus indicating topological cavities.
4. Birth and Death of Topological Features
Tracking Features:
• Birth of Features: For each identified feature (connected component, loop, or void), record the scale parameter r at which the feature first appears. This value is termed the birth value.
• Death of Features: As the radius r continues to increase, track when features merge or disappear. The scale at which a feature ceases to exist is recorded as the death value.
5. Persistence Diagrams
Representation of Features:
Construct persistence diagrams based on the recorded birth and death values of the topological features:
X-axis: Represents the birth value of each topological feature.
Y-axis: Represents the death value of each topological feature.
Each point (b, d) on the diagram corresponds to a feature that was born at scale b and died at scale d.
6. Analysis of Persistence Diagrams
Differentiation of Features:
• Significant Features: Points situated far from the diagonal line in the diagram indicate features that persist over a wide range of scales. These persistent features are often indicative of significant structures within the mammograms (e.g., lesions).
• Noise Features: Conversely, points near the diagonal suggest features that are transient and likely represent noise rather than meaningful topological structures.
8. Application to Breast Cancer Detection
• Feature Extraction: The persistence diagrams generated (PHDs) serve as feature vectors in the classification pipeline. Each diagram encapsulates vital information about the topological structure of the mammograms.
• Comparison of Mammograms: By systematically analysing the PHDs, mammograms can be classified as healthy or unhealthy based on the presence and significance of the persistent topological features identified within them.
In the operative configuration, the disclosed system (100) for diagnosing breast cancer utilizes topological data analysis to process mammogram images efficiently. The system comprises a data preprocessing module (102) that standardizes and enhances the mammogram images, focusing on the region of interest (ROI) to ensure uniformity. These pre-processed images are then passed to the feature extraction module (104), where they are transformed into histograms that facilitate the extraction of key topological features. The topological data analysis module (106) converts these histograms into Persistent Homology Diagrams (PHDs) that capture the birth-death cycles of significant topological features in the mammogram data. The similarity metric module (108) computes an Earth Mover's Distance (EMD) matrix to quantify the similarity between PHDs, while the representative selection module (110) identifies the most characteristic PHDs for both healthy and unhealthy cohorts. The classification module (112) utilizes these representative PHDs to classify new mammogram images by comparing their PHDs with those of the identified representatives, determining the health status of the tissue. The system's performance is continuously assessed using the performance analysis module (114), which calculates diagnostic metrics such as accuracy, sensitivity, specificity, precision, and F1 score. Finally, the web service module (116) provides an intuitive user interface that allows healthcare practitioners and non-experts to upload mammogram images and receive clear, interpretable diagnostic results, thus ensuring ease of use and accessibility.
Advantageously, the disclosed system leverages topological data analysis through persistent homology diagrams (PHDs) combined with the earth mover's distance (EMD) metric, which provides a robust and interpretable alternative to conventional machine learning and deep learning techniques for breast cancer detection. Unlike black-box models that often lack transparency, this method offers a clear representation of topological features, making it easier for clinicians to understand the basis of diagnostic decisions. The system's ability to update and refine representative PHDs dynamically based on new mammogram data ensures that the classification process remains accurate and relevant over time, even as medical data evolves. Furthermore, the integration with a web-based platform enhances accessibility, enabling resource-limited healthcare settings to utilize advanced diagnostic tools without the need for specialized technical expertise. This approach not only improves early detection rates but also supports the development of targeted interventions, ultimately contributing to better clinical outcomes and optimized allocation of healthcare resources, especially in underserved regions. The combination of real-time processing, adaptability, and user-friendly design makes this system a valuable tool in the fight against breast cancer, promoting widespread adoption in both urban and rural healthcare environments.
The functions described herein may be implemented in hardware, executed by a processor, firmware, or any combination thereof. Other examples and implementations are within the scope and spirit of the disclosure and appended claims. The present disclosure can be implemented by a processor, hardware, firmware, hardwiring, or combinations of any of these. Features implementing functions may also be physically located at various positions, including being distributed such that portions of functions are implemented at different physical locations.
The foregoing description of the embodiments has been provided for purposes of illustration and is not intended to limit the scope of the present disclosure. Individual components of a particular embodiment are generally not limited to that particular embodiment, but, are interchangeable. Such variations are not to be regarded as a departure from the present disclosure, and all such modifications are considered to be within the scope of the present disclosure.
TECHNICAL ADVANCEMENTS
The present disclosure described hereinabove has several technical advantages including, but not limited to, a system and method for breast cancer diagnosis, which;
• enhances diagnostic capabilities by utilizing advanced data analysis techniques in medical imaging;
• more efficient and accurate interpretation of mammogram images through the use of computational tools;
• improves decision-making processes in clinical environments by providing a reliable and reproducible diagnostic framework;
• integrates state-of-the-art image processing techniques to support radiologists and clinicians in identifying potential breast cancer cases;
• enables seamless interaction between healthcare providers and diagnostic tools for effective data sharing and patient management;
• remains adaptable to new data inputs and continues to refine its diagnostic models over time for improved performance; and
• promotes accessibility and user-friendliness in diagnostic tools, making them suitable for both experts and non-experts in medical imaging.
The foregoing disclosure has been described with reference to the accompanying embodiments which do not limit the scope and ambit of the disclosure. The description provided is purely by way of example and illustration.
The embodiments herein and the various features and advantageous details thereof are explained with reference to the non-limiting embodiments in the following description. Descriptions of well known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
The foregoing description of the specific embodiments so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments herein can be practiced with modification within the spirit and scope of the embodiments as described herein.
Any discussion of devices, articles or the like that has been included in this specification is solely for the purpose of providing a context for the disclosure. It is not to be taken as an admission that any or all of these matters form a part of the prior art base or were common general knowledge in the field relevant to the disclosure as it existed anywhere before the priority date of this application.
While considerable emphasis has been placed herein on the components and component parts of the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the disclosure. These and other changes in the preferred embodiment as well as other embodiments of the disclosure will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the disclosure and not as a limitation.
, C , Claims:WE CLAIM:
1. A system (100) for breast cancer diagnosis, comprising:
• a data preprocessing module (102) configured to receive mammogram images, standardize their size, and enhance the region of interest (ROI) that indicates potential areas of lump formation;
• a feature extraction module (104) configured to transform the standardized mammogram images into histograms to facilitate topological feature extraction;
• a topological data analysis module (106) configured to convert the histograms into Persistent Homology Diagrams (PHDs) to represent the birth-death cycles of the topological features;
• a similarity metric module (108) configured to generate an Earth Mover's Distance (EMD) matrix to quantify the similarity between PHDs of different mammogram images;
• a representative selection module (110) configured to identify representative PHDs for healthy and unhealthy cohorts based on intra-group similarity and inter-group dissimilarity metrics using the EMD matrix;
• a classification module (112) configured to perform binary classification by comparing the PHDs of new mammogram images against the selected representative PHDs using the EMD metric;
• a performance analysis module (114) configured to evaluate the system's performance using metrics including accuracy, sensitivity, specificity, precision, F1 score, and computational efficiency; and
• a web service module (116) configured to provide a user interface enabling clinicians and non-experts to upload mammogram images and receive diagnostic results with interpretive visualizations.
2. The system (100) of claim 1, wherein said data preprocessing module (102) is configured to employ adaptive thresholding techniques to automatically detect and highlight areas with dense tissue that may indicate tumour regions.
3. The system (100) of claim 1, wherein said feature extraction module (104) includes a step of texture analysis.
4. The system (100) of claim 1, wherein said topological data analysis module (106) implements a multi-scale analysis using different filtration parameters to ensure the detection of significant topological features while minimizing noise artifacts.
5. The system (100) of claim 1, wherein said similarity metric module (108) includes a parallel processing mechanism to compute the Earth Mover's Distance (EMD) more efficiently, reducing computation time without compromising accuracy.
6. The system (100) of claim 1, wherein said representative selection module (110) dynamically updates the representative PHDs based on new incoming mammogram data to refine the classification model over time.
7. The system (100) of claim 1, wherein said classification module (112) integrates a hybrid machine learning technique, combining the topological data analysis with a supervised learning model to improve classification accuracy.
8. The system (100) of claim 1, wherein said performance analysis module (114) utilizes advanced statistical techniques, to assess the system's diagnostic performance.
9. The system (100) of claim 1, wherein said web service module (116) is configured with an AI-based chatbot that provides users with step-by-step guidance through the diagnostic process and interpretation of the results.
10. A method (200) for breast cancer diagnosis, comprising:
• receiving mammogram images and preprocessing them with a data preprocessing module (102) to standardize the size and focus on the region of interest (ROI);
• converting the pre-processed mammogram images into histograms with a feature extraction module (104) to highlight significant patterns in the image data;
• transforming the histograms into Persistent Homology Diagrams (PHDs) using a topological data analysis module (106) to capture birth-death processes of topological features;
• generating an Earth Mover's Distance (EMD) matrix with a similarity metric module (108) to measure the similarity between PHDs of different mammograms;
• selecting representative PHDs for healthy and unhealthy cohorts using a representative selection module (110) by assessing both intra-group cohesion and inter-group separation based on EMD values;
• classifying mammogram images with a classification module (112) by comparing their PHDs to the representative PHDs and assigning diagnostic labels accordingly;
• evaluating diagnostic accuracy using a performance analysis module (114) with metrics such as accuracy, sensitivity, specificity, precision, F1 score, and confusion matrix analysis; and
• displaying the classification results through a web service module (116), including visual representations of the mammogram data and interpretive feedback for the user.
11. The method (200) as claimed in claim 10, further comprises the step of implementing a feature extraction technique based on wavelet decomposition to enhance the detection of microcalcifications in the mammogram images.
12. The method (200) as claimed in claim 10, wherein the topological data analysis involves applying persistent homology with varying filtration parameters to differentiate between noise and significant topological structures in the dataset.
13. The method (200) as claimed in claim 10, further comprises the step of optimizing the Earth Mover's Distance (EMD) computation by leveraging GPU acceleration to reduce processing time for large datasets.
14. The method (200) as claimed in claim 10, wherein said representative PHD selection incorporates a confidence scoring mechanism that quantifies the reliability of the representative PHDs based on their EMD metrics.
15. The method (200) as claimed in claim 10, where said classification process includes a voting-based ensemble technique to improve the robustness and accuracy of the binary classification.
16. The method (200) as claimed in claim 10 , further comprises generating a performance report that includes both quantitative metrics and qualitative analysis to guide healthcare providers in making clinical decisions.
17. The method (200) as claimed in claim 10, wherein said web service module automatically updates its model parameters using active learning techniques as new mammogram data is continuously fed into the system.
Dated this 15th Day of November, 2024
_______________________________
MOHAN RAJKUMAR DEWAN, IN/PA - 25
of R.K. DEWAN & CO.
Authorized Agent of Applicant
TO,
THE CONTROLLER OF PATENTS
THE PATENT OFFICE, AT CHENNAI
Documents
Name | Date |
---|---|
202441088356-AMMENDED DOCUMENTS [10-12-2024(online)].pdf | 10/12/2024 |
202441088356-FORM 13 [10-12-2024(online)].pdf | 10/12/2024 |
202441088356-MARKED COPIES OF AMENDEMENTS [10-12-2024(online)].pdf | 10/12/2024 |
202441088356-AMMENDED DOCUMENTS [20-11-2024(online)].pdf | 20/11/2024 |
202441088356-FORM 13 [20-11-2024(online)].pdf | 20/11/2024 |
202441088356-MARKED COPIES OF AMENDEMENTS [20-11-2024(online)].pdf | 20/11/2024 |
202441088356-FORM-26 [16-11-2024(online)].pdf | 16/11/2024 |
202441088356-COMPLETE SPECIFICATION [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-DECLARATION OF INVENTORSHIP (FORM 5) [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-DRAWINGS [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-EDUCATIONAL INSTITUTION(S) [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-EVIDENCE FOR REGISTRATION UNDER SSI [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-FORM 1 [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-FORM 18 [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-FORM FOR SMALL ENTITY(FORM-28) [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-FORM-9 [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-PROOF OF RIGHT [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-REQUEST FOR EARLY PUBLICATION(FORM-9) [15-11-2024(online)].pdf | 15/11/2024 |
202441088356-REQUEST FOR EXAMINATION (FORM-18) [15-11-2024(online)].pdf | 15/11/2024 |
Talk To Experts
Calculators
Downloads
By continuing past this page, you agree to our Terms of Service,, Cookie Policy, Privacy Policy and Refund Policy © - Uber9 Business Process Services Private Limited. All rights reserved.
Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.
Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.