Consult an Expert

Trademark

Patent

Infringement

Design Registration

Consult an Expert

Talk to a IP/Trademark Lawyer

Trademark

Trademark Registration

Trademark Search

Respond to TM Objection

International Trademark

Trademark Class Finder

Patent

Indian Patent Search

Provisional Patent Application

Patent Registration

Infringement

Patent Infringement

Trademark Infringement

Design Registration

Patent search/

METHOD FOR THE ANALYSIS OF GENE EXPRESSION PROFILES USING HIGH-THROUGHPUT SEQUENCING

Patent Search in India

Extensive patent search conducted by a registered patent agent
Patent search done by experts in under 48hrs

₹999

₹399

Talk to expert

METHOD FOR THE ANALYSIS OF GENE EXPRESSION PROFILES USING HIGH-THROUGHPUT SEQUENCING

ORDINARY APPLICATION

Published

Filed on 28 October 2024

Abstract

METHOD FOR THE ANALYSIS OF GENE EXPRESSION PROFILES USING HIGH-THROUGHPUT SEQUENCING ABSTRACT This invention discloses a method for analyzing gene expression profiles using high-throughput sequencing. The method involves sequencing RNA from biological samples, preprocessing the data, aligning sequences to a reference genome, and quantifying gene expression levels. The method enhances accuracy by accounting for errors and alternative splicing during alignment and employs robust normalization techniques to reduce batch effects. Advanced statistical models, including machine learning approaches, are used to identify differentially expressed genes. This method offers improved scalability, sensitivity, and applicability across various biological research fields, including disease biomarker discovery and drug response analysis.

Patent Information

Application ID	202441082194
Invention Field	BIO-CHEMISTRY
Date of Application	28/10/2024
Publication Number	44/2024

Inventors

Name	Address	Country	Nationality
Dr. N. Sarithadevi	Associate Professor St. Pauls Collge of Pharmacy, Sy. No. 603 , 604 & 605 Turkayamjal (V), Abdullapurmet (M), R.R. Dist. - 501510, Telangana, India.	India	India
Dr. Kiranmai Mandava	Professor & Principal St. Pauls Collge of Pharmacy, Sy. No. 603 , 604 & 605 Turkayamjal (V), Abdullapurmet (M), R.R. Dist. - 501510, Telangana, India.	India	India
Mr. A. Santhosh	Assistant Professor St. Pauls Collge of Pharmacy, Sy. No. 603 , 604 & 605 Turkayamjal (V), Abdullapurmet (M), R.R. Dist. - 501510, Telangana, India.	India	India

Applicants

Name	Address	Country	Nationality
St. Pauls College of Pharmacy	TURKAYAMJAL, NAGARJUNA SAGAR ROAD, HYDERABAD, TELANGANA 501510	India	India
Dr. Are Anusha	Associate Professor ST. PAULS COLLEGE OF PHARMACY, TURKAYAMJAL, NAGARJUNA SAGAR ROAD, HYDERABAD, TELANGANA 501510	India	India

Specification

Description:METHOD FOR THE ANALYSIS OF GENE EXPRESSION PROFILES USING HIGH-THROUGHPUT SEQUENCING

FIELD OF THE INVENTION

The present invention relates to the field of molecular biology and bioinformatics. Specifically, it pertains to a method for analyzing gene expression profiles using high-throughput sequencing technologies. This invention provides a systematic approach to quantify and interpret RNA sequences in biological samples, enabling the identification of differentially expressed genes across various conditions, disease states, or treatments.

BACKGROUND OF THE INVENTION

Gene expression profiling is a crucial method in molecular biology for understanding how genes are regulated under different physiological and pathological conditions. Traditional methods such as microarrays have been widely used; however, these techniques are limited by low sensitivity, specificity, and reliance on predefined probes.
The advent of high-throughput sequencing technologies, such as RNA-seq (RNA sequencing), has revolutionized gene expression analysis. RNA-seq provides an unbiased, comprehensive, and highly accurate view of the transcriptome, enabling researchers to quantify gene expression levels across entire genomes. High-throughput sequencing technology can also identify novel transcripts, alternative splicing events, and gene fusion events, which were challenging to detect using earlier methods.
However, existing methods for analyzing gene expression profiles using high-throughput sequencing often lack efficiency, accuracy, and scalability. These methods are prone to errors due to issues such as the large size of sequencing data, errors during sequence alignment, and biases introduced during sample preparation. Furthermore, the interpretation of gene expression data remains complex, requiring sophisticated computational algorithms to ensure accurate results.
There is a need for an improved method that enhances the accuracy, efficiency, and scalability of gene expression analysis using high-throughput sequencing. This invention addresses these challenges by introducing a robust method that simplifies data processing, optimizes computational workflows, and improves the overall interpretation of gene expression profiles.

SUMMARY OF THE INVENTION

The present invention provides a novel method for analyzing gene expression profiles using high-throughput sequencing data, such as RNA-seq. The method comprises the steps of sequencing RNA from biological samples, preprocessing the sequencing data, aligning the sequences to a reference genome or transcriptome, and quantifying gene expression levels using a combination of normalization and statistical techniques.
In an embodiment, the method involves the use of specialized algorithms to improve sequence alignment accuracy by accounting for read errors, biases, and alternative splicing events. The method further includes a novel normalization approach to minimize batch effects and ensure consistent expression measurements across multiple experiments or sample sets.
Another embodiment of the invention introduces advanced statistical techniques for identifying differentially expressed genes between experimental conditions. This method leverages machine learning-based models to improve the sensitivity and specificity of gene expression analysis, thus providing a comprehensive and accurate profile of the transcriptome.
The invention is scalable, allowing the analysis of large-scale sequencing data with minimal computational resources. It can be applied across a wide range of biological studies, including disease biomarker discovery, drug response analysis, and developmental biology research.

BRIEF DESCRIPTION OF THE FIGURES

The accompanying figures where like reference numerals refer to identical or functionally similar elements throughout the separate views and which together with the detailed description below are incorporated in and form part of the specification, serve to further illustrate various embodiments and to explain various principles and advantages all in accordance with the invention.
A diagram (Figure 1) illustrates the workflow of the method for gene expression analysis using high-throughput sequencing. The figure depicts the following steps:
1. RNA extraction from biological samples.
2. Library preparation and sequencing of RNA.
3. Preprocessing of raw sequencing data, including quality control and trimming.
4. Alignment of sequences to a reference genome or transcriptome.
5. Quantification of gene expression levels.
6. Statistical analysis for differential expression.
7. Visualization and interpretation of results.
Skilled artisans will appreciate the elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of embodiments of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

While various embodiments of the invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed. It shall be understood that different aspects of the invention can be appreciated individually, collectively, or in combination with each other.
In an embodiment, the method begins with the extraction of RNA from biological samples, which may include tissues, cells, or bodily fluids. RNA is then processed into sequencing libraries using established protocols such as poly(A) selection or ribosomal RNA depletion, depending on the specific research objective.
The sequencing libraries are subjected to high-throughput sequencing using platforms such as Illumina, Ion Torrent, or Pacific Biosciences. The generated raw sequencing data typically contains millions of short reads representing the RNA molecules present in the samples.
Once the sequencing data is obtained, the raw reads undergo preprocessing steps, including quality assessment, trimming of adapter sequences, and removal of low-quality reads. This step ensures that only high-quality sequencing data is used for downstream analysis, reducing biases and improving accuracy.
The preprocessed reads are aligned to a reference genome or transcriptome using specialized software such as STAR, HISAT2, or Bowtie. In an embodiment, the method incorporates an alignment algorithm that accounts for read mismatches, insertions, deletions, and alternative splicing events. This results in more accurate mapping of the sequencing reads to the genome.
After alignment, gene expression levels are quantified by counting the number of reads that map to specific genes. In an embodiment, the method uses a normalization technique such as TPM (Transcripts Per Million) or FPKM (Fragments Per Kilobase of transcript per Million mapped reads) to standardize expression levels across samples, reducing batch effects.
To identify differentially expressed genes, the method applies advanced statistical models, including machine learning-based approaches that improve the sensitivity and specificity of the analysis. This allows researchers to accurately identify genes that are significantly upregulated or downregulated between experimental conditions, treatments, or disease states.
The results are then visualized using tools such as heatmaps, volcano plots, and principal component analysis (PCA), enabling the clear interpretation of gene expression patterns. These visualizations aid in the discovery of key biological insights, including gene regulatory networks, disease biomarkers, and potential therapeutic targets.
Exemplary Embodiment 1: Cancer Biomarker Discovery
In this embodiment, the method is utilized to analyze gene expression profiles in cancer tissues to identify potential biomarkers for diagnosis and treatment response.
1. Sample Collection: RNA is extracted from tumor and adjacent normal tissues of patients diagnosed with a specific type of cancer, such as breast cancer.
2. Library Preparation and Sequencing: RNA libraries are prepared using poly(A) selection, followed by high-throughput sequencing on an Illumina platform.
3. Preprocessing: The raw sequencing data undergoes quality control checks, trimming of adapter sequences, and removal of low-quality reads to ensure high-quality data for subsequent analysis.
4. Alignment: Preprocessed reads are aligned to the human reference genome (e.g., GRCh38) using a read alignment algorithm (e.g., STAR) that accounts for splicing events, improving alignment accuracy.
5. Quantification: Gene expression levels are quantified using FPKM normalization to account for differences in sequencing depth and gene length.
6. Differential Expression Analysis: Statistical models, including a machine learning-based approach, identify differentially expressed genes between tumor and normal tissues. The analysis reveals a set of genes significantly upregulated in tumor samples.
7. Biomarker Validation: Potential biomarkers identified through sequencing are validated using RT-qPCR on an independent cohort of patients to confirm their relevance in cancer diagnosis and prognosis.
8. Clinical Application: The identified biomarkers may be used for developing diagnostic assays or therapeutic targets, ultimately contributing to personalized medicine strategies for cancer treatment.
Exemplary Embodiment 2: Drug Response Analysis in Neurological Disorders
This embodiment focuses on analyzing gene expression profiles in response to a novel drug treatment in patients with neurological disorders, such as Alzheimer's disease.
1. Patient Stratification: RNA is collected from patients diagnosed with Alzheimer's disease before and after treatment with a new drug aimed at enhancing cognitive function.
2. Library Preparation and Sequencing: RNA libraries are generated from patient samples and sequenced using a high-throughput platform, such as Ion Torrent, allowing for rapid data generation.
3. Data Preprocessing: The raw sequencing reads are subjected to stringent quality control, including the removal of low-quality reads and trimming, ensuring that only high-quality data is used for analysis.
4. Sequence Alignment: The cleaned reads are aligned to the human transcriptome using an alignment algorithm that accounts for the presence of novel isoforms and alternative splicing events.
5. Expression Quantification: The aligned reads are quantified, and expression levels of genes are normalized using a method such as TPM to minimize biases related to sequencing depth and gene length.
6. Differential Gene Expression Analysis: Advanced statistical techniques, including linear models and machine learning approaches, are employed to compare gene expression levels before and after treatment. The analysis identifies genes that show significant changes in expression correlated with treatment response.
7. Functional Enrichment Analysis: Identified genes are subjected to functional enrichment analysis to elucidate biological pathways affected by the drug, providing insights into the mechanism of action.
8. Therapeutic Insights: Results from this analysis can inform further drug development and optimize treatment protocols for Alzheimer's disease, enhancing the understanding of drug effects on gene regulation in neurological conditions.
Advantages of the Invention
1. Enhanced Accuracy: The method improves the accuracy of gene expression analysis by accounting for sequencing errors, alternative splicing, and read alignment biases.
2. Scalability: It is highly scalable and can process large volumes of sequencing data efficiently, making it suitable for large-scale transcriptomic studies.
3. Improved Sensitivity and Specificity: The method employs advanced statistical and machine learning techniques to enhance the detection of differentially expressed genes with high precision.
4. Robust Normalization: The normalization approach reduces batch effects, ensuring consistent and reliable gene expression measurements across different experimental conditions.
5. Broad Applicability: The method can be used in various biological research areas, including disease diagnostics, drug response analysis, and biomarker discovery.
, Claims:I/WE CLAIM:
1. A method for the analysis of gene expression profiles using high-throughput sequencing, comprising the steps of: a) sequencing RNA from a biological sample; b) preprocessing raw sequencing data, including quality control and trimming; c) aligning the sequencing data to a reference genome or transcriptome; d) quantifying gene expression levels; and e) identifying differentially expressed genes using statistical techniques.

2. The method of claim 1, wherein the preprocessing step includes removing adapter sequences and low-quality reads.

3. The method of claim 1, wherein the alignment step uses an algorithm that accounts for alternative splicing events.

4. The method of claim 1, wherein the quantification step uses TPM or FPKM normalization techniques.

5. The method of claim 1, wherein the sequencing is performed using an Illumina platform.

6. The method of claim 1, further comprising the step of visualizing the results using heatmaps or PCA.

7. The method of claim 1, wherein the biological sample is selected from tissue, cells, or bodily fluids.

8. The method of claim 1, wherein the statistical techniques include machine learning-based models for differential expression analysis.

9. The method of claim 1, further comprising the step of validating the gene expression results using RT-qPCR.

Documents

Name	Date
202441082194-FORM-5 [05-11-2024(online)].pdf	05/11/2024
202441082194-COMPLETE SPECIFICATION [28-10-2024(online)].pdf	28/10/2024
202441082194-DRAWINGS [28-10-2024(online)].pdf	28/10/2024
202441082194-EDUCATIONAL INSTITUTION(S) [28-10-2024(online)].pdf	28/10/2024
202441082194-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [28-10-2024(online)].pdf	28/10/2024
202441082194-FORM 1 [28-10-2024(online)].pdf	28/10/2024
202441082194-FORM FOR SMALL ENTITY(FORM-28) [28-10-2024(online)].pdf	28/10/2024
202441082194-FORM-9 [28-10-2024(online)].pdf	28/10/2024
202441082194-POWER OF AUTHORITY [28-10-2024(online)].pdf	28/10/2024