Consult an Expert
Trademark
Design Registration
Consult an Expert
Trademark
Copyright
Patent
Infringement
Design Registration
More
Consult an Expert
Consult an Expert
Trademark
Design Registration
Login
METHOD AND SYSTEM FOR MODIFYING SPEECH-IMPAIRED MESSAGES USING MACHINE LEARNING-BASED AUDIO FILTERS
Extensive patent search conducted by a registered patent agent
Patent search done by experts in under 48hrs
₹999
₹399
Abstract
Information
Inventors
Applicants
Specification
Documents
ORDINARY APPLICATION
Published
Filed on 22 November 2024
Abstract
The present invention discloses a method and system for modifying speech-impaired messages using machine learning-based audio filters. The system captures a speech signal using a microphone array, processes the signal through a signal processing unit for noise cancellation and pitch enhancement, and analyzes the pre-processed signal with a machine learning model. The machine learning model applies adaptive filters to modify speech characteristics, such as pitch, tone, cadence, and timing, in real-time. The modified speech is then output through an output device, ensuring enhanced intelligibility and clarity. The invention incorporates specialized hardware components like Field-Programmable Gate Arrays (FPGAs), Digital Signal Processors (DSPs), and machine learning processors with hardware accelerators (e.g., GPUs, ASICs) to minimize latency and ensure real-time processing. This system provides an effective and efficient solution for improving communication for individuals with speech impairments, with applications in hearing aids, speech-generating devices, and mobile communication devices.
Patent Information
Application ID | 202411090840 |
Invention Field | ELECTRONICS |
Date of Application | 22/11/2024 |
Publication Number | 49/2024 |
Inventors
Name | Address | Country | Nationality |
---|---|---|---|
Dr. Ayushi Prakash | Professor, Computer Science and Engineering, Ajay Kumar Garg Engineering College, 27th KM Milestone, Delhi - Meerut Expy, Ghaziabad, Uttar Pradesh 201015, India. | India | India |
Abhinav Bajpai | Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, 27th KM Milestone, Delhi - Meerut Expy, Ghaziabad, Uttar Pradesh 201015, India. | India | India |
Applicants
Name | Address | Country | Nationality |
---|---|---|---|
Ajay Kumar Garg Engineering College | 27th KM Milestone, Delhi - Meerut Expy, Ghaziabad, Uttar Pradesh 201015. | India | India |
Specification
Description:[014] The following is a detailed description of embodiments of the disclosure depicted in the accompanying drawings. The embodiments are in such detail as to clearly communicate the disclosure. However, the amount of detail offered is not intended to limit the anticipated variations of embodiments. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit, and scope of the present disclosure as defined by the appended claims.
[015] In the following description, numerous specific details are set forth in order to provide a thorough understanding of embodiments of the present invention. It will be apparent to one skilled in the art that embodiments of the present invention may be practiced without some of these specific details.
[016] Specific details are given in the following description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail to avoid obscuring the embodiments.
[017] Also, it is noted that individual embodiments may be described as a process that is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed but could have additional steps not included in a figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination can correspond to a return of the function to the calling function or the main function.
[018] The word "exemplary" and/or "demonstrative" is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited by such examples. In addition, any aspect or design described herein as "exemplary" and/or "demonstrative" is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to preclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms "includes," "has," "contains," and other similar words are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term "comprising" as an open transition word without precluding any additional or other elements.
[019] Reference throughout this specification to "one embodiment" or "an embodiment" or "an instance" or "one instance" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
[020] In an embodiment of the invention and referring to Figures 1, the present invention relates to a method and system for modifying speech-impaired messages utilizing machine learning-based audio filters. This technology is designed to assist individuals with speech impairments by improving the intelligibility and clarity of their speech in real-time. Unlike traditional speech enhancement systems, which rely solely on software-based algorithms, the invention integrates novel hardware components that optimize processing speed and minimize latency. The innovative combination of hardware and software ensures an effective and efficient system for real-time communication.
[021] The system described in this invention comprises several integrated hardware modules and software components that work together seamlessly. These components include an advanced audio capture unit, a signal processing unit, a dedicated machine learning processor, an array of machine learning-based audio filters, and an output unit. Each of these components is designed to perform a specific function within the overall system, and their interconnection is key to the efficacy of the invention.
[022] The core of the system is its ability to modify speech-impaired messages using machine learning techniques. Speech impairment often involves irregularities in pitch, tone, volume, cadence, and articulation. The system is designed to address these challenges by dynamically adjusting the speech signal in real-time, improving clarity and making the message intelligible without distorting the user's original voice. This dynamic adaptation is achieved using machine learning models, which are trained to recognize and modify speech characteristics based on pre-processed audio data.
[023] The hardware components in the system include a microphone array, a signal processing unit, a specialized machine learning processor, and an output device (such as speakers). The microphone array captures the speech signal from the user and sends it to the signal processing unit. The microphone system is designed to filter out ambient noise and isolate the user's speech for better clarity. The use of multiple microphones arranged in an array allows for enhanced spatial audio capture, improving the accuracy of speech recognition and reducing the influence of surrounding noise.
[024] Once the audio is captured by the microphone array, it is transmitted to the signal processing unit. This unit performs initial pre-processing on the audio signal. The signal processing unit is equipped with high-performance hardware components like Field-Programmable Gate Arrays (FPGAs) or Digital Signal Processors (DSPs), which efficiently handle the initial stages of signal conditioning. This pre-processing step involves noise cancellation, volume normalization, and pitch enhancement. The goal of this stage is to create a clean, well-balanced audio signal that is easier for the machine learning model to analyze.
[025] The machine learning processor is a key hardware element that houses the trained machine learning model. The processor is designed to handle intensive computations related to speech modification. It uses a high-speed GPU or a custom-built ASIC (Application-Specific Integrated Circuit) to run complex neural networks in real time. The machine learning model employed in the system is a deep neural network (DNN) trained on vast datasets of impaired speech, allowing it to recognize various speech patterns and identify which modifications are required to improve intelligibility.
[026] This machine learning model is designed to operate on a set of machine learning-based audio filters, which are custom-developed for the purpose of enhancing speech quality. The audio filters apply adaptive techniques to adjust the frequency, pitch, tone, and cadence of the speech signal. These filters can compensate for specific speech impairments such as dysarthria, stuttering, or speech delay. The real-time operation of these filters is achieved using specialized hardware accelerators, which ensure that the processing occurs with minimal latency.
[027] The final stage in the system involves transmitting the processed speech to the output device. This output unit consists of high-quality speakers or a speech synthesizer that reproduces the modified speech. The output device is designed to reproduce the modified speech in a clear and natural-sounding manner, ensuring that the listener can understand the message with ease. The output unit can be adapted for use in various environments, such as a mobile device, hearing aids, or assistive communication devices.
[028] The system's working process begins with the user speaking into the microphone array. The microphone array captures the sound in high fidelity and sends the audio signal to the signal processing unit. The signal processing unit digitizes the audio, filters out background noise, and applies basic enhancements to improve the clarity of the speech. The processed signal is then forwarded to the machine learning processor, which is responsible for further improving the speech.
[029] The machine learning model uses deep learning techniques to analyze the speech signal. It identifies patterns of speech impairment and applies appropriate modifications. These modifications might include pitch shifting, tone modulation, volume adjustment, and timing corrections. The machine learning model continuously adapts based on feedback from the system's output, improving the accuracy of its modifications over time. The real-time processing ensures that the system can modify speech dynamically as the user speaks.
[030] After the machine learning model applies the necessary modifications, the enhanced audio signal is transmitted to the output unit, which reproduces the speech with improved intelligibility. The output unit ensures that the modified speech is clear and easy to understand, regardless of the severity of the original impairment. The system can be further customized to suit individual users, with the machine learning model adapting to the user's unique speech patterns.
[031] A key aspect of the invention is the use of novel hardware components that enhance the overall performance and efficiency of the system. The signal processing unit uses FPGA or DSP technology, which allows for real-time processing with minimal latency. This hardware is crucial in ensuring that the system operates seamlessly and provides a smooth user experience.
[032] Additionally, the machine learning processor is implemented using specialized hardware accelerators such as GPUs or custom-built ASICs. These hardware components are optimized for running deep learning models, enabling the system to process speech data rapidly. The use of such hardware accelerators significantly reduces the processing time compared to conventional software-based solutions, ensuring that the system can modify speech in real time.
[033] The microphone array and the output unit are also designed to enhance the quality of the system. The microphone array utilizes advanced beamforming techniques to capture speech more effectively, even in noisy environments. The output unit is equipped with high-fidelity speakers that reproduce the modified speech accurately, ensuring that the user's message is conveyed clearly.
[034] The hardware components in the system are interconnected through high-speed communication interfaces. The microphone array is connected to the signal processing unit via a low-latency data bus, which ensures fast transmission of the captured audio data. The signal processing unit, in turn, communicates with the machine learning processor through high-speed memory interfaces that allow for rapid data exchange.
[035] The machine learning processor interacts with the audio filtering system, which is an integral part of the modification process. These components work in parallel to modify the speech signal in real time. Once the processing is complete, the machine learning processor sends the enhanced audio data to the output unit via a fast communication link. This interconnection ensures that the entire system operates efficiently, with minimal delays between input and output.
[036] The efficacy of the system has been demonstrated through extensive testing. The system has shown significant improvements in speech intelligibility, as evidenced by tests conducted with individuals who have various types of speech impairments. The modifications made by the machine learning model are tailored to the specific needs of each user, improving clarity and comprehension.
[037] The system's performance is further enhanced by the use of hardware accelerators, which ensure that the modifications occur in real time. The integration of machine learning with specialized hardware allows the system to process large amounts of speech data quickly and accurately. This combination of hardware and software makes the system particularly well-suited for use in applications that require real-time interaction, such as communication devices or assistive technologies for individuals with speech impairments.
[038] The following tables summarize the results of testing conducted to validate the effectiveness of the system in improving speech intelligibility and recognition accuracy.
[039] Table 1: Speech Intelligibility Improvement (Test Group 1)
Table 2: Speech Recognition Accuracy (Test Group 2)
[040] These results clearly demonstrate the significant improvements in both speech intelligibility and recognition accuracy following the modifications made by the system. The data supports the validity of the invention and highlights its potential as a powerful tool for individuals with speech impairments.
[041] The invention described herein provides a novel and effective method for modifying speech-impaired messages using machine learning-based audio filters. By integrating specialized hardware components with machine learning algorithms, the system is able to process speech data in real time, making it ideal for use in real-world applications. The system's ability to improve speech intelligibility and recognition accuracy represents a significant advancement in assistive technology, offering a valuable tool for individuals with speech impairments and enhancing their ability to communicate more effectively. , Claims:1. A method for modifying speech-impaired messages, comprising the steps of:
a) capturing a speech signal using a microphone array;
b) pre-processing the captured speech signal using a signal processing unit for noise cancellation, volume normalization, and pitch enhancement;
c) analyzing the pre-processed speech signal using a machine learning model implemented on a machine learning processor;
d) modifying the speech signal in real-time by applying machine learning-based audio filters to adjust frequency, pitch, tone, cadence, and timing of the speech;
e) outputting the modified speech signal using an output device, wherein the system improves speech intelligibility and clarity for individuals with speech impairments.
2. A system for modifying speech-impaired messages, comprising:
i. a microphone array configured to capture a speech signal;
ii. a signal processing unit operable to pre-process the captured speech signal for noise cancellation, volume normalization, and pitch enhancement;
iii. a machine learning processor configured to apply a machine learning model to the pre-processed speech signal, wherein the machine learning model identifies and modifies speech impairment characteristics;
iv. a set of machine learning-based audio filters operable to modify the speech signal in real-time to improve speech intelligibility;
v. an output device to reproduce the modified speech signal.
3. The method as claimed in claim 1, wherein the signal processing unit is implemented using a Field-Programmable Gate Array (FPGA) or a Digital Signal Processor (DSP) to enable real-time processing of the speech signal.
4. The method as claimed in claim 1, wherein the machine learning model is a deep neural network (DNN) trained on datasets containing various speech impairments, and wherein the model adapts based on feedback from the output device.
5. The method as claimed in claim 1, wherein the machine learning-based audio filters adjust specific aspects of speech impairment, including dysarthria, stuttering, speech delay, and articulation disorders.
6. The system as claimed in claim 2, wherein the machine learning processor uses specialized hardware accelerators, including a Graphics Processing Unit (GPU) or an Application-Specific Integrated Circuit (ASIC), to reduce processing latency and enable real-time speech modification.
7. The system as claimed in claim 2, wherein the microphone array utilizes beamforming techniques to improve spatial audio capture and reduce background noise, thereby enhancing the quality of the captured speech signal.
8. The system as claimed in claim 2, wherein the output device includes high-fidelity speakers or a speech synthesizer configured to reproduce the modified speech signal with high clarity.
9. The method as claimed in claim 1, wherein the system is integrated into assistive communication devices, such as hearing aids, mobile devices, or speech-generating devices, to facilitate communication for individuals with speech impairments.
10. The system as claimed in claim 2, wherein the microphone array, signal processing unit, machine learning processor, and output device are interconnected through high-speed communication interfaces to enable low-latency processing and real-time feedback.
Documents
Name | Date |
---|---|
202411090840-COMPLETE SPECIFICATION [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-DECLARATION OF INVENTORSHIP (FORM 5) [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-DRAWINGS [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-EDUCATIONAL INSTITUTION(S) [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-EVIDENCE FOR REGISTRATION UNDER SSI [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-FORM 1 [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-FORM 18 [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-FORM FOR SMALL ENTITY(FORM-28) [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-FORM-9 [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-REQUEST FOR EARLY PUBLICATION(FORM-9) [22-11-2024(online)].pdf | 22/11/2024 |
202411090840-REQUEST FOR EXAMINATION (FORM-18) [22-11-2024(online)].pdf | 22/11/2024 |
Talk To Experts
Calculators
Downloads
By continuing past this page, you agree to our Terms of Service,, Cookie Policy, Privacy Policy and Refund Policy © - Uber9 Business Process Services Private Limited. All rights reserved.
Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.
Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.