Consult an Expert

Trademark

Patent

Infringement

Design Registration

Consult an Expert

Talk to a IP/Trademark Lawyer

Trademark

Trademark Registration

Trademark Search

Respond to TM Objection

International Trademark

Trademark Class Finder

Patent

Indian Patent Search

Provisional Patent Application

Patent Registration

Infringement

Patent Infringement

Trademark Infringement

Design Registration

Patent search/

A System and Method for Enhancing Spoken English Proficiency through Real-Time Feedback

Patent Search in India

Extensive patent search conducted by a registered patent agent
Patent search done by experts in under 48hrs

₹999

₹399

Talk to expert

A System and Method for Enhancing Spoken English Proficiency through Real-Time Feedback

ORDINARY APPLICATION

Published

Filed on 21 November 2024

Abstract

ABSTRACT: Title: A System and Method for Enhancing Spoken English Proficiency through Real-Time Feedback The present disclosure proposes an advanced system (100) for improving spoken English proficiency. The advanced system (100) comprises plurality of modules (108) like an input module (109), a feedback module (111), a facial expression analysis module (112), a speech recognition module (114), an adaptive learning module (116), an accent comparison module (118), a mother tongue influence (MTI) reduction module (120) and a performance analytics module (122). The facial expression analysis module (112) is configured to interpret user gestures and facial expressions. The speech recognition module (114) is configured to detect user speech patterns and pauses in the user's speech. The accent comparison module (118) is configured to compare the user accents with integrated accents. The MTI reduction module (120) identifies and corrects mother tongue influence. The performance analytics module (122) generates detailed reports based on user performance.

Patent Information

Application ID	202441090655
Invention Field	ELECTRONICS
Date of Application	21/11/2024
Publication Number	48/2024

Inventors

Name	Address	Country	Nationality
Dr. P. Asha	Assistant Professor, English and Other Languages, GITAM School of Humanities and Social Sciences, GITAM Deemed to be University, Gandhi Nagar, Rushikonda, Visakhapatnam-530045, Andhra Pradesh, India.	India	India
Dr. S. Sushma Raj	In-Charge Head of the Institute, GITAM School of Humanities and Social Sciences, GITAM Deemed to be University, Gandhi Nagar, Rushikonda, Visakhapatnam-530045, Andhra Pradesh, India.	India	India

Applicants

Name	Address	Country	Nationality
Gandhi Institute of Technology and Management (GITAM)	GITAM Deemed to be University, Gandhi Nagar, Rushikonda, Visakhapatnam-530045, Andhra Pradesh, India.	India	India

Specification

Description:DESCRIPTION:
Field of the invention:
[0001] The present disclosure generally relates to the technical field of language learning systems, in specific, relates to an advanced system for improving spoken English proficiency through real-time feedback, facial expression analysis, adaptive learning, and accent comparison using dynamic time warping (DTW), thereby enhancing accuracy, engagement, and personalization.
Background of the invention:
[0002] Language learning has long been a fundamental pursuit for individuals seeking to broaden their horizons, advance their careers, or simply engage more deeply with global communities. Across the world, the demand for effective language education continues to grow as globalization accelerates and communication barriers persist. Particularly, the acquisition of spoken language skills, such as mastering pronunciation, intonation, and fluency, remains a significant challenge for learners of all ages and backgrounds.

[0003] Currently, there are various English-speaking and instructional techniques, applications, and gadgets available in technology. While these tools assist learners in basic English learning precautions, many do not incorporate features for word and tense correction. Existing apps and tools that do offer such corrections aim to help users practice English by providing guidance on these aspects. However, learners often face challenges in applying these corrections and effectively communicating with unfamiliar individuals in real-time scenarios. This gap highlights the need for more integrated and supportive tools that not only correct language errors but also facilitate practical communication skill development in real-world settings.

[0004] In recent years, advancements in artificial intelligence (AI) and natural language processing (NLP) have spurred innovation in language learning technologies. AI-driven systems now offer promising solutions to address longstanding challenges in language acquisition, particularly in the realm of real-time speech assistance. These systems leverage sophisticated techniques to analyze speech patterns, detect errors, and provide immediate feedback tailored to individual learners needs.

[0005] The evolution of AI-powered language learning tools marks a paradigm shift from static, one-size-fits-all approaches to dynamic, adaptive learning environments. Such tools integrate voice recognition, machine learning models, and interactive interfaces to simulate immersive language learning experiences. By harnessing the power of AI, these systems can identify nuances in pronunciation, intonation, and grammar that traditional methods often overlook, thereby enabling learners to refine their speaking skills with unprecedented precision and efficacy.

[0006] Despite these advancements, significant challenges persist in the field of AI-driven language learning. Issues such as the accuracy of speech recognition systems across diverse accents and languages, the integration of cultural context into language instruction, and the ethical implications of AI usage in education continue to be topics of rigorous debate and research. Moreover, the scalability and sustainability of AI-powered solutions in resource-constrained environments pose additional barriers to widespread adoption and effectiveness.

[0007] A prior art, CN117877333A, discloses an experimental teaching method, server, and system based on virtual simulation for enhancing language learning through voice recognition, gesture recognition, and facial expression analysis. The system provides real-time feedback, grammar corrections, and pronunciation suggestions to students during spoken language practice. Multi-mode interaction technologies enable students to interact with a virtual environment using voice, gestures, and facial expressions, enhancing learning experiences and language skills. However, the system lacks provisions for distinguishing between normal and advanced levels of English, missing an opportunity to customize learning experiences to different proficiency levels. There is no feature to suggest words related to the topic when students pause or forget words while speaking, which could aid in maintaining fluency and continuity in speech. The patent does not include a system for comparing the user's accent with standard or target accents, which could be beneficial for improving pronunciation and achieving more accurate language skills.

[0008] By addressing all the above-mentioned problems, there is a need for an advanced system for improving spoken English proficiency through real-time feedback, facial expression analysis, adaptive learning, and accent comparison using dynamic time warping (DTW), thereby enhancing accuracy, engagement, and personalization. There is also a need for an advanced system that is capable of providing instant and accurate feedback on language pronunciation, grammar, and vocabulary during speech practice. There is also a need for an advanced system that provides an adaptive learning environment that adjusts to the proficiency level of the user, thereby offering customized feedback and exercises.

[0009] Additionally, there is also a need for an advanced system that integrates artificial intelligence (AI) techniques to analyze and correct pronunciation errors in real-time, thereby utilizing advanced speech recognition and processing techniques. There is also a need for an advanced system that integrates one or more technologies for analyzing facial expressions and gestures to enhance feedback precision and user engagement during language practice. There is also a need for an advanced system that offers a scalable solution that can be used by individuals, educational institutions, and organizations to improve language skills.

[0010] There is also a need for an advanced system that provides an interactive language learning platform that offers personalized pronunciation correction based on the individual learner's speech patterns. There is also a need for an advanced system for suggesting words based on context and user proficiency level, aiding in maintaining fluency and coherence in speech. There is also a need for an advanced system that provides an interactive user interface that supports multiple modes of interaction (voice, gesture, and text) for seamless language learning experiences. Further, there is also a need for an advanced system that generates detailed performance reports and analytics, facilitates personalized learning pathways, and tracks progress.
Objectives of the invention:
[0011] The primary objective of the present invention is to provide an advanced system for improving spoken English proficiency through real-time feedback, facial expression analysis, adaptive learning, and accent comparison using dynamic time warping (DTW), thereby enhancing accuracy, engagement, and personalization.

[0012] Another objective of the present invention is to provide an advanced system that is capable of providing instant and accurate feedback on language pronunciation, grammar, and vocabulary during speech practice.

[0013] Another objective of the present invention is to provide an advanced system that provides an adaptive learning environment that adjusts to the proficiency level of the user, thereby offering customized feedback and exercises.

[0014] Another objective of the present invention is to provide an advanced system that integrates artificial intelligence (AI) techniques to analyze and correct pronunciation errors in real-time, thereby utilizing advanced speech recognition and processing techniques.

[0015] Another objective of the present invention is to provide an advanced system that integrates one or more technologies for analyzing facial expressions and gestures to enhance feedback precision and user engagement during language practice.

[0016] Another objective of the present invention is to provide an advanced system that offers a scalable solution that can be used by individuals, educational institutions, and organizations to improve language skills.

[0017] Another objective of the present invention is to provide an advanced system that provides an interactive language learning platform that offers personalized pronunciation correction based on the individual learner's speech patterns.

[0018] Another objective of the present invention is to provide an advanced system for suggesting words based on context and user proficiency level, aiding in maintaining fluency and coherence in speech.

[0019] Yet another objective of the present invention is to provide an advanced system that provides an interactive user interface that supports multiple modes of interaction (voice, gesture, and text) for seamless language learning experiences.

[0020] Further objective of the present invention is to provide an advanced system that generates detailed performance reports and analytics, facilitates personalized learning pathways, and tracks progress.
Summary of the invention:
[0021] The present disclosure proposes an advanced system and method for enhancing spoken English proficiency through real-time feedback. The following presents a simplified summary in order to provide a basic understanding of some aspects of the claimed subject matter. This summary is not an extensive overview. It is not intended to identify key/critical elements or to delineate the scope of the claimed subject matter. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

[0022] In order to overcome the above deficiencies of the prior art, the present disclosure is to solve the technical problem to provide an advanced system for improving spoken English proficiency through real-time feedback, facial expression analysis, adaptive learning, and accent comparison using dynamic time warping (DTW), thereby enhancing accuracy, engagement, and personalization.

[0023] According to one aspect, the invention provides the advanced system for improving spoken English proficiency. The system includes a computing device with a processor and a memory for storing one or more instructions executable by the processor. The computing device communicates with an application server via a network. The processor is configured to provide instant and accurate feedback on language pronunciation, grammar, and vocabulary during a user's speech practice via a plurality of modules.

[0024] In one embodiment herein, the computing device is configured to perform operations for enhancing spoken English proficiency through a plurality of modules. The plurality of modules comprises an input module, a feedback module, a facial expression analysis module a speech recognition module, an adaptive learning module, an accent comparison module, a mother tongue influence (MTI) reduction module, and a performance analytics module.

[0025] In one embodiment herein, the facial expression analysis module is configured to detect user gestures and facial expressions through an artificial intelligence (AI) camera to identify user pauses and emotional data using facial expression analysis.

[0026] In one embodiment herein, the speech recognition module is configured to analyze a user's speech received through an input module to detect user speech patterns, and pauses in the user's speech. The speech recognition module is configured to predict the intended topic and suggest relevant words and phrases based on the pauses in the user's speech, context, and the emotional data received from the facial expression analysis module.

[0027] In one embodiment herein, the speech recognition module uses a natural language processing (NLP) model to analyze the context of the user's speech, predict pauses, and generate word suggestions based on the level of English proficiency selected by the user. In one embodiment herein, the speech recognition module uses a Markov decision processes (MDP) model, which is configured for predicting potential words and phrases that the user needs during pauses in the speech. The speech recognition module is trained on a large dataset of diverse accents and dialects to enhance transcription accuracy.

[0028] In one embodiment herein, the input module is configured to enable the user to enter the one or more inputs. The one or more inputs include a topic, subject details, level of English proficiency, and type of accent. The one or more inputs are provided through at least one of audio input, visual or gesture input, and text input.

[0029] In one embodiment herein, the feedback module is configured to provide real-time feedback on pronunciation, grammar, and vocabulary based on the analyzed user speech via at least one user interface module. The feedback module is configured to differentiate between normal and advanced English proficiency levels. The feedback module is configured to provide visual and auditory feedback via the at least one user interface module to correct pronunciation errors. The least one user interface module is configured to support multi-modal inputs such as touch, voice, and gesture for a more engaging user experience.

[0030] In one embodiment herein, the adaptive learning module is configured to adjust feedback and exercises based on the user's proficiency level and learning pace. The adaptive learning module employs one or more machine learning techniques to dynamically adjust the difficulty level of exercises based on the user performance data.

[0031] In one embodiment herein, the accent comparison module is configured to compare the user accents with target and integrated accent versions using a dynamic time warping (DTW) model, thereby providing detailed visual feedback and auditory feedback in the user's voice for pronunciation adjustments and intonation patterns to achieve more natural and expressive accents. The accent improvement suggestions are provided through at least one of detailed visual feedback and auditory feedback through the at least one user interface module in the user's voice.

[0032] In one embodiment herein, the mother tongue influence (MTI) reduction module is configured to identify and correct mother tongue influence in pronunciation, grammar, and sentence structure of the user's speech and provides visual feedback and auditory feedback via the at least one user interface module.

[0033] In one embodiment herein, the MTI reduction module comprises a database of common MTI errors and corresponding corrective measures. In one embodiment herein, the MTI reduction module incorporates specific exercises designed to reduce the influence of the user's native language on their spoken English. These exercises target common errors related to mother tongue interference.

[0034] In one embodiment herein, the performance analytics module is configured for generating one or more detailed reports and analytics based on the user performance, thereby transferring the one or more detailed reports to the teacher for facilitating personalized learning pathways. The performance analytics module generates weekly and monthly reports to track user progress and highlight areas for improvement. These reports help in assessing the effectiveness of the learning process.

[0035] According to another aspect, the invention provides a method for improving spoken English proficiency using an advanced system. At one step, the input module receives the one or more inputs from a user and a teacher via at least one user interface module through at least one of voice commands, gestures, and text. At another step, the facial expression analysis module detects user gestures and facial expressions during the user's speech with an artificial intelligence (AI) camera to identify user pauses and emotional data using facial expression analysis.

[0036] At another step, the speech recognition module analyzes the user's speech received through the input module to detect user speech patterns, and pauses in the user's speech, thereby predicting the intended topic and suggesting relevant words and phrases based on the pauses in the user speech, context, and the emotion data received from the facial expression analysis module.

[0037] At another step, the feedback module provides real-time feedback on pronunciation, grammar, and vocabulary based on the analyzed user speech via at least one user interface module, thereby differentiating between normal and advanced English proficiency levels.

[0038] At another step, the adaptive learning module adjusts the feedback and exercises based on the user's proficiency level and learning pace. At another step, the accent comparison module compares the user's accent with target and integrated accent versions using a dynamic time warping (DTW) model, thereby providing detailed visual and auditory feedback in the user's voice for pronunciation adjustments and intonation patterns to achieve more natural and expressive accents.

[0039] At another step, the mother tongue influence (MTI) reduction module identifies and corrects the mother tongue influence in pronunciation, grammar, and sentence structure and provides visual and auditory feedback via the at least one user interface module. Further at other step, the performance analytics module generates the one or more detailed reports and analytics based on the user performance, and the one or more detailed reports are transferred to the teacher for facilitating personalized learning pathways.
[0040] Further, objects and advantages of the present invention will be apparent from a study of the following portion of the specification, the claims, and the attached drawings.
Detailed description of drawings:
[0041] The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate an embodiment of the invention, and, together with the description, explain the principles of the invention.

[0042] FIG. 1 illustrates a block diagram representing the advanced system for improving spoken English proficiency, in accordance to an exemplary embodiment of the invention.

[0043] FIG. 2 illustrates a schematic view of the at least one user practicing the speech in a classroom, in accordance to an exemplary embodiment of the invention.

[0044] FIG. 3 illustrates a flowchart of a method for improving spoken English proficiency using the advanced system, in accordance to an exemplary embodiment of the invention.
Detailed invention disclosure:
[0045] Various embodiments of the present invention will be described in reference to the accompanying drawings. Wherever possible, same or similar reference numerals are used in the drawings and the description to refer to the same or like parts or steps.

[0046] The present disclosure has been made with a view towards solving the problem with the prior art described above, and it is an object of the present invention to provide an advanced system for improving spoken English proficiency through real-time feedback, facial expression analysis, adaptive learning, and accent comparison using dynamic time warping (DTW), thereby enhancing accuracy, engagement, and personalization.

[0047] According to an exemplary embodiment of the invention, FIG. 1 refers to a block diagram representing an advanced system 100 for improving spoken English proficiency. The advanced system 100 is capable of providing instant and accurate feedback on language pronunciation, grammar, and vocabulary during speech practice. The advanced system 100 provides an adaptive learning environment that adjusts to the proficiency level of the user, thereby offering customized feedback and exercises. The advanced system 100 integrates artificial intelligence (AI) techniques to analyze and correct pronunciation errors in real-time, thereby utilizing advanced speech recognition and processing techniques. The advanced system 100 integrates one or more technologies for analyzing facial expressions and gestures to enhance feedback precision and user engagement during language practice.

[0048] The advanced system 100 offers a scalable solution that can be used by individuals, educational institutions, and organizations to improve language skills. The advanced system 100 provides an interactive language learning platform that offers personalized pronunciation correction based on the individual learner's speech patterns. The advanced system 100 is used for suggesting words based on context and user proficiency level, aiding in maintaining fluency and coherence in speech. The advanced system 100 provides an interactive user interface that supports multiple modes of interaction (voice, gesture, and text) for seamless language learning experiences. The advanced system 100 generates detailed performance reports and analytics, facilitates personalized learning pathways, and tracks progress.

[0049] In one embodiment, the system 100 includes a computing device 102 with a processor 104 and a memory 106 for storing one or more instructions executable by the processor 104. The computing device 102 communicates with a server 124 via a network 126. The processor 104 is configured to provide instant and accurate feedback on language pronunciation, grammar, and vocabulary during a user's speech practice via a plurality of modules 108.

[0050] The one or more instructions may be executed to cause the system 100 to perform the various functionalities. The processor 104 acts as the central processing unit (CPU) of the system 100, responsible for coordinating different tasks and carrying out complex operations, data processing, and decision-making by fetching instructions from the memory 106, thereby decoding the instructions and executing the necessary actions.

[0051] In one embodiment herein, the memory 106 serves as the storage component of the system 100, holding the executable instructions as well as any data or information required by the processor 104 to perform its tasks. The data includes user inputs, system configurations, and any other relevant data needed for the system's operations. Through the communication between the processor 104 and the memory 106, the system 100 is able to process the user inputs, access stored information, perform computations, and make decisions accordingly.

[0052] In one embodiment herein, the computing device 102 represents any electronic device that the user can utilize to interact with the system 100. The computing device 102 can be, but is not limited to, a smartphone, a laptop, a tablet, a personal computer, or any other suitable electronic device. The computing device 102 serves as the user's gateway to accessing and interacting with the system 100.

[0053] In one embodiment herein, the computing device 102 is in communication with the server 124 via the network 126. The network 126 acts as a communication that allows the computing device 102 to interact with the other components of the system 100, thereby facilitating the exchange of data, commands, and information. In one embodiment herein, the network 126 can be a wireless communication infrastructure, which offers the user flexibility and convenience when interacting with the system 100. This wireless connectivity enables the users to access the system 100 from various locations without being tethered to a fixed physical connection.

[0054] In one embodiment herein, the network 126 can be, but is not limited to, a local area network (LAN), a cellular network, a wide area network (WAN), an intranet, a virtual private network (VPN), and wireless networks that use radio frequency (RF) or infrared (IR) technology to transmit data without the need for physical cables, thereby providing mobility and flexibility. The versatility of the network 126 ensures that the computing device 102 can seamlessly connect to the server 124, thereby enabling the users to access the system's 100 functionalities and resources from a variety of locations and devices. This wireless connectivity enhances the overall accessibility and convenience of the system 100 for users.

[0055] In one embodiment herein, the computing device 102 is configured to perform operations for enhancing spoken English proficiency through a plurality of modules 108. The plurality of modules 108 comprises an input module 109, a feedback module 111, a facial expression analysis module 112 a speech recognition module 114, an adaptive learning module 116, an accent comparison module 118, a mother tongue influence (MTI) reduction module 120, and a performance analytics module 122.

[0056] In one embodiment herein, the facial expression analysis module 112 is configured to detect user gestures and facial expressions through an artificial intelligence (AI) camera to identify user pauses and emotional data using facial expression analysis through at least one of the deep learning models. In some embodiments, the pauses might be hesitation or uncertainty in a user's speech.

[0057] In one embodiment herein, the speech recognition module 114 is configured to analyze a user's speech received through an input module 109 to detect user speech patterns, and pauses in the user's speech. The speech recognition module 114 is configured to predict the intended topic and suggest relevant words and phrases based on the pauses in the user's speech, context, and the emotional data received from the facial expression analysis module 112.

[0058] In one embodiment herein, the speech recognition module 114 uses a natural language processing (NLP) model to analyze the context of the user's speech, predict pauses, and generate word suggestions based on the level of English proficiency selected by the user. In one embodiment herein, the speech recognition module 114 uses a Markov decision processes (MDP) model, which is configured for predicting potential words and phrases that the user needs during pauses in the speech. The speech recognition module 114 is trained on a large dataset of diverse accents and dialects to enhance transcription accuracy.

[0059] In one embodiment herein, the input module 109 is configured to enable the user to enter the one or more inputs. The one or more inputs include a topic, subject details, level of English proficiency, and type of accent. The one or more inputs are provided through at least one of audio input, visual or gesture input, and text input.

[0060] In one embodiment herein, the feedback module 111 is configured to provide real-time feedback on pronunciation, grammar, and vocabulary based on the analyzed user speech via at least one user interface module 110. The feedback module 111 is configured to differentiate between normal and advanced English proficiency levels. The feedback module 111 is configured to provide visual and auditory feedback via the at least one user interface module 110 to correct pronunciation errors. The least one user interface module 110 is configured to support multi-modal inputs such as touch, voice, and gesture for a more engaging user experience.

[0061] In one embodiment herein, the adaptive learning module 116 is configured to adjust feedback and exercises based on the user's proficiency level and learning pace. The adaptive learning module 116 employs one or more machine learning techniques to dynamically adjust the difficulty level of exercises based on the user performance data.

[0062] In one embodiment herein, the accent comparison module 118 is configured to compare the user accents with target and integrated accent versions using a dynamic time warping (DTW) model, thereby providing detailed visual feedback and auditory feedback in the user's voice for pronunciation adjustments and intonation patterns to achieve more natural and expressive accents. The accent improvement suggestions are provided through at least one of detailed visual feedback and auditory feedback through the at least one user interface module 110 in the user's voice.

[0063] In one embodiment herein, the mother tongue influence (MTI) reduction module 120 is configured to identify and correct mother tongue influence in pronunciation, grammar, and sentence structure of the user's speech and provides visual feedback and auditory feedback via the at least one user interface module 110.

[0064] In one embodiment herein, the MTI reduction module 120 comprises a database of common MTI errors and corresponding corrective measures. In one embodiment herein, the MTI reduction module 120 incorporates specific exercises designed to reduce the influence of the user's native language on their spoken English. These exercises target common errors related to mother tongue interference.

[0065] In one embodiment herein, the performance analytics module 122 is configured for generating one or more detailed reports and analytics based on the user performance, thereby transferring the one or more detailed reports to the teacher for facilitating personalized learning pathways. The performance analytics module 122 generates weekly and monthly reports to track user progress and highlight areas for improvement. These reports help in assessing the effectiveness of the learning process.

[0066] According to an exemplary embodiment of the invention, FIG. 2 refers to a schematic view of the at least one user practicing the speech in a classroom. In one embodiment herein, the system 100 facilitates the initial setup process, the user (student) wears an audio device (microphone) and inputs topic or subject details through either voice or text input. The selection of English proficiency level is configured to be adjusted by the teacher on the display or the at least one interface module 110. In one embodiment herein, the English proficiency levels are classified as normal and advanced. Normal English assists in forming sentences with simpler vocabulary, while advanced English aids in constructing sentences with more complex and sophisticated vocabulary.

[0067] In one embodiment herein, during sentence formation, the system 100 utilizes real-time speech processing techniques involving the plurality of deep learning modules such as convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and Markov decision processes (MDP). These models transcribe spoken words into text in real-time, thereby facilitating subsequent processing steps like word suggestions and accent comparison. The system 100, assists in sentence formation with appropriate vocabulary and syntax based on the selected proficiency level.

[0068] In one embodiment herein, the speech recognition module 114 converts the student's speech into text, thereby allowing for the application of subsequent processing steps. Depending on the user's selected level, the system 100 assists in forming sentences with the appropriate vocabulary and syntax. In one embodiment herein, for normal English, if the user says, "Climate change is a big problem," the system 100 might suggest simpler words like "issue" or "concern" as alternatives to "problem". This helps in constructing sentences with basic grammar and vocabulary suitable for everyday communication.

[0069] In one embodiment herein, for advanced English, if the user says, "Climate change is a significant problem," the system 100 might suggest more sophisticated words such as "substantial," "consequential," or "pressing" as alternatives to "problem". This provides assistance with more complex sentence structures, advanced vocabulary, and nuanced expressions. In one embodiment herein, the facial expression analysis module 112 tracks the user's gestures and facial expressions using an AI camera as they speak. Simultaneously, the audio system monitors the clarity of their speech to ensure accurate input. In one embodiment herein, the system 100 provides real-time word suggestions on the at least one user interface module 110 based on the user's selected level. This feature assists the user in forming grammatically correct and contextually appropriate sentences, enhancing overall communication effectiveness.

[0070] In one embodiment herein, the speech recognition module 114 addresses scenarios where the user pauses during speech. For example, if a student begins speaking about "climate change" and says, "To mitigate the effects of climate change, we need to.." and then pauses, the system 100 detects this hesitation. In one embodiment herein, the facial expression analysis module 112 utilizes an AI camera to identify the user's hesitation and recognize it as a pause. The system 100 analyzes the user's facial expressions to determine if they are confused or in need of assistance. In one embodiment herein, the speech recognition module 114 processes the user's spoken sentence. The speech recognition system captures the spoken segment, and the NLP model identifies "climate change" as the topic and "mitigate the effects" as the context.

[0071] In one embodiment herein, the user interface module 110 displays suggested words on the screen as the user speaks. For normal English, suggestions might include "recycle," "reduce waste," or "plant trees". For advanced English, suggestions could be "implement sustainable practices," "reduce carbon emissions," or "promote renewable energy". These suggestions are based on the identified topic and context. In one embodiment herein, the system 100 displays these suggestions to the user via the user interface module 110. The user selects "recycle" in normal English or "reduce carbon emissions" in advanced English, and then resumes their speech incorporating the selected word "to mitigate the effects of climate change, we need to recycle" or "to mitigate the effects of climate change, we need to reduce carbon emissions".

[0072] In one embodiment herein, the system 100 continues to monitor the user's speech, remaining prepared to assist again if another pause occurs. This ensures continuous support throughout the user's speaking process. In one embodiment herein, the continuous improvement and feedback enhance the system's capabilities through machine learning. The system 100 learns from each interaction, thereby improving its prediction and suggestion accuracy over time. Users can provide feedback on the usefulness of the suggestions, allowing the system 100 to refine the machine-learning models.

[0073] In one embodiment herein, the benefits of the system 100 include providing a seamless speaking experience by helping users maintain their speech flow, reducing anxiety, and enhancing confidence. It offers contextually relevant suggestions that are appropriate for the ongoing conversation and supports enhanced learning by providing real-time assistance, helping users learn new vocabulary and sentence structures to improve their language skills. In one embodiment herein, the adaptive learning module 116 adjusts feedback and exercises based on the user's proficiency level and learning pace. This ensures that the learning content is appropriately challenging and tailored to individual needs. In one embodiment herein, the adaptive learning module 116 uses machine learning techniques to dynamically adjust the difficulty level of exercises based on user performance data. This approach ensures that the exercises remain suitably challenging.

[0074] In one embodiment herein, the accent comparison module 118 involves analyzing the user's pronunciation and comparing it to a target accent using the dynamic time warping (DTW) model, such as General American English, Received Pronunciation (British English), or another native speaker model. This feature helps users identify and correct pronunciation differences. For example, when the user says, "Climate change is a significant problem," the system 100 records the user's speech and analyzes the pronunciation of each word. The system 100 breaks down the user's pronunciation into phonetic components and compares these components against a database of the target accent's phonetic characteristics. It highlights specific phonemes where the user's pronunciation deviates from the target and provides suggestions on how to adjust tongue position, lip shape, and vocal cord usage to match the target accent. For instance, if the user pronounces "significant" as "sɪɡnɪfɪkant" instead of "sɪɡnɪfɪkənt," the feedback would be, "Your pronunciation of 'significant' is close, but try to open your mouth slightly more on the second syllable to match the target sound".

[0075] In one embodiment herein, intonation refers to the rise and fall in pitch during speech, which affects the meaning and emotional tone of the spoken language. Intonation feedback helps users adjust their pitch to sound more natural and convey the intended meaning effectively. For example, when the user says, "Climate change is a significant problem," the system 100 records the user's speech and analyzes the pitch contour throughout the sentence. It maps the pitch changes in the user's speech and compares them with a target intonation pattern for the sentence. The system 100 identifies where the user's intonation differs from the target and provides guidance on where to raise or lower the pitch to match natural intonation patterns. For instance, if the user's intonation is flat on "significant," the feedback would be, "Try raising your pitch on 'significant' to emphasize the word and convey the seriousness of the issue".

[0076] In one embodiment herein, the combined accent and intonation feedback provides integrated feedback on both accent and intonation. For example, if the user says, "To mitigate the effects of climate change, we need to recycle," the system detects that the user pronounced "mitigate" as "ˈmɪtɪɡeɪt" but with a slightly different vowel sound in the first syllable. The feedback would be, "For'mitigate,' make sure the vowel sound in the first syllable is a short 'i' as in'sit'. Try saying 'ˈmɪtɪɡeɪt' with a sharper 'i' sound". Additionally, if the system 100 notes that the user spoke with a flat pitch, especially on the key phrase "mitigate the effects," the feedback would be "Raise your pitch slightly on'mitigate' and 'effects' to highlight the importance of the action 'To mitigate the effects of climate change, we need to recycle. In one embodiment herein, the intonation feedback also provides emotional tone feedback. For instance, if the user says, "To mitigate the effects of climate change, we need to recycle," the feedback might be, "To convey urgency, try a rising intonation on 'climate change' and a falling intonation on'recycle', ' To mitigate the effects of climate change, we need to recycle.

[0077] In one embodiment herein, the system comprises a voice integration and playback module, which is configured to adjust the pronunciation and intonation in the student's speech to match the standard accent. The student listens both to their original output and the adjusted version of the accent with their voice. The system 100 plays both the original recorded speech and the adjusted version sequentially. The student first listens to their original speech, noting any areas where they might have pronounced words incorrectly or used inappropriate intonation. Next, the student listens to the adjusted version, which provides a clear example of how the speech should sound with the correct accent in their voice. By comparing the two versions, the student can easily identify the differences in pronunciation, stress, and intonation. The student can replay both versions multiple times to better understand the correct accent. This comparison helps reinforce learning by providing a tangible example of the improvements needed. The auditory feedback is complemented by visual cues on the display, such as highlighted words or phonetic transcriptions, to further aid the student's understanding and retention.

[0078] In one embodiment herein, the MTI reduction module 120 addresses challenges posed by mother tongue influence (MTI) in pronunciation, grammar, and sentence structure. For example, when a user says, "To reduce pollution, people should recycle more," the system identifies MTI challenges. Pronunciation issues are detected, such as pronouncing "reduce" as "ri-duus" instead of the target "ri-dyoos," and "recycle" as "ree-sai-kul" instead of the target "ri-saikl". The system provides feedback like "For 'reduce,' ensure the 'e' sound is short and stressed correctly. Try pronouncing it as 'ri-dyoos'," and "For 'recycle,' stress the second syllable correctly: 'ri-sai-kl'".

[0079] In one embodiment herein, the MTI reduction module 120 also addresses grammar and syntax errors caused by direct translations from the user's native language, such as "People must recycle more" instead of the correct structure, "People should recycle more". The system provides guidance on the correct word order to improve sentence construction. In one embodiment herein, the intonation feedback helps users adjust their pitch to sound more natural and convey the intended meaning effectively. For instance, when the user says, "To reduce pollution, people should recycle more," with a flat intonation, the system 100 suggests raising the pitch slightly on "reduce" and "recycle" to emphasize key actions: "To reduce pollution, people should recycle more". In one embodiment herein, the MTI reduction module 120 provides customized feedback and practical suggestions to help users overcome the challenges posed by MTI. This enables more accurate and fluent communication in the second language. The user corrects their pronunciation and intonation, producing a sentence closer to native speaker norms, thereby reducing the influence of their mother tongue.

[0080] In one embodiment herein, the performance analytics module 122 generates detailed reports and analytics based on user performance. The system sends a detailed report to the teacher's mobile device (teacher terminal system), highlighting specific areas for improvement. The teacher reviews the report and plans targeted exercises to help the student improve their pronunciation and intonation. In one embodiment herein, the teacher plays a crucial role in helping students overcome MTI and improve their accents in a second language. For instance, during an observation session, the teacher notes that the student, John, pronounces "reduce" with a longer vowel sound and "recycle" with incorrect stress. The teacher provides correct pronunciations: "For 'reduce,' it should be 'ri-dyoos'. And for 'recycle,' stress the second syllable: 'ri-sai-kl'.

[0081] In one embodiment herein, the teacher uses aids such as visual guides and audio examples, encouraging the student to repeat after native speakers. For intonation guidance, the teacher advises raising the pitch on "reduce" and "recycle" to emphasize these words: "To reduce pollution, people should recycle more". The teacher encourages daily practice and self-recording, comparing recordings with examples, and monitors and adjusts the learning plan based on the student's progress, providing additional support for challenging sounds as needed. In one embodiment herein, the system 100 provides benefits such as guided learning, reduced fear of speaking, enhanced speaking abilities, and teacher involvement. The system's integration of AI, audio, and visual technologies delivers a comprehensive solution for improving English speaking skills, offering real-time assistance and continuous guidance to enhance user confidence and fluency.

[0082] According to an exemplary embodiment of the invention, FIG. 3 refers to a flowchart 300 of a method for improving spoken English proficiency using an advanced system 100. At step 302, the input module 109 receives the one or more inputs from a user and a teacher via at least one user interface module 110 through at least one of voice commands, gestures, and text. At step 304, the facial expression analysis module 112 detects user gestures and facial expressions during the user's speech with an artificial intelligence (AI) camera to identify user pauses and emotional data using facial expression analysis.

[0083] At step 306, the speech recognition module 114 analyzes the user's speech received through the input module 109 to detect user speech patterns, and pauses in the user's speech, thereby predicting the intended topic and suggesting relevant words and phrases based on the pauses in the user speech, context, and the emotion data received from the facial expression analysis module 112. At step 308, the feedback module 111 provides real-time feedback on pronunciation, grammar, and vocabulary based on the analyzed user speech via at least one user interface module 110, thereby differentiating between normal and advanced English proficiency levels.

[0084] At step 310, the adaptive learning module 116 adjusts the feedback and exercises based on the user's proficiency level and learning pace. At step 312, the accent comparison module 118 compares the user's accent with target and integrated accent versions using a dynamic time warping (DTW) model, thereby providing detailed visual and auditory feedback in the user's voice for pronunciation adjustments and intonation patterns to achieve more natural and expressive accents. At step 314, the mother tongue influence (MTI) reduction module 120 identifies and corrects the mother tongue influence in pronunciation, grammar, and sentence structure and provides visual and auditory feedback via the at least one user interface module 110. Further at step 316, the performance analytics module 122 generates the one or more detailed reports and analytics based on the user performance, and the one or more detailed reports are transferred to the teacher for facilitating personalized learning pathways.

[0085] Numerous advantages of the present disclosure may be apparent from the discussion above. In accordance with the present disclosure an advanced system for improving spoken English proficiency, is disclosed. The proposed invention provides an advanced system 100 for improving spoken English proficiency through real-time feedback, facial expression analysis, adaptive learning, and accent comparison using dynamic time warping (DTW), thereby enhancing accuracy, engagement, and personalization. The advanced system 100 is capable of providing instant and accurate feedback on language pronunciation, grammar, and vocabulary during speech practice. The advanced system 100 provides an adaptive learning environment that adjusts to the proficiency level of the user, thereby offering customized feedback and exercises. The advanced system 100 integrates artificial intelligence (AI) techniques to analyze and correct pronunciation errors in real-time, thereby utilizing advanced speech recognition and processing techniques. The advanced system 100 integrates one or more technologies for analyzing facial expressions and gestures to enhance feedback precision and user engagement during language practice.

[0086] The advanced system 100 offers a scalable solution that can be used by individuals, educational institutions, and organizations to improve language skills. The advanced system 100 provides an interactive language learning platform that offers personalized pronunciation correction based on the individual learner's speech patterns. The advanced system 100 is used for suggesting words based on context and user proficiency level, aiding in maintaining fluency and coherence in speech. The advanced system 100 provides an interactive user interface that supports multiple modes of interaction (voice, gesture, and text) for seamless language learning experiences. The advanced system 100 generates detailed performance reports and analytics, facilitates personalized learning pathways, and tracks progress.

[0087] It will readily be apparent that numerous modifications and alterations can be made to the processes described in the foregoing examples without departing from the principles underlying the invention, and all such modifications and alterations are intended to be embraced by this application.
, Claims:CLAIMS:
I/We Claim:
1. An advanced system (100) for improving spoken English proficiency, comprising:
a computing device (102) having a processor (104) and a memory (106) for storing one or more instructions executable by the processor (104), wherein the computing device (102) is in communication with a server (124) via a network (126),
wherein the processor (104) is configured to perform operations using plurality of modules (108), wherein the plurality of modules (108) comprises:
a facial expression analysis module (112) configured to detect user gestures and facial expressions through an artificial intelligence (AI) camera to identify user pauses and emotional data using facial expression analysis;
a speech recognition module (114) configured to analyze a user's speech received through an input module (109) to detect user speech patterns, and pauses in the user's speech, wherein the speech recognition module (114) is configured to predict the intended topic and suggest relevant words and phrases based on the pauses in the user speech, context, and the emotion data received from the facial expression analysis module (112);
a feedback module (111) configured to provide real-time feedback on pronunciation, grammar, and vocabulary based on the analyzed user speech via at least one user interface module (110), wherein the feedback module (111) is configured to differentiate between normal and advanced English proficiency levels;
an adaptive learning module (116) configured to adjust feedback and exercises based on the user's proficiency level and learning pace;
an accent comparison module (118) configured to compare the user accents with target and integrated accent versions using a dynamic time warping (DTW) model, thereby providing detailed visual feedback and auditory feedback in the user's voice for pronunciation adjustments and intonation patterns to achieve more natural and expressive accents;
a mother tongue influence (MTI) reduction module (120) configured to identify and correct mother tongue influence in pronunciation, grammar, and sentence structure of the user's speech and provides visual feedback and auditory feedback via the at least one user interface module (110); and
a performance analytics module (122) configured for generating one or more detailed reports and analytics based on the user performance, thereby transferring the one or more detailed reports to the teacher for facilitating personalized learning pathways.
2. The advanced system (100) for language learning as claimed in claim 1,
wherein
the input module (109) is configured to enable the user to enter the one or more inputs,
the one or more inputs include a topic, subject details, a level of English proficiency, and a type of accent, and
the one or more inputs are provided through at least one of audio input, visual or gesture input, and text input.
3. The advanced system (100) for language learning as claimed in claim 1, wherein the accent improvement suggestions are provided through at least one of detailed visual feedback and auditory feedback through the at least one user interface module 110 in the user voice for pronunciation adjustments and intonation patterns to achieve more natural and expressive accents.
4. The advanced system (100) for language learning as claimed in claim 1, wherein the feedback module (111) is configured to provide visual and auditory feedback via the at least one user interface module (110) to correct pronunciation errors.
5. The advanced system (100) for language learning as claimed in claim 1,
wherein
the speech recognition module (114) uses a natural language processing (NLP) model to analyze the context of the user's speech, predict pauses, and generate word suggestions based on the level of English proficiency selected by the user,
the speech recognition module (114) uses a Markov decision processes (MDP) model, which is configured for predicting potential words and phrases that the user needs during pauses in the speech, and
the speech recognition module (114) is trained on a large dataset of diverse accents and dialects to enhance transcription accuracy.
6. The advanced system (100) for language learning as claimed in claim 1, wherein the adaptive learning module (116) employs one or more machine learning techniques to dynamically adjust the difficulty level of exercises based on the user performance data.
7. The advanced system (100) for language learning as claimed in claim 1, wherein
the MTI reduction module (120) comprises a database of common MTI errors and corresponding corrective measures, and
the MTI reduction module (120) includes specific exercises to target and reduce the influence of the user's native language on their spoken English.
8. The advanced system (100) for language learning as claimed in claim 1, wherein the performance analytics module (122) generates weekly and monthly reports to track user progress and highlight areas for improvement.
9. The advanced system (100) for language learning as claimed in claim 1, wherein the at least one user interface module (110) is configured to support multi-modal inputs such as touch, voice, and gesture for a more engaging user experience.
10. A method for improving spoken English proficiency using an advanced system (100), comprising:
receiving, via an input module (109), one or more inputs from a user and a teacher via at least one user interface module (110) through at least one of voice commands, gestures, and text;
detecting, by a facial expression analysis module (112), user gestures and facial expressions during the user's speech with an artificial intelligence (AI) camera to identify user pauses and emotional data using facial expression analysis;
analyzing, by a speech recognition module (114), the user's speech received through an input module (109) to detect user speech patterns, and pauses in the user's speech, thereby predicting the intended topic and suggesting relevant words and phrases based on the pauses in the user speech, context, and the emotion data received from the facial expression analysis module (112);
providing, by a feedback module (111), real-time feedback on pronunciation, grammar, and vocabulary based on the analyzed user speech via at least one user interface module (110), thereby differentiating between normal and advanced English proficiency levels;
adjusting feedback and exercises based on the user's proficiency level and learning pace using an adaptive learning module (116);
comparing, by an accent comparison module (118), the user's accent with target and integrated accent versions using a dynamic time warping (DTW) model, thereby providing detailed visual and auditory feedback in the user's voice for pronunciation adjustments and intonation patterns to achieve more natural and expressive accents;
identifying and correcting mother tongue influence in pronunciation, grammar, and sentence structure using a mother tongue influence (MTI) reduction module (120) and providing visual and auditory feedback via the at least one user interface module (110); and
generating one or more detailed reports and analytics based on the user performance using a performance analytics module (122), and transferring the one or more detailed reports to the teacher for facilitating personalized learning pathways.

Documents

Name	Date
202441090655-COMPLETE SPECIFICATION [21-11-2024(online)].pdf	21/11/2024
202441090655-DECLARATION OF INVENTORSHIP (FORM 5) [21-11-2024(online)].pdf	21/11/2024
202441090655-DRAWINGS [21-11-2024(online)].pdf	21/11/2024
202441090655-EDUCATIONAL INSTITUTION(S) [21-11-2024(online)].pdf	21/11/2024
202441090655-EVIDENCE FOR REGISTRATION UNDER SSI [21-11-2024(online)].pdf	21/11/2024
202441090655-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [21-11-2024(online)].pdf	21/11/2024
202441090655-FORM 1 [21-11-2024(online)].pdf	21/11/2024
202441090655-FORM 18 [21-11-2024(online)].pdf	21/11/2024
202441090655-FORM FOR SMALL ENTITY(FORM-28) [21-11-2024(online)].pdf	21/11/2024
202441090655-FORM-9 [21-11-2024(online)].pdf	21/11/2024
202441090655-POWER OF AUTHORITY [21-11-2024(online)].pdf	21/11/2024
202441090655-REQUEST FOR EARLY PUBLICATION(FORM-9) [21-11-2024(online)].pdf	21/11/2024
202441090655-REQUEST FOR EXAMINATION (FORM-18) [21-11-2024(online)].pdf	21/11/2024

Talk To Experts

Online Lawyer Consultation

Online CA Consultation

Company Secretary Services

Calculators

Business Setup Calculator

PPF Calculator

Income Tax Calculator

Simple Compound Interest Calculator

Salary Calculator

Retirement Planning Calculator

RD Calculator

Mutual Fund Calculator

FD Calculator

Home Loan EMI Calculator

EMI Calculator

Lumpsum Calculator

Downloads

Rental Agreement Format

GST Invoice Format

Income Certificate Format

Power of Attorney Format

Affidavit Format

Salary Slip Sample

Appointment Letter Format

Relieving Letter Format

Legal Heir Certificate Format

Generate Free Rent Receipt

Commercial Rental Agreement

Consent Letter for GST Registration Format

No Objection Certificate (NOC) Format

Partnership Deed Format

Experience Letter Format

Resignation Letter Format

Offer Letter Format

Bonafide Certificate Format

Delivery Challan Format

Authorised Signatory in GST

Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.

Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.