Consult an Expert

Trademark

Patent

Infringement

Design Registration

Consult an Expert

Talk to a IP/Trademark Lawyer

Trademark

Trademark Registration

Trademark Search

Respond to TM Objection

International Trademark

Trademark Class Finder

Patent

Indian Patent Search

Provisional Patent Application

Patent Registration

Infringement

Patent Infringement

Trademark Infringement

Design Registration

Patent search/

HAND GESTURE RECOGNITION AND SPEECH SYNTHESIS FOR THE IMPAIRED LEVERAGING DEEP NEURAL NETWORKS AND MOTION TRIGGER FILTERING

Patent Search in India

Extensive patent search conducted by a registered patent agent
Patent search done by experts in under 48hrs

₹999

₹399

Talk to expert

HAND GESTURE RECOGNITION AND SPEECH SYNTHESIS FOR THE IMPAIRED LEVERAGING DEEP NEURAL NETWORKS AND MOTION TRIGGER FILTERING

ORDINARY APPLICATION

Published

Filed on 20 November 2024

Abstract

This invention focuses on facilitating phone calls between a person with sensory • disabilities(deaf-mutes) and a person without such disabilities. The speech-to-text technology converts the speech of the regular person into text for the special person, while the text input from the special person is transformed into speech for the regular individual (106). The other way is to enable video functionality exclusively where the camera (101) is used to capture the sign language of the individuals with sensory disabilities and are recognized using Google’s Mediapipe (104) and sign language is converted into an understandable text format for the common man. The hearing person cannot view the video; converting sign language into text. The model is trained on different types of Sign languages (102) to improve the efficiency of the model. Thus, Hand-gesture recognition (105) and voice conversion is used to bridge the communication gap between the deaf-mutes and the others.

Patent Information

Application ID	202441090030
Invention Field	ELECTRONICS
Date of Application	20/11/2024
Publication Number	48/2024

Inventors

Name	Address	Country	Nationality
Dr.V.VIDHYA	Department of Artificial Intelligence and Data science, Easwari Engineering College, Bharathi Salai, Ramapuram, Chennai-600089.	India	India
G.KRISHNA PRIYA	Department of Artificial Intelligence and Data science, Easwari Engineering College, Bharathi Salai, Ramapuram, Chennai-600089.	India	India
D.KARTHIKA PRIYA	Department of Artificial Intelligence and Data science, Easwari Engineering College, Bharathi Salai, Ramapuram, Chennai-600089.	India	India
HARINI. J	Department of Artificial Intelligence and Data science, Easwari Engineering College, Bharathi Salai, Ramapuram, Chennai-600089.	India	India
KAVIN A S	Department of Artificial Intelligence and Data science, Easwari Engineering College, Bharathi Salai, Ramapuram, Chennai-600089.	India	India
LAKSHETHA S	Department of Artificial Intelligence and Data science, Easwari Engineering College, Bharathi Salai, Ramapuram, Chennai-600089.	India	India

Applicants

Name	Address	Country	Nationality
EASWARI ENGINEERING COLLEGE	162, Bharathi Salai, Ramapuram, Chennai-600089.	India	India

Specification

DESCRIPTION:
[0001] The title of the invention is Hand Gesture Recognition and Speech Synthesis for the Impaired: Leveraging Deep Neural Networks and Motion Trigger Filtering
PRIOR ART AND BACKGROUND:
[0002] CN113723327A:This patent describes a framework that uses hardware equipment such as hand gloves to coordinate the face and hand key points. Our invention extract the feature via camera.
[0003] CN105976675A: This patent describes a framework that includes voice and image acquisition terminals that process and recognize gestures, lip movements, and facial expressions, converting them into text or speech to facilitate communication and also incorporates gesture recognition through glove-based systems and advanced algorithms like wavelet transforms and neural networks to enhance accuracy. Our invention includes Hand Gesture Recognition by enabling video and Speech Synthesis using Deep Neural Networks and Google's MediaPipe integrated in smartphones.
[0004] CN111208907A: This patent describes a framework that discloses a sign language recognition system and method that combines electromyographic (EMG) signals and finger joint deformation signals. Our invention combines Deep Neural Network and Google's Audiopipe to recognize sign language and for text-to-speech conversion and vice versa.
[0005] CN111768786A: This patent describes a framework where the whole model is integrated on an intelligent terminal. In our invention the whole application is integrated on the smartphone.
OBJECTIVE:
[0006] The primary objective is to develop a machine learning model in particular to Hand Gesture recognition and voice conversion method for deaf and mutes integrated in a Smartphone.
SUMMARY:
[0007] The invention contains a dataset of 200x200 pixel images with 29 classes, including letters A-Z and special classes. For voice-to-text, we utilized a Kaggle dataset of 3,168 speech samples from various speakers. The data was preprocessed using acoustic analysis, examining frequencies from 0-280 Hz. These datasets support our-gesture recognition and voice conversion models.
[0008] Data preprocessing is crucial for our gathered datasets. We'll focus on two key techniques:
one is RGB to Grayscale Conversion and in that, we use the average method and weighted method other important preprocessing methods are image resizing, noise reduction and data normalization (min-max normalization, z-score normalization) important hardware component are CPU Processing Unit and sensor unit
[0009] The sensor unit is crucial for gesture and voice recognition. Cameras capture gestures, while accelerometers and gyroscopes aid interpretation. For voice, sensors analyze speech and filter noise. Software components used are hand landmark detection which is used to locates 21 hand knuckle coordinates using real and synthetic images .Convolutional Neural Networks used identify hand motions and aid in speech recognition, support vector machines used to classify data points by determining optimal decision boundaries they process hand emotions captured by web cameras and finally recognize_google() method transcribes audio using Google's web voice API. It requires an AudioData object from the speech_recognition module.Voice conversion modifies speech features from source to target speaker. It alters timbre and prosody elements like duration and intonation. The invention uses camera-recorded hand motions, and converts them to text, then to speech.
DETAILED TECHNICAL DESCRIPTION:
[0010] Hand Motion Detection Systems Camera-based hand motion tracking technology uses sensors and cameras to follow and understand user movements in real-time.
[0011] Network Architecture
1. Input Layer: Accepts grayscale images converted from RGB, enhancing contrast for better feature extraction.
2. Preprocessing: Resizes images to a uniform size, applies Gaussian filtering for noise reduction, and normalizes pixel values to stabilize training.
3. Feature Extraction: Utilizes convolutional layers to capture spatial hierarchies, , _ = -pooling layeis lo down sample'features, and activation functions for non-linearity.
4. Hand Motion Detection: Employs visual-based systems for direct analysis, using deformable templates for gesture tracking and key point detection for critical features.
[0012] Training Strategy
1. Implements mini-batch gradient descent and adaptive learning rates for efficient training, monitoring loss and accuracy.
2. Integrates dropout layers and weight decay to prevent overfitting, employing early stopping based on validation performance.
3. Conducts hyperparameter tuning and evaluates performance on validation sets to ensure robustness and efficiency for deployment.
BRIEF DESCRIPTION OF THE DRAWING:
Fig 1:A flow diagram of Hand gesture recognition and Text-to-Speech conversion
LIST OF REFERENCE NUMERALS
100 - Hand images
101 - Hand images from the captured using camera by enabling the video functionality
102 - Training on Different types of sign languages
103 - 3D hand key points on the palm are marked (Hand landmark detection)
104 - Hand landmark detection is done using Mediapipe.
105 - Identify hand gestures based on hand key points.
106 - Simultaneous Text-to-Speech conversion for the recognised gesture.
107 - Output is generated at the user end i.e Smartphones
WE Claim.
1. A method for, Hand Gesture Recognition and Speech Synthesis for the Impaired: Leveraging Deep Neural Networks and Motion Trigger Filtering comprising,
a. Extracting the gestures that are captured through the camera
b. Training the model with hand gesture images
c. Matching and coordinating the training and real time images
d. Converting the output of the hand gesture recognition system simultaneously
into speech i.e from text-to-speech. _ _ .....
2. The system as claimed in claim 1, wherein the whole deep learning model is compressed into mobile application is integrated on the smartphone.
3. The system as claimed in claim 1, wherein the model is trained on different types of sign language such as British Sign Language and American Sign Language.
4. The system as claimed in claim 1, wherein the accuracy of hand gesture recognition is improved by hand landmark detection and motion trigger filtering.

Documents

Name	Date
202441090030-Form 1-201124.pdf	25/11/2024
202441090030-Form 18-201124.pdf	25/11/2024
202441090030-Form 2(Title Page)-201124.pdf	25/11/2024
202441090030-Form 3-201124.pdf	25/11/2024
202441090030-Form 5-201124.pdf	25/11/2024
202441090030-Form 9-201124.pdf	25/11/2024

Talk To Experts

Online Lawyer Consultation

Online CA Consultation

Company Secretary Services

Calculators

Business Setup Calculator

PPF Calculator

Income Tax Calculator

Simple Compound Interest Calculator

Salary Calculator

Retirement Planning Calculator

RD Calculator

Mutual Fund Calculator

FD Calculator

Home Loan EMI Calculator

EMI Calculator

Lumpsum Calculator

Downloads

Rental Agreement Format

GST Invoice Format

Income Certificate Format

Power of Attorney Format

Affidavit Format

Salary Slip Sample

Appointment Letter Format

Relieving Letter Format

Legal Heir Certificate Format

Generate Free Rent Receipt

Commercial Rental Agreement

Consent Letter for GST Registration Format

No Objection Certificate (NOC) Format

Partnership Deed Format

Experience Letter Format

Resignation Letter Format

Offer Letter Format

Bonafide Certificate Format

Delivery Challan Format

Authorised Signatory in GST

Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.

Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.