Consult an Expert
Trademark
Design Registration
Consult an Expert
Trademark
Copyright
Patent
Infringement
Design Registration
More
Consult an Expert
Consult an Expert
Trademark
Design Registration
Login
VISION IMPAIRMENT ASSISTANCE SYSTEM AND METHOD THEREOF
Extensive patent search conducted by a registered patent agent
Patent search done by experts in under 48hrs
₹999
₹399
Abstract
Information
Inventors
Applicants
Specification
Documents
ORDINARY APPLICATION
Published
Filed on 26 November 2024
Abstract
Disclosed herein is a vision impairment assistance system and method thereof, designed to assist visually impaired individuals in identifying objects, reading text, and recognizing currency in real-time. The system comprises a user device (102) with an integrated camera that captures live video of objects, text, and currency. This data is transmitted via a communication network (104) to a processing unit (106), which employs a you only live once (YOLO) based object detection model and tensor flow for currency recognition. The system further includes an object detection module (108), a currency detection module (110), and a text reading module (112), all configured to process the captured images and convert them into corresponding audio outputs through an audio output module (114). A memory unit (116) stores datasets and processed results for future updates. This system provides real-time audio feedback, enabling visually impaired users to navigate their environment independently and efficiently.
Patent Information
Application ID | 202441091966 |
Invention Field | PHYSICS |
Date of Application | 26/11/2024 |
Publication Number | 49/2024 |
Inventors
Name | Address | Country | Nationality |
---|---|---|---|
MR.DEVIDAS | DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING, NMAM INSTITUTE OF TECHNOLOGY, NITTE, KARKALA, KARNATAKA 576110 | India | India |
MR.RIZWAN | DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING, NMAM INSTITUTE OF TECHNOLOGY, NITTE, KARKALA, KARNATAKA 576110 | India | India |
MR. MOHD ANUS | DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING, NMAM INSTITUTE OF TECHNOLOGY, NITTE, KARKALA, KARNATAKA 576110 | India | India |
MR. MAHIR | DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING, NMAM INSTITUTE OF TECHNOLOGY, NITTE, KARKALA, KARNATAKA 576110 | India | India |
MR. SINAN | DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING, NMAM INSTITUTE OF TECHNOLOGY, NITTE, KARKALA, KARNATAKA 576110 | India | India |
MS. APEKSHA NAYAK | DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING, NMAM INSTITUTE OF TECHNOLOGY, NITTE, KARKALA, KARNATAKA 576110 | India | India |
MR. NAGENDRA PAI | DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING, NMAM INSTITUTE OF TECHNOLOGY, NITTE, KARKALA, KARNATAKA 576110 | India | India |
Applicants
Name | Address | Country | Nationality |
---|---|---|---|
NITTE (DEEMED TO BE UNIVERSITY) | 6TH FLOOR, UNIVERSITY ENCLAVE, MEDICAL SCIENCES COMPLEX, DERALAKATTE, MANGALURU, KARNATAKA 575018 | India | India |
Specification
Description:FIELD OF DISCLOSURE
[0001] The present disclosure relates generally relates to assistive technologies, more specifically, relates to vision impairment assistance system and method thereof.
BACKGROUND OF THE DISCLOSURE
[0002] The system continuously empowers individuals with vision impairments to perform daily activities independently by providing them with real-time information about their surroundings, enhancing their ability to interact with the physical environment.
[0003] It consistently enables seamless access to important visual data by converting it into audio formats, ensuring that users efficiently engage in tasks such as reading, navigating spaces, and managing transactions without needing assistance.
[0004] The system regularly enhances personal safety and mobility by delivering continuous audio feedback on nearby objects or obstacles, allowing users to move through different environments more securely and confidently.
[0005] Existing systems frequently rely on bulky or cumbersome hardware, making them less convenient for users to carry or integrate seamlessly into their daily routines, leading to decreased portability and usability.
[0006] Many current solutions consistently suffer from limited accuracy in detecting objects or recognizing text, which causes frustration for users and reduces the overall reliability of the assistance provided during navigation or task completion.
[0007] Existing inventions often depend on constant internet connectivity for functionality, leading to disruptions in performance when users are in areas with poor or no network access, limiting their independence in various environments.
[0008] Thus, in light of the above-stated discussion, there exists a need for a vision impairment assistance system and method thereof.
SUMMARY OF THE DISCLOSURE
[0009] The following is a summary description of illustrative embodiments of the invention. It is provided as a preface to assist those skilled in the art to more rapidly assimilate the detailed design discussion which ensues and is not intended in any way to limit the scope of the claims which are appended hereto in order to particularly point out the invention.
[0010] According to illustrative embodiments, the present disclosure focuses on an automated academic feedback and assessment system and method thereof which overcomes the above-mentioned disadvantages or provide the users with a useful or commercial choice.
[0011] An objective of the present disclosure is to enhance the mobility and independence of visually impaired individuals by providing real-time object detection and obstacle identification.
[0012] An objective of the present disclosure is to assist users in recognizing and identifying various objects in their surroundings through audio feedback.
[0013] Another objective of the present disclosure is to improve accessibility for visually impaired users by enabling them to interact with digital and printed text through a text-reading function.
[0014] Another objective of the present disclosure is to facilitate accurate currency recognition, allowing visually impaired individuals to handle financial transactions independently.
[0015] Another objective of the present disclosure is to provide a user-friendly interface that integrates seamlessly into everyday activities, promoting ease of use for individuals with vision impairments.
[0016] Another objective of the present disclosure is to reduce dependency on external assistance by allowing users to navigate and interact with their environment autonomously.
[0017] Yet another objective of the present disclosure is to ensure consistent and reliable performance across various environments and lighting conditions, enhancing user confidence.
[0018] Yet another objective of the present disclosure is to increase the efficiency of visually impaired users in completing tasks such as reading, navigating, and managing financial activities.
[0019] Yet another objective of the present disclosure is to provide a lightweight and portable system that is easily accessible to users in diverse settings, improving their quality of life.
[0020] Yet another objective of the present disclosure is to offer a comprehensive solution that combines object detection, currency identification, and text reading into a single, unified system.
[0021] In light of the above, in one aspect of the present disclosure, an vision impairment assistance system is disclosed herein. The system comprises a user device configured to capture live video through an integrated camera, wherein the user device captures images of objects, text, and currency, and transmits them for further processing. The system includes a communication network operatively connected to the user device and the processing unit, wherein the communication network enables real-time data transmission between the user device and the processing unit for seamless processing and feedback. The system also includes a processing unit operatively connected to the user device, wherein the processing unit is configured to detect and identify objects, text, and currency in real-time from the captured images using an object detection model based on you only live once (YOLO) architecture, and to convert the detected objects into corresponding text or audio signals. The system also includes an object detection module integrated into the processing unit, wherein the object detection module identifies objects within the captured image frames using a you only live once (YOLO) based object detection framework and converts the detected objects into audio labels, transmitted to the user device. The system also includes a currency detection module operatively connected to the processing unit, wherein the currency detection module captures images of currency, segments the images for feature extraction, and compares them against a pre-trained model using a tensor flow library, providing the corresponding audio output to the user device. The system also includes text reading module integrated into the processing unit, wherein the text reading module extracts text from captured images, converts the text into speech, and transmits the speech output to the user device for the user to hear. The system also includes an audio output module configured to transmit the detected object, currency, and text results as real-time audio signals to the user device, wherein the audio output module ensures that visually impaired users receive timely and accurate information regarding their surroundings. The system also includes a memory unit operatively connected to the processing unit, wherein the memory unit stores datasets for object detection, currency recognition, and text reading, along with captured video frames and processed results, facilitating future updates and optimization.
[0022] In one embodiment, the user device is further configured to capture audio commands from the user through an integrated microphone, enabling voice-based control and interaction with the system's functionalities, such as requesting object identification, currency recognition, or text reading.
[0023] In one embodiment, the communication network supports wireless transmission protocols, including wireless fidelity (Wi-Fi) and Bluetooth for real-time communication between the user device and the processing unit, ensuring uninterrupted data transmission and feedback.
[0024] In one embodiment, the object detection module identifies specific obstacles such as doors, chairs, and tables, and categorizes them based on pre-defined object types, transmitting the category information to the user as audio feedback for enhanced navigation assistance.
[0025] In one embodiment, the currency detection module is further configured to recognize multiple currencies, distinguishing between different denominations and currency types, and transmitting this information as speech output for visually impaired users to manage financial transactions independently.
[0026] In one embodiment, the memory unit stores user interaction history, including detected objects, recognized text, and currency identification results, allowing the system to optimize future responses based on user preferences and frequently encountered objects or tasks.
[0027] In light of the above, in one aspect of the present disclosure, a method for vision impairment assistance is disclosed herein. The method comprises capturing live video using a user device operatively connected to a camera, wherein the user device captures images of objects, text, and currency in real-time. The method include transmitting the captured images through a communication network operatively connected to the user device and a processing unit, wherein the communication network provides real-time data transmission to the processing unit for further analysis. The method also include processing the transmitted images in the processing unit operatively connected to the communication network, wherein the processing unit detects objects using a you only live once (YOLO) based object detection model, extracts text, and identifies currency in the captured images. The method also includes identifying objects through an object detection module integrated into the processing unit, wherein the object detection module analyses the image frames, detects the objects, and converts the identified objects into corresponding audio signals. The method also includes recognizing currency using a currency detection module integrated into the processing unit, wherein the currency detection module segments the captured images, extracts features, and compares the features with pre-trained models using the tensor flow library to generate an audio description of the currency. The method also includes extracting text from the captured images using a text reading module integrated into the processing unit, wherein the text reading module converts the extracted text into speech output, enabling the user to receive the text as an audio message. The method also includes transmitting the audio signals corresponding to detected objects, recognized currency, and extracted text to the user device via an audio output module operatively connected to the processing unit, wherein the audio signals are communicated in real-time to provide immediate auditory feedback to the visually impaired user.
[0028] In one embodiment, the step of capturing live video further includes adjusting the frame rate and resolution based on the user's environment, optimizing the processing efficiency and accuracy of object detection, currency recognition, and text extraction.
[0029] In one embodiment, the processing unit dynamically updates the object detection model by continuously training the you only live once (YOLO) based model with newly captured images, improving the accuracy of object identification over time based on real-time environmental changes.
[0030] In one embodiment, the step of transmitting audio signals to the user device includes adjusting the volume and speed of the speech output according to the user's preferences, ensuring optimal user experience and accessibility for different hearing conditions.
[0031] These and other advantages will be apparent from the present application of the embodiments described herein.
[0032] The preceding is a simplified summary to provide an understanding of some embodiments of the present invention. This summary is neither an extensive nor exhaustive overview of the present invention and its various embodiments. The summary presents selected concepts of the embodiments of the present invention in a simplified form as an introduction to the more detailed description presented below. As will be appreciated, other embodiments of the present invention are possible utilizing, alone or in combination, one or more of the features set forth above or described in detail below.
[0033] These elements, together with the other aspects of the present disclosure and various features are pointed out with particularity in the claims annexed hereto and form a part of the present disclosure. For a better understanding of the present disclosure, its operating advantages, and the specified object attained by its uses, reference should be made to the accompanying drawings and descriptive matter in which there are illustrated exemplary embodiments of the present disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
[0034] To describe the technical solutions in the embodiments of the present disclosure or in the prior art more clearly, the following briefly describes the accompanying drawings required for describing the embodiments or the prior art. Apparently, the accompanying drawings in the following description merely show some embodiments of the present disclosure, and a person of ordinary skill in the art can derive other implementations from these accompanying drawings without creative efforts. All of the embodiments or the implementations shall fall within the protection scope of the present disclosure.
[0035] The advantages and features of the present disclosure will become better understood with reference to the following detailed description taken in conjunction with the accompanying drawing, in which:
[0036] FIG. 1 illustrates a block diagram of a vision impairment assistance system, in accordance with an exemplary embodiment of the present disclosure;
[0037] FIG. 2 illustrates a flow chart of a vision impairment assistance system, in accordance with an exemplary embodiment of the present disclosure;
[0038] FIG. 3 illustrates a block diagram of a method for vision impairment assistance in accordance with an exemplary embodiment of the present disclosure;
[0039] FIG. 4 illustrates a perspective view of flow diagram for object detection and audio feedback system for visually impaired users, in accordance with an exemplary embodiment of the present disclosure;
[0040] FIG. 5 illustrates a perspective view of process flow for object detection, currency recognition, and text reading system, in accordance with an exemplary embodiment of the present disclosure;
[0041] FIG. 6 illustrates a perspective view of workflow of deep learning-based vision assistance system for the visually impaired, in accordance with an exemplary embodiment of the present disclosure.
[0042] Like reference, numerals refer to like parts throughout the description of several views of the drawing.
[0043] The vision impairment assistance system and method thereof is illustrated in the accompanying drawings, which like reference letters indicate corresponding parts in the various figures. It should be noted that the accompanying figure is intended to present illustrations of exemplary embodiments of the present disclosure. This figure is not intended to limit the scope of the present disclosure. It should also be noted that the accompanying figure is not necessarily drawn to scale.
DETAILED DESCRIPTION OF THE DISCLOSURE
[0044] The following is a detailed description of embodiments of the disclosure depicted in the accompanying drawings. The embodiments are in such detail as to communicate the disclosure. However, the amount of detail offered is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure.
[0045] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be apparent to one skilled in the art that embodiments of the present disclosure may be practiced without some of these specific details.
[0046] Various terms as used herein are shown below. To the extent a term is used, it should be given the broadest definition persons in the pertinent art have given that term as reflected in printed publications and issued patents at the time of filing.
[0047] The terms "a" and "an" herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced items.
[0048] The terms "having", "comprising", "including", and variations thereof signify the presence of a component.
[0049] Referring now to FIG. 1 to FIG. 6 to describe various exemplary embodiments of the present disclosure FIG. 1 illustrates a block diagram of a vision impairment assistance system, in accordance with an exemplary embodiment of the present disclosure.
[0050] The system 100 may include a user device 102 configured to capture live video through an integrated camera, wherein the user device 102 captures images of objects, text, and currency, and transmits them for further processing, a communication network 104 operatively connected to the user device 102and the processing unit, wherein the communication network 104 enables real-time data transmission between the user device 102 and the processing unit 106 for seamless processing and feedback, a processing unit 106 operatively connected to the user device, wherein the processing unit 106 is configured to detect and identify objects, text, and currency in real-time from the captured images using an object detection model based on you only live once (YOLO) architecture, and to convert the detected objects into corresponding text or audio signals, an object detection module 108 integrated into the processing unit, wherein the object detection module 108 identifies objects within the captured image frames using a you only live once (YOLO) -based object detection framework and converts the detected objects into audio labels, transmitted to the user device, a currency detection module 110 operatively connected to the processing unit, wherein the currency detection module 110 captures images of currency, segments the images for feature extraction, and compares them against a pre-trained model using a tensor flow library, providing the corresponding audio output to the user device, a text reading module 112 integrated into the processing unit, wherein the text reading module 112 extracts text from captured images, converts the text into speech, and transmits the speech output to the user device 102 for the user to hear, an audio output module 114 configured to transmit the detected object, currency, and text results as real-time audio signals to the user device, wherein the audio output module 114 ensures that visually impaired users receive timely and accurate information regarding their surroundings, a memory unit 116 operatively connected to the processing unit, wherein the memory unit 116 stores datasets for object detection, currency recognition, and text reading, along with captured video frames and processed results, facilitating future updates and optimization.
[0051] The user device 102 is further configured to capture audio commands from the user through an integrated microphone, enabling voice-based control and interaction with the system's functionalities, such as requesting object identification, currency recognition, or text reading.
[0052] The communication network 104 supports wireless transmission protocols, including wireless fidelity (Wi-Fi) and Bluetooth, for real-time communication between the user device 102 and the processing unit, ensuring uninterrupted data transmission and feedback.
[0053] The object detection module 108 identifies specific obstacles such as doors, chairs, and tables, and categorizes them based on pre-defined object types, transmitting the category information to the user as audio feedback for enhanced navigation assistance.
[0054] The currency detection module 110 is further configured to recognize multiple currencies, distinguishing between different denominations and currency types, and transmitting this information as speech output for visually impaired users to manage financial transactions independently.
[0055] The memory unit 116 stores user interaction histories, including detected objects, recognized text, and currency identification results, allowing the system to optimize future responses based on user preferences and frequently encountered objects or tasks.
[0056] The method may include capturing live video using a user device 102 operatively connected to a camera, wherein the user device 102 captures images of objects, text, and currency in real-time, transmitting the captured images through a communication network 104 operatively connected to the user device 102 and a processing unit, wherein the communication network 104 provides real-time data transmission to the processing unit 106 for further analysis, processing the transmitted images in the processing unit 106 operatively connected to the communication network, wherein the processing unit 106 detects objects using a you only live once (YOLO) -based object detection model, extracts text, and identifies currency in the captured images, identifying objects through an object detection module 108 integrated into the processing unit, wherein the object detection module 108 analyses the image frames, detects the objects, and converts the identified objects into corresponding audio signals, recognizing currency using a currency detection module 110 integrated into the processing unit, wherein the currency detection module 110 segments the captured images, extracts features, and compares the features with pre-trained models using the tensor flow library to generate an audio description of the currency, extracting text from the captured images using a text reading module 112 integrated into the processing unit, wherein the text reading module 112 converts the extracted text into speech output, enabling the user to receive the text as an audio message, transmitting the audio signals corresponding to detected objects, recognized currency, and extracted text to the user device 102 via an audio output module 114 operatively connected to the processing unit, wherein the audio signals are communicated in real-time to provide immediate auditory feedback to the visually impaired user.
[0057] The step of capturing live video further includes adjusting the frame rate and resolution based on the user's environment, optimizing the processing efficiency and accuracy of object detection, currency recognition, and text extraction.
[0058] The processing unit 106 dynamically updates the object detection model by continuously training the you only live once (YOLO) based model with newly captured images, improving the accuracy of object identification over time based on real-time environmental changes.
[0059] The step of transmitting audio signals to the user device 102 includes adjusting the volume and speed of the speech output according to the user's preferences, ensuring optimal user experience and accessibility for different hearing conditions.
[0060] The user device 102 serves as the primary interface for visually impaired individuals, designed to capture and transmit real-time data for processing and feedback. The user device 102 integrates a camera that continuously captures live video of the user's surroundings, focusing on objects, text, and currency. The captured video frames are processed and used for detecting, recognizing, and converting these elements into corresponding audio feedback. The integrated camera within the user device 102 operates with high precision to ensure that even minute details in the environment are captured accurately, facilitating real-time processing.
[0061] The user device 102 maintains a continuous connection with a communication network 104, which ensures that the data captured by the camera is transmitted seamlessly to the processing unit 106. The communication network 104 provides uninterrupted real-time transmission, allowing the processing unit 106 to analyses the captured video efficiently.
[0062] Furthermore, the user device 102 is equipped with an intuitive interface, allowing users to switch modes and receive real-time audio feedback. The user device 102 continuously interacts with the processing unit 106, where the detection and recognition algorithms are deployed, transforming visual information into audible cues that the user receives through connected audio accessories like headphones or speakers.
[0063] The ergonomic design of the user device 102 ensures ease of use for visually impaired individuals, supporting them during navigation and interaction with their surroundings. The user device 102 constantly communicates with the object detection module 108, currency detection module 110, and text reading module 112, each performing specialized functions.
[0064] Through seamless integration with these modules, the user device 102 enhances the user experience by ensuring accurate and timely feedback regarding objects, currency, and text in the user's environment. The design and functionality of the user device 102 continuously adapt to the user's needs, ensuring efficient and reliable assistance for visually impaired individuals.
[0065] The communication network 104 functions as the essential link between the user device 102 and the processing unit 106, ensuring real-time data exchange for seamless operation. The communication network 104 continuously enables the live video captured by the user device 102 to be transmitted directly to the processing unit 106, where detection and recognition tasks are performed. This uninterrupted transmission is critical for providing visually impaired users with timely feedback regarding their surroundings.
[0066] The communication network 104 supports high-speed data transfer, ensuring that all captured video frames reach the processing unit 106 without delay. This connection enables the processing unit 106 to execute object detection, currency recognition, and text reading functions in real-time, delivering accurate results back to the user device 102. By maintaining an efficient data flow, the communication network 104 ensures that the entire system operates smoothly and without interruptions.
[0067] Additionally, the communication network 104 is designed to handle multiple data streams simultaneously, facilitating the transmission of data from various modules within the system, including the object detection module 108, currency detection module 110, and text reading module 112. The network ensures that all processed results are returned promptly to the user device 102, allowing users to receive immediate audio feedback.
[0068] The communication network 104 constantly manages the connection between all components, ensuring a stable and reliable link even in dynamic environments. By facilitating smooth data exchange between the user device 102 and the processing unit 106, the communication network 104 plays a crucial role in enhancing the overall performance of the vision impairment assistance system. Its robust design ensures that visually impaired users receive accurate and timely assistance, providing them with critical information regarding their surroundings.
[0069] The processing unit 106 acts as the central hub for managing the detection and identification processes in the vision impairment assistance system. The processing unit 106 receives live video footage transmitted by the user device 102 through the communication network 104. This footage, containing images of objects, text, and currency, is processed in real-time, ensuring the system provides accurate and immediate feedback to the user.
[0070] The processing unit 106 is configured to utilize a you only live once (YOLO)-based object detection model for analysing the captured images. The object detection module 108, integrated within the processing unit 106, detects various objects within the image frames, identifies them, and converts them into audio labels. These audio labels are then transmitted back to the user device 102 for the user's awareness of nearby objects. The object detection process involves segmenting the images, detecting shapes, and classifying objects, all handled efficiently by the processing unit 106.
[0071] In addition to object detection, the processing unit 106 manages the currency detection module 110. The currency detection module 110 uses tensor flow to analyses and compare currency images against a pre-trained model. The processing unit 106 segments the currency images for feature extraction and generates corresponding audio feedback, which is transmitted to the user device 102. This feature allows visually impaired users to identify currency denominations accurately.
[0072] Furthermore, the processing unit 106 operates the text reading module 112, which extracts text from captured images and converts it into speech. The processing unit 106 manages the entire text extraction process, ensuring that all text is accurately identified and converted into an audible format for the user. The audio output is transmitted to the user device 102, providing real-time assistance for text reading tasks.
[0073] The processing unit 106 integrates and manages all these modules, ensuring the system functions smoothly to assist visually impaired users.
[0074] The object detection module 108 plays a pivotal role in the vision impairment assistance system by identifying and processing objects captured within the image frames. Integrated into the processing unit 106, the object detection module 108 functions using a you only live once (YOLO)-based object detection framework. This framework is responsible for recognizing a wide range of objects from the captured live video transmitted by the user device 102 through the communication network 104.
[0075] Upon receiving the images, the object detection module 108 segments the image frames to identify key features and objects. The YOLO-based architecture, known for its real-time detection capability, processes each frame to determine the presence and count of objects. The object detection module 108 uses deep learning techniques to classify objects, recognizing patterns and shapes efficiently. The module then assigns a label to each detected object, which is sent to the audio output module 114 for conversion into audio feedback.
[0076] The object detection module 108 not only identifies objects but also counts the number of instances of a particular object. This ensures that visually impaired users receive accurate information about the surrounding environment. The module is designed to handle various object classes, such as people, vehicles, furniture, and other obstacles, ensuring a comprehensive detection process.
[0077] The object detection module 108 operates seamlessly in conjunction with other modules like the currency detection module 110 and text reading module 112 within the processing unit 106. By integrating with these modules, the object detection module 108 ensures that users are provided with a complete and real-time understanding of their environment. All detected objects and their respective classifications are transmitted to the user device 102, offering real-time feedback to visually impaired individuals for safer navigation and interaction with their surroundings.
[0078] The currency detection module 110 is an integral part of the vision impairment assistance system, specifically designed to assist visually impaired users in identifying and handling currency. The currency detection module 110 operates by capturing images of currency through the camera integrated within the user device 102 and transmitting these images via the communication network 104 to the processing unit 106 for analysis.
[0079] Once the currency image is received, the currency detection module 110 segments the image into distinct regions that are essential for accurate feature extraction. The segmentation process enables the module to isolate key characteristics of the currency, such as numerical values, unique symbols, and colour patterns. The currency detection module 110 employs deep learning models, utilizing a tensor flow library, to recognize these features and compare them against a pre-trained dataset stored within the memory unit 116.
[0080] The currency detection module 110 is designed to support multiple currencies, ensuring that users can identify currency types and denominations accurately. The module provides real-time analysis, rapidly processing the captured images to identify the specific denomination of the currency. The system is built to recognize variations in currency design, such as different notes and coins, ensuring consistent accuracy regardless of the type of currency presented.
[0081] After detecting and identifying the currency, the currency detection module 110 transmits the results to the audio output module 114, where the information is converted into an audio signal. This audio feedback is sent back to the user device 102, allowing visually impaired individuals to receive immediate and clear feedback about the currency they are handling.
[0082] The currency detection module 110 works in parallel with other modules, such as the object detection module 108 and text reading module 112, to ensure that users can manage everyday tasks effectively, providing comprehensive assistance for various real-world scenarios, including financial transactions and shopping.
[0083] The text reading module 112 functions as a critical component within the vision impairment assistance system, designed to convert text from images into speech for visually impaired users. The text reading module 112 starts by receiving images captured by the user device 102, which transmits live video of various objects, text, and documents through the communication network 104 to the processing unit 106. The text reading module 112 then extracts text content from the image by using advanced optical character recognition (OCR) algorithms embedded within the processing unit 106.
[0084] Once the image is received and processed, the text reading module 112 initiates the process of identifying letters, words, and sentences from the visual data. The module segments the image to isolate the text portions, focusing on distinct letters and characters, even when they appear on complex backgrounds or surfaces. The segmentation process ensures accurate identification of the text, regardless of the surrounding environment or lighting conditions. The text reading module 112 employs sophisticated language models that understand the structure of words and sentences, enabling it to provide accurate recognition of printed and handwritten text.
[0085] After the text extraction is completed, the text reading module 112 performs a conversion of the identified text into speech. The text reading module 112 is configured to handle multiple languages, ensuring that users can hear accurate pronunciations in their preferred language. It also supports different text formats, such as printed documents, signs, labels, and digital screens, providing a wide range of assistance across diverse use cases.
[0086] Once the text is converted into speech, the text reading module 112 transmits the audio output to the user device 102 through the audio output module 114. The module ensures that the spoken information is delivered in a clear and concise manner, helping visually impaired individuals comprehend written content, navigate signs, and read documents with ease. The text reading module 112 plays an essential role in empowering users with enhanced accessibility in various daily life situations.
[0087] The audio output module 114 serves as a pivotal component within the vision impairment assistance system, designed to deliver real-time audio feedback to users. The audio output module 114 receives processed data from the processing unit 106, which handles various types of recognition tasks, including object detection from the object detection module 108, currency identification from the currency detection module 110, and text-to-speech conversion from the text reading module 112. Once the processing unit 106 completes these tasks, the audio output module 114 ensures that the results are conveyed to the user in an audible format.
[0088] The audio output module 114 plays a crucial role in making the environment accessible to visually impaired users by converting visual information into clear, understandable audio signals. The audio output module 114 is responsible for transmitting audio labels for identified objects, enabling users to understand the nature and identity of the objects in their immediate surroundings. Similarly, the audio output module 114 communicates the results of currency recognition, allowing the user to discern between different denominations and currencies accurately, facilitating easier financial transactions and handling.
[0089] In addition to object and currency recognition, the audio output module 114 relays text-based information after the text reading module 112 processes the content. The audio output module 114 ensures the real-time delivery of spoken words or sentences extracted from captured images, whether they are signs, documents, or any other form of written material. This instantaneous delivery of information enhances user experience, ensuring that individuals receive timely, accurate, and relevant data about their surroundings.
[0090] The audio output module 114 integrates seamlessly with the user device 102, providing uninterrupted feedback even in dynamic environments. By employing advanced audio processing techniques, the audio output module 114 ensures clarity and volume adjustments appropriate for different settings, making the auditory experience accessible and user-friendly. Its ability to handle diverse audio outputs, ranging from simple object labels to more complex sentences, allows the audio output module 114 to meet a wide variety of user needs in real time.
[0091] The memory unit 116 functions as an essential storage hub within the vision impairment assistance system, responsible for managing and preserving all relevant datasets and information crucial for smooth operations. The memory unit 116 stores a wide array of data, including the datasets for object detection utilized by the object detection module 108, currency recognition datasets leveraged by the currency detection module 110, and text extraction datasets applied by the text reading module 112. By maintaining these pre-trained models and datasets, the memory unit 116 ensures that the system can perform accurate real-time recognition of objects, currencies, and text.
[0092] In addition to datasets, the memory unit 116 also holds captured video frames from the user device 102. These stored video frames provide a reference point for the system's processing unit 106 to analyse and identify the objects, currencies, and text. The memory unit 116 plays a critical role in facilitating the object detection module 108 by ensuring that each image frame is accessible for further analysis and processing. The ability of the memory unit 116 to store this information contributes to the seamless and continuous processing of data across the system.
[0093] The memory unit 116 further manages processed results generated by the processing unit 106. After the system completes the recognition processes, the results are stored in the memory unit 116, enabling future retrieval and analysis. This storage capability also supports system optimization, as the stored data allows the system to update its models and algorithms with new information, ensuring that the object detection module 108, currency detection module 110, and text reading module 112 stay accurate and effective over time.
[0094] Moreover, the memory unit 116 supports system updates and enhancements. By maintaining an organized structure of datasets, video frames, and processed outputs, the memory unit 116 allows for efficient system improvements and adjustments without compromising existing functionalities. As the system evolves, the memory unit 116 ensures that all updates integrate seamlessly into its architecture, providing consistent and reliable performance.
[0095] FIG. 2 illustrates a flow chart of a vision impairment assistance system, in accordance with an exemplary embodiment of the present disclosure.
[0096] At 202, the user device captures live video of objects, text, and currency using its integrated camera.
[0097] At 204, the captured images are transmitted from the user device to the processing unit through the communication network for real-time analysis.
[0098] At 206, the object detection module processes objects, and the currency detection module processes currency, while the text reading module extracts and converts text to speech.
[0099] At 208, the object detection module processes objects, and the currency detection module processes currency, while the text reading module extracts and converts text to speech.
[0100] At 210, the object detection module processes objects, and the currency detection module processes currency, while the text reading module extracts and converts text to speech.
[0101] At 212, the memory unit stores datasets for future updates and optimization, along with video frames and processed results.
[0102] FIG. 3 illustrates a block diagram of a method for vision impairment assistance in accordance with an exemplary embodiment of the present disclosure.
[0103] At 302, capturing live video using a user device operatively connected to a camera, wherein the user device captures images of objects, text, and currency in real-time.
[0104] At 304, transmitting the captured images through a communication network operatively connected to the user device and a processing unit, wherein the communication network provides real-time data transmission to the processing unit for further analysis.
[0105] At 306, processing the transmitted images in the processing unit operatively connected to the communication network, wherein the processing unit detects objects using a YOLO-based object detection model, extracts text, and identifies currency in the captured images.
[0106] At 308, identifying objects through an object detection module integrated into the processing unit, wherein the object detection module analyses the image frames, detects the objects, and converts the identified objects into corresponding audio signals.
[0107] At 310, recognizing currency using a currency detection module integrated into the processing unit, wherein the currency detection module segments the captured images, extracts features, and compares the features with pre-trained models using the tensor flow library to generate an audio description of the currency.
[0108] At 312, extracting text from the captured images using a text reading module integrated into the processing unit, wherein the text reading module converts the extracted text into speech output, enabling the user to receive the text as an audio message.
[0109] At 314, transmitting the audio signals corresponding to detected objects, recognized currency, and extracted text to the user device via an audio output module operatively connected to the processing unit, wherein the audio signals are communicated in real-time to provide immediate auditory feedback to the visually impaired user.
[0110] FIG. 4 illustrates a perspective view of flow diagram for object detection and audio feedback system for visually impaired users, in accordance with an exemplary embodiment of the present disclosure.
[0111] At 402, the input from the camera in the user device 102 captures live video frames, continuously provides the system with visual data necessary for real-time analysis. The camera actively records images of objects, currency, and text, serving as the primary source of raw data for the system. Through seamless integration with the user device 102, the camera enables constant data acquisition, ensuring that each captured frame reaches the communication network 104 for transmission to the processing unit 106. This input allows the system to perform essential functions, including object identification, text extraction, and currency recognition, offering crucial support to visually impaired users.
[0112] At 404, the capture image function within the user device 102 involves obtaining high-quality visual frames of objects, text, and currency, which serve as essential input for processing and analysis. Each capture image from the user device 102 undergoes transmission through the communication network 104 to the processing unit 106, where modules for object detection, currency recognition, and text extraction analyse the image data. By capturing precise and detailed frames, the system ensures accuracy in detecting, identifying, and converting visual information into audio outputs, thus providing valuable real-time feedback for visually impaired users.
[0113] At 406, the keep images function in the memory unit 116 enables storage of all captured images processed by the user device 102 and analysed by the processing unit 106. Storing captured images allows the system to maintain a record of visual data, supporting future optimization, model updates, and reference during processing tasks. The memory unit 116 securely keeps images to facilitate accurate detection, text extraction, and currency recognition over time. This stored data becomes instrumental for refining object detection module 108, currency detection module 110, and text reading module 112, ensuring consistent and effective assistance for visually impaired users.
[0114] At 408, the image pre-processing function within the processing unit 106 prepares captured images from the user device 102 for precise analysis. This function enhances the clarity, contrast, and format of images, enabling efficient processing by the object detection module 108, currency detection module 110, and text reading module 112. By refining image quality, pre-processing facilitates accurate object detection, feature extraction, and text recognition. The processing unit 106 systematically applies these adjustments to each captured image, supporting reliable real-time feedback for visually impaired users and ensuring smooth functionality across modules integrated within the system.
[0115] At 410, the object recognition process in the object detection module 108 identifies and labels objects within the captured image from the user device 102 in real-time. Using a you only live once (YOLO) based object detection framework, the object detection module 108 analyses the processed image from the processing unit 106 to detect specific objects and assigns corresponding labels to each object. Following detection, the object detection module 108 converts the object labels into audio signals, which the audio output module 114 transmits to the user device 102. This continuous process provides visually impaired users with immediate auditory awareness of their surroundings.
[0116] At 412, the distance of detected objects from the user by analysing spatial data within the captured image received from the user device 102. Using image depth estimation techniques, the processing unit 106 calculates the relative proximity of objects, helping to convey critical spatial awareness information to the visually impaired user. The processing unit 106 then transmits the calculated distance data to the audio output module 114, converting the information into clear, real-time auditory feedback. This continuous process enables users to receive accurate auditory cues about object distances, enhancing safe navigation and spatial awareness.
[0117] At 414, an audio signal corresponding to each identified object in real-time, using data from the object detection module 108 to ensure accurate identification. The audio output module 114 converts this identification into a clear auditory description, effectively translating visual data into accessible sound. This conversion allows the visually impaired user to recognize objects around them, receiving distinct audio cues for each detected item. The audio output module 114 promptly transmits the auditory feedback through the user device 102, facilitating continuous interaction with the surrounding environment and supporting safe and independent navigation.
[0118] At 416, transfers the generated audio signal to the user device 102, maintaining seamless and immediate delivery of auditory feedback. The communication network 104 facilitates this transfer, ensuring that the audio description of detected objects, currency, or text reaches the user device 102 without delay. By continuously transmitting the audio signal in real-time, the audio output module 114 enhances user awareness and supports interaction with the environment. This consistent audio feedback from the audio output module 114 empowers visually impaired users, allowing intuitive navigation and interaction through auditory cues processed by the processing unit 106.
[0119] At 418, the audio output module delivers clear, real-time auditory information about the surroundings to the user device 102. By providing audio feedback on identified objects, detected currency, and recognized text, the audio output module 114 translates visual data into accessible audio information. This module ensures that each detected item's details, including type, position, and possibly distance, are communicated directly to the user. Through the precise processing of the processing unit 106 and seamless data relay via the communication network 104, the audio output module 114 creates an immersive auditory experience, supporting the visually impaired user in real-world navigation and decision-making.
[0120] FIG. 5 illustrates a perspective view of process flow for object detection, currency recognition, and text reading system, in accordance with an exemplary embodiment of the present disclosure.
[0121] At 502, the start phase initiates the operation of the vision impairment assistance system, activating each core component to prepare for capturing, processing, and delivering information. During this phase, the user device powers on and synchronizes with the communication network, establishing the necessary connectivity to the processing unit. This initial phase configures the camera within the user device to begin capturing live video frames, setting the stage for subsequent data analysis and audio feedback generation. Once the start phase activates the processing unit, all connected modules, including object detection, currency recognition, and text reading, stands ready for real-time data processing and audio output. The system ensures all components are primed for seamless interaction, marking the transition to real-time operation and assistance for the visually impaired user.
[0122] At 504, the currency detection process involves capturing an image of currency through the user device's integrated camera and transmitting it to the processing unit for analysis. Within the processing unit, the currency detection module identifies and segments the captured image for feature extraction, isolating distinctive characteristics of the currency. The module then compares these extracted features to a pre-trained model using a tensor flow library, which enhances accuracy in identifying different denominations. Once identification completes, the module converts the recognized currency information into an audio output. This audio output transmits back to the user device, allowing visually impaired users to discern currency type independently.
[0123] At 506, the kerns model functions as an integral part of the processing unit, facilitating deep learning tasks essential for object detection and recognition in real time. By leveraging neural network architectures, the kerns model processes input data from the user device and identifies specific features within captured images. The model trains on extensive datasets, optimizing its accuracy in recognizing objects, text, and currency. During operation, the kerns model analyzes visual data, using layers to classify and detect items based on learned patterns. This information converts into audio signals within the processing unit, providing visually impaired users with immediate auditory feedback on their surroundings.
[0124] At 508, the nearby currency recognition system operates by detecting and identifying currency within the user's immediate environment. The user device captures images of currency through its integrated camera, transmitting this visual data to the processing unit for analysis. The currency detection module then processes these images, segmenting them and comparing each segment to pre-trained currency models stored in the memory unit. The tensor flow library enhances this identification process by providing machine learning tools that accurately classify currency types based on features like size, shape, and markings. The processing unit subsequently converts this identification into an audio description, delivering real-time feedback to the user.
[0125] At 510, the currency name identification feature functions to recognize and announce the specific currency type detected in the user's surroundings. The user device captures images of currency, sending them to the processing unit for immediate analysis. The currency detection module within the processing unit compares these images against pre-trained currency datasets in the memory unit. Using a tensor flow-based classification process, the currency detection module accurately identifies distinct features, such as symbols or inscriptions, that correspond to particular currencies. Once identified, the processing unit generates an audio output with the recognized currency name, delivering this information directly to the user device as a clear auditory signal for the user.
[0126] At 512, the currency name identification feature functions to recognize and announce the specific currency type detected in the user's surroundings. The user device captures images of currency, sending them to the processing unit for immediate analysis. The currency detection module within the processing unit compares these images against pre-trained currency datasets in the memory unit. Using a tensor flow-based classification process, the currency detection module accurately identifies distinct features, such as symbols or inscriptions, that correspond to particular currencies. Once identified, the processing unit generates an audio output with the recognized currency name, delivering this information directly to the user device as a clear auditory signal for the user.
[0127] At 514, the audio plays a crucial role in delivering real-time information to visually impaired users. Generated through the text-to-speech (TTS) model within the processing unit, the audio communicates essential details about objects, text, and currency captured by the user device. The audio output module manages this information, converting text and detected elements into spoken descriptions that reach users with clarity. By conveying auditory cues, the audio allows users to engage with their surroundings independently, enhancing both situational awareness and safety. Audio generated for each recognized item supports seamless interaction, ensuring accessibility and aiding in daily tasks for visually impaired users
[0128] At 516, the start of object detection initiates when the user device captures live video, activating the object detection module within the processing unit. The object detection module, employing the YOLO architecture, processes the incoming images to recognize and identify objects present in the user's surroundings. The processing unit segments each captured frame, allowing the object detection module to assess and classify detected items based on pre-defined categories. Once identified, the object detection module sends the processed data to the audio output module, converting recognized objects into audio cues. This step enhances spatial awareness, enabling visually impaired users to navigate their environment confidently.
[0129] At 518, you only live once (YOLO) algorithm enables real-time object detection within the processing unit by dividing each image into a grid and analyzing each cell for potential objects. Each grid cell predicts bounding boxes and confidence scores, indicating the likelihood of an object's presence. The you only live once (YOLO) algorithm optimizes detection speed and accuracy by evaluating the entire image at once, reducing the complexity often associated with scanning each part individually. Through its streamlined, grid-based structure, the you only live once (YOLO) algorithm ensures efficient processing and accurate identification, enhancing object detection capabilities. This approach supports fast decision-making, making it suitable for real-time applications for visually impaired users.
[0130] At 520, the nearby object represents an essential aspect of spatial awareness within the vision impairment assistance system. The processing unit continuously analyzes captured images to identify objects in proximity to the user device, facilitating safe navigation. Each nearby object undergoes classification through the object detection module, ensuring that visually impaired users receive real-time information about their surroundings. The audio output module communicates details of each nearby object, delivering audio cues that alert users to obstacles or relevant items. By efficiently distinguishing nearby objects, the system enhances mobility and situational awareness, allowing users to interact confidently with their environment.
[0131] At 522, the object name is identified by the object detection module integrated within the processing unit, which uses the you only live once (YOLO) based detection framework to recognize specific objects in the captured image. Once an object is detected, the object detection module assigns a unique label to the object, corresponding to its name, such as "chair," "table," or "door." The processing unit then translates the object name into audio signals through the audio output module, which transfers these signals to the user device. This real-time auditory identification of object names provides visually impaired users with clear, immediate awareness of their environment, significantly improving spatial orientation and navigation.
[0132] At 524, the text reader within the vision impairment assistance system serves as a vital tool for enabling access to written information. The text reading module integrated into the processing unit captures text from images taken by the user device, transforming printed words into accessible audio output. This module efficiently extracts and processes textual content, converting it to speech that the audio output module communicates to users. Through seamless integration with the processing unit, the text reader enhances users' ability to understand documents, labels, and signage. The user device, paired with the text reading module, enriches the experience for visually impaired individuals by providing real-time auditory feedback on textual information in their surroundings.
[0133] At 526, the capture image process within the user device initiates the core functionality of the vision impairment assistance system. The user device captures images of the surroundings, including objects, text, and currency, and transmits these images to the processing unit through the communication network. This live image capture continuously gathers essential visual data, enabling the system to identify items in real time. The captured image provides a basis for subsequent processing by the processing unit, allowing modules such as the object detection module, currency detection module, and text reading module to analyze and interpret the content. Through the capture image functionality, the system efficiently bridges the visual data and audio feedback stages, ensuring visually impaired users receive timely information about their environment.
[0134] At 528, the extract text process engages the text reading module integrated into the processing unit to identify and retrieve written content from captured images. The text reading module uses advanced OCR optical character recognition (OCR) techniques to recognize characters and words within images transmitted by the user device through the communication network. Extracted text undergoes processing, converting it into readable, structured information. The processing unit prepares the extracted text to be transformed into audio, which the audio output module then transmits back to the user device. This process empowers visually impaired users to access printed or digital text in their surroundings through auditory feedback, enhancing their ability to interact with and interpret written information independently.
[0135] FIG. 6 illustrates a perspective view of workflow of deep learning-based vision assistance system for the visually impaired, in accordance with an exemplary embodiment of the present disclosure.
[0136] The workflow of a deep learning-based vision assistance system designed for visually impaired individuals. The system begins by collecting images and dividing them into training, validation, and testing sets. These images undergo augmentation and annotation, where labels are assigned for tasks like object detection and currency recognition. The annotated data is then fed into a deep learning model for training. Once trained, the model is evaluated for performance before being deployed on a single-board computer integrated with accessories like a distance sensor, camera, headphones, and power source.
[0137] When the device is activated, the user plugs in headphones and selects the desired mode. The system processes various data inputs, including objects, text, and obstacles. It performs object detection to identify and count objects, converts captured text into speech, and measures distances using the sensor. The corresponding information, such as object names, text, or proximity warnings, is converted into audio feedback, which is relayed to the user through headphones. This system provides real-time assistance in recognizing objects, reading text, and navigating obstacles, offering a comprehensive solution for visually impaired individuals to interact with their surroundings independently.
[0138] While the invention has been described in connection with what is presently considered to be the most practical and various embodiments, it will be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims.
A person of ordinary skill in the art may be aware that, in combination with the examples described in the embodiments disclosed in this specification, units and algorithm steps may be implemented by electronic hardware, computer software, or a combination thereof.
, Claims:I/We Claim:
1. A vision impairment assistance system (100) comprising:
a user device (102) configured to capture live video through an integrated camera, wherein the user device (102) captures images of objects, text, and currency, and transmits them for further processing;
a communication network (104) operatively connected to the user device (102) and the processing unit, wherein the communication network (104) enables real-time data transmission between the user device (102) and the processing unit (106) for seamless processing and feedback;
a processing unit (106) operatively connected to the user device, wherein the processing unit (106) is configured to detect and identify objects, text, and currency in real-time from the captured images using an object detection model based on you only live once (YOLO) architecture, and to convert the detected objects into corresponding text or audio signals;
an object detection module (108) integrated into the processing unit, wherein the object detection module (108) identifies objects within the captured image frames using a you only live once (YOLO) -based object detection framework and converts the detected objects into audio labels, transmitted to the user device;
a currency detection module (110) operatively connected to the processing unit, wherein the currency detection module (110) captures images of currency, segments the images for feature extraction, and compares them against a pre-trained model using a tensor flow library, providing the corresponding audio output to the user device;
a text reading module (112) integrated into the processing unit, wherein the text reading module (112) extracts text from captured images, converts the text into speech, and transmits the speech output to the user device (102) for the user to hear;
an audio output module (114) configured to transmit the detected object, currency, and text results as real-time audio signals to the user device, wherein the audio output module (114) ensures that visually impaired users receive timely and accurate information regarding their surroundings;
a memory unit (116) operatively connected to the processing unit, wherein the memory unit (116) stores datasets for object detection, currency recognition, and text reading, along with captured video frames and processed results, facilitating future updates and optimization.
2. The system (100) as claimed in claim 1, wherein the user device (102) is further configured to capture audio commands from the user through an integrated microphone, enabling voice-based control and interaction with the system's functionalities, such as requesting object identification, currency recognition, or text reading.
3. The system (100) as claimed in claim 1, wherein the communication network (104) supports wireless transmission protocols, including wireless fidelity (Wi-Fi) and Bluetooth, for real-time communication between the user device (102) and the processing unit, ensuring uninterrupted data transmission and feedback.
4. The system (100) as claimed in claim 1, wherein the object detection module (108) identifies specific obstacles such as doors, chairs, and tables, and categorizes them based on pre-defined object types, transmitting the category information to the user as audio feedback for enhanced navigation assistance.
5. The system (100) as claimed in claim 1, wherein the currency detection module (110) is further configured to recognize multiple currencies, distinguishing between different denominations and currency types, and transmitting this information as speech output for visually impaired users to manage financial transactions independently.
6. The system (100) as claimed in claim 1, wherein the memory unit (116) stores user interaction history, including detected objects, recognized text, and currency identification results, allowing the system to optimize future responses based on user preferences and frequently encountered objects or tasks.
7. A method for vision impairment assistance comprising:
capturing live video using a user device (102) operatively connected to a camera, wherein the user device (102) captures images of objects, text, and currency in real-time;
transmitting the captured images through a communication network (104) operatively connected to the user device (102) and a processing unit, wherein the communication network (104) provides real-time data transmission to the processing unit (106) for further analysis;
processing the transmitted images in the processing unit (106) operatively connected to the communication network, wherein the processing unit (106) detects objects using a you only live once (YOLO) -based object detection model, extracts text, and identifies currency in the captured images;
identifying objects through an object detection module (108) integrated into the processing unit, wherein the object detection module (108) analyses the image frames, detects the objects, and converts the identified objects into corresponding audio signals;
recognizing currency using a currency detection module (110) integrated into the processing unit, wherein the currency detection module (110) segments the captured images, extracts features, and compares the features with pre-trained models using the tensor flow library to generate an audio description of the currency;
extracting text from the captured images using a text reading module (112) integrated into the processing unit, wherein the text reading module (112) converts the extracted text into speech output, enabling the user to receive the text as an audio message;
transmitting the audio signals corresponding to detected objects, recognized currency, and extracted text to the user device (102) via an audio output module (114) operatively connected to the processing unit, wherein the audio signals are communicated in real-time to provide immediate auditory feedback to the visually impaired user.
8. The method (100) as claimed in claim 7, wherein the step of capturing live video further includes adjusting the frame rate and resolution based on the user's environment, optimizing the processing efficiency and accuracy of object detection, currency recognition, and text extraction.
9. The method (100) as claimed in claim 7, wherein the processing unit (106) dynamically updates the object detection model by continuously training the you only live once (YOLO) based model with newly captured images, improving the accuracy of object identification over time based on real-time environmental changes.
10. The method (100) as claimed in claim 7, wherein the step of transmitting audio signals to the user device (102) includes adjusting the volume and speed of the speech output according to the user's preferences, ensuring optimal user experience and accessibility for different hearing conditions.
Documents
Name | Date |
---|---|
202441091966-COMPLETE SPECIFICATION [26-11-2024(online)].pdf | 26/11/2024 |
202441091966-DECLARATION OF INVENTORSHIP (FORM 5) [26-11-2024(online)].pdf | 26/11/2024 |
202441091966-DRAWINGS [26-11-2024(online)].pdf | 26/11/2024 |
202441091966-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [26-11-2024(online)].pdf | 26/11/2024 |
202441091966-FORM 1 [26-11-2024(online)].pdf | 26/11/2024 |
202441091966-FORM FOR SMALL ENTITY(FORM-28) [26-11-2024(online)].pdf | 26/11/2024 |
202441091966-REQUEST FOR EARLY PUBLICATION(FORM-9) [26-11-2024(online)].pdf | 26/11/2024 |
Talk To Experts
Calculators
Downloads
By continuing past this page, you agree to our Terms of Service,, Cookie Policy, Privacy Policy and Refund Policy © - Uber9 Business Process Services Private Limited. All rights reserved.
Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.
Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.