image
image
user-login
Patent search/

OFFLINE VOICE-ENABLED GEOSPATIAL SYSTEM FOR MULTIMODAL INTERACTION AND REAL-TIME DATA PROCESSING

search

Patent Search in India

  • tick

    Extensive patent search conducted by a registered patent agent

  • tick

    Patent search done by experts in under 48hrs

₹999

₹399

Talk to expert

OFFLINE VOICE-ENABLED GEOSPATIAL SYSTEM FOR MULTIMODAL INTERACTION AND REAL-TIME DATA PROCESSING

ORDINARY APPLICATION

Published

date

Filed on 22 November 2024

Abstract

The present invention relates to an offline voice enabled geospatial system for multimodal interaction and real-time data processing. The system integrates multilingual capabilities, augmented reality (AR), and sensor data to enhance geospatial interaction. It integrates with GIS libraries such as Leaflet and OpenLayers to provide functionalities like real-time traffic visualization, layer management, and geospatial queries. The system comprises: a voice recognition technologies such as Vosk and Mozilla DeepSpeech, which enable offline voice interaction in languages like Hindi and English; Augmented Reality (AR) to offer a contextual AR experience and geospatial navigation or real-time data visualization; a gesture-based control mechanism through MediaPipe that allows users to interact with the geospatial map without physical touch; edge devices (e.g., Nvidia Jetson, Raspberry Pi) and advanced sensors (e.g., LiDAR, hyperspectral, multispectral, and thermal sensors) facilitates real-time data processing for terrain analysis, environmental monitoring, and 3D mapping; and adaptive control mechanisms and deep learning models to optimize the performance of the application based on real-time sensor inputs. Real-time collaboration is supported by Socket.IO and Supabase, and predictive analytics are implemented using Prophet and PyCaret for advanced geospatial forecasting. The integration of edge computing devices and advanced sensors ensures high performance and real-time responsiveness, further enhancing the offline capabilities of the system.

Patent Information

Application ID202411090864
Invention FieldCOMPUTER SCIENCE
Date of Application22/11/2024
Publication Number49/2024

Inventors

NameAddressCountryNationality
Prashant HemrajaniDepartment of IOT and Intelligent Systems, School of Computing and Intelligent Systems, Manipal University JaipurIndiaIndia

Applicants

NameAddressCountryNationality
Manipal University JaipurManipal University Jaipur, Off Jaipur-Ajmer Expressway, Post: Dehmi Kalan, Jaipur-303007, Rajasthan, IndiaIndiaIndia

Specification

Description:Field of the Invention
The present invention relates to the field of geospatial web applications, more particular to an offline voice enabled geospatial system for multimodal interaction and real-time data processing with Multilingual, AR, and Sensor Integration.
Background of the Invention
The invention addresses several significant limitations of existing geospatial web applications. Current technologies often depend on cloud-based voice recognition services, which necessitate a stable internet connection and raise concerns about data privacy as user data is transmitted to third-party servers. This invention overcomes these issues by employing on-device voice recognition technologies such as Vosk and Mozilla DeepSpeech, enabling offline functionality that ensures uninterrupted operation even in areas with limited connectivity. Additionally, by processing voice commands locally, the invention enhances data privacy and security.
Moreover, while many existing applications offer basic map functionalities, they lack advanced features like seamless integration of voice commands with gesture recognition and augmented reality (AR). This invention fills that gap by incorporating gesture recognition through MediaPipe and AR capabilities using AR.js and Three.js, providing an immersive and interactive user experience. It also addresses real-time collaboration limitations by utilizing Socket.IO and Supabase for effective multi-user interactions and collaborative editing, avoiding common synchronization issues. The integration of edge computing devices and advanced sensors ensures high performance and real-time responsiveness, further enhancing the offline capabilities of the system.
CN110580273B: Map GIS data processing and storing method and device and readable storage medium, discloses a map GIS data processing and storing method and device, a computer readable storage medium, a two-dimensional image display method and a method for calling block data formed by the method, and belongs to the field of data processing and storing.
JP3073176U: Image data generator for audio drive video plane, disclosed a telephone device or an independent image generation device that requires only a small number of images of a single face or only one, and does not need to send a large amount of data at the time of transmission. The present invention provides an audio drive moving image plane image data generating apparatus capable of performing the following. The present invention relates to a sampling storage unit for pre-sampling and storing a representative example of a mouth state change corresponding to a specific voice of a speaker, a reading unit for reading one image of the speaker's face, and a reading unit for reading the image. A change adding unit that changes the mouth of the image based on a change in the state of the mouth stored in the sampling storage unit; and a state of the mouth changed by the change adding unit. Storage means.
US10249298B2: Method and apparatus for providing global voice-based entry of geographic information in a device, disclosed an approach is provided for global voice-based entry of location information. The approach involves partitioning a global speech decoding graph into spatial partitions. The approach also involves determining key entities occurring in each spatial partition to construct a combined set of key entities. The approach further involves creating a retrieval index to map the key entities in the combined set of key entities to a corresponding partition. A first partition, the combined set of key entities, and the retrieved index are stored in a memory of a device for processing a voice input signal. A second partition that is not in the memory of the device is retrieved based on the combined set of key entities and the retrieval index to automatically re-process the voice input signal when an out-of-vocabulary result is obtained from the first partition.
US20080312826A1: Mobile phone having gps navigation system, disclosed a method of providing navigational instructions on a mobile phone having an integrated GPS receiver, a geographic information system, and access to at least one database of addresses of geographic locations, each address having an association to a telephone number. The method including the steps of establishing a present geographic location of the mobile phone having integrated GPS receiver; inputting the telephone number identifying a destination; retrieving the address of the destination from the at least one database of addresses based on the input telephone number; and using the retrieved address of the destination and the present geographic location of the mobile phone to request navigational instructions from the present geographic location to the destination.
None of the prior art indicated above either alone or in combination with one another disclose what the present invention has disclosed.
Drawings
Fig.1 illustrates the block diagram of the present invention
Fig.2 illustrates the flow chart of the present invention
Detailed Description of the Invention
The following description includes the preferred best mode of one embodiment of the present invention. It will be clear from this description of the invention that the invention is not limited to these illustrated embodiments but that the invention also includes a variety of modifications and embodiments thereto. Therefore, the present description should be seen as illustrative and not limiting. While the invention is susceptible to various modifications and alternative constructions, it should be understood, that there is no intention to limit the invention to the specific form disclosed, but, on the contrary, the invention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the invention as defined in the claims.
In any embodiment described herein, the open-ended terms "comprising," "comprises," and the like (which are synonymous with "including," "having" and "characterized by") may be replaced by the respective partially closed phrases "consisting essentially of," consists essentially of," and the like or the respective closed phrases "consisting of," "consists of, the like. As used herein, the singular forms "a", "an", and "the" designate both the singular and the plural, unless expressly stated to designate the singular only.

The invention addresses several significant limitations of existing geospatial web applications. The present invention relates to an offline voice enabled geospatial system for multimodal interaction and real-time data processing. The system comprises of:
1. Offline Voice Recognition with Multilingual Support:
• The invention leverages on-device voice recognition technologies such as Vosk and Mozilla DeepSpeech, which enable offline voice interaction in languages like Hindi and English. Most existing geospatial applications rely on cloud-based voice processing, which requires continuous internet access. This invention ensures user data privacy and offline functionality, distinguishing it from the conventional systems that depend on external servers.
2. Augmented Reality (AR) for Enhanced Geospatial Interaction:
• The system integrates AR.js and Three.js to offer a contextual AR experience that overlays geospatial data on the real-world view. Current applications seldom utilize AR in this manner for geospatial navigation or real-time data visualization. Users can view points of interest, traffic conditions, and landmarks superimposed on real-world visuals using a mobile device or AR glasses.
3. Gesture-Based Interaction:
• This invention introduces a gesture-based control mechanism through MediaPipe that allows users to interact with the geospatial map without physical touch. Users can zoom, pan, or switch between map layers using simple gestures, which is particularly useful in situations where hands-free operation is necessary.
4. Edge Computing for Local Processing:
• The invention makes use of edge computing devices like Nvidia Jetson or Raspberry Pi for local processing of geospatial data. This eliminates the need for cloud servers, enabling the system to operate entirely offline while maintaining high performance and real-time data processing. The local processing capability allows for the integration of advanced sensors, such as LiDAR,
5. Advanced Sensors for Real-Time Geospatial Data Integration:
• The invention incorporates advanced sensors such as LiDAR, hyperspectral cameras, thermal cameras, and environmental sensors to collect real-time data and enhance the accuracy of geospatial maps. These sensors provide valuable insights into environmental conditions, such as temperature variations, air quality, and terrain mapping, all of which are processed locally on the device.
6. Adaptive Control and Deep Learning for Optimized Performance:
• The system utilizes adaptive control mechanisms and deep learning models to optimize the performance of the application based on real-time sensor inputs. The adaptive control system adjusts parameters such as sensor sensitivity and processing rates to maintain optimal performance in varying conditions. Additionally, deep learning algorithms enable the system to perform tasks such as anomaly detection, predictive maintenance, and user preference adaptation.
7. Real-Time Collaboration and Predictive Analytics:
• The invention supports real-time collaboration through Socket.IO and Supabase, allowing multiple users to interact with the map and see each other's updates in real time. It also incorporates predictive analytics using Prophet and PyCaret, enabling the system to forecast future trends, such as traffic patterns or environmental changes, based on historical data.
Current technologies in the field of geospatial web applications include various GIS libraries like Leaflet and OpenLayers, which provide robust tools for interactive map functionalities and layer management. Voice recognition in these applications is typically supported by cloud-based services such as Google Speech-to-Text and Microsoft Azure Speech Service. However, these cloud-based solutions have several limitations: they require a continuous internet connection, raise concerns about data privacy, and may introduce latency in voice command processing.
Existing voice-enabled GIS applications often rely on these online APIs, which can be problematic in low-connectivity environments. Moreover, the integration of augmented reality (AR) and gesture recognition features remains limited, with few solutions offering a seamless and immersive user experience. In this invention, AR projections are overlaid onto real-world views through mobile devices or smart glasses, allowing users to visualize geospatial data (such as terrain, traffic, or topography) in real-time, providing contextual information directly within the physical environment.
Advanced sensors such as LiDAR, hyperspectral, multispectral, and thermal sensors are incorporated into the system and are placed on edge devices like Nvidia Jetson and Raspberry Pi. These sensors serve to capture critical environmental and geospatial data such as terrain elevation, vegetation health, soil properties, and temperature variations. This data is processed locally to generate accurate 3D maps, terrain analysis, and environmental monitoring, enhancing precision and responsiveness without requiring a constant internet connection.
Furthermore, inertial measurement units (IMUs) and other sensors track movement and positioning, which can be crucial for gesture-based interactions within the system. These gestures are recognized through MediaPipe and enable intuitive controls like zooming, panning, and layer adjustments. The system uses these sensor inputs to adapt dynamically to real-world changes, employing adaptive control mechanisms that fine-tune system performance and adjust sensor sensitivity based on the surrounding environment.
Finally, real-time collaboration is supported by Socket.IO and Supabase, which allow multiple users to simultaneously interact with the map, ensuring synchronized updates and live data sharing. This combination of sensors, edge computing, and AR provides a comprehensive offline solution that enhances both functionality and user experience.
Working Example 1: Commuting and Navigation
A user is commuting to work in a city where internet coverage may be spotty or unavailable. They can use the app to get directions, find alternative routes, and check real-time traffic conditions. Using voice commands, they ask for the quickest route to work in Hindi or English, and the app provides navigation assistance. If the user is driving, gesture recognition allows them to interact with the map without touching the screen.
In this scenario, the GPS on the user's device provides accurate location tracking, while offline maps and predictive analytics suggest the best route, taking into account usual traffic patterns, even without a live internet connection.
Working Example 2: Local Discovery
A user visiting a new town wants to explore nearby restaurants and parks. They open the app, and through augmented reality, they point their camera at a street to see virtual markers for restaurants, cafes, and other points of interest overlaid onto the real-world view. The app provides detailed information such as hours of operation, distance, and ratings without needing an active internet connection. Voice commands like "show nearby parks" help make the app easy for users to engage with.
The app integrates the smartphone's camera and GPS sensors with AR to offer this immersive experience, making it ideal for locals and tourists alike who want to navigate new areas conveniently.

, Claims:1. An offline voice-enabled geospatial system for multimodal interaction and real-time data processing, system comprising of:
a) voice recognition technologies such as Vosk and Mozilla DeepSpeech, which enable offline voice interaction in languages like Hindi and English;
b) integrates AR.js and Three.js to offer a contextual AR experience that overlays geospatial data on the real-world view;
c) a gesture-based control mechanism through MediaPipe that allows users to interact with the geospatial map without physical touch;
d) edge computing devices like Nvidia Jetson or Raspberry Pi for local processing of geospatial data that eliminates the need for cloud servers, enabling the system to operate entirely offline while maintaining high performance and real-time data processing;
e) incorporates advanced sensors such as LiDAR, hyperspectral cameras, thermal cameras, and environmental sensors to collect real-time data and enhance the accuracy of geospatial maps; and
f) adaptive control mechanisms and deep learning models to optimize the performance of the application based on real-time sensor inputs.
2. The offline voice-enabled geospatial system for multimodal interaction and real-time data processing as claimed in the claim 1, wherein Socket.IO and Supabase, and predictive analytics are implemented using Prophet and PyCaret for advanced geospatial forecasting.
3. The offline voice-enabled geospatial system for multimodal interaction and real-time data processing as claimed in the claim 1, wherein the integration of edge computing devices and advanced sensors ensures high performance and real-time responsiveness, further enhancing the offline capabilities of the system.

Documents

NameDate
202411090864-COMPLETE SPECIFICATION [22-11-2024(online)].pdf22/11/2024
202411090864-DRAWINGS [22-11-2024(online)].pdf22/11/2024
202411090864-FIGURE OF ABSTRACT [22-11-2024(online)].pdf22/11/2024
202411090864-FORM 1 [22-11-2024(online)].pdf22/11/2024
202411090864-FORM-9 [22-11-2024(online)].pdf22/11/2024

footer-service

By continuing past this page, you agree to our Terms of Service,Cookie PolicyPrivacy Policy  and  Refund Policy  © - Uber9 Business Process Services Private Limited. All rights reserved.

Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.

Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.