image
image
user-login
Patent search/

A METHOD FOR HARDWARE-ACCELERATED DATA PROCESSING AND RENDERING IN MOBILE AND EMBEDDED SYSTEMS

search

Patent Search in India

  • tick

    Extensive patent search conducted by a registered patent agent

  • tick

    Patent search done by experts in under 48hrs

₹999

₹399

Talk to expert

A METHOD FOR HARDWARE-ACCELERATED DATA PROCESSING AND RENDERING IN MOBILE AND EMBEDDED SYSTEMS

ORDINARY APPLICATION

Published

date

Filed on 23 November 2024

Abstract

ABSTRACT The present invention relates to a method for hardware-accelerated processing and rendering to enable real-time communication between guest software and host hardware in mobile and embedded systems. The method employs a shared memory driver, guest encoder, guest-to-host and host-to-guest buffers, and a host decoder to facilitate the efficient transmission, decoding, and rendering of commands and data. This configuration maximizes host hardware efficiency, reduces latency, and enhances performance, making it particularly suitable for mobile and embedded systems emulation environments. Figure 1.

Patent Information

Application ID202441091373
Invention FieldCOMPUTER SCIENCE
Date of Application23/11/2024
Publication Number48/2024

Inventors

NameAddressCountryNationality
Dr. J. SenthilkumarDepartment of Information Technology, Sona College of Technology, TPT Road, Salem - 636 005, Tamil NaduIndiaIndia
Dr. Selvaraj KesavanDepartment of Information Technology, Sona College of Technology, TPT Road, Salem - 636 005, Tamil NaduIndiaIndia
Dr. V. MohanrajDepartment of Information Technology, Sona College of Technology, TPT Road, Salem - 636 005, Tamil NaduIndiaIndia
Dr. Y. SureshDepartment of Information Technology, Sona College of Technology, TPT Road, Salem - 636 005, Tamil NaduIndiaIndia

Applicants

NameAddressCountryNationality
SONA COLLEGE OF TECHNOLOGYSona College of Technology, TPT Road, Salem - 636 005IndiaIndia

Specification

Description:A METHOD FOR HARDWARE-ACCELERATED DATA PROCESSING AND RENDERING IN MOBILE AND EMBEDDED SYSTEMS

FIELD OF THE INVENTION
The present invention relates to a method for hardware-accelerated processing and rendering, aimed at enhancing performance for mobile and embedded applications that require high-performance graphics, gaming, and video processing. More specifically, the present invention relates to a method for achieving efficient host-to-guest transmission and real-time processing of commands and data, optimizing resource utilization for emulation and prototyping purposes without relying on physical hardware.

BACKGROUND OF THE INVENTION
In the development of complex mobile and embedded applications, there is a significant need for efficient, real-time processing and rendering capabilities, particularly for high-performance graphics, gaming, and video processing tasks. Such applications frequently rely on hardware acceleration to manage these intensive tasks effectively. However, development often encounters limitations due to the non-availability of physical target hardware, necessitating the use of emulators. These emulators often fail to fully utilize the host system's resources, leading to inefficient processing, slower command execution, and a suboptimal user experience.
Existing solutions, such as the method described in patent CN114035967 A, attempt to address these challenges by facilitating communication between the host and guest systems. However, CN114035967 A lacks certain key features that this invention introduces, specifically the use of host-to-guest buffers and dedicated mechanisms to enhance the guest composition system. Moreover, the shared memory mechanism in CN114035967 A employs a different approach to real-time command processing, which does not fully optimize the flow of commands and data required for seamless guest-host interactions.
The present invention overcomes these limitations by establishing a robust, hardware-enabled method for real-time command and data processing between the guest (emulator) and host. This method ensures that guest software efficiently leverages the host system's processing, composition, and rendering capabilities, thereby enhancing performance, accelerating development, and optimizing resource management within emulation environments.

SUMMARY OF THE INVENTION
The present invention relates to a method for hardware-accelerated processing and rendering that facilitates effective, real-time communication between guest software (or emulators) and host hardware. This method ensures that commands and data are transmitted and processed with minimal latency, allowing emulation environments to more accurately reflect the capabilities of physical hardware and to optimize resource utilization effectively.
In one embodiment, the present invention relates to a method where a shared memory driver is employed to manage memory allocation between the guest and host efficiently. This method also incorporates a guest encoder that packages commands for efficient transmission, as well as a series of guest-to-host buffers that facilitate the seamless flow of data to the host system. The host system includes a decoder that processes incoming data and interprets the guest commands, while a host-to-guest buffer returns processed output to the guest, enabling real-time rendering and minimizing processing delays.
In another embodiment, the present the invention uses virtual device creation and synchronization algorithms to dynamically manage memory allocation. The method optimizes the host-to-guest buffer for seamless surface data transmission, allowing the guest software to render components directly from the shared memory in real time. This configuration provides a smoother, more responsive emulation experience that closely replicates physical hardware interactions, improving the accuracy and performance of complex applications within an emulated environment.

BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 depicts the overall process involved in the hardware-accelerated real-time data processing in mobile and embedded systems, enabling efficient host-to-guest communication.
Figure 2 illustrates the implementation of the method in Host hardware support for Android emulator.
Figure 3 illustrates the detail flow of the method
Figure 4 illustrates an example of software based rendering
Figure 5 illustrates an example of hardware accelerated rendering (ARM Mali GPU)
Referring to the drawings, the embodiments of the present invention are further described. The figures are not necessarily drawn to scale, and in some instances the drawings have been exaggerated or simplified for illustrative purposes only. One of ordinary skill in the art may appreciate the many possible applications and variations of the present invention based on the following examples of possible embodiments of the present invention.

DETAILED DESCRIPTION OF THE INVENTION
The following description outlines various embodiments of the invention for illustrative purposes, without limiting its scope. Skilled persons in the field will appreciate that other configurations may also fall within the scope of this disclosure. Terms used herein carry their standard meanings in the relevant field, and synonyms may be used interchangeably. Examples provided are illustrative and do not limit the scope of the invention.
The present invention relates to a method for hardware-accelerated processing and rendering, specifically designed to enhance real-time communication between guest software (such as emulators) and host hardware. This method leverages a shared memory driver, an encoding and decoding process, and buffer mechanisms to facilitate seamless data transmission and rendering. It aims to reduce latency, improve performance, and replicate physical hardware capabilities within emulated environments for mobile and embedded applications.
In one embodiment, the present invention relates to a method for hardware-accelerated processing and rendering, facilitating real-time communication between guest software (e.g., emulators) and host hardware. A shared memory driver manages memory allocation between the guest and host efficiently, allowing commands and data to be transmitted and processed with minimal latency. This method enables emulation environments to more accurately mirror the capabilities of physical hardware, optimizing resource utilization. The method also incorporates a guest encoder for command packaging, guest-to-host buffers for data transmission, a host decoder for processing incoming data, and a host-to-guest buffer for returning processed output to the guest, thus enabling real-time rendering and minimizing delays.
In another embodiment, the invention utilizes virtual device creation and synchronization algorithms to dynamically manage memory allocation. This method optimizes the host-to-guest buffer for seamless surface data transmission, allowing guest software to render components directly from the shared memory in real time. This configuration provides a smoother, more responsive emulation experience, replicating physical hardware interactions accurately.
Mobile and embedded applications, particularly those requiring advanced graphics and video processing, often rely on hardware acceleration. However, limitations arise when physical hardware is unavailable, necessitating the use of emulators. Existing emulators do not fully utilize host hardware capabilities, resulting in slower command processing and reduced efficiency. The present invention addresses these limitations through a specialized method that optimizes host resources for guest applications, using distinct components and procedures.
The method comprises essential components, including a shared memory driver, a guest encoder, guest-to-host buffers, a host decoder and parser, host-to-guest buffers, and an enhanced guest composition system. These components work in unison to enable real-time, hardware-optimized rendering and processing within the emulation environment.
The shared memory driver is integral to the method, as it allocates memory space that is accessible to both the guest and host systems. This shared memory facilitates data exchange without extensive copying, thus reducing latency and enhancing real-time performance. Through efficient memory allocation, this driver ensures that data remains accessible for rapid processing, which is essential for real-time command and data rendering.
The guest encoder plays a critical role in preparing data for transmission from the guest to the host. It compresses and encodes commands and data to reduce transmission size, which improves processing speed. This encoding also ensures that the data format aligns with the host decoding process, allowing efficient unpacking on the host side. By minimizing the data size, the guest encoder significantly enhances transmission efficiency, reducing delays.
The guest-to-host buffers facilitate the transmission of encoded commands and data. These buffers are allocated within the shared memory as continuous memory blocks, enabling efficient data flow without risking overflow. This buffer system ensures that data packets are sequentially processed and promptly transferred, maintaining real-time responsiveness essential for graphics and data-intensive applications.
The host decoder is responsible for unpacking and interpreting the commands and data received from the guest. Using a custom decoding algorithm, it translates encoded commands into operations executable by the host hardware. Following decoding, a parser organizes and routes commands to appropriate rendering modules on the host hardware. This decoding and parsing process is optimized for rapid interpretation, supporting real-time graphics rendering on the host system.
Host-to-guest buffers return processed data from the host to the guest software, providing immediate access to rendered output. This synchronized data flow between the host and guest prevents data loss and enhances performance. By managing buffer availability and preventing overwriting, the host-to-guest buffers ensure a continuous and reliable communication loop, critical for real-time rendering updates in the guest software.
The guest composition system is modified to enable real-time access to rendered components directly from the shared memory. This enhancement allows guest software to refresh display elements promptly, reflecting the host's output without delay. By reading components directly from shared memory, the guest composition system achieves high rendering efficiency and responsiveness, providing a near-realistic hardware emulation experience.

The process flow involved in the method for hardware-accelerated processing and rendering that facilitates effective, real-time communication between guest software (or emulators) and host hardware includesallocating a shared memory space between the guest and host systems using a shared memory driver, wherein the shared memory space enables data exchange between the guest and host without extensive data copying, thereby reducing latency; encoding guest commands and data on the guest side using a guest encoder, wherein the encoder minimizes data size for transmission while ensuring compatibility with host decoding processes; transmitting encoded commands and data from the guest to the host through guest-to-host buffers, wherein continuous memory allocation for the buffers ensures efficient data handling and prevents overflow; decoding and parsing the received commands and data on the host side using a host decoder and parser, wherein the host decoder translates encoded commands into host-executable operations, and the parser routes the decoded commands to rendering modules within the host hardware; rendering and processing the parsed commands on the host hardware, wherein host GPU resources are utilized to perform graphics-intensive tasks, improving processing speed and performance; transmitting the processed data from the host back to the guest through host-to-guest buffers, wherein the host-to-guest buffers synchronize data flow to avoid data loss and enable immediate access to rendered output by the guest software; and rendering components directly from the shared memory on the guest side using an enhanced guest composition system, wherein the guest software renders updated components in real-time based on the processed data received from the host.
Process Flow of the Method
1. Initialization of Virtual Device and Shared Memory:
The method begins by creating a virtual device and allocating shared memory accessible to both the guest and host systems. This shared space supports command and data transmission, minimizing delay and optimizing resource allocation for real-time operation
2. Encoding and Transmission from Guest to Host:
The guest encoder packages commands and data for efficient transmission. The encoded data is then loaded into guest-to-host buffers within the shared memory, which facilitates a rapid and reliable transfer to the host.
3. Host-Side Decoding and Processing:
On the host side, the host decoder decodes the commands and data, translating them into executable instructions. A parser routes these commands to specific rendering modules, leveraging the host's GPU resources for high-performance tasks, such as 3D rendering.
4. Data Transmission from Host to Guest:
Processed data from the host is transmitted back to the guest through the host-to-guest buffers. This loop provides real-time feedback, allowing the guest system to render updated components as they are processed by the host.
5. Real-Time Rendering and Synchronization:
The guest composition system reads the latest components from shared memory and renders them in real time. Synchronization algorithms manage buffer availability and prevent data loss, ensuring that the system operates smoothly, even under high load.

Detail Step by Step Flow of the Method
The detailed flow of the method is described in figure 3.
The proposed method begins by decoding commands and extracting OpenGL ES (GLES) commands along with their associated states. This step involves parsing the incoming data to interpret rendering instructions effectively. By extracting and organizing these commands, the method ensures that subsequent operations align with the rendering requirements.
Once the commands are decoded, a rendering surface is created. The rendering surface serves as the target display area for graphical output, such as a window or a frame buffer. This surface is intricately tied to a specific rendering context, ensuring a seamless link between graphical operations and their output.
The decoded commands and data are then validated to ensure compatibility with the host processing units. This validation step is crucial for mapping the commands to the appropriate host resources, such as GPU cores. The method cross-verifies the data to prevent errors during execution, ensuring that only correct and optimized commands are processed.
Following validation, a custom renderer is implemented. The custom renderer handles the issuance of GLES commands and manages the rendering pipeline, dictating how objects are rendered for each frame. By tailoring the rendering process, the method optimizes graphical output and resource utilization.
To facilitate platform-specific rendering, the method establishes a rendering context. This context manages the state and resources required for the rendering pipeline, ensuring compatibility with the operating system and hardware environment. The context serves as a foundation for efficient rendering, streamlining resource allocation and state management.
Shared memory is then allocated and managed between the guest and host systems. This shared memory enables both systems to access the same resources efficiently, bridging the virtualized guest environment with the physical host system. A dedicated shared memory command buffer is created within this memory space to facilitate the transfer of GLES commands, state changes, and texture updates from the guest to the host.
A virtual device driver is employed to manage the shared memory space, orchestrating the interaction between the virtualized guest environment and the underlying host hardware. This driver ensures efficient allocation, deallocation, and coordination of resources. Commands are translated into objects with a defined format, and state and operation data are packaged for seamless transfer via the shared memory.
The encoded commands, operation data, and objects are transferred using a custom virtual memory driver, which manages the flow of data between the guest and host environments. In the host engine context, these commands and data are processed further. A custom decoder parses the received data to extract and interpret the GLES commands and states.
After decoding, the commands and data are validated again to ensure compatibility with the host processing units, mapping them accurately to the hardware resources. The host GPU interface and driver are engaged to execute the rendering commands, with the GPU hardware handling the operations efficiently. The processed data is validated and mapped to the pixel buffer, ensuring the graphical output is prepared for display.
Finally, the processed data is transferred back to the guest rendering context through a dedicated data pipeline. This step enables the guest system to manage its rendering context and buffers, finalizing the process. The method ensures efficient interaction between the guest and host systems, leveraging hardware acceleration for optimal rendering performance.

Example
Real-World Implementation
The proposed method is exemplified through its application in Android OpenGL ES 3.0 implementation. A graphics application utilizing OpenGL ES 3.0 commands is developed, with the rendering environment managed through an EGL context. Resources are allocated, and the rendering pipeline is initialized for smooth operation. The SurfaceFlinger system service and Gralloc are configured for managing graphics memory buffers.
A virtual device is initialized to handle rendering tasks, and shared memory is allocated to optimize resource access between the guest and host systems. A command decoder on the host parses and decodes rendering instructions from the guest. Real-world tests validate rendering capabilities, including fragment and vertex shader functionality and texture binding.

Performance Benchmarking
The performance benchmarking of the proposed method underscores its effectiveness in optimizing graphical processing tasks through hardware acceleration. A comparative analysis between traditional software-based rendering and the proposed hardware-accelerated rendering method highlights significant advancements in rendering quality and system efficiency.
In conventional software-based rendering, all GPU simulation tasks are handled by the host CPU, resulting in high CPU utilization, limited frame rates, increased latency, and suboptimal energy efficiency. The CPU load in this scenario typically ranges from 70-91%, with frame rates averaging between 12-24 FPS. Such limitations become particularly evident in complex rendering tasks, where high latency values of 60 ms to 180 ms further degrade performance. Power consumption is also relatively high, averaging 6-8 W due to the intensive CPU operations.
By contrast, the proposed method leverages the host GPU to offload graphical rendering tasks, substantially reducing the computational burden on the CPU. This shift enables hardware-accelerated rendering for OpenGL ES 3.0 commands, resulting in enhanced performance metrics. CPU utilization decreases significantly, averaging between 30-43%, as the GPU processes the bulk of the graphical workload. Frame rates are notably improved, averaging between 50-70 FPS, ensuring smoother rendering and a more realistic simulation of real-world application performance. Latency per frame is significantly reduced, varying between 10 ms to 45 ms, which is crucial for delivering responsive and visually rich user experiences. While power usage by the GPU increases to 10-16 W, this trade-off is justified by the substantial performance gains and reduced CPU energy consumption.
Table1: Tabulation of the metrics captured
Benchmark metrics Software rendering Using proposed method (Host GPU acceleration)
CPU Utilization Observed high CPU load varies from 70-91% Low CPU utilization on average 30-43% as GPU handles processing
Frame Rates (FPS) Low FPS especially during rendering complex scenes (avg 12-24 FPS) Smooth rendering with higher FPS as GPU handles and process( avg50-70 FPS)
Power Consumption High power usage while processing (average 6 W to 8 W) Lower CPU power usage, but higher GPU usage (10W to 16 W)
Latency Higher latency per frame (varies between 60ms to 180ms) Low latency per frame (varies between 10ms to 45ms)
These findings were validated using a sample graphics application tested on an Android emulator with ARM Mali GPU integration. For example, a real-time shooting game was tested to showcase the practical impact of hardware acceleration. In the software-based rendering mode (Figure 4), the game's graphical user interface (GUI) was rendered at an average frame rate of 12 FPS. The limited computational resources led to blurry and inconsistent visuals. Conversely, in the hardware-accelerated rendering mode (Figure 5), the ARM Mali GPU efficiently handled the rendering tasks, producing clear and smooth visuals at a frame rate of 70 FPS.
The improved clarity and responsiveness in hardware-accelerated mode underscore the potential of the proposed method to transform rendering processes in Android environments. By utilizing advanced GPU capabilities, the method ensures not only better graphical performance but also a realistic approximation of application behavior on dedicated hardware. This innovation provides a robust framework for developers aiming to deliver high-quality graphical applications efficiently.

Advantages and Benefits of the Invention
The method described herein offers several advantages, including:
Enhanced Real-Time Processing: By utilizing host hardware capabilities, the invention significantly accelerates command processing and data rendering.
Reduced Latency: The shared memory and optimized buffers minimize data transmission delays, allowing for real-time updates and interactions.
Improved Resource Efficiency: Memory allocation and buffer synchronization maximize the host's resources, enhancing performance and enabling complex tasks without additional hardware.
Accelerated Development and Testing: Emulating hardware interactions without physical hardware accelerates prototyping and reduces time-to-market.
Scalability and Flexibility: The method adapts to various mobile and embedded systems, providing a versatile solution for diverse applications.
Higher Accuracy in Emulation: By closely replicating the behavior of physical hardware, this method offers a more accurate and reliable emulation experience.

It may be appreciated by those skilled in the art that the drawings, examples and detailed description herein are to be regarded in an illustrative rather than a restrictive manner. , Claims:We Claim:

1. A method for hardware-accelerated processing and rendering to facilitate real-time communication between guest software and host hardware, the method comprising:
a. Allocating a shared memory space between the guest and host systems using a shared memory driver;
b. Encoding guest commands and data on the guest side using a guest encoder;
c. Transmitting encoded commands and data from the guest to the host through guest-to-host buffers;
d. Decoding and parsing the received commands and data on the host side using a host decoder and parser;
e. Rendering and processing the parsed commands on the host hardware;
f. Transmitting the processed data from the host back to the guest through host-to-guest buffers; and
g. Rendering components directly from the shared memory on the guest side using an enhanced guest composition system
characterized in that the method enables efficient processing of high-performance tasks by optimizing host hardware, making it ideal for emulation environments that simulate hardware for mobile and embedded applications.
2. The method as claimed in claim 1, wherein the shared memory space enables data exchange between the guest and host without extensive data copying, thereby reducing latency;
3. The method as claimed in claim 1, wherein the encoder minimizes data size for transmission while ensuring compatibility with host decoding processes;
4. The method as claimed in claim 1, wherein continuous memory allocation for the buffers ensures efficient data handling and prevents overflow;
5. The method as claimed in claim 1, wherein the host decoder translates encoded commands into host-executable operations, and the parser routes the decoded commands to rendering modules within the host hardware;
6. The method as claimed in claim 1, wherein host GPU resources are utilized to perform graphics-intensive tasks, improving processing speed and performance;
7. The method as claimed in claim 1, wherein the host-to-guest buffers synchronize data flow to avoid data loss and enable immediate access to rendered output by the guest software
8. The method as claimed in claim 1, wherein the guest software renders updated components in real-time based on the processed data received from the host.

Documents

NameDate
202441091373-COMPLETE SPECIFICATION [23-11-2024(online)].pdf23/11/2024
202441091373-DECLARATION OF INVENTORSHIP (FORM 5) [23-11-2024(online)].pdf23/11/2024
202441091373-DRAWINGS [23-11-2024(online)].pdf23/11/2024
202441091373-EDUCATIONAL INSTITUTION(S) [23-11-2024(online)].pdf23/11/2024
202441091373-FORM 1 [23-11-2024(online)].pdf23/11/2024
202441091373-FORM 18 [23-11-2024(online)].pdf23/11/2024
202441091373-FORM-9 [23-11-2024(online)].pdf23/11/2024
202441091373-OTHERS [23-11-2024(online)].pdf23/11/2024
202441091373-POWER OF AUTHORITY [23-11-2024(online)].pdf23/11/2024
202441091373-REQUEST FOR EXAMINATION (FORM-18) [23-11-2024(online)].pdf23/11/2024

footer-service

By continuing past this page, you agree to our Terms of Service,Cookie PolicyPrivacy Policy  and  Refund Policy  © - Uber9 Business Process Services Private Limited. All rights reserved.

Uber9 Business Process Services Private Limited, CIN - U74900TN2014PTC098414, GSTIN - 33AABCU7650C1ZM, Registered Office Address - F-97, Newry Shreya Apartments Anna Nagar East, Chennai, Tamil Nadu 600102, India.

Please note that we are a facilitating platform enabling access to reliable professionals. We are not a law firm and do not provide legal services ourselves. The information on this website is for the purpose of knowledge only and should not be relied upon as legal advice or opinion.