University of Washington and UC San Diego Researchers Introduce Tensor Query Processor (TQP) with Tensor Computation Runtimes for Query Processing – 20x Faster
If data is the fuel of artificial intelligence (AI), computation is its engine. The ever-increasing computational demands of modern AI systems have prompted investment and R&D in specialized hardware, as well as the development and support of AI runtimes and compilers, leading players in the industry and open source communities devoting significant resources to software development for AI. workloads.
TQP is the world’s first query processor that runs on Tensor compute runtimes, delivering performance up to 20x faster than CPU-only systems.
Tensor Query Processor (TQP) is a query processor that runs on Tensor Computing (TCR) runtimes such as PyTorch, TVM, and ONNX Runtime. It is prototyped by a research team from the University of Washington, UC San Diego and Microsoft in the new publication Query Processing on Tensor Computation Runtimes.
TQP is the first query processor to run on TCRs, researchers say, and has been shown to improve query execution speed by up to 20x compared to CPU-only systems and up to 5x compared to specialized GPU solutions.
The TCR’s tensor interface is expressive enough to accommodate all commonly used relational operations. A set of algorithms and a stack of compilers have been proposed to convert relational operators into tensor computation. The Tensor Query Processor technique has been benchmarked against state-of-the-art benchmarks. Data scientists can easily build and deploy deep neural networks (DNNs) using TCRs like PyTorch and TensorFlow, allowing them to utilize the exciting potential offered by new hardware.
The growing demand for TCR suggests that hardware solutions aimed specifically at data-intensive ML are becoming more common, raising the question of how databases might benefit from these advances. According to the researchers, their suggested TQP was created to achieve three goals:
- Designed with the utmost care.
The performance of the query processor should be comparable to that of specialized engines (for example, it should perform as well as GPU databases on GPU devices). A query processor can run on a variety of hardware devices, from bespoke ASICs to CPUs and GPUs, operating across generations and manufacturers. It’s a huge undertaking to create high-performance bespoke operators for each device backend. Instead of O(n), an O(1) strategy targeted the number of hardware supported.
TQP uses a typical architecture for assembling relational operators and machine learning models in tensor programming. The workflow has two steps:
1) Input requests are translated into an executable tensor program during the compilation step;
2) In the execution phase, the input data is first transformed into tensors and then fed into the compiled program to create the final result of the query.
There are four main levels in the compilation phase:
The analysis layer converts an input SQL statement into an internal intermediate representation (IR) graph describing the physical plane of the query. Next, the canonicalization and optimization layer performs IR-IR transformations; the planning layer translates the IR graph generated in the previous layer into an operator plan, and the execution layer generates an executor from the operator plan.
The program controls the conversion of data to tensor format by invoking the feed operator and data transfers to/from device memory and scheduling operators in the specified device in the runtime created by the compile phase.
According to the results, TQP can accelerate query execution time up to 20 times on CPU-only systems and up to 5 times on specialized GPU solutions. TQP further accelerates queries by combining ML predictions with SQL from start to finish, delivering performance up to 5x faster than CPU baselines.
Overall, this demonstrates that the proposed TQP can utilize TCR advancements and run efficiently on all supported hardware platforms. The article Tensor Computation Runtimes for Query Processing is available on arXiv.