Ultra-fast Big Data processing technology:
Zap-In Technology Overview

 

Overview
Zap-In: the most advanced form of in-memory database

Among databases, the most successful and widely used type is the relational database. In-memory databases, which store data in main memory, have been developed as a way of increasing the speed of relational databases. But even so, they result in speeds only about 10 times faster.

Turbo Data Laboratories has developed a high-speed in-memory database by rebuilding the data structures and processing algorithms from the ground up. Built on algorithms based on our own linear filtering method*, this database system delivers processing speeds 10 to 1000 times, and in some cases even 100,000 times, faster than other database systems.

In addition, thanks to this data structure it can handle massive data of up to 2 million lines even though it is an in-memory database.

Zap-In Technology’s strengths

Ultra-high-speed database processing <To detailed information>

Enables processing speeds 10-1000 times faster than those of an ordinary database system, or 100,000 times faster in JOIN processing. In one case this has shortened the time needed for processing that had taken day and night on an ordinary database system to one minute or less. These kinds of speed increases can greatly improve the quality of business operations.

In an ordinary relational database system, the time that processing takes will increase rapidly as the volume of data increases (o(n*log(n)), where n is the volume of data). This is why it is not practical to use them for Big Data, which takes a very long time to process.

In contrast, with Zap-In Technology processing time is proportional to the volume of data o(n), where n is the volume of data). This difference in speeds compared to an ordinary database rises sharply for Big Data.

It can load data (.csv format) at speeds 100 times faster than those of an ordinary database.

Ultra-high speeds even for Big Data

It’s also compatible with Big Data of up to 2 billion lines. Although it’s an in-memory database that stores data in main memory instead of on a hard disk, it can handle massive volumes of data thanks to its highly efficient data structure.

In a cluster implementation it can run on 16 servers, making it possible to handle massive volumes of Big Data totaling 32 billion lines.

High cost-performance

It makes it possible to achieve the same performance on smaller-scale hardware at much lower costs (1/10 – 1/1000).

System development in a very short period of time: Automatized programming through recording macros <To details>

It records the user’s database operations in the graphical user interface as macros and converts them to programs automatically. This is possible because it uses the procedural language Python instead of the non-procedural language SQL.

This function makes it possible to complete development of a database in a short timeframe without any programming. In some cases it has successfully completed in three days system development that would have taken one month through SQL programming.

Comparisons with other database technologies (functions, performance)

The functions and speed of Zap-In Technology are compared below with those of various other database systems currently in use around the world.

Functional comparisons:
Zap-In (in-memory ultra-high-speed database)

Features: Ultra-high speed, high performance comparable to that of an RDB. Enables high-performance, high-speed full-text retrieval. Does not have transaction functions.

RDB (disk) database

Example products: MySQL, Oracle Database
Features: Currently the mainstream in databases. Offers high performance but reduced speeds when handling Big Data.

RDB (in-memory) database

Example products: HANA (SAP), Times Ten (Oracle), Spark (Apache)
Features: Faster (roughly 10 times) than RDB (disk).

NoSQL database

Example products: Dynamo DB (Amazon), Cassandra (Apache)
Features: High speed but few functions

Full-text retrieval database

Example products: Secure Enterprise Search (Oracle), Namazu
Features: Although providing only one function (full-text retrieval), delivers high-speed performance.

Batch-processing database

Example products:
Features: Simple and easy programming, but low speeds.

Speed comparisons:

<To speed benchmarks>

Processing RDB (disk) database RDB (in-memory) database NoSQL database Full-text retrieval database Turbo Zap-In (Note: Zap-In’s features)
Loading CSV data 1 10 50 0.05 100
JOIN computation 1 10 1000-100,000 High speeds even with high cardinality
SORT 1 10 100-100,000 High speeds even with high cardinality
SEAERCH 1 10 100-1000
(perfect match of key term)
10-1000 High speeds even with large data volumes
BOM deployment 1 10 500-700
Categorizing 1 10 10-1000,000 Particularly high speed O(n)
Totaling 1 10 1000-100,000 High speeds even with high cardinality
Calculation, updating 1 10 0.1-10,000 one item at a time is slow, bulk updating is ultra-high-speed
EXPORT 1 10 1
(when one data item fully matches the key term)
1-10
(pulling hit documents)
100-1000 Particularly high-speed when the number of hits is high
Full-text retrieval 1 10 ? ? 10-1000 High-speed even when the number of hits is high

Note: Cardinality: Number of types of values. If the item has the value “male” or “female,” the cardinality is two. It becomes much larger with data such as names. It becomes massive with numerical and similar data. In an ordinary database, speed decreases as cardinality increases.

<To speed benchmarks>

System configurations

Standalone configuration

Consists of one PC only.

Standalone

Server-client configuration

Up to 16 client PCs can connect to one server.
Database processing is conducted on the server, while the client PCs serve in display functions.

ServerClient

Cluster server configuration

An expanded version of the server-client configuration, capable of enhanced data and processing capacities.

Up to 16 client PCs can be connected to up to 16 servers. One service manager machine is needed too.Database processing is conducted on the servers, while the client PCs serve display functions.

Cluster

Web service configuration

This employs a server-client server configured for Web services.
It is used via Web browsers on client PCs and smartphones.

WebServer

Detailed information pages