Ultra-fast Big Data processing technology:
Zap-In Technology Overview
Zap-In: the most advanced form of in-memory database
Among databases, the most successful and widely used type is the relational database. In-memory databases, which store data in main memory, have been developed as a way of increasing the speed of relational databases. But even so, they result in speeds only about 10 times faster.
Turbo Data Laboratories has developed a high-speed in-memory database by rebuilding the data structures and processing algorithms from the ground up. Built on algorithms based on our own linear filtering method*, this database system delivers processing speeds 10 to 1000 times, and in some cases even 100,000 times, faster than other database systems.
In addition, thanks to this data structure it can handle massive data of up to 2 million lines even though it is an in-memory database.
Zap-In Technology’s strengths
Enables processing speeds 10-1000 times faster than those of an ordinary database system, or 100,000 times faster in JOIN processing. In one case this has shortened the time needed for processing that had taken day and night on an ordinary database system to one minute or less. These kinds of speed increases can greatly improve the quality of business operations.
In an ordinary relational database system, the time that processing takes will increase rapidly as the volume of data increases (o(n*log(n)), where n is the volume of data). This is why it is not practical to use them for Big Data, which takes a very long time to process.
In contrast, with Zap-In Technology processing time is proportional to the volume of data ( o(n), where n is the volume of data). This difference in speeds compared to an ordinary database rises sharply for Big Data.
It can load data (.csv format) at speeds 100 times faster than those of an ordinary database.
Ultra-high speeds even for Big Data
It’s also compatible with Big Data of up to 2 billion lines. Although it’s an in-memory database that stores data in main memory instead of on a hard disk, it can handle massive volumes of data thanks to its highly efficient data structure.
In a cluster implementation it can run on 16 servers, making it possible to handle massive volumes of Big Data totaling 32 billion lines.
It makes it possible to achieve the same performance on smaller-scale hardware at much lower costs (1/10 – 1/1000).
System development in a very short period of time: Automatized programming through recording macros <To details>
It records the user’s database operations in the graphical user interface as macros and converts them to programs automatically. This is possible because it uses the procedural language Python instead of the non-procedural language SQL.
This function makes it possible to complete development of a database in a short timeframe without any programming. In some cases it has successfully completed in three days system development that would have taken one month through SQL programming.
Comparisons with other database technologies (functions, performance)
The functions and speed of Zap-In Technology are compared below with those of various other database systems currently in use around the world.
Zap-In (in-memory ultra-high-speed database)
Features: Ultra-high speed, high performance comparable to that of an RDB. Enables high-performance, high-speed full-text retrieval. Does not have transaction functions.
RDB (disk) database
Example products: MySQL, Oracle Database
Features: Currently the mainstream in databases. Offers high performance but reduced speeds when handling Big Data.
RDB (in-memory) database
Example products: HANA (SAP), Times Ten (Oracle), Spark (Apache)
Features: Faster (roughly 10 times) than RDB (disk).
Example products: Dynamo DB (Amazon), Cassandra (Apache)
Features: High speed but few functions
Full-text retrieval database
Example products: Secure Enterprise Search (Oracle), Namazu
Features: Although providing only one function (full-text retrieval), delivers high-speed performance.
Features: Simple and easy programming, but low speeds.
|Processing||RDB (disk) database||RDB (in-memory) database||NoSQL database||Full-text retrieval database||Turbo Zap-In||(Note: Zap-In’s features)|
|Loading CSV data||1||10||50||0.05||100|
|JOIN computation||1||10||–||–||1000-100,000||High speeds even with high cardinality|
|SORT||1||10||–||–||100-100,000||High speeds even with high cardinality|
(perfect match of key term)
|–||10-1000||High speeds even with large data volumes|
|Categorizing||1||10||–||–||10-1000,000||Particularly high speed O(n)|
|Totaling||1||10||–||–||1000-100,000||High speeds even with high cardinality|
|Calculation, updating||1||10||–||–||0.1-10,000||one item at a time is slow, bulk updating is ultra-high-speed|
(when one data item fully matches the key term)
(pulling hit documents)
|100-1000||Particularly high-speed when the number of hits is high|
|Full-text retrieval||1||10||?||?||10-1000||High-speed even when the number of hits is high|
Note: Cardinality: Number of types of values. If the item has the value “male” or “female,” the cardinality is two. It becomes much larger with data such as names. It becomes massive with numerical and similar data. In an ordinary database, speed decreases as cardinality increases.
Consists of one PC only.
Up to 16 client PCs can connect to one server.
Database processing is conducted on the server, while the client PCs serve in display functions.
Cluster server configuration
An expanded version of the server-client configuration, capable of enhanced data and processing capacities.
Up to 16 client PCs can be connected to up to 16 servers. One service manager machine is needed too.Database processing is conducted on the servers, while the client PCs serve display functions.
Web service configuration
This employs a server-client server configured for Web services.
It is used via Web browsers on client PCs and smartphones.