May 26, 2012

Post: MySQL: Data Storage or Data Processing

… used to process the data, at least not on the large scale – instead it is used for data storage and light duty data retrieval. Even… map to relational data structure to well, but also because of lack of build in parallel processing. When you need high data processing speed you simply…

Post: State of MySQL Market and will Replication live ?

… inhouse and opensource solutions for tasks for large scale file storage, queuing, data processing etc. But seriously if you look at this – people… and non relational data processing and storage system – BigTable with MapReduce, Amazon Dynamo and SimpleDB, Hadoop Another angle of customization of data store and processing was using…

Post: Recovery beyond data restore

…, especially for complex systems. Instead of looking just at data restore process you better look at the whole process which is required to bring…. It will be needed in the worst case of data loss which is trashing the data which gets to all slaves via replication… storage engine, add some tables beyond tables which are being replicated which all has to be accounted for for in the process of

Post: The perils of InnoDB with Debian and startup scripts

storage. Since they have a lot of tables, InnoDB uses over 3.5G of memory for its data … | user | Sending data | select … | 6359 | user | statistics | select … | 6360 | user | | NULL | 6361 | user | | NULL +——+——————+—————-+————- Notice all those processes in…

Post: High-Performance Click Analysis with MySQL

… types of data you’ll need in those aggregate tables, and include columns to support these queries. But beware of denormalizing with character data… this kind of data processing in MySQL, you’re going to end up heavily I/O bound.  Listen to any of the talks… row.  An impression per day becomes a fixed overhead of storage size.  So, you actually have as many rows as…

Post: How fast can MySQL Process Data

…product is about beating data processing limitations of current systems. This raises valid question how fast can MySQL process (filter) data using it current … which is not that bad considering abstraction of storage engine which requires “row by row” processing which means function calls for each …

Post: Data mart or data warehouse?

… aggregation process. Comparing the performance of OLAP with and without aggregation over multiple MySQL storage engines at various data scales. What is a data warehouse…. This process is usually called “conforming” the source data into the warehouse schema. Another important aspect of the definition is aggregation. A data warehouse…

Post: Distributed Set Processing with Shard-Query

… is done in parallel and there is very little data shipping because of the distributed reduction discussed above. If you have any… of compute resource which speaks SQL, but right now only MySQL storage nodes are supported. Amdahl’s law applies to the distributed processing… source query. Adding a new storage node SQL dialect is almost trivial. I’ll document that process shortly, I think I need…

Post: Living with backups

… reading a lot of data very quickly, just as the archiving process does when it runs, it causes a huge number of requests being… the browser. Moreover, reads sent from backup process usually want many sequential blocks of data and such access pattern may be preferred by… a storage. It allows to reduce I/O to such devices on frequently accessed information. After a successful read the block of data

Post: Shard-Query EC2 images available

… performance of the splitter. It is multi-threaded(actually multi-process) and is able to hash split up to 50GB/hour of input data… 2.5GB of data, so ICE gets over 16:1 compression ratio(compared to Innodb, 8:1 compared to raw input data), which is quite nice. Each shard contains only 128MB of data! Storage engine makes a big difference In…