
|
|
Violin Memory: making memory music 
19 November, 2008 By David Hill (Mesabi Group) and Jim Handy (Objective Analysis) |

The solid state disk (SSD) market for enterprise-class storage has heated up recently.
Over the past few months, we have seen a flurry of announcements from major companies. EMC kicked off the heightened interest particularly in flash memory early this year with its use as a disk drive option within its Symmetrix product line and more recently within its CLARiiON line. Intel and Samsung, as well as smaller companies, like STEC and Texas Memory Systems have also jumped in. While IBM, Sun Microsystems, and HP have all evinced strong interest in SSD technology, none has announced any products. Now, Violin Memory has entered the fray with the introduction of its Violin Memory Appliance for the enterprise-class space.
This interest comes about for a simple reason -- SSDs address a growing problem with I/O bottlenecks, where response time dramatically falls due to the inability of disks to process I/O requests fast enough because they are overloaded, in applications that are performance- intensive. Applications that do not run into an I/O bottleneck can take advantage of high capacity drives -- hence the rise of SATA drives. Applications that are storage I/O bottlenecked often use a workaround of trying to spread the I/O workload over as many disk spindles as possible. This means buying a heck of a lot of disks but using relatively little of their available storage capacity to deliver the necessary I/Os per second (IOPS).
Flash memory-based SSDs offer an alternative, but since flash's cost per gigabyte is three or more times that of an HDD many wonder if the performance of this technology is really worth the price. SSD vendors argue that it is. Disk drive speeds are stuck where they have been (15K RPM being the fastest) and are not likely to improve. Flash has very short access latency and seek latency, so it can deliver data at speeds two or more orders of magnitude faster than HDDs (say 150K IOPS to 300 IOPS).
Flash does have its problems, though. Flash writes are very slow -- sometimes even slower than writes to an HDD. This and its high price have hindered the technology's adoption in the PC. Benchmark agencies found that PCs with flash SSDs were no faster than consumer HDDs, and in some cases were even slower. But slow writes are only one drawback associated with this type of flash technology called NAND flash technology. Another important issue is that this technology has wear-related reliability issues -- flash memory cells wear out if overused. SSD vendors use wear-leveling algorithms to lessen wear-related failures but flash still faces serious challenges in moving from undemanding niches, such as memory sticks, cameras, and MP3 players, to the enterprise space, where reliability is a critical concern.
The write performance and wear issues were mitigated by second generation SSDs and page mapping techniques, which use metadata to map between the user address of the block and the actual physical address of the Flash page. However, this introduced a third problem (and one not generally discussed in polite company), the flash garbage collection process. The problem occurs in drives that have been in use for a long time, where space (called a "block") on the drive will need to be freed up (erased) before a write can take place. The erase, when all bits on the 128 to 256 KB block are reset to one, takes a very long (in read/write terms) 10 milliseconds, during which time no reads to the same flash chip can occur.
After an erase, the system can initiate a write of a subset of the block called a page (512 bytes to 2 KB), but this also takes a long time (one millisecond) during which no reads can take place. Interestingly enough, writes into a page are sequential (just like a mini-tape drive) whereas reads are random. This means that the system must perform a lot of block-to- page mapping with metadata. All in all, the garbage collection process requires a lot of overhead.
Once the drive fills up, "garbage collection" has to take place, a complex process of reclaiming partially-used blocks through page consolidation. This requires gathering active pages from partially-used blocks, erasing old blocks, and then sequentially writing the gathered pages to the newly-erased block. All in all, the garbage collection is a major determinant of flash performance once a drive is full; for instance, performance of an empty or relatively empty drive is likely to be better than the performance of a full drive. For enterprise-class drives, that could cause a severe "your mileage may vary" problem -- so let the buyer beware!
Violin Memory has tackled these problems in a number of innovative ways. The company finessed speed problems by combining DRAM and flash memory, along with other special techniques, such as the use of massive parallelism, into an integrated solution. The result:100K sustained-write IOPS performance in a 2U-rackmount, 4-terabyte, scalable memory appliance.
In addition to automatic wear leveling, the Violin Memory Appliance also uses storage-like
RAID to improve memory-fault tolerance: Data is massively-striped over 4000 flash devices.
Each module can be hot swapped and each reports back to the operator the status of the flash and can cope with flash device and block failures without requiring service.
The Violin Memory Appliance also addresses garbage collection as well as other issues, including scalability. The company's secret sauce is in its switched memory architecture, which uses block mapping and massive RAID striping to improve bandwidth, IOPS, and latency.
Each of the modules manages garbage collection independently, so while one or some modules may be cleaning house others are free to accept incoming data at full speed.
The examples above demonstrate that optimizing flash memory requires a lot of "management" through a variety of controllers, and doing this well is Violin's chief focus.
Violin's controllers perform numerous tasks including scheduling where and when erases go on, ensuring that erases never impact reads or writes (through the use of a modified RAID 3 with non-blocking erases) and managing the metadata that tells where all the data resides.
The company has measured or modeled a number of leading devices that enterprise customers may choose to evaluate. These include standard enterprise HDDs and different sorts of SSDs, and the company found that latency increases in proportion to IOPS requested for all the products it tested. But while an enterprise HDD's latency increased from 3 milliseconds to 4 milliseconds (as it was pushed from modest demands to 100 IOPS) and standard enterprise SSDs' latency rose to more than 2 milliseconds at 2,000 IOPS, the latency of the Violin Memory Appliance remained below half of a millisecond right up to 64,000 IOPS.
What does this mean in practical terms? In a typical data center, the Violin Memory Appliance poses a significant value proposition, since it can support the equivalent IOPS of over 300 enterprise HDDs in a single 2U rack-mount form factor. As with other SSD-based solutions,
Violin's solutions will dramatically reduce the cost per IOPS and power per IOPS in speed-constrained systems. It will also slash floor space requirements to a fraction of that required by enterprise HDDs. Violin claims its Memory Appliance delivers advantages including a 50 percent lower cost and six times density improvement, while consuming only 20 percent the power of competing SSD-based solutions.
Violin has essentially thrown down the gauntlet in defining what enterprises should expect from a well-crafted flash memory appliance. Overall, we believe the company's efforts represent a positive step towards addressing the issues that might otherwise hinder the acceptance of enterprise-class flash memory. To compete effectively with the Violin Memory Appliance, all vendors, large and small, will need to keep their noses to the innovation grindstone -- a situation that will benefit both potential customers and the industry itself.
The Mesabi Group (www.mesabigroup.com) helps organizations make their complex storage, storage management, and interrelated IT infrastructure decisions easier by making the choices simpler and clearer to understand. Objective Analysis (www.Objective-Analysis.com) offers third-party independent market research and data for the semiconductor industry and investors in the semiconductor industry. This article first appeared in the Pund-IT Weekly Review.
|