Mixed Random Performance

Real-world storage workloads usually aren't pure reads or writes but a mix of both. It is completely impractical to test and graph the full range of possible mixed I/O workloads—varying the proportion of reads vs writes, sequential vs random and differing block sizes leads to far too many configurations. Instead, we're going to focus on just a few scenarios that are most commonly referred to by vendors, when they provide a mixed I/O performance specification at all. We tested a range of 4kB random read/write mixes at queue depth 32, the maximum supported by SATA SSDs. This gives us a good picture of the maximum throughput these drives can sustain for mixed random I/O, but in many cases the queue depth will be far higher than necessary, so we can't draw meaningful conclusions about latency from this test. As with our tests of pure random reads or writes, we are using 32 threads each issuing one read or write request at a time. This spreads the work over many CPU cores, and for NVMe drives it also spreads the I/O across the drive's several queues.

The full range of read/write mixes is graphed below, but we'll primarily focus on the 70% read, 30% write case that is a fairly common stand-in for moderately read-heavy mixed workloads.

4kB Mixed Random Read/Write

4kB Mixed Random Read/Write
Power Efficiency in MB/s/W Average Power in W

The Kingston and Samsung SATA drives are rather evenly matched for performance with a 70% read/30% write random IO workload: the DC500M is tied with the 883 DCT, and the DC500R is a bit slower than the 860 DCT. The two Kingston drives use the same amount of power, so the slower DC500R has a much worse efficiency score. The two Samsung drives have roughly the same great efficiency score; the slower 860 DCT also uses less power.

The Kingston DC500M and Samsung 883 DCT perform similarly for read-heavy mixes, but once the workload is more than about 30% writes, the Samsung falls behind. Their power consumption is very different: the Samsung plateaus at just over 3W while the DC500M starts at 3W and steadily climbs to over 5W as more writes are added to the mix.

Between the DC500R and the 860 DCT, the Samsung drive has better performance until the workload has shifted to be much more write-heavy than either drive is intended for. The Samsung drive's power consumption also never gets as high as the lowest power draw recorded from the DC500R during this test.

Aerospike Certification Tool

Aerospike is a high-performance NoSQL database designed for use with solid state storage. The developers of Aerospike provide the Aerospike Certification Tool (ACT), a benchmark that emulates the typical storage workload generated by the Aerospike database. This workload consists of a mix of large-block 128kB reads and writes, and small 1.5kB reads. When the ACT was initially released back in the early days of SATA SSDs, the baseline workload was defined to consist of 2000 reads per second and 1000 writes per second. A drive is considered to pass the test if it meets the following latency criteria:

  • fewer than 5% of transactions exceed 1ms
  • fewer than 1% of transactions exceed 8ms
  • fewer than 0.1% of transactions exceed 64ms

Drives can be scored based on the highest throughput they can sustain while satisfying the latency QoS requirements. Scores are normalized relative to the baseline 1x workload, so a score of 50 indicates 100,000 reads per second and 50,000 writes per second. Since this test uses fixed IO rates, the queue depths experienced by each drive will depend on their latency, and can fluctuate during the test run if the drive slows down temporarily for a garbage collection cycle. The test will give up early if it detects the queue depths growing excessively, or if the large block IO threads can't keep up with the random reads.

We used the default settings for queue and thread counts and did not manually constrain the benchmark to a single NUMA node, so this test produced a total of 64 threads scheduled across all 72 virtual (36 physical) cores.

The usual runtime for ACT is 24 hours, which makes determining a drive's throughput limit a long process. For fast NVMe SSDs, this is far longer than necessary for drives to reach steady-state. In order to find the maximum rate at which a drive can pass the test, we start at an unsustainably high rate (at least 150x) and incrementally reduce the rate until the test can run for a full hour, and the decrease the rate further if necessary to get the drive under the latency limits.

Aerospike Certification Tool Score

The Kingston drives don't handle the Aerospike test well at all. The DC500R can only pass the test at twice the throughput of a baseline standard that was set years ago, and DC500M's score of 4x the base load is still much worse than even the Samsung 860 DCT. The Kingston drives can provide comparable throughput to the Samsung SATA drives (as seen on the 70/30 test above), but they don't pass the strict latency QoS requirements imposed by the Aerospike test. This test is more write-intensive than the above 70/30 test and is definitely beyond what the DC500R is intended to be used for, but the DC500M should be able to do better.

Aerospike ACT: Power Efficiency
Power Efficiency Average Power in W

The Kingston drives don't draw any more power than the Samsung drives for once, but since they are running at much lower throughput their efficiency scores are a fraction of what the Samsung drives earn.

Peak Throughput and Steady State Conclusion
POST A COMMENT

28 Comments

View All Comments

  • KAlmquist - Tuesday, June 25, 2019 - link

    Good points. I'd add that there is an upgrade to SATA called "SATA Express" which basically combines two PCIe lanes and traditional SATA into a single cable. It never really took off for the reasons you explained: it's simpler just to switch to PCIe. Reply
  • MDD1963 - Tuesday, June 25, 2019 - link

    It would be nice indeed to see a new SATA4 spec at SAS speeds, 12 Gbps.... Reply
  • TheUnhandledException - Saturday, June 29, 2019 - link

    Why? Why not just use PCIe directly. Flash drives don't need the SATA interface and ultimately the SATA interface becomes PCIe at the SATA controller. It is just adding additional pointless translation to fit a round peg into a square hole. Connect your flash drive to PCIe and it is as slow or fast as you want it to be. 2x PCIe 3.0 you got ~2GB/s to work with, 4 lanes gets you 4GB/s. Upgrade to PCIe 4 and you now have 8 GB/s. Reply
  • jabber - Wednesday, June 26, 2019 - link

    They could stay with 6GBps just fine. I'd say work on reducing the latency.

    Bandwidth is done. Latency is more important now IMO. Ultra low latency SATA would do fine for years to come.
    Reply
  • RogerAndOut - Friday, July 12, 2019 - link

    In an enterprise environment, the 6Gbps speed is not much of an issue as deployment does not involve individual drives. Once you have 8,16,32 etc. in some form of RAID configuration the overall bandwidth increase. Such systems may also have NVMe based modules acting as a cache to allow fast retrieval of frequently access blocks and to speed up the 'commit' time of writes. Reply
  • Dug - Tuesday, June 25, 2019 - link

    I would like to see the Intel and Micron Pro included.
    We need drives with power loss protection.
    And I don't think write heavy is regulated to nvme territory. That's just not in the cards for small businesses or even large businesses. 1 because of cost, 2 because of size, 3 because of scalability.
    Reply
  • MDD1963 - Tuesday, June 25, 2019 - link

    1.3 DWPD endurance (9100+ TB of writes! for a 3.8 TB drive? Impressive! $800+...lower it to $399, and count me in! :) Reply
  • m4063 - Tuesday, September 8, 2020 - link

    LISTEN! The most important feature, and reason to buy these drives, is they have power-loss-protected (PLP) cache, not for protecting your data, BUT FOR SPEED!
    In my believe the most important thing about PLP is it should improve direct synchronous I/O (ESX and SQL) because the drive can report back that the data is "written to disk" as soon as the data hit the cache, where a non PLP drive actually need to write the data to the nand before reporting "OK"!
    And for that reason it's obvious the size of the PLP protected cache is pretty important.
    None of those two features are considered and tested in this review, which is very criticizable.
    This is the main-reasons you should go for these drives. I've asked Kingston about the PLP protected cache size and I got:
    SEDC500M/480 - 1GB
    SEDC500M/960 - 2GB
    SEDC500M/1920 - 4GB
    These sizes could play a huge different in synchronous I/O intensive systems/applications.
    Anatech: please cover these factors in your tests/review!
    (admittedly, I have done any benchmark myself In lack of PLP drives)
    Reply

Log in

Don't have an account? Sign up now