I discussed my previous post on O_DIRECT_NO_FSYNC with the InnoDB team and they fixed my understanding of a few parts of the code that contribute to the stalls I have been reporting.  We also discussed a problem I have been ignoring. The InnoDB code that does an fsync for a tablespace (fil_flush) can make a thread sleep when there are concurrent attempts to do the fsync. The amount of time to sleep, 20 milliseconds, was probably chosen in 1995. It is too long for fast storage including HW RAID cards with battery backed write cache and flash-based devices.

I created bug 68588 for the call to sleep and today I have performance numbers to share from MySQL 5.6.10. The workload is the same as described in the previous post. Three binaries were tested:

  • O_DIRECT,condvar – use innodb_flush_method=O_DIRECT and patch the binary to use condition variable waits rather than sleep when there are concurrent threads in fil_flush.
  • O_DIRECT,sleep – use innodb_flush_method=O_DIRECT
  • O_DIRECT_NO_FSYNC – use innodb_flush_method=O_DIRECT_NO_FSYNC

updates/second for update 1 row by PK via sysbench
    8      16      32      64     128     256   concurrent clients
18234   23513   22542   21967   21941   22135   O_DIRECT,condvar
18382   25290   10464    9868   10059   10917   O_DIRECT,sleep
18237   26332   30318   29695   29633   29380   O_DIRECT_NO_FSYNC

There is still a benefit from using O_DIRECT_NO_FSYNC but the difference is less significant. I didn’t see any obvious stalls when using PMP with the O_DIRECT,condvar binary. However, at test end the average rate for innodb_pages_written was about 19,000/second for O_DIRECT,condvar versus 24,000/second for O_DIRECT_NO_FSYNC. My explanation at this point is that the page cleaner thread is able to write pages faster when it doesn’t have to wait to call fsync and go through fil_flush().