qemu-patch-raspberry4/block
Roman Pen 0ed93d84ed linux-aio: process completions from ioq_submit()
In order to reduce completion latency it makes sense to harvest completed
requests ASAP.  Very fast backend device can complete requests just after
submission, so it is worth trying to check ring buffer in order to peek
completed requests directly after io_submit() has been called.

Indeed, this patch reduces the completions latencies and increases the
overall throughput, e.g. the following is the percentiles of number of
completed requests at once:

        1th 10th  20th  30th  40th  50th  60th  70th  80th  90th  99.99th
Before    2    4    42   112   128   128   128   128   128   128    128
 After    1    1     4    14    33    45    47    48    50    51    108

That means, that before the current patch is applied the ring buffer is
observed as full (128 requests were consumed at once) in 60% of calls.

After patch is applied the distribution of number of completed requests
is "smoother" and the queue (requests in-flight) is almost never full.

The fio read results are the following (write results are almost the
same and are not showed here):

  Before
  ------
job: (groupid=0, jobs=8): err= 0: pid=2227: Tue Jul 19 11:29:50 2016
  Description  : [Emulation of Storage Server Access Pattern]
  read : io=54681MB, bw=1822.7MB/s, iops=179779, runt= 30001msec
    slat (usec): min=172, max=16883, avg=338.35, stdev=109.66
    clat (usec): min=1, max=21977, avg=1051.45, stdev=299.29
     lat (usec): min=317, max=22521, avg=1389.83, stdev=300.73
    clat percentiles (usec):
     |  1.00th=[  346],  5.00th=[  596], 10.00th=[  708], 20.00th=[  852],
     | 30.00th=[  932], 40.00th=[  996], 50.00th=[ 1048], 60.00th=[ 1112],
     | 70.00th=[ 1176], 80.00th=[ 1256], 90.00th=[ 1384], 95.00th=[ 1496],
     | 99.00th=[ 1800], 99.50th=[ 1928], 99.90th=[ 2320], 99.95th=[ 2672],
     | 99.99th=[ 4704]
    bw (KB  /s): min=205229, max=553181, per=12.50%, avg=233278.26, stdev=18383.51

  After
  ------
job: (groupid=0, jobs=8): err= 0: pid=2220: Tue Jul 19 11:31:51 2016
  Description  : [Emulation of Storage Server Access Pattern]
  read : io=57637MB, bw=1921.2MB/s, iops=189529, runt= 30002msec
    slat (usec): min=169, max=20636, avg=329.61, stdev=124.18
    clat (usec): min=2, max=19592, avg=988.78, stdev=251.04
     lat (usec): min=381, max=21067, avg=1318.42, stdev=243.58
    clat percentiles (usec):
     |  1.00th=[  310],  5.00th=[  580], 10.00th=[  748], 20.00th=[  876],
     | 30.00th=[  908], 40.00th=[  948], 50.00th=[ 1012], 60.00th=[ 1064],
     | 70.00th=[ 1080], 80.00th=[ 1128], 90.00th=[ 1224], 95.00th=[ 1288],
     | 99.00th=[ 1496], 99.50th=[ 1608], 99.90th=[ 1960], 99.95th=[ 2256],
     | 99.99th=[ 5408]
    bw (KB  /s): min=212149, max=390160, per=12.49%, avg=245746.04, stdev=11606.75

Throughput increased from 1822MB/s to 1921MB/s, average completion latencies
decreased from 1051us to 988us.

Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Message-id: 1468931263-32667-4-git-send-email-roman.penyaev@profitbricks.com
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-13 11:00:56 +01:00
..
accounting.c block: Clean up includes 2016-01-20 13:36:23 +01:00
archipelago.c coccinelle: Remove unnecessary variables for function return value 2016-06-20 16:38:13 +02:00
backup.c drive-backup: added support for data compression 2016-09-05 19:06:48 +02:00
blkdebug.c block/blkdebug: Store config filename 2016-08-15 15:52:28 +02:00
blkreplay.c blkreplay: Switch .bdrv_co_discard() to byte-based 2016-07-20 14:11:55 +01:00
blkverify.c block: Convert bdrv_aio_writev() to BdrvChild 2016-07-05 16:46:26 +02:00
block-backend.c block: remove BlockDriver.bdrv_write_compressed 2016-09-05 19:06:48 +02:00
bochs.c block: Convert bdrv_co_preadv/pwritev to BdrvChild 2016-07-05 16:46:27 +02:00
cloop.c block: Convert bdrv_pread(v) to BdrvChild 2016-07-05 16:46:27 +02:00
commit.c Improve block job rate limiting for small bandwidth values 2016-07-13 13:41:38 +02:00
crypto.c block: export LUKS specific data to qemu-img info 2016-07-26 17:46:37 +02:00
curl.c curl: Cast fd to int for DPRINTF 2016-08-17 19:57:54 +08:00
dirty-bitmap.c dirty-bitmap: operate with int64_t amount 2016-07-19 16:54:46 -04:00
dmg.c block: Convert bdrv_pread(v) to BdrvChild 2016-07-05 16:46:27 +02:00
gluster.c Pull request 2016-07-21 11:00:36 +01:00
io.c block/io: turn on dirty_bitmaps for the compressed writes 2016-09-05 19:06:48 +02:00
iscsi.c iscsi: Switch .bdrv_co_discard() to byte-based 2016-07-20 14:24:25 +01:00
linux-aio.c linux-aio: process completions from ioq_submit() 2016-09-13 11:00:56 +01:00
Makefile.objs block: Move bdrv_commit() to block/commit.c 2016-07-05 16:46:27 +02:00
mirror.c mirror: finish earlier on error 2016-08-08 13:05:43 +02:00
nbd-client.c nbd: Convert to byte-based interface 2016-07-20 14:24:25 +01:00
nbd-client.h nbd: Limit nbdflags to 16 bits 2016-08-03 18:44:56 +02:00
nbd.c block/nbd: Store runtime option values 2016-08-15 15:52:29 +02:00
nfs.c coroutine: move entry argument to qemu_coroutine_create 2016-07-13 13:26:02 +02:00
null.c block/null: Implement bdrv_refresh_filename() 2016-06-16 15:20:37 +02:00
parallels.c block/parallels: check new image size 2016-08-05 09:59:06 +01:00
qapi.c qapi: Add new visit_complete() function 2016-07-06 10:52:04 +02:00
qcow.c qcow: cleanup qcow_co_pwritev_compressed to avoid the recursion 2016-09-05 19:06:48 +02:00
qcow2-cache.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
qcow2-cluster.c qcow2: avoid memcpy(dst, NULL, len) 2016-09-13 11:00:55 +01:00
qcow2-refcount.c block: Convert bdrv_discard() to byte-based 2016-07-20 14:11:55 +01:00
qcow2-snapshot.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
qcow2.c qcow2: avoid memcpy(dst, NULL, len) 2016-09-13 11:00:55 +01:00
qcow2.h qcow2: Implement .bdrv_co_pwritev() 2016-06-16 15:19:55 +02:00
qed-check.c qed: Use DIV_ROUND_UP 2016-06-07 18:19:24 +03:00
qed-cluster.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-gencb.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-l2-cache.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-table.c block: Convert bdrv_aio_writev() to BdrvChild 2016-07-05 16:46:26 +02:00
qed.c coroutine: move entry argument to qemu_coroutine_create 2016-07-13 13:26:02 +02:00
qed.h util: move declarations out of qemu-common.h 2016-03-22 22:20:17 +01:00
quorum.c block: Convert bdrv_aio_writev() to BdrvChild 2016-07-05 16:46:26 +02:00
raw-posix.c block: Convert .bdrv_aio_discard() to byte-based 2016-07-20 14:11:55 +01:00
raw-win32.c raw-posix: Switch paio_submit() to byte-based 2016-07-20 14:11:55 +01:00
raw_bsd.c raw_bsd: Convert to byte-based interface 2016-07-20 14:24:25 +01:00
rbd.c block: Convert .bdrv_aio_discard() to byte-based 2016-07-20 14:11:55 +01:00
sheepdog.c sheepdog: Switch .bdrv_co_discard() to byte-based 2016-07-20 14:24:25 +01:00
snapshot.c error: Remove NULL checks on error_propagate() calls 2016-06-20 16:38:13 +02:00
ssh.c block/ssh: Use QemuOpts for runtime options 2016-08-15 15:52:28 +02:00
stream.c Improve block job rate limiting for small bandwidth values 2016-07-13 13:41:38 +02:00
throttle-groups.c block: Move I/O throttling configuration functions to BlockBackend 2016-05-19 16:45:30 +02:00
trace-events trace-events: fix first line comment in trace-events 2016-08-12 10:36:01 +01:00
vdi.c block: Convert bdrv_co_preadv/pwritev to BdrvChild 2016-07-05 16:46:27 +02:00
vhdx-endian.c qemu-common: stop including qemu/bswap.h from qemu-common.h 2016-05-19 16:42:28 +02:00
vhdx-log.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
vhdx.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
vhdx.h block: vhdx - update PAYLOAD_BLOCK_UNMAPPED value to match 1.00 spec 2014-12-12 15:42:22 +00:00
vmdk.c vmdk: add vmdk_co_pwritev_compressed 2016-09-05 19:06:48 +02:00
vpc.c block: Convert bdrv_co_preadv/pwritev to BdrvChild 2016-07-05 16:46:27 +02:00
vvfat.c vvfat: Fix qcow write target driver specification 2016-07-13 13:41:39 +02:00
win32-aio.c linux-aio: share one LinuxAioState within an AioContext 2016-07-18 15:09:31 +01:00
write-threshold.c block: Clean up includes 2016-01-20 13:36:23 +01:00