Commit graph

23 commits

Author SHA1 Message Date
Stefan Hajnoczi da5e1de95b Revert "iothread: release iothread around aio_poll"
This reverts commit a0710f7995.

In qemu-devel email message <556DBF87.2020908@de.ibm.com>, Christian
Borntraeger writes:

  Having many guests all with a kernel/ramdisk (via -kernel) and
  several null block devices will result in hangs. All hanging
  guests are in partition detection code waiting for an I/O to return
  so very early maybe even the first I/O.

  Reverting that commit "fixes" the hangs.

Reverting this commit for the 2.4 release.  More time is needed to
investigate and correct this patch.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-06-12 13:58:33 +01:00
Paolo Bonzini a0710f7995 iothread: release iothread around aio_poll
This is the first step towards having fine-grained critical sections in
dataplane threads, which resolves lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.

Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Chrysostomos Nanakos 2f78e491d7 async: aio_context_new(): Handle event_notifier_init failure
On a system with a low limit of open files the initialization
of the event notifier could fail and QEMU exits without printing any
error information to the user.

The problem can be easily reproduced by enforcing a low limit of open
files and start QEMU with enough I/O threads to hit this limit.

The same problem raises, without the creation of I/O threads, while
QEMU initializes the main event loop by enforcing an even lower limit of
open files.

This commit adds an error message on failure:

 # qemu [...] -object iothread,id=iothread0 -object iothread,id=iothread1
 qemu: Failed to initialize event notifier: Too many open files in system

Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:48 +01:00
Paolo Bonzini 363285d4b3 test-aio: test timers on Windows too
Use EventNotifier instead of a pipe, which makes it trivial to test
timers on Windows.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-29 10:46:58 +01:00
Stefan Weil 748bfb4eee tests: Add missing 'static' attributes (fix warnings from smatch)
Smatch also complains about 0 used for pointers, so replace those by
NULL in test-visitor-serialization.c, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-07-18 17:45:37 +04:00
Paolo Bonzini acfb23ad3d AioContext: do not rely on aio_poll(ctx, true) result to end a loop
Currently, whenever aio_poll(ctx, true) has completed all pending
work it returns true *and* the next call to aio_poll(ctx, true)
will not block.

This invariant has its roots in qemu_aio_flush()'s implementation
as "while (qemu_aio_wait()) {}".  However, qemu_aio_flush() does
not exist anymore and bdrv_drain_all() is implemented differently;
and this invariant is complicated to maintain and subtly different
from the return value of GMainLoop's g_main_context_iteration.

All calls to aio_poll(ctx, true) except one are guarded by a
while() loop checking for a request to be incomplete, or a
BlockDriverState to be idle.  The one remaining call (in
iothread.c) uses this to delay the aio_context_release/acquire
pair until the AioContext is quiescent, however:

- we can do the same just by using non-blocking aio_poll,
  similar to how vl.c invokes main_loop_wait

- it is buggy, because it does not ensure that the AioContext
  is released between an aio_notify and the next time the
  iothread goes to sleep.  This leads to hangs when stopping
  the dataplane thread.

In the end, these semantics are a bad match for the current
users of AioContext.  So modify that one exception in iothread.c,
which also fixes the hangs, as well as the testcase so that
it use the same idiom as the actual QEMU code.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-14 12:03:20 +02:00
Paolo Bonzini ef508f427b test-aio: fix GSource-based timer test
The current test depends too much on the implementation of the AioContext
GSource.  Just iterate on the main loop until the callback has been invoked
the right number of times.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-07-09 15:50:11 +02:00
Stefan Weil 0875709429 tests: Remove unsupported tests for MinGW
test_timer_schedule and test_source_timer_schedule don't compile for MinGW
because some functions are not implemented for MinGW (qemu_pipe,
aio_set_fd_handler).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-31 22:35:02 +02:00
Stefan Hajnoczi 98563fc3ec aio: add aio_context_acquire() and aio_context_release()
It can be useful to run an AioContext from a thread which normally does
not "own" the AioContext.  For example, request draining can be
implemented by acquiring the AioContext and looping aio_poll() until all
requests have been completed.

The following pattern should work:

  /* Event loop thread */
  while (running) {
      aio_context_acquire(ctx);
      aio_poll(ctx, true);
      aio_context_release(ctx);
  }

  /* Another thread */
  aio_context_acquire(ctx);
  bdrv_read(bs, 0x1000, buf, 1);
  aio_context_release(ctx);

This patch implements aio_context_acquire() and aio_context_release().

Note that existing aio_poll() callers do not need to worry about
acquiring and releasing - it is only needed when multiple threads will
call aio_poll() on the same AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Stefan Hajnoczi d3fa923044 aio: make aio_poll(ctx, true) block with no fds
This patch drops a special case where aio_poll(ctx, true) returns false
instead of blocking if no file descriptors are waiting on I/O.  Now it
is possible to block in aio_poll() to wait for aio_notify().

This change eliminates busy waiting.  bdrv_drain_all() used to rely on
busy waiting to completed throttled I/O requests but this is no longer
required so we can simplify aio_poll().

Note that aio_poll() still returns false when aio_notify() was used.  In
other words, stopping a blocking aio_poll() wait is not considered
making progress.

Adjust test-aio /aio/bh/callback-delete/one which assumed aio_poll(ctx,
true) would immediately return false instead of blocking.

Reviewed-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-06 16:53:51 +01:00
Alex Bligh a94a3fac19 aio / timers: fix build of test/test-aio.c on non-linux platforms
tests/test-aio.c used pipe2 which is Linux only. Use qemu_pipe
and qemu_set_nonblock for portabillity. Addition of O_CLOEXEC
is a harmless bonus.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Alex Bligh fcdda211f9 aio / timers: use g_usleep() not sleep()
sleep() apparently doesn't exist under mingw. Use g_usleep for
portability.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-01 20:02:45 +04:00
Alex Bligh 91c68f143d aio / timers: remove dummy_io_handler_flush from tests/test-aio.c
Remove dummy_io_handler_flush from tests/test-aio.c as it does
nothing now.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 22:03:47 +02:00
Alex Bligh b53edf971f aio / timers: Add test harness for AioContext timers
Add a test harness for AioContext timers. The g_source equivalent is
unsatisfactory as it suffers from false wakeups.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:14:24 +02:00
Alex Bligh dae21b98b9 aio / timers: Add QEMUTimerListGroup to AioContext
Add a QEMUTimerListGroup each AioContext (meaning a QEMUTimerList
associated with each clock is added) and delete it when the
AioContext is freed.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 19:10:27 +02:00
Stefan Hajnoczi f2e5dca46b aio: drop io_flush argument
The .io_flush() handler no longer exists and has no users.  Drop the
io_flush argument to aio_set_fd_handler() and related functions.

The AioFlushEventNotifierHandler and AioFlushHandler typedefs are no
longer used and are dropped too.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi 1b9ecdb164 tests: drop event_active_cb()
Drop the io_flush argument to aio_set_event_notifier().

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:52:19 +02:00
Stefan Hajnoczi 164a101f28 aio: stop using .io_flush()
Now that aio_poll() users check their termination condition themselves,
it is no longer necessary to call .io_flush() handlers.

The behavior of aio_poll() changes as follows:

1. .io_flush() is no longer invoked and file descriptors are *always*
monitored.  Previously returning 0 from .io_flush() would skip this file
descriptor.

Due to this change it is essential to check that requests are pending
before calling qemu_aio_wait().  Failure to do so means we block, for
example, waiting for an idle iSCSI socket to become readable when there
are no requests.  Currently all qemu_aio_wait()/aio_poll() callers check
before calling.

2. aio_poll() now returns true if progress was made (BH or fd handlers
executed) and false otherwise.  Previously it would return true whenever
'busy', which means that .io_flush() returned true.  The 'busy' concept
no longer exists so just progress is returned.

Due to this change we need to update tests/test-aio.c which asserts
aio_poll() return values.  Note that QEMU doesn't actually rely on these
return values so only tests/test-aio.c cares.

Note that ctx->notifier, the EventNotifier fd used for aio_notify(), is
now handled as a special case.  This is a little ugly but maintains
aio_poll() semantics, i.e. aio_notify() does not count as 'progress' and
aio_poll() avoids blocking when the user has not set any fd handlers yet.

Patches after this remove .io_flush() handler code until we can finally
drop the io_flush arguments to aio_set_fd_handler() and friends.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:35 +02:00
Stefan Hajnoczi 24d1a6d9d5 tests: adjust test-aio to new aio_poll() semantics
aio_poll(ctx, true) will soon block if any fd handlers have been set.
Previously it would only block when .io_flush() returned true.

This means that callers must check their wait condition *before*
aio_poll() to avoid deadlock.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-19 15:45:34 +02:00
Kevin Wolf 2ea9b58f0b aio: Fix return value of aio_poll()
aio_poll() must return true if any work is still pending, even if it
didn't make progress, so that bdrv_drain_all() doesn't stop waiting too
early. The possibility of stopping early occasionally lead to a failed
assertion in bdrv_drain_all(), when some in-flight request was missed
and the function didn't really drain all requests.

In order to make that change, the return value as specified in the
function comment must change for blocking = false; fortunately, the
return value of blocking = false callers is only used in test cases, so
this change shouldn't cause any trouble.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-17 10:51:42 +01:00
Paolo Bonzini 737e150e89 block: move include files to include/block/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Stefan Hajnoczi 9fe3781f09 tests: use aio_poll() instead of aio_flush() in test-aio.c
There has been confusion between various aio wait and flush functions.
It's time to get rid of qemu_aio_flush() but in the aio test cases we
really do want this low-level functionality.

Therefore declare a local wait_for_aio() helper for the test cases.
Drop the aio_flush() test case.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-11 11:04:25 +01:00
Paolo Bonzini b2ea25d7ae tests: add AioContext unit tests
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-11-26 09:37:51 -06:00