Commit graph

907 commits

Author SHA1 Message Date
Dr. David Alan Gilbert 9952e807fd vhost-user+postcopy: Use qemu_set_nonblock
Use qemu_set_nonblock rather than a simple fcntl; cleaner
and I have no reason to change other flags.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:02:02 +03:00
Tiwei Bie 6f80e6170e virtio: support setting memory region based host notifier
This patch introduces the support for setting memory region
based host notifiers for virtio device. This is helpful when
using a hardware accelerator for a virtio device, because
hardware heavily depends on the notification, this will allow
the guest driver in the VM to notify the hardware directly.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:01:54 +03:00
Tiwei Bie 1f3a4519b1 vhost-user: support receiving file descriptors in slave_read
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 17:01:54 +03:00
Peter Xu ffcbbe722f vhost: add trace for IOTLB miss
Add some trace points for IOTLB translation for vhost. After vhost-user
is setup, the only IO path that QEMU will participate should be the
IOMMU translation, so it'll be good we can track this with explicit
timestamps when needed to see how long time we take to do the
translation, and whether there's anything stuck inside.  It might be
useful for triaging vhost-user problems.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-23 03:14:41 +03:00
Jonathan Helman b7b1264429 virtio-balloon: add hugetlb page allocation counts
qemu should read and report hugetlb page allocation
counts exported in the following kernel patch:

    commit 4c3ca37c4a4394978fd0f005625f6064ed2b9a64
    Author: Jonathan Helman <jonathan.helman@oracle.com>
    Date:   Mon Mar 19 11:00:35 2018 -0700

    virtio_balloon: export hugetlb page allocation counts

    Export the number of successful and failed hugetlb page
    allocations via the virtio balloon driver. These 2 counts
    come directly from the vm_events HTLB_BUDDY_PGALLOC and
    HTLB_BUDDY_PGALLOC_FAIL.

Signed-off-by: Jonathan Helman <jonathan.helman@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2018-05-23 03:14:40 +03:00
Jason Wang aebbdbee55 vhost: do not verify ring mappings when IOMMU is enabled
When IOMMU is enabled, we store virtqueue metadata as iova (though it
may has _phys suffix) and access them through dma helpers. Any
translation failures could be reported by IOMMU.

In this case, trying to validate iova against gpa won't work and will
cause a false error reporting. So this patch bypasses the ring
verification if IOMMU is enabled which is similar to the behavior
before 0ca1fd2d68 that calls vhost_memory_map() which is a nop when
IOMMU is enabled.

Fixes: 0ca1fd2d68 ("vhost: Simplify ring verification checks")
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-04-16 19:11:38 +03:00
Peter Maydell 915d34c5f9 Miscellaneous bugfixes, including crash fixes from Alexey, Peter M. and
Thomas.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJay3qbAAoJEL/70l94x66D7LwIAIDjHDULzCy/u+m/uFyTn7rD
 zyDhQTWgHP6OQ+TqixIDDszeasev/PWmiC6Bp+NG6ZIG102+XTREciSW+X7B6mct
 OqI/5xpjoqzKj2LrTeCnm754Xv7Ilz9kxZ1MKlGqjnRzdmykDRx7RNLqGBohL4EI
 nnF3iiOiT4ECY/aLgeRLfufJqj9zHr8hQ3om+2zMqntPfqc3Eg0eCpgb7uGMRDq8
 nWLecnDtqmBWhXDJCPngxDavBQqHDAmq1aj9ppJPLS+nB6pez0DvHMI6Gg3K4fIl
 2ybJse5FbOj/+PsM1Ae5g8TcWz607mVgtE+crKxLDmffg+YjbO9raqWigZoIw2Y=
 =aMIC
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

Miscellaneous bugfixes, including crash fixes from Alexey, Peter M. and
Thomas.

# gpg: Signature made Mon 09 Apr 2018 15:37:15 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  Add missing bit for SSE instr in VEX decoding
  maint: Add .mailmap entries for patches claiming list authorship
  dump: Fix build with newer gcc
  device-crash-test: Remove fixed isa-fdc entry
  qemu-pr-helper: Write pidfile more often
  qemu-pr-helper: Daemonize before dropping privileges
  virtio-serial: fix heapover-flow
  kvmclock: fix clock_is_reliable on migration from QEMU < 2.9
  hw/dma/i82374: Avoid double creation of the 82374 controller
  hw/scsi: support SCSI-2 passthrough without PI
  scsi-disk: allow customizing the SCSI version
  scsi-disk: Don't enlarge min_io_size to max_io_size
  configure: Add missing configure options to help text
  i386/hyperv: error out if features requested but unsupported
  i386/hyperv: add hv-frequencies cpu property
  target/i386: WHPX: set CPUID_EXT_HYPERVISOR bit
  memfd: fix vhost-user-test on non-memfd capable host
  scripts/checkpatch.pl: Bug fix
  target/i386: Fix andn instruction
  sys_membarrier: fix up include directives

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-04-09 17:29:10 +01:00
Dr. David Alan Gilbert e7b94a84b6 vhost: Allow adjoining regions
My rework of section adding combines overlapping or adjoining regions,
but checks they're actually the same underlying RAM block.
Fix the case where two blocks adjoin but don't overlap; that new region
should get added (but not combined), but my previous patch was disallowing it.

Fixes: c1ece84e7c

Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-04-09 17:35:46 +03:00
Maxime Coquelin 1c3e5a2617 vhost-user: back SET/GET_CONFIG requests with a protocol feature
Without a dedicated protocol feature, QEMU cannot know whether
the backend can handle VHOST_USER_SET_CONFIG and
VHOST_USER_GET_CONFIG messages.

This patch adds a protocol feature that is only advertised by
QEMU if the device implements the config ops. Vhost user init
fails if the device support the feature but the backend doesn't.

The backend should only send VHOST_USER_SLAVE_CONFIG_CHANGE_MSG
requests if the protocol feature has been negotiated.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Changpeng Liu <changpeng.liu@intel.com>
2018-04-09 17:35:46 +03:00
Maxime Coquelin bc6abcff7c vhost-user-blk: set config ops before vhost-user init
As soon as vhost-user init is done, the backend may send
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, so let's set the
notification callback before it.

Also, it will be used to know whether the device supports
the config feature to advertize it or not.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Changpeng Liu <changpeng.liu@intel.com>
2018-04-09 17:35:45 +03:00
Marc-André Lureau 648abbfbaa memfd: fix vhost-user-test on non-memfd capable host
On RHEL7, memfd is not supported, and vhost-user-test fails:
TEST: tests/vhost-user-test... (pid=10248)
  /x86_64/vhost-user/migrate:
  qemu-system-x86_64: -object memory-backend-memfd,id=mem,size=2M,: failed to create memfd
FAIL

There is a qemu_memfd_check() to prevent running memfd path, but it
also checks for fallback implementation. Let's specialize
qemu_memfd_check() to check memfd only, while qemu_memfd_alloc_check()
checks for the qemu_memfd_alloc() API.

Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Tested-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180328121804.16203-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2018-04-09 12:57:06 +02:00
Dr. David Alan Gilbert c1ece84e7c vhost: Huge page align and merge
Align RAMBlocks to page size alignment, and adjust the merging code
to deal with partial overlap due to that alignment.

This is needed for postcopy so that we can place/fetch whole hugepages
when under userfault.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert 46343570c0 vhost+postcopy: Wire up POSTCOPY_END notify
Wire up a call to VHOST_USER_POSTCOPY_END message to the vhost clients
right before we ask the listener thread to shutdown.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert c639187e33 vhost-user: Add VHOST_USER_POSTCOPY_END message
This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests.  It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert c07e36158f vhost+postcopy: Add vhost waker
Register a waker function in vhost-user code to be notified when
pages arrive or requests to previously mapped pages get requested.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:37 +02:00
Dr. David Alan Gilbert 375318d03f vhost+postcopy: Resolve client address
Resolve fault addresses read off the clients UFD into RAMBlock
and offset, and call back to the postcopy code to ask for the page.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 16:40:19 +02:00
Dr. David Alan Gilbert 905125d0e2 vhost+postcopy: Stash RAMBlock and offset
Stash the RAMBlock and offset for later use looking up
addresses.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert 9bb3801994 vhost+postcopy: Send address back to qemu
We need a better way, but at the moment we need the address of the
mappings sent back to qemu so it can interpret the messages on the
userfaultfd it reads.

This is done as a 3 stage set:
   QEMU -> client
      set_mem_table

   mmap stuff, get addresses

   client -> qemu
       here are the addresses

   qemu -> client
       OK - now you can use them

That ensures that qemu has registered the new addresses in it's
userfault code before the client starts accessing them.

Note: We don't ask for the default 'ack' reply since we've got our own.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert 55d754b307 postcopy+vhost-user: Split set_mem_table for postcopy
Split the set_mem_table routines in both qemu and libvhost-user
because the postcopy versions are going to be quite different
once changes in the later patches are added. However, this patch
doesn't produce any functional change, just the split.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert 6864a7b5ac vhost+postcopy: Transmit 'listen' to slave
Notify the vhost-user slave on reception of the 'postcopy-listen'
event from the source.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert f82c11165f vhost+postcopy: Register shared ufd with postcopy
Register the UFD that comes in as the response to the 'advise' method
with the postcopy code.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:28 +02:00
Dr. David Alan Gilbert d3dff7a5a1 vhost-user: Add 'VHOST_USER_POSTCOPY_ADVISE' message
Wire up a notifier to send a VHOST_USER_POSTCOPY_ADVISE
message on an incoming advise.

Later patches will fill in the behaviour/contents of the
message.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Dr. David Alan Gilbert 9ccbfe14dd postcopy: Add vhost-user flag for postcopy and check it
Add a vhost feature flag for postcopy support, and
use the postcopy notifier to check it before allowing postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 05:03:27 +02:00
Markus Armbruster 112ed241f5 qapi: Empty out qapi-schema.json
The previous commit improved compile time by including less of the
generated QAPI headers.  This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.

Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.

It's possible everywhere, except:

* monitor.c needs qmp-command.h to get qmp_init_marshal()

* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
  qapi-event.h to get enum QAPIEvent

Perhaps we'll get rid of those some other day.

Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-02 13:45:50 -06:00
Gonglei efbfeb8180 cryptodev-vhost-user: add crypto session handler
Introduce two vhost-user meassges: VHOST_USER_CREATE_CRYPTO_SESSION
and VHOST_USER_CLOSE_CRYPTO_SESSION. At this point, the QEMU side
support crypto operation in cryptodev host-user backend.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-01 18:26:17 +02:00
Gonglei 5da73dabe8 cryptodev: add vhost support
Impliment the vhost-crypto's funtions, such as startup,
stop and notification etc. Introduce an enum
QCryptoCryptoDevBackendOptionsType in order to
identify the cryptodev vhost backend is vhost-user
or vhost-kernel-module (If exist).

At this point, the cryptdoev-vhost-user works.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-01 18:26:17 +02:00
Jia He 9fac50c88d vhost: fix incorrect check in vhost_verify_ring_mappings
In commit 0ca1fd2d68 ("vhost: Simplify ring verification checks"),
it checks the virtqueue desc mapping for 3 times.

Fixed: commit 0ca1fd2d68 ("vhost: Simplify ring verification checks")
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2018-03-01 18:17:47 +02:00
Jia He fb20fbb764 vhost: avoid to start/stop virtqueue which is not ready
In our Armv8a server, we try to configure the vhost scsi but fail
to boot up the guest (-machine virt-2.10). The guest's boot failure
is very early, even earlier than grub.

There are 3 virtqueues (ctrl, event and cmd) for virtio scsi device,
but ovmf and seabios will only set the physical address for the 3rd
one (cmd). Then in vhost_virtqueue_start(), virtio_queue_get_desc_addr
will be 0 for ctrl and event vq when qemu negotiates with ovmf. So
vhost_memory_map fails with ENOMEM.

This patch just fixs it by early quitting the virtqueue start/stop
when virtio_queue_get_desc_addr is 0.

Btw, after guest kernel starts, all the 3 queues will be initialized
and set address correctly.

Already tested on Arm64 and X86_64 qemu.

Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-01 18:17:47 +02:00
Jay Zhou 9e2a2a3e08 vhost: fix memslot limit check
Since used_memslots will be updated to the actual value after
registering memory listener for the first time, move the
memslots limit checking to the right place.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-01 16:25:37 +02:00
Greg Kurz 2080a29f0e virtio-pci: trivial fixes in error message
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-01 16:25:36 +02:00
Peter Maydell b734ed9de1 virtio,vhost,pci,pc: features, fixes and cleanups
- new stats in virtio balloon
 - virtio eventfd rework for boot speedup
 - vhost memory rework for boot speedup
 - fixes and cleanups all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJagxKDAAoJECgfDbjSjVRp5qAH/3gmgBaIzL3KRHd5i0RZifJv
 PvyAVYgZd7h0+/1r9GM7guHKyEPZ08JtbHSm/HuDV4BD/Vf3/8joy8roExIfde2A
 6k8fd6ANVQmE3t5zUxNXi9qiG4pO4xDIu4cMAbixzgN9x5ttlcfTw7fTT0e0VJxJ
 8SN02/uCPPR/DY4/cpjah+slSyv6rBKT1v1ONy7djyRTYHi6h3Meoh05YfEALkwA
 goxTKBZHi0L1IZ3HP/ZpXJDohQ5n2P09DX0fQgb8PgmW6WIWB/Qpi5pD53LZpMCV
 n9waTF0U0ahneFd2FHo22QMMrwWvQyrjv+w5uXVr+qmHb/OyH2tUt7PgGF9+QKA=
 =78s5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,vhost,pci,pc: features, fixes and cleanups

- new stats in virtio balloon
- virtio eventfd rework for boot speedup
- vhost memory rework for boot speedup
- fixes and cleanups all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 13 Feb 2018 16:29:55 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (22 commits)
  virtio-balloon: include statistics of disk/file caches
  acpi-test: update FADT
  lpc: drop pcie host dependency
  tests: acpi: fix FADT not being compared to reference table
  hw/pci-bridge: fix pcie root port's IO hints capability
  libvhost-user: Support across-memory-boundary access
  libvhost-user: Fix resource leak
  virtio-balloon: unref the memory region before continuing
  pci: removed the is_express field since a uniform interface was inserted
  virtio-blk: enable multiple vectors when using multiple I/O queues
  pci/bus: let it has higher migration priority
  pci-bridge/i82801b11: clear bridge registers on platform reset
  vhost: Move log_dirty check
  vhost: Merge and delete unused callbacks
  vhost: Clean out old vhost_set_memory and friends
  vhost: Regenerate region list from changed sections list
  vhost: Merge sections added to temporary list
  vhost: Simplify ring verification checks
  vhost: Build temporary section list and deref after commit
  virtio: improve virtio devices initialization time
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-02-13 16:33:31 +00:00
Tomáš Golembiovský bf1e7140ef virtio-balloon: include statistics of disk/file caches
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-13 18:29:35 +02:00
Tiwei Bie b86107ab43 virtio-balloon: unref the memory region before continuing
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-13 18:25:48 +02:00
Markus Armbruster e688df6bc4 Include qapi/error.h exactly where needed
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.

While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
2018-02-09 13:50:17 +01:00
Changpeng Liu 0ebf9a7488 virtio-blk: enable multiple vectors when using multiple I/O queues
Currently virtio-pci driver hardcoded 2 vectors for virtio-blk device,
for multiple I/O queues scenario, all the I/O queues will share one
interrupt vector, while here, enable multiple vectors according to
the number of I/O queues.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Dr. David Alan Gilbert aa3c40f6bf vhost: Move log_dirty check
Move the log_dirty check into vhost_section.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Dr. David Alan Gilbert 938eeb640c vhost: Merge and delete unused callbacks
Now that the olf vhost_set_memory code is gone, the _nop and _add
callbacks are identical and can be merged.  The _del callback is
no longer needed.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:41 +02:00
Dr. David Alan Gilbert 06709c120c vhost: Clean out old vhost_set_memory and friends
Remove the old update mechanism, vhost_set_memory, and the functions
and flags it used.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert ade6d081fc vhost: Regenerate region list from changed sections list
Compare the sections list that's just been generated, and if it's
different from the old one regenerate the region list.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert 48d7c97577 vhost: Merge sections added to temporary list
As sections are reported by the listener to the _nop and _add
methods, add them to the temporary section list but now merge them
with the previous section if the new one abuts and the backend allows.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert 0ca1fd2d68 vhost: Simplify ring verification checks
vhost_verify_ring_mappings() were used to verify that
rings are still accessible and related memory hasn't
been moved after flatview is updated.

It was doing checks by mapping ring's GPA+len and
checking that HVA hadn't changed with new memory map.
To avoid maybe expensive mapping call, we were
identifying address range that changed and were doing
mapping only if ring was in changed range.

However it's not neccessary to perform ring's GPA
mapping as we already have its current HVA and all
we need is to verify that ring's GPA translates to
the same HVA in updated flatview.

This will allow the following patches to simplify the range
comparison that was previously needed to avoid expensive
verify_ring_mapping calls.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
with modifications by:
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Dr. David Alan Gilbert c44317efec vhost: Build temporary section list and deref after commit
Igor spotted that there's a race, where a region that's unref'd
in a _del callback might be free'd before the set_mem_table call in
the _commit callback, and thus the vhost might end up using free memory.

Fix this by building a complete temporary sections list, ref'ing every
section (during add and nop) and then unref'ing the whole list right
at the end of commit.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:40 +02:00
Gal Hammer 710fccf80d virtio: improve virtio devices initialization time
The loading time of a VM is quite significant when its virtio
devices use a large amount of virt-queues (e.g. a virtio-serial
device with max_ports=511). Most of the time is spend in the
creation of all the required event notifiers (ioeventfd and memory
regions).

This patch pack all the changes to the memory regions in a
single memory transaction.

Reported-by: Sitong Liu
Reported-by: Xiaoling Gao
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
2018-02-08 21:06:40 +02:00
Gal Hammer 76143618a5 virtio: remove event notifier cleanup call on de-assign
The virtio_bus_set_host_notifier function no longer calls
event_notifier_cleanup when a event notifier is removed.

The commit updates the code to match the new behavior and calls
virtio_bus_cleanup_host_notifier after the notifier was de-assign
and no longer in use.

This change is a preparation to allow executing the
virtio_bus_set_host_notifier function in a memory region
transaction.

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 21:06:26 +02:00
Michael S. Tsirkin f41d912023 Revert "vhost: add traces for memory listeners"
This reverts commit 0750b06021.

Follow up patches are reworking the memory listeners, the new mechanism
will add its own set of traces.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-02-08 19:26:51 +02:00
Marc-André Lureau 0f2956f915 memfd: add error argument, instead of perror()
This will allow callers to silence error report when the call is
allowed to failed.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:25 +01:00
Peter Xu d25836cafd memory: do explicit cleanup when remove listeners
When unregister memory listeners, we should call, e.g.,
region_del() (and possibly other undo operations) on every existing
memory region sections there, otherwise we may leak resources that are
held during the region_add(). This patch undo the stuff for the
listeners, which emulates the case when the address space is set from
current to an empty state.

I found this problem when debugging a refcount leak issue that leads to
a device unplug event lost (please see the "Bug:" line below).  In that
case, the leakage of resource is the PCI BAR memory region refcount.
And since memory regions are not keeping their own refcount but onto
their owners, so the vfio-pci device's (who is the owner of the PCI BAR
memory regions) refcount is leaked, and event missing.

We had encountered similar issues before and fixed in other
way (ee4c112846, "vhost: Release memory references on cleanup"). This
patch can be seen as a more high-level fix of similar problems that are
caused by the resource leaks from memory listeners. So now we can remove
the explicit unref of memory regions since that'll be done altogether
during unregistering of listeners now.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1531393
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-5-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00
Peter Xu 0750b06021 vhost: add traces for memory listeners
Trace these operations on two memory listeners.  It helps to verify the
new memory listener fix, and good to keep them there.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-2-peterx@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-07 14:09:24 +01:00
Philippe Mathieu-Daudé bf85388169 qdev: use device_class_set_parent_realize/unrealize/reset()
changes generated using the following Coccinelle patch:

  @@
  type DeviceParentClass;
  DeviceParentClass *pc;
  DeviceClass *dc;
  identifier parent_fn;
  identifier child_fn;
  @@
  (
  +device_class_set_parent_realize(dc, child_fn, &pc->parent_fn);
  -pc->parent_fn = dc->realize;
  ...
  -dc->realize = child_fn;
  |
  +device_class_set_parent_unrealize(dc, child_fn, &pc->parent_fn);
  -pc->parent_fn = dc->unrealize;
  ...
  -dc->unrealize = child_fn;
  |
  +device_class_set_parent_reset(dc, child_fn, &pc->parent_fn);
  -pc->parent_fn = dc->reset;
  ...
  -dc->reset = child_fn;
  )

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180114020412.26160-4-f4bug@amsat.org>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-02-05 13:54:38 +01:00
Michael S. Tsirkin 1ef8185a06 Revert "virtio: postpone the execution of event_notifier_cleanup function"
This reverts commit 4fe6d78b2e as it is
reported to break cleanup and migration.

Cc: Gal Hammer <ghammer@redhat.com>
Cc: Sitong Liu <siliu@redhat.com>
Cc: Xiaoling Gao <xiagao@redhat.com>
Suggested-by: Greg Kurz <groug@kaod.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
2018-01-24 19:20:19 +02:00
Michael S. Tsirkin ce3a9eaff4 Revert "virtio: improve virtio devices initialization time"
This reverts commit 6f0bb23072.

This reverts commit f87d72f5c5 as that is
reported to break cleanup and migration.

Cc: Gal Hammer <ghammer@redhat.com>
Cc: Sitong Liu <siliu@redhat.com>
Cc: Xiaoling Gao <xiagao@redhat.com>
Suggested-by: Greg Kurz <groug@kaod.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
2018-01-24 19:20:19 +02:00
Jay Zhou f4bf56fb78 vhost: remove assertion to prevent crash
QEMU will assert on vhost-user backed virtio device hotplug if QEMU is
using more RAM regions than VHOST_MEMORY_MAX_NREGIONS (for example if
it were started with a lot of DIMM devices).

Fix it by returning error instead of asserting and let callers of
vhost_set_mem_table() handle error condition gracefully.

Cc: qemu-stable@nongnu.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:39 +02:00
Michael S. Tsirkin 69aff03064 vhost-user: fix misaligned access to payload
We currently take a pointer to a misaligned field of a packed structure.
clang reports this as a build warning.
A fix is to keep payload in a separate structure, and access is it
from there using a vectored write.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:39 +02:00
Michael S. Tsirkin 24e34754eb vhost-user: factor out msg head and payload
split header and payload into separate structures,
to enable easier handling of alignment issues.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:39 +02:00
Gal Hammer 6f0bb23072 virtio: improve virtio devices initialization time
The loading time of a VM is quite significant when its virtio
devices use a large amount of virt-queues (e.g. a virtio-serial
device with max_ports=511). Most of the time is spend in the
creation of all the required event notifiers (ioeventfd and memory
regions).

This patch pack all the changes to the memory regions in a
single memory transaction.

Reported-by: Sitong Liu <siliu@redhat.com>
Reported-by: Xiaoling Gao <xiagao@redhat.com>
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:38 +02:00
Gal Hammer 4fe6d78b2e virtio: postpone the execution of event_notifier_cleanup function
Use the EventNotifier's cleanup callback function to execute the
event_notifier_cleanup function after kvm unregistered the eventfd.

This change supports running the virtio_bus_set_host_notifier
function inside a memory region transaction. Otherwise, a closed
fd is sent to kvm, which results in a failure.

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:37 +02:00
Changpeng Liu 00343e4b54 vhost-user-blk: introduce a new vhost-user-blk host device
This commit introduces a new vhost-user device for block, it uses a
chardev to connect with the backend, same with Qemu virito-blk device,
Guest OS still uses the virtio-blk frontend driver.

To use it, start QEMU with command line like this:

qemu-system-x86_64 \
    -chardev socket,id=char0,path=/path/vhost.socket \
    -device vhost-user-blk-pci,chardev=char0,num-queues=2, \
            bootindex=2... \

Users can use different parameters for `num-queues` and `bootindex`.

Different with exist Qemu virtio-blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. It uses the new
vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
information from backend process.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:37 +02:00
Changpeng Liu 4c3e257b5e vhost-user: add new vhost user messages to support virtio config space
Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration of vhost user devices, also vhost user devices
can benefit from the messages to get/set virtio config space from/to the
I/O target. For the purpose to support virtio config space change,
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event notifier
in case virtio config space change in the slave I/O target.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-18 21:52:37 +02:00
Michael S. Tsirkin acc95bc850 Merge remote-tracking branch 'origin/master' into HEAD
Resolve conflicts around apb.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-01-11 22:03:50 +02:00
Ladi Prosek f2bc54de47 virtio-pci: Don't force Subsystem Vendor ID = Vendor ID
The statement being removed doesn't change anything as virtio PCI devices already
have Subsystem Vendor ID set to pci_default_sub_vendor_id (0x1af4), same as Vendor
ID. And the Virtio spec does not require the two to be equal, either:

  "The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect the PCI
  Vendor and Device ID of the environment (for informational purposes by the driver)."

Background:

Following the recent virtio-win licensing change, several vendors are planning to
ship their own certified version of Windows guest Virtio drivers, potentially taking
advantage of Windows Update as a distribution channel. It is therefore critical that
each vendor uses their own PCI Subsystem Vendor ID for Virtio devices to prevent
drivers from other vendors binding to it.

This would be trivially done by adding:

  k->subsystem_vendor_id = ...

to virtio_pci_class_init(). Except for the problematic statement deleted by this
patch, which reverts the Subsystem Vendor ID back to 0x1af4 for legacy devices for
no good reason.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-12-22 01:42:03 +02:00
Michael S. Tsirkin 8fc47c876d virtio_error: don't invoke status callbacks
Backends don't need to know what frontend requested a reset,
and notifying then from virtio_error is messy because
virtio_error itself might be invoked from backend.

Let's just set the status directly.

Cc: qemu-stable@nongnu.org
Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-19 23:41:00 +02:00
Philippe Mathieu-Daudé 2070aaebd2 hw/virtio-balloon: remove old i386 dependency
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
Philippe Mathieu-Daudé e9808d0969 hw: use "qemu/osdep.h" as first #include in source files
applied using ./scripts/clean-includes

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-12-18 17:07:02 +03:00
David Gibson fd56e0612b pci: Eliminate redundant PCIDevice::bus pointer
The bus pointer in PCIDevice is basically redundant with QOM information.
It's always initialized to the qdev_get_parent_bus(), the only difference
is the type.

Therefore this patch eliminates the field, instead creating a pci_get_bus()
helper to do the type mangling to derive it conveniently from the QOM
Device object underneath.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-12-05 19:13:45 +02:00
Prasad J Pandit 758ead31c7 virtio: check VirtQueue Vring object is set
A guest could attempt to use an uninitialised VirtQueue object
or unset Vring.align leading to a arithmetic exception. Add check
to avoid it.

Reported-by: Zhangboxian <zhangboxian@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2017-12-01 19:05:58 +02:00
Greg Kurz 2fe45ec3bf vhost: fix error check in vhost_verify_ring_mappings()
Since commit f1f9e6c5 "vhost: adapt vhost_verify_ring_mappings() to
virtio 1 ring layout", we check the mapping of each part (descriptor
table, available ring and used ring) of each virtqueue separately.

The checking of a part is done by the vhost_verify_ring_part_mapping()
function: it returns either 0 on success or a negative errno if the
part cannot be mapped at the same place.

Unfortunately, the vhost_verify_ring_mappings() function checks its
return value the other way round. It means that we either:
- only verify the descriptor table of the first virtqueue, and if it
  is valid we ignore all the other mappings
- or ignore all broken mappings until we reach a valid one

ie, we only raise an error if all mappings are broken, and we consider
all mappings are valid otherwise (false success), which is obviously
wrong.

This patch ensures that vhost_verify_ring_mappings() only returns
success if ALL mappings are okay.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-01 19:05:58 +02:00
Maxime Coquelin 2ae39a113a vhost: restore avail index from vring used index on disconnection
vhost_virtqueue_stop() gets avail index value from the backend,
except if the backend is not responding.

It happens when the backend crashes, and in this case, internal
state of the virtio queue is inconsistent, making packets
to corrupt the vring state.

With a Linux guest, it results in following error message on
backend reconnection:

[   22.444905] virtio_net virtio0: output.0:id 0 is not a head!
[   22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5
[   22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5

Fixes: 283e2c2adc ("net: virtio-net discards TX data after link down")
Cc: qemu-stable@nongnu.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-01 19:05:58 +02:00
Maxime Coquelin 2d4ba6cc74 virtio: Add queue interface to restore avail index from vring used index
In case of backend crash, it is not possible to restore internal
avail index from the backend value as vhost_get_vring_base
callback fails.

This patch provides a new interface to restore internal avail index
from the vring used index, as done by some vhost-user backend on
reconnection.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-12-01 19:05:58 +02:00
linzhecheng 7abea552ab fix: unrealize virtio device if we fail to hotplug it
If we fail to hotplug virtio-blk device and then suspend
or shutdown VM, qemu is likely to crash.

Re-production steps:
1. Run VM named vm001
2. Create a virtio-blk.xml which contains wrong configurations:
<disk device="lun" rawio="yes" type="block">
  <driver cache="none" io="native" name="qemu" type="raw" />
  <source dev="/dev/mapper/11-dm" />
  <target bus="virtio" dev="vdx" />
</disk>
3. Run command : virsh attach-device vm001 virtio-blk.xml
error: Failed to attach device from blk-scsi.xml
error: internal error: unable to execute QEMU command 'device_add': Please set scsi=off for virtio-blk devices in order to use virtio 1.0
it means hotplug virtio-blk device failed.
4. Suspend or shutdown VM will leads to qemu crash

Problem happens in virtio_vmstate_change which is called by
vm_state_notify:
vdev’s parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash.
virtio_vmstate_change is added to the list vm_change_state_head at virtio_blk_device_realize(virtio_init),
but after hotplug virtio-blk failed, virtio_vmstate_change will not be removed from vm_change_state_head.
Adding unrealize function of virtio-blk device can solve this problem.

Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-11-16 17:46:53 +02:00
Alexey Kardashevskiy a93c8d828a virtio-pci: Replace modern_as with direct access to modern_bar
The modern bar is accessed now via yet another address space created just
for that purpose and it does not really need FlatView and dispatch tree
as it has a single memory region so it is just a waste of memory. Things
get even worse when there are dozens or hundreds of virtio-pci devices -
since these address spaces are global, changing any of them triggers
rebuilding all address spaces.

This replaces indirect accesses to the modern BAR with a simple lookup
and direct calls to memory_region_dispatch_read/write.

This is expected to save lots of memory at boot time after applying:
[Qemu-devel] [PULL 00/32] Misc changes for 2017-09-22

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-15 05:54:44 +03:00
Wolfgang Bumiller 37ef70be6a virtio: fix descriptor counting in virtqueue_pop
While changing the s/g list allocation, commit 3b3b0628
also changed the descriptor counting to count iovec entries
as split by cpu_physical_memory_map(). Previously only the
actual descriptor entries were counted and the split into
the iovec happened afterwards in virtqueue_map().
Count the entries again instead to avoid erroneous
"Looped descriptor" errors.

Reported-by: Hans Middelhoek <h.middelhoek@ospito.nl>
Link: https://forum.proxmox.com/threads/vm-crash-with-memory-hotplug.35904/
Fixes: 3b3b062821 ("virtio: slim down allocation of VirtQueueElements")
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-15 05:54:44 +03:00
Eduardo Habkost a5fa336f11 pci: Add interface names to hybrid PCI devices
The following devices support both PCI Express and Conventional
PCI, by including special code to handle the QEMU_PCI_CAP_EXPRESS
flag and/or conditional pcie_endpoint_cap_init() calls:

* vfio-pci (is_express=1, but legacy PCI handled by
  vfio_populate_device())
* vmxnet3 (is_express=0, but PCIe handled by vmxnet3_realize())
* pvscsi (is_express=0, but PCIe handled by pvscsi_realize())
* virtio-pci (is_express=0, but PCIe handled by
  virtio_pci_dc_realize(), and additional legacy PCI code at
  virtio_pci_realize())
* base-xhci (is_express=1, but pcie_endpoint_cap_init() call
  is conditional on pci_bus_is_express(dev->bus)
  * Note that xhci does not clear QEMU_PCI_CAP_EXPRESS like the
    other hybrid devices

Cc: Dmitry Fleytman <dmitry@daynix.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-15 05:54:42 +03:00
Dr. David Alan Gilbert b81b948ecc virtio/pci/migration: Convert to VMState
Convert the 'modern_state' part of virtio-pci to modern migration
macros.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-15 05:54:41 +03:00
Felipe Franciosi 5c0ba1be37 virtio/vhost: reset dev->log after syncing
vhost_log_put() is called to decomission the dirty log between qemu and
a vhost device when stopping the device. Such a call can happen from
migration_completion().

Present code sets dev->log_size to zero too early in vhost_log_put(),
causing the sync check to always return false. As a consequence, the
last pass on the dirty bitmap never happens at the end of migration.

If a vhost device was busy (writing to guest memory) until the last
moments before vhost_virtqueue_stop(), this error will result in guest
memory corruption (at least) following migrations.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-10-15 05:54:41 +03:00
Dr. David Alan Gilbert 2f168d0708 migration: Route more error paths
vmstate_save_state is called in lots of places.
Route error returns from the easier cases back up;  there are lots
of more complex cases where their own error paths need fixing.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-7-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Commit message fix up as Peter's review
2017-09-27 11:44:18 +01:00
Dr. David Alan Gilbert 44b1ff319c migration: pre_save return int
Modify the pre_save method on VMStateDescription to return an int
rather than void so that it potentially can fail.

Changed zillions of devices to make them return 0; the only
case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already
had an error_report/return case.

Note: If you add an error exit in your pre_save you must emit
an error_report to say why.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-09-27 11:35:59 +01:00
Eric Blake 262a69f428 osdep.h: Prohibit disabling assert() in supported builds
We already have several files that knowingly require assert()
to work, sometimes because refactoring the code for proper
error handling has not been tackled yet; there are probably
other files that have a similar situation but with no comments
documenting the same.  In fact, we have places in migration
that handle untrusted input with assertions, where disabling
the assertions risks a worse security hole than the current
behavior of losing the guest to SIGABRT when migration fails
because of the assertion.  Promote our current per-file
safety-valve to instead be project-wide, and expand it to also
cover glib's g_assert().

Note that we do NOT want to encourage 'assert(side-effects);'
(that is a bad practice that prevents copy-and-paste of code to
other projects that CAN disable assertions; plus it costs
unnecessary reviewer mental cycles to remember whether a project
special-cases the crippling of asserts); and we would LIKE to
fix migration to not rely on asserts (but that takes a big code
audit).  But in the meantime, we DO want to send a message
that anyone that disables assertions has to tweak code in order
to compile, making it obvious that they are taking on additional
risk that we are not going to support.  At the same time, leave
comments mentioning NDEBUG in files that we know still need to
be scrubbed, so there is at least something to grep for.

It would be possible to come up with some other mechanism for
doing runtime checking by default, but which does not abort
the program on failure, while leaving side effects in place
(unlike how crippling assert() avoids even the side effects),
perhaps under the name q_verify(); but it was not deemed worth
the effort (developers should not have to learn a replacement
when the standard C macro works just fine, and it would be a lot
of churn for little gain).  The patch specifically uses #error
rather than #warn so that a user is forced to tweak the header
to acknowledge the issue, even when not using a -Werror
compilation.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>

Message-Id: <20170911211320.25385-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 16:20:49 +02:00
Alistair Francis 2ab4b13563 Convert single line fprintf(.../n) to warn_report()
Convert all the single line uses of fprintf(stderr, "warning:"..."\n"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.

All of the warnings were changed using this command:
  find ./* -type f -exec sed -i \
    's|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig' \
    {} +

Some of the lines were manually edited to reduce the line length to below
80 charecters.

The #include lines were manually updated to allow the code to compile.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com> [mips]
Message-Id: <ae8f8a7f0a88ded61743dff2adade21f8122a9e7.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-09-19 14:09:34 +02:00
Alex Williamson ee4c112846 vhost: Release memory references on cleanup
vhost registers a MemoryListener where it adds and removes references
to MemoryRegions as the MemoryRegionSections pass through.  The
region_add callback is invoked for each existing section when the
MemoryListener is registered, but unregistering the MemoryListener
performs no reciprocal region_del callback.  It's therefore the
owner of the MemoryListener's responsibility to cleanup any persistent
changes, such as these memory references, after unregistering.

The consequence of this bug is that if we have both a vhost device
and a vfio device, the vhost device will reference any mmap'd MMIO of
the vfio device via this MemoryListener.  If the vhost device is then
removed, those references remain outstanding.  If we then attempt to
remove the vfio device, it never gets finalized and the only way to
release the kernel file descriptors is to terminate the QEMU process.

Fixes: dfde4e6e1a ("memory: add ref/unref calls")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org # v1.6.0+
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-09-08 16:15:17 +03:00
Marc-André Lureau 33c5793bd9 vhost: use QEMU_ALIGN_DOWN
I used the clang-tidy qemu-round check to generate the fix:
https://github.com/elmarco/clang-tools-extra

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-08-31 12:29:07 +02:00
Marc-André Lureau e6a74868d9 build-sys: add --disable-vhost-user
Learn to compile out vhost-user (net, scsi & upcoming users). Keep it
enabled by default on non-win32, that is assumed to be POSIX. Fail if
trying to enable it on win32.

When trying to make a vhost-user netdev, it gives the following error:

-netdev vhost-user,id=foo,chardev=chr-test: Parameter 'type' expects a netdev backend type

And similar error with the HMP/QMP monitors.

While at it, rename CONFIG_VHOST_NET_TEST CONFIG_VHOST_USER_NET_TEST
since it's a vhost-user specific variable.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-08-03 15:55:41 +03:00
Felipe Franciosi 5df04f1762 vhost-user: fix legacy cross-endian configurations
Currently, vhost-user does not implement any means for notifying the
backend about guest endianess. This commit introduces a new message
called VHOST_USER_SET_VRING_ENDIAN which is analogous to the ioctl()
called VHOST_SET_VRING_ENDIAN used for kernel vhost backends. Such
message is necessary for backends supporting legacy (pre-1.0) virtio
devices running in big-endian guests.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Mike Cui <cui@nutanix.com>
2017-08-02 00:13:25 +03:00
Peng Hao 08b9e0ba62 vhost: fix a memory leak
vhost exists a call for g_file_get_contents, but not call g_free.

Signed-off-by: Peng Hao<peng.hao2@zte.com.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-08-02 00:13:25 +03:00
Vladimir Sementsov-Ogievskiy 8908eb1a4a trace-events: fix code style: print 0x before hex numbers
The only exception are groups of numers separated by symbols
'.', ' ', ':', '/', like 'ab.09.7d'.

This patch is made by the following:

> find . -name trace-events | xargs python script.py

where script.py is the following python script:
=========================
 #!/usr/bin/env python

import sys
import re
import fileinput

rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)'
rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')')
rbad = re.compile('(?<!0x)' + rhex)

files = sys.argv[1:]

for fname in files:
    for line in fileinput.input(fname, inplace=True):
        arr = re.split(rgroup, line)
        for i in range(0, len(arr), 2):
            arr[i] = re.sub(rbad, '0x\g<0>', arr[i])

        sys.stdout.write(''.join(arr))
=========================

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-08-01 12:13:07 +01:00
Philippe Mathieu-Daudé 87e0331c5a docs: fix broken paths to docs/devel/tracing.txt
With the move of some docs/ to docs/devel/ on ac06724a71,
no references were updated.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-07-31 13:12:53 +03:00
Fam Zheng aa8f057e74 virtio-crypto: Convert to DEFINE_PROP_LINK
Unlike other object_property_add_link() occurrences in virtio devices,
virtio-crypto checks the "in use" state of the linked backend object in
addition to qdev_prop_allow_set_link_before_realize. To convert it
without needing to specialize DEFINE_PROP_LINK which always uses the
qdev callback, move the "in use" check to device realize time.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170714021509.23681-10-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:43 +02:00
Fam Zheng d1fd7f775e virtio-rng: Convert to DEFINE_PROP_LINK
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170714021509.23681-9-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:42 +02:00
Fam Zheng 08f1ecd873 virtio-scsi: Convert to DEFINE_PROP_LINK
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170714021509.23681-8-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:42 +02:00
Fam Zheng d679ac09f0 virtio-blk: Convert to DEFINE_PROP_LINK
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170714021509.23681-7-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:42 +02:00
Igor Mammedov 8f5d58ef2c qom: enforce readonly nature of link's check callback
link's check callback is supposed to verify/permit setting it,
however currently nothing restricts it from misusing it
and modifying target object from within.
Make sure that readonly semantics are checked by compiler
to prevent callback's misuse.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170714021509.23681-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:42 +02:00
Maxime Coquelin b9ec9bd468 vhost-user: unregister slave req handler at cleanup time
If the backend sends a request just before closing the socket,
the aio dispatcher might schedule its reading after the vhost
device has been cleaned, leading to a NULL pointer dereference
in slave_read();

vhost_user_cleanup() already closes the socket but it is not
enough, the handler has to be unregistered.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Maxime Coquelin 384b557da1 vhost: ensure vhost_ops are set before calling iotlb callback
This patch fixes a crash that happens when vhost-user iommu
support is enabled and vhost-user socket is closed.

When it happens, if an IOTLB invalidation notification is sent
by the IOMMU, vhost_ops's NULL pointer is dereferenced.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Mao Zhongyi 9a7c2a5970 pci: Make errp the last parameter of pci_add_capability()
Add Error argument for pci_add_capability() to leverage the errp
to pass info on errors. This way is helpful for its callers to
make a better error handling when moving to 'realize'.

Cc: pbonzini@redhat.com
Cc: rth@twiddle.net
Cc: ehabkost@redhat.com
Cc: mst@redhat.com
Cc: jasowang@redhat.com
Cc: marcel@redhat.com
Cc: alex.williamson@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Stefan Hajnoczi c324fd0a39 virtio-pci: use ioeventfd even when KVM is disabled
Old kvm.ko versions only supported a tiny number of ioeventfds so
virtio-pci avoids ioeventfds when kvm_has_many_ioeventfds() returns 0.

Do not check kvm_has_many_ioeventfds() when KVM is disabled since it
always returns 0.  Since commit 8c56c1a592
("memory: emulate ioeventfd") it has been possible to use ioeventfds in
qtest or TCG mode.

This patch makes -device virtio-blk-pci,iothread=iothread0 work even
when KVM is disabled.

I have tested that virtio-blk-pci works under TCG both with and without
iothread.

This patch fixes qemu-iotests 068, which was accidentally merged early
despite the dependency on ioeventfd.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170628184724.21378-7-stefanha@redhat.com
Message-id: 20170615163813.7255-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-30 11:03:45 +01:00
Felipe Franciosi f12c1ebddf vhost-user-scsi: Introduce vhost-user-scsi host device
This commit introduces a vhost-user device for SCSI. This is based
on the existing vhost-scsi implementation, but done over vhost-user
instead. It also uses a chardev to connect to the backend. Unlike
vhost-scsi (today), VMs using vhost-user-scsi can be live migrated.

To use it, start Qemu with a command line equivalent to:

qemu-system-x86_64 \
       -chardev socket,id=vus0,path=/tmp/vus.sock \
       -device vhost-user-scsi-pci,chardev=vus0,bus=pci.0,addr=...

A separate commit presents a sample application linked with libiscsi to
provide a backend for vhost-user-scsi.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Message-Id: <1488479153-21203-4-git-send-email-felipe@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-15 11:18:40 +02:00
Peter Maydell cb8b8ef457 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZMbiwAAoJENro4Ql1lpzlEm0P/RViCB92pz62wdcsjpXvqwBO
 ddhfQqPaGE+BD1tRcrOxDFKTWEnxFN4Oj9zNQ+/FgLcckZ5qAy6PkuCnesG2eJjG
 c43y07e+u89G8L+7zLoAw+fYt8tnb5+ood6Q6WWH9rbNHLaIlMvwzomm3Rkf+1L/
 PKVixcnFBxulTTftLgVUqpFSRrDDlywrmK9gjoDRHeIp7fZTxUib5T720ShszZe5
 C2iym5Ucb5ohEoq4LfY+mPWVGkTdMtsX3Vz5eHsTqoYYY1akTg5CwgGQuqbS/kXN
 7/4nkX6otkxTlU1+ydWEQCpD3orWEUJUeKUlLA48rAckqJyX3mfx94AeIOYseILM
 vMdxQo549ofcann1RtHyPkgfwyl8rrFRZ2xdkYGeSJ2zyv6Ekcxs0r5NXX7DMZBY
 oyjH38UTpHBhZp5l/0KAff0y0FNnDz6JJttgkcqHz8Qd4chE6JP9X9gx+9xkakst
 ytbiFP5NuG9RTwWjFSXObgZl9QU/n+JuJ/kr/LMMTgmPYwONW3NglBGgAoRDu/gC
 4YAmdSWf2UYlQg1IxTn2KT4U4Dn56DEmcaL9yDBH+UJLhWpxle9+D2MVruJN7MPS
 H0/8ytXvmb99DMWOhIdRmOzWCmQ/YAg/CAXmUZPX7sul0b3TNhSDqEi2qnLR84GT
 GSOh5aGbQ2H8DGQioGLV
 =5eL6
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/elmarco/tags/chrfe-pull-request' into staging

# gpg: Signature made Fri 02 Jun 2017 20:12:48 BST
# gpg:                using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/chrfe-pull-request:
  char: move char devices to chardev/
  char: make chr_fe_deinit() optionaly delete backend
  char: rename functions that are not part of fe
  char: move CharBackend handling in char-fe unit
  char: generalize qemu_chr_write_all()
  be-hci: use backend functions
  chardev: serial & parallel declaration to own headers
  chardev: move headers to include/chardev
  Remove/replace sysemu/char.h inclusion
  char-win: close file handle except with console
  char-win: rename hcom->file
  char-win: rename win_chr_init/poll win_chr_serial_init/poll
  char-win: remove WinChardev.len
  char-win: simplify win_chr_read()
  char: cast ARRAY_SIZE() as signed to silent warning on empty array

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-05 10:09:14 +01:00
Maxime Coquelin 6dcdd06e3b spec/vhost-user spec: Add IOMMU support
This patch specifies and implements the master/slave communication
to support device IOTLB in slave.

The vhost_iotlb_msg structure introduced for kernel backends is
re-used, making the design close between the two backends.

An exception is the use of the secondary channel to enable the
slave to send IOTLB miss requests to the master.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau 4bbeeba023 vhost-user: add slave-req-fd support
Learn to give a socket to the slave to let him make requests to the
master.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau 2152f3fead vhost-user: add vhost_user to hold the chr
Next patches will add more fields to the structure

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Maxime Coquelin 020e571b8b vhost: rework IOTLB messaging
This patch reworks IOTLB messaging to prepare for vhost-user
device IOTLB support.

IOTLB messages handling is extracted from vhost-kernel backend,
so that only the messages transport remains backend specifics.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Maxime Coquelin fc58bd0d97 vhost: propagate errors in vhost_device_iotlb_miss()
Some backends might want to know when things went wrong.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Paolo Bonzini b0ac429f13 virtio: add virtqueue_alloc_element tracepoint
This tracepoint can help diagnosing failures due to memory
fragmentation in the guest.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-02 18:57:17 +03:00
Marc-André Lureau 4d43a603c7 char: move CharBackend handling in char-fe unit
Move all the frontend struct and methods to a seperate unit. This avoids
accidentally mixing backend and frontend calls, and helps with readabilty.

Make qemu_chr_replay() a macro shared by both char and char-fe.

Export qemu_chr_write(), and use a macro for qemu_chr_write_all()

(nb: yes, CharBackend is for char frontend :)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:53 +04:00
Marc-André Lureau 8228e353d8 chardev: move headers to include/chardev
So they are all in one place. The following patch will move serial &
parallel declarations to the respective headers.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-06-02 11:33:52 +04:00
Maxime Coquelin 3cf7daf8c3 vhost-user: pass message as a pointer to process_message_reply()
process_message_reply() was recently updated to get full message
content instead of only its request field.

There is no need to copy all the struct content into the stack,
so just pass its pointer as const.

Reviewed-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-25 21:25:28 +03:00
Juan Quintela 68ba3b0743 migration: migration.h was not needed
This files don't use any function from migration.h, so drop it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-05-18 19:20:59 +02:00
Stefan Hajnoczi 2ccbd47c1d migration/next for 20170517
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZHCoMAAoJEPSH7xhYctcjyOcQAN82GDYgXj93k40rU/SmZTP7
 blelisGsY5UNo33bLZq07fVwwdk1vIR0OIZvjMyGVWptAX49QJ6BVwX2E5zmb9LW
 AT3rVeyqz8nnC6OwWBxN9bu+sPJ13ibGs1l2j5Kn9jZ6a9rJCC7LOKdo4Dxbs3Uk
 Obw4f7swsozTQPxeHfrsBgFIvcB8qXLjdxsVhj+IWkmp1KDKVg+TWfNFJx30dK0G
 ktVsV0Xu6exEzcnzpTf93Bcv8vt49JRrCka9N5YryPTZmFuGgW291lqviPWiZg/W
 39F3cga5QfDzcs4Z6Lrz3Qeo/q+2n5G5O23UmrJccZ//UQMdeW9sd5udj211aMeq
 I7UdrarIHWRCCVTVdVL7AGJ8xmMIKHsvKRWstw7FEMHQ+lD/sFSfpWBtYdGhAotF
 mf/yncMKb52QbNyIuanoKi8UjU+RCvuslCac87U3fPqz/qYGvhnmO145S/wai1mR
 +FQQXORJOhdsWDqRRz9q8/uXqPwm173+rHHzMgFa3P1X9u1jfLhjJk0g9sDFtyAb
 If4IzOwfuCLJyelcuzzy9SSOzDsGu1LcrBoRgqTugX+MSJXFjWOKKfA1wxnAKkPf
 T2fQIqny2N7VCfpDB1iaCfxnkizIwrYEI3YRkMuJpYU3489x/BJQIILoLo1yEj4G
 vNhq+qJ9V/Uj8X+X5/cL
 =A5DU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'quintela/tags/migration/20170517' into staging

migration/next for 20170517

# gpg: Signature made Wed 17 May 2017 11:46:36 AM BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* quintela/tags/migration/20170517:
  migration: Move check_migratable() into qdev.c
  migration: Move postcopy stuff to postcopy-ram.c
  migration: Move page_cache.c to migration/
  migration: Create migration/blocker.h
  ram: Rename RAM_SAVE_FLAG_COMPRESS to RAM_SAVE_FLAG_ZERO
  migration: Pass Error ** argument to {save,load}_vmstate
  migration: Fix regression with compression threads

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 10:05:52 +01:00
Greg Kurz 66453cff9e virtio: allow broken device to notify guest
According to section 2.1.2 of the virtio-1 specification:

"The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that
a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET,
the device MUST send a device configuration change notification to the
driver."

Commit "f5ed36635d8f virtio: stop virtqueue processing if device is broken"
introduced a virtio_error() call that just does that:

- internally mark the device as broken
- set the DEVICE_NEEDS_RESET bit in the status
- send a configuration change notification

Unfortunately, virtio_notify_vector(), called by virtio_notify_config(),
returns right away when the device is marked as broken and the notification
isn't sent in this case.

The spec doesn't say whether a broken device can send notifications
in other situations or not. But since the driver isn't supposed to do
anything but to reset the device, it makes sense to keep the check in
virtio_notify_config().

Marking the device as broken AFTER the configuration change notification was
sent is enough to fix the issue.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-18 00:35:15 +03:00
Juan Quintela 795c40b8bd migration: Create migration/blocker.h
This allows us to remove lots of includes of migration/migration.h

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-05-17 12:04:59 +02:00
Zhiyong Yang 60cd11024f hw/virtio: fix vhost user fails to startup when MQ
Qemu2.7~2.9 and vhost user for dpdk 17.02 release work together
to cause failures of new connection when negotiating to set MQ.
(one queue pair works well).
   Because there exist some bugs in qemu code when introducing
VHOST_USER_PROTOCOL_F_REPLY_ACK to qemu. When vhost_user_set_mem_table
is invoked to deal with the vhost message VHOST_USER_SET_MEM_TABLE
for the second time, qemu indeed doesn't send the messge (The message
needs to be sent only once)but still will be waiting for dpdk's reply
ack, then, qemu is always freezing, while DPDK is always waiting for
next vhost message from qemu.
  The patch aims to fix the bug, MQ can work well.
  The same bug is found in function vhost_user_net_set_mtu, it is fixed
at the same time.
  DPDK related patch is as following:
  http://www.dpdk.org/dev/patchwork/patch/23955/

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Cc: qemu-stable@nongnu.org
Fixes: ca525ce561 ("vhost-user: Introduce a new protocol feature REPLY_ACK.")
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-05-10 22:04:23 +03:00
Peter Maydell 32c7e0ab75 migration/next for 20170421
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJY+d69AAoJEPSH7xhYctcj/4oQAIFFEyWaqrL9ve5ySiJgdtcY
 zYtiIhZQ+nPuy2i1oDSX+vbMcmkJDDyfO5qLovxyHGkZHniR8HtxNHP+MkZQa07p
 DiSIvd51HvcixIouhbGcoUCU63AYxqNL3o5/TyNpUI72nvsgwl3yfOot7PtutE/F
 r384j8DrOJ9VwC5GGPg27mJvRPvyfDQWfxDCyMYVw153HTuwVYtgiu/layWojJDV
 D2L1KV45ezBuGckZTHt9y6K4J5qz8qHb/dJc+whBBjj4j9T9XOILU9NPDAEuvjFZ
 gHbrUyxj7EiApjHcDZoQm9Raez422ALU30yc9Kn7ik7vSqTxk2Ejq6Gz7y9MJrDn
 KdMj75OETJNjBL+0T9MmbtWts28+aalpTUXtBpmi3eWQV5Hcox2NF1RP42jtD9Pa
 lkrM6jv0nsdNfBPlQ+ZmBTJxysWECcMqy487nrzmPNC8vZfokjXL5be12puho9fh
 ziU4gx9C6/k82S+/H6WD/AUtRiXJM7j4oTU2mnjrsSXQC1JNWqODBOFUo9zsDufl
 vtcrxfPhSD1DwOInFSIBHf/RylcgTkPCL0rPoJ8npNDly6rHFYJ+oIbsn84Z4uYY
 RWvH8xB9wgRlK9L1WdRgOd2q7PaeHQoSSdPOiS9YVEVMVvSW8Es5CRlhcAsw/M/T
 1Tl65cNrjETAuZKL3dLH
 =EsZ5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' into staging

migration/next for 20170421

# gpg: Signature made Fri 21 Apr 2017 11:28:13 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170421: (65 commits)
  hmp: info migrate_parameters format tunes
  hmp: info migrate_capability format tunes
  migration: rename max_size to threshold_size
  migration: set current_active_state once
  virtio-rng: stop virtqueue while the CPU is stopped
  migration: don't close a file descriptor while it can be in use
  ram: Remove migration_bitmap_extend()
  migration: Disable hotplug/unplug during migration
  qdev: Move qdev_unplug() to qdev-monitor.c
  qdev: Export qdev_hot_removed
  qdev: qdev_hotplug is really a bool
  migration: Remove MigrationState parameter from migration_is_idle()
  ram: Use RAMBitmap type for coherence
  ram: rename last_ram_offset() last_ram_pages()
  ram: Use ramblock and page offset instead of absolute offset
  ram: Change offset field in PageSearchStatus to page
  ram: Remember last_page instead of last_offset
  ram: Use page number instead of an address for the bitmap operations
  ram: reorganize last_sent_block
  ram: ram_discard_range() don't use the mis parameter
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 15:59:27 +01:00
Laurent Vivier a23a6d1839 virtio-rng: stop virtqueue while the CPU is stopped
If we modify the virtio-rng virqueue while the
vmstate is already migrated we can have some
inconsistencies between the virtqueue state and
the memory content.

To avoid this, stop the virtqueue while the CPU
is stopped.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by:  Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:40 +02:00
Peter Xu 698feb5e13 memory: add section range info for IOMMU notifier
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova range overlaps with the range of specific IOMMU notifier,
and skip those notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

This patch is solving the problem that vfio-pci devices receive
duplicated UNMAP notification on x86 platform when vIOMMU is there. The
issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is
splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK
this (splitted IOMMU region) is only happening on x86.

This patch also helps vhost to leverage the new interface as well, so
that vhost won't get duplicated cache flushes. In that sense, it's an
slight performance improvement.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com>
[ehabkost: included extra vhost_iommu_region_del() change from Peter Xu]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Jason Wang 375f74f473 vhost: generalize iommu memory region
We assumes the iommu_ops were attached to the root region of address
space. This may not be true for all kinds of IOMMU implementation and
especially after commit 3716d5902d ("pci: introduce a bus master
container"). So fix this by not assuming as->root has iommu_ops,
instead depending on the regions reported by memory listener through:

- register a memory listener to dma_as
- during region_add, if it's a region of IOMMU, register a specific
  IOMMU notifier, and store all notifiers in a list.
- during region_del, compare and delete the IOMMU notifier from the list

This is also a must for making vhost device IOTLB works for all types
of IOMMUs. Note, since we register one notifier during each
.region_add, the IOTLB may be flushed more than one times, this is
suboptimal and could be optimized in the future.

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fixes: 3716d5902d ("pci: introduce a bus master container")
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-03-30 19:09:16 +03:00
Paolo Bonzini e49a661840 virtio: always use handle_aio_output if registered
Commit ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is
active", 2016-10-30) and 9ffe337 ("virtio-blk: always use dataplane
path if ioeventfd is active", 2016-10-30) broke the virtio 1.0
indirect access registers.

The indirect access registers bypass the ioeventfd, so that virtio-blk
and virtio-scsi now repeatedly try to initialize dataplane instead of
triggering the guest->host EventNotifier.  Detect the situation by
checking vq->handle_aio_output; if it is not NULL, trigger the
EventNotifier, which is how the device expects to get notifications
and in fact the only thread-safe manner to deliver them.

Fixes: ad07cd6
Fixes: 9ffe337
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-22 17:56:00 +02:00
Fam Zheng a77690c41d virtio: Fix error handling in virtio_bus_device_plugged
For one thing we shouldn't continue if an error happened, for the other
two steps failing can cause an abort() in error_setg because we reuse
the same errp blindly.

Add error handling checks to fix both issues.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-22 17:54:32 +02:00
Marcel Apfelbaum 27ce0f3afc hw/virtio: fix Power Management Control Register for PCI Express virtio devices
Make Power Management State flag writable to conform
with the PCI Express spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:41 +02:00
Marcel Apfelbaum d584f1b9ca hw/virtio: fix Link Control Register for PCI Express virtio devices
Make several Link Control Register flags writable to conform
with the PCI Express spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:41 +02:00
Marcel Apfelbaum c2cabb3422 hw/virtio: fix error enabling flags in Device Control register
When the virtio devices are PCI Express, make error-enabling flags
writable to respect the PCIe spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:40 +02:00
Jason Wang 60a8d80234 virtio-pci: reset modern vq meta data
We don't reset proxy->vqs[].{num|desc[]|avail[]|used[]}. This means if
a driver enable the vq without setting vq address after reset. The old
addresses were leaked. Fixing this by resetting modern vq meta data
during device reset.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:59:18 +02:00
Jason Wang f0edf23978 Revert "virtio: unbreak virtio-pci with IOMMU after caching ring translations"
This reverts commit
96a8821d21. Previous patch is a better
solution which does not require a strict order between virtio and IOMMU.

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-15 19:59:00 +02:00
Jason Wang e45da65322 virtio: validate address space cache during init
We don't check the return value of address_space_cache_init(), this
may lead buggy driver use incorrect region caches. Instead of
triggering an assert, catch and warn this early in
virtio_init_region_cache().

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang e0e2d64409 virtio: destroy region cache during reset
We don't destroy region cache during reset which can make the maps
of previous driver leaked to a buggy or malicious driver that don't
set vring address before starting to use the device. Fix this by
destroy the region cache during reset and validate it before trying to
see them.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang 168e4af3c1 virtio: guard against NULL pfn
To avoid access stale memory region cache after reset, this patch
check the existence of virtqueue pfn for all exported virtqueue access
helpers before trying to use them.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang 96a8821d21 virtio: unbreak virtio-pci with IOMMU after caching ring translations
Commit c611c76417 ("virtio: add MemoryListener to cache ring
translations") registers a memory listener to dma_as. This may not
work when IOMMU is enabled: dma_as(bus_master_as) were initialized in
pcibus_machine_done() after virtio_realize(). This will cause a
segfault. Fixing this by using pci_device_iommu_address_space()
instead to make sure address space were initialized at this time.

With this fix, IOMMU device were required to be initialized before any
virtio-pci devices.

Fixes: c611c76417 ("virtio: add MemoryListener to cache ring translations")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:28 +02:00
Stefan Hajnoczi 874adf45db virtio: add missing region cache init in virtio_load()
Commit 97cd965c07 ("virtio: use
VRingMemoryRegionCaches for avail and used rings") switched to a memory
region cache to avoid repeated map/unmap operations.

The virtio_load() process is a little tricky because vring addresses are
serialized in two separate places.  VIRTIO 1.0 devices serialize desc
and then a subsection with used and avail.  Legacy devices only
serialize desc.

Live migration of VIRTIO 1.0 devices fails on the destination host with:

  VQ 0 size 0x80 < last_avail_idx 0x12f8 - used_idx 0x0
  Failed to load virtio-blk:virtio
  error while loading state for instance 0x0 of device '0000:00:04.0/virtio-blk'

This happens because the memory region cache is only initialized after
desc is loaded and not after the used and avail subsection is loaded.
If the guest chose memory addresses that don't match the legacy ring
layout then the wrong guest memory location is accessed.

Wait until all ring addresses are known before trying to initialize the
region cache.  Also clarify the incomplete comment about VIRTIO-1 ring
address subsection.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
2017-03-02 07:14:28 +02:00
Stefan Hajnoczi 3cdf847329 virtio: invalidate memory in vring_set_avail_event()
Remember to invalidate the avail event field so the memory pages are
marked dirty.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
2017-03-02 07:14:27 +02:00
Cornelia Huck 34c6bf22a8 virtio: guard vring access when setting notification
Switching to vring caches exposed an existing bug in
virtio_queue_set_notification(): We can't access vring structures
if they have not been set up yet. This may happen, for example,
for virtio-blk devices with multiple queues: The code will try to
switch notifiers for every queue, but the guest may have only set up
a subset of them.

Fix this by guarding access to the vring memory by checking for
vring.desc. The first aio poll will iron out any remaining
inconsistencies for later-configured queues (buggy legacy drivers).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:27 +02:00
Paolo Bonzini dd3dd4ba7b virtio: check for vring setup in virtio_queue_empty
If the vring has not been set up, there is nothing in the virtqueue.
virtio_queue_host_notifier_aio_poll calls virtio_queue_empty even in
this case; we have to filter it out just like virtio_queue_notify_aio_vq.

Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-03-02 07:14:27 +02:00
Michael S. Tsirkin b4b9862b53 virtio: Fix no interrupt when not creating msi controller
For ARM virt machine, if we use virt-2.7 which will not create ITS node,
the virtio-net can not recieve interrupts so it can't get ip address
through dhcp.
This fixes commit 83d768b(virtio: set ISR on dataplane notifications).

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini 97cd965c07 virtio: use VRingMemoryRegionCaches for avail and used rings
The virtio-net change is necessary because it uses virtqueue_fill
and virtqueue_flush instead of the more convenient virtqueue_push.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini ca0176ad83 virtio: check for vring setup in virtio_queue_update_used_idx
If the vring has not been set up, it is not necessary for vring_used_idx
to do anything (as is already the case when the caller is virtio_load).
This is harmless for now, but it will be a problem when the
MemoryRegionCache has not been set up.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini 991976f751 virtio: use VRingMemoryRegionCaches for descriptor ring
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini c611c76417 virtio: add MemoryListener to cache ring translations
The cached translations are RCU-protected to allow efficient use
when processing virtqueues.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini 5eba0404b9 virtio: use MemoryRegionCache to access descriptors
For now, the cache is created on every virtqueue_pop.  Later on,
direct descriptors will be able to reuse it.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Paolo Bonzini 9796d0ac8f virtio: use address_space_map/unmap to access descriptors
This makes little difference, but it makes the code change smaller
for the next patch that introduces MemoryRegionCache.  This is
because map/unmap are similar to MemoryRegionCache init/destroy.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Fam Zheng 0793169870 virtio: Report real progress in VQ aio poll handler
In virtio_queue_host_notifier_aio_poll, not all "!virtio_queue_empty()"
cases are making true progress.

Currently the offending one is virtio-scsi event queue, whose handler
does nothing if no event is pending. As a result aio_poll() will spin on
the "non-empty" VQ and take 100% host CPU.

Fix this by reporting actual progress from virtio queue aio handlers.

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Ed Swierk <eswierk@skyportsystems.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-17 21:52:30 +02:00
Michael S. Tsirkin d56ec1e98c vhost: skip ROM sections
vhost does not support RO protections on memory at the moment - adding
ROMs would mean that e.g. a buggy guest might change them in-memory - a
condition from which guest reset does not recover. Not nice.

We also definitely don't want to try logging writes into ROMs -
in particular guests set very high addresses for ROM BARs
so logging these writes would waste a lot of memory.

Maybe ROMs could be supported with the iotlb variant -
not sure, but there seems to be no good reason for virtio
to try to do DMA from ROM. So let's just skip ROM memory.

Suggested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
2017-02-01 03:37:18 +02:00
Paolo Bonzini c25d97c4ff virtio: make virtio_should_notify static
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-01 03:37:18 +02:00
Cao jin ee640c625e pci: Convert msix_init() to Error and fix callers
msix_init() reports errors with error_report(), which is wrong when
it's used in realize().  The same issue was fixed for msi_init() in
commit 1108b2f. In order to make the API change as small as possible,
leave the return value check to later patch.

For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error
object.

Bonus: add comment for msix_init.

CC: Jiri Pirko <jiri@resnulli.us>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Dmitry Fleytman <dmitry@daynix.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Hannes Reinecke <hare@suse.de>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-02-01 03:37:18 +02:00
Peter Maydell ffb5a69c31 trivial patches for 2017-01-24
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJYh7icAAoJEHAbT2saaT5ZixMH/2qr2TPaAARnTPFzf/mfpHvR
 jYKZary6L//DTCqjrys5zAVzKUg8rCPGwWI2T2FDsos7Ku4MKBBSfDmnabc+iu0P
 7Rkr18dPGi5ozAiHcGzNXivODVrXBqZT3KcJZ1aYo04Bl0xszxO+fWp2B6n9aXIs
 g4HFq98XGXut8Rs7wNcsUOGHTkIupnzxt+TYXFhezRPq/6bRWZj8pPjwiPReZJBP
 w6IhlVkIxsMdW1tpy+Im21aKCWO23mvQYj+ZiS2eb2F/jcSshL9xp1vqlbNU65H1
 w/zQaUE+m0yJhF7sVKM76101vnDJ1DPxiD/45BnF5p/xwiYcUwpS5UG53riFxAA=
 =B6et
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging

trivial patches for 2017-01-24

# gpg: Signature made Tue 24 Jan 2017 20:27:08 GMT
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* remotes/mjt/tags/trivial-patches-fetch: (31 commits)
  hw/isa/isa-bus: Set category of the "isabus-bridge" device
  usb: Set category and description of the MTP device
  gdbstub.c: update old error report statements
  gdbstub.c: fix GDB connection segfault caused by empty machines
  scsi-disk: add 'fall through' comment to switch VERIFY cases
  Drop duplicate display option documentation
  hw/display/framebuffer.c: Avoid overflow for framebuffers > 4GB
  win32: use glib gpoll if glib >= 2.50
  util/mmap-alloc: refactor a little bit for readability
  util/mmap-alloc: check parameter before using
  vfio: remove a duplicated word in comments
  docs: sync pci-ids.txt
  disas/cris.c: Fix Coverity warning about unchecked NULL
  lm32: milkymist-tmu2: fix another integer overflow
  hw/i386/kvmvapic: Remove dead code in patch_hypercalls()
  doc/usb2: fix typo
  qga: fix erroneous argument to strerror
  block: remove dead check
  pci-assign: avoid pointless stat
  qemu-img: remove dead check
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-01-25 10:42:26 +00:00
Stefan Weil b12227afb1 hw: Fix typos found by codespell
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-01-24 23:26:52 +03:00
Ashijeet Acharya fe44dc9180 migration: disallow migrate_add_blocker during migration
If a migration is already in progress and somebody attempts
to add a migration blocker, this should rightly fail.

Add an errp parameter and a retcode return value to migrate_add_blocker.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-5-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Merged with recent 'Allow invtsc migration' change
2017-01-24 18:00:30 +00:00
Jianjun Duan 2c21ee769e migration: extend VMStateInfo
Current migration code cannot handle some data structures such as
QTAILQ in qemu/queue.h. Here we extend the signatures of put/get
in VMStateInfo so that customized handling is supported. put now
will return int type.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-2-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-01-24 17:54:47 +00:00
Peter Maydell 598cf1c805 * QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
 * Memory leak fixes (Li Qiang, me)
 * Ctrl-a b regression (Marc-André)
 * Stubs cleanups and fixes (Leif, me)
 * hxtool tweak (me)
 * HAX support (Vincent)
 * QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
 * PC_COMPAT_2_8 fix (Marcelo)
 * stronger bitmap assertions (Peter)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYggc9FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 5pMH/092iVHw1la8VmphQd8W7hkCHckvVbwaEJ+n4BP8MjeUNmYFJX+op9Qlpqfe
 ekYqQgK69v2UwuofVK2gqS+Y2EyFHivTESk5pS3SM3lTewV1fzCM/HVG3pTxV/ol
 V+eBnp+shrfNG3Eg7YThTqx4LkDUp24Pd3HJVblQZMVpqGzL2xUuUQzSf8F/eeQJ
 xO61pm0ovpCY5MCg3kPLx8GIkPAmcXo5jhMCTz5aLnQW6TO/mwx271a4UE2RTLZ7
 cFjNhxdGSzlnn2RwId4HVYWGU42taW6mpa8NX1hVVUXa1A2qlAfi5N/WLaH0aGYR
 J5ZTIaXdPUBx2SrUmd8udj4a818=
 =H5BQ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
* Memory leak fixes (Li Qiang, me)
* Ctrl-a b regression (Marc-André)
* Stubs cleanups and fixes (Leif, me)
* hxtool tweak (me)
* HAX support (Vincent)
* QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
* PC_COMPAT_2_8 fix (Marcelo)
* stronger bitmap assertions (Peter)

# gpg: Signature made Fri 20 Jan 2017 12:49:01 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (35 commits)
  pc.h: move x-mach-use-reliable-get-clock compat entry to PC_COMPAT_2_8
  bitmap: assert that start and nr are non negative
  Revert "win32: don't run subprocess tests on Mingw32 platform"
  hax: add Darwin support
  Plumb the HAXM-based hardware acceleration support
  target/i386: Add Intel HAX files
  kvm: move cpu synchronization code
  KVM: PPC: eliminate unnecessary duplicate constants
  ramblock-notifier: new
  char: fix ctrl-a b not working
  exec: Add missing rcu_read_unlock
  x86: ioapic: fix fail migration when irqchip=split
  x86: ioapic: dump version for "info ioapic"
  x86: ioapic: add traces for ioapic
  hxtool: emit Texinfo headings as @subsection
  qemu-thread: fix qemu_thread_set_name() race in qemu_thread_create()
  serial: fix memory leak in serial exit
  scsi-block: fix direction of BYTCHK test for VERIFY commands
  pc: fix crash in rtc_set_memory() if initial cpu is marked as hotplugged
  acpi: filter based on CONFIG_ACPI_X86 rather than TARGET
  ...

# Conflicts:
#	include/hw/i386/pc.h
2017-01-20 16:42:07 +00:00
Peter Maydell d1c82f7cc3 First set of s390x patches for 2.9:
- rework of the zpci code, giving us proper multibus support
 - introduction of the 2.9 machine
 - fixes and improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYgdReAAoJEN7Pa5PG8C+vDggP/i3eviyb2mFlnIiwazlAfBuw
 Uc6vBFDh/WWMthpzHl4PF+yujM3XbuvUN3VejdnqWLQ1PYq2p3n7rHNlR2XlBovu
 f8l2LpPZGsj1VtAr1QGBj5ipOmRs3qydXY7EDCKORbKuPeor1VW7TbeaKbfpvpZM
 rZHWMlV1UGA6kxM/B+zd9+kxBM3IYnHy3o+Gaq+cfuKyc0VRWRJmalqonjkR7EZj
 InaIyOtGonpPTlMD1GTbM71Wx/NnCugYUEX1Eq4yHX4DV15rM3B83LgTJu72txzr
 ObJmzT3XU2DKwtzo87Y6cWJ3GoxQQbwgiU6VL+l8JVtrzGfllpUdcdInQjSqxXp2
 OW8NuV6Ie02YOrczBXbBAv46PKmoLTf63hvsC4f6nNLa2O6FqxAXzYGKtOpvgOq5
 j1Q6VyzAb/vbyyW2lyMice4XJXGMxitaMGxvJG0lq/iscRpNdpz6E+dgkzO7lieF
 +ETpDsGd5miMdsAUqmIREjBCCjOzOGpC4WX0mg8Te8LmR3Rt8WYIgWuowMvbq2iG
 /qmv9a8ea2XqB+/g2ta+YqS9cPChsPJSN03Q0bo1244DMwBKuVwyXNsC9lRIkiHJ
 4b1Msoseohv9D4ghU8q6gSOU+T5nxLRT1TWBByqhkONU1C4UyKHEblop/c1oHE5k
 UZtiaQvyWFhVU4QtXeE8
 =fzmu
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170120-v2' into staging

First set of s390x patches for 2.9:
- rework of the zpci code, giving us proper multibus support
- introduction of the 2.9 machine
- fixes and improvements

# gpg: Signature made Fri 20 Jan 2017 09:11:58 GMT
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20170120-v2:
  virtio-ccw: fix ring sizing
  s390x/pci: merge msix init functions
  s390x/pci: handle PCIBridge bus number
  s390x/pci: use hashtable to look up zpci via fh
  s390x/pci: PCI multibus bridge handling
  s390x/pci: optimize calling s390_get_phb()
  s390x/pci: change the device array to a list
  s390x/pci: dynamically allocate iommu
  s390x/pci: make S390PCIIOMMU inherit Object
  s390x/kvm: use kvm_gsi_routing_enabled in flic
  s390x: add compat machine for 2.9
  s390x: remove double compat statement

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-01-20 15:53:58 +00:00
Michael S. Tsirkin 8c797e758a virtio-ccw: fix ring sizing
Current code seems to assume ring size is
always decreased but this is not required by spec:
what spec says is just that size can not exceed
the maximum. Fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <1484256243-1982-1-git-send-email-mst@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-01-20 10:02:02 +01:00
Jason Wang 2943b53f68 virtio: force VIRTIO_F_IOMMU_PLATFORM
We allow vhost to clear VIRITO_F_IOMMU_PLATFORM which is wrong since
VIRTIO_F_IOMMU_PLATFORM is mandatory for security. Fixing this by
enforce it after vdc->get_features().

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-19 23:00:31 +02:00
Michael S. Tsirkin 6bdc21c050 virtio: fix up max size checks
Coverity reports that ARRAY_SIZE(elem->out_sg) (and all the others too)
is wrong because elem->out_sg is a pointer.

However, the check is not in the right place and the max_size argument
of virtqueue_map_iovec can be removed.  The check on in_num/out_num
should be moved to qemu_get_virtqueue_element instead, before the call
to virtqueue_alloc_element.

Cc: qemu-stable@nongnu.org
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes: 3724650db0 ("virtio: introduce virtqueue_alloc_element")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-01-19 23:00:31 +02:00
Michael S. Tsirkin 7e71da7f12 virtio-mmio: switch to linux headers
Switch to virtio_mmio.h from Linux - will make it
easier to implement virtio 1.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-18 22:59:53 +02:00
Michael S. Tsirkin 1aea7a5b7e virtio: drop an obsolete comment
virtio core has code to revert queue number
to maximum on reset. Drop TODO to add that.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-18 22:59:53 +02:00
Jason Wang c471ad0e9b vhost_net: device IOTLB support
This patches implements Device IOTLB support for vhost kernel. This is
done through:

1) switch to use dma helpers when map/unmap vrings from vhost codes
2) introduce a set of VhostOps to:
   - setting up device IOTLB request callback
   - processing device IOTLB request
   - processing device IOTLB invalidation
2) kernel support for Device IOTLB API:

- allow vhost-net to query the IOMMU IOTLB entry through eventfd
- enable the ability for qemu to update a specified mapping of vhost
- through ioctl.
- enable the ability to invalidate a specified range of iova for the
  device IOTLB of vhost through ioctl. In x86/intel_iommu case this is
  triggered through iommu memory region notifier from device IOTLB
  invalidation descriptor processing routine.

With all the above, kernel vhost_net can co-operate with userspace
IOMMU. For vhost-user, the support could be easily done on top by
implementing the VhostOps.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-18 22:59:53 +02:00
Stefan Hajnoczi 1448c133e1 virtio: disable notifications again after poll succeeded
While AioContext is in polling mode virtqueue notifications are not
necessary.  Some device virtqueue handlers enable notifications.  Make
sure they stay disabled to avoid unnecessary vmexits.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-18 22:59:53 +02:00
Stefan Hajnoczi 332fa82d09 Revert "virtio: turn vq->notification into a nested counter"
This reverts commit aff8fd18f1.

Both virtio-net and virtio-crypto do not balance
virtio_queue_set_notification() enable and disable calls.  This makes
the notifications_disabled counter unreliable and Doug Goldstein
reported the following assertion failure:

  #3  0x00007ffff44d1c62 in __GI___assert_fail (
      assertion=assertion@entry=0x555555ae8e8a "vq->notification_disabled > 0",
      file=file@entry=0x555555ae89c0 "/home/doug/work/qemu/hw/virtio/virtio.c",
      line=line@entry=215,
      function=function@entry=0x555555ae9630 <__PRETTY_FUNCTION__.43707>
      "virtio_queue_set_notification") at assert.c:101
  #4  0x00005555557f25d6 in virtio_queue_set_notification (vq=0x55555666aa90,
      enable=enable@entry=1) at /home/doug/work/qemu/hw/virtio/virtio.c:215
  #5  0x00005555557dc311 in virtio_net_has_buffers (q=<optimized out>,
      q=<optimized out>, bufsize=102)
      at /home/doug/work/qemu/hw/net/virtio-net.c:1008
  #6  virtio_net_receive (nc=<optimized out>, buf=0x555557386b88 "", size=102)
      at /home/doug/work/qemu/hw/net/virtio-net.c:1148
  #7  0x00005555559cad33 in nc_sendv_compat (flags=<optimized out>, iovcnt=1,
      iov=0x7fffead746d0, nc=0x55555788b340) at net/net.c:705
  #8  qemu_deliver_packet_iov (sender=<optimized out>, flags=<optimized out>,
      iov=0x7fffead746d0, iovcnt=1, opaque=0x55555788b340) at net/net.c:732
  #9  0x00005555559cd929 in qemu_net_queue_deliver (size=<optimized out>,
      data=<optimized out>, flags=<optimized out>, sender=<optimized out>,
      queue=0x55555788b550) at net/queue.c:164
  #10 qemu_net_queue_flush (queue=0x55555788b550) at net/queue.c:261

This patch is safe to revert since it's just an optimization for
virtqueue polling.  The next patch will improve the situation again
without resorting to nesting.

Reported-by: Doug Goldstein <cardoe@cardoe.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-18 22:59:53 +02:00
Paolo Bonzini 4a3f03ba8d virtio-net: enable ioeventfd even if vhost=off
virtio-net-pci does not enable ioeventfd for historical reasons (and
nobody ever checked whether it should be revisited).  Note that other
backends do enable ioeventfd for virtio-net.

However, it has a major effect on performance.  On Windows, throughput is
_multiplied_ by 2 or 3 on TCP_STREAM (on small packets it is "only" a 30%
improvement) and a little less so on TCP_MAERTS albeit still very much
statistically significant.  Latency also has a single digit improvement.

This is not visible when using vhost, which forces ioeventfd=on, but it
is substantial without vhost.  In addition, also on Windows and with the
RHEL 7.3 kernel, APICv seems to slow down virtio-net performance a bit,
but the penalty with this patch goes from -25% to -7%.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-18 22:59:53 +02:00
Paolo Bonzini d6da1e9eca event_notifier: cleanups around event_notifier_set_handler
Remove the useless is_external argument.  Since the iohandler
AioContext is never used for block devices, aio_disable_external
is never called on it.  This lets us remove stubs/iohandler.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Paolo Bonzini a0f80010b3 stubs: move vhost stubs to stubs/vhost.o
No need to include them in libqemustub.a, since only system emulators
need them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Maxime Coquelin c5f048d8fb vhost-user: Add MTU protocol feature and op
This patch implements VHOST_USER_PROTOCOL_F_NET_MTU
protocol feature and VHOST_USER_NET_SET_MTU request so
that the backend gets notified of the user defined host
MTU.

If backend supports VHOST_USER_PROTOCOL_F_REPLY_ACK,
QEMU assumes MTU is valid if success is returned.

Vhost-net driver sends this request through a new
vhost_net_set_mtu vhost_ops entry.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Aaron Conole <aconole@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:53 +02:00
Yuri Benditovich 54e17709ac virtio: Introduce virtqueue_drop_all procedure
Add procedure for fast drop of queued packets, acting like
pop and push without mapping the buffers into memory.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:53 +02:00
Yuri Benditovich aa94d52142 net: vhost stop updates virtio queue state
Make virtio queue suitable for push operation from qemu
after vhost was stopped.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:53 +02:00
Yuri Benditovich 312d3b3534 net: Add virtio queue interface to update used index from vring state
Bring virtio queue to correct internal  state for host-to-guest
operations when vhost is temporary stopped.

Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:53 +02:00
Dr. David Alan Gilbert f2fd57db36 balloon: Don't balloon roms
A broken guest can specify physical addresses that correspond
to any memory region, but it shouldn't be able to change ROM.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Halil Pasic e66bcc4081 virtio: fix vq->inuse recalc after migr
Correct recalculation of vq->inuse after migration for the corner case
where the avail_idx has already wrapped but used_idx not yet.

Also change the type of the VirtQueue.inuse to unsigned int. This is
done to be consistent with other members representing sizes (VRing.num),
and because C99 guarantees max ring size < UINT_MAX but does not
guarantee max ring size < INT_MAX.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: bccdef6b ("virtio: recalculate vq->inuse after migration")
CC: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-10 07:02:52 +02:00
Gonglei 02ed3e7c16 virtio-crypto: zeroize the key material before free
Common practice with sensitive information (key material, passwords,
etc). Prevents sensitive information from being exposed by accident later in
coredumps, memory disclosure bugs when heap memory is reused, etc.

Sensitive information is sometimes also held in mlocked pages to prevent
it being swapped to disk but that's not being done here.

Let's zeroize the memory of CryptoDevBackendSymOpInfo structure pointed
for key material security.

[Thanks to Stefan for help with crafting the commit message]

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Gonglei ef69d971cd virtio-crypto-pci: tag virtio-crypto device hot pluggable
After resolving the relationship with cryptodev backend,
the virtio crypto device supports hotplug now.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Gonglei b89f8c80cc virtio-crypto: avoid one cryptodev device is used by multiple virtio crypto devices
Add the check condition for cryptodev device in order
to avoid one cryptodev device is used by multiple
virtio crypto devices.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Gonglei 305f5131ac virtio-crypto-pci: add check for cryptodev object
We must assure each virtio crypto pci device has
an vaild cryptodev backend object.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Gonglei 6138dbda5a cryptodev: wrap the ready flag
The ready flag should be set by the children of
cryptodev backend interface. Warp the setter/getter
functions for it.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Gonglei 46fd170545 cryptodev: introduce a new is_used property
This property is used to Tag the cryptodev backend
is used by virtio-crypto or not. Making cryptodev
can't be hot unplugged when it's in use. Cleanup
resources when cryptodev is finalized.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Gonglei c159a4d1d0 virtio-crypto: use the correct length for cipher operation
In some modes of cipher algorithms, the length of destination data
maybe larger then source data, such as ciphertext stealing (CTS).

For symmetric algorithms, the length of ciphertext is definitly
equal to the plaintext for each crypto operation. So we should
use the src_len instead of dst_len avoid to pass the incorrect
cryptographical results to the frontend driver.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 07:02:52 +02:00
Jason Wang 615c4ed205 virtio-pci: address space translation service (ATS) support
This patches enable the Address Translation Service support for virtio
pci devices. This is needed for a guest visible Device IOTLB
implementation and will be required by vhost device IOTLB API
implementation for intel IOMMU.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 05:56:59 +02:00
Jason Wang 8607f5c307 virtio: convert to use DMA api
Currently, all virtio devices bypass IOMMU completely. This is because
address_space_memory is assumed and used during DMA emulation. This
patch converts the virtio core API to use DMA API. This idea is

- introducing a new transport specific helper to query the dma address
  space. (only pci version is implemented).
- query and use this address space during virtio device guest memory
  accessing when iommu platform (VIRTIO_F_IOMMU_PLATFORM) was enabled
  for this device.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 05:56:58 +02:00
Gonglei a08aaff811 virtio-crypto: fix possible integer and heap overflow
Because the 'size_t' type is 4 bytes in 32-bit platform, which
is the same with 'int'. It's easy to make 'max_len' to zero when
integer overflow and then cause heap overflow if 'max_len' is zero.

Using uint_64 instead of size_t to avoid the integer overflow.

Cc: qemu-stable@nongnu.org
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Tested-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-01-10 05:56:58 +02:00
Peter Maydell 77424a452a virtio, vhost, pc: fixes
Here are some bugfixes that didn't make 2.8.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJYVFkHAAoJECgfDbjSjVRpdc0H/1JMgQn0/J6vjKfeRY7720y8
 /Bihf4gjKN2bPtu6pTGY1KQBvK76ShyvZZBFCa5bf8a4V9HP4BgSQ8mQ7ZUURzJJ
 OslYbSzK1R2LiWJ40e9xdFOoKWKB3lK1lBF/Xb8QPZPoJ0D50Fo2xpymt4hZFdkF
 oSnXxHmYoKMsMmmqJZd3aaqyffFLYGmcm1dbJOxninwu/nBzOBY2SQmmaaacSoGn
 3D5988i6OcD1qbavfk4rNCQV4avJA+H7c/FmXH3WarPE8M9/jgnAaUDdknFicUA8
 hGwMsLrO/7sMnMYtSQIxWg743LhrAm93HCr2u/TgLzhYbPydmLCpup5eAWj+jvo=
 =nN2D
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio, vhost, pc: fixes

Here are some bugfixes that didn't make 2.8.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 16 Dec 2016 21:13:43 GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  virtio: avoid using guest_notifier_mask in vhost-user mode
  pci: fix error message for express slots
  i386: amd_iommu: fix MMIO register count and access
  tests/vhost-user-bridge: use contrib/libvhost-user
  contrib: add libvhost-user
  tests/vhost-user-bridge: do not accept more than one connection
  tests/vhost-user-bridge: indicate peer disconnected
  tests/vhost-user-bridge: remove unnecessary dispatcher_remove
  tests/vhost-user-bridge: remove false comment

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-01-09 15:30:45 +00:00
Stefan Hajnoczi a7c8215e3b virtio: disable virtqueue notifications during polling
This is a performance optimization to eliminate vmexits during polling.
It also avoids spurious ioeventfd processing after polling ends.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-12-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-03 16:38:50 +00:00
Stefan Hajnoczi aff8fd18f1 virtio: turn vq->notification into a nested counter
Polling should disable virtqueue notifications but that requires nested
virtio_queue_set_notification() calls.  Turn vq->notification into a
counter so it is possible to do nesting.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-10-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-03 16:38:49 +00:00
Stefan Hajnoczi 0062ea0fd6 virtio: poll virtqueues for new buffers
Add an AioContext poll handler to detect new virtqueue buffers without
waiting for a guest->host notification.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-03 16:38:48 +00:00
Stefan Hajnoczi f6a51c84cd aio: add AioPollFn and io_poll() interface
The new AioPollFn io_poll() argument to aio_set_fd_handler() and
aio_set_event_handler() is used in the next patch.

Keep this code change separate due to the number of files it touches.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-03 16:38:48 +00:00
Wei Huang 2858bc6870 virtio: avoid using guest_notifier_mask in vhost-user mode
Because guest mask notifier cannot be used in vhost-user mode, a boolean
flag "use_guest_notifier_mask" was added in commit 5669655aaf to disable
the use of guest mask notifier under virtio-pci. However this flag wasn't
checked in other virtio devices, such as virtio-mmio. In our tests, it
caused assertion error under "vhost-user + virtio-mmio". This patch
addresses this problem by adding a check before guest_notifier_mask is
called.

Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-12-16 01:14:54 +02:00
Maxime Coquelin 66d1c4c19f virtio-pci: Fix cross-version migration with older machines
This patch fixes a cross-version migration regression introduced
by commit d1b4259f ("virtio-bus: Plug devices after features are
negotiated").

The problem is encountered when host's vhost backend does not support
VIRTIO_F_VERSION_1, and migration is initiated from a v2.7 or prior
machine with virtio-pci modern capabilities enabled to a v2.8 machine.

In this case, modern capabilities get exposed to the guest by the source,
whereas the target will detect version 1 is not supported so will only
expose legacy capabilities.

The problem is fixed by introducing a new "x-ignore-backend-features"
property, which is set in v2.7 and prior compatibility modes. Doing this,
v2.7 machine keeps its broken behaviour (enabling modern while version
is not supported), and newer machines will behave correctly.

Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-id: 20161214163035.3297-1-maxime.coquelin@redhat.com
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-12-15 07:35:19 +00:00
Gonglei 9730280d54 virtio-crypto: fix uninitialized variables
Though crypto_cfg.reserve is an unused field, let me
initialize the structure in order to make coverity happy.

*** CID 1365923:  Uninitialized variables  (UNINIT)
/hw/virtio/virtio-crypto.c: 851 in virtio_crypto_get_config()
845         stl_le_p(&crypto_cfg.mac_algo_h, c->conf.mac_algo_h);
846         stl_le_p(&crypto_cfg.aead_algo, c->conf.aead_algo);
847         stl_le_p(&crypto_cfg.max_cipher_key_len, c->conf.max_cipher_key_len);
848         stl_le_p(&crypto_cfg.max_auth_key_len, c->conf.max_auth_key_len);
849         stq_le_p(&crypto_cfg.max_size, c->conf.max_size);
850
>>>     CID 1365923:  Uninitialized variables  (UNINIT)
>>>     Using uninitialized value "crypto_cfg". Field "crypto_cfg.reserve"
       is uninitialized when calling "memcpy".
      [Note: The source code implementation of the function
       has been overridden by a builtin model.]
851         memcpy(config, &crypto_cfg, c->config_size);
852     }
853

Rported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-30 04:22:18 +02:00
Paolo Bonzini 83d768b564 virtio: set ISR on dataplane notifications
Dataplane has been omitting forever the step of setting ISR when
an interrupt is raised.  This caused little breakage, because the
specification actually says that ISR may not be updated in MSI mode.

Some versions of the Windows drivers however didn't clear MSI mode
correctly, and proceeded using polling mode (using ISR, not the used
ring index!) for crashdump and hibernation.  If it were just crashdump
and hibernation it would not be a big deal, but recent releases of
Windows do not really shut down, but rather log out and hibernate to
make the next startup faster.  Hence, this manifested as a more serious
hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs.
Newer versions fixed this, while older versions do not use MSI at all.

The failure has always been there for virtio dataplane, but it became
visible after commits 9ffe337 ("virtio-blk: always use dataplane path
if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always
use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk
and virtio-scsi always use the dataplane code under KVM.  The good news
therefore is that it was not a bug in the patches---they were doing
exactly what they were meant for, i.e. shake out remaining dataplane bugs.

The fix is not hard, so it's worth arranging for the broken drivers.
The virtio_should_notify+event_notifier_set pair that is common to
virtio-blk and virtio-scsi dataplane is replaced with a new public
function virtio_notify_irqfd that also sets ISR.  The irqfd emulation
code now need not set ISR anymore, so virtio_irq is removed.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18 17:29:25 +02:00
Paolo Bonzini 0687c37c5e virtio: access ISR atomically
This will be needed once dataplane will be able to set it outside
the big QEMU lock.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18 17:29:25 +02:00
Paolo Bonzini 310837de6c virtio: introduce grab/release_ioeventfd to fix vhost
Following the recent refactoring of virtio notifiers [1], more specifically
the patch ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to
start/stop ioeventfd") that uses virtio_bus_set_host_notifier [2]
by default, core virtio code requires 'ioeventfd_started' to be set
to true/false when the host notifiers are configured.

When vhost is stopped and started, however, there is a stop followed by
another start. Since ioeventfd_started was never set to true, the 'stop'
operation triggered by virtio_bus_set_host_notifier() will not result
in a call to virtio_pci_ioeventfd_assign(assign=false). This leaves
the memory regions with stale notifiers and results on the next start
triggering the following assertion:

  kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
  Aborted

This patch reintroduces (hopefully in a cleaner way) the concept
that was present with ioeventfd_disabled before the refactoring.
When ioeventfd_grabbed>0, ioeventfd_started tracks whether ioeventfd
should be enabled or not, but ioeventfd is actually not started at
all until vhost releases the host notifiers.

[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07748.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07760.html

Reported-by: Felipe Franciosi <felipe@nutanix.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd")
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-18 17:29:25 +02:00
Stefan Hajnoczi 600f5ce356 virtio-crypto: fix virtio_queue_set_notification() race
We must check for new virtqueue buffers after re-enabling notifications.
This prevents the race condition where the guest added buffers just
after we stopped popping the virtqueue but before we re-enabled
notifications.

I think the virtio-crypto code was based on virtio-net but this crucial
detail was missed.  virtio-net does not have the race condition because
it processes the virtqueue one more time after re-enabling
notifications.

Cc: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
2016-11-18 17:14:10 +02:00
Greg Kurz 435346d748 virtio: drop virtio_queue_get_ring_{size,addr}()
These are not used anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:38 +02:00
Greg Kurz 1cdce7c54d vhost: drop legacy vring layout bits
The legacy vring layout is not used anymore as we use the separate
mappings even for legacy devices.
This patch simply removes it.

This also fixes a bug with virtio 1 devices when the vring descriptor table
is mapped at a higher address than the used vring because the following
function may return an insanely great value:

hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n)
{
    return vdev->vq[n].vring.used - vdev->vq[n].vring.desc +
           virtio_queue_get_used_size(vdev, n);
}

and the mapping fails.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:38 +02:00
Greg Kurz f1f9e6c596 vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout
With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.

In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.

This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.

This works for legacy mappings as well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:38 +02:00
Rafael David Tinoco 0d34fbabc1 vhost: migration blocker only if shared log is used
Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).

Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:37 +02:00
Michael S. Tsirkin 9b706dbbbb virtio: allow per-device-class legacy features
Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.

Cc: qemu-stable@nongnu.org # dependency for the next patch
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-11-15 17:20:36 +02:00
Gonglei 6e724d9d99 virtio-crypto: tag as not hotpluggable and migration
Currently the virtio-crypto device hasn't supported
hotpluggable and live migration well. Let's tag it
as not hotpluggable and migration actively and reopen
them once we support them well.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:36 +02:00
Ladi Prosek bf91bd2792 virtio: make virtqueue_alloc_element static
The function does not fully initialize the returned VirtQueueElement and should
be used only internally from the virtio module.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:36 +02:00
Ladi Prosek 27e57efe32 virtio: rename virtqueue_discard to virtqueue_unpop
The function undoes the effect of virtqueue_pop and doesn't do anything
destructive or irreversible so virtqueue_unpop is a more fitting name.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-15 17:20:36 +02:00
Gonglei 20cb2ffd5f virtio-crypto: using bh to handle dataq's requests
Make crypto operations are executed asynchronously,
so that other QEMU threads and monitor couldn't
be blocked at the virtqueue handling context.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-01 19:21:08 +02:00
Gonglei d6634ac09a cryptodev: introduce an unified wrapper for crypto operation
We use an opaque point to the VirtIOCryptoReq which
can support different packets based on different
algorithms.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-01 19:21:08 +02:00
Gonglei 04b9b37edd virtio-crypto: add data queue processing handler
Introduces VirtIOCryptoReq structure to store
crypto request so that we can easily support
asynchronous crypto operation in the future.

At present, we only support cipher and algorithm
chaining.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-01 19:21:08 +02:00
Gonglei 59c360ca42 virtio-crypto: add control queue handler
Realize the symmetric algorithm control queue handler,
including plain cipher and chainning algorithms.

Currently the control queue is used to create and
close session for symmetric algorithm.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-01 19:21:08 +02:00
Gonglei 050652d9be virtio-crypto: set capacity of algorithms supported
Expose the capacity of algorithms supported by
virtio crypto device to the frontend driver using
pci configuration space.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-01 19:21:08 +02:00
Gonglei b307d308c9 virtio-crypto-pci: add virtio crypto pci support
This patch adds virtio-crypto-pci, which is the pci proxy for the virtio
crypto device.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-01 19:21:08 +02:00
Gonglei ea4d8ac2da virtio-crypto: add virtio crypto device emulation
Introduce the virtio crypto realization, I'll
finish the core code in the following patches. The
thoughts came from virtio net realization.

For more information see:
http://qemu-project.org/Features/VirtioCrypto

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-11-01 19:21:08 +02:00
Paolo Bonzini 2bd3c31a60 virtio: inline set_host_notifier_internal
This is only called from virtio_bus_set_host_notifier.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:21 +02:00
Paolo Bonzini fa283a4a8b virtio: inline virtio_queue_set_host_notifier_fd_handler
Of the three possible parameter combinations for
virtio_queue_set_host_notifier_fd_handler:

- assign=true/set_handler=true is only called from
  virtio_device_start_ioeventfd

- assign=false/set_handler=false is called from
  set_host_notifier_internal but it only does something when
  reached from virtio_device_stop_ioeventfd_impl; otherwise
  there is no EventNotifier set on qemu_get_aio_context().

- assign=true/set_handler=false is called from
  set_host_notifier_internal, but it is not doing anything:
  with the new start_ioeventfd and stop_ioeventfd methods,
  there is never an EventNotifier set on qemu_get_aio_context()
  at this point.  This is enforced by the assertion in
  virtio_bus_set_host_notifier.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:21 +02:00
Paolo Bonzini ed08a2a0ba virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd
ioeventfd_disabled was the only reason for the default
implementation of virtio_device_start_ioeventfd not to use
virtio_bus_set_host_notifier.  This is now fixed, and the sole entry
point to set up ioeventfd can be virtio_bus_set_host_notifier.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:21 +02:00
Paolo Bonzini e616c2f390 virtio: remove ioeventfd_disabled altogether
Now that there is not anymore a switch from the generic ioeventfd handler
to the dataplane handler, virtio_bus_set_host_notifier(assign=true) is
always called with !bus->ioeventfd_started, hence virtio_bus_stop_ioeventfd
does nothing in this case.  Move the invocation to vhost.c, which is the
only place that needs it.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:20 +02:00
Paolo Bonzini 6019f3b966 virtio: remove set_handler argument from set_host_notifier_internal
Make virtio_device_start_ioeventfd_impl use the same logic as
dataplane to set up the host notifier.  This removes the need
for the set_handler argument in set_host_notifier_internal.

This is a first step towards using virtio_bus_set_host_notifier
as the sole entry point to set up ioeventfds.  At least now
the functions have the same interface, but they still differ
in that virtio_bus_set_host_notifier sets ioeventfd_disabled.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:20 +02:00
Paolo Bonzini f1ac6a5522 Revert "virtio: Introduce virtio_add_queue_aio"
This reverts commit 872dd82c83.
virtio_add_queue_aio is unused.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 20:06:20 +02:00
Paolo Bonzini 8e93cef14e virtio: introduce virtio_device_ioeventfd_enabled
This will be used to forbid iothread configuration when the
proxy does not allow using ioeventfd.  To simplify the implementation,
change the direction of the ioeventfd_disabled callback too.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini ff4c07df67 virtio: add start_ioeventfd and stop_ioeventfd to VirtioDeviceClass
Allow customization of the start and stop of ioeventfd.  This will
allow direct start of dataplane without passing through the default
ioeventfd handlers, which in turn allows using the dataplane logic
instead of virtio_add_queue_aio.  It will also enable some code
simplification, because the sole entry point to ioeventfd setup
will be virtio_bus_set_host_notifier.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini b13d396227 virtio: move ioeventfd_started flag to VirtioBusState
This simplifies the code and removes the ioeventfd_started
and ioeventfd_set_started callback.  The only difference is
in how virtio-ccw handles an error---it doesn't disable
ioeventfd forever anymore.  It was the only backend to do
so, and if desired this behavior should be implemented in

virtio-bus.c.

Instead of ioeventfd_started, the ioeventfd_assign callback now
determines whether the virtio bus supports host notifiers.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini 4ddcc2d5cb virtio: move ioeventfd_disabled flag to VirtioBusState
This simplifies the code and removes the ioeventfd_set_disabled
callback.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:32 +02:00
Paolo Bonzini ca2b413c39 virtio: disable ioeventfd as early as possible
Avoid "tricking" virtio-blk-dataplane into thinking that ioeventfd will be
available when it is not.  This bug has always been there, but it will break
TCG+ioeventfd=on once the dataplane code will be always used when ioeventfd=on.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:31 +02:00
Dr. David Alan Gilbert 019518a80e virtio/migration: Migrate balloon to VMState
Replace the load/save with a vmsd.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:31 +02:00
Dr. David Alan Gilbert ea43e25987 virtio/migration: Add VMStateDescription to VirtioDeviceClass
Provide a vmsd pointer for VirtIO devices to use instead of the
load/save methods.

We'll eventually kill off the load/save methods.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-30 19:51:31 +02:00
Marc-André Lureau 5345fdb446 char: use qemu_chr_fe* functions with CharBackend argument
This also switches from qemu_chr_add_handlers() to
qemu_chr_fe_set_handlers(). Note that qemu_chr_fe_set_handlers() now
takes the focus when fe_open (qemu_chr_add_handlers() did take the
focus)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-16-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-24 15:27:21 +02:00
Peter Maydell 627eae7d72 virtio, pc: fixes and features
more guest error handling for virtio devices
 virtio migration rework
 pc fixes
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJX+tUfAAoJECgfDbjSjVRpIGMH/Ri+bnKF9zD6jQXfzYY+neSF
 SqR0BsFUqR+8C1Yxx45tFRC/kMpJy3n5PZunoDwAXcSlN/uoWvzp05/s44praFDc
 5FDcj3SvFhvOpBFnO5sTMBTkmGOCG/f/lnej+Fea0X8KjtOvVE6Yxek8CS+/dS3K
 t70hxLaTO93Z63olOxhAZSVX9wYKLovB0PXAu9Uj9LsnXl8o8gQLxM9WgKnI/0vD
 1V/ZGZY0lfFaHrvIgkgKy3/L7QJ91A/jU9jypNJOEdV52EDfkV97hA2ibcIQ+7Y1
 w/S3gzVmKM3dtxdS9DiQJ3riBT8XcPUWI6sIEjpfKGFGoOjazai3m9e3bcEx3Rg=
 =f//+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio, pc: fixes and features

more guest error handling for virtio devices
virtio migration rework
pc fixes

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 10 Oct 2016 00:39:11 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (33 commits)
  intel-iommu: Check IOAPIC's Trigger Mode against the one in IRTE
  virtio: cleanup VMSTATE_VIRTIO_DEVICE
  vhost-vsock: convert VMSTATE_VIRTIO_DEVICE
  virtio-rng: convert VMSTATE_VIRTIO_DEVICE
  virtio-balloon: convert VMSTATE_VIRTIO_DEVICE
  virtio-scsi: convert VMSTATE_VIRTIO_DEVICE
  virtio-input: convert VMSTATE_VIRTIO_DEVICE
  virtio-gpu: convert VMSTATE_VIRTIO_DEVICE
  virtio-serial: convert VMSTATE_VIRTIO_DEVICE
  virtio-9p: convert VMSTATE_VIRTIO_DEVICE
  virtio-net: convert VMSTATE_VIRTIO_DEVICE
  virtio-blk: convert VMSTATE_VIRTIO_DEVICE
  virtio: prepare change VMSTATE_VIRTIO_DEVICE macro
  net: don't poke at chardev internal QemuOpts
  virtio-scsi: handle virtio_scsi_set_config() error
  virtio-scsi: convert virtio_scsi_bad_req() to use virtio_error()
  virtio-net: handle virtio_net_flush_tx() errors
  virtio-net: handle virtio_net_receive() errors
  virtio-net: handle virtio_net_handle_ctrl() error
  virtio-blk: handle virtio_blk_handle_request() errors
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-10 16:23:40 +01:00
Halil Pasic 5705653ff8 virtio: cleanup VMSTATE_VIRTIO_DEVICE
Now all the usages of the old version of VMSTATE_VIRTIO_DEVICE are gone,
so we can get rid of the conditionals, and the old macro.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 02:21:43 +03:00
Halil Pasic 81cc8a6566 vhost-vsock: convert VMSTATE_VIRTIO_DEVICE
Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 02:21:43 +03:00
Halil Pasic b7de81f697 virtio-rng: convert VMSTATE_VIRTIO_DEVICE
Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 02:21:43 +03:00
Halil Pasic c5dc16b726 virtio-balloon: convert VMSTATE_VIRTIO_DEVICE
Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 02:21:43 +03:00
Halil Pasic 1a665855d7 virtio: prepare change VMSTATE_VIRTIO_DEVICE macro
In most cases the functions passed to VMSTATE_VIRTIO_DEVICE
only call the virtio_load and virtio_save wrappers. Some include some
pre- and post- massaging too. The massaging is better expressed
as such in the VMStateDescription.

Let us prepare for changing the semantic of the VMSTATE_VIRTIO_DEVICE
macro so that it is more similar to the other VMSTATE_*_DEVICE macros
in a sense that it is a field definition.

The preprocessor conditionals are going to be removed as soon as
every usage is converted to the new semantic.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 02:21:42 +03:00
Stefan Hajnoczi 2640d2a5ff virtio: add virtio_detach_element()
During device reset or similar situations a VirtQueueElement needs to be
freed without pushing it onto the used ring or rewinding the virtqueue.
Extract a new function to do this.

Later patches add virtio_detach_element() calls to existing device so
that scatter-gather lists are unmapped and vq->inuse goes back to zero
during device reset.  Currently some devices don't bother and simply
call g_free(elem) which is not a clean way to throw away a
VirtQueueElement.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 01:16:58 +03:00
Liang Li 17871f71fd virtio-balloon: Remove needless precompiled directive
Since there in wrapper around madvise(), the virtio-balloon
code is able to work without the precompiled directive, the
directive can be removed.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewd-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 01:16:57 +03:00
Chen Fan 7a25126d8a virtio: rename the bar index field name in VirtIOPCIProxy
the bar index names are much similar to the bar memory regions,
distinguish them to improve the code readability.

Signed-off-by: Chen Fan <fan.chen@easystack.cn>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-08 11:25:29 +03:00
Daniel P. Berrange 331f5eb28a trace: move hw/virtio/virtio-balloon.c trace points into correct file
The trace points for hw/virtio/virtio-balloon.c were mistakenly put
in the top level trace-events file, instead of util/trace-events in

  commit 270ab88f7c
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Thu Jun 16 09:39:57 2016 +0100

    trace: split out trace events for hw/virtio/ directory

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1473872624-23285-5-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-09-28 19:17:55 +01:00
Stefan Hajnoczi fb1131b674 virtio: handle virtqueue_get_head() errors
Stop processing the vring if virtqueue_get_head() fetches an
out-of-bounds head index.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:56 +03:00
Stefan Hajnoczi 4355c1abca virtio: handle virtqueue_num_heads() errors
If the avail ring index is bogus virtqueue_num_heads() must return
-EINVAL.

The only caller is virtqueue_get_avail_bytes().  Return saying no bytes
are available when virtqueue_num_heads() fails.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:56 +03:00
Stefan Hajnoczi 412e0e81b1 virtio: handle virtqueue_read_next_desc() errors
Stop processing the vring if an avail ring index is invalid.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:56 +03:00
Stefan Hajnoczi b1c7c07f2d virtio: use unsigned int for virtqueue_get_avail_bytes() index
The virtio code uses int, unsigned int, and uint16_t for virtqueue
indices.  The uint16_t is used for the low-level descriptor layout in
virtio_ring.h while code that isn't concerned with descriptor layout can
use unsigned int.

Use of int is problematic because it can result in signed/unsigned
comparison and incompatible int*/unsigned int* pointer types.

Make the virtqueue_get_avail_bytes() 'i' variable unsigned int.  This
eliminates the need to introduce casts and modify code further in the
patches that follow.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:56 +03:00
Stefan Hajnoczi d65abf85e7 virtio: handle virtqueue_get_avail_bytes() errors
If the vring is invalid, tell the caller no bytes are available and mark
the device broken.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:56 +03:00
Stefan Hajnoczi ec55da1924 virtio: handle virtqueue_map_desc() errors
Errors can occur during virtqueue_pop(), especially in
virtqueue_map_desc().  In order to handle this we must unmap iov[]
before returning NULL.  The caller will consider the virtqueue empty and
the virtio_error() call will have marked the device broken.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-23 19:03:55 +03:00
Stefan Hajnoczi 791b1daf72 virtio: migrate vdev->broken flag
Send a subsection if the vdev->broken flag is set.  This allows live
migration of broken virtio devices.

The subsection is only sent if vdev->broken has been set.  In most cases
the flag will be clear and no subsection will be sent.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:55 +03:00
Stefan Hajnoczi f5ed36635d virtio: stop virtqueue processing if device is broken
QEMU prints an error message and exits when the device enters an invalid
state.  Terminating the process is heavy-handed.  The guest may still be
able to function even if there is a bug in a virtio guest driver.

Moreover, exiting is a bug in nested virtualization where a nested guest
could DoS other nested guests by killing a pass-through virtio device.
I don't think this configuration is possible today but it is likely in
the future.

If the broken flag is set, do not process virtqueues or write back used
descriptors.  The broken flag can be cleared again by resetting the
device.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:55 +03:00
Stefan Hajnoczi 8275e2f6be virtio: fix stray tab character
Fix a single occurrence of a tab character in a file that otherwise uses
spaces for indentation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-09-23 19:03:55 +03:00
Prasad J Pandit 973e7170dd virtio: add check for descriptor's mapped address
virtio back end uses set of buffers to facilitate I/O operations.
If its size is too large, 'cpu_physical_memory_map' could return
a null address. This would result in a null dereference while
un-mapping descriptors. Add check to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2016-09-23 18:51:40 +03:00
Maxime Coquelin d1b4259f1a virtio-bus: Plug devices after features are negotiated
Currently, devices are plugged before features are negotiated.
If the backend doesn't support VIRTIO_F_VERSION_1, the transport
needs to rewind some settings.

This is the case for CCW, for which a post_plugged callback had
been introduced, where max_rev field is just updated if
VIRTIO_F_VERSION_1 is not supported by the backend.
For PCI, implementing post_plugged would be much more
complicated, so it needs to know whether the backend supports
VIRTIO_F_VERSION_1 at plug time.

Currently, nothing is done for PCI. Modern capabilities get
exposed to the guest even if VIRTIO_F_VERSION_1 is not supported
by the backend, which confuses the guest.

This patch replaces existing post_plugged solution with an
approach that fits with both transports.
Features negotiation is performed before ->device_plugged() call.
A pre_plugged callback is introduced so that the transports can
set their supported features.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com> [ccw]
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-15 17:30:03 +03:00
Stefan Hajnoczi fc0b9b0e1c vhost-vsock: add virtio sockets device
Implement the new virtio sockets device for host<->guest communication
using the Sockets API.  Most of the work is done in a vhost kernel
driver so that virtio-vsock can hook into the AF_VSOCK address family.
The QEMU vhost-vsock device handles configuration and live migration
while the rx/tx happens in the vhost_vsock.ko Linux kernel driver.

The vsock device must be given a CID (host-wide unique address):

  # qemu -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3 ...

For more information see:
http://qemu-project.org/Features/VirtioVsock

[Endianness fixes and virtio-ccw support by Claudio Imbrenda
<imbrenda@linux.vnet.ibm.com>]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[mst: rebase to master]
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-10 00:28:08 +03:00
Michael S. Tsirkin 71d19fc513 virtio-pci: minor refactoring
!legacy && !modern is shorter than !(legacy || modern).
I also perfer this (less ()s) as a matter of taste.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Jason Wang 96a3d98d2c vhost: don't set vring call if no vector
We used to set vring call fd unconditionally even if guest driver does
not use MSIX for this vritqueue at all. This will cause lots of
unnecessary userspace access and other checks for drivers does not use
interrupt at all (e.g virtio-net pmd). So check and clean vring call
fd if guest does not use any vector for this virtqueue at
all.

Perf diffs (on rx) shows lots of cpus wasted on vhost_signal() were saved:

#
    28.12%  -27.82%  [vhost]           [k] vhost_signal
    14.44%   -1.69%  [kernel.vmlinux]  [k] copy_user_generic_string
     7.05%   +1.53%  [kernel.vmlinux]  [k] __free_page_frag
     6.51%   +5.53%  [vhost]           [k] vhost_get_vq_desc
...

Pktgen tests shows 15.8% improvement on rx pps and 6.5% on tx pps.

Before: RX 2.08Mpps TX 1.35Mpps
After:  RX 2.41Mpps TX 1.44Mpps

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Greg Kurz 3eff376977 virtio-pci: error out when both legacy and modern modes are disabled
Without presuming if we got there because of a user mistake or some
more subtle bug in the tooling, it really does not make sense to
implement a non-functional device.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Ladi Prosek 4a1e48beca virtio-balloon: fix stats vq migration
The statistics virtqueue is not migrated properly because virtio-balloon
does not include s->stats_vq_elem in the migration stream.

After migration the statistics virtqueue hangs because the host never
completes the last element (s->stats_vq_elem is NULL on the destination
QEMU).  Therefore the guest never submits new elements and the virtqueue
is hung.

Instead of changing the migration stream format in an incompatible way,
detect the migration case and rewind the virtqueue so the last element
can be completed.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Suggested-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Stefan Hajnoczi 297a75e6c5 virtio: add virtqueue_rewind()
virtqueue_discard() requires a VirtQueueElement but virtio-balloon does
not migrate its in-use element.  Introduce a new function that is
similar to virtqueue_discard() but doesn't require a VirtQueueElement.

This will allow virtio-balloon to access element again after migration
with the usual proviso that the guest may have modified the vring since
last time.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Ladi Prosek 104e70cae7 virtio-balloon: discard virtqueue element on reset
The one pending element is being freed but not discarded on device
reset, which causes svq->inuse to creep up, eventually hitting the
"Virtqueue size exceeded" error.

Properly discarding the element on device reset makes sure that its
buffers are unmapped and the inuse counter stays balanced.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Stefan Hajnoczi 4b7f91ed02 virtio: zero vq->inuse in virtio_reset()
vq->inuse must be zeroed upon device reset like most other virtqueue
fields.

In theory, virtio_reset() just needs assert(vq->inuse == 0) since
devices must clean up in-flight requests during reset (requests cannot
not be leaked!).

In practice, it is difficult to achieve vq->inuse == 0 across reset
because balloon, blk, 9p, etc implement various different strategies for
cleaning up requests.  Most devices call g_free(elem) directly without
telling virtio.c that the VirtQueueElement is cleaned up.  Therefore
vq->inuse is not decremented during reset.

This patch zeroes vq->inuse and trusts that devices are not leaking
VirtQueueElements across reset.

I will send a follow-up series that refactors request life-cycle across
all devices and converts vq->inuse = 0 into assert(vq->inuse == 0) but
this more invasive approach is not appropriate for stable trees.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
2016-09-09 20:58:34 +03:00
Marcel Apfelbaum d9997d89a4 virtio-pci: reduce modern_mem_bar size
Currently each VQ Notification Virtio Capability is allocated
on a different page. The idea is to enable split drivers within
guests, however there are no known plans to do that.
The allocation will result in a 8MB BAR, more than various
guest firmwares pre-allocates for PCI Bridges hotplug process.

Reserve 4 bytes per VQ by default and add a new parameter
"page-per-vq" to be used with split drivers.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Michael S. Tsirkin e3aab6c7f3 virtio-pci: use size from correct structure
PIO MR registration should use size from the correct notify struct.
Doesn't affect any visible behaviour because the field values are the
same (both are 4).

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Thomas Huth a8bba0ada4 virtio: Tell the user what went wrong when event_notifier_init failed
event_notifier_init() can fail in real life, for example when there
are not enough open file handles available (EMFILE) when using a lot
of devices. So instead of leaving the average user with a cryptic
error number only, print out a proper error message with strerror()
instead, so that the user has a better way to figure out what is
going on and that using "ulimit -n" might help here for example.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Peter Maydell e00da552a0 virtio: fixes
some bugfixes for virtio
 balloon is still broken wrt migration
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXvHrHAAoJECgfDbjSjVRpup4IAKFS/2miwD9OJNy8UieLmXTg
 PVL8twWgYUPBLRFUx6h7r+VnsFXY3NPSiKZhdXpKjnW9WIV/ru9i7UCk5OOt/4mj
 BiS3kztMrrs7RRPCQVgyjuWterkllICoIT38muo6Q7iOAP6iUgTyjdzUh+u9leUX
 IeevtsttyOBW+SrH7ug7VzmYWODHOgkycBwNDyPCNcEMTiZKdhREQo45FnRaKB+Q
 H/BWn5yvjyVXp8NRCm4fBX9TGoU/qERU0k+aTltCv7ctlQR8BOmQ/r5glMUHu8Kj
 6tpf6WowsGmDl7IH3lX6An4GsGLfM5AwHVn4Aa9dd0C7C7cVJmPudPFsd9tv6Y4=
 =I/lz
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio: fixes

some bugfixes for virtio
balloon is still broken wrt migration

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 23 Aug 2016 17:33:11 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  virtio: decrement vq->inuse in virtqueue_discard()
  virtio: recalculate vq->inuse after migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-08-24 17:21:03 +01:00
Stefan Hajnoczi 58a83c6149 virtio: decrement vq->inuse in virtqueue_discard()
virtqueue_discard() moves vq->last_avail_idx back so the element can be
popped again.  It's necessary to decrement vq->inuse to avoid "leaking"
the element count.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-23 19:20:24 +03:00
Stefan Hajnoczi bccdef6b1a virtio: recalculate vq->inuse after migration
The vq->inuse field is not migrated.  Many devices don't hold
VirtQueueElements across migration so it doesn't matter that vq->inuse
starts at 0 on the destination QEMU.

At least virtio-serial, virtio-blk, and virtio-balloon migrate while
holding VirtQueueElements.  For these devices we need to recalculate
vq->inuse upon load so the value is correct.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-08-23 19:20:10 +03:00
Peter Maydell aba5d97664 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJXraljAAoJEJykq7OBq3PIgCsIAKix9uyPGZQqL8sxMjpxz4ck
 cQ+hjStWSDZGw+eJ4R7DCO3MW/b0O8JUzT5OL+h0y5qiR/M9QkqfYyzC2Lwn45UO
 Zz6iRrRMLfAGlfnJPXC5a1s4/tBn39rJtYcKkrDmpJwRZg8PUw7LC27k7Rr0Jpi0
 PYe9o8buwsAjuH0O2Q4UC2PtSX06s1aQf06CAHB9jfvZlHaRM3o8msan66u+FkJg
 Tz+IUNj+AUakM2uMptagoxRcEsqwH4XbnbJtyFb9VcxIVW7BX3WxVuNUvVQkCIvD
 A1wMy2mFjBi9i3uBMT9Zos5cE3QTFLFTdlV9qhLuzJFcmEjyLCdiPfvzFl81AHw=
 =zWck
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

# gpg: Signature made Fri 12 Aug 2016 11:48:03 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace-events: fix first line comment in trace-events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-08-15 18:27:51 +01:00