Commit graph

49794 commits

Author SHA1 Message Date
Li Qiang dd28fbbc2e 9pfs: add xattrwalk_fid field in V9fsXattr struct
Currently, 9pfs sets the 'copied_len' field in V9fsXattr
to -1 to tag xattr walk fid. As the 'copied_len' is also
used to account for copied bytes, this may make confusion. This patch
add a bool 'xattrwalk_fid' to tag the xattr walk fid.

Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
2016-11-01 12:00:40 +01:00
Peter Maydell 0e35636651 Update OpenBIOS images
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJYF6dpAAoJEFvCxW+uDzIflxIH/3dhtyZ82YG/xfOiWte+ro5P
 VE7ZSLzYxU+Z8lkSeh+RmQ06JBTmexa75t0Fz4GXc7264CCxMVi7AFEjd0a/dPVz
 nxOj8mnj56ZfIUfjtNf2Qhj2QR8iPgL7yhtraP/9z6BhUuN5H0XnQ1GeG7ou613W
 taIBtuEwI48O87menOSlrEbblL0VSKkyOBHe783KTZrmircqSCybYtCmJSp1GrQ1
 FfAnxarxyquZfwUUcDaBa8f5lDzfQfQeeNXrCM3f7sUDuldPrBcFGF3amDZZzhZv
 aEhCAsQiA29y7btyY1ulOpTXltRWrA+XrCngoG26mnrrKsTUPvB2mNyEJMhctc8=
 =WXec
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging

Update OpenBIOS images

# gpg: Signature made Mon 31 Oct 2016 20:19:53 GMT
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to 1dc4f16 built from submodule.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-11-01 10:24:44 +00:00
Jeff Cody 02ba9265e8 migration: fix compiler warning on uninitialized variable
Some older GCC versions (e.g. 4.4.7) report a warning on an
uninitialized variable for 'request', even though all possible code
paths that reference 'request' will be initialized.   To appease
these versions, initialize the variable to 0.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-id: 259818682e41b95ae60f1423b87954a3fe377639.1477950393.git.jcody@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-11-01 09:31:53 +00:00
Stefan Hajnoczi 586ef5dee7 qga: add vsock-listen method
Add AF_VSOCK (virtio-vsock) support as an alternative to virtio-serial.

  $ qemu-system-x86_64 -device vhost-vsock-pci,guest-cid=3 ...
  (guest)# qemu-ga -m vsock-listen -p 3:1234

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-31 19:49:33 -05:00
Stefan Hajnoczi 6a02c8069f sockets: add AF_VSOCK support
Add the AF_VSOCK address family so that qemu-ga will be able to use
virtio-vsock.

The AF_VSOCK address family uses <cid, port> address tuples.  The cid is
the unique identifier comparable to an IP address.  AF_VSOCK does not
use name resolution so it's easy to convert between struct sockaddr_vm
and strings.

This patch defines a VsockSocketAddress instead of trying to piggy-back
on InetSocketAddress.  This is cleaner in the long run since it avoids
lots of IPv4 vs IPv6 vs vsock special casing.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* treat trailing commas as garbage when parsing (Eric Blake)
* add configure check instead of checking AF_VSOCK directly
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-31 19:49:33 -05:00
Stefan Hajnoczi f06b2031a3 qga: drop unnecessary GA_CHANNEL_UNIX_LISTEN checks
Throughout the code there are c->listen_channel checks which manage the
listen socket file descriptor (waiting for accept(2), closing the file
descriptor, etc).  These checks are currently preceded by explicit
c->method == GA_CHANNEL_UNIX_LISTEN checks.

Explicit GA_CHANNEL_UNIX_LISTEN checks are not necessary since serial
channel types do not create the listen channel (c->listen_channel).

As more listen channel types are added, explicitly checking all of them
becomes messy.  Rely on c->listen_channel to determine whether or not a
listen socket file descriptor is used.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-31 19:21:22 -05:00
Stefan Hajnoczi b8093d38e8 qga: drop unused sockaddr in accept(2) call
ga_channel_listen_accept() is currently hard-coded to support only
AF_UNIX because the struct sockaddr_un type is used.  This function
should work with any address family.

Drop the sockaddr since the client address is unused and is an optional
argument to accept(2).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-31 19:21:22 -05:00
Denis V. Lunev 91274487a9 qga: minimal support for fstrim for Windows guests
Unfortunately, there is no public Windows API to start trimming the
filesystem. The only viable way here is to call 'defrag.exe /L' for
each volume.

This is working since Win8 and Win2k12.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
CC: Michael Roth <mdroth@linux.vnet.ibm.com>
CC: Stefan Weil <sw@weilnetz.de>
CC: Marc-André Lureau <marcandre.lureau@gmail.com>
* check g_utf16_to_utf8() return value for GError handling instead
  of GError directly (Marc-André)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-10-31 19:09:21 -05:00
Richard Henderson 5a7267b6a9 target-sparc: Use tcg_gen_atomic_cmpxchg_tl
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-31 14:46:48 -06:00
Richard Henderson da1bcae652 target-sparc: Use tcg_gen_atomic_xchg_tl
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-31 14:46:48 -06:00
Richard Henderson 47b2696b97 target-sparc: Remove MMU_MODE*_SUFFIX
The functions that these generate are no longer used.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-31 14:46:48 -06:00
Richard Henderson cb21b4da6c target-sparc: Allow 4-byte alignment on fp mem ops
The cpu is allowed to require stricter alignment on these 8- and 16-byte
operations, and the OS is required to fix up the accesses as necessary,
so the previous code was not wrong.

However, we can easily handle this misalignment for all direct 8-byte
operations and for direct 16-byte loads.

We must retain 16-byte alignment for 16-byte stores, so that we don't have
to probe for writability of a second page before performing the first of
two 8-byte stores.  We also retain 8-byte alignment for no-fault loads,
since they are rare and it's not worth extending the helpers for this.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-31 14:46:47 -06:00
Richard Henderson f939ffe5a0 target-sparc: Implement ldqf and stqf inline
At the same time, fix a problem with stqf_asi, when
a write might access two pages.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-31 14:46:47 -06:00
Richard Henderson 918d9a2c9d target-sparc: Remove asi helper code handled inline
Now that we never call out to helpers when direct accesses can
handle an asi, remove the corresponding code in those helpers.
For ldda, this removes the entire helper.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-31 14:46:47 -06:00
Richard Henderson 34810610ac target-sparc: Implement BCOPY/BFILL inline
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-10-31 14:46:47 -06:00
Mark Cave-Ayland 625ed4be4b Update OpenBIOS images to 1dc4f16 built from submodule.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2016-10-31 20:01:25 +00:00
Peter Maydell b90da81d9f x86 and machine queue, 2016-10-31
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJYF41+AAoJECgHk2+YTcWmFZYP/2onwnvrb2JaHdPFHVfwR3Uc
 AAOs7DDCP9v7lnkaeTtcjxSACaBbrn83aT5oUty6hh7Fq+ASRjh5v275b16Vw2B7
 xa3IPypypQSz4s3vWJy/LhsdulOh+HBvmkTVmJ290w0LBnoO/xuIhpBszYZJfRkQ
 XxwTQNZlJ2/TZK2wZQNhAgpDaW+Qd7AJIcmt9ApglxOUEvVNhQE0z9+h0yoz7ww9
 lV9XhTg9fqtPbbHZGVcN2FWfRcfhH3BaWY7X2S+ilKhZswmipS1hPS7SE99l4RL6
 Czr4flhTuC1DDGg4/cMWK375np17zohdyzhtm5zYF0C6WeYWMrrLaRLKY+tWBcvu
 fU1FTX3ibiRPi9cHkiyaxaAhmDrGu43uqf2JSnc1jAK6x6R0tI8vxPcxbw236tsT
 jJTziew9HyaLWwYTXNZk00S+Xy89lj3Upl/DQ6xmZ8i6pYWUcxhtT7iWljVp3Rxc
 mGP7vEABMyKnVlShchtBOaiXWRiYNXsVyh1feTeaga/uNcLsdRFGbH5zusqhWGTz
 CkuFRhE80tPNgiFdbdR3a8IW8fRs0HSkN/8bupgikPz0eVHDdGMYzxy6BhhtCctD
 3havhZ8VjPqVWBHsHEkBUCCCEr/6hv6u7SoPbezZAxd2q++/0XEeeF3T7h2PtyV0
 w3h6Gj7RataZBtCwqieu
 =ZwQ5
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-and-machine-pull-request' into staging

x86 and machine queue, 2016-10-31

# gpg: Signature made Mon 31 Oct 2016 18:29:18 GMT
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-and-machine-pull-request:
  target-i386: Print warning when mixing [+-]foo and foo=(on|off)
  tests: Remove unneeded "-vnc none" option

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 19:06:09 +00:00
Eduardo Habkost 83a00f6095 target-i386: Print warning when mixing [+-]foo and foo=(on|off)
Print a warning when mixing [+-]foo and foo=(on|off) in the -cpu
argument in a way that will break in the future.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-31 16:20:59 -02:00
Peter Maydell e80b4b8fb6 VFIO updates 2016-10-31
- Replace skip_dump with ram_device to denote device memory and mark
    as non-direct to avoid memcpy to MMIO - fixes RTL (Alex Williamson)
  - Skip zero-length sparse mmaps - avoids unnecessary warning
    (Alex Williamson)
  - Clear BARs on reset so guest doesn't assume programming on return
    from S3 (Ido Yariv)
  - Enable sub-page MMIO mmaps - performance improvement for devices
    with smaller BARs, iff both host and guest map them to full,
    aligned pages (Yongji Xie)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJYF37XAAoJECObm247sIsi9okP/jT/UBqR1G7RVuxQ8AZPPAsU
 mBClGw5lC2lQ70M/t9HNxMMpceHSmAIC4doauOhVNGn7yl3MgHywhEmuxvdQQBAV
 WQYkrZsAIyNhg4I0/92PybsppccEgXgGjz7tW+56udgPhU4ChSsbUwrt8uxZ6/M5
 R/rIGBe/46QVKCAPes3PvOLq19LErUnN0uSasP0QxacD0aFnO9vRSlT3Ake6mnqv
 u+Z1p8d9DM5LYkZPV0wcDWBlosda+cWFH+RhEp1UH4d+2hpW4+WB6bMG6SneguAV
 9P6Dl7z8dJUZauFXw+/ctYDHLOKmul6wb7fLR8n09kqLsgxveH3xEw3tILEDBMvn
 W9xBc1Rp5luH7vZio8ZUYvRO0+/MGEyzQwUPcOiw/VOWl0w8IYyA2UVpHQZk5Esi
 r+DsrkxdonrhqXuB4vrJg7TdlbBEh2cAciy2zrSsYADB2ine/op7O+68+kqwsrlP
 tQOz+wIEi+72G7S6jdnVUQAYu+01Fae55K8gR2OPwGQO5SWgliYY7AZbE3l6eMZ7
 UtgG8YfJpJbZ5wQnshkF5NlNO9HwUS3bp+YgaSdF+NiZC+lz1nKpsqEx/JXRST7V
 A9hvK5so5mZ69EmEz7ruijBIblF3nte+Pfrm+FTjwqMUklvbwsElJGKf/fI6f+kl
 xYyUWkiYOoZXmSkjCanm
 =ZMwj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20161031.0' into staging

VFIO updates 2016-10-31

 - Replace skip_dump with ram_device to denote device memory and mark
   as non-direct to avoid memcpy to MMIO - fixes RTL (Alex Williamson)
 - Skip zero-length sparse mmaps - avoids unnecessary warning
   (Alex Williamson)
 - Clear BARs on reset so guest doesn't assume programming on return
   from S3 (Ido Yariv)
 - Enable sub-page MMIO mmaps - performance improvement for devices
   with smaller BARs, iff both host and guest map them to full,
   aligned pages (Yongji Xie)

# gpg: Signature made Mon 31 Oct 2016 17:26:47 GMT
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20161031.0:
  vfio: Add support for mmapping sub-page MMIO BARs
  vfio/pci: fix out-of-sync BAR information on reset
  vfio: Handle zero-length sparse mmap ranges
  memory: Don't use memcpy for ram_device regions
  memory: Replace skip_dump flag with "ram_device"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 18:19:06 +00:00
Peter Maydell 8ff7fd8a29 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJYF2zfAAoJEH8JsnLIjy/WOsAP/1TLKU8ZZ/TLpONfUD6SYcdt
 rPxhuxlxRl7k1xP/Y4oYa+Nl6Q9emgIcTA3V3hqKNEmaqxQWEM56Q+l2cFhaqqVm
 CO/bNJ0vGDogAG0ahgPCs9XKx7IPByb/iQiE+FNL68ZPZAWIaEHYgoiCjukr6e8B
 Q/OXN+WQLM4SIxnfLCYKSCycdGsaSTSx1T/8IjQb0DbpMFtDHOPkXcf/7AsiNJdP
 wdcx0uDkFk3GK2hN4A/ODFQQpEoi6ehm/RaaiDqGEX93IWnFlXRpq4h+dih79jaI
 Xuf9vt+uBKqTW5EiwY/lUmalf+zxj6wHNkvNze9paiacAvuT4N4Vq4+niDvRAbeI
 gkeC2GNRdO0iD+iBMHKQQ0JelBn0y/B+txzurcbZd71d0GF3r6jC17BEwnM39veR
 iIARK/dZEvygYRQknTG/EkbYHBdpWNmjKeVBr/08L7v7r9pN15huNx6Cmr+klz08
 ibstYxyCdmhvAf+UFELDZh3MF+65dpv8+Sa2OYS1SeE7jO9IhkF1SSyaGtSglvtO
 AN9xrJnaOBlLdVLycLpuH+1bcpXIYwMaoHeVsDmeBI5N9QPYlD73Y0ns7vENLR59
 2kUtyhOORdaE+y69hjuRmNYIN8svszzDk6or95aGZB3r4yv7uu2w+sGyy2dR+pdZ
 m+80c3xshudX9rUyHlJb
 =Hglo
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Mon 31 Oct 2016 16:10:07 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (29 commits)
  qapi: allow blockdev-add for NFS
  block/nfs: Introduce runtime_opts in NFS
  block: Mention replication in BlockdevDriver enum docs
  qemu-iotests: test 'offset' and 'size' options in raw driver
  raw_bsd: add offset and size options
  qemu-iotests: Test the 'base-node' parameter of 'block-stream'
  block: Add 'base-node' parameter to the 'block-stream' command
  qemu-iotests: Test streaming to a Quorum child
  qemu-iotests: Add iotests.supports_quorum()
  qemu-iotests: Test block-stream and block-commit in parallel
  qemu-iotests: Test overlapping stream and commit operations
  qemu-iotests: Test block-stream operations in parallel
  qemu-iotests: Test streaming to an intermediate layer
  docs: Document how to stream to an intermediate layer
  block: Add QMP support for streaming to an intermediate layer
  block: Support streaming to an intermediate layer
  block: Block all intermediate nodes in commit_active_start()
  block: Block all nodes involved in the block-commit operation
  block: Check blockers in all nodes involved in a block-commit job
  block: Use block_job_add_bdrv() in backup_start()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 17:29:04 +00:00
Eduardo Habkost 2df35773cb tests: Remove unneeded "-vnc none" option
Some tests use the "-vnc none" option without any clear reason,
making those tests break when --disable-vnc is specified on
./configure.  Remove the unnecessary option.

Reviewed-by: John Snow <jsnow@redhat.com>
Tested-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-31 15:09:59 -02:00
Yongji Xie 95251725e3 vfio: Add support for mmapping sub-page MMIO BARs
Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap
sub-page MMIO BARs if the mmio page is exclusive") allows VFIO
to mmap sub-page BARs. This is the corresponding QEMU patch.
With those patches applied, we could passthrough sub-page BARs
to guest, which can help to improve IO performance for some devices.

In this patch, we expand MemoryRegions of these sub-page
MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that
the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION
with a valid size. The expanding size will be recovered when
the base address of sub-page BAR is changed and not page aligned
any more in guest. And we also set the priority of these BARs'
memory regions to zero in case of overlap with BARs which share
the same page with sub-page BARs in guest.

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-10-31 09:53:04 -06:00
Ido Yariv a52a4c4717 vfio/pci: fix out-of-sync BAR information on reset
When a PCI device is reset, pci_do_device_reset resets all BAR addresses
in the relevant PCIDevice's config buffer.

The VFIO configuration space stays untouched, so the guest OS may choose
to skip restoring the BAR addresses as they would seem intact. The PCI
device may be left non-operational.
One example of such a scenario is when the guest exits S3.

Fix this by resetting the BAR addresses in the VFIO configuration space
as well.

Signed-off-by: Ido Yariv <ido@wizery.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-10-31 09:53:04 -06:00
Alex Williamson 24acf72b9a vfio: Handle zero-length sparse mmap ranges
As reported in the link below, user has a PCI device with a 4KB BAR
which contains the MSI-X table.  This seems to hit a corner case in
the kernel where the region reports being mmap capable, but the sparse
mmap information reports a zero sized range.  It's not entirely clear
that the kernel is incorrect in doing this, but regardless, we need
to handle it.  To do this, fill our mmap array only with non-zero
sized sparse mmap entries and add an error return from the function
so we can tell the difference between nr_mmaps being zero based on
sparse mmap info vs lack of sparse mmap info.

NB, this doesn't actually change the behavior of the device, it only
removes the scary "Failed to mmap ... Performance may be slow" error
message.  We cannot currently create an mmap over the MSI-X table.

Link: http://lists.nongnu.org/archive/html/qemu-discuss/2016-10/msg00009.html
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-10-31 09:53:03 -06:00
Alex Williamson 4a2e242bbb memory: Don't use memcpy for ram_device regions
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors.  If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap.  Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion.  When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.

This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU.  If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap.  Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems.  I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces.  It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.

To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.

With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access.  The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities.  If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access.  A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device.  Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).

Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Alex Williamson 21e00fa55f memory: Replace skip_dump flag with "ram_device"
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that.  If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it.  Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-31 09:53:03 -06:00
Ashijeet Acharya aa2623d817 qapi: allow blockdev-add for NFS
Introduce new object 'BlockdevOptionsNFS' in qapi/block-core.json to
support blockdev-add for NFS network protocol driver. Also make a new
struct NFSServer to support tcp connection.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Ashijeet Acharya 94d6a7a76e block/nfs: Introduce runtime_opts in NFS
Make NFS block driver use various fine grained runtime_opts.
Set .bdrv_parse_filename() to nfs_parse_filename() and introduce two
new functions nfs_parse_filename() and nfs_parse_uri() to help parsing
the URI.
Add a new option "server" which then accepts a new struct NFSServer.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
[ kwolf: Fixed client->path ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Eric Blake 68875e9fda block: Mention replication in BlockdevDriver enum docs
Missed in commit 82ac554.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Tomáš Golembiovský ccc47808fd qemu-iotests: test 'offset' and 'size' options in raw driver
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Tomáš Golembiovský 2fdc70452a raw_bsd: add offset and size options
Added two new options 'offset' and 'size'. This makes it possible to use
only part of the file as a device. This can be used e.g. to limit the
access only to single partition in a disk image or use a disk inside a
tar archive (like OVA).

When 'size' is specified we do our best to honour it.

Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia 7eb13c9daa qemu-iotests: Test the 'base-node' parameter of 'block-stream'
The block-stream command has traditionally used the 'base' parameter
to indicate the image to copy the data from. This test checks that the
'base-node' parameter can also be used for the same purpose.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia 312fe09cc8 block: Add 'base-node' parameter to the 'block-stream' command
The way to specify the node from which to copy data in the
block-stream operation is by using the 'base' parameter. This
parameter however takes a file name, not a node name.

Since we want to be able to perform this operation using only node
names, this patch adds a new 'base-node' parameter.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia 48361afba9 qemu-iotests: Test streaming to a Quorum child
Quorum children are special in the sense that they're not directly
attached to a block backend but they're not used as backing images
either. However the intermediate block streaming code supports
streaming to them. This is a test case for that scenario.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia b0f904950c qemu-iotests: Add iotests.supports_quorum()
There's many tests that need Quorum support in order to run. At the
moment each test implements its own check to see if Quorum is
enabled. This patch centralizes all those checks in a new function
called iotests.supports_quorum().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia 704d59f13d qemu-iotests: Test block-stream and block-commit in parallel
As with test_stream_parallel(), we allow mixing block-stream and
block-commit operations in the same backing chain as long as there's
no overlap among the involved nodes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia eb290b78ff qemu-iotests: Test overlapping stream and commit operations
These test cases check that it's not possible to perform two
block-stream or block-commit operations if there are nodes involved in
both.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia c1a34322d8 qemu-iotests: Test block-stream operations in parallel
This test case checks that it's possible to launch several stream
operations in parallel in the same snapshot chain, each one involving
a different set of nodes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia 7b8a9e5ab4 qemu-iotests: Test streaming to an intermediate layer
This adds test_stream_intermediate(), similar to test_stream() but
streams to the intermediate image instead.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:39 +01:00
Alberto Garcia 1029641bef docs: Document how to stream to an intermediate layer
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia 554b614765 block: Add QMP support for streaming to an intermediate layer
This patch makes the 'device' parameter of the 'block-stream' command
accept a node name that is not a root node. The presence of this
feature can't be directly tested with introspection; soon we'll
introduce a 'base-node' parameter whose presence can be checked for
this purpose.

In addition to that, operation blockers will be checked in all
intermediate nodes between the top and the base node.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia 61b49e48b3 block: Support streaming to an intermediate layer
This makes sure that the image we are streaming into is open in
read-write mode during the operation.

Operation blockers are also set in all intermediate nodes, since they
will be removed from the chain afterwards.

Finally, this also unblocks the stream operation in backing files.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia f3ede4b05d block: Block all intermediate nodes in commit_active_start()
When block-commit is launched without the top parameter, it uses
internally a mirror block job. In that case all intermediate nodes
between the active and base nodes must be blocked as well.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia 3e4c5122cb block: Block all nodes involved in the block-commit operation
After a successful block-commit operation all nodes between top and
base are removed from the backing chain, and top's overlay needs to
be updated to point to base. Because of that we should prevent other
block jobs from messing with them.

This patch blocks all operations in these nodes in commit_start().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia 058223a6e3 block: Check blockers in all nodes involved in a block-commit job
qmp_block_commit() checks for op blockers in the active and
destination (base) images. However all nodes between top_bs and base
are also involved, and they are removed from the chain afterwards.

In addition to that, if top_bs is not the active layer then top_bs's
overlay also needs to be checked because it's involved in the job (its
backing image string needs to be updated to point to 'base').

This patch checks that none of those nodes are blocked.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia b7340d002e block: Use block_job_add_bdrv() in backup_start()
Use block_job_add_bdrv() instead of blocking all operations in
backup_start() and unblocking them in backup_run().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia cee3c6b5ca block: Use block_job_add_bdrv() in mirror_start_job()
Use block_job_add_bdrv() instead of blocking all operations in
mirror_start_job() and unblocking them in mirror_exit().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia 23d402d42b block: Add block_job_add_bdrv()
When a block job is created on a certain BlockDriverState, operations
are blocked there while the job exists. However, some block jobs may
involve additional BDSs, which must be blocked separately when the job
is created and unblocked manually afterwards.

This patch adds block_job_add_bdrv(), that simplifies this process by
keeping a list of BDSs that are involved in the specified block job.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia 40840e419b block: Pause all jobs during bdrv_reopen_multiple()
When a BlockDriverState is about to be reopened it can trigger certain
operations that need to write to disk. During this process a different
block job can be woken up. If that block job completes and also needs
to call bdrv_reopen() it can happen that it needs to do it on the same
BlockDriverState that is still in the process of being reopened.

This can have fatal consequences, like in this example:

  1) Block job A starts and sleeps after a while.
  2) Block job B starts and tries to reopen node1 (a qcow2 file).
  3) Reopening node1 means flushing and replacing its qcow2 cache.
  4) While the qcow2 cache is being flushed, job A wakes up.
  5) Job A completes and reopens node1, replacing its cache.
  6) Job B resumes, but the cache that was being flushed no longer
     exists.

This patch splits the bdrv_drain_all() call to keep all block jobs
paused during bdrv_reopen_multiple(), so that step 4 can never happen
and the operation is safe.

Note that this scenario can only happen if both bdrv_reopen() calls
are made by block jobs on the same backing chain. Otherwise there's no
chance that the same BlockDriverState appears in both reopen queues.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:52:38 +01:00
Alberto Garcia c0778f6693 block: Add bdrv_drain_all_{begin,end}()
bdrv_drain_all() doesn't allow the caller to do anything after all
pending requests have been completed but before block jobs are
resumed.

This patch splits bdrv_drain_all() into _begin() and _end() for that
purpose. It also adds aio_{disable,enable}_external() calls to disable
external clients in the meantime.

An important restriction of this split is that no new block jobs or
BlockDriverStates can be created between the bdrv_drain_all_begin()
and bdrv_drain_all_end() calls. This is not a concern now because
we'll only be using this in bdrv_reopen_multiple(), but it must be
dealt with if we ever have other uses cases in the future.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-10-31 16:51:14 +01:00