Commit graph

6021 commits

Author SHA1 Message Date
Daniel P. Berrange 1cd9a787a2 block: pass option prefix down to crypto layer
While the crypto layer uses a fixed option name "key-secret",
the upper block layer may have a prefix on the options. e.g.
"encrypt.key-secret", in order to avoid clashes between crypto
option names & other block option names. To ensure the crypto
layer can report accurate error messages, we must tell it what
option name prefix was used.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-19-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange c01c214b69 block: remove all encryption handling APIs
Now that all encryption keys must be provided upfront via
the QCryptoSecret API and associated block driver properties
there is no need for any explicit encryption handling APIs
in the block layer. Encryption can be handled transparently
within the block driver. We only retain an API for querying
whether an image is encrypted or not, since that is a
potentially useful piece of metadata to report to the user.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-18-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange 788cf9f8c8 block: rip out all traces of password prompting
Now that qcow & qcow2 are wired up to get encryption keys
via the QCryptoSecret object, nothing is relying on the
interactive prompting for passwords. All the code related
to password prompting can thus be ripped out.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-17-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange 0cb8d47ba9 block: deprecate "encryption=on" in favor of "encrypt.format=aes"
Historically the qcow & qcow2 image formats supported a property
"encryption=on" to enable their built-in AES encryption. We'll
soon be supporting LUKS for qcow2, so need a more general purpose
way to enable encryption, with a choice of formats.

This introduces an "encrypt.format" option, which will later be
joined by a number of other "encrypt.XXX" options. The use of
a "encrypt." prefix instead of "encrypt-" is done to facilitate
mapping to a nested QAPI schema at later date.

e.g. the preferred syntax is now

  qemu-img create -f qcow2 -o encrypt.format=aes demo.qcow2

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-8-berrange@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:55 +02:00
Peter Maydell 29741be341 VFIO fixes 2017-07-10
- Don't iterate over non-realized devices (Alex Williamson)
  - Add PCIe capability version fixup (Alex Williamson)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIbBAABAgAGBQJZY9AjAAoJECObm247sIsiQs4P9Ai2tGMihJDJUj71lb/5nQOw
 2yvzfJaSaB1136K9BXRiCEFj1x44e1EdweiPur3AqZ8mDVGY/WXWSqnVV/iCLTVR
 nM7GWtIXnA52LHjwFYomDD/9iz6gj6bPIPGPGAF7iv1A9lfDvgb1Yr9F5bHLE+DQ
 GBlfT5WM0SzfrN5bFI+yN82elDOTHlbRPtrM9P2GDvj9H9Zvd2RDUn5Uv9yGUbQF
 mBYMyBZoL3FL3ij09zFFSwuJGYTwxfWTPnc73BgRAyKMyuy2dnv19XwHB3DQSQ70
 EbWSzcTYjsuquVKni7hfCD+XrP3NMb2U/42hyOAhXVbKAQ5bVmKFPzwA3bHOWBpR
 yWhALZVtl1bTQG4J5nrN+VYQfEv3Yr0dyhVUKZXGvrmP8SoA5WWkdeSrDcEANtew
 VHT7eOsBsCYlqH/0kO772K2XNuvj2XgkDY9da4c5O88WHRk7fUYBZgkjaEQ1uNrq
 +DV4Eixc4UsQiQOTmfCXFWT6hC8vRaM5NA25EvsRXFn8DNEyD/mlRRkndD9Ujawc
 LW1enhleeMXOtR/b6M5TqfPmBMGzQm4ITvM9EumLX/1nF0wDG/Ia+9qNVwZmf2qK
 6/riDUIpbKMzsj9XerDayLp1vTWEuh8Y3ExoOjadKxNYPsN+xcguOknHEVxA0suh
 SR20xhlho/Lq+rzTvcY=
 =FrIF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20170710.0' into staging

VFIO fixes 2017-07-10

 - Don't iterate over non-realized devices (Alex Williamson)
 - Add PCIe capability version fixup (Alex Williamson)

# gpg: Signature made Mon 10 Jul 2017 20:06:11 BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20170710.0:
  vfio/pci: Fixup v0 PCIe capabilities
  vfio: Test realized when using VFIOGroup.device_list iterator

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-11 13:47:28 +01:00
Juan Quintela acb5ea8697 migration: Create load_setup()/cleanup() methods
We need to do things at load time and at cleanup time.

Signed-off-by: Juan Quintela <quintela@redhat.com>

--

Move the printing of the error message so we can print the device
giving the error.
Add call to postcopy stuff
Message-Id: <20170628095228.4661-4-quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-07-10 17:52:21 +01:00
Juan Quintela 70f794fcfa migration: Rename cleanup() to save_cleanup()
We need a cleanup for loads, so we rename here to be consistent.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

Rename htab_cleanup to htap_save_cleanup as dave suggestion
Message-Id: <20170628095228.4661-3-quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-07-10 17:52:21 +01:00
Juan Quintela 9907e842d7 migration: Rename save_live_setup() to save_setup()
We are going to use it now for more than save live regions.
Once there rename qemu_savevm_state_begin() to qemu_savevm_state_setup().

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170628095228.4661-2-quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-07-10 17:52:21 +01:00
Peter Xu b605c47b57 migration: fix handling for --only-migratable
MigrateState object is not ready at that time, so we'll get an
assertion. Use qemu_global_option() instead.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Fixes: 3df663e ("migration: move only_migratable to MigrationState")
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1499242883-2184-2-git-send-email-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-07-10 17:52:21 +01:00
Alex Williamson 7da624e26a vfio: Test realized when using VFIOGroup.device_list iterator
VFIOGroup.device_list is effectively our reference tracking mechanism
such that we can teardown a group when all of the device references
are removed.  However, we also use this list from our machine reset
handler for processing resets that affect multiple devices.  Generally
device removals are fully processed (exitfn + finalize) when this
reset handler is invoked, however if the removal is triggered via
another reset handler (piix4_reset->acpi_pcihp_reset) then the device
exitfn may run, but not finalize.  In this case we hit asserts when
we start trying to access PCI helpers since much of the PCI state of
the device is released.  To resolve this, add a pointer to the Object
DeviceState in our common base-device and skip non-realized devices
as we iterate.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-07-10 10:39:43 -06:00
Eric Blake 51b0a48888 block: Make bdrv_is_allocated_above() byte-based
We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, for the most part this patch is just the
addition of scaling at the callers followed by inverse scaling at
bdrv_is_allocated().  But some code, particularly stream_run(),
gets a lot simpler because it no longer has to mess with sectors.
Leave comments where we can further simplify by switching to
byte-based iterations, once later patches eliminate the need for
sector-aligned operations.

For ease of review, bdrv_is_allocated() was tackled separately.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake d6a644bbfe block: Make bdrv_is_allocated() byte-based
We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned
on input and that *pnum is sector-aligned on return to the caller,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, this code adds usages like
DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
values, where the call might reasonbly give non-aligned results
in the future; on the other hand, no rounding is needed for callers
that should just continue to work with byte alignment.

For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_is_allocated().  But
some code, particularly bdrv_commit(), gets a lot simpler because it
no longer has to mess with sectors; also, it is now possible to pass
NULL if the caller does not care how much of the image is allocated
beyond the initial offset.  Leave comments where we can further
simplify once a later patch eliminates the need for sector-aligned
requests through bdrv_is_allocated().

For ease of review, bdrv_is_allocated_above() will be tackled
separately.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake f6ac207893 backup: Switch block_backup.h to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting
the public interface to backup jobs (no semantic change), including
a change to CowRequest to track by bytes instead of cluster indices.

Note that this does not change the difference between the public
interface (starting point, and size of the subsequent range) and
the internal interface (starting and end points).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake e8a81e9cad block: Drop unused bdrv_round_sectors_to_clusters()
Now that the last user [mirror_iteration()] has converted to using
bytes, we no longer need a function to round sectors to clusters.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake f3e4ce4af3 blockjob: Track job ratelimits via bytes, not sectors
The user interface specifies job rate limits in bytes/second.
It's pointless to have our internal representation track things
in sectors/second, particularly since we want to move away from
sector-based interfaces.

Fix up a doc typo found while verifying that the ratelimit
code handles the scaling difference.

Repetition of expressions like 'n * BDRV_SECTOR_SIZE' will be
cleaned up later when functions are converted to iterate over
images by bytes rather than by sectors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake d5254033da block: Simplify use of BDRV_BLOCK_RAW
The lone caller that cares about a return of BDRV_BLOCK_RAW
(namely, io.c:bdrv_co_get_block_status) completely replaces the
return value, so there is no point in passing BDRV_BLOCK_DATA.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Stefano Stabellini 9f2130f58d xenfb: remove xen_init_display "temporary" hack
Initialize xenfb properly, as all other backends, from its own
"initialise" function.

Remove the dependency of vkbd on vfb: use qemu_console_lookup_by_index
to find the principal console (to get the size of the screen) instead of
relying on a vfb backend to be available (which adds a dependency
between the two).

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
2017-07-07 11:10:03 -07:00
Peter Maydell b113658675 s390x/kvm/migration: fixes, enhancements and cleanups
- new email address for Cornelia
 - Fixes: 3270, flic, virtio-scsi-ccw, ipl
 - Enhancements, cpumodel, migration
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJZXeQ7AAoJEBF7vIC1phx8QNkP/j/dGk7rw8jtvBQ2wMej2ytD
 8Tv+pq6BqvB7pwjAJ44B05YHdmisPfFkyEZM7cRWvd9M+Tavlltb8cBOgsBHlIK3
 Qg5AIYFwMsfuqclY6aT50lCH/a6ELtblAZJaASVdWmJbeLhRyBZMVM6UeBtoEj7T
 UwaTCxe9oJ3qow+5WrP1GASo3nr256oVGE/nG05wcQ27dv624Ieb8UVy8DN+I5Kj
 nqxRMNVvzmn3VC+BFyn5xOprPP+c73fWTj/ifZVtFRMIeodtZVcBjlCexISznnck
 MhJkqQaZpsXK00ULHtqAMjJXUGBkIxwMxBy1pJ3ozKZ5gKFBDNKoffa5fXxkcjQf
 xqozynZfXqZxSRJaLXMMe4lWB8/fdi2fPI77wmir7wyfOUk4AFp6NYOqwLTdqTop
 P17oTg8JfwBDSje1lGXY0vcZ+4WYUVQswr4tzrgjDD4NEtGjrzm4aSPAGekNmjai
 RfQLi/VG8On0c9b2g40Zgn73vJTS7nWnOkCM8EflIu1ga3E1E9/SKvkD871hh+AS
 OSz81mLFpnWEKYyxtyQ0BVGXwAOKb12LrA3mjpVh9LUmG0Xl6H3uoV5m8O8d2DSU
 upziIPiLjPBlAHdDrjvOSbPmrdzpt42B17MSD6okgvZuNuCzu9DZW7iJKx0CWpeT
 yZiPbDqQvFh1ZwNEvAsL
 =OrKS
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/borntraeger/tags/s390x-20170706' into staging

s390x/kvm/migration: fixes, enhancements and cleanups

- new email address for Cornelia
- Fixes: 3270, flic, virtio-scsi-ccw, ipl
- Enhancements, cpumodel, migration

# gpg: Signature made Thu 06 Jul 2017 08:18:19 BST
# gpg:                using RSA key 0x117BBC80B5A61C7C
# gpg: Good signature from "Christian Borntraeger (IBM) <borntraeger@de.ibm.com>"
# Primary key fingerprint: F922 9381 A334 08F9 DBAB  FBCA 117B BC80 B5A6 1C7C

* remotes/borntraeger/tags/s390x-20170706:
  hw/s390x/ipl: Fix endianness problem with netboot_start_addr
  virtio-scsi-ccw: use ioeventfd even when KVM is disabled
  s390x: return unavailable features via query-cpu-definitions
  s390x/MAINTAINERS: Update my email address
  s390x: fix realize inheritance for kvm-flic
  s390x: fix error propagation in kvm-flic's realize
  s390x/3270: fix instruction interception handler
  s390x: vmstatify config migration for virtio-ccw

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-06 11:42:59 +01:00
Halil Pasic 517ff12c7d s390x: vmstatify config migration for virtio-ccw
Let's vmstatify virtio_ccw_save_config and virtio_ccw_load_config for
flexibility (extending using subsections) and for fun.

To achieve this we need to hack the config_vector, which is VirtIODevice
(that is common virtio) state, in the middle of the VirtioCcwDevice state
representation.  This is somewhat ugly, but we have no choice because the
stream format needs to be preserved.

Almost no changes in behavior. Exception is everything that comes with
vmstate like extra bookkeeping about what's in the stream, and maybe some
extra checks and better error reporting.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-Id: <20170703213414.94298-1-pasic@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-07-05 12:16:55 +02:00
Yang Zhong b11ec7f2e4 tcg: add CONFIG_TCG guards in headers
Add CONFIG_TCG around TLB-related functions and structure declarations.
Some of these functions are defined in ./accel/tcg/cputlb.c, which will
not be linked in if TCG is disabled, and have no stubs; therefore, their
callers will also be compiled out for --disable-tcg.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-05 09:11:08 +02:00
Paolo Bonzini beeaef55e4 tcg: move tb_lock out of translate-all.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 16:01:16 +02:00
Yang Zhong 8e2b72990e tcg: make tcg_allowed global
Change the tcg_enabled() and make sure user build still enable tcg
even x86 softmmu disable tcg.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 16:01:16 +02:00
Paolo Bonzini 8b3ae692b8 vl: convert -tb-size to qemu_strtoul
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:39:28 +02:00
Fam Zheng c096358e74 qemu-thread: Assert locks are initialized before using
Not all platforms check whether a lock is initialized before used.  In
particular Linux seems to be more permissive than OSX.

Check initialization state explicitly in our code to catch such bugs
earlier.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170704122325.25634-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:39:28 +02:00
Peter Maydell de5f852f38 main_loop: Make main_loop_wait() return void
The last users of main_loop_wait() that cared about the return value
have now been changed to no longer use it. Drop the now-useless return
value and make the function return void.

We avoid the awkwardness of ifdeffery to handle the 'ret'
variable in main_loop_wait() only being wanted if CONFIG_SLIRP
by simply dropping all the ifdefs. There are stub implementations
of slirp_pollfds_poll() and slirp_pollfds_fill() already in
stubs/slirp.c which do nothing, as required.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1498584769-12439-3-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:39:28 +02:00
Thomas Huth 47507383c6 include/exec/poison: Mark CONFIG_SOFTMMU as poisoned
CONFIG_SOFTMMU should never be used in common code, so mark
it as poisoned, too.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1498454578-18709-6-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:39:11 +02:00
Thomas Huth 2cd5394311 cpu: Introduce a wrapper for tlb_flush() that can be used in common code
Commit 1f5c00cfdb ("qom/cpu: move tlb_flush to cpu_common_reset")
moved the call to tlb_flush() from the target-specific reset handlers
into the common code qom/cpu.c file, and protected the call with
"#ifdef CONFIG_SOFTMMU" to avoid that it is called for linux-user
only targets. But since qom/cpu.c is common code, CONFIG_SOFTMMU is
*never* defined here, so the tlb_flush() was simply never executed
anymore. Fix it by introducing a wrapper for tlb_flush() in a file
that is re-compiled for each target, i.e. in translate-all.c.

Fixes: 1f5c00cfdb
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1498454578-18709-5-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Thomas Huth cbca3722a3 include/exec/poison: Mark CONFIG_KVM as poisoned, too
CONFIG_KVM is only defined for target-specific code, so nobody should
use it by accident in common code. To avoid such subtle bugs,
CONFIG_KVM is now marked as poisoned in common code. The header
include/sysemu/kvm.h is somewhat special since it is included
all over the place from common code, too, so we need some extra
logic via "#ifdef NEED_CPU_H" here to make sure that we can
compile all files without problems.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1498454578-18709-4-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Thomas Huth 2099935dbf Move CONFIG_KVM related definitions to kvm_i386.h
pc.h and sysemu/kvm.h are also included from common code (where
CONFIG_KVM is not available), so the #defines that depend on CONFIG_KVM
should not be declared here to avoid that anybody is using them in a
wrong way. Since we're also going to poison CONFIG_KVM for common code,
let's move them to kvm_i386.h instead. Most of the dummy definitions
from sysemu/kvm.h are also unused since the code that uses them is
only compiled for CONFIG_KVM (e.g. target/i386/kvm.c), so the unused
defines are also simply dropped here instead of being moved.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1498454578-18709-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Thomas Huth 50b8a2d326 include/exec/poison: Add some more missing TARGET and CONFIG defines
The defines of some *-linux-user targets were still missing.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1498454578-18709-2-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Sergio Andres Gomez Del Real 99f318322e vcpu_dirty: share the same field in CPUState for all accelerators
This patch simply replaces the separate boolean field in CPUState that
kvm, hax (and upcoming hvf) have for keeping track of vcpu dirtiness
with a single shared field.

Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170618191101.3457-1-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Mao Zhongyi 344475e77d pci: Convert shpc_init() to Error
In order to propagate error message better, convert shpc_init() to
Error also convert the pci_bridge_dev_initfn() to realize.

Cc: mst@redhat.com
Cc: marcel@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Mao Zhongyi f8cd1b0201 pci: Convert to realize
Convert i82801b11, io3130_upstream, io3130_downstream and
pcie_root_port devices to realize.

Cc: mst@redhat.com
Cc: marcel@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Mao Zhongyi 2784127857 pci: Replace pci_add_capability2() with pci_add_capability()
After the patch 'Make errp the last parameter of pci_add_capability()',
pci_add_capability() and pci_add_capability2() now do exactly the same.
So drop the wrapper pci_add_capability() of pci_add_capability2(), then
replace the pci_add_capability2() with pci_add_capability() everywhere.

Cc: pbonzini@redhat.com
Cc: rth@twiddle.net
Cc: ehabkost@redhat.com
Cc: mst@redhat.com
Cc: dmitry@daynix.com
Cc: jasowang@redhat.com
Cc: marcel@redhat.com
Cc: alex.williamson@redhat.com
Cc: armbru@redhat.com
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Mao Zhongyi 9a7c2a5970 pci: Make errp the last parameter of pci_add_capability()
Add Error argument for pci_add_capability() to leverage the errp
to pass info on errors. This way is helpful for its callers to
make a better error handling when moving to 'realize'.

Cc: pbonzini@redhat.com
Cc: rth@twiddle.net
Cc: ehabkost@redhat.com
Cc: mst@redhat.com
Cc: jasowang@redhat.com
Cc: marcel@redhat.com
Cc: alex.williamson@redhat.com
Cc: armbru@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:49 +03:00
Wei Wang 9b02e1618c virtio-net: enable configurable tx queue size
This patch enables the virtio-net tx queue size to be configurable
between 256 (the default queue size) and 1024 by the user when the
vhost-user backend is used.

Currently, the maximum tx queue size for other backends is 512 due
to the following limitations:
- QEMU backend: the QEMU backend implementation in some cases may
send 1024+1 iovs to writev.
- Vhost_net backend: there are possibilities that the guest sends
a vring_desc of memory which crosses a MemoryRegion thereby
generating more than 1024 iovs after translation from guest-physical
address in the backend.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-07-03 22:29:48 +03:00
Emilio G. Cota f3ced3c592 tcg: consistently access cpu->tb_jmp_cache atomically
Some code paths can lead to atomic accesses racing with memset()
on cpu->tb_jmp_cache, which can result in torn reads/writes
and is undefined behaviour in C11.

These torn accesses are unlikely to show up as bugs, but from code
inspection they seem possible. For example, tb_phys_invalidate does:
    /* remove the TB from the hash list */
    h = tb_jmp_cache_hash_func(tb->pc);
    CPU_FOREACH(cpu) {
        if (atomic_read(&cpu->tb_jmp_cache[h]) == tb) {
            atomic_set(&cpu->tb_jmp_cache[h], NULL);
        }
    }
Here atomic_set might race with a concurrent memset (such as the
ones scheduled via "unsafe" async work, e.g. tlb_flush_page) and
therefore we might end up with a torn pointer (or who knows what,
because we are under undefined behaviour).

This patch converts parallel accesses to cpu->tb_jmp_cache to use
atomic primitives, thereby bringing these accesses back to defined
behaviour. The price to pay is to potentially execute more instructions
when clearing cpu->tb_jmp_cache, but given how infrequently they happen
and the small size of the cache, the performance impact I have measured
is within noise range when booting debian-arm.

Note that under "safe async" work (e.g. do_tb_flush) we could use memset
because no other vcpus are running. However I'm keeping these accesses
atomic as well to keep things simple and to avoid confusing analysis
tools such as ThreadSanitizer.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1497486973-25845-1-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-30 11:40:59 -07:00
Emilio G. Cota 53f6672bcf gen-icount: use tcg_ctx.tcg_env instead of cpu_env
We are relying on cpu_env being defined as a global, yet most
targets (i.e. all but arm/a64) have it defined as a local variable.
Luckily all of them use the same "cpu_env" name, but really
compilation shouldn't break if the name of that local variable
changed.

Fix it by using tcg_ctx.tcg_env, which all targets set in their
translate_init function. This change also helps paving the way
for the upcoming "translation loop common to all targets" work.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1497639397-19453-3-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-30 11:40:59 -07:00
Emilio G. Cota ae06cb46b2 gen-icount: add missing inline to gen_tb_end
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1497639397-19453-2-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-06-30 11:40:59 -07:00
Peter Maydell 82d76dc7fc -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
 
 iQEtBAABCAAXBQJZVlttEBxmYW16QHJlZGhhdC5jb20ACgkQyjViTGqRccaSJQf/
 aKBxpeES6l4zYoa09+x7eJwjQXj6RdIpUNL5N4a/dhUsVJ2keWo6lPcjC/kcbwPR
 TJ4zYplm+suVzbNZG4XJGXLryo6ODIaHhpa/Ctsf3i6vQkRipxManpTbqqnyjt/e
 fAnwdFu0dFKbnqJECujDQgaZo1qWLuyZP++ZFt2kiZgOX/OdHpnQPH2U4l+22Cp6
 LB2FAFv0TwDjxmzM6EuOjsuLr9Rq3ckx7CVoyCnZuWkcaqn3/2cXhdNDErUB1nl7
 Pa/N3khz6PJs1Q8/H3GPH+BJHfORLRFR6dZ/eD8JU6qqBJfguvkSmi9cQUlbVsko
 KEBqUmL+bKJaK257lkUkfg==
 =wO2z
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging

# gpg: Signature made Fri 30 Jun 2017 15:08:45 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/block-pull-request:
  block: Exploit BDRV_BLOCK_EOF for larger zero blocks
  block: Add BDRV_BLOCK_EOF to bdrv_get_block_status()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-30 16:29:51 +01:00
Peter Maydell 6db174aed1 -----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZVkR0AAoJEPMMOL0/L748IKoP/jhlofUx3HjjS403EWOJEVHZ
 0EpvqjB9DCDGmAVcxw3eJ/0lqQJhWKEDpUuMR0MzCPnvSJRJlPZL8Kk55eWgRKSd
 3mmnhd8unFdBmxP3lv2QRd/zhU4shqLk5a2PbQNxpchLjklbOVLYQMHGwK3R5MXL
 o4B/c66ag/GH/aticHbU5Tz1xxIJ4sS/0dendJ3t088gW9tPlJLsSSxCttIKCoNv
 PtQSNT/rGjCoOZDOVQi9CPZ/AuFujvwh96Mkjx50TcFNpg/h88jRaw3f3FyBuRtz
 EOEhRN4vb/adLZtwVXzo9CrPeOccKxY0jk2gSQa/hJVaVGoC2dEgx29+Am2uULUZ
 Kp0il2rjMFMtDzRzswccuAy/mH5N5H+oPPR2U3/ygNK67twYALKE8LMt+j1AqGug
 l9zf8wRg/JEm2F+HU2Qe4ErG4WhHeNKi2h+6Vx1NXGiy0OAzL9hFuUxBrVw6kX9k
 ivzDJETq6d5+h8CI/SlaV/ljtDwwXDQoOgZz8Cq11zZHfitdiDDVMFiUX6nWBqto
 4e4W/tLVqgHV+rLKoFbju5cpw4wBvvliQqTs/KC/CfRtawNDWNpy7MTUb6j5M1q1
 Z+KJoqt4kJ6EG8DBsa9MS6qFDJ9Elt4Wv3Znby1GCjdTLb3fBOOfP6V7HoujwcB0
 0oUufyqJCfnCqDu0DmZX
 =ueS4
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier/tags/m68k-for-2.10-pull-request' into staging

# gpg: Signature made Fri 30 Jun 2017 13:30:44 BST
# gpg:                using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier/tags/m68k-for-2.10-pull-request:
  target/m68k: add fmovem
  target/m68k: add explicit single and double precision operations (part 2)
  target/m68k: add fsglmul and fsgldiv
  softfloat: define floatx80_round()
  target/m68k: add explicit single and double precision operations
  target/m68k: add fmovecr
  target/m68k: add fscc.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-30 14:59:01 +01:00
Eric Blake fb0d8654ff block: Add BDRV_BLOCK_EOF to bdrv_get_block_status()
Just as the block layer already sets BDRV_BLOCK_ALLOCATED as a
shortcut for subsequent operations, there are also some optimizations
that are made easier if we can quickly tell that *pnum will advance
us to the end of a file, via a new BDRV_BLOCK_EOF which gets set
by the block layer.

This just plumbs up the new bit; subsequent patches will make use
of it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170505021500.19315-2-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-30 21:48:06 +08:00
David Gibson 0dfabd39d5 spapr: Clean up DRC set_isolation_state() path
There are substantial differences in the various paths through
set_isolation_state(), both for setting to ISOLATED versus UNISOLATED
state and for logical versus physical DRCs.

So, split the set_isolation_state() method into isolate() and unisolate()
methods, and give it different implementations for the two DRC types.

Factor some minimal common checks, including for valid indicator values
(which we weren't previously checking) into rtas_set_isolation_state().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
David Gibson 617367321e spapr: Clean up DRC set_allocation_state path
The allocation-state indicator should only actually be implemented for
"logical" DRCs, not physical ones.  Factor a check for this, and also for
valid indicator state values into rtas_set_allocation_state().  Because
they don't exist for physical DRCs, there's no reason that we'd ever want
more than one method implementation, so it can just be a plain function.

In addition, the setting to USABLE and setting to UNUSABLE paths in
set_allocation_state() don't actually have much in common.  So, split the
method separate functions for each parameter value (drc_set_usable()
and drc_set_unusable()).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
David Gibson 307b7715d0 spapr: Eliminate DRC 'signalled' state variable
The 'signalled' field in the DRC appears to be entirely a torturous
workaround for the fact that PCI devices were started in UNISOLATED state
for unclear reasons.

1) 'signalled' is already meaningless for logical (so far, all non PCI)
DRCs.  It's always set to true (at least at any point it might be tested),
and can't be assigned any real meaning due to the way signalling works for
logical DRCs.

2) For PCI DRCs, the only time signalled would be false is when non-zero
functions of a multifunction device are hotplugged, followed by function
zero (the other way around is explicitly not permitted). In that case the
secondary function DRCs are attached, but the notification isn't sent to
the guest until function 0 is plugged.

3) signalled being false is used to allow a DRC detach to switch mode
back to ISOLATED state, which allows a secondary function to be hotplugged
then unplugged with function 0 never inserted.  Without this a secondary
function starting in UNISOLATED state couldn't be detached again without
function 0 being inserted, all the functions configured by the guest, then
sent back to ISOLATED state.

4) But now that PCI DRCs start in ISOLATED state, there's nothing to be
done.  If the guest doesn't get the notification, it won't switch the
device to UNISOLATED state, so nothing prevents it from being unplugged.
If the guest does move it to UNISOLATED state without the signal (due to
a manual drmgr call, for instance) then it really isn't safe to unplug it.

So, this patch removes the signalled variable and all code related to it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-06-30 14:03:32 +10:00
Greg Kurz 46f7afa370 spapr: fix migration of ICPState objects from/to older QEMU
Commit 5bc8d26de2 ("spapr: allocate the ICPState object from under
sPAPRCPUCore") moved ICPState objects from the machine to CPU cores.
This is an improvement since we no longer allocate ICPState objects
that will never be used. But it has the side-effect of breaking
migration of older machine types from older QEMU versions.

This patch allows spapr to register dummy "icp/server" entries to vmstate.
These entries use a dedicated VMStateDescription that can swallow and
discard state of an incoming migration stream, and that don't send anything
on outgoing migration.

As for real ICPState objects, the instance_id is the cpu_index of the
corresponding vCPU, which happens to be equal to the generated instance_id
of older machine types.

The machine can unregister/register these entries when CPUs are dynamically
plugged/unplugged.

This is only available for pseries-2.9 and older machines, thanks to a
compat property.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-06-30 14:03:31 +10:00
David Gibson 7843c0d60d pseries: Move CPU compatibility property to machine
Server class POWER CPUs have a "compat" property, which is used to set the
backwards compatibility mode for the processor.  However, this only makes
sense for machine types which don't give the guest access to hypervisor
privilege - otherwise the compatibility level is under the guest's control.

To reflect this, this removes the CPU 'compat' property and instead
creates a 'max-cpu-compat' property on the pseries machine.  Strictly
speaking this breaks compatibility, but AFAIK the 'compat' option was
never (directly) used with -device or device_add.

The option was used with -cpu.  So, to maintain compatibility, this
patch adds a hack to the cpu option parsing to strip out any compat
options supplied with -cpu and set them on the machine property
instead of the now deprecated cpu property.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Tested-by: Andrea Bolognani <abologna@redhat.com>
2017-06-30 14:03:31 +10:00
Laurent Vivier 0f72129281 softfloat: define floatx80_round()
Add a function to round a floatx80 to the defined precision
(floatx80_rounding_precision)

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <20170628204241.32106-5-laurent@vivier.eu>
2017-06-29 20:27:39 +02:00
Haozhong Zhang 084140bd49 exec: fix access to ram_list.dirty_memory when sync dirty bitmap
In cpu_physical_memory_sync_dirty_bitmap(rb, start, ...), the 2nd
argument 'start' is relative to the start of the ramblock 'rb'. When
it's used to access the dirty memory bitmap of ram_list (i.e.
ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]->blocks[]), an offset to
the start of all RAM (i.e. rb->offset) should be added to it, which has
however been missed since c/s 6b6712efcc. For a ramblock of host memory
backend whose offset is not zero, cpu_physical_memory_sync_dirty_bitmap()
synchronizes the incorrect part of the dirty memory bitmap of ram_list
to the per ramblock dirty bitmap. As a result, a guest with host
memory backend may crash after migration.

Fix it by adding the offset of ramblock when accessing the dirty memory
bitmap of ram_list in cpu_physical_memory_sync_dirty_bitmap().

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20170628083704.24997-1-haozhong.zhang@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Juan Quintela <quintela@redhat.com>
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-28 12:23:58 +02:00
Halil Pasic d2164ad35c vmstate: error hint for failed equal checks
In some cases a failing VMSTATE_*_EQUAL does not mean we detected a bug,
but it's actually the best we can do. Especially in these cases a verbose
error message is required.

Let's introduce infrastructure for specifying a error hint to be used if
equal check fails. Let's do this by adding a parameter to the _EQUAL
macros called _err_hint. Also change all current users to pass NULL as
last parameter so nothing changes for them.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>

Message-Id: <20170623144823.42936-1-pasic@linux.vnet.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-06-28 11:18:44 +02:00