Commit graph

151 commits

Author SHA1 Message Date
Benjamin Herrenschmidt 90da0d5a70 ppc/spapr: Add "ibm,pa-features" property to the device-tree
LoPAPR defines a "ibm,pa-features" per-CPU device tree property which
describes extended features of the Processor Architecture.

This adds the property to the device tree. At the moment this is the
copy of what pHyp advertises except "I=1 (cache inhibited) Large Pages"
which is enabled for TCG and disabled when running under HV KVM host
with 4K system page size.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: rebased, changed commit log, moved ci_large_pages initialization,
renamed pa_features arrays]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23 12:22:40 +11:00
Benjamin Herrenschmidt aa4bb58752 ppc: Add mmu_model defines for arch 2.03 and 2.07
This removes unused POWERPC_MMU_2_06a/POWERPC_MMU_2_06d.

This replaces POWERPC_MMU_64B with POWERPC_MMU_2_03 for POWER5+ to be
more explicit about the version of the PowerISA supported.

This defines POWERPC_MMU_2_07 and uses it for the POWER8 CPU family.
This will not have an immediate effect now but it will in the following
patch.

This should cause no behavioural change.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: rebased, changed commit log]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23 12:22:40 +11:00
David Gibson 6a81dd172c spapr_iommu: Rename vfio_accel parameter
The vfio_accel parameter used when creating a new TCE table (guest IOMMU
context) has a confusing name.  What it really means is whether we need the
TCE table created to be able to support VFIO devices.

VFIO is relevant, because when available we use in-kernel acceleration of
the TCE table, but that may not work with VFIO devices because updates to
the table are handled in kernel, bypass qemu and so don't hit qemu's
infrastructure for keeping the VFIO host IOMMU state in sync with the guest
IOMMU state.

Rename the parameter to "need_vfio" throughout.  This is a cosmetic change,
with no impact on the logic.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23 10:38:10 +11:00
Pavel Fedin dc9f06ca81 kvm: Pass PCI device pointer to MSI routing functions
In-kernel ITS emulation on ARM64 will require to supply requester IDs.
These IDs can now be retrieved from the device pointer using new
pci_requester_id() function.

This patch adds pci_dev pointer to KVM GSI routing functions and makes
callers passing it.

x86 architecture does not use requester IDs, but hw/i386/kvm/pci-assign.c
also made passing PCI device pointer instead of NULL for consistency with
the rest of the code.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Message-Id: <ce081423ba2394a4efc30f30708fca07656bc500.1444916432.git.p.fedin@samsung.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-19 10:13:07 +02:00
Markus Armbruster 4c315c2766 qdev: Protect device-list-properties against broken devices
Several devices don't survive object_unref(object_new(T)): they crash
or hang during cleanup, or they leave dangling pointers behind.

This breaks at least device-list-properties, because
qmp_device_list_properties() needs to create a device to find its
properties.  Broken in commit f4eb32b "qmp: show QOM properties in
device-list-properties", v2.1.  Example reproducer:

    $ qemu-system-aarch64 -nodefaults -display none -machine none -S -qmp stdio
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 4, "major": 2}, "package": ""}, "capabilities": []}}
    { "execute": "qmp_capabilities" }
    {"return": {}}
    { "execute": "device-list-properties", "arguments": { "typename": "pxa2xx-pcmcia" } }
    qemu-system-aarch64: /home/armbru/work/qemu/memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.
    Aborted (core dumped)
    [Exit 134 (SIGABRT)]

Unfortunately, I can't fix the problems in these devices right now.
Instead, add DeviceClass member cannot_destroy_with_object_finalize_yet
to mark them:

* Hang during cleanup (didn't debug, so I can't say why):
  "realview_pci", "versatile_pci".

* Dangling pointer in cpus: most CPUs, plus "allwinner-a10", "digic",
  "fsl,imx25", "fsl,imx31", "xlnx,zynqmp", because they create such
  CPUs

* Assert kvm_enabled(): "host-x86_64-cpu", host-i386-cpu",
  "host-powerpc64-cpu", "host-embedded-powerpc-cpu",
  "host-powerpc-cpu" (the powerpc ones can't currently reach the
  assertion, because the CPUs are only registered when KVM is enabled,
  but the assertion is arguably in the wrong place all the same)

Make qmp_device_list_properties() fail cleanly when the device is so
marked.  This improves device-list-properties from "crashes, hangs or
leaves dangling pointers behind" to "fails".  Not a complete fix, just
a better-than-nothing work-around.  In the above reproducer,
device-list-properties now fails with "Can't list properties of device
'pxa2xx-pcmcia'".

This also protects -device FOO,help, which uses the same machinery
since commit ef52358 "qdev-monitor: include QOM properties in -device
FOO, help output", v2.2.  Example reproducer:

    $ qemu-system-aarch64 -machine none -device pxa2xx-pcmcia,help

Before:

    qemu-system-aarch64: .../memory.c:1307: memory_region_finalize: Assertion `((&mr->subregions)->tqh_first == ((void *)0))' failed.

After:

    Can't list properties of device 'pxa2xx-pcmcia'

Cc: "Andreas Färber" <afaerber@suse.de>
Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Anthony Green <green@moxielogic.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Jia Liu <proljc@gmail.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: qemu-ppc@nongnu.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1443689999-12182-10-git-send-email-armbru@redhat.com>
2015-10-09 15:25:57 +02:00
Shraddha Barke f9b8e7f63a target-ppc: Remove unnecessary variable
Compress lines and remove the variable.

Signed-off-by: Shraddha Barke <shraddha.6596@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-10-08 19:46:01 +03:00
Thomas Huth 4d9392be6c ppc/spapr: Implement H_RANDOM hypercall in QEMU
The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.

This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:

 qemu-system-ppc64 -device spapr-rng,use-kvm=true

For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:

 qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
                   -device spapr-rng,rng=gid0 ...

See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Alexey Kardashevskiy ef9971dd69 spapr: Enable in-kernel H_SET_MODE handling
For setting debug watchpoints, sPAPR guests use H_SET_MODE hypercall.
The existing QEMU H_SET_MODE handler does not support this but
the KVM handler in HV KVM does. However it is not enabled.

This enables the in-kernel H_SET_MODE handler which handles:
- Completed Instruction Address Breakpoint Register
- Watch point 0 registers.

The rest is still handled in QEMU.

Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:10 +10:00
Michael Roth 2d103aae87 target-ppc: fix hugepage support when using memory-backend-file
Current PPC code relies on -mem-path being used in order for
hugepage support to be detected. With the introduction of
MemoryBackendFile we can now handle this via:
  -object memory-file-backend,mem-path=...,id=hugemem0 \
  -numa node,id=mem0,memdev=hugemem0

Management tools like libvirt treat the 2 approaches as
interchangeable in some cases, which can lead to user-visible
regressions even for previously supported guest configurations.

Fix these by also iterating through any configured memory
backends that may be backed by hugepages.

Since the old code assumed hugepages always backed the entirety
of guest memory, play it safe an pick the minimum across the
max pages sizes for all backends, even ones that aren't backed
by hugepages.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-07-07 17:44:49 +02:00
Jan Kiszka 4b8523ee89 kvm: First step to push iothread lock out of inner run loop
This opens the path to get rid of the iothread lock on vmexits in KVM
mode. On x86, the in-kernel irqchips has to be used because we otherwise
need to synchronize APIC and other per-cpu state accesses that could be
changed concurrently.

Regarding pre/post-run callbacks, s390x and ARM should be fine without
specific locking as the callbacks are empty. MIPS and POWER require
locking for the pre-run callback.

For the handle_exit callback, it is non-empty in x86, POWER and s390.
Some POWER cases could do without the locking, but it is left in
place for now.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-7-git-send-email-pbonzini@redhat.com>
2015-07-01 15:45:51 +02:00
Peter Maydell 3b730f570c Patch queue for ppc - 2015-06-03
Highlights this time around:
 
   - sPAPR: endian fixes, speedups, bug fixes, hotplug basics
   - add default ram size capability for machines (sPAPR defaults to 512MB now)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJVb3itAAoJECszeR4D/txgGncQAIz7tPRvMlCJyaGdYIkySUh4
 vbwAf4Z2Ddjv/gA/3G3oY1lC5RnhOJucxCbobzdayKecrdkdAJa/O/6RbKij4zMD
 svXseSpk8aKr4yrfNItxrjysJsp4cMS7APim7HcF5mOBJJqp0COkr1q97VteTfY1
 AdiSfBU5IEj0RZ+J1pSnMVf837gLiKSv+L2gTyGkb66VBMqZOZzu5UuoUhIOfa+R
 /tlm2VMRKe7vrU7Q4TL8Syn9UZnB03aNrKIXYN0VJy5WTePSMWPSQ6fbImTELEQB
 En87DGYt/QVs0eB7XNwzhF0REFblHECOzFhbOovCrvGZIa4xai8HJaJHMeaxQfkx
 4Aiby7Kv8wJgjn13OuBTvG7YWtw3hJcO1i0ePs2MmGz9sJNzhz0tyRSRglc3xN1Q
 RBrqyl3lOsnvNRzj/py7kYxCKtG8xlkaTSkO6FfXmt9UMW91pqWo4/2LCTON0zkx
 +gd2UW7JPw2u6ttzCu+b8BZv1ATovHoj2wXPP4iEYpe1sGT6qp4moZZ6CtWex/O3
 4Lhd9jJVJurMZl6e1pn/4bkcEhNvT2B484GmmerrZXrtlKm9wcepqMJC2bVCtzjT
 JBLNGTk6z8QKN5WRD+LWD3LgEjAEqV6nvqrmiwovMUtC0lJSHJTTAoeurM3h6jJn
 eaR4tzdEqHgDhzkOCHux
 =zWZp
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/agraf/tags/signed-ppc-for-upstream' into staging

Patch queue for ppc - 2015-06-03

Highlights this time around:

  - sPAPR: endian fixes, speedups, bug fixes, hotplug basics
  - add default ram size capability for machines (sPAPR defaults to 512MB now)

# gpg: Signature made Wed Jun  3 22:59:09 2015 BST using RSA key ID 03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"

* remotes/agraf/tags/signed-ppc-for-upstream: (40 commits)
  softmmu: support up to 12 MMU modes
  tcg: add TCG_TARGET_TLB_DISPLACEMENT_BITS
  tci: do not use CPUArchState in tcg-target.h
  Add David Gibson for sPAPR in MAINTAINERS file
  pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations
  spapr: override default ram size to 512MB
  machine: add default_ram_size to machine class
  spapr_pci: emit hotplug add/remove events during hotplug
  spapr_pci: enable basic hotplug operations
  pci: make pci_bar useable outside pci.c
  spapr_pci: create DRConnectors for each PCI slot during PHB realize
  spapr_pci: add dynamic-reconfiguration option for spapr-pci-host-bridge
  spapr_drc: add spapr_drc_populate_dt()
  spapr_events: event-scan RTAS interface
  spapr_events: re-use EPOW event infrastructure for hotplug events
  spapr_rtas: add ibm, configure-connector RTAS interface
  spapr: add rtas_st_buffer_direct() helper
  spapr_rtas: add get-sensor-state RTAS interface
  spapr_rtas: add set-indicator RTAS interface
  spapr_rtas: add get/set-power-level RTAS interfaces
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-04 14:04:14 +01:00
David Gibson 026bfd89cb pseries: Enable in-kernel H_LOGICAL_CI_{LOAD, STORE} implementations
qemu currently implements the hypercalls H_LOGICAL_CI_LOAD and
H_LOGICAL_CI_STORE as PAPR extensions.  These are used by the SLOF firmware
for IO, because performing cache inhibited MMIO accesses with the MMU off
(real mode) is very awkward on POWER.

This approach breaks when SLOF needs to access IO devices implemented
within KVM instead of in qemu.  The simplest example would be virtio-blk
using an iothread, because the iothread / dataplane mechanism relies on
an in-kernel implementation of the virtio queue notification MMIO.

To fix this, an in-kernel implementation of these hypercalls has been made,
(kernel commit 99342cf "kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM"
however, the hypercalls still need to be enabled from qemu.  This performs
the necessary calls to do so.

It would be nice to provide some warning if we encounter a problematic
device with a kernel which doesn't support the new calls.  Unfortunately,
I can't see a way to detect this case which won't either warn in far too
many cases that will probably work, or which is horribly invasive.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-06-03 23:56:55 +02:00
Eric Auger 1850b6b7d0 kvm: introduce kvm_arch_msi_data_to_gsi
On ARM the MSI data corresponds to the shared peripheral interrupt (SPI)
ID. This latter equals to the SPI index + 32. to retrieve the SPI index,
matching the gsi, an architecture specific function is introduced.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-06-02 14:56:25 +01:00
Paolo Bonzini 4c66375252 kvm: add support for memory transaction attributes
Let kvm_arch_post_run convert fields in the kvm_run struct to MemTxAttrs.
These are then passed to address_space_rw.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-04-30 16:55:32 +02:00
Marcel Apfelbaum b16565b396 kvm: add machine state to kvm_arch_init
Needed to query machine's properties.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2015-03-11 18:16:17 +01:00
Frank Blaschka 9e03a0405d kvm: extend kvm_irqchip_add_msi_route to work on s390
on s390 MSI-X irqs are presented as thin or adapter interrupts
for this we have to reorganize the routing entry to contain
valid information for the adapter interrupt code on s390.
To minimize impact on existing code we introduce an architecture
function to fixup the routing entry.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-01-12 10:14:04 +01:00
Cédric Le Goater e094c4c12f target-ppc: explicitly save page table headers in big endian
Currently, when the page tables are saved, the kvm_get_htab_header structs
and the ptes are assumed being big endian and dumped as a indistinct blob
in the statefile. This is no longer true when the host is little endian
and this breaks restoration.

This patch unfolds the kvmppc_save_htab routine to write explicitly the
kvm_get_htab_header structs in big endian. The ptes are left untouched.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:26 +01:00
Chen Gang cc64b1a194 target-ppc: kvm: Fix memory overflow issue about strncat()
strncat() will append additional '\0' to destination buffer, so need
additional 1 byte for it, or may cause memory overflow, just like other
area within QEMU have done.

And can use g_strdup_printf() instead of strncat(), which may be more
easier understanding.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:13 +01:00
Alexander Graf 6fd33a7502 PPC: KVM: Use vm check_extension for pv hcall
To find out whether we support the KVM hypercall interface we need to ask KVM
on the VM level rather than the global KVM level, because Book3S HV KVM does
not support it and we play conservative when both HV and PR are loaded.

So instead, use the VM helper that falls back to global KVM enumeration. That
should cover all cases.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Bharat Bhushan 88365d17d5 ppc: Add hw breakpoint watchpoint support
This patch adds hardware breakpoint and hardware watchpoint support
for ppc.

On BOOKE architecture we cannot share debug resources between QEMU
and guest because:
    When QEMU is using debug resources then debug exception must
    be always enabled. To achieve this we set MSR_DE and also set
    MSRP_DEP so guest cannot change MSR_DE.

    When emulating debug resource for guest we want guest
    to control MSR_DE (enable/disable debug interrupt on need).

    So above mentioned two configuration cannot be supported
    at the same time. So the result is that we cannot share
    debug resources between QEMU and Guest on BOOKE architecture.

In the current design QEMU gets priority over guest,
this means that if QEMU is using debug resources then guest
cannot use them and if guest is using debug resource then
qemu can overwrite them.

When QEMU is not able to handle debug exception then we inject program
exception to guest. Yes program exception NOT debug exception and the
reason is:
 1) QEMU and guest not sharing debug resources
 2) For software breakpoint QEMU uses a ehpriv-1 instruction;

 So there cannot be any reason that we are in qemu with exit reason
 KVM_EXIT_DEBUG  for guest set debug exception, only possibility is
 guest executed ehpriv-1 privilege instruction and that's why we are
 injecting program exception.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan 8a0548f94e ppc: Add software breakpoint support
This patch allow insert/remove software breakpoint.

When QEMU is not able to handle debug exception then we inject
program exception to guest because for software breakpoint QEMU
uses a ehpriv-1 instruction;
So there cannot be any reason that we are in qemu with exit reason
KVM_EXIT_DEBUG  for guest set debug exception, only possibility is
guest executed ehpriv-1 privilege instruction and that's why we are
injecting program exception.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
[agraf: make deflect comment booke/book3s agnostic]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan c371c2e3e0 ppc: synchronize excp_vectors for injecting exception
This patch synchronizes env->excp_vectors[] with env->iovr[].
This is required for using the existing interrupt injection mechanism
for kvm.

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Bharat Bhushan 3c902d4469 ppc: debug stub: Get trap instruction opcode from KVM
Get trap instruction opcode from KVM and this opcode will
be used for setting software breakpoint in following patch

Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Nikunj A Dadhania ef9514431d spapr: add uuid/host details to device tree
Useful for identifying the guest/host uniquely within the
guest. Adding following properties to the guest root node.

vm,uuid - uuid of the guest
host-model - Host model number
host-serial - Host machine serial number
hypervisor type - Tells its "kvm"

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexey Kardashevskiy 658fa66b81 spapr: Move RMA memory region registration code
PPC970 does not support VRMA (virtual RMA) so real memory required
for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl.
Later this memory is used as a part of the guest RAM area.
The RMA allocating code also registers a memory region for this piece
of RAM.

We are going to simplify memory regions layout: RMA memory region
will be a subregion in the RAM memory region, both starting from zero.
This way we will not have to take care of start address alignment for
the piece of RAM next to the RMA.

This moves memory region business closer to the RAM memory region
creation/allocation code.

As this is a mechanical patch, no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on non-kvm systems]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy 9bb62a0702 spapr_iommu: Make in-kernel TCE table optional
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.

Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.

So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.

This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
in upstream yet and being discussed so now it is always false which means
that in-kernel VFIO acceleration is not supported.

This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel
for VFIO. The flag is false now as at the moment there is no VFIO.

This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
is the same. Since there is only emulated PCI and VIO now, the flag is set
to false. Upcoming VFIO support will set it to true.

This is a preparation patch so no change in behaviour is expected

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexander Graf d13fc32ecf PPC: KVM: Make pv hcall endian agnostic
There were a few revisions of the Linux kernel that incorrectly swapped
the hcall instructions when they saw ePAPR compliant hypercalls.

We already have fixups for those in place when running with PR KVM, but
HV KVM and systems that don't implement hypercalls at all are still broken
because they fall back to the QEMU implementation of fallback hypercalls.

So let's make the fallback hypercall instruction path endian agnostic. This
only really works well for 64bit guests, but I don't think there are any 32bit
systems left that don't implement real pv hcall support, so we'll never get
into this code path.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Alexey Kardashevskiy 80b3f79b99 KVM: target-ppc: Enable TM state migration
This adds migration support for registers saved before Transactional
Memory (TM) transaction started.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00
Alexander Graf 87a91de61a KVM: PPC: Expose fixup hcall capability
New kvm versions expose a PPC_FIXUP_HCALL capability. Make it visible to
machine code so we can take decisions based on it.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexey Kardashevskiy 523e7b8ab8 spapr_iommu: Get rid of window_size in sPAPRTCETable
This removes window_size as it is basically a copy of nb_table
shifted by SPAPR_TCE_PAGE_SHIFT. As new dynamic DMA windows are
going to support windows as big as the entire RAM and this number
will be bigger that 32 capacity, we will have to do something
about @window_size anyway and removal seems to be the right way to go.

This removes dma_window_start/dma_window_size from sPAPRPHBState as
they are no longer used.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy da95324ebe spapr_iommu: Enable multiple TCE requests
Currently only single TCE entry per request is supported (H_PUT_TCE).
However PAPR+ specification allows multiple entry requests such as
H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host
kernel via ioctls, support of these calls can accelerate IOMMU operations.

This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT.

This advertises "multi-tce" capability to the guest if the host kernel
supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy 6db5bb0f54 KVM: PPC: Enable compatibility mode
The host kernel implements a KVM_REG_PPC_ARCH_COMPAT register which
this uses to enable a compatibility mode if any chosen.

This sets the KVM_REG_PPC_ARCH_COMPAT register in KVM. ppc_set_compat()
signals the caller if the mode cannot be enabled by the host kernel.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix TCG compat setting]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00
Alexey Kardashevskiy 98a8b52442 spapr: Add support for time base offset migration
This allows guests to have a different timebase origin from the host.

This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.

This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.

This adds kvm_access_one_reg() to access a special register which is not
in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch.

The feature must be present in the host kernel.

This bumps vmstate_spapr::version_id and enables new vmstate_ppc_timebase
only for it. Since the vmstate_spapr::minimum_version_id remains
unchanged, migration from older QEMU is supported but without
vmstate_ppc_timebase.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:35 +02:00
Alexander Graf 08215d8fd8 KVM: PPC: Don't secretly add 1T segment feature to CPU
When we select a CPU type that does not support 1TB segments, we should
not expose 1TB just because KVM supports 1TB segments. User configuration
always wins over feature availability.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:33 +02:00
Alexey Kardashevskiy 5b79b1cadd target-ppc: Create versionless CPU class per family if KVM
At the moment generic version-less CPUs are supported via hardcoded aliases.
For example, POWER7 is an alias for POWER7_v2.1. So when QEMU is started
with -cpu POWER7, the POWER7_v2.1 class instance is created.

This approach works for TCG and KVMs other than HV KVM. HV KVM cannot emulate
PVR value so the guest always sees the real PVR. HV KVM will not allow setting
PVR other that the host PVR because of that (the kernel patch for it is on
its way). So in most cases it is impossible to run QEMU with -cpu POWER7
unless the host PVR is exactly the same as the one from the alias (which
is now POWER7_v2.3). It was decided that under HV KVM QEMU should use
-cpu host.

Using "host" CPU type creates a problem for management tools such as libvirt
because they want to know in advance if the destination guest can possibly
run on the destination. Since the "host" type is really not a type and will
always work with any KVM, there is no way for libvirt to know if the migration
will success.

This registers additional CPU class derived from the host CPU family.
The name for it is taken from @desc field of the CPU family class.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:27 +02:00
Paolo Bonzini 50a2c6e55f kvm: reset state from the CPU's reset method
Now that we have a CPU object with a reset method, it is better to
keep the KVM reset close to the CPU reset.  Using qemu_register_reset
as we do now keeps them far apart.

With this patch, PPC no longer calls the kvm_arch_ function, so
it can get removed there.  Other arches call it from their CPU
reset handler, and the function gets an ARMCPU/X86CPU/S390CPU.

Note that ARM- and s390-specific functions are called kvm_arm_*
and kvm_s390_*, while x86-specific functions are called kvm_arch_*.
That follows the convention used by the different architectures.
Changing that is the topic of a separate patch.

Reviewed-by: Gleb Natapov <gnatapov@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-05-13 13:12:40 +02:00
Cornelia Huck 48add816cf ppc: use kvm_vcpu_enable_cap()
Convert existing users of KVM_ENABLE_CAP to new helper.

Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-04-30 14:39:58 +02:00
Andreas Färber a47dddd734 exec: Change cpu_abort() argument to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:52:28 +01:00
Andreas Färber 27103424c4 cpu: Move exception_index field from CPU_COMMON to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2014-03-13 19:20:46 +01:00
Alexey Kardashevskiy 0f20ba62c3 target-ppc: spapr: e500: fix to use cpu_dt_id
This makes use of @cpu_dt_id and related API in:
1. emulated XICS hypercall handlers as they receive fixed CPU indexes;
2. XICS-KVM to enable in-kernel XICS on right CPU;
3. device-tree renderer.

This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
can accept command-line CPU indexes again.

This changes kvm_arch_vcpu_id() to use ppc_get_vcpu_dt_id() as at the moment
KVM CPU id and device tree ID are calculated using the same algorithm.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:07:04 +01:00
Alexey Kardashevskiy 0ce470cd4c target-ppc: add PowerPCCPU::cpu_dt_id
Normally CPUState::cpu_index is used to pick the right CPU for various
operations. However default consecutive numbering does not always work
for POWERPC.

These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
and used to call KVM VCPU's ioctls. In order to achieve this,
kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
cpu_index by the number of threads per core.

This approach has disadvantages such as:
1. NUMA configuration stays broken after the fixup;
2. CPU-targeted commands from the QEMU Monitor do not work properly as
CPU indexes have been fixed and there is no clear way for the user to
know what the new CPU indexes are.

This introduces a @cpu_dt_id field in the CPUPPCState struct which
is initialized from @cpu_index by default and can be fixed later
to meet the device tree requirements.

This adds an API to handle @cpu_dt_id.

This removes kvmppc_fixup_cpu() as it is not more needed, @cpu_dt_id
is calculated in ppc_cpu_realize().

This will be used later in machine code.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:07:03 +01:00
Aneesh Kumar K.V c138593380 target-ppc: Update ppc_hash64_store_hpte to support updating in-kernel htab
This support updating htab managed by the hypervisor. Currently we don't have
any user for this feature. This actually bring the store_hpte interface
in-line with the load_hpte one. We may want to use this when we want to
emulate henter hcall in qemu for HV kvm.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[ folded fix for the "warn_unused_result" build break in
  kvmppc_hash64_write_pte(), Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:07:03 +01:00
Aneesh Kumar K.V 7c43bca004 target-ppc: Fix page table lookup with kvm enabled
With kvm enabled, we store the hash page table information in the hypervisor.
Use ioctl to read the htab contents. Without this we get the below error when
trying to read the guest address

 (gdb) x/10 do_fork
 0xc000000000098660 <do_fork>:   Cannot access memory at address 0xc000000000098660
 (gdb)

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[ fixes for 32 bit build (casts!), ldq_phys() API change,
  Greg Kurz <gkurz@linux.vnet.ibm.com ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:07:02 +01:00
Aneesh Kumar K.V f3c75d42ad target-ppc: Fix htab_mask calculation
Correctly update the htab_mask using the return value of
KVM_PPC_ALLOCATE_HTAB ioctl. Also we don't update sdr1
on GET_SREGS for HV. We check for external htab and if
found true, we don't need to update sdr1

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[ fixed pte group offset computation in ppc_hash64_htab_lookup() that
  caused TCG to fail, Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:07:02 +01:00
Alexey Kardashevskiy b36f100e17 PPC: KVM: suppress warnings about not supported SPRs
PR KVM lacks support of many SPRs in set/get one register API but it does
really break PR KVM. So convert them to switchable traces for now.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:06:45 +01:00
Alexey Kardashevskiy 69b31b907b PPC: KVM: store SLB slot number
When ppc_store_slb() is called from kvm_arch_get_registers(), it stores
a SLB in CPUPPCState::slb[slot]. However it drops the slot number from
ESID so when kvm_arch_put_registers() puts SLBs back to KVM, they do not
have correct "index" field anymore. This broke migration with LPCR_AIR
enabled as now the guest is handling interrupts in virtual mode and unable
to reconstruct correct SLBs anymore.

This adds "index" field for valid SLBs when putting them to KVM.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-03-05 03:06:44 +01:00
Alexander Graf 933b19ea97 PPC: KVM: Add missing address space to ldl_phys helper
We now have to pass an address space to our _phys helpers. During the
transition apparently the EPR exit path missed out, so let's put it there.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-14 10:42:31 +00:00
Alexey Kardashevskiy 3bc9ccc054 powerpc: add PVR mask support
IBM POWERPC processors encode PVR as a CPU family in higher 16 bits and
a CPU version in lower 16 bits. Since there is no significant change
in behavior between versions, there is no point to add every single CPU
version in QEMU's CPU list. Also, new CPU versions of already supported
CPU won't break the existing code.

This adds PVR value/mask support for KVM, i.e. for -cpu host option.

As CPU family class name for POWER7 is "POWER7-family", there is no need
to touch aliases.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-12-20 01:57:45 +01:00
Greg Kurz c65f9a07a7 target-ppc: add stubs for KVM breakpoints
The latest update to v3.13-rc3 (bf63839f) breaks the
ppc build with KVM:

kvm-all.o: In function `kvm_update_guest_debug':
kvm-all.c:1910: undefined reference to `kvm_arch_update_guest_debug'
kvm-all.o: In function `kvm_insert_breakpoint':
kvm-all.c:1937: undefined reference to `kvm_arch_insert_sw_breakpoint'
kvm-all.c:1945: undefined reference to `kvm_arch_insert_hw_breakpoint'
kvm-all.o: In function `kvm_remove_breakpoint':
kvm-all.c:1977: undefined reference to `kvm_arch_remove_sw_breakpoint'
kvm-all.c:1985: undefined reference to `kvm_arch_remove_hw_breakpoint'
kvm-all.o: In function `kvm_remove_all_breakpoints':
kvm-all.c:2009: undefined reference to `kvm_arch_remove_sw_breakpoint'
kvm-all.c:2006: undefined reference to `kvm_arch_remove_sw_breakpoint'
kvm-all.c:2017: undefined reference to `kvm_arch_remove_all_hw_breakpoints'

We need stubs until something gets implemented.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-12-20 01:57:44 +01:00
Aneesh Kumar K.V d83af16786 target-ppc: Use #define for max slb entries
Instead of opencoding 64 use MAX_SLB_ENTRIES. We don't update the kernel
header here.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-25 23:25:48 +02:00