Commit graph

548 commits

Author SHA1 Message Date
David Gibson 2309832afd spapr: Maximum (HPT) pagesize property
The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
that every page that the guest puts in the pagetables must be truly
physically contiguous, not just GPA-contiguous.  In effect this means that
an HPT guest can't use any pagesizes greater than the host page size used
to back its memory.

At present we handle this by changing what we advertise to the guest based
on the backing pagesizes.  This is pretty bad, because it means the guest
sees a different environment depending on what should be host configuration
details.

As a start on fixing this, we add a new capability parameter to the
pseries machine type which gives the maximum allowed pagesizes for an
HPT guest.  For now we just create and validate the parameter without
making it do anything.

For backwards compatibility, on older machine types we set it to the max
available page size for the host.  For the 3.0 machine type, we fix it to
16, the intention being to only allow HPT pagesizes up to 64kiB by default
in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-06-22 14:19:07 +10:00
Cédric Le Goater 71b5c8d26e spapr: remove unused spapr_irq routines
spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
Cédric Le Goater 4fe75a8ccd spapr: split the IRQ allocation sequence
Today, when a device requests for IRQ number in a sPAPR machine, the
spapr_irq_alloc() routine first scans the ICSState status array to
find an empty slot and then performs the assignement of the selected
numbers. Split this sequence in two distinct routines : spapr_irq_find()
for lookups and spapr_irq_claim() for claiming the IRQ numbers.

This will ease the introduction of a static layout of IRQ numbers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
David Gibson 9f6edd066e spapr: Compute effective capability values earlier
Previously, the effective values of the various spapr capability flags
were only determined at machine reset time.  That was a lazy way of making
sure it was after cpu initialization so it could use the cpu object to
inform the defaults.

But we've now improved the compat checking code so that we don't need to
instantiate the cpus to use it.  That lets us move the resolution of the
capability defaults much earlier.

This is going to be necessary for some future capabilities.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2018-06-21 21:22:53 +10:00
David Gibson ad99d04c76 target/ppc: Allow cpu compatiblity checks based on type, not instance
ppc_check_compat() is used in a number of places to check if a cpu object
supports a certain compatiblity mode, subject to various constraints.

It takes a PowerPCCPU *, however it really only depends on the cpu's class.
We have upcoming cases where it would be useful to make compatibility
checks before we fully instantiate the cpu objects.

ppc_type_check_compat() will now make an equivalent check, but based on a
CPU's QOM typename instead of an instantiated CPU object.

We make use of the new interface in several places in spapr, where we're
essentially making a global check, rather than one specific to a particular
cpu.  This avoids some ugly uses of first_cpu to grab a "representative"
instance.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2018-06-21 21:22:53 +10:00
Greg Kurz b94020268e spapr_cpu_core: migrate per-CPU data
A per-CPU machine data pointer was recently added to PowerPCCPU. The
motivation is to to hide platform specific details from the core CPU
code. This per-CPU data can hold state which is relevant to the guest
though, eg, Virtual Processor Areas, and we should migrate this state.

This patch adds the plumbing so that we can migrate the per-CPU data
for PAPR guests. We only do this for newer machine types for the sake
of backward compatibility. No state is migrated for the moment: the
vmstate_spapr_cpu_state structure will be populated by subsequent
patches.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix some trivial spelling and spacing errors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
Greg Kurz 844afc54ae spapr: fix xics_system_init() error path
Commit 3d85885a1b tried to fix error handling, but it actually
went into the wrong direction by dropping the local Error *.

In the default KVM case, the rationale is to try the in-kernel XICS first,
and if not possible, to fallback to userland XICS. Passing errp everywhere
makes this fallback impossible if errp is &error_fatal (which happens to
be the case). And anyway, if the caller would pass a regular &local_err,
things would be worse: we could possibly pass an already set *errp to
error_setg() and crash, or return an error even in case of success.

So we definitely need a local Error * and only propagate it when we're
done with the fallback logic. This is what this patch does.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-18 09:43:19 +10:00
David Hildenbrand a4261be172 spapr: handle cpu core unplug via hotplug handler chain
Factor out cpu core unplug into separate function from
spapr_core_release(). Then use generic hotplug_handler_unplug() to trigger
cpu core unplug, which would call spapr_machine_device_unplug() ->
spapr_core_unplug() in the end.

This way unplug operation is not buried in spapr internals and located
in the same place like in other targets, following similar
logic/call chain across targets.

Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 3ec71474ca spapr: handle pc-dimm unplug via hotplug handler chain
Factor out memory unplug into separate function from spapr_lmb_release().
Then use generic hotplug_handler_unplug() to trigger memory unplug,
which will call spapr_machine_device_unplug() -> spapr_memory_unplug()
in the end.

This way unplug operation is not buried in lmb internals and located in
the same place like in other targets, following similar logic/call chain
across targets.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 88432f44aa spapr: introduce machine unplug handler
We'll be handling unplug of e.g. CPUs and PCDIMMs  via the general
hotplug handler soon, so let's add that handler function.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 4e8a01bdb2 spapr: move memory hotplug support check into spapr_memory_pre_plug()
Let's finish cleaning up the hotplug handler. This check can be
performed in the pre_plug code as the very first thing.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 81985f3be9 spapr: move lookup of the node into spapr_memory_plug()
Let's clean the hotplug handler up by moving lookup of the node into
the function where it is actually being used.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand fcc8ef17e2 spapr: no need to verify the node
The node property can always be queried and the value has already been
verified in pc_dimm_realize().

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
Peter Maydell afd76ffba9 * Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
 * IPMI migration fix (Corey)
 * QOM improvements (Alexey, Philippe, me)
 * Memory API cleanups (Jay, me, Tristan, Peter)
 * WHPX fixes and improvements (Lucian)
 * Chardev fixes (Marc-André)
 * IOMMU documentation improvements (Peter)
 * Coverity fixes (Peter, Philippe)
 * Include cleanup (Philippe)
 * -clock deprecation (Thomas)
 * Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
 * Configurability improvements (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlsRd2UUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPG8Qf+M85E8xAQ/bhs90tAymuXkUUsTIFF
 uI76K8eM0K3b2B+vGckxh1gyN5O3GQaMEDL7vITfqbX+EOH5U2lv8V9JRzf2YvbG
 Zahjd4pOCYzR0b9JENA1r5U/J8RntNrBNXlKmGTaXOaw9VCXlZyvgVd9CE3z/e2M
 0jSXMBdF4LB3UzECI24Va8ejJxdSiJcqXA2j3J+pJFxI698i+Z5eBBKnRdo5TVe5
 jl0TYEsbS6CLwhmbLXmt3Qhq+ocZn7YH9X3HjkHEdqDUeYWyT9jwUpa7OHFrIEKC
 ikWm9er4YDzG/vOC0dqwKbShFzuTpTJuMz5Mj4v8JjM/iQQFrp4afjcW2g==
 =RS/B
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
* IPMI migration fix (Corey)
* QOM improvements (Alexey, Philippe, me)
* Memory API cleanups (Jay, me, Tristan, Peter)
* WHPX fixes and improvements (Lucian)
* Chardev fixes (Marc-André)
* IOMMU documentation improvements (Peter)
* Coverity fixes (Peter, Philippe)
* Include cleanup (Philippe)
* -clock deprecation (Thomas)
* Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
* Configurability improvements (me)

# gpg: Signature made Fri 01 Jun 2018 17:42:13 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (56 commits)
  hw: make virtio devices configurable via default-configs/
  hw: allow compiling out SCSI
  memory: Make operations using MemoryRegionIoeventfd struct pass by pointer.
  char: Remove unwanted crlf conversion
  qdev: Remove DeviceClass::init() and ::exit()
  qdev: Simplify the SysBusDeviceClass::init path
  hw/i2c: Use DeviceClass::realize instead of I2CSlaveClass::init
  hw/i2c/smbus: Use DeviceClass::realize instead of SMBusDeviceClass::init
  target/i386/kvm.c: Remove compatibility shim for KVM_HINTS_REALTIME
  Update Linux headers to 4.17-rc6
  target/i386/kvm.c: Handle renaming of KVM_HINTS_DEDICATED
  scripts/update-linux-headers: Handle kernel license no longer being one file
  scripts/update-linux-headers: Handle __aligned_u64
  virtio-gpu-3d: Define VIRTIO_GPU_CAPSET_VIRGL2 elsewhere
  gdbstub: Prevent fd leakage
  docs/interop: add "firmware.json"
  ipmi: Use proper struct reference for KCS vmstate
  vmstate: Add a VSTRUCT type
  tcg: remove softfloat from --disable-tcg builds
  qemu-options: Mark the non-functional -clock option as deprecated
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-01 18:24:16 +01:00
Philippe Mathieu-Daudé 0304f9ec9c hw: Do not include "sysemu/block-backend.h" if it is not necessary
Remove those unneeded includes to speed up the compilation
process a little bit. (Continue 7eceff5b5a cleanup)

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-13-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 14:15:10 +02:00
Peter Maydell d8c0c7af80 ppc: Rename 2.13 machines to 3.0
Rename the 2.13 machines to match the number we're going to
use for the next release.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-id: 20180522104000.9044-5-peter.maydell@linaro.org
2018-05-29 11:28:46 +01:00
Igor Mammedov debbdc0018 make sure that we aren't overwriting mc->get_hotplug_handler by accident
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1525691524-32265-5-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-10 18:10:56 +01:00
David Hildenbrand 0c9269a52d spapr: rename "hotplug memory" terminology to "device memory"
Let's make it clear at relevant places that we are dealing with device
memory. That it can be used for memory hotplug is just a special case.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-11-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand bd6c3e4a49 pc-dimm: pass in the machine and to the MemoryHotplugState
We use the machine internally either way, so let's just pass it in then.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand acc7fa17e6 pc-dimm: no need to pass the memory region
We can just query it ourselves. When unplugging, we should always be
able to the region (as it was previously plugged). E.g. PPC already
assumed that and used &error_abort.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand b0c14ec4ef machine: make MemoryHotplugState accessible via the machine
Let's allow to query the MemoryHotplugState directly from the machine.
If the pointer is NULL, the machine does not support memory devices. If
the pointer is !NULL, the machine supports memory devices and the
data structure contains information about the applicable physical
guest address space region.

This allows us to generically detect if a certain machine has support
for memory devices, and to generically manage it (find free address
range, plug/unplug a memory region).

We will rename "MemoryHotplugState" to something more meaningful
("DeviceMemory") after we completed factoring out the pc-dimm code into
MemoryDevice code.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
[ehabkost: squashed fix to use g_malloc0()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand 2cc0e2e814 pc-dimm: factor out MemoryDevice interface
On the qmp level, we already have the concept of memory devices:
    "query-memory-devices"
Right now, we only support NVDIMM and PCDIMM.

We want to map other devices later into the address space of the guest.
Such device could e.g. be virtio devices. These devices will have a
guest memory range assigned but won't be exposed via e.g. ACPI. We want
to make them look like memory device, but not glued to pc-dimm.

Especially, it will not always be possible to have TYPE_PC_DIMM as a parent
class (e.g. virtio devices). Let's use an interface instead. As a first
part, convert handling of
- qmp_pc_dimm_device_list
- get_plugged_memory_size
to our new model. plug/unplug stuff etc. will follow later.

A memory device will have to provide the following functions:
- get_addr(): Necessary, as the property "addr" can e.g. not be used for
              virtio devices (already defined).
- get_plugged_size(): The amount this device offers to the guest as of
                      now.
- get_region_size(): Because this can later on be bigger than the
                     plugged size.
- fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
Greg Kurz 0550b1206a spapr: don't advertise radix GTSE if max-compat-cpu < power9
On a POWER9 host, if a guest runs in pre POWER9 compat mode, it necessarily
uses the hash MMU mode. In this case, we shouldn't advertise radix GTSE in
the ibm,arch-vec-5-platform-support DT property as the current code does.
The first reason is that it doesn't make sense, and the second one is that
causes the CAS-negotiated options subsection to be migrated. This breaks
backward migration to QEMU 2.7 and older versions on POWER8 hosts:

qemu-system-ppc64: error while loading state for instance 0x0 of device
 'spapr'
qemu-system-ppc64: load of migration failed: No such file or directory

This patch hence initialize CPUs a bit earlier so that we can check the
requested compat mode, and don't set OV5_MMU_RADIX_GTSE for power8 and
older.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-05-04 15:00:37 +10:00
Greg Kurz aef19c04bf spapr: don't migrate "spapr_option_vector_ov5_cas" to pre 2.8 machines
a324d6f166 "spapr: Support ibm,dynamic-memory-v2 property" added
a new feature in the set of CAS-negotiatable options. This causes
the CAS-negotiated options subsection to be migrated, even for old
machine types that don't know about it, and breaks backward migration
to QEMU 2.7 and older versions:

qemu-system-ppc64: error while loading state for instance 0x0 of device
 'spapr'
qemu-system-ppc64: load of migration failed: No such file or directory

Since this feature only affects boot time behaviour, it should be
filtered out when we decide to migrate CAS-negotiated options, like
we already do with OV5_FORM1_AFFINITY and OV5_DRCONF_MEMORY.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-05-04 15:00:37 +10:00
David Gibson 84369f639e spapr: Make a helper to set up cpu entry point state
Under PAPR, only the boot CPU is active when the system starts.  Other cpus
must be explicitly activated using an RTAS call.  The entry state for the
boot and secondary cpus isn't identical, but it has some things in common.
We're going to add a bit more common setup later, too, so to simplify
make a helper which sets up the common entry state for both boot and
secondary cpu threads.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-05-04 15:00:37 +10:00
David Gibson 090052aa08 spapr: Remove support for explicitly allocated RMAs
Current POWER cpus allow for a VRMA, a special mapping which describes a
guest's view of memory when in real mode (MMU off, from the guest's point
of view).  Older cpus didn't have that which meant that to support a guest
a special host-contiguous region of memory was needed to give the guest its
Real Mode Area (RMA).

KVM used to provide special calls to allocate a contiguous RMA for those
cases.  This was useful in the early days of KVM on Power to allow it to be
tested on PowerPC 970 chips as used in Macintosh G5 machines.  Now, those
machines are so old as to be almost irrelevant.

The normal qemu deprecation process would require this to be marked
deprecated then removed in 2 releases.  However, this can only be used
with corresponding support in the host kernel - which was dropped
years ago (in c17b98cf "KVM: PPC: Book3S HV: Remove code for PPC970
processors" of 2014-12-03 to be precise).  Therefore it should be ok
to drop this immediately.

Just to be clear this only affects *KVM HV* guests with PowerPC 970,
and those already require an ancient host kernel.  TCG and KVM PR
guests with PowerPC 970 should still work.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Thomas Huth <thuth@redhat.com>
2018-05-04 11:15:18 +10:00
Bharata B Rao a324d6f166 spapr: Support ibm,dynamic-memory-v2 property
The new property ibm,dynamic-memory-v2 allows memory to be represented
in a more compact manner in device tree.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:23 +10:00
Serhii Popovych da9f80fbad spapr: Add ibm,max-associativity-domains property
Now recent kernels (i.e. since linux-stable commit a346137e9142
("powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes")
support this property to mark initially memory-less NUMA nodes as "possible"
to allow further memory hot-add to them.

Advertise this property for pSeries machines to let guest kernels detect
maximum supported node configuration and benefit from kernel side change
when hot-add memory to specific, possibly empty before, NUMA node.

Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:23 +10:00
David Gibson 67d7d66f27 target/ppc: Fold slb_nr into PPCHash64Options
The env->slb_nr field gives the size of the SLB (Segment Lookaside Buffer).
This is another static-after-initialization parameter of the specific
version of the 64-bit hash MMU in the CPU.  So, this patch folds the field
into PPCHash64Options with the other hash MMU options.

This is a bit more complicated that the things previously put in there,
because slb_nr was foolishly included in the migration stream.  So we need
some of the usual dance to handle backwards compatible migration.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-04-27 18:05:22 +10:00
David Gibson 26cd35b861 target/ppc: Fold ci_large_pages flag into PPCHash64Options
The ci_large_pages boolean in CPUPPCState is only relevant to 64-bit hash
MMU machines, indicating whether it's possible to map large (> 4kiB) pages
as cache-inhibitied (i.e. for IO, rather than memory).  Fold it as another
flag into the PPCHash64Options structure.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-04-27 18:05:22 +10:00
David Gibson 58969eeece target/ppc: Move 1T segment and AMR options to PPCHash64Options
Currently env->mmu_model is a bit of an unholy mess of an enum of distinct
MMU types, with various flag bits as well.  This makes which bits of the
field should be compared pretty confusing.

Make a start on cleaning that up by moving two of the flags bits -
POWERPC_MMU_1TSEG and POWERPC_MMU_AMR - which are specific to the 64-bit
hash MMU into a new flags field in PPCHash64Options structure.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-04-27 18:05:22 +10:00
David Gibson 644a2c99a9 target/ppc: Pass cpu instead of env to ppc_create_page_sizes_prop()
As a rule we prefer to pass PowerPCCPU instead of CPUPPCState, and this
change will make some things simpler later on.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2018-04-27 18:05:22 +10:00
Greg Kurz b2692d5fed spapr: drop useless dynamic sysbus device sanity check
Since commit 7da79a167a, the machine class init function registers
dynamic sysbus device types it supports. Passing an unsupported device
type on the command line causes QEMU to exit with an error message
just after machine init.

It is hence not needed to do the same sanity check at machine reset.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:22 +10:00
Serhii Popovych e47f1d2786 Revert "spapr: Don't allow memory hotplug to memory less nodes"
This reverts commit b556854bd8.

Leave change @node type from uint32_t to to int from reverted commit
because node < 0 is always false.

Note that implementing capability or some trick to detect if guest
kernel does not support hot-add to memory: this returns previous
behavour where memory added to first non-empty node.

Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:22 +10:00
Greg Kurz 1d36c75a9e spapr: drop useless sanity check in spapr_irq_alloc*()
Both spapr_irq_alloc() and spapr_irq_alloc_block() have an errp
parameter, but they don't use it if XICS hasn't been initialized
yet.

This is doubly wrong:

- all callers do pass a non-null Error **, ie, they expect an error
  to be propagated in case of failure

- XICS obviously needs to be initialized before anything starts allocating
  IRQs

So this patch turns the check into an assert.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:22 +10:00
David Gibson 8a4fd427fe spapr: Introduce pseries-2.13 machine type
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:22 +10:00
Peter Maydell b8846a4d63 vl.c: new function serial_max_hds()
Create a new function serial_max_hds() which returns the number of
serial ports defined by the user. This is needed only by spapr.

This allows us to remove the MAX_SERIAL_PORTS define.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180420145249.32435-14-peter.maydell@linaro.org
2018-04-26 13:58:29 +01:00
Peter Maydell 9bca0edb28 Change references to serial_hds[] to serial_hd()
Change all the uses of serial_hds[] to go via the new
serial_hd() function. Code change produced with:
 find hw -name '*.[ch]' | xargs sed -i -e 's/serial_hds\[\([^]]*\)\]/serial_hd(\1)/g'

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20180420145249.32435-8-peter.maydell@linaro.org
2018-04-26 13:57:00 +01:00
Alexey Kardashevskiy 127f03e442 spapr: Initialize reserved areas list in FDT in H_CAS handler
At the moment the device tree produced by the H_CAS handler has no
reserved map initialized at all which is not correct as at least one
empty record is required to be present as a marker of the end.
This does not cause problems now as the only consumer is SLOF which
does not look at the reserved map area.

However when DTC's "Improve libfdt's memory safety" changeset hits
the QEMU upstream, there will be errors reported and crashes observed.

This fixes the problem by adding an empty entry to the reserved map,
just like create_device_tree() does already.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-10 10:05:38 +10:00
Peter Maydell ed627b2ad3 virtio,vhost,pci,pc: features, cleanups
SRAT tables for DIMM devices
 new virtio net flags for speed/duplex
 post-copy migration support in vhost
 cleanups in pci
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJasR1rAAoJECgfDbjSjVRpOocH/R9A3g/TkpGjmLzJBrrX1NGO
 I/iq0ttHjqg4OBIChA4BHHjXwYUMs7XQn26B3efrk1otLAJhuqntZIIo3uU0WraA
 5J+4DT46ogs5rZWNzDCZ0zAkSaATDA6h9Nfh7TvPc9Q2WpcIT0cTa/jOtrxRc9Vq
 32hbUKtJSpNxRjwbZvk6YV21HtWo3Tktdaj9IeTQTN0/gfMyOMdgxta3+bymicbJ
 FuF9ybHcpXvrEctHhXHIL4/YVGEH/4shagZ4JVzv1dVdLeHLZtPomdf7+oc0+07m
 Qs+yV0HeRS5Zxt7w5blGLC4zDXczT/bUx8oln0Tz5MV7RR/+C2HwMOHC69gfpSc=
 =vomK
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

virtio,vhost,pci,pc: features, cleanups

SRAT tables for DIMM devices
new virtio net flags for speed/duplex
post-copy migration support in vhost
cleanups in pci

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 20 Mar 2018 14:40:43 GMT
# gpg:                using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (51 commits)
  postcopy shared docs
  libvhost-user: Claim support for postcopy
  postcopy: Allow shared memory
  vhost: Huge page align and merge
  vhost+postcopy: Wire up POSTCOPY_END notify
  vhost-user: Add VHOST_USER_POSTCOPY_END message
  libvhost-user: mprotect & madvises for postcopy
  vhost+postcopy: Call wakeups
  vhost+postcopy: Add vhost waker
  postcopy: postcopy_notify_shared_wake
  postcopy: helper for waking shared
  vhost+postcopy: Resolve client address
  postcopy-ram: add a stub for postcopy_request_shared_page
  vhost+postcopy: Helper to send requests to source for shared pages
  vhost+postcopy: Stash RAMBlock and offset
  vhost+postcopy: Send address back to qemu
  libvhost-user+postcopy: Register new regions with the ufd
  migration/ram: ramblock_recv_bitmap_test_byte_offset
  postcopy+vhost-user: Split set_mem_table for postcopy
  vhost+postcopy: Transmit 'listen' to slave
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	scripts/update-linux-headers.sh
2018-03-20 15:48:34 +00:00
Haozhong Zhang 52c95cae4e pc-dimm: make qmp_pc_dimm_device_list() sort devices by address
Make qmp_pc_dimm_device_list() return sorted by start address
list of devices so that it could be reused in places that
would need sorted list*. Reuse existing pc_dimm_built_list()
to get sorted list.

While at it hide recursive callbacks from callers, so that:

  qmp_pc_dimm_device_list(qdev_get_machine(), &list);

could be replaced with simpler:

  list = qmp_pc_dimm_device_list();

* follow up patch will use it in build_srat()

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au> for ppc part
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-03-20 03:34:52 +02:00
Thomas Huth 3c3a4e7afa hw/ppc/spapr: Allow "spapr-vlan" as NIC model name beside "ibmveth"
With the new "--nic" command line parameter option, the "old" way of
specifying a NIC model via the nd_table[] is becoming more prominent
again. But for the pseries "spapr-vlan" device, there is a confusing
discrepancy between the model name that is used for "--device" (i.e.
"spapr-vlan") and the model name that has to be used for "--net nic"
or the new "--nic" parameter (i.e. "ibmveth"). Since "spapr-vlan" is
the "real" name of the device, let's allow "spapr-vlan" to be used
as model name for the nd_table[] entries, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-18 18:27:23 +11:00
Alexey Kardashevskiy fcad0d2121 ppc/spapr, vfio: Turn off MSIX emulation for VFIO devices
This adds a possibility for the platform to tell VFIO not to emulate MSIX
so MMIO memory regions do not get split into chunks in flatview and
the entire page can be registered as a KVM memory slot and make direct
MMIO access possible for the guest.

This enables the entire MSIX BAR mapping to the guest for the pseries
platform in order to achieve the maximum MMIO preformance for certain
devices.

Tested on:
LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2018-03-13 11:17:31 -06:00
Nikunj A Dadhania 90ee4e01a1 hw/ppc/spapr,e500: Use new property "stdout-path" for boot console
Linux kernel commit 2a9d832cc9aae21ea827520fef635b6c49a06c6d
(of: Add bindings for chosen node, stdout-path) deprecated chosen property
"linux,stdout-path" and "stdout".

Introduce the new property "stdout-path" and continue supporting the older
property to remain compatible with existing/older firmware. This older property
can be deprecated after 5 years.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-06 13:16:29 +11:00
Suraj Jitindar Singh 813f3cf655 ppc/spapr-caps: Define the pseries-2.12-sxxm machine type
The sxxm (speculative execution exploit mitigation) machine type is a
variant of the 2.12 machine type with workarounds for speculative
execution vulnerabilities enabled by default.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-06 13:16:29 +11:00
Greg Kurz 1a5008fc17 spapr: harden code that depends on VSMT
VSMT must be set in order to compute VCPU ids. This means that the
following functions must not be called before spapr_set_vsmt_mode()
was called:
- spapr_vcpu_id()
- spapr_is_thread0_in_vcore()
- xics_max_server_number()

We had a recent regression where the latter would be called before VSMT
was set, and broke migration of some old machine types. This patch
adds assert() in the above functions to avoid problems in the future.

Also, since VSMT is really a CPU related thing, spapr_set_vsmt_mode() is
now called from spapr_init_cpus(), just before the first VSMT user.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-06 13:16:29 +11:00
Greg Kurz 72fdd4de8e spapr: register dummy ICPs later
Some older machine types create more ICPs than needed. We hence
need to register up to xics_max_server_number() dummy ICPs to
accomodate the migration of these machine types.

Recent VSMT rework changed xics_max_server_number() to return

    DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads)

instead of

    DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);

The change is okay but it requires spapr->vsmt to be set, which
isn't the case with the current code. This causes the formula to
return zero and we don't create dummy ICPs. This breaks migration
of older guests as reported here:

    https://bugzilla.redhat.com/show_bug.cgi?id=1549087

The dummy ICP workaround doesn't really have a dependency on XICS
itself. But it does depend on proper VCPU id numbering and it must
be applied before creating vCPUs (ie, creating real ICPs). So this
patch moves the workaround to spapr_init_cpus(), which already
assumes VSMT to be set.

Fixes: 72194664c8 ("spapr: use spapr->vsmt to compute VCPU ids")
Reported-by: Lukas Doktor <ldoktor@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-06 13:16:29 +11:00
Greg Kurz b1a568c1c2 spapr: fix missing CPU core nodes in DT when running with TCG
Commit 5d0fb1508e "spapr: consolidate the VCPU id numbering logic
in a single place" introduced a helper to detect thread0 of a virtual
core based on its VCPU id. This is used to create CPU core nodes in
the DT, but it is broken in TCG.

$ qemu-system-ppc64 -nographic -accel tcg -machine dumpdtb=dtb.bin \
                    -smp cores=16,maxcpus=16,threads=1
$ dtc -f -O dts dtb.bin | grep POWER8
                PowerPC,POWER8@0 {
                PowerPC,POWER8@8 {

instead of the expected 16 cores that we get with KVM:

$ dtc -f -O dts dtb.bin | grep POWER8
                PowerPC,POWER8@0 {
                PowerPC,POWER8@8 {
                PowerPC,POWER8@10 {
                PowerPC,POWER8@18 {
                PowerPC,POWER8@20 {
                PowerPC,POWER8@28 {
                PowerPC,POWER8@30 {
                PowerPC,POWER8@38 {
                PowerPC,POWER8@40 {
                PowerPC,POWER8@48 {
                PowerPC,POWER8@50 {
                PowerPC,POWER8@58 {
                PowerPC,POWER8@60 {
                PowerPC,POWER8@68 {
                PowerPC,POWER8@70 {
                PowerPC,POWER8@78 {

This happens because spapr_get_vcpu_id() maps VCPU ids to
cs->cpu_index in TCG mode. This confuses the code in
spapr_is_thread0_in_vcore(), since it assumes thread0 VCPU
ids to have a spapr->vsmt spacing.

    spapr_get_vcpu_id(cpu) % spapr->vsmt == 0

Actually, there's no real reason to expose cs->cpu_index instead
of the VCPU id, since we also generate it with TCG. Also we already
set it explicitly in spapr_set_vcpu_id(), so there's no real reason
either to call kvm_arch_vcpu_id() with KVM.

This patch unifies spapr_get_vcpu_id() to always return the computed
VCPU id both in TCG and KVM. This is one step forward towards KVM<->TCG
migration.

Fixes: 5d0fb1508e
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-03-06 13:16:29 +11:00
Greg Kurz 5d0fb1508e spapr: consolidate the VCPU id numbering logic in a single place
Several places in the code need to calculate a VCPU id:

    (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads
    (core_id / smp_threads) * spapr->vsmt (1 user)
    index * spapr->vsmt (2 users)

or guess that the VCPU id of a given VCPU is the first thread of a virtual
core:

    index % spapr->vsmt != 0

Even if the numbering logic isn't that complex, it is rather fragile to
have these assumptions open-coded in several places. FWIW this was
proved with recent issues related to VSMT.

This patch moves the VCPU id formula to a single function to be called
everywhere the code needs to compute one. It also adds an helper to
guess if a VCPU is the first thread of a VCORE.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Rename spapr_is_vcore() to spapr_is_thread0_in_vcore() for clarity]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-02-16 12:14:26 +11:00
Greg Kurz 14bb4486c8 spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id()
The spapr_vcpu_id() function is an accessor actually. Let's rename it
for symmetry with the recently added spapr_set_vcpu_id() helper.

The motivation behind this is that a later patch will consolidate
the VCPU id formula in a function and spapr_vcpu_id looks like an
appropriate name.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-02-16 12:14:26 +11:00