Commit graph

360 commits

Author SHA1 Message Date
Alexander Graf b981289c49 PPC: Cuda: Use cuda timer to expose tbfreq to guest
Mac OS X calibrates a number of frequencies on bootup based on reading
tb values on bootup and comparing them to via cuda timer values.

The only variable we can really steer well (thanks to KVM) is the cuda
frequency. So let's use that one to fake Mac OS X into believing the
bus frequency is tbfreq * 4. That way Mac OS X will automatically
calculate the correct timebase frequency.

With this patch and the patch set I posted earlier I can successfully
run Mac OS X 10.2, 10.3 and 10.4 guests with -M mac99 on TCG and KVM.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf caae6c9611 PPC: Mac: Move tbfreq into local variable
We already expose the real CPU's tb frequency to the guest via fw_cfg. Soon
we will need to also expose it to the MacIO, so let's move it to a variable
that we can leverage every time we need the frequency.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:52 +02:00
Alexander Graf a8b0503701 PPC: mac_nvram: Remove unused functions
The macio_nvram_read and macio_nvram_write functions are never called,
just remove them.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:51 +02:00
Nikunj A Dadhania 9674a35626 ppc/spapr: Fix MAX_CPUS to 255
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This
MAX_CPUS number was percolated back to "virsh capabilities" with wrong
max_cpus.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Benjamin Herrenschmidt b7d1f77ada spapr: Locate RTAS and device-tree based on real RMA
We currently calculate the final RTAS and FDT location based on
the early estimate of the RMA size, cropped to 256M on KVM since
we only know the real RMA size at reset time which happens much
later in the boot process.

This means the FDT and RTAS end up right below 256M while they
could be much higher, using precious RMA space and limiting
what the OS bootloader can put there which has proved to be
a problem with some OSes (such as when using very large initrd's)

Fortunately, we do the actual copy of the device-tree into guest
memory much later, during reset, late enough to be able to do it
using the final RMA value, we just need to move the calculation
to the right place.

However, RTAS is still loaded too early, so we change the code to
load the tiny blob into qemu memory early on, and then copy it into
guest memory at reset time. It's small enough that the memory usage
doesn't matter.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on 32bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy c3b4f589d8 spapr: Fix ibm, associativity for memory nodes
We want the associtivity lists of memory and CPU nodes to match but
memory nodes have incorrect domain#3 which is zero for CPU so they won't
match.

This clears domain#3 in the list to match CPUs associtivity lists.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy b082d65a30 spapr: Add a helper for node0_size calculation
In multiple places there is a node0_size variable calculation
which assumes that NUMA node #0 and memory node #0 are the same
things which they are not. Since we are going to change it and
do not want to change it in multiple places, let's make a helper.

This adds a spapr_node0_size() helper and makes use of it.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy 6010818c30 spapr: Split memory nodes to power-of-two blocks
Linux kernel expects nodes to have power-of-two size and
does WARN_ON if this is not the case:
[    0.041456] WARNING: at drivers/base/memory.c:115
which is:

===
	/* Validate blk_sz is a power of 2 and not less than section size */
	if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) {
        	WARN_ON(1);
	        block_sz = MIN_MEMORY_BLOCK_SIZE;
	}
===

This splits memory nodes into set of smaller blocks with
a size which is a power of two. This makes sure the start
address of every node is aligned to the node size.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: squash windows compile fix in]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy 7db8a127e3 spapr: Refactor spapr_populate_memory() to allow memoryless nodes
Current QEMU does not support memoryless NUMA nodes, however
actual hardware may have them so it makes sense to have a way
to emulate them in QEMU. This prepares SPAPR for that.

This moves 2 calls of spapr_populate_memory_node() into
the existing loop over numa nodes so first several nodes may
have no memory and this still will work.

If there is no numa configuration, the code assumes there is just
a single node at 0 and it has all the guest memory.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy 81014ac2b8 spapr: Use DT memory node rendering helper for other nodes
This finishes refactoring by using the spapr_populate_memory_node helper
for all nodes and removing leftovers from spapr_populate_memory().

This is not a part of the previous patch because the patches look
nicer apart.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexey Kardashevskiy 26a8c353bf spapr: Move DT memory node rendering to a helper
This moves recurring bits of code related to memory@xxx nodes
creation to a helper.

This makes use of the new helper for node@0.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Gonglei a21a7a7012 spapr: fix possible memory leak
get_boot_devices_list() will malloc memory, spapr_finalize_fdt
doesn't free it.

Signed-off-by: Chenliang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexander Graf 261265cc91 PPC: mac99: Move NVRAM to page boundary when necessary
When running KVM we have to adhere to host page boundaries for memory slots.
Unfortunately the NVRAM on mac99 is a 4k RAM hole inside of an MMIO flash
area.

So if our host is configured with 64k page size, we can't use the mac99 target
with KVM. This is a real shame, as this limitation is not really an issue - we
can easily map NVRAM somewhere else and at least Linux and Mac OS X use it
at their new location.

So in that emergency case when it's about failing to run at all and moving NVRAM
to a place it shouldn't be at, choose the latter.

This patch enables -M mac99 with KVM on 64k page size hosts.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Nikunj A Dadhania ef9514431d spapr: add uuid/host details to device tree
Useful for identifying the guest/host uniquely within the
guest. Adding following properties to the guest root node.

vm,uuid - uuid of the guest
host-model - Host model number
host-serial - Host machine serial number
hypervisor type - Tells its "kvm"

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Peter Maydell 7d0cd464a7 hw/ppc/spapr_hcall.c: Fix typo in function names
Fix a typo in the names of a couple of functions
(s/resouce/resource/).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Nikunj A Dadhania 2e14072f9e ppc: spapr-rtas - implement os-term rtas call
PAPR compliant guest calls this in absence of kdump. This finally
reaches the guest and can be handled according to the policies set by
higher level tools(like taking dump) for further analysis by tools like
crash.

Linux kernel calls ibm,os-term when extended property of os-term is set.
This makes sure that a return to the linux kernel is gauranteed.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[agraf: reduce RTAS_TOKEN_MAX]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:45 +02:00
Alexander Graf 277c7a4d71 PPC: KVM: Fix g3beige and mac99 when HV is loaded
On PPC we have 2 different styles of KVM: PR and HV. HV can only virtualize
sPAPR guests while PR can virtualize everything that's reasonably close to
the host hardware platform.

As long as only one kernel module (PR or HV) is loaded, the "default" kvm type
is the module that's loaded. So if your hardware only supports PR mode you can
easily spawn a Mac VM.

However, if both HV and PR are loaded we default to HV mode. And in that case
the Mac machines have to explicitly ask for PR mode to get a working VM.

Fix this up by explicitly having the Mac machines ask for PR style KVM. This
fixes bootup of Mac VMs on systems where bot HV and PR kvm modules are loaded
for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:45 +02:00
Alexey Kardashevskiy 3431648272 spapr: Add support for new NMI interface
This implements an NMI interface POWERPC SPAPR machine.
This enables an "nmi" HMP/QMP command supported on SPAPR.

This calls POWERPC_EXCP_RESET (vector 0x100) in the guest to deliver NMI
to every CPU. The expected result is XMON (in-kernel debugger) invocation.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Peter Maydell 0e4a773705 SCSI changes that enable sending vendor-specific commands via virtio-scsi.
Memory changes for QOMification and automatic tracking of MR lifetime.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJT8et9AAoJEBvWZb6bTYbyIJAQAI3AlLSe27xWoUGfQUgWH30z
 Rt/pShHz3BJMfQpD79JfTH8u6uBpkQmKtflerNT7FhXN9ULDzNq+b/jRtke8nkuy
 ctCt05FhhK00rfWpUoRue4XiCuvbizBU7MK0DI3yCyNdXQyYnFvgnvsJtlqox8Zh
 J5HZcBJEmdCiWBxq7UPk0qBitp4PqNoy7jlD/Ex3m7fJN5WK2cyspQIT9zmhehVn
 B8Nwp+RitDDbXbwm0r18col5rFr/6Nj6+dW1gr+7sVJDLNsmJEqC2l3Kgk0wbPkG
 Uqwbih29me9PC9/L1VLGHY0ApKDQ8JGE0GrYgEg162hbhoxEHkjjoHMhDUfV6Pj8
 NkqcjjWl11UUhgkNqrGafayXbBVnOiEglxy8uXCeq14y9Xd/gjK9Fz6MQvRSOjms
 PFmaKknhdmpxh0DuZmTix7WBmKim8zOiCE0/vrAPvwx5L+d1bn5xh6yQvtVjBMpU
 Sru3Mhdm9bL9dUDBgOM/G6WCxSTVLBlExOblcYkQh03MfabD7bfplcrKYPXt5ull
 Y8YLjqkoIfoy5t0ErvtlpdBJjeEz99JXU+wLQ6NYHnzwzTV+oUtSaEph14mAFOcY
 XkFKdoPDI9PnyEfvy4193du8z/dSbhu7sWgHWbTCQyrcaNnSaVhlH43NUC+p23YN
 8vfEsVLd1X7MFkDBUmWp
 =M+/m
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

SCSI changes that enable sending vendor-specific commands via virtio-scsi.

Memory changes for QOMification and automatic tracking of MR lifetime.

# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg:                 aka "Paolo Bonzini <bonzini@gnu.org>"

* remotes/bonzini/tags/for-upstream:
  mtree: remove write-only field
  memory: Use canonical path component as the name
  memory: Use memory_region_name for name access
  memory: constify memory_region_name
  exec: Abstract away ref to memory region names
  loader: Abstract away ref to memory region names
  tpm_tis: remove instance_finalize callback
  memory: remove memory_region_destroy
  memory: convert memory_region_destroy to object_unparent
  ioport: split deletion and destruction
  nic: do not destroy memory regions in cleanup functions
  vga: do not dynamically allocate chain4_alias
  sysbus: remove unused function sysbus_del_io
  qom: object: move unparenting to the child property's release callback
  qom: object: delete properties before calling instance_finalize
  virtio-scsi: implement parse_cdb
  scsi-block, scsi-generic: implement parse_cdb
  scsi-block: extract scsi_block_is_passthrough
  scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
  scsi-bus: prepare scsi_req_new for introduction of parse_cdb

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-08-19 13:00:57 +01:00
Paolo Bonzini d8d9581460 memory: convert memory_region_destroy to object_unparent
Explicitly call object_unparent in the few places where we
will re-create the memory region.  If the memory region is
simply being destroyed as part of device teardown, let QOM
handle it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-18 12:06:20 +02:00
Peter Crosthwaite aa2ac1dac3 ppc: convert g_new(qemu_irq usages to g_new0
To indicate the IRQs are initially disconnected.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-08-15 18:54:50 +04:00
Hu Tao e206ad4833 ppc: fix -mem-path failure
commit e938ba0c tried to enable -mem-path for ppc but breaked some ppc
boards.

The problems are:

1. it fails when allocating memory for rom, sram whose sizes are less
   than huge page size:

   ./ppc-softmmu/qemu-system-ppc  -m 512 -mem-path /hugepages/ \
   -kernel /home/hutao/Downloads/vmlinux-ppc -initrd \
   /home/hutao/Downloads/initrd-ppc.gz
   qemu-system-ppc: /mnt/data/projects/qemu/exec.c:1184: qemu_ram_set_idstr: Assertion `new_block' failed.

2. if there is a numa node backed by memory backend object, qemu fails
   with message:

   ./ppc-softmmu/qemu-system-ppc  -m 512 \
   -object memory-backend-file,size=512M,mem-path=/hugepages,id=f0 \
   -numa node,nodeid=0,memdev=f0 \
   -kernel /home/hutao/Downloads/vmlinux-ppc \
   -initrd /home/hutao/Downloads/initrd-ppc.gz
   qemu-system-ppc: memory backend f0 is used multiple times. Each -numa option must use a different memdev value.

This patch does following:

1. replaces memory_region_allocate_system_memory() with
   memory_region_init_ram() for rom, sram. Then only system memory
   is backed by hugepages when specifying mem-path.

2. for memory banks, allocates all ram with
   one memory_region_allocate_system_memory(), and use
   memory_region_init_alias() to initialize memory banks.

Tested machines: default(g3beige), mac99, taihu, bamboo, ref405ep.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-22 17:37:25 +02:00
Gavin Shan 27e27782f7 sPAPR/IOMMU: Fix TCE entry permission
The permission of TCE entry should exclude physical base address.
Otherwise, unmapping TCE entry can be interpreted to mapping TCE
entry wrongly for VFIO devices.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy f92f5da108 spapr: Enable use of huge pages
0b183fc87 "memory: move mem_path handling to
memory_region_allocate_system_memory" disabled -mempath use for all
machines that do not use memory_region_allocate_system_memory() to
register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
support was disabled for it.

This replaces memory_region_init_ram()+vmstate_register_ram_global() with
memory_region_allocate_system_memory() to get huge pages back.

This changes RAM size from (ram_limit - rma_alloc_size) to ram_limit as
the previous patch moved RMA memory region allocation after RAM allocation
and therefore this change does not have immediate effect but simplifies
the code.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy 658fa66b81 spapr: Move RMA memory region registration code
PPC970 does not support VRMA (virtual RMA) so real memory required
for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl.
Later this memory is used as a part of the guest RAM area.
The RMA allocating code also registers a memory region for this piece
of RAM.

We are going to simplify memory regions layout: RMA memory region
will be a subregion in the RAM memory region, both starting from zero.
This way we will not have to take care of start address alignment for
the piece of RAM next to the RMA.

This moves memory region business closer to the RAM memory region
creation/allocation code.

As this is a mechanical patch, no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on non-kvm systems]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Shreyas B. Prabhu e938ba0c35 ppc: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory
Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.

Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.

Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:58 +02:00
Peter Maydell b653282ecc hw/ppc/spapr_hcall.c: Add ULL suffix to 64 bit constant
Add ULL suffix to 64 bit constant to prevent compiler warnings
on some 32 bit platforms.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-07-08 16:03:19 +01:00
Laurent Dufour 4bce526ec4 target-ppc: KVMPPC_H_CAS fix cpu-version endianess
During KVMPPC_H_CAS processing, the cpu-version updated value is stored
without taking care of the current endianess. As a consequence, the guest
may not switch to the right CPU model, leading to unexpected results.

If needed, the value is now converted.

Fixes: 6d9412ea81 ("target-ppc: Implement "compat" CPU option")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Hervé Poussineau 56de2e5269 prep: Remove CPU reset entry point hack related to OpenHack'Ware
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Hervé Poussineau 97db046678 prep: Remove PCI memory hack related to OpenHack'Ware
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2014-07-07 16:46:35 +02:00
Alexander Graf 79c0ff2cae PPC: e500: Only create dt entries for existing serial ports
When the user specifies -nodefaults he can tell us that he doesn't want any
serial ports spawned by default. While we do honor that wish, we still create
device tree entries for those non-existent devices.

Make device tree generation depend on whether the device is actually available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy 9a321e9234 spapr_pci: Use XICS interrupt allocator and do not cache interrupts in PHB
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.

This makes use of new XICS ability to reuse interrupts.

This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.

This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.

This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.

This fixed traces to be more informative.

This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:27 +02:00
Alexey Kardashevskiy ba0e5bf8de spapr: Remove @next_irq
This removes @next_irq from sPAPREnvironment which was used in old
IRQ allocator as XICS is now responsible for IRQs and keeps track of
allocated IRQs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy bee763dbfb spapr: Move interrupt allocator to xics
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.

This moves an allocator from SPAPR to XICS.

This switches IRQ users to use new API.

This uses LSI/MSI flags to know if interrupt is allocated.

The interrupt release function will be posted as a separate patch.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff 3b50d8974b spapr: Add RTAS sysparm SPLPAR Characteristics
Add support for the SPLPAR Characteristics parameter to the emulated
RTAS call ibm,get-system-parameter.

The support provides just enough information to allow "cat
/proc/powerpc/lparcfg" to succeed without generating a kernel error
message.

Without this patch the above command will produce the following kernel
message: arch/powerpc/platforms/pseries/lparcfg.c \
parse_system_parameter_string Error calling get-system-parameter \
(0xfffffffd)

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff b907d7b0fd spapr: Add RTAS sysparm UUID
Add support for the UUID parameter to the emulated RTAS call
ibm,get-system-parameter.

Return the guest's UUID as the value for the RTAS UUID system
parameter, or null (a zero length result) if it is not set.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Sam bobroff 3052d95190 spapr: Fix RTAS sysparm DIAGNOSTICS_RUN_MODE
This allows the ibm,get-system-parameter RTAS call to succeed for the
DIAGNOSTICS_RUN_MODE system parameter.

The problem can be seen with "ppc64_cpu --run-mode" from the
powerpc-utils package which fails before this patch with "Machine does
not support diagnostic run mode".

This is corrected by using the rtas_st_buffer() function to write to
the buffer.

The RTAS constants are also moved out into a header file, some new
constants added and the surrounding code slightly simplified.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[agraf: remove some commentary]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy 6026db4501 spapr: Define a 2.1 pseries machine
This adds a v2.1 machine to support backward compatibility
for newer macines in the case if they ever be implemented.

This adds a "pseries-2.1" machine as a child of the "pseries"
machine and only changes visible machine name.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy 6ca1502e36 spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState)
Every single sPAPR QOM object has small first "s".
Most (not all yet) QOM objects have "State" suffix.

This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU
code style and removes redundant empty line.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
BALATON Zoltan a0bb2a5fa0 mac99: Add motherboard devices before PCI cards
Change the order of creating devices for New World Mac emulation so
that devices on the motherboard are added first and PCI cards (VGA and
NIC) come later. As a side effect, this also causes OpenBIOS to map
the motherboard devices into the MMIO space to the same addresses as
on real hardware and allow clients that hardcode these addresses (e.g.
MorphOS) to find and use them until OpenBIOS is tought to map devices
to specific addresses. (On real hardware the graphics and network
cards are really on separate buses but we don't model that yet.) This
brings the memory map closer to what is found on PowerMac3,1.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:24 +02:00
Alexey Kardashevskiy 9fc34ada7e spapr_pci_vfio: Add spapr-pci-vfio-host-bridge to support vfio
The patch adds a spapr-pci-vfio-host-bridge device type
which is a PCI Host Bridge with VFIO support. The new device
inherits from the spapr-pci-host-bridge device and adds an "iommu"
property which is an IOMMU id. This ID represents a minimal entity
for which IOMMU isolation can be guaranteed. In SPAPR architecture IOMMU
group is called a Partitionable Endpoint (PE).

Current implementation supports one IOMMU id per QEMU VFIO PHB. Since
SPAPR allows multiple PHB for no extra cost, this does not seem to
be a problem. This limitation may change in the future though.

Example of use:
Configure and Add 3 functions of a multifunctional device to QEMU:
(the NEC PCI USB card is used as an example here):
-device spapr-pci-vfio-host-bridge,id=USB,iommu=4,index=7 \
-device vfio-pci,host=4:0:1.0,addr=1.0,bus=USB,multifunction=true
-device vfio-pci,host=4:0:1.1,addr=1.1,bus=USB
-device vfio-pci,host=4:0:1.2,addr=1.2,bus=USB

where:
* index=7 is a QEMU PHB index (used as source for MMIO/MSI/IO windows
offset);
* iommu=4 is an IOMMU id which can be found in sysfs:
[aik@vpl2 ~]$ cd /sys/bus/pci/devices/0004:00:00.0/
[aik@vpl2 0004:00:00.0]$ ls -l iommu_group
lrwxrwxrwx 1 root root 0 Jun  5 12:49 iommu_group -> ../../../kernel/iommu_groups/4

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy 9bb62a0702 spapr_iommu: Make in-kernel TCE table optional
POWER KVM supports an KVM_CAP_SPAPR_TCE capability which allows allocating
TCE tables in the host kernel memory and handle H_PUT_TCE requests
targeted to specific LIOBN (logical bus number) right in the host without
switching to QEMU. At the moment this is used for emulated devices only
and the handler only puts TCE to the table. If the in-kernel H_PUT_TCE
handler finds a LIOBN and corresponding table, it will put a TCE to
the table and complete hypercall execution. The user space will not be
notified.

Upcoming VFIO support is going to use the same sPAPRTCETable device class
so KVM_CAP_SPAPR_TCE is going to be used as well. That means that TCE
tables for VFIO are going to be allocated in the host as well.
However VFIO operates with real IOMMU tables and simple copying of
a TCE to the real hardware TCE table will not work as guest physical
to host physical address translation is requited.

So until the host kernel gets VFIO support for H_PUT_TCE, we better not
to register VFIO's TCE in the host.

This adds a place holder for KVM_CAP_SPAPR_TCE_VFIO capability. It is not
in upstream yet and being discussed so now it is always false which means
that in-kernel VFIO acceleration is not supported.

This adds a bool @vfio_accel flag to the sPAPRTCETable device telling
that sPAPRTCETable should not try allocating TCE table in the host kernel
for VFIO. The flag is false now as at the moment there is no VFIO.

This adds an vfio_accel parameter to spapr_tce_new_table(), the semantic
is the same. Since there is only emulated PCI and VIO now, the flag is set
to false. Upcoming VFIO support will set it to true.

This is a preparation patch so no change in behaviour is expected

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:23 +02:00
Alexey Kardashevskiy 3a3b8502e6 spapr: Fix RTAS token numbers
At the moment spapr_rtas_register() allocates a new token number for every
new RTAS callback so numbers are not fixed and depend on the number of
supported RTAS handlers and the exact order of spapr_rtas_register() calls.
These tokens are copied into the device tree and remain the same during
the guest lifetime.

When we start another guest to receive a migration, it calls
spapr_rtas_register() as well. If the number of RTAS handlers or their
order is different in QEMU on source and destination sides, the "/rtas"
node in the device tree will differ. Since migration overwrites the device
tree (as it overwrites the entire RAM), the actual RTAS config on
the destination side gets broken.

This defines global contant values for every RTAS token which QEMU
is using today.

This changes spapr_rtas_register() to accept a token number instead of
allocating one. This changes all users of spapr_rtas_register().

This changes XICS-KVM not to cache tokens registered with KVM as they
constant now.

This makes TOKEN_BASE global as RTAS_XXX use TOKEN_BASE as
a base. TOKEN_MAX is moved and renamed too and its value is changed
to the last token + 1. Boundary checks for token values are adjusted.

This reserves token numbers for "os-term" handlers and PCI hotplug
which we are working on.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Avik Sil cc84c0f357 spapr: Add "qemu, boot-menu" property to /chosen
This is required to enable boot menu display during booting

Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Wenchao Xia e010ad8f1e qapi event: convert RTC_CHANGE
This patch also eliminates build time warning caused by no caller
of monitor_qapi_event_throttle().

Signed-off-by: Wenchao Xia <wenchaoqemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-06-23 11:12:27 -04:00
Wanlong Gao 8c85901ed3 NUMA: Add numa_info structure to contain numa nodes info
Add the numa_info structure to contain the numa nodes memory,
VCPUs information and the future added numa nodes host memory
policies.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
[Fix hw/ppc/spapr.c - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Badari Pulavarty 9dbae97723 spapr_pci: Advertise MSI quota
Hotplug of multiple disks fails due to MSI vector quota check.
Number of MSI vectors default to 8 allowing only 4 devices.
This happens on RHEL6.5 guest. RHEL7 and SLES11 guests fallback
to INTX.

One way to workaround the issue is to increase total MSIs,
so that MSI quota check allows us to hotplug multiple disks.

This sets the quota to the maximum number of interupts XICS has
which is 1024 now (XICS_IRQS). This moves XICS_IRQS from spapr.c
to xics.h for wider visibility.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
[aik: put XICS_IRQS=1024 instead of 64i, fixed endianness and size]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Eduardo Habkost 23825581d7 spapr: Add kvm-type property
The kvm-type machine option was left out when MachineState was
introduced, preventing the kvm-type option from being used. Add the
missing property to the sPAPR machine class, so it can be used.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Eduardo Habkost 748abce94f spapr: Create SPAPRMachine struct
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Alexey Kardashevskiy d5ac4f5433 spapr_hcall: Add address-translation-mode-on-interrupt resource in H_SET_MODE
This adds handling of the RESOURCE_ADDR_TRANS_MODE resource from
the H_SET_MODE, for POWER8 (PowerISA 2.07) only.

This defines AIL flags for LPCR special register.

This changes @excp_prefix according to the mode, takes effect in TCG.

This turns support of a new capability PPC2_ISA207S flag for TCG.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Tom Musta <tommusta@gmail.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:45 +02:00