Commit graph

30272 commits

Author SHA1 Message Date
Mark Cave-Ayland 4e136629f0 macfb: fix VRAM dirty memory region logging
The macfb VRAM memory region was configured with coalescing rather than dirty
memory logging enabled, causing some areas of the screen not to redraw after
a full screen update.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 8ac919a065 ("hw/m68k: add Nubus macfb video card")
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220108164147.30813-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-09 12:04:30 +01:00
Laurent Vivier 0969e00b39 q800: fix segfault with invalid MacROM
"qemu-system-m68k -M q800 -bios /dev/null" crashes with a segfault
in q800_init().
This happens because the code doesn't check that rom_ptr() returned
a non-NULL pointer .

To avoid NULL pointer, don't allow 0 sized file and use bios_size with
rom_ptr().

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/756
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20220107105049.961489-1-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-01-09 12:03:21 +01:00
Laurent Vivier 214bdf8e71 hw: m68k: Add virt compat machine type for 7.0
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20211218114340.1856757-1-laurent@vivier.eu>
2022-01-09 12:02:53 +01:00
Alistair Francis 8f972e5b4b hw/riscv: Use error_fatal for SoC realisation
When realising the SoC use error_fatal instead of error_abort as the
process can fail and report useful information to the user.

Currently a user can see this:

   $ ../qemu/bld/qemu-system-riscv64 -M sifive_u -S -monitor stdio -display none -drive if=pflash
    QEMU 6.1.93 monitor - type 'help' for more information
    (qemu) Unexpected error in sifive_u_otp_realize() at ../hw/misc/sifive_u_otp.c:229:
    qemu-system-riscv64: OTP drive size < 16K
    Aborted (core dumped)

Which this patch addresses

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220105213937.1113508-8-alistair.francis@opensource.wdc.com>
2022-01-08 15:46:09 +10:00
Alistair Francis 41bcc44a25 hw/intc: sifive_plic: Cleanup remaining functions
We can remove the original sifive_plic_irqs_pending() function and
instead just use the sifive_plic_claim() function (renamed to
sifive_plic_claimed()) to determine if any interrupts are pending.

This requires move the side effects outside of sifive_plic_claimed(),
but as they are only invoked once that isn't a problem.

We have also removed all of the old #ifdef debugging logs, so let's
cleanup the last remaining debug function while we are here.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220105213937.1113508-5-alistair.francis@opensource.wdc.com>
2022-01-08 15:46:09 +10:00
Alistair Francis b79e1c76c0 hw/intc: sifive_plic: Cleanup the read function
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220105213937.1113508-4-alistair.francis@opensource.wdc.com>
2022-01-08 15:46:09 +10:00
Alistair Francis fb926d57cc hw/intc: sifive_plic: Cleanup the write function
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220105213937.1113508-3-alistair.francis@opensource.wdc.com>
2022-01-08 15:46:09 +10:00
Alistair Francis 83b92b8efc hw/intc: sifive_plic: Add a reset function
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20220105213937.1113508-2-alistair.francis@opensource.wdc.com>
2022-01-08 15:46:09 +10:00
Jim Shu e6b0408a17 hw/dma: sifive_pdma: permit 4/8-byte access size of PDMA registers
It's obvious that PDMA supports 64-bit access of 64-bit registers, and
in previous commit, we confirm that PDMA supports 32-bit access of
both 32/64-bit registers. Thus, we configure 32/64-bit memory access
of PDMA registers as valid in general.

Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20220104063408.658169-3-jim.shu@sifive.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-08 15:46:09 +10:00
Jim Shu 6fd3f397ca hw/dma: sifive_pdma: support high 32-bit access of 64-bit register
Real PDMA supports high 32-bit read/write memory access of 64-bit
register.

The following result is PDMA tested in U-Boot on Unmatched board:

1. Real PDMA allows high 32-bit read/write to 64-bit register.
=> mw.l 0x3000000 0x0                      <= Disclaim channel 0
=> mw.l 0x3000000 0x1                      <= Claim channel 0
=> mw.l 0x3000010 0x80000000               <= Write low 32-bit NextDest (NextDest = 0x280000000)
=> mw.l 0x3000014 0x2                      <= Write high 32-bit NextDest
=> md.l 0x3000010 1                        <= Dump low 32-bit NextDest
03000010: 80000000
=> md.l 0x3000014 1                        <= Dump high 32-bit NextDest
03000014: 00000002
=> mw.l 0x3000018 0x80001000               <= Write low 32-bit NextSrc (NextSrc = 0x280001000)
=> mw.l 0x300001c 0x2                      <= Write high 32-bit NextSrc
=> md.l 0x3000018 1                        <= Dump low 32-bit NextSrc
03000010: 80001000
=> md.l 0x300001c 1                        <= Dump high 32-bit NextSrc
03000014: 00000002

2. PDMA transfer from 0x280001000 to 0x280000000 is OK.
=> mw.q 0x3000008 0x4                      <= NextBytes = 4
=> mw.l 0x3000004 0x22000000               <= wsize = rsize = 2 (2^2 = 4 bytes)
=> mw.l 0x280000000 0x87654321             <= Fill test data to dst
=> mw.l 0x280001000 0x12345678             <= Fill test data to src
=> md.l 0x280000000 1; md.l 0x280001000 1  <= Dump src/dst memory contents
280000000: 87654321                              !Ce.
280001000: 12345678                              xV4.
=> md.l 0x3000000 8                        <= Dump PDMA status
03000000: 00000001 22000000 00000004 00000000    ......."........
03000010: 80000000 00000002 80001000 00000002    ................
=> mw.l 0x3000000 0x3                      <= Set channel 0 run and claim bits
=> md.l 0x3000000 8                        <= Dump PDMA status
03000000: 40000001 22000000 00000004 00000000    ...@..."........
03000010: 80000000 00000002 80001000 00000002    ................
=> md.l 0x280000000 1; md.l 0x280001000 1  <= Dump src/dst memory contents
280000000: 12345678                               xV4.
280001000: 12345678                               xV4.

Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20220104063408.658169-2-jim.shu@sifive.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-08 15:46:09 +10:00
Richard Henderson d70075373a virtio,pci,pc: features,fixes,cleanups
New virtio mem options.
 A vhost-user cleanup.
 Control over smbios entry point type.
 Config interrupt support for vdpa.
 Fixes, cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmHY2zEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpCiEH/jv5tHUffDdGz5M2pN7FTWPQ9UAMQZXbn5AS
 PPVutOI/B+ILYBuNjYLvMGeq6ymG4/0DM940/jkQwCWD4ku1OG0ReM5T5klUR8lY
 df5y1SCDv3Yoq0vxpQCnssKqbgm8Kf9tnAFjni7Lvbu3oo6DCq77m6MWEapLoEUu
 IkM+l60NKmHAClnE6RF4KobLa5srIlDTho1iBXH5S39CRF1LvP9NgnYzl7nqiEkq
 ZYQEqkKO5XGxZji9banZPJD2kxt1iL7s24QI6OJG2Lz8Hf86b0Yo7XJpmw4ShP9h
 Vl1SL3m/HhHSMBuXOb7w/EkCm59b7whXCmoyYBF/GqaxtZkvVnM=
 =4VIN
 -----END PGP SIGNATURE-----

Merge tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pci,pc: features,fixes,cleanups

New virtio mem options.
A vhost-user cleanup.
Control over smbios entry point type.
Config interrupt support for vdpa.
Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 07 Jan 2022 04:30:41 PM PST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of git://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (55 commits)
  tests: acpi: Add updated TPM related tables
  acpi: tpm: Add missing device identification objects
  tests: acpi: prepare for updated TPM related tables
  virtio/vhost-vsock: don't double close vhostfd, remove redundant cleanup
  hw/scsi/vhost-scsi: don't double close vhostfd on error
  hw/scsi/vhost-scsi: don't leak vqs on error
  docs: reSTify virtio-balloon-stats documentation and move to docs/interop
  hw/i386/pc: Add missing property descriptions
  acpihp: simplify acpi_pcihp_disable_root_bus
  tests: acpi: SLIC: update expected blobs
  tests: acpi: add SLIC table test
  tests: acpi: whitelist expected blobs before changing them
  acpi: fix QEMU crash when started with SLIC table
  intel-iommu: correctly check passthrough during translation
  virtio-mem: Set "unplugged-inaccessible=auto" for the 7.0 machine on x86
  virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
  linux-headers: sync VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
  MAINTAINERS: Add a separate entry for acpi/VIOT tables
  virtio: signal after wrapping packed used_idx
  virtio-mem: Support "prealloc=on" option
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:24:24 -08:00
Stefan Berger 5903646d39 acpi: tpm: Add missing device identification objects
Add missing TPM device identification objects _STR and _UID. They will
appear as files 'description' and 'uid' under Linux sysfs.

Following inspection of sysfs entries for hardware TPMs we chose
uid '1'.

Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Ani Sinha <ani@anisinha.ca>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/708
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Shannon Zhao <shannon.zhaosl@gmail.com>
Message-id: 20211223022310.575496-3-stefanb@linux.ibm.com
Message-Id: <20220104175806.872996-3-stefanb@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2022-01-07 19:30:13 -05:00
Daniil Tatianin d731ab3119 virtio/vhost-vsock: don't double close vhostfd, remove redundant cleanup
In case of an error during initialization in vhost_dev_init, vhostfd is
closed in vhost_dev_cleanup. Remove close from err_virtio as it's both
redundant and causes a double close on vhostfd.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Message-Id: <20211129125204.1108088-1-d-tatianin@yandex-team.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Daniil Tatianin 539ba1acac hw/scsi/vhost-scsi: don't double close vhostfd on error
vhost_dev_init calls vhost_dev_cleanup on error, which closes vhostfd,
don't double close it.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Message-Id: <20211129132358.1110372-2-d-tatianin@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Daniil Tatianin b259772afc hw/scsi/vhost-scsi: don't leak vqs on error
vhost_dev_init calls vhost_dev_cleanup in case of an error during
initialization, which zeroes out the entire vsc->dev as well as the
vsc->dev.vqs pointer. This prevents us from properly freeing it in free_vqs.
Keep a local copy of the pointer so we can free it later.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Message-Id: <20211129132358.1110372-1-d-tatianin@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Thomas Huth 44bff3767c hw/i386/pc: Add missing property descriptions
When running "qemu-system-x86_64 -M pc,help" I noticed that some
properties were still missing their description. Add them now so
that users get at least a slightly better idea what they are all
about.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20211206134255.94784-1-thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Ani Sinha 784802689f acpihp: simplify acpi_pcihp_disable_root_bus
Get rid of the static variable that keeps track of whether hotplug has been
disabled on the root pci bus. Simply use qbus_is_hotpluggable() api to
perform the same check. This eliminates additional if conditional and
simplifies the function.

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1640764674-7784-1-git-send-email-ani@anirban.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Igor Mammedov 8cdb99af45 acpi: fix QEMU crash when started with SLIC table
if QEMU is started with used provided SLIC table blob,

  -acpitable sig=SLIC,oem_id='CRASH ',oem_table_id="ME",oem_rev=00002210,asl_compiler_id="",asl_compiler_rev=00000000,data=/dev/null
it will assert with:

  hw/acpi/aml-build.c:61:build_append_padded_str: assertion failed: (len <= maxlen)

and following backtrace:

  ...
  build_append_padded_str (array=0x555556afe320, str=0x555556afdb2e "CRASH ME", maxlen=0x6, pad=0x20) at hw/acpi/aml-build.c:61
  acpi_table_begin (desc=0x7fffffffd1b0, array=0x555556afe320) at hw/acpi/aml-build.c:1727
  build_fadt (tbl=0x555556afe320, linker=0x555557ca3830, f=0x7fffffffd318, oem_id=0x555556afdb2e "CRASH ME", oem_table_id=0x555556afdb34 "ME") at hw/acpi/aml-build.c:2064
  ...

which happens due to acpi_table_begin() expecting NULL terminated
oem_id and oem_table_id strings, which is normally the case, but
in case of user provided SLIC table, oem_id points to table's blob
directly and as result oem_id became longer than expected.

Fix issue by handling oem_id consistently and make acpi_get_slic_oem()
return NULL terminated strings.

PS:
After [1] refactoring, oem_id semantics became inconsistent, where
NULL terminated string was coming from machine and old way pointer
into byte array coming from -acpitable option. That used to work
since build_header() wasn't expecting NULL terminated string and
blindly copied the 1st 6 bytes only.

However commit [2] broke that by replacing build_header() with
acpi_table_begin(), which was expecting NULL terminated string
and was checking oem_id size.

1) 602b45820 ("acpi: Permit OEM ID and OEM table ID fields to be changed")
2)
Fixes: 4b56e1e4eb ("acpi: build_fadt: use acpi_table_begin()/acpi_table_end() instead of build_header()")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/786
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20211227193120.1084176-2-imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Denis Lisov <dennis.lissov@gmail.com>
Tested-by: Alexander Tsoy <alexander@tsoy.me>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Jason Wang 5178d78f4b intel-iommu: correctly check passthrough during translation
When scalable mode is enabled, the passthrough more is not determined
by the context entry but PASID entry, so switch to use the logic of
vtd_dev_pt_enabled() to determine the passthrough mode in
vtd_do_iommu_translate().

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220105041945.13459-2-jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand 60f1f77cab virtio-mem: Set "unplugged-inaccessible=auto" for the 7.0 machine on x86
Set the new default to "auto", keeping it set to "off" for compat
machines. This property is only available for x86 targets.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand 23ad8dec8d virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading
unplugged memory is not supported. We have to fail feature negotiation
in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.

First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle
memory backends (or architectures) without support for the shared zeropage
in the hypervisor cleanly. Without the shared zeropage, even reading an
unpopulated virtual memory location can populate real memory and
consequently consume memory in the hypervisor. We have a guaranteed shared
zeropage only on MAP_PRIVATE anonymous memory.

Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default
long-term as even populating the shared zeropage can be problematic: for
example, without THP support (possible) or without support for the shared
huge zeropage with THP (unlikely), the PTE page tables to hold the shared
zeropage entries can consume quite some memory that cannot be reclaimed
easily.

Third, there are other optimizations+features (e.g., protection of
unplugged memory, reducing the total memory slot size and bitmap sizes)
that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.

We really only support x86 targets with virtio-mem for now (and
Linux similarly only support x86), but that might change soon, so prepare
for different targets already.

Add a new "unplugged-inaccessible" tristate property for x86 targets:
- "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy
  guests working.
- "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests
  from using the device.
- "auto" selects the default based on support for the shared zeropage.

Warn in case the property is set to "off" and we don't have support for the
shared zeropage.

For existing compat machines, the property will default to "off", to
not change the behavior but eventually warn about a problematic setup.
Short-term, we'll set the property default to "auto" for new QEMU machines.
Mid-term, we'll set the property default to "on" for new QEMU machines.
Long-term, we'll deprecate the parameter and disallow legacy
guests completely.

The property has to match on the migration source and destination. "auto"
will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long
as the qemu command line (esp. memdev) match -- so "auto" is good enough
for migration purposes and the parameter doesn't have to be migrated
explicitly.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Stefan Hajnoczi 750539c4c4 virtio: signal after wrapping packed used_idx
Packed Virtqueues wrap used_idx instead of letting it run freely like
Split Virtqueues do. If the used ring wraps more than once there is no
way to compare vq->signalled_used and vq->used_idx in
virtio_packed_should_notify() since they are modulo vq->vring.num.

This causes the device to stop sending used buffer notifications when
when virtio_packed_should_notify() is called less than once each time
around the used ring.

It is possible to trigger this with virtio-blk's dataplane
notify_guest_bh() irq coalescing optimization. The call to
virtio_notify_irqfd() (and virtio_packed_should_notify()) is deferred to
a BH. If the guest driver is polling it can complete and submit more
requests before the BH executes, causing the used ring to wrap more than
once. The result is that the virtio-blk device ceases to raise
interrupts and I/O hangs.

Cc: Tiwei Bie <tiwei.bie@intel.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211130134510.267382-1-stefanha@redhat.com>
Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand 09b3b7e092 virtio-mem: Support "prealloc=on" option
For scarce memory resources, such as hugetlb, we want to be able to
prealloc such memory resources in order to not crash later on access. On
simple user errors we could otherwise easily run out of memory resources
an crash the VM -- pretty much undesired.

For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.

Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.

A common use case for hugetlb will be using "reserve=off,prealloc=off" for
the memory backend and "prealloc=on" for the virtio-mem device. This
way, no huge pages will be reserved for the process, but we can recover
if there are no actual huge pages when plugging memory. Libvirt is
already prepared for this.

Note that preallocation cannot protect from the OOM killer -- which
holds true for any kind of preallocation in QEMU. It's primarily useful
only for scarce memory resources such as hugetlb, or shared file-backed
memory. It's of little use for ordinary anonymous memory that can be
swapped, KSM merged, ... but we won't forbid it.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-9-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Patrick Venture b8905cc2dd hw/arm: kudo add lm75s on bus 13
Add the four lm75s behind the mux on bus 13.

Tested by booting the firmware:
lm75 42-0048: hwmon0: sensor 'lm75'
lm75 43-0049: supply vs not found, using dummy regulator
lm75 43-0049: hwmon1: sensor 'lm75'
lm75 44-0048: supply vs not found, using dummy regulator
lm75 44-0048: hwmon2: sensor 'lm75'
lm75 45-0049: supply vs not found, using dummy regulator
lm75 45-0049: hwmon3: sensor 'lm75'

Signed-off-by: Patrick Venture <venture@google.com>
Reviewed-by: Titus Rwantare <titusr@google.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220102215844.2888833-5-venture@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-07 17:08:01 +00:00
Patrick Venture 5b0829d38c hw/arm: add i2c muxes to kudo-bmc
Signed-off-by: Patrick Venture <venture@google.com>
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220102215844.2888833-4-venture@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-07 17:08:00 +00:00
Shengtan Mao b27de2c57b hw/arm: attach MMC to kudo-bmc
Signed-off-by: Shengtan Mao <stmao@google.com>
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Chris Rauer <crauer@google.com>
Message-id: 20220102215844.2888833-3-venture@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-07 17:08:00 +00:00
Chris Rauer 560223dcf0 hw/arm: Add kudo i2c eeproms.
Signed-off-by: Chris Rauer <crauer@google.com>
Reviewed-by: Hao Wu <wuhaotsh@google.com>
Reviewed-by: Patrick Venture <venture@google.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20220102215844.2888833-2-venture@google.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-07 17:08:00 +00:00
Peter Maydell 7f18ac3ab3 hw/intc/arm_gicv3_its: Rename max_l2_entries to num_l2_entries
In several places we have a local variable max_l2_entries which is
the number of entries which will fit in a level 2 table.  The
calculations done on this value are correct; rename it to
num_l2_entries to fit the convention we're using in this code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-01-07 17:08:00 +00:00
Peter Maydell 80dcd37feb hw/intc/arm_gicv3_its: Fix various off-by-one errors
The ITS code has to check whether various parameters passed in
commands are in-bounds, where the limit is defined in terms of the
number of bits that are available for the parameter.  (For example,
the GITS_TYPER.Devbits ID register field specifies the number of
DeviceID bits minus 1, and device IDs passed in the MAPTI and MAPD
command packets must fit in that many bits.)

Currently we have off-by-one bugs in many of these bounds checks.
The typical problem is that we define a max_foo as 1 << n. In
the Devbits example, we set
  s->dt.max_ids = 1UL << (GITS_TYPER.Devbits + 1).
However later when we do the bounds check we write
  if (devid > s->dt.max_ids) { /* command error */ }
which incorrectly permits a devid of 1 << n.

These bugs will not cause QEMU crashes because the ID values being
checked are only used for accesses into tables held in guest memory
which we access with address_space_*() functions, but they are
incorrect behaviour of our emulation.

Fix them by standardizing on this pattern:
 * bounds limits are named num_foos and are the 2^n value
   (equal to the number of valid foo values)
 * bounds checks are either
   if (fooid < num_foos) { good }
   or
   if (fooid >= num_foos) { bad }

In this commit we fix the handling of the number of IDs
in the device table and the collection table, and the number
of commands that will fit in the command queue.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2022-01-07 17:08:00 +00:00
Peter Maydell 437dc0ea98 hw/intc/arm_gicv3_its: Use FIELD macros for CTEs
Use FIELD macros to handle CTEs, rather than ad-hoc mask-and-shift.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:59 +00:00
Peter Maydell 257bb6501c hw/intc/arm_gicv3_its: Correct comment about CTE RDBase field size
The comment says that in our CTE format the RDBase field is 36 bits;
in fact for us it is only 16 bits, because we use the RDBase format
where it specifies a 16-bit CPU number. The code already uses
RDBASE_PROCNUM_LENGTH (16) as the field width, so fix the comment
to match it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:59 +00:00
Peter Maydell e07f844599 hw/intc/arm_gicv3_its: Use FIELD macros for DTEs
Currently the ITS code that reads and writes DTEs uses open-coded
shift-and-mask to assemble the various fields into the 64-bit DTE
word.  The names of the macros used for mask and shift values are
also somewhat inconsistent, and don't follow our usual convention
that a MASK macro should specify the bits in their place in the word.
Replace all these with use of the FIELD macro.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:59 +00:00
Peter Maydell b87fab1c8e hw/intc/arm_gicv3_its: Correct handling of MAPI
The MAPI command takes arguments DeviceID, EventID, ICID, and is
defined to be equivalent to MAPTI DeviceID, EventID, EventID, ICID.
(That is, where MAPTI takes an explicit pINTID, MAPI uses the EventID
as the pINTID.)

We didn't quite get this right.  In particular the error checks for
MAPI include "EventID does not specify a valid LPI identifier", which
is the same as MAPTI's error check for the pINTID field.  QEMU's code
skips the pINTID error check entirely in the MAPI case.

We can fix this bug and in the process simplify the code by switching
to the obvious implementation of setting pIntid = eventid early
if ignore_pInt is true.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:59 +00:00
Peter Maydell 764d6ba10c hw/intc/arm_gicv3_its: Don't misuse GITS_TYPE_PHYSICAL define
The GITS_TYPE_PHYSICAL define is the value we set the
GITS_TYPER.Physical field to -- this is 1 to indicate that we support
physical LPIs.  (Support for virtual LPIs is the GITS_TYPER.Virtual
field.) We also use this define as the *value* that we write into an
interrupt translation table entry's INTTYPE field, which should be 1
for a physical interrupt and 0 for a virtual interrupt.  Finally, we
use it as a *mask* when we read the interrupt translation table entry
INTTYPE field.

Untangle this confusion: define an ITE_INTTYPE_VIRTUAL and
ITE_INTTYPE_PHYSICAL to be the valid values of the ITE INTTYPE
field, and replace the ad-hoc collection of ITE_ENTRY_* defines with
use of the FIELD() macro to define the fields of an ITE and the
FIELD_EX64() and FIELD_DP64() macros to read and write them.
We use ITE in the new setup, rather than ITE_ENTRY, because
ITE stands for "Interrupt translation entry" and so the extra
"entry" would be redundant.

We take the opportunity to correct the name of the field that holds
the GICv4 'doorbell' interrupt ID (this is always the value 1023 in a
GICv3, which is why we were calling it the 'spurious' field).

The GITS_TYPE_PHYSICAL define is then used in only one place, where
we set the initial GITS_TYPER value.  Since GITS_TYPER.Physical is
essentially a boolean, hiding the '1' value behind a macro is more
confusing than helpful, so expand out the macro there and remove the
define entirely.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:59 +00:00
Peter Maydell 9ae8543190 hw/intc/arm_gicv3_its: Correct setting of TableDesc entry_sz
We set the TableDesc entry_sz field from the appropriate
GITS_BASER.ENTRYSIZE field.  That ID register field specifies the
number of bytes per table entry minus one.  However when we use
td->entry_sz we assume it to be the number of bytes per table entry
(for instance we calculate the number of entries in a page by
dividing the page size by the entry size).

The effects of this bug are:
 * we miscalculate the maximum number of entries in the table,
   so our checks on guest index values are wrong (too lax)
 * when looking up an entry in the second level of an indirect
   table, we calculate an incorrect index into the L2 table.
   Because we make the same incorrect calculation on both
   reads and writes of the L2 table, the guest won't notice
   unless it's unlucky enough to use an index value that
   causes us to index off the end of the L2 table page and
   cause guest memory corruption in whatever follows

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:58 +00:00
Peter Maydell e5487a4139 hw/intc/arm_gicv3_its: Reduce code duplication in extract_table_params()
The extract_table_params() decodes the fields in the GITS_BASER<n>
registers into TableDesc structs.  Since the fields are the same for
all the GITS_BASER<n> registers, there is currently a lot of code
duplication within the switch (type) statement.  Refactor so that the
cases include only what is genuinely different for each type:
the calculation of the number of bits in the ID value that indexes
into the table.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-01-07 17:07:58 +00:00
Peter Maydell 62df780e3d hw/intc/arm_gicv3_its: Don't return early in extract_table_params() loop
In extract_table_params() we process each GITS_BASER<n> register.  If
the register's Valid bit is not set, this means there is no
in-guest-memory table and so we should not try to interpret the other
fields in the register.  This was incorrectly coded as a 'return'
rather than a 'break', so instead of looping round to process the
next GITS_BASER<n> we would stop entirely, treating any later tables
as being not valid also.

This has no real guest-visible effects because (since we don't have
GITS_TYPER.HCC != 0) the guest must in any case set up all the
GITS_BASER<n> to point to valid tables, so this only happens in an
odd misbehaving-guest corner case.

Fix the check to 'break', so that we leave the case statement and
loop back around to the next GITS_BASER<n>.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:58 +00:00
Peter Maydell 6c1db43de4 hw/intc/arm_gicv3_its: Remove maxids union from TableDesc
The TableDesc struct defines properties of the in-guest-memory tables
which the guest tells us about by writing to the GITS_BASER<n>
registers.  This struct currently has a union 'maxids', but all the
fields of the union have the same type (uint32_t) and do the same
thing (record one-greater-than the maximum ID value that can be used
as an index into the table).

We're about to add another table type (the GICv4 vPE table); rather
than adding another specifically-named union field for that table
type with the same type as the other union fields, remove the union
entirely and just have a 'uint32_t max_ids' struct field.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:58 +00:00
Peter Maydell 8d2d6dd9bb hw/intc/arm_gicv3_its: Remove redundant ITS_CTLR_ENABLED define
We currently define a bitmask for the GITS_CTLR ENABLED bit in
two ways: as ITS_CTLR_ENABLED, and via the FIELD() macro as
R_GITS_CTLR_ENABLED_MASK. Consistently use the FIELD macro version
everywhere and remove the redundant ITS_CTLR_ENABLED define.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-01-07 17:07:58 +00:00
Peter Maydell a120157b24 hw/intc/arm_gicv3_its: Correct off-by-one bounds check on rdbase
The checks in the ITS on the rdbase values in guest commands are
off-by-one: they permit the guest to pass us a value equal to
s->gicv3->num_cpu, but the valid values are 0...num_cpu-1.  This
meant the guest could cause us to index off the end of the
s->gicv3->cpu[] array when calling gicv3_redist_process_lpi(), and we
would probably crash.

(This is not a security bug, because this code is only usable
with emulation, not with KVM.)

Cc: qemu-stable@nongnu.org
Fixes: 17fb5e36aa ("hw/intc: GICv3 redistributor ITS processing")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2022-01-07 17:07:57 +00:00
Troy Lee d9e9cd59df Add dummy Aspeed AST2600 Display Port MCU (DPMCU)
AST2600 Display Port MCU introduces 0x18000000~0x1803FFFF as it's memory
and io address. If guest machine try to access DPMCU memory, it will
cause a fatal error.

Signed-off-by: Troy Lee <troy_lee@aspeedtech.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20211210083034.726610-1-troy_lee@aspeedtech.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-07 17:07:57 +00:00
Andy Pei 0a963af3e3 hw/vhost-user-blk: turn on VIRTIO_BLK_F_SIZE_MAX feature for virtio blk device
Turn on pre-defined feature VIRTIO_BLK_F_SIZE_MAX for virtio blk device to
avoid guest DMA request sizes which are too large for hardware spec.

Signed-off-by: Andy Pei <andy.pei@intel.com>
Message-Id: <1641202092-149677-1-git-send-email-andy.pei@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07 05:19:55 -05:00
Eduardo Habkost 0e4edb3b3b hw/i386: expose a "smbios-entry-point-type" PC machine property
The i440fx and Q35 machine types are both hardcoded to use the
legacy SMBIOS 2.1 (32-bit) entry point. This is a sensible
conservative choice because SeaBIOS only supports SMBIOS 2.1

EDK2, however, can also support SMBIOS 3.0 (64-bit) entry points,
and QEMU already uses this on the ARM virt machine type.

This adds a property to allow the choice of SMBIOS entry point
versions For example to opt in to 64-bit SMBIOS entry point:

   $QEMU -machine q35,smbios-entry-point-type=64

Based on a patch submitted by Daniel Berrangé.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-4-ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07 05:19:55 -05:00
Eduardo Habkost 10be11d0b4 smbios: Rename SMBIOS_ENTRY_POINT_* enums
Rename the enums to match the naming style used by QAPI, and to
use "32" and "64" instead of "20" and "31".  This will allow us
to more easily move the enum to the QAPI schema later.

About the naming choice: "SMBIOS 2.1 entry point"/"SMBIOS 3.0
entry point" and "32-bit entry point"/"64-bit entry point" are
synonymous in the SMBIOS specification.  However, the phrases
"32-bit entry point" and "64-bit entry point" are used more often.

The new names also avoid confusion between the entry point format
and the actual SMBIOS version reported in the entry point
structure.  For example: currently the 32-bit entry point
actually report SMBIOS 2.8 support, not 2.1.

Based on portions of a patch submitted by Daniel P. Berrangé.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20211026151100.1691925-2-ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Frederic Barrat 20766514d6 pcie_aer: Don't trigger a LSI if none are defined
Skip triggering an LSI when the AER root error status is updated if no
LSI is defined for the device. We can have a root bridge with no LSI,
MSI and MSI-X defined, for example on POWER systems.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-Id: <20211116170133.724751-4-fbarrat@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2022-01-07 05:19:55 -05:00
Frederic Barrat 2fedf46e34 pci: Export the pci_intx() function
Move the pci_intx() definition to the PCI header file, so that it can
be called from other PCI files. It is used by the next patch.

Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-Id: <20211116170133.724751-3-fbarrat@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2022-01-07 05:19:55 -05:00
Roman Kagan fb76785934 vhost-user-blk: propagate error return from generic vhost
Fix the only callsite that doesn't propagate the error code from the
generic vhost code.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-11-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 5d33ae4b7a vhost: stick to -errno error return convention
The generic vhost code expects that many of the VhostOps methods in the
respective backends set errno on errors.  However, none of the existing
backends actually bothers to do so.  In a number of those methods errno
from the failed call is clobbered by successful later calls to some
library functions; on a few code paths the generic vhost code then
negates and returns that errno, thus making failures look as successes
to the caller.

As a result, in certain scenarios (e.g. live migration) the device
doesn't notice the first failure and goes on through its state
transitions as if everything is ok, instead of taking recovery actions
(break and reestablish the vhost-user connection, cancel migration, etc)
before it's too late.

To fix this, consolidate on the convention to return negated errno on
failures throughout generic vhost, and use it for error propagation.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-10-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 025faa872b vhost-user: stick to -errno error return convention
VhostOps methods in user_ops are not very consistent in their error
returns: some return negated errno while others just -1.

Make sure all of them consistently return negated errno.  This also
helps error propagation from the functions being called inside.
Besides, this synchronizes the error return convention with the other
two vhost backends, kernel and vdpa, and will therefore allow for
consistent error propagation in the generic vhost code (in a followup
patch).

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-9-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 3631151b3e vhost-vdpa: stick to -errno error return convention
Almost all VhostOps methods in vdpa_ops follow the convention of
returning negated errno on error.

Adjust the few that don't.  To that end, rework vhost_vdpa_add_status to
check if setting of the requested status bits has succeeded and return
the respective error code it hasn't, and propagate the error codes
wherever it's appropriate.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-8-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 2d88d9c65c vhost-backend: stick to -errno error return convention
Almost all VhostOps methods in kernel_ops follow the convention of
returning negated errno on error.

Adjust the only one that doesn't.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-7-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 6dcae534e8 vhost-backend: avoid overflow on memslots_limit
Fix the (hypothetical) potential problem when the value parsed out of
the vhost module parameter in sysfs overflows the return value from
vhost_kernel_memslots_limit.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-6-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan b7107e758f vhost-user-blk: reconnect on any error during realize
vhost-user-blk realize only attempts to reconnect if the previous
connection attempt failed on "a problem with the connection and not an
error related to the content (which would fail again the same way in the
next attempt)".

However this distinction is very subtle, and may be inadvertently broken
if the code changes somewhere deep down the stack and a new error gets
propagated up to here.

OTOH now that the number of reconnection attempts is limited it seems
harmless to try reconnecting on any error.

So relax the condition of whether to retry connecting to check for any
error.

This patch amends a527e312b5 "vhost-user-blk: Implement reconnection
during realize".

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-2-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
2022-01-07 05:19:55 -05:00
Laurent Vivier deeb956c40 trace-events,pci: unify trace events format
Unify format used by trace_pci_update_mappings_del(),
trace_pci_update_mappings_add(), trace_pci_cfg_write() and
trace_pci_cfg_read() to print the device name and bus number,
slot number and function number.

For instance:

  pci_cfg_read virtio-net-pci 00:0 @0x20 -> 0xffffc00c
  pci_cfg_write virtio-net-pci 00:0 @0x20 <- 0xfea0000c
  pci_update_mappings_del d=0x555810b92330 01:00.0 4,0xffffc000+0x4000
  pci_update_mappings_add d=0x555810b92330 01:00.0 4,0xfea00000+0x4000

becomes

  pci_cfg_read virtio-net-pci 01:00.0 @0x20 -> 0xffffc00c
  pci_cfg_write virtio-net-pci 01:00.0 @0x20 <- 0xfea0000c
  pci_update_mappings_del virtio-net-pci 01:00.0 4,0xffffc000+0x4000
  pci_update_mappings_add virtio-net-pci 01:00.0 4,0xfea00000+0x4000

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20211105192541.655831-1-lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu d5d24d859c virtio-pci: add support for configure interrupt
Add support for configure interrupt, The process is used kvm_irqfd_assign
to set the gsi to kernel. When the configure notifier was signal by
host, qemu will inject a msix interrupt to guest

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-11-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu d48185f1a4 virtio-mmio: add support for configure interrupt
Add configure interrupt support for virtio-mmio bus. This
interrupt will be working while the backend is vhost-vdpa

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-10-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu 497679d510 virtio-net: add support for configure interrupt
Add functions to support configure interrupt in virtio_net
The functions are config_pending and config_mask, while
this input idx is VIRTIO_CONFIG_IRQ_IDX will check the
function of configure interrupt.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-9-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu f7220a7ce2 vhost: add support for configure interrupt
Add functions to support configure interrupt.
The configure interrupt process will start in vhost_dev_start
and stop in vhost_dev_stop.

Also add the functions to support vhost_config_pending and
vhost_config_mask, for masked_config_notifier, we only
use the notifier saved in vq 0.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-8-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu 081f864f56 virtio: add support for configure interrupt
Add the functions to support the configure interrupt in virtio
The function virtio_config_guest_notifier_read will notify the
guest if there is an configure interrupt.
The function virtio_config_set_guest_notifier_fd_handler is
to set the fd hander for the notifier

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-7-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu 634f7c89fb vhost-vdpa: add support for config interrupt
Add new call back function in vhost-vdpa, this function will
set the event fd to kernel. This function will be called
in the vhost_dev_start and vhost_dev_stop

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-6-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu 316011b8a7 virtio-pci: decouple the single vector from the interrupt process
To reuse the interrupt process in configure interrupt
Need to decouple the single vector from the interrupt process. Add new function
kvm_virtio_pci_vector_use_one and _release_one. These functions are use
for the single vector, the whole process will finish in a loop for the vq number.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-4-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu e3480ef81f virtio-pci: decouple notifier from interrupt process
To reuse the notifier process in configure interrupt.
Use the virtio_pci_get_notifier function to get the notifier.
the INPUT of this function is the IDX, the OUTPUT is notifier and
the vector

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-3-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu bf1d85c166 virtio: introduce macro IRTIO_CONFIG_IRQ_IDX
To support configure interrupt for vhost-vdpa
Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index,
Then we can reuse the functions guest_notifier_mask and guest_notifier_pending.
Add the check of queue index in these drivers, if the driver does not support
configure interrupt, the function will just return

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-2-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Michael S. Tsirkin 9bd6565cce acpi: validate hotplug selector on access
When bus is looked up on a pci write, we didn't
validate that the lookup succeeded.
Fuzzers thus can trigger QEMU crash by dereferencing the NULL
bus pointer.

Fixes: b32bd763a1 ("pci: introduce acpi-index property for PCI device")
Fixes: CVE-2021-4158
Cc: "Igor Mammedov" <imammedo@redhat.com>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/770
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
2022-01-06 06:11:38 -05:00
David Hildenbrand 7656d9ce09 virtio-mem: Don't skip alignment checks when warning about block size
If we warn about the block size being smaller than the default, we skip
some alignment checks.

This can currently only fail on x86-64, when specifying a block size of
1 MiB, however, we detect the THP size of 2 MiB.

Fixes: 228957fea3 ("virtio-mem: Probe THP size to determine default block size")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211011173305.13778-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 04:16:58 -05:00
Cornelia Huck 01854af2cf hw: Add compat machines for 7.0
Add 7.0 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211217143948.289995-1-cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-01-05 09:06:36 +01:00
Frank Chang b66f73a0cb hw/sd: Add SDHC support for SD card SPI-mode
In SPI-mode, SD card's OCR register: Card Capacity Status (CCS) bit
is not set to 1 correclty when the assigned SD image size is larger
than 2GB (SDHC). This will cause the SD card to be indentified as SDSC
incorrectly. CCS bit should be set to 1 if we are using SDHC.

Also, as there's no power up emulation in SPI-mode.
The OCR register: Card power up status bit bit (busy) should also
be set to 1 when reset. (busy bit is set to LOW if the card has not
finished the power up routine.)

Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211228125719.14712-1-frank.chang@sifive.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2022-01-04 08:50:28 +01:00
Philippe Mathieu-Daudé 6947feca58 hw/sd/sdcard: Rename Write Protect Group variables
'wp_groups' holds a bitmap, rename it as 'wp_group_bmap'.
'wpgrps_size' is the bitmap size (in bits), rename it as
'wp_group_bits'.

Patch created mechanically using:

  $ sed -i -e s/wp_groups/wp_group_bmap/ \
           -e s/wpgrps_size/wp_group_bits/ hw/sd/sd.c

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20210728181728.2012952-4-f4bug@amsat.org>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
2022-01-04 08:50:27 +01:00
Cédric Le Goater c316203c1e ppc/ppc405: Fix timer initialization
Timers are already initialized in ppc4xx_init(). No need to do it a
second time with a wrong set.

Fixes: d715ea9612 ("PPC: 405: Fix ppc405ep initialization")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211222064025.1541490-7-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220103063441.3424853-8-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Cédric Le Goater b1273a5e13 ppc/ppc405: Rework ppc_40x_timers_init() to use a PowerPCCPU
This is a small cleanup to ease reading. It includes the removal of a
check done on the returned value of g_malloc0(), which can not fail.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211222064025.1541490-6-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220103063441.3424853-7-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Cédric Le Goater cbd8f17d16 ppc/ppc405: Restore TCR and STR write handlers
The 405 timers were broken when booke support was added. Assumption
was made that the register numbers were the same but it's not :

    SPR_BOOKE_TSR         (0x150)
    SPR_BOOKE_TCR         (0x154)
    SPR_40x_TSR           (0x3D8)
    SPR_40x_TCR           (0x3DA)

Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Fixes: ddd1055b07 ("PPC: booke timers")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211222064025.1541490-5-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220103063441.3424853-6-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Cédric Le Goater b3b5c5d38f ppc/ppc4xx: Convert printfs()
Use a QEMU log primitive for errors and trace events for debug.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.drobear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211222064025.1541490-3-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20220103063441.3424853-4-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Daniel Henrique Barboza 9747d061ca pnv_phb4.c: do not set 'root-bus' as bus name
This change has the same motivation as the one done for pnv-phb3-root-bus
buses previously. Defaulting every bus to 'root-bus' makes it impossible to attach
root ports to specific buses and it doesn't allow for custom bus
naming because we're ignoring the 'id' value when registering the root
bus.

After this patch, creating pnv-phb4 devices with 'id' being set will
result in the following qtree:

qemu-system-ppc64 -m 4G -machine powernv9,accel=tcg \
   -device pnv-phb4,chip-id=0,index=0,id=pcie.0 \
   -device pnv-phb4,chip-id=1,index=4,id=pcie.1

bus: main-system-bus
  type System
  dev: pnv-phb4, id "pcie.1"
    index = 4 (0x4)
    chip-id = 1 (0x1)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pcie.1
      type pnv-phb4-root-bus
  dev: pnv-phb4, id "pcie.0"
    index = 0 (0x0)
    chip-id = 0 (0x0)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pcie.0
      type pnv-phb4-root-bus

And without setting any ids:

qemu-system-ppc64 -m 4G -machine powernv9,accel=tcg \
   -device pnv-phb4,chip-id=0,index=0,id=pcie.0 \
   -device pnv-phb4,chip-id=1,index=4,id=pcie.1

bus: main-system-bus
  type System
  dev: pnv-phb4, id ""
    index = 4 (0x4)
    chip-id = 1 (0x1)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb4-root-bus.1
      type pnv-phb4-root-bus
  dev: pnv-phb4, id ""
    index = 0 (0x0)
    chip-id = 0 (0x0)
    version = 704374636546 (0xa400000002)
    device-id = 1217 (0x4c1)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb4-root-bus.0
      type pnv-phb4-root-bus

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211228193806.1198496-17-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Daniel Henrique Barboza dec4e2897c pnv_phb3.c: do not set 'root-bus' as bus name
All pnv-phb3-root-bus buses are being created as 'root-bus'. This
makes it impossible to, for example, add a pnv-phb3-root-port in
a specific root bus, since they all have the same name. By default
the device will be parented by the pnv-phb3 device that precedeced it in
the QEMU command line.

Moreover, this doesn't all for custom bus naming. Libvirt, for instance,
likes to name these buses as 'pcie.N', where 'N' is the index value of
the controller in the domain XML, by using the 'id' command line
attribute. At this moment this is also being ignored - the created root
bus will always be named 'root-bus'.

This patch fixes both scenarios by removing the 'root-bus' name from the
pci_register_root_bus() call. If an "id" is provided, use that.
Otherwise use 'NULL' as bus name. The 'NULL' value will be handled in
qbus_init_internal() and it will defaulted as lowercase bus type + the
global bus_id value.

After this path we can define the bus name by using the 'id' attribute:

qemu-system-ppc64 -m 4G -machine powernv8,accel=tcg \
    -device pnv-phb3,chip-id=0,index=1,id=pcie.0

  dev: pnv-phb3, id "pcie.0"
    index = 1 (0x1)
    chip-id = 0 (0x0)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pcie.0
      type pnv-phb3-root-bus

And without an 'id' we will have the following default:

qemu-system-ppc64 -m 4G -machine powernv8,accel=tcg \
    -device pnv-phb3,chip-id=0,index=1

  dev: pnv-phb3, id ""
    index = 1 (0x1)
    chip-id = 0 (0x0)
    x-config-reg-migration-enabled = true
    bypass-iommu = false
    bus: pnv-phb3-root-bus.0
      type pnv-phb3-root-bus

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211228193806.1198496-3-danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Cédric Le Goater c42b9c8b33 ppc/pnv: Remove the PHB4 "device-id" property
It's unused.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211222063817.1541058-4-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Cédric Le Goater 81fbb57b7b ppc/pnv: Remove PHB4 reset handler
The PHB4 reset handler was preparing ground for PHB5 to set
appropriately the device id. We don't need it for the PHB4 since the
device id is already set in the root port complex. PH5 will introduce
its own.

"device-id" property is now useless. It should be removed.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211222063817.1541058-3-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:34 +01:00
Cédric Le Goater 316717feb3 ppc/pnv: Change the maximum of PHB3 devices for Power8NVL
The POWER8 processors with a NVLink logic unit have 4 PHB3 devices per
chip.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20211222063817.1541058-2-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
2022-01-04 07:55:33 +01:00
Yanan Wang 864c3b5c32 hw/core/machine: Introduce CPU cluster topology support
The new Cluster-Aware Scheduling support has landed in Linux 5.16,
which has been proved to benefit the scheduling performance (e.g.
load balance and wake_affine strategy) on both x86_64 and AArch64.

So now in Linux 5.16 we have four-level arch-neutral CPU topology
definition like below and a new scheduler level for clusters.
struct cpu_topology {
    int thread_id;
    int core_id;
    int cluster_id;
    int package_id;
    int llc_id;
    cpumask_t thread_sibling;
    cpumask_t core_sibling;
    cpumask_t cluster_sibling;
    cpumask_t llc_sibling;
}

A cluster generally means a group of CPU cores which share L2 cache
or other mid-level resources, and it is the shared resources that
is used to improve scheduler's behavior. From the point of view of
the size range, it's between CPU die and CPU core. For example, on
some ARM64 Kunpeng servers, we have 6 clusters in each NUMA node,
and 4 CPU cores in each cluster. The 4 CPU cores share a separate
L2 cache and a L3 cache tag, which brings cache affinity advantage.

In virtualization, on the Hosts which have pClusters (physical
clusters), if we can design a vCPU topology with cluster level for
guest kernel and have a dedicated vCPU pinning. A Cluster-Aware
Guest kernel can also make use of the cache affinity of CPU clusters
to gain similar scheduling performance.

This patch adds infrastructure for CPU cluster level topology
configuration and parsing, so that the user can specify cluster
parameter if their machines support it.

Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Message-Id: <20211228092221.21068-3-wangyanan55@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[PMD: Added '(since 7.0)' to @clusters in qapi/machine.json]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-31 13:42:39 +01:00
Philippe Mathieu-Daudé 3e2f14981c hw/core: Rename smp_parse() -> machine_parse_smp_config()
All methods related to MachineState are prefixed with "machine_".
smp_parse() does not need to be an exception. Rename it and
const'ify the SMPConfiguration argument, since it doesn't need
to be modified.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211216132015.815493-9-philmd@redhat.com>
2021-12-31 13:35:10 +01:00
Philippe Mathieu-Daudé 2ebd9ce19a hw/qdev: Rename qdev_connect_gpio_out*() 'input_pin' parameter
@pin is an input where we connect a device output.
Rename it @input_pin to simplify the documentation.

Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20211218130437.1516929-5-f4bug@amsat.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-31 13:21:36 +01:00
Philippe Mathieu-Daudé 4a63054bce pci: Let ld*_pci_dma() propagate MemTxResult
ld*_dma() returns a MemTxResult type. Do not discard
it, return it to the caller.

Update the few callers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-24-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé 398f9a84ac pci: Let ld*_pci_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling ld*_pci_dma().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-22-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé a423a1b523 pci: Let st*_pci_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling st*_pci_dma().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-21-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé cd1db8df74 dma: Let ld*_dma() propagate MemTxResult
dma_memory_read() returns a MemTxResult type. Do not discard
it, return it to the caller.

Update the few callers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-19-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé 34cdea1db6 dma: Let ld*_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling ld*_dma().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-17-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé 2280c27afc dma: Let st*_dma() take MemTxAttrs argument
Let devices specify transaction attributes when calling st*_dma().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-16-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé 1e5a3f8b2a dma: Let dma_buf_read() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_buf_read().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-13-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé 392e48af34 dma: Let dma_buf_write() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_buf_write().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-12-philmd@redhat.com>
2021-12-31 01:05:27 +01:00
Philippe Mathieu-Daudé e2d784b67d pci: Let pci_dma_rw() take MemTxAttrs argument
Let devices specify transaction attributes when calling pci_dma_rw().

Keep the default MEMTXATTRS_UNSPECIFIED in the few callers.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-10-philmd@redhat.com>
2021-12-31 01:05:23 +01:00
Philippe Mathieu-Daudé 5e468a36dc dma: Have dma_buf_read() / dma_buf_write() take a void pointer
DMA operations are run on any kind of buffer, not arrays of
uint8_t. Convert dma_buf_read/dma_buf_write functions to take
a void pointer argument and save us pointless casts to uint8_t *.

Remove this pointless casts in the megasas device model.

Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-9-philmd@redhat.com>
2021-12-30 17:16:32 +01:00
Philippe Mathieu-Daudé a1d4b0a305 dma: Let dma_memory_map() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_memory_map().

Patch created mechanically using spatch with this script:

  @@
  expression E1, E2, E3, E4;
  @@
  - dma_memory_map(E1, E2, E3, E4)
  + dma_memory_map(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-7-philmd@redhat.com>
2021-12-30 17:16:32 +01:00
Philippe Mathieu-Daudé ba06fe8add dma: Let dma_memory_read/write() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_memory_read() or dma_memory_write().

Patch created mechanically using spatch with this script:

  @@
  expression E1, E2, E3, E4;
  @@
  (
  - dma_memory_read(E1, E2, E3, E4)
  + dma_memory_read(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED)
  |
  - dma_memory_write(E1, E2, E3, E4)
  + dma_memory_write(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED)
  )

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-6-philmd@redhat.com>
2021-12-30 17:16:32 +01:00
Philippe Mathieu-Daudé 23faf5694f dma: Let dma_memory_rw() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_memory_rw().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-5-philmd@redhat.com>
2021-12-30 17:16:32 +01:00
Philippe Mathieu-Daudé 7a36e42d91 dma: Let dma_memory_set() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_memory_set().

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-3-philmd@redhat.com>
2021-12-30 17:16:32 +01:00
Philippe Mathieu-Daudé 41d5e8da3d hw/scsi/megasas: Use uint32_t for reply queue head/tail values
While the reply queue values fit in 16-bit, they are accessed
as 32-bit:

  661:    s->reply_queue_head = ldl_le_pci_dma(pcid, s->producer_pa);
  662:    s->reply_queue_head %= MEGASAS_MAX_FRAMES;
  663:    s->reply_queue_tail = ldl_le_pci_dma(pcid, s->consumer_pa);
  664:    s->reply_queue_tail %= MEGASAS_MAX_FRAMES;

Having:

  41:#define MEGASAS_MAX_FRAMES 2048         /* Firmware limit at 65535 */

In order to update the ld/st*_pci_dma() API to pass the address
of the value to access, it is simpler to have the head/tail declared
as 32-bit values. Replace the uint16_t by uint32_t, wasting 4 bytes in
the MegasasState structure.

Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211223115554.3155328-20-philmd@redhat.com>
2021-12-30 17:16:32 +01:00
Laurent Vivier 1b529d908d failover: Silence warning messages during qtest
virtio-net-failover test tries several device combinations that produces
some expected warnings.
These warning can be confusing, so we disable them during the qtest
sequence.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20211220145314.390697-1-lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
[thuth: Fix memory leak by using error_free()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-12-22 08:12:45 +01:00
Marc-André Lureau f6ef71bded ui: move qemu_spice_fill_device_address to ui/util.c
Other backends can use it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-21 10:50:21 +04:00
Marc-André Lureau f6413cbfd0 ui: simplify gl unblock & flush
GraphicHw.gl_flushed was introduced to notify the
device (vhost-user-gpu) that the GL resources (the display scanout) are
no longer needed.

It was decoupled from QEMU own gl-blocking mechanism, but that
difference isn't helping. Instead, we can reuse QEMU gl-blocking and
notify virtio_gpu_gl_flushed() when unblocking (to unlock
vhost-user-gpu).

An extra block/unblock is added arount dpy_gl_update() so existing
backends that don't block will have the flush event handled. It will
also help when there are no backends associated.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-21 10:50:21 +04:00
Marc-André Lureau 46e4609e33 virtio-gpu: use VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP
It's part of Linux headers for a while now.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-21 10:50:21 +04:00
Marc-André Lureau 8f5f1ea0c0 hw/display: report an error if virgl initialization failed
Currently, virgl initialization error is silent. Make it verbose instead.

(this is likely going to bug later on, as the device isn't fully
initialized)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-12-21 10:50:21 +04:00