Compare commits

...

52 Commits

Author SHA1 Message Date
Michael Roth 54e1f5be86 Update version for v6.1.1 release
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-23 09:52:09 -06:00
Cole Robinson fddd169de5 tests: tcg: Fix PVH test with binutils 2.36+
binutils started adding a .note.gnu.property ELF section which
makes the PVH test fail:

  TEST    hello on x86_64
qemu-system-x86_64: Error loading uncompressed kernel without PVH ELF Note

Discard .note.gnu* while keeping the PVH .note bits intact.

This also strips the build-id note, so drop the related comment.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-Id: <5ab2a54c262c61f64c22dbb49ade3e2db8a740bb.1633708346.git.crobinso@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8e751e9c38)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-15 07:13:37 -06:00
Richard Henderson 711bd602cc tcg/arm: Reduce vector alignment requirement for NEON
With arm32, the ABI gives us 8-byte alignment for the stack.
While it's possible to realign the stack to provide 16-byte alignment,
it's far easier to simply not encode 16-byte alignment in the
VLD1 and VST1 instructions that we emit.

Remove the assertion in temp_allocate_frame, limit natural alignment
to the provided stack alignment, and add a comment.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1999878
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210912174925.200132-1-richard.henderson@linaro.org>
Message-Id: <20211206191335.230683-2-richard.henderson@linaro.org>
(cherry picked from commit b9537d5904)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-15 07:13:27 -06:00
Daniel P. Berrangé e88636b4d4 target/i386: add missing bits to CR4_RESERVED_MASK
Booting Fedora kernels with -cpu max hangs very early in boot. Disabling
the la57 CPUID bit fixes the problem. git bisect traced the regression to

  commit 213ff024a2 (HEAD, refs/bisect/bad)
  Author: Lara Lazier <laramglazier@gmail.com>
  Date:   Wed Jul 21 17:26:50 2021 +0200

    target/i386: Added consistency checks for CR4

    All MBZ bits in CR4 must be zero. (APM2 15.5)
    Added reserved bitmask and added checks in both
    helper_vmrun and helper_write_crN.

    Signed-off-by: Lara Lazier <laramglazier@gmail.com>
    Message-Id: <20210721152651.14683-2-laramglazier@gmail.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

In this commit CR4_RESERVED_MASK is missing CR4_LA57_MASK and
two others. Adding this lets Fedora kernels boot once again.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20210831175033.175584-1-berrange@redhat.com>
[Removed VMXE/SMXE, matching the commit message. - Paolo]
Fixes: 213ff024a2 ("target/i386: Added consistency checks for CR4", 2021-07-22)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 69e3895f9d)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-15 07:13:18 -06:00
Gerd Hoffmann 34833f361b qxl: fix pre-save logic
Oops.  Logic is backwards.

Fixes: 39b8a183e2 ("qxl: remove assert in qxl_pre_save.")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/610
Resolves: https://bugzilla.redhat.com//show_bug.cgi?id=2002907
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210910094203.3582378-1-kraxel@redhat.com>
(cherry picked from commit eb94846280)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-15 07:13:12 -06:00
Jon Maloy 43583f0c07 e1000: fix tx re-entrancy problem
The fact that the MMIO handler is not re-entrant causes an infinite
loop under certain conditions:

Guest write to TDT ->  Loopback -> RX (DMA to TDT) -> TX

We now eliminate the effect of this problem locally in e1000, by adding
a boolean in struct E1000State indicating when the TX side is busy. This
will cause any entering new call to return early instead of interfering
with the ongoing work, and eliminates any risk of looping.

This is intended to address CVE-2021-20257.

Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 25ddb946e6)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 17:40:06 -06:00
Prasad J Pandit 1ce084af08 net: vmxnet3: validate configuration values during activate (CVE-2021-20203)
While activating device in vmxnet3_acticate_device(), it does not
validate guest supplied configuration values against predefined
minimum - maximum limits. This may lead to integer overflow or
OOB access issues. Add checks to avoid it.

Fixes: CVE-2021-20203
Buglink: https://bugs.launchpad.net/qemu/+bug/1913873
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit d05dcd94ae)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 17:39:20 -06:00
Mark Mielke fec12fc888 virtio-blk: Fix clean up of host notifiers for single MR transaction.
The code that introduced "virtio-blk: Configure all host notifiers in
a single MR transaction" introduced a second loop variable to perform
cleanup in second loop, but mistakenly still refers to the first
loop variable within the second loop body.

Fixes: d0267da614 ("virtio-blk: Configure all host notifiers in a single MR transaction")
Signed-off-by: Mark Mielke <mark.mielke@gmail.com>
Message-id: CALm7yL08qarOu0dnQkTN+pa=BSRC92g31YpQQNDeAiT4yLZWQQ@mail.gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5b807181c2)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 15:10:56 -06:00
Philippe Mathieu-Daudé ef0cf1887e tests/qtest/fdc-test: Add a regression test for CVE-2021-20196
Without the previous commit, when running 'make check-qtest-i386'
with QEMU configured with '--enable-sanitizers' we get:

  AddressSanitizer:DEADLYSIGNAL
  =================================================================
  ==287878==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000344
  ==287878==The signal is caused by a WRITE memory access.
  ==287878==Hint: address points to the zero page.
      #0 0x564b2e5bac27 in blk_inc_in_flight block/block-backend.c:1346:5
      #1 0x564b2e5bb228 in blk_pwritev_part block/block-backend.c:1317:5
      #2 0x564b2e5bcd57 in blk_pwrite block/block-backend.c:1498:11
      #3 0x564b2ca1cdd3 in fdctrl_write_data hw/block/fdc.c:2221:17
      #4 0x564b2ca1b2f7 in fdctrl_write hw/block/fdc.c:829:9
      #5 0x564b2dc49503 in portio_write softmmu/ioport.c:201:9

Add the reproducer for CVE-2021-20196.

Suggested-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-4-philmd@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit cc20926e9b)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 15:05:11 -06:00
Philippe Mathieu-Daudé 71ba2adfeb hw/block/fdc: Kludge missing floppy drive to fix CVE-2021-20196
Guest might select another drive on the bus by setting the
DRIVE_SEL bit of the DIGITAL OUTPUT REGISTER (DOR).
The current controller model doesn't expect a BlockBackend
to be NULL. A simple way to fix CVE-2021-20196 is to create
an empty BlockBackend when it is missing. All further
accesses will be safely handled, and the controller state
machines keep behaving correctly.

Cc: qemu-stable@nongnu.org
Fixes: CVE-2021-20196
Reported-by: Gaoning Pan (Ant Security Light-Year Lab) <pgn@zju.edu.cn>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-3-philmd@redhat.com
BugLink: https://bugs.launchpad.net/qemu/+bug/1912780
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/338
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 1ab95af033)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 15:05:05 -06:00
Philippe Mathieu-Daudé 7629818574 hw/block/fdc: Extract blk_create_empty_drive()
We are going to re-use this code in the next commit,
so extract it as a new blk_create_empty_drive() function.

Inspired-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20211124161536.631563-2-philmd@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit b154791e7b)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 15:04:59 -06:00
Daniil Tatianin 4658dfcbc0 chardev/wctable: don't free the instance in wctablet_chr_finalize
Object is supposed to be freed by invoking obj->free, and not
obj->instance_finalize. This would lead to use-after-free followed by
double free in object_unref/object_finalize.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20211117142349.836279-1-d-tatianin@yandex-team.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit fdc6e16818)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:54:14 -06:00
Klaus Jensen 2b2eb343a0 hw/nvme: fix buffer overrun in nvme_changed_nslist (CVE-2021-3947)
Fix missing offset verification.

Cc: qemu-stable@nongnu.org
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Fixes: f432fdfa12 ("support changed namespace asynchronous event")
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit e2c57529c9)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:52:50 -06:00
Eric Blake 932333c5f0 nbd/server: Don't complain on certain client disconnects
When a client disconnects abruptly, but did not have any pending
requests (for example, when using nbdsh without calling h.shutdown),
we used to output the following message:

$ qemu-nbd -f raw file
$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to read request: Unexpected end-of-file before all bytes were read

Then in commit f148ae7, we refactored nbd_receive_request() to use
nbd_read_eof(); when this returns 0, we regressed into tracing
uninitialized memory (if tracing is enabled) and reporting a
less-specific:

qemu-nbd: Disconnect client, due to: Request handling failed in intermediate state

Note that with Unix sockets, we have yet another error message,
unchanged by the 6.0 regression:

$ qemu-nbd -k /tmp/sock -f raw file
$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.trim(1,0)'
qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write to socket: Broken pipe

But in all cases, the error message goes away if the client performs a
soft shutdown by using NBD_CMD_DISC, rather than a hard shutdown by
abrupt disconnect:

$ nbdsh -u 'nbd://localhost:10809' -c 'h.trim(1,0)' -c 'h.shutdown()'

This patch fixes things to avoid uninitialized memory, and in general
avoids warning about a client that does a hard shutdown when not in
the middle of a packet.  A client that aborts mid-request, or which
does not read the full server's reply, can still result in warnings,
but those are indeed much more unusual situations.

CC: qemu-stable@nongnu.org
Fixes: f148ae7d36 ("nbd/server: Quiesce coroutines on context switch", v6.0.0)
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: defer unrelated typo fixes to later patch]
Message-Id: <20211117170230.1128262-2-eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 1644cccea5)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:51:17 -06:00
Peng Liang 8c2d5911de vfio: Fix memory leak of hostwin
hostwin is allocated and added to hostwin_list in vfio_host_win_add, but
it is only deleted from hostwin_list in vfio_host_win_del, which causes
a memory leak.  Also, freeing all elements in hostwin_list is missing in
vfio_disconnect_container.

Fix: 2e4109de8e ("vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)")
CC: qemu-stable@nongnu.org
Signed-off-by: Peng Liang <liangpeng10@huawei.com>
Link: https://lore.kernel.org/r/20211117014739.1839263-1-liangpeng10@huawei.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit f3bc3a73c9)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:49:55 -06:00
Jason Wang 08e46e6d92 virtio: use virtio accessor to access packed event
We used to access packed descriptor event and off_wrap via
address_space_{write|read}_cached(). When we hit the cache, memcpy()
is used which is not atomic which may lead a wrong value to be read or
wrote.

This patch fixes this by switching to use
virito_{stw|lduw}_phys_cached() to make sure the access is atomic.

Fixes: 683f766567 ("virtio: event suppression support for packed ring")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211111063854.29060-2-jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d152cdd6f6)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:43:25 -06:00
Jason Wang df1c9c3039 virtio: use virtio accessor to access packed descriptor flags
We used to access packed descriptor flags via
address_space_{write|read}_cached(). When we hit the cache, memcpy()
is used which is not an atomic operation which may lead a wrong value
is read or wrote.

So this patch switches to use virito_{stw|lduw}_phys_cached() to make
sure the aceess is atomic.

Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211111063854.29060-1-jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f463e761a4)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:43:18 -06:00
Igor Mammedov 7204b8f3c6 pcie: rename 'native-hotplug' to 'x-native-hotplug'
Mark property as experimental/internal adding 'x-' prefix.

Property was introduced in 6.1 and it should have provided
ability to turn on native PCIE hotplug on port even when
ACPI PCI hotplug is in use is user explicitly sets property
on CLI. However that never worked since slot is wired to
ACPI hotplug controller.
Another non-intended usecase: disable native hotplug on slot
when APCI based hotplug is disabled, which works but slot has
'hotplug' property for this taks.

It should be relatively safe to rename it to experimental
as no users should exist for it and given that the property
is broken we don't really want to leave it around for much
longer lest users start using it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20211112110857.3116853-2-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 2aa1842d6d)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:38:08 -06:00
Greg Kurz 36c651c226 accel/tcg: Register a force_rcu notifier
A TCG vCPU doing a busy loop systematicaly hangs the QEMU monitor
if the user passes 'device_add' without argument. This is because
drain_cpu_all() which is called from qmp_device_add() cannot return
if readers don't exit read-side critical sections. That is typically
what busy-looping TCG vCPUs do:

int cpu_exec(CPUState *cpu)
{
[...]
    rcu_read_lock();
[...]
    while (!cpu_handle_exception(cpu, &ret)) {
        // Busy loop keeps vCPU here
    }
[...]
    rcu_read_unlock();

    return ret;
}

For MTTCG, have all vCPU threads register a force_rcu notifier that will
kick them out of the loop using async_run_on_cpu(). The notifier is called
with the rcu_registry_lock mutex held, using async_run_on_cpu() ensures
there are no deadlocks.

For RR, a single thread runs all vCPUs. Just register a single notifier
that kicks the current vCPU to the next one.

For MTTCG:
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>

For RR:
Suggested-by: Richard Henderson <richard.henderson@linaro.org>

Fixes: 7bed89958b ("device_core: use drain_call_rcu in in qmp_device_add")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/650
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211109183523.47726-3-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit dd47a8f654)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:26:01 -06:00
Greg Kurz fceaefb43f rcu: Introduce force_rcu notifier
The drain_rcu_call() function can be blocked as long as an RCU reader
stays in a read-side critical section. This is typically what happens
when a TCG vCPU is executing a busy loop. It can deadlock the QEMU
monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 .

This can be avoided by allowing drain_rcu_call() to enforce an RCU grace
period. Since each reader might need to do specific actions to end a
read-side critical section, do it with notifiers.

Prepare ground for this by adding a notifier list to the RCU reader
struct and use it in wait_for_readers() if drain_rcu_call() is in
progress. An API is added for readers to register their notifiers.

This is largely based on a draft from Paolo Bonzini.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211109183523.47726-2-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ef149763a8)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:25:55 -06:00
Laurent Vivier 7d71e6bfb0 hw: m68k: virt: Add compat machine for 6.1
Add the missing machine type for m68k/virt

Cc: qemu-stable@nongnu.org
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20211106194158.4068596-2-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 6837f29976)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:23:21 -06:00
Mauro Matteo Cascella c2c7f108b8 hw/scsi/scsi-disk: MODE_PAGE_ALLS not allowed in MODE SELECT commands
This avoids an off-by-one read of 'mode_sense_valid' buffer in
hw/scsi/scsi-disk.c:mode_sense_page().

Fixes: CVE-2021-3930
Cc: qemu-stable@nongnu.org
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: a8f4bbe290 ("scsi-disk: store valid mode pages in a table")
Fixes: #546
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b3af7fdf9c)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:22:44 -06:00
Paolo Bonzini 3488bb205d target-i386: mmu: fix handling of noncanonical virtual addresses
mmu_translate is supposed to return an error code for page faults; it is
not able to handle other exceptions.  The #GP case for noncanonical
virtual addresses is not handled correctly, and incorrectly raised as
a page fault with error code 1.  Since it cannot happen for nested
page tables, move it directly to handle_mmu_fault, even before the
invocation of mmu_translate.

Fixes: #676
Fixes: 661ff4879e ("target/i386: extract mmu_translate", 2021-05-11)
Cc: qemu-stable@nongnu.org
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b04dc92e01)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:19:00 -06:00
Paolo Bonzini cddfaf96ab target-i386: mmu: use pg_mode instead of HF_LMA_MASK
Correctly look up the paging mode of the hypervisor when it is using 64-bit
mode but the guest is not.

Fixes: 68746930ae ("target/i386: use mmu_translate for NPT walk", 2021-05-11)
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 93eae35832)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:18:25 -06:00
Jessica Clarke 43a457841f Partially revert "build: -no-pie is no functional linker flag"
This partially reverts commit bbd2d5a812.

This commit was misguided and broke using --disable-pie on any distro
that enables PIE by default in their compiler driver, including Debian
and its derivatives. Whilst -no-pie is not a linker flag, it is a
compiler driver flag that ensures -pie is not automatically passed by it
to the linker. Without it, all compile_prog checks will fail as any code
built with the explicit -fno-pie will fail to link with the implicit
default -pie due to trying to use position-dependent relocations. The
only bug that needed fixing was LDFLAGS_NOPIE being used as a flag for
the linker itself in pc-bios/optionrom/Makefile.

Note this does not reinstate exporting LDFLAGS_NOPIE, as it is unused,
since the only previous use was the one that should not have existed. I
have also updated the comment for the -fno-pie and -no-pie checks to
reflect what they're actually needed for.

Fixes: bbd2d5a812
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Message-Id: <20210805192545.38279-1-jrtc27@jrtc27.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ffd205ef29)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:13:38 -06:00
Ari Sundholm ebf660beb1 block/file-posix: Fix return value translation for AIO discards
AIO discards regressed as a result of the following commit:
	0dfc7af2 block/file-posix: Optimize for macOS

When trying to run blkdiscard within a Linux guest, the request would
fail, with some errors in dmesg:

---- [ snip ] ----
[    4.010070] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
[    4.011061] sd 2:0:0:0: [sda] tag#0 Sense Key : Aborted Command
[current]
[    4.011061] sd 2:0:0:0: [sda] tag#0 Add. Sense: I/O process
terminated
[    4.011061] sd 2:0:0:0: [sda] tag#0 CDB: Unmap/Read sub-channel 42
00 00 00 00 00 00 00 18 00
[    4.011061] blk_update_request: I/O error, dev sda, sector 0
---- [ snip ] ----

This turns out to be a result of a flaw in changes to the error value
translation logic in handle_aiocb_discard(). The default return value
may be left untranslated in some configurations, and the wrong variable
is used in one translation.

Fix both issues.

Fixes: 0dfc7af2b2 ("block/file-posix: Optimize for macOS")
Cc: qemu-stable@nongnu.org
Signed-off-by: Ari Sundholm <ari@tuxera.com>
Signed-off-by: Emil Karlson <jkarlson@tuxera.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211019110954.4170931-1-ari@tuxera.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 13a028336f)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:12:50 -06:00
Ani Sinha bbbdedb386 tests/acpi/bios-tables-test: update DSDT blob for multifunction bridge test
We added a new unit test for testing acpi hotplug on multifunction bridges in
q35 machines. Here, we update the DSDT table gloden master blob for this unit
test.

The test adds the following devices to qemu and then checks the changes
introduced in the DSDT table due to the addition of the following devices:

(a) a multifunction bridge device
(b) a bridge device with function 1
(c) a non-bridge device with function 2

In the DSDT table, we should see AML hotplug descriptions for (a) and (b).
For (a) we should find a hotplug AML description for function 0.

Following is the ASL diff between the original DSDT table and the modified DSDT
table due to the unit test. We see that multifunction bridge on bus 2 and single
function bridge on bus 3 function 1 are described, not the non-bridge balloon
device on bus 4, function 2.

@@ -1,30 +1,30 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20190509 (64-bit version)
  * Copyright (c) 2000 - 2019 Intel Corporation
  *
  * Disassembling to symbolic ASL+ operators
  *
- * Disassembly of tests/data/acpi/q35/DSDT, Thu Oct  7 18:29:19 2021
+ * Disassembly of /tmp/aml-C7JCA1, Thu Oct  7 18:29:19 2021
  *
  * Original Table Header:
  *     Signature        "DSDT"
- *     Length           0x00002061 (8289)
+ *     Length           0x00002187 (8583)
  *     Revision         0x01 **** 32-bit table (V1), no 64-bit math support
- *     Checksum         0xF9
+ *     Checksum         0x8D
  *     OEM ID           "BOCHS "
  *     OEM Table ID     "BXPC    "
  *     OEM Revision     0x00000001 (1)
  *     Compiler ID      "BXPC"
  *     Compiler Version 0x00000001 (1)
  */
 DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
 {
     Scope (\)
     {
         OperationRegion (DBG, SystemIO, 0x0402, One)
         Field (DBG, ByteAcc, NoLock, Preserve)
         {
             DBGB,   8
         }

@@ -3265,23 +3265,95 @@
                 Method (_S1D, 0, NotSerialized)  // _S1D: S1 Device State
                 {
                     Return (Zero)
                 }

                 Method (_S2D, 0, NotSerialized)  // _S2D: S2 Device State
                 {
                     Return (Zero)
                 }

                 Method (_S3D, 0, NotSerialized)  // _S3D: S3 Device State
                 {
                     Return (Zero)
                 }
             }

+            Device (S10)
+            {
+                Name (_ADR, 0x00020000)  // _ADR: Address
+                Name (BSEL, One)
+                Device (S00)
+                {
+                    Name (_SUN, Zero)  // _SUN: Slot User Number
+                    Name (_ADR, Zero)  // _ADR: Address
+                    Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device, x=0-9
+                    {
+                        PCEJ (BSEL, _SUN)
+                    }
+
+                    Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
+                    {
+                        Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                    }
+                }
+
+                Method (DVNT, 2, NotSerialized)
+                {
+                    If ((Arg0 & One))
+                    {
+                        Notify (S00, Arg1)
+                    }
+                }
+
+                Method (PCNT, 0, NotSerialized)
+                {
+                    BNUM = One
+                    DVNT (PCIU, One)
+                    DVNT (PCID, 0x03)
+                }
+            }
+
+            Device (S19)
+            {
+                Name (_ADR, 0x00030001)  // _ADR: Address
+                Name (BSEL, Zero)
+                Device (S00)
+                {
+                    Name (_SUN, Zero)  // _SUN: Slot User Number
+                    Name (_ADR, Zero)  // _ADR: Address
+                    Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device, x=0-9
+                    {
+                        PCEJ (BSEL, _SUN)
+                    }
+
+                    Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
+                    {
+                        Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                    }
+                }
+
+                Method (DVNT, 2, NotSerialized)
+                {
+                    If ((Arg0 & One))
+                    {
+                        Notify (S00, Arg1)
+                    }
+                }
+
+                Method (PCNT, 0, NotSerialized)
+                {
+                    BNUM = Zero
+                    DVNT (PCIU, One)
+                    DVNT (PCID, 0x03)
+                }
+            }
+
             Method (PCNT, 0, NotSerialized)
             {
+                ^S19.PCNT ()
+                ^S10.PCNT ()
             }
         }
     }
 }

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20211007135750.1277213-4-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
(cherry picked from commit a8339e07f9)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:05:24 -06:00
Ani Sinha 8319de607f tests/acpi/pcihp: add unit tests for hotplug on multifunction bridges for q35
commit d7346e614f ("acpi: x86: pcihp: add support hotplug on multifunction bridges")
added ACPI hotplug descriptions for cold plugged bridges for functions other
than 0. For all other devices, the ACPI hotplug descriptions are limited to
function 0 only. This change adds unit tests for this feature.

This test adds the following devices to qemu and then checks the changes
introduced in the DSDT table due to the addition of the following devices:

(a) a multifunction bridge device
(b) a bridge device with function 1
(c) a non-bridge device with function 2

In the DSDT table, we should see AML hotplug descriptions for (a) and (b).
For (a) we should find a hotplug AML description for function 0.

The following diff compares the DSDT table AML with the new unit test before
and after the change d7346e614f is introduced. In other words,
this diff reflects the changes that occurs in the DSDT table due to the change
d7346e614f .

@@ -1,60 +1,38 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20190509 (64-bit version)
  * Copyright (c) 2000 - 2019 Intel Corporation
  *
  * Disassembling to symbolic ASL+ operators
  *
- * Disassembly of tests/data/acpi/q35/DSDT.multi-bridge, Thu Oct  7 18:56:05 2021
+ * Disassembly of /tmp/aml-AN0DA1, Thu Oct  7 18:56:05 2021
  *
  * Original Table Header:
  *     Signature        "DSDT"
- *     Length           0x000020FE (8446)
+ *     Length           0x00002187 (8583)
  *     Revision         0x01 **** 32-bit table (V1), no 64-bit math support
- *     Checksum         0xDE
+ *     Checksum         0x8D
  *     OEM ID           "BOCHS "
  *     OEM Table ID     "BXPC    "
  *     OEM Revision     0x00000001 (1)
  *     Compiler ID      "BXPC"
  *     Compiler Version 0x00000001 (1)
  */
 DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
 {
-    /*
-     * iASL Warning: There was 1 external control method found during
-     * disassembly, but only 0 were resolved (1 unresolved). Additional
-     * ACPI tables may be required to properly disassemble the code. This
-     * resulting disassembler output file may not compile because the
-     * disassembler did not know how many arguments to assign to the
-     * unresolved methods. Note: SSDTs can be dynamically loaded at
-     * runtime and may or may not be available via the host OS.
-     *
-     * In addition, the -fe option can be used to specify a file containing
-     * control method external declarations with the associated method
-     * argument counts. Each line of the file must be of the form:
-     *     External (<method pathname>, MethodObj, <argument count>)
-     * Invocation:
-     *     iasl -fe refs.txt -d dsdt.aml
-     *
-     * The following methods were unresolved and many not compile properly
-     * because the disassembler had to guess at the number of arguments
-     * required for each:
-     */
-    External (_SB_.PCI0.S19_.PCNT, MethodObj)    // Warning: Unknown method, guessing 1 arguments
-
     Scope (\)
     {
         OperationRegion (DBG, SystemIO, 0x0402, One)
         Field (DBG, ByteAcc, NoLock, Preserve)
         {
             DBGB,   8
         }

         Method (DBUG, 1, NotSerialized)
         {
             ToHexString (Arg0, Local0)
             ToBuffer (Local0, Local0)
             Local1 = (SizeOf (Local0) - One)
             Local2 = Zero
             While ((Local2 < Local1))
             {
@@ -3322,24 +3300,60 @@
                 Method (DVNT, 2, NotSerialized)
                 {
                     If ((Arg0 & One))
                     {
                         Notify (S00, Arg1)
                     }
                 }

                 Method (PCNT, 0, NotSerialized)
                 {
                     BNUM = One
                     DVNT (PCIU, One)
                     DVNT (PCID, 0x03)
                 }
             }

+            Device (S19)
+            {
+                Name (_ADR, 0x00030001)  // _ADR: Address
+                Name (BSEL, Zero)
+                Device (S00)
+                {
+                    Name (_SUN, Zero)  // _SUN: Slot User Number
+                    Name (_ADR, Zero)  // _ADR: Address
+                    Method (_EJ0, 1, NotSerialized)  // _EJx: Eject Device, x=0-9
+                    {
+                        PCEJ (BSEL, _SUN)
+                    }
+
+                    Method (_DSM, 4, Serialized)  // _DSM: Device-Specific Method
+                    {
+                        Return (PDSM (Arg0, Arg1, Arg2, Arg3, BSEL, _SUN))
+                    }
+                }
+
+                Method (DVNT, 2, NotSerialized)
+                {
+                    If ((Arg0 & One))
+                    {
+                        Notify (S00, Arg1)
+                    }
+                }
+
+                Method (PCNT, 0, NotSerialized)
+                {
+                    BNUM = Zero
+                    DVNT (PCIU, One)
+                    DVNT (PCID, 0x03)
+                }
+            }
+
             Method (PCNT, 0, NotSerialized)
             {
-                ^S19.PCNT (^S10.PCNT ())
+                ^S19.PCNT ()
+                ^S10.PCNT ()
             }
         }
     }
 }

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20211007135750.1277213-3-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
(cherry picked from commit 04dd78b9e8)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:05:20 -06:00
Ani Sinha a759dc19ec tests/acpi/bios-tables-test: add and allow changes to a new q35 DSDT table blob
We are adding a new unit test to cover the acpi hotplug support in q35 for
multi-function bridges. This test uses a new table DSDT.multi-bridge.
We need to allow changes in DSDT acpi table for addition of this new
unit test.

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <20211007135750.1277213-2-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
(cherry picked from commit 6dcb1cc951)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:05:15 -06:00
Michael S. Tsirkin 24101e36f1 pci: fix PCI resource reserve capability on BE
PCI resource reserve capability should use LE format as all other PCI
things. If we don't then seabios won't boot:

=== PCI new allocation pass #1 ===
PCI: check devices
PCI: QEMU resource reserve cap: size 10000000000000 type io
PCI: secondary bus 1 size 10000000000000 type io
PCI: secondary bus 1 size 00200000 type mem
PCI: secondary bus 1 size 00200000 type prefmem
=== PCI new allocation pass #2 ===
PCI: out of I/O address space

This became more important since we started reserving IO by default,
previously no one noticed.

Fixes: e2a6290aab ("hw/pcie-root-port: Fix hotplug for PCI devices requiring IO")
Cc: marcel.apfelbaum@gmail.com
Fixes: 226263fb5c ("hw/pci: add QEMU-specific PCI capability to the Generic PCI Express Root Port")
Cc: zuban32s@gmail.com
Fixes: 6755e618d0 ("hw/pci: add PCI resource reserve capability to legacy PCI bridge")
Cc: jing2.liu@linux.intel.com
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit 0e464f7d99)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 14:01:41 -06:00
Paolo Bonzini a43e057bd6 block: introduce max_hw_iov for use in scsi-generic
Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel
sources, IOV_MAX in POSIX).  Because of this, on some host adapters
requests with many iovecs are rejected with -EINVAL by the
io_submit() or readv()/writev() system calls.

In fact, the same limit applies to SG_IO as well.  To fix both the
EINVAL and the possible performance issues from using fewer iovecs
than allowed by Linux (some HBAs have max_segments as low as 128),
introduce a separate entry in BlockLimits to hold the max_segments
value from sysfs.  This new limit is used only for SG_IO and clamped
to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to
bs->bl.max_transfer.

Reported-by: Halil Pasic <pasic@linux.ibm.com>
Cc: Hanna Reitz <hreitz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Cc: qemu-stable@nongnu.org
Fixes: 18473467d5 ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cc07162953)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 13:53:15 -06:00
Ani Sinha 3aa2c2cd67 bios-tables-test: Update ACPI DSDT table golden blobs for q35
We have modified the IO address range for ACPI pci hotplug in q35. See change:

5adcc9e39e6a5 ("hw/i386/acpi: fix conflicting IO address range for acpi pci hotplug in q35")

The ACPI DSDT table golden blobs must be regenrated in order to make the unit tests
pass. This change updates the golden ACPI DSDT table blobs.

Following is the ASL diff between the blobs:

@@ -1,30 +1,30 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20190509 (64-bit version)
  * Copyright (c) 2000 - 2019 Intel Corporation
  *
  * Disassembling to symbolic ASL+ operators
  *
- * Disassembly of tests/data/acpi/q35/DSDT, Tue Sep 14 09:04:06 2021
+ * Disassembly of /tmp/aml-52DP90, Tue Sep 14 09:04:06 2021
  *
  * Original Table Header:
  *     Signature        "DSDT"
  *     Length           0x00002061 (8289)
  *     Revision         0x01 **** 32-bit table (V1), no 64-bit math support
- *     Checksum         0xE5
+ *     Checksum         0xF9
  *     OEM ID           "BOCHS "
  *     OEM Table ID     "BXPC    "
  *     OEM Revision     0x00000001 (1)
  *     Compiler ID      "BXPC"
  *     Compiler Version 0x00000001 (1)
  */
 DefinitionBlock ("", "DSDT", 1, "BOCHS ", "BXPC    ", 0x00000001)
 {
     Scope (\)
     {
         OperationRegion (DBG, SystemIO, 0x0402, One)
         Field (DBG, ByteAcc, NoLock, Preserve)
         {
             DBGB,   8
         }

@@ -226,46 +226,46 @@
             Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
             {
                 IO (Decode16,
                     0x0070,             // Range Minimum
                     0x0070,             // Range Maximum
                     0x01,               // Alignment
                     0x08,               // Length
                     )
                 IRQNoFlags ()
                     {8}
             })
         }
     }

     Scope (_SB.PCI0)
     {
-        OperationRegion (PCST, SystemIO, 0x0CC4, 0x08)
+        OperationRegion (PCST, SystemIO, 0x0CC0, 0x08)
         Field (PCST, DWordAcc, NoLock, WriteAsZeros)
         {
             PCIU,   32,
             PCID,   32
         }

-        OperationRegion (SEJ, SystemIO, 0x0CCC, 0x04)
+        OperationRegion (SEJ, SystemIO, 0x0CC8, 0x04)
         Field (SEJ, DWordAcc, NoLock, WriteAsZeros)
         {
             B0EJ,   32
         }

-        OperationRegion (BNMR, SystemIO, 0x0CD4, 0x08)
+        OperationRegion (BNMR, SystemIO, 0x0CD0, 0x08)
         Field (BNMR, DWordAcc, NoLock, WriteAsZeros)
         {
             BNUM,   32,
             PIDX,   32
         }

         Mutex (BLCK, 0x00)
         Method (PCEJ, 2, NotSerialized)
         {
             Acquire (BLCK, 0xFFFF)
             BNUM = Arg0
             B0EJ = (One << Arg1)
             Release (BLCK)
             Return (Zero)
         }

@@ -3185,34 +3185,34 @@
                     0x0620,             // Range Minimum
                     0x0620,             // Range Maximum
                     0x01,               // Alignment
                     0x10,               // Length
                     )
             })
         }

         Device (PHPR)
         {
             Name (_HID, "PNP0A06" /* Generic Container Device */)  // _HID: Hardware ID
             Name (_UID, "PCI Hotplug resources")  // _UID: Unique ID
             Name (_STA, 0x0B)  // _STA: Status
             Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
             {
                 IO (Decode16,
-                    0x0CC4,             // Range Minimum
-                    0x0CC4,             // Range Maximum
+                    0x0CC0,             // Range Minimum
+                    0x0CC0,             // Range Maximum
                     0x01,               // Alignment
                     0x18,               // Length
                     )
             })
         }
     }

     Scope (\)
     {
         Name (_S3, Package (0x04)  // _S3_: S3 System State
         {
             One,
             One,
             Zero,
             Zero
         })

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210916132838.3469580-4-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 500eb21cff)
*drop dependency on 75539b886a ("tests: acpi: tpm1.2: Add expected TPM 1.2 ACPI blobs")
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 13:27:44 -06:00
Ani Sinha 9e80a430ed hw/i386/acpi: fix conflicting IO address range for acpi pci hotplug in q35
Change caf108bc58 ("hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35")
selects an IO address range for acpi based PCI hotplug for q35 arbitrarily. It
starts at address 0x0cc4 and ends at 0x0cdb. At the time when the patch was
written but the final version of the patch was not yet pushed upstream, this
address range was free and did not conflict with any other IO address ranges.
However, with the following change, this address range was no
longer conflict free as in this change, the IO address range
(value of ACPI_PCIHP_SIZE) was incremented by four bytes:

b32bd763a1 ("pci: introduce acpi-index property for PCI device")

This can be seen from the output of QMP command 'info mtree' :

0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt
0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt
0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr
0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0
0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi
0000000000000cc4-0000000000000cdb (prio 0, i/o): acpi-pci-hotplug
0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-cpu-hotplug

It shows that there is a region of conflict between IO regions of acpi
pci hotplug and acpi cpu hotplug.

Unfortunately, the change caf108bc58 did not update the IO address range
appropriately before it was pushed upstream to accommodate the increased
length of the IO address space introduced in change b32bd763a1.

Due to this bug, windows guests complain 'This device cannot find
enough free resources it can use' in the device manager panel for extended
IO buses. This issue also breaks the correct functioning of pci hotplug as the
following shows that the IO space for pci hotplug has been truncated:

(qemu) info mtree -f
FlatView #0
 AS "I/O", root: io
 Root memory region: io
  0000000000000cc4-0000000000000cd7 (prio 0, i/o): acpi-pci-hotplug
  0000000000000cd8-0000000000000cf7 (prio 0, i/o): acpi-cpu-hotplug

Therefore, in this fix, we adjust the IO address range for the acpi pci
hotplug so that it does not conflict with cpu hotplug and there is no
truncation of IO spaces. The starting IO address of PCI hotplug region
has been decremented by four bytes in order to accommodate four byte
increment in the IO address space introduced by change
b32bd763a1 ("pci: introduce acpi-index property for PCI device")

After fixing, the following are the corrected IO ranges:

0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt
0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt
0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr
0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0
0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi
0000000000000cc0-0000000000000cd7 (prio 0, i/o): acpi-pci-hotplug
0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-cpu-hotplug

This change has been tested using a Windows Server 2019 guest VM. Windows
no longer complains after this change.

Fixes: caf108bc58 ("hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/561

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Julia Suvorova <jusual@redhat.com>
Message-Id: <20210916132838.3469580-3-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0e780da76a)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 13:12:38 -06:00
Ani Sinha c66f5dfc12 bios-tables-test: allow changes in DSDT ACPI tables for q35
We are going to commit a change to fix IO address range allocated for acpi pci
hotplug in q35. This affects DSDT tables. This change allows DSDT table
modification so that unit tests are not broken.

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210916132838.3469580-2-ani@anisinha.ca>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9f29e872d5)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 13:12:31 -06:00
Jean-Philippe Brucker 5cf977a2a1 hw/i386: Rename default_bus_bypass_iommu
Since commit d8fb7d0969 ("vl: switch -M parsing to keyval"), machine
parameter definitions cannot use underscores, because keyval_dashify()
transforms them to dashes and the parser doesn't find the parameter.

This affects option default_bus_bypass_iommu which was introduced in the
same release:

$ qemu-system-x86_64 -M q35,default_bus_bypass_iommu=on
qemu-system-x86_64: Property 'pc-q35-6.1-machine.default-bus-bypass-iommu' not found

Rename the parameter to "default-bus-bypass-iommu". Passing
"default_bus_bypass_iommu" is still valid since the underscore are
transformed automatically.

Fixes: c9e96b04fc ("hw/i386: Add a default_bus_bypass_iommu pc machine option")
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211025104737.1560274-1-jean-philippe@linaro.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 739b38630c)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 13:07:48 -06:00
Jean-Philippe Brucker 36cfd11a86 hw/arm/virt: Rename default_bus_bypass_iommu
Since commit d8fb7d0969 ("vl: switch -M parsing to keyval"), machine
parameter definitions cannot use underscores, because keyval_dashify()
transforms them to dashes and the parser doesn't find the parameter.

This affects option default_bus_bypass_iommu which was introduced in the
same release:

$ qemu-system-aarch64 -M virt,default_bus_bypass_iommu=on
qemu-system-aarch64: Property 'virt-6.1-machine.default-bus-bypass-iommu' not found

Rename the parameter to "default-bus-bypass-iommu". Passing
"default_bus_bypass_iommu" is still valid since the underscore are
transformed automatically.

Fixes: 6d7a85483a ("hw/arm/virt: Add default_bus_bypass_iommu machine option")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211026093733.2144161-1-jean-philippe@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 9dad363a22)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 13:07:27 -06:00
Stefano Garzarella 246ccfbf44 vhost-vsock: fix migration issue when seqpacket is supported
Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
enabled the SEQPACKET feature bit.
This commit is released with QEMU 6.1, so if we try to migrate a VM where
the host kernel supports SEQPACKET but machine type version is less than
6.1, we get the following errors:

    Features 0x130000002 unsupported. Allowed features: 0x179000000
    Failed to load virtio-vhost_vsock:virtio
    error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
    load of migration failed: Operation not permitted

Let's disable the feature bit for machine types < 6.1.
We add a new OnOffAuto property for this, called `seqpacket`.
When it is `auto` (default), QEMU behaves as before, trying to enable the
feature, when it is `on` QEMU will fail if the backend (vhost-vsock
kernel module) doesn't support it.

Fixes: 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
Cc: qemu-stable@nongnu.org
Reported-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-2-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d6a9378f47)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 13:04:13 -06:00
Michael Tokarev 3ee93e456d qemu-sockets: fix unix socket path copy (again)
Commit 4cfd970ec1 added an
assert which ensures the path within an address of a unix
socket returned from the kernel is at least one byte and
does not exceed sun_path buffer. Both of this constraints
are wrong:

A unix socket can be unnamed, in this case the path is
completely empty (not even \0)

And some implementations (notable linux) can add extra
trailing byte (\0) _after_ the sun_path buffer if we
passed buffer larger than it (and we do).

So remove the assertion (since it causes real-life breakage)
but at the same time fix the usage of sun_path. Namely,
we should not access sun_path[0] if kernel did not return
it at all (this is the case for unnamed sockets),
and use the returned salen when copyig actual path as an
upper constraint for the amount of bytes to copy - this
will ensure we wont exceed the information provided by
the kernel, regardless whenever there is a trailing \0
or not. This also helps with unnamed sockets.

Note the case of abstract socket, the sun_path is actually
a blob and can contain \0 characters, - it should not be
passed to g_strndup and the like, it should be accessed by
memcpy-like functions.

Fixes: 4cfd970ec1
Fixes: http://bugs.debian.org/993145
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
CC: qemu-stable@nongnu.org
(cherry picked from commit 118d527f2e)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 12:55:36 -06:00
Paolo Bonzini ec08035102 plugins: do not limit exported symbols if modules are active
On Mac --enable-modules and --enable-plugins are currently incompatible, because the
Apple -Wl,-exported_symbols_list command line options prevents the export of any
symbols needed by the modules.  On x86 -Wl,--dynamic-list does not have this effect,
but only because the -Wl,--export-dynamic option provided by gmodule-2.0.pc overrides
it.  On Apple there is no -Wl,--export-dynamic, because it is the default, and thus
no override.

Either way, when modules are active there is no reason to include the plugin_ldflags.
While at it, avoid the useless -Wl,--export-dynamic when --enable-plugins is
specified but --enable-modules is not; this way, the GNU and Apple configurations
are more similar.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/516
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AJB: fix noexport to no-export]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210811100550.54714-1-pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
(cherry picked from commit b906acace2)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 12:49:09 -06:00
Mahmoud Mandour f97853c8cb plugins/execlog: removed unintended "s" at the end of log lines.
Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210803151428.125323-1-ma.mandourr@gmail.com>
Message-Id: <20210806141015.2487502-2-alex.bennee@linaro.org>
Cc: qemu-stable@nongnu.org
(cherry picked from commit b40310616d)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 12:48:25 -06:00
Christian Schoenebeck abeee2a470 9pfs: fix crash in v9fs_walk()
v9fs_walk() utilizes the v9fs_co_run_in_worker({...}) macro to run the
supplied fs driver code block on a background worker thread.

When either the 'Twalk' client request was interrupted or if the client
requested fid for that 'Twalk' request caused a stat error then that
fs driver code block was left by 'break' keyword, with the intention to
return from worker thread back to main thread as well:

    v9fs_co_run_in_worker({
        if (v9fs_request_cancelled(pdu)) {
            err = -EINTR;
            break;
        }
        err = s->ops->lstat(&s->ctx, &dpath, &fidst);
        if (err < 0) {
            err = -errno;
            break;
        }
        ...
    });

However that 'break;' statement also skipped the v9fs_co_run_in_worker()
macro's final and mandatory

    /* re-enter back to qemu thread */
    qemu_coroutine_yield();

call and thus caused the rest of v9fs_walk() to be continued being
executed on the worker thread instead of main thread, eventually
leading to a crash in the transport virtio transport driver.

To fix this issue and to prevent the same error from happening again by
other users of v9fs_co_run_in_worker() in future, auto wrap the supplied
code block into its own

    do { } while (0);

loop inside the 'v9fs_co_run_in_worker' macro definition.

Full discussion and backtrace:
https://lists.gnu.org/archive/html/qemu-devel/2021-08/msg05209.html
https://lists.gnu.org/archive/html/qemu-devel/2021-09/msg00174.html

Fixes: 8d6cb10073
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <E1mLTBg-0002Bh-2D@lizzy.crudebyte.com>
(cherry picked from commit f83df00900)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 12:46:48 -06:00
Yang Zhong ff6d391e10 i386/cpu: Remove AVX_VNNI feature from Cooperlake cpu model
The AVX_VNNI feature is not in Cooperlake platform, remove it
from cpu model.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20210820054611.84303-1-yang.zhong@intel.com>
Fixes: c1826ea6a0 ("i386/cpu: Expose AVX_VNNI instruction to guest")
Cc: qemu-stable@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit f429dbf8fc)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 12:43:09 -06:00
Helge Deller b19de1137b hw/display/artist: Fix bug in coordinate extraction in artist_vram_read() and artist_vram_write()
The CDE desktop on HP-UX 10 shows wrongly rendered pixels when the local screen
menu is closed. This bug was introduced by commit c7050f3f16
("hw/display/artist: Refactor x/y coordination extraction") which converted the
coordinate extraction in artist_vram_read() and artist_vram_write() to use the
ADDR_TO_X and ADDR_TO_Y macros, but forgot to right-shift the address by 2 as
it was done before.

Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: c7050f3f16 ("hw/display/artist: Refactor x/y coordination extraction")
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <YK1aPb8keur9W7h2@ls3530>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 01f750f5fe)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:57:12 -06:00
David Hildenbrand 3c6e5df1f6 libvhost-user: fix VHOST_USER_REM_MEM_REG skipping mmap_addr
We end up not copying the mmap_addr of all existing regions, resulting
in a SEGFAULT once we actually try to map/access anything within our
memory regions.

Fixes: 875b9fd97b ("Support individual region unmap in libvhost-user")
Cc: qemu-stable@nongnu.org
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Coiby Xu <coiby.xu@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211011201047.62587-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6889eb2d43)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:57:07 -06:00
Xueming Li 695c25e167 vhost-user: fix duplicated notifier MR init
In case of device resume after suspend, VQ notifier MR still valid.
Duplicated registrations explode memory block list and slow down device
resume.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: tiwei.bie@intel.com
Cc: qemu-stable@nongnu.org
Cc: Yuwei Zhang <zhangyuwei.9149@bytedance.com>

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20211008080215.590292-1-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a1ed9ef1de)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:57:02 -06:00
Gerd Hoffmann 23ba9f170f uas: add stream number sanity checks.
The device uses the guest-supplied stream number unchecked, which can
lead to guest-triggered out-of-band access to the UASDevice->data3 and
UASDevice->status3 fields.  Add the missing checks.

Fixes: CVE-2021-3713
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reported-by: Chen Zhe <chenzhe@huawei.com>
Reported-by: Tan Jingguo <tanjingguo@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210818120505.1258262-2-kraxel@redhat.com>
(cherry picked from commit 13b250b12a)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:56:53 -06:00
David Hildenbrand f0dee5a40d virtio-mem-pci: Fix memory leak when creating MEMORY_DEVICE_SIZE_CHANGE event
Apparently, we don't have to duplicate the string.

Fixes: 722a3c783e ("virtio-pci: Send qapi events when the virtio-mem size changes")
Cc: qemu-stable@nongnu.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210929162445.64060-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 75b98cb9f6)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:56:49 -06:00
Markus Armbruster 7637373b23 hmp: Unbreak "change vnc"
HMP command "change vnc" can take the password as argument, or prompt
for it:

    (qemu) change vnc password 123
    (qemu) change vnc password
    Password: ***
    (qemu)

This regressed in commit cfb5387a1d "hmp: remove "change vnc TARGET"
command", v6.0.0.

    (qemu) change vnc passwd 123
    Password: ***
    (qemu) change vnc passwd
    (qemu)

The latter passes NULL to qmp_change_vnc_password(), which is a no-no.
Looks like it puts the display into "password required, but none set"
state.

The logic error is easy to miss in review, but testing should've
caught it.

Fix the obvious way.

Fixes: cfb5387a1d
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210909081219.308065-2-armbru@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 6193344f93)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:56:41 -06:00
Nir Soffer 4c34ef3d34 qemu-nbd: Change default cache mode to writeback
Both qemu and qemu-img use writeback cache mode by default, which is
already documented in qemu(1). qemu-nbd uses writethrough cache mode by
default, and the default cache mode is not documented.

According to the qemu-nbd(8):

   --cache=CACHE
          The  cache  mode  to be used with the file.  See the
          documentation of the emulator's -drive cache=... option for
          allowed values.

qemu(1) says:

    The default mode is cache=writeback.

So users have no reason to assume that qemu-nbd is using writethough
cache mode. The only hint is the painfully slow writing when using the
defaults.

Looking in git history, it seems that qemu used writethrough in the past
to support broken guests that did not flush data properly, or could not
flush due to limitations in qemu. But qemu-nbd clients can use
NBD_CMD_FLUSH to flush data, so using writethrough does not help anyone.

Change the default cache mode to writback, and document the default and
available values properly in the online help and manual.

With this change converting image via qemu-nbd is 3.5 times faster.

    $ qemu-img create dst.img 50g
    $ qemu-nbd -t -f raw -k /tmp/nbd.sock dst.img

Before this change:

    $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
    Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
      Time (mean ± σ):     83.639 s ±  5.970 s    [User: 2.733 s, System: 6.112 s]
      Range (min … max):   76.749 s … 87.245 s    3 runs

After this change:

    $ hyperfine -r3 "./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock"
    Benchmark #1: ./qemu-img convert -p -f raw -O raw -T none -W fedora34.img nbd+unix:///?socket=/tmp/nbd.sock
      Time (mean ± σ):     23.522 s ±  0.433 s    [User: 2.083 s, System: 5.475 s]
      Range (min … max):   23.234 s … 24.019 s    3 runs

Users can avoid the issue by using --cache=writeback[1] but the defaults
should give good performance for the common use case.

[1] https://bugzilla.redhat.com/1990656

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Message-Id: <20210813205519.50518-1-nsoffer@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 0961525705)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:56:36 -06:00
Jason Wang 9e41f16fca virtio-net: fix use after unmap/free for sg
When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().

Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.

This addresses CVE-2021-3748.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit bedd7e93d0)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:56:31 -06:00
Peter Maydell 3054f772de target/arm: Don't skip M-profile reset entirely in user mode
Currently all of the M-profile specific code in arm_cpu_reset() is
inside a !defined(CONFIG_USER_ONLY) ifdef block.  This is
unintentional: it happened because originally the only
M-profile-specific handling was the setup of the initial SP and PC
from the vector table, which is system-emulation only.  But then we
added a lot of other M-profile setup to the same "if (ARM_FEATURE_M)"
code block without noticing that it was all inside a not-user-mode
ifdef.  This has generally been harmless, but with the addition of
v8.1M low-overhead-loop support we ran into a problem: the reset of
FPSCR.LTPSIZE to 4 was only being done for system emulation mode, so
if a user-mode guest tried to execute the LE instruction it would
incorrectly take a UsageFault.

Adjust the ifdefs so only the really system-emulation specific parts
are covered.  Because this means we now run some reset code that sets
up initial values in the FPCCR and similar FPU related registers,
explicitly set up the registers controlling FPU context handling in
user-emulation mode so that the FPU works by design and not by
chance.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/613
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210914120725.24992-2-peter.maydell@linaro.org
(cherry picked from commit b62ceeaf80)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:56:25 -06:00
David Hildenbrand aa77e375a5 virtio-balloon: don't start free page hinting if postcopy is possible
Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:

1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
   and consequently won't release free pages back to the OS once
   migration finishes.

   The issue is that for postcopy, we won't do a final bitmap sync while
   the guest is stopped on the source and
   virtio_balloon_free_page_hint_notify() will only call
   virtio_balloon_free_page_done() on the source during
   PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
   the destination.

2) Once the VM touches a page on the destination that has been excluded
   from migration on the source via qemu_guest_free_page_hint() while
   postcopy is active, that thread will stall until postcopy finishes
   and all threads are woken up. (with older Linux kernels that won't
   retry faults when woken up via userfaultfd, we might actually get a
   SEGFAULT)

   The issue is that the source will refuse to migrate any pages that
   are not marked as dirty in the dirty bmap -- for example, because the
   page might just have been sent. Consequently, the faulting thread will
   stall, waiting for the page to be migrated -- which could take quite
   a while and result in guest OS issues.

While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].

As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.

Note 1: in the future we might either adjust migration code on the source
        to track pages that have actually been sent or adjust
        migration code on source and destination  to eventually send
        pages multiple times from the source and and deal with pages
        that are sent multiple times on the destination.

Note 2: virtio-mem has similar issues, however, access to "unplugged"
        memory by the guest is very rare and we would have to be very
        lucky for it to happen during migration. The spec states
        "The driver SHOULD NOT read from unplugged memory blocks ..."
        and "The driver MUST NOT write to unplugged memory blocks".
        virtio-mem will move away from virtio_balloon_free_page_done()
        soon and handle this case explicitly on the destination.

[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com

Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit fd51e54fa1)
Signed-off-by: Michael Roth <michael.roth@amd.com>
2021-12-14 08:56:18 -06:00
68 changed files with 410 additions and 105 deletions

View File

@ -1 +1 @@
6.1.0
6.1.1

View File

@ -28,6 +28,7 @@
#include "sysemu/tcg.h"
#include "sysemu/replay.h"
#include "qemu/main-loop.h"
#include "qemu/notify.h"
#include "qemu/guest-random.h"
#include "exec/exec-all.h"
#include "hw/boards.h"
@ -35,6 +36,26 @@
#include "tcg-accel-ops.h"
#include "tcg-accel-ops-mttcg.h"
typedef struct MttcgForceRcuNotifier {
Notifier notifier;
CPUState *cpu;
} MttcgForceRcuNotifier;
static void do_nothing(CPUState *cpu, run_on_cpu_data d)
{
}
static void mttcg_force_rcu(Notifier *notify, void *data)
{
CPUState *cpu = container_of(notify, MttcgForceRcuNotifier, notifier)->cpu;
/*
* Called with rcu_registry_lock held, using async_run_on_cpu() ensures
* that there are no deadlocks.
*/
async_run_on_cpu(cpu, do_nothing, RUN_ON_CPU_NULL);
}
/*
* In the multi-threaded case each vCPU has its own thread. The TLS
* variable current_cpu can be used deep in the code to find the
@ -43,12 +64,16 @@
static void *mttcg_cpu_thread_fn(void *arg)
{
MttcgForceRcuNotifier force_rcu;
CPUState *cpu = arg;
assert(tcg_enabled());
g_assert(!icount_enabled());
rcu_register_thread();
force_rcu.notifier.notify = mttcg_force_rcu;
force_rcu.cpu = cpu;
rcu_add_force_rcu_notifier(&force_rcu.notifier);
tcg_register_thread();
qemu_mutex_lock_iothread();
@ -100,6 +125,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
tcg_cpus_destroy(cpu);
qemu_mutex_unlock_iothread();
rcu_remove_force_rcu_notifier(&force_rcu.notifier);
rcu_unregister_thread();
return NULL;
}

View File

@ -28,6 +28,7 @@
#include "sysemu/tcg.h"
#include "sysemu/replay.h"
#include "qemu/main-loop.h"
#include "qemu/notify.h"
#include "qemu/guest-random.h"
#include "exec/exec-all.h"
@ -135,6 +136,11 @@ static void rr_deal_with_unplugged_cpus(void)
}
}
static void rr_force_rcu(Notifier *notify, void *data)
{
rr_kick_next_cpu();
}
/*
* In the single-threaded case each vCPU is simulated in turn. If
* there is more than a single vCPU we create a simple timer to kick
@ -145,10 +151,13 @@ static void rr_deal_with_unplugged_cpus(void)
static void *rr_cpu_thread_fn(void *arg)
{
Notifier force_rcu;
CPUState *cpu = arg;
assert(tcg_enabled());
rcu_register_thread();
force_rcu.notify = rr_force_rcu;
rcu_add_force_rcu_notifier(&force_rcu);
tcg_register_thread();
qemu_mutex_lock_iothread();
@ -257,6 +266,7 @@ static void *rr_cpu_thread_fn(void *arg)
rr_deal_with_unplugged_cpus();
}
rcu_remove_force_rcu_notifier(&force_rcu);
rcu_unregister_thread();
return NULL;
}

View File

@ -1978,6 +1978,12 @@ uint32_t blk_get_max_transfer(BlockBackend *blk)
return ROUND_DOWN(max, blk_get_request_alignment(blk));
}
int blk_get_max_hw_iov(BlockBackend *blk)
{
return MIN_NON_ZERO(blk->root->bs->bl.max_hw_iov,
blk->root->bs->bl.max_iov);
}
int blk_get_max_iov(BlockBackend *blk)
{
return blk->root->bs->bl.max_iov;

View File

@ -1273,7 +1273,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
ret = hdev_get_max_segments(s->fd, &st);
if (ret > 0) {
bs->bl.max_iov = ret;
bs->bl.max_hw_iov = ret;
}
}
}
@ -1807,7 +1807,7 @@ static int handle_aiocb_copy_range(void *opaque)
static int handle_aiocb_discard(void *opaque)
{
RawPosixAIOData *aiocb = opaque;
int ret = -EOPNOTSUPP;
int ret = -ENOTSUP;
BDRVRawState *s = aiocb->bs->opaque;
if (!s->has_discard) {
@ -1829,7 +1829,7 @@ static int handle_aiocb_discard(void *opaque)
#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
aiocb->aio_offset, aiocb->aio_nbytes);
ret = translate_err(-errno);
ret = translate_err(ret);
#elif defined(__APPLE__) && (__MACH__)
fpunchhole_t fpunchhole;
fpunchhole.fp_flags = 0;

View File

@ -136,6 +136,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
dst->min_mem_alignment = MAX(dst->min_mem_alignment,
src->min_mem_alignment);
dst->max_iov = MIN_NON_ZERO(dst->max_iov, src->max_iov);
dst->max_hw_iov = MIN_NON_ZERO(dst->max_hw_iov, src->max_hw_iov);
}
typedef struct BdrvRefreshLimitsState {

View File

@ -320,7 +320,6 @@ static void wctablet_chr_finalize(Object *obj)
TabletChardev *tablet = WCTABLET_CHARDEV(obj);
qemu_input_handler_unregister(tablet->hs);
g_free(tablet);
}
static void wctablet_chr_open(Chardev *chr,

10
configure vendored
View File

@ -2246,9 +2246,11 @@ static THREAD int tls_var;
int main(void) { return tls_var; }
EOF
# Check we support --no-pie first; we will need this for building ROMs.
# Check we support -fno-pie and -no-pie first; we will need the former for
# building ROMs, and both for everything if --disable-pie is passed.
if compile_prog "-Werror -fno-pie" "-no-pie"; then
CFLAGS_NOPIE="-fno-pie"
LDFLAGS_NOPIE="-no-pie"
fi
if test "$static" = "yes"; then
@ -2264,6 +2266,7 @@ if test "$static" = "yes"; then
fi
elif test "$pie" = "no"; then
CONFIGURE_CFLAGS="$CFLAGS_NOPIE $CONFIGURE_CFLAGS"
CONFIGURE_LDFLAGS="$LDFLAGS_NOPIE $CONFIGURE_LDFLAGS"
elif compile_prog "-Werror -fPIE -DPIE" "-pie"; then
CONFIGURE_CFLAGS="-fPIE -DPIE $CONFIGURE_CFLAGS"
CONFIGURE_LDFLAGS="-pie $CONFIGURE_LDFLAGS"
@ -3187,9 +3190,8 @@ glib_req_ver=2.56
glib_modules=gthread-2.0
if test "$modules" = yes; then
glib_modules="$glib_modules gmodule-export-2.0"
fi
if test "$plugins" = "yes"; then
glib_modules="$glib_modules gmodule-2.0"
elif test "$plugins" = "yes"; then
glib_modules="$glib_modules gmodule-no-export-2.0"
fi
for i in $glib_modules; do

View File

@ -67,7 +67,7 @@ static void vcpu_insn_exec(unsigned int cpu_index, void *udata)
/* Print previous instruction in cache */
if (s->len) {
qemu_plugin_outs(s->str);
qemu_plugin_outs("s\n");
qemu_plugin_outs("\n");
}
/* Store new instruction in cache */

View File

@ -98,8 +98,10 @@ driver options if ``--image-opts`` is specified.
.. option:: --cache=CACHE
The cache mode to be used with the file. See the documentation of
the emulator's ``-drive cache=...`` option for allowed values.
The cache mode to be used with the file. Valid values are:
``none``, ``writeback`` (the default), ``writethrough``,
``directsync`` and ``unsafe``. See the documentation of
the emulator's ``-drive cache=...`` option for more info.
.. option:: -n, --nocache

View File

@ -51,7 +51,9 @@
*/ \
qemu_coroutine_yield(); \
qemu_bh_delete(co_bh); \
code_block; \
do { \
code_block; \
} while (0); \
/* re-enter back to qemu thread */ \
qemu_coroutine_yield(); \
} while (0)

View File

@ -2677,10 +2677,10 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
"Set the IOMMU type. "
"Valid values are none and smmuv3");
object_class_property_add_bool(oc, "default_bus_bypass_iommu",
object_class_property_add_bool(oc, "default-bus-bypass-iommu",
virt_get_default_bus_bypass_iommu,
virt_set_default_bus_bypass_iommu);
object_class_property_set_description(oc, "default_bus_bypass_iommu",
object_class_property_set_description(oc, "default-bus-bypass-iommu",
"Set on/off to enable/disable "
"bypass_iommu for default root bus");

View File

@ -222,7 +222,7 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev)
memory_region_transaction_commit();
while (j--) {
virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), i);
virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), j);
}
goto fail_host_notifiers;
}

View File

@ -61,6 +61,12 @@
} while (0)
/* Anonymous BlockBackend for empty drive */
static BlockBackend *blk_create_empty_drive(void)
{
return blk_new(qemu_get_aio_context(), 0, BLK_PERM_ALL);
}
/********************************************************/
/* qdev floppy bus */
@ -486,8 +492,7 @@ static void floppy_drive_realize(DeviceState *qdev, Error **errp)
}
if (!dev->conf.blk) {
/* Anonymous BlockBackend for an empty drive */
dev->conf.blk = blk_new(qemu_get_aio_context(), 0, BLK_PERM_ALL);
dev->conf.blk = blk_create_empty_drive();
ret = blk_attach_dev(dev->conf.blk, qdev);
assert(ret == 0);
@ -1161,7 +1166,19 @@ static FDrive *get_drv(FDCtrl *fdctrl, int unit)
static FDrive *get_cur_drv(FDCtrl *fdctrl)
{
return get_drv(fdctrl, fdctrl->cur_drv);
FDrive *cur_drv = get_drv(fdctrl, fdctrl->cur_drv);
if (!cur_drv->blk) {
/*
* Kludge: empty drive line selected. Create an anonymous
* BlockBackend to avoid NULL deref with various BlockBackend
* API calls within this model (CVE-2021-20196).
* Due to the controller QOM model limitations, we don't
* attach the created to the controller device.
*/
cur_drv->blk = blk_create_empty_drive();
}
return cur_drv;
}
/* Status A register : 0x00 (read-only) */

View File

@ -43,6 +43,7 @@ GlobalProperty hw_compat_6_0[] = {
{ "nvme-ns", "eui64-default", "off"},
{ "e1000", "init-vet", "off" },
{ "e1000e", "init-vet", "off" },
{ "vhost-vsock-device", "seqpacket", "off" },
};
const size_t hw_compat_6_0_len = G_N_ELEMENTS(hw_compat_6_0);

View File

@ -1170,8 +1170,8 @@ static void artist_vram_write(void *opaque, hwaddr addr, uint64_t val,
}
buf = vram_write_buffer(s);
posy = ADDR_TO_Y(addr);
posx = ADDR_TO_X(addr);
posy = ADDR_TO_Y(addr >> 2);
posx = ADDR_TO_X(addr >> 2);
if (!buf->size) {
return;
@ -1232,8 +1232,8 @@ static uint64_t artist_vram_read(void *opaque, hwaddr addr, unsigned size)
return 0;
}
posy = ADDR_TO_Y(addr);
posx = ADDR_TO_X(addr);
posy = ADDR_TO_Y(addr >> 2);
posx = ADDR_TO_X(addr >> 2);
if (posy > buf->height || posx > buf->width) {
return 0;

View File

@ -2252,7 +2252,7 @@ static int qxl_pre_save(void *opaque)
} else {
d->last_release_offset = (uint8_t *)d->last_release - ram_start;
}
if (d->last_release_offset < d->vga.vram_size) {
if (d->last_release_offset >= d->vga.vram_size) {
return 1;
}

View File

@ -1763,7 +1763,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
object_class_property_add_bool(oc, "hpet",
pc_machine_get_hpet, pc_machine_set_hpet);
object_class_property_add_bool(oc, "default_bus_bypass_iommu",
object_class_property_add_bool(oc, "default-bus-bypass-iommu",
pc_machine_get_default_bus_bypass_iommu,
pc_machine_set_default_bus_bypass_iommu);

View File

@ -243,7 +243,7 @@ static void pc_q35_init(MachineState *machine)
NULL);
if (acpi_pcihp) {
object_register_sugar_prop(TYPE_PCIE_SLOT, "native-hotplug",
object_register_sugar_prop(TYPE_PCIE_SLOT, "x-native-hotplug",
"false", true);
}

View File

@ -304,7 +304,14 @@ type_init(virt_machine_register_types)
} \
type_init(machvirt_machine_##major##_##minor##_init);
static void virt_machine_6_0_options(MachineClass *mc)
static void virt_machine_6_1_options(MachineClass *mc)
{
}
DEFINE_VIRT_MACHINE(6, 0, true)
DEFINE_VIRT_MACHINE(6, 1, true)
static void virt_machine_6_0_options(MachineClass *mc)
{
virt_machine_6_1_options(mc);
compat_props_add(mc->compat_props, hw_compat_6_0, hw_compat_6_0_len);
}
DEFINE_VIRT_MACHINE(6, 0, false)

View File

@ -107,6 +107,7 @@ struct E1000State_st {
e1000x_txd_props props;
e1000x_txd_props tso_props;
uint16_t tso_frames;
bool busy;
} tx;
struct {
@ -763,6 +764,11 @@ start_xmit(E1000State *s)
return;
}
if (s->tx.busy) {
return;
}
s->tx.busy = true;
while (s->mac_reg[TDH] != s->mac_reg[TDT]) {
base = tx_desc_base(s) +
sizeof(struct e1000_tx_desc) * s->mac_reg[TDH];
@ -789,6 +795,7 @@ start_xmit(E1000State *s)
break;
}
}
s->tx.busy = false;
set_ics(s, 0, cause);
}

View File

@ -1746,10 +1746,13 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
VirtIONet *n = qemu_get_nic_opaque(nc);
VirtIONetQueue *q = virtio_net_get_subqueue(nc);
VirtIODevice *vdev = VIRTIO_DEVICE(n);
VirtQueueElement *elems[VIRTQUEUE_MAX_SIZE];
size_t lens[VIRTQUEUE_MAX_SIZE];
struct iovec mhdr_sg[VIRTQUEUE_MAX_SIZE];
struct virtio_net_hdr_mrg_rxbuf mhdr;
unsigned mhdr_cnt = 0;
size_t offset, i, guest_offset;
size_t offset, i, guest_offset, j;
ssize_t err;
if (!virtio_net_can_receive(nc)) {
return -1;
@ -1780,6 +1783,12 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
total = 0;
if (i == VIRTQUEUE_MAX_SIZE) {
virtio_error(vdev, "virtio-net unexpected long buffer chain");
err = size;
goto err;
}
elem = virtqueue_pop(q->rx_vq, sizeof(VirtQueueElement));
if (!elem) {
if (i) {
@ -1791,7 +1800,8 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
n->guest_hdr_len, n->host_hdr_len,
vdev->guest_features);
}
return -1;
err = -1;
goto err;
}
if (elem->in_num < 1) {
@ -1799,7 +1809,8 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
"virtio-net receive queue contains no in buffers");
virtqueue_detach_element(q->rx_vq, elem, 0);
g_free(elem);
return -1;
err = -1;
goto err;
}
sg = elem->in_sg;
@ -1836,12 +1847,13 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
if (!n->mergeable_rx_bufs && offset < size) {
virtqueue_unpop(q->rx_vq, elem, total);
g_free(elem);
return size;
err = size;
goto err;
}
/* signal other side */
virtqueue_fill(q->rx_vq, elem, total, i++);
g_free(elem);
elems[i] = elem;
lens[i] = total;
i++;
}
if (mhdr_cnt) {
@ -1851,10 +1863,23 @@ static ssize_t virtio_net_receive_rcu(NetClientState *nc, const uint8_t *buf,
&mhdr.num_buffers, sizeof mhdr.num_buffers);
}
for (j = 0; j < i; j++) {
/* signal other side */
virtqueue_fill(q->rx_vq, elems[j], lens[j], j);
g_free(elems[j]);
}
virtqueue_flush(q->rx_vq, i);
virtio_notify(vdev, q->rx_vq);
return size;
err:
for (j = 0; j < i; j++) {
g_free(elems[j]);
}
return err;
}
static ssize_t virtio_net_do_receive(NetClientState *nc, const uint8_t *buf,

View File

@ -1441,6 +1441,7 @@ static void vmxnet3_activate_device(VMXNET3State *s)
vmxnet3_setup_rx_filtering(s);
/* Cache fields from shared memory */
s->mtu = VMXNET3_READ_DRV_SHARED32(d, s->drv_shmem, devRead.misc.mtu);
assert(VMXNET3_MIN_MTU <= s->mtu && s->mtu < VMXNET3_MAX_MTU);
VMW_CFPRN("MTU is %u", s->mtu);
s->max_rx_frags =
@ -1486,6 +1487,9 @@ static void vmxnet3_activate_device(VMXNET3State *s)
/* Read rings memory locations for TX queues */
pa = VMXNET3_READ_TX_QUEUE_DESCR64(d, qdescr_pa, conf.txRingBasePA);
size = VMXNET3_READ_TX_QUEUE_DESCR32(d, qdescr_pa, conf.txRingSize);
if (size > VMXNET3_TX_RING_MAX_SIZE) {
size = VMXNET3_TX_RING_MAX_SIZE;
}
vmxnet3_ring_init(d, &s->txq_descr[i].tx_ring, pa, size,
sizeof(struct Vmxnet3_TxDesc), false);
@ -1496,6 +1500,9 @@ static void vmxnet3_activate_device(VMXNET3State *s)
/* TXC ring */
pa = VMXNET3_READ_TX_QUEUE_DESCR64(d, qdescr_pa, conf.compRingBasePA);
size = VMXNET3_READ_TX_QUEUE_DESCR32(d, qdescr_pa, conf.compRingSize);
if (size > VMXNET3_TC_RING_MAX_SIZE) {
size = VMXNET3_TC_RING_MAX_SIZE;
}
vmxnet3_ring_init(d, &s->txq_descr[i].comp_ring, pa, size,
sizeof(struct Vmxnet3_TxCompDesc), true);
VMXNET3_RING_DUMP(VMW_CFPRN, "TXC", i, &s->txq_descr[i].comp_ring);
@ -1537,6 +1544,9 @@ static void vmxnet3_activate_device(VMXNET3State *s)
/* RX rings */
pa = VMXNET3_READ_RX_QUEUE_DESCR64(d, qd_pa, conf.rxRingBasePA[j]);
size = VMXNET3_READ_RX_QUEUE_DESCR32(d, qd_pa, conf.rxRingSize[j]);
if (size > VMXNET3_RX_RING_MAX_SIZE) {
size = VMXNET3_RX_RING_MAX_SIZE;
}
vmxnet3_ring_init(d, &s->rxq_descr[i].rx_ring[j], pa, size,
sizeof(struct Vmxnet3_RxDesc), false);
VMW_CFPRN("RX queue %d:%d: Base: %" PRIx64 ", Size: %d",
@ -1546,6 +1556,9 @@ static void vmxnet3_activate_device(VMXNET3State *s)
/* RXC ring */
pa = VMXNET3_READ_RX_QUEUE_DESCR64(d, qd_pa, conf.compRingBasePA);
size = VMXNET3_READ_RX_QUEUE_DESCR32(d, qd_pa, conf.compRingSize);
if (size > VMXNET3_RC_RING_MAX_SIZE) {
size = VMXNET3_RC_RING_MAX_SIZE;
}
vmxnet3_ring_init(d, &s->rxq_descr[i].comp_ring, pa, size,
sizeof(struct Vmxnet3_RxCompDesc), true);
VMW_CFPRN("RXC queue %d: Base: %" PRIx64 ", Size: %d", i, pa, size);

View File

@ -4164,6 +4164,11 @@ static uint16_t nvme_changed_nslist(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
int i = 0;
uint32_t nsid;
if (off >= sizeof(nslist)) {
trace_pci_nvme_err_invalid_log_page_offset(off, sizeof(nslist));
return NVME_INVALID_FIELD | NVME_DNR;
}
memset(nslist, 0x0, sizeof(nslist));
trans_len = MIN(sizeof(nslist) - off, buf_len);

View File

@ -448,11 +448,11 @@ int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset,
PCIBridgeQemuCap cap = {
.len = cap_len,
.type = REDHAT_PCI_CAP_RESOURCE_RESERVE,
.bus_res = res_reserve.bus,
.io = res_reserve.io,
.mem = res_reserve.mem_non_pref,
.mem_pref_32 = res_reserve.mem_pref_32,
.mem_pref_64 = res_reserve.mem_pref_64
.bus_res = cpu_to_le32(res_reserve.bus),
.io = cpu_to_le64(res_reserve.io),
.mem = cpu_to_le32(res_reserve.mem_non_pref),
.mem_pref_32 = cpu_to_le32(res_reserve.mem_pref_32),
.mem_pref_64 = cpu_to_le64(res_reserve.mem_pref_64)
};
int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR,

View File

@ -148,7 +148,7 @@ static Property pcie_slot_props[] = {
DEFINE_PROP_UINT8("chassis", PCIESlot, chassis, 0),
DEFINE_PROP_UINT16("slot", PCIESlot, slot, 0),
DEFINE_PROP_BOOL("hotplug", PCIESlot, hotplug, true),
DEFINE_PROP_BOOL("native-hotplug", PCIESlot, native_hotplug, true),
DEFINE_PROP_BOOL("x-native-hotplug", PCIESlot, native_hotplug, true),
DEFINE_PROP_END_OF_LIST()
};

View File

@ -1087,6 +1087,7 @@ static int mode_sense_page(SCSIDiskState *s, int page, uint8_t **p_outbuf,
uint8_t *p = *p_outbuf + 2;
int length;
assert(page < ARRAY_SIZE(mode_sense_valid));
if ((mode_sense_valid[page] & (1 << s->qdev.type)) == 0) {
return -1;
}
@ -1428,6 +1429,11 @@ static int scsi_disk_check_mode_select(SCSIDiskState *s, int page,
return -1;
}
/* MODE_PAGE_ALLS is only valid for MODE SENSE commands */
if (page == MODE_PAGE_ALLS) {
return -1;
}
p = mode_current;
memset(mode_current, 0, inlen + 2);
len = mode_sense_page(s, page, &p, 0);

View File

@ -180,7 +180,7 @@ static int scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s, int len)
page = r->req.cmd.buf[2];
if (page == 0xb0) {
uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
uint32_t max_iov = blk_get_max_iov(s->conf.blk);
uint32_t max_iov = blk_get_max_hw_iov(s->conf.blk);
assert(max_transfer);
max_transfer = MIN_NON_ZERO(max_transfer, max_iov * qemu_real_host_page_size)

View File

@ -840,6 +840,9 @@ static void usb_uas_handle_data(USBDevice *dev, USBPacket *p)
}
break;
case UAS_PIPE_ID_STATUS:
if (p->stream > UAS_MAX_STREAMS) {
goto err_stream;
}
if (p->stream) {
QTAILQ_FOREACH(st, &uas->results, next) {
if (st->stream == p->stream) {
@ -867,6 +870,9 @@ static void usb_uas_handle_data(USBDevice *dev, USBPacket *p)
break;
case UAS_PIPE_ID_DATA_IN:
case UAS_PIPE_ID_DATA_OUT:
if (p->stream > UAS_MAX_STREAMS) {
goto err_stream;
}
if (p->stream) {
req = usb_uas_find_request(uas, p->stream);
} else {
@ -902,6 +908,11 @@ static void usb_uas_handle_data(USBDevice *dev, USBPacket *p)
p->status = USB_RET_STALL;
break;
}
err_stream:
error_report("%s: invalid stream %d", __func__, p->stream);
p->status = USB_RET_STALL;
return;
}
static void usb_uas_unrealize(USBDevice *dev)

View File

@ -551,6 +551,7 @@ static int vfio_host_win_del(VFIOContainer *container, hwaddr min_iova,
QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) {
if (hostwin->min_iova == min_iova && hostwin->max_iova == max_iova) {
QLIST_REMOVE(hostwin, hostwin_next);
g_free(hostwin);
return 0;
}
}
@ -2230,6 +2231,7 @@ static void vfio_disconnect_container(VFIOGroup *group)
if (QLIST_EMPTY(&container->group_list)) {
VFIOAddressSpace *space = container->space;
VFIOGuestIOMMU *giommu, *tmp;
VFIOHostDMAWindow *hostwin, *next;
QLIST_REMOVE(container, next);
@ -2240,6 +2242,12 @@ static void vfio_disconnect_container(VFIOGroup *group)
g_free(giommu);
}
QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next,
next) {
QLIST_REMOVE(hostwin, hostwin_next);
g_free(hostwin);
}
trace_vfio_disconnect_container(container->fd);
close(container->fd);
g_free(container);

View File

@ -1469,8 +1469,9 @@ static int vhost_user_slave_handle_vring_host_notifier(struct vhost_dev *dev,
name = g_strdup_printf("vhost-user/host-notifier@%p mmaps[%d]",
user, queue_idx);
memory_region_init_ram_device_ptr(&n->mr, OBJECT(vdev), name,
page_size, addr);
if (!n->mr.ram) /* Don't init again after suspend. */
memory_region_init_ram_device_ptr(&n->mr, OBJECT(vdev), name,
page_size, addr);
g_free(name);
if (virtio_queue_set_host_notifier_mr(vdev, queue_idx, &n->mr, true)) {

View File

@ -114,10 +114,21 @@ static uint64_t vhost_vsock_get_features(VirtIODevice *vdev,
Error **errp)
{
VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(vdev);
VHostVSock *vsock = VHOST_VSOCK(vdev);
virtio_add_feature(&requested_features, VIRTIO_VSOCK_F_SEQPACKET);
return vhost_get_features(&vvc->vhost_dev, feature_bits,
requested_features);
if (vsock->seqpacket != ON_OFF_AUTO_OFF) {
virtio_add_feature(&requested_features, VIRTIO_VSOCK_F_SEQPACKET);
}
requested_features = vhost_get_features(&vvc->vhost_dev, feature_bits,
requested_features);
if (vsock->seqpacket == ON_OFF_AUTO_ON &&
!virtio_has_feature(requested_features, VIRTIO_VSOCK_F_SEQPACKET)) {
error_setg(errp, "vhost-vsock backend doesn't support seqpacket");
}
return requested_features;
}
static const VMStateDescription vmstate_virtio_vhost_vsock = {
@ -218,6 +229,8 @@ static void vhost_vsock_device_unrealize(DeviceState *dev)
static Property vhost_vsock_properties[] = {
DEFINE_PROP_UINT64("guest-cid", VHostVSock, conf.guest_cid, 0),
DEFINE_PROP_STRING("vhostfd", VHostVSock, conf.vhostfd),
DEFINE_PROP_ON_OFF_AUTO("seqpacket", VHostVSock, seqpacket,
ON_OFF_AUTO_AUTO),
DEFINE_PROP_END_OF_LIST(),
};

View File

@ -30,6 +30,7 @@
#include "trace.h"
#include "qemu/error-report.h"
#include "migration/misc.h"
#include "migration/migration.h"
#include "hw/virtio/virtio-bus.h"
#include "hw/virtio/virtio-access.h"
@ -662,6 +663,18 @@ virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
return 0;
}
/*
* Pages hinted via qemu_guest_free_page_hint() are cleared from the dirty
* bitmap and will not get migrated, especially also not when the postcopy
* destination starts using them and requests migration from the source; the
* faulting thread will stall until postcopy migration finishes and
* all threads are woken up. Let's not start free page hinting if postcopy
* is possible.
*/
if (migrate_postcopy_ram()) {
return 0;
}
switch (pnd->reason) {
case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC:
virtio_balloon_free_page_stop(dev);

View File

@ -88,13 +88,8 @@ static void virtio_mem_pci_size_change_notify(Notifier *notifier, void *data)
size_change_notifier);
DeviceState *dev = DEVICE(pci_mem);
const uint64_t * const size_p = data;
const char *id = NULL;
if (dev->id) {
id = g_strdup(dev->id);
}
qapi_event_send_memory_device_size_change(!!id, id, *size_p);
qapi_event_send_memory_device_size_change(!!dev->id, dev->id, *size_p);
}
static void virtio_mem_pci_class_init(ObjectClass *klass, void *data)

View File

@ -249,13 +249,10 @@ static void vring_packed_event_read(VirtIODevice *vdev,
hwaddr off_off = offsetof(VRingPackedDescEvent, off_wrap);
hwaddr off_flags = offsetof(VRingPackedDescEvent, flags);
address_space_read_cached(cache, off_flags, &e->flags,
sizeof(e->flags));
e->flags = virtio_lduw_phys_cached(vdev, cache, off_flags);
/* Make sure flags is seen before off_wrap */
smp_rmb();
address_space_read_cached(cache, off_off, &e->off_wrap,
sizeof(e->off_wrap));
virtio_tswap16s(vdev, &e->off_wrap);
e->off_wrap = virtio_lduw_phys_cached(vdev, cache, off_off);
virtio_tswap16s(vdev, &e->flags);
}
@ -265,8 +262,7 @@ static void vring_packed_off_wrap_write(VirtIODevice *vdev,
{
hwaddr off = offsetof(VRingPackedDescEvent, off_wrap);
virtio_tswap16s(vdev, &off_wrap);
address_space_write_cached(cache, off, &off_wrap, sizeof(off_wrap));
virtio_stw_phys_cached(vdev, cache, off, off_wrap);
address_space_cache_invalidate(cache, off, sizeof(off_wrap));
}
@ -275,8 +271,7 @@ static void vring_packed_flags_write(VirtIODevice *vdev,
{
hwaddr off = offsetof(VRingPackedDescEvent, flags);
virtio_tswap16s(vdev, &flags);
address_space_write_cached(cache, off, &flags, sizeof(flags));
virtio_stw_phys_cached(vdev, cache, off, flags);
address_space_cache_invalidate(cache, off, sizeof(flags));
}
@ -509,11 +504,9 @@ static void vring_packed_desc_read_flags(VirtIODevice *vdev,
MemoryRegionCache *cache,
int i)
{
address_space_read_cached(cache,
i * sizeof(VRingPackedDesc) +
offsetof(VRingPackedDesc, flags),
flags, sizeof(*flags));
virtio_tswap16s(vdev, flags);
hwaddr off = i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags);
*flags = virtio_lduw_phys_cached(vdev, cache, off);
}
static void vring_packed_desc_read(VirtIODevice *vdev,
@ -566,8 +559,7 @@ static void vring_packed_desc_write_flags(VirtIODevice *vdev,
{
hwaddr off = i * sizeof(VRingPackedDesc) + offsetof(VRingPackedDesc, flags);
virtio_tswap16s(vdev, &desc->flags);
address_space_write_cached(cache, off, &desc->flags, sizeof(desc->flags));
virtio_stw_phys_cached(vdev, cache, off, desc->flags);
address_space_cache_invalidate(cache, off, sizeof(desc->flags));
}

View File

@ -702,6 +702,13 @@ typedef struct BlockLimits {
*/
uint64_t max_hw_transfer;
/* Maximal number of scatter/gather elements allowed by the hardware.
* Applies whenever transfers to the device bypass the kernel I/O
* scheduler, for example with SG_IO. If larger than max_iov
* or if zero, blk_get_max_hw_iov will fall back to max_iov.
*/
int max_hw_iov;
/* memory alignment, in bytes so that no bounce buffer is needed */
size_t min_mem_alignment;

View File

@ -29,7 +29,7 @@
#include "hw/acpi/acpi_dev_interface.h"
#include "hw/acpi/tco.h"
#define ACPI_PCIHP_ADDR_ICH9 0x0cc4
#define ACPI_PCIHP_ADDR_ICH9 0x0cc0
typedef struct ICH9LPCPMRegs {
/*

View File

@ -30,6 +30,9 @@ struct VHostVSock {
VHostVSockCommon parent;
VHostVSockConf conf;
/* features */
OnOffAuto seqpacket;
/*< public >*/
};

View File

@ -27,6 +27,7 @@
#include "qemu/thread.h"
#include "qemu/queue.h"
#include "qemu/atomic.h"
#include "qemu/notify.h"
#include "qemu/sys_membarrier.h"
#ifdef __cplusplus
@ -66,6 +67,13 @@ struct rcu_reader_data {
/* Data used for registry, protected by rcu_registry_lock */
QLIST_ENTRY(rcu_reader_data) node;
/*
* NotifierList used to force an RCU grace period. Accessed under
* rcu_registry_lock. Note that the notifier is called _outside_
* the thread!
*/
NotifierList force_rcu;
};
extern __thread struct rcu_reader_data rcu_reader;
@ -180,6 +188,13 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(RCUReadAuto, rcu_read_auto_unlock)
#define RCU_READ_LOCK_GUARD() \
g_autoptr(RCUReadAuto) _rcu_read_auto __attribute__((unused)) = rcu_read_auto_lock()
/*
* Force-RCU notifiers tell readers that they should exit their
* read-side critical section.
*/
void rcu_add_force_rcu_notifier(Notifier *n);
void rcu_remove_force_rcu_notifier(Notifier *n);
#ifdef __cplusplus
}
#endif

View File

@ -210,6 +210,7 @@ uint32_t blk_get_request_alignment(BlockBackend *blk);
uint32_t blk_get_max_transfer(BlockBackend *blk);
uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
int blk_get_max_iov(BlockBackend *blk);
int blk_get_max_hw_iov(BlockBackend *blk);
void blk_set_guest_block_size(BlockBackend *blk, int align);
void *blk_try_blockalign(BlockBackend *blk, size_t size);
void *blk_blockalign(BlockBackend *blk, size_t size);

View File

@ -1496,7 +1496,7 @@ void hmp_change(Monitor *mon, const QDict *qdict)
}
if (strcmp(target, "passwd") == 0 ||
strcmp(target, "password") == 0) {
if (arg) {
if (!arg) {
MonitorHMP *hmp_mon = container_of(mon, MonitorHMP, common);
monitor_read_password(hmp_mon, hmp_change_read_arg, NULL);
return;

View File

@ -1413,6 +1413,9 @@ static int nbd_receive_request(NBDClient *client, NBDRequest *request,
if (ret < 0) {
return ret;
}
if (ret == 0) {
return -EIO;
}
/* Request
[ 0 .. 3] magic (NBD_REQUEST_MAGIC)

View File

@ -1,9 +1,11 @@
if 'CONFIG_HAS_LD_DYNAMIC_LIST' in config_host
plugin_ldflags = ['-Wl,--dynamic-list=' + (meson.build_root() / 'qemu-plugins-ld.symbols')]
elif 'CONFIG_HAS_LD_EXPORTED_SYMBOLS_LIST' in config_host
plugin_ldflags = ['-Wl,-exported_symbols_list,' + (meson.build_root() / 'qemu-plugins-ld64.symbols')]
else
plugin_ldflags = []
plugin_ldflags = []
# Modules need more symbols than just those in plugins/qemu-plugins.symbols
if not enable_modules
if 'CONFIG_HAS_LD_DYNAMIC_LIST' in config_host
plugin_ldflags = ['-Wl,--dynamic-list=' + (meson.build_root() / 'qemu-plugins-ld.symbols')]
elif 'CONFIG_HAS_LD_EXPORTED_SYMBOLS_LIST' in config_host
plugin_ldflags = ['-Wl,-exported_symbols_list,' + (meson.build_root() / 'qemu-plugins-ld64.symbols')]
endif
endif
specific_ss.add(when: 'CONFIG_PLUGIN', if_true: [files(

View File

@ -135,7 +135,9 @@ static void usage(const char *name)
" 'snapshot.id=[ID],snapshot.name=[NAME]', or\n"
" '[ID_OR_NAME]'\n"
" -n, --nocache disable host cache\n"
" --cache=MODE set cache mode (none, writeback, ...)\n"
" --cache=MODE set cache mode used to access the disk image, the\n"
" valid options are: 'none', 'writeback' (default),\n"
" 'writethrough', 'directsync' and 'unsafe'\n"
" --aio=MODE set AIO mode (native, io_uring or threads)\n"
" --discard=MODE set discard mode (ignore, unmap)\n"
" --detect-zeroes=MODE set detect-zeroes mode (off, on, unmap)\n"
@ -552,7 +554,7 @@ int main(int argc, char **argv)
bool alloc_depth = false;
const char *tlscredsid = NULL;
bool imageOpts = false;
bool writethrough = true;
bool writethrough = false; /* Client will flush as needed. */
bool fork_process = false;
bool list = false;
int old_stderr = -1;

View File

@ -816,6 +816,7 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) {
shadow_regions[j].gpa = dev->regions[i].gpa;
shadow_regions[j].size = dev->regions[i].size;
shadow_regions[j].qva = dev->regions[i].qva;
shadow_regions[j].mmap_addr = dev->regions[i].mmap_addr;
shadow_regions[j].mmap_offset = dev->regions[i].mmap_offset;
j++;
} else {

View File

@ -265,12 +265,15 @@ static void arm_cpu_reset(DeviceState *dev)
env->uncached_cpsr = ARM_CPU_MODE_SVC;
}
env->daif = PSTATE_D | PSTATE_A | PSTATE_I | PSTATE_F;
#endif
if (arm_feature(env, ARM_FEATURE_M)) {
#ifndef CONFIG_USER_ONLY
uint32_t initial_msp; /* Loaded from 0x0 */
uint32_t initial_pc; /* Loaded from 0x4 */
uint8_t *rom;
uint32_t vecbase;
#endif
if (cpu_isar_feature(aa32_lob, cpu)) {
/*
@ -324,6 +327,8 @@ static void arm_cpu_reset(DeviceState *dev)
env->v7m.fpccr[M_REG_S] = R_V7M_FPCCR_ASPEN_MASK |
R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK;
}
#ifndef CONFIG_USER_ONLY
/* Unlike A/R profile, M profile defines the reset LR value */
env->regs[14] = 0xffffffff;
@ -352,8 +357,22 @@ static void arm_cpu_reset(DeviceState *dev)
env->regs[13] = initial_msp & 0xFFFFFFFC;
env->regs[15] = initial_pc & ~1;
env->thumb = initial_pc & 1;
#else
/*
* For user mode we run non-secure and with access to the FPU.
* The FPU context is active (ie does not need further setup)
* and is owned by non-secure.
*/
env->v7m.secure = false;
env->v7m.nsacr = 0xcff;
env->v7m.cpacr[M_REG_NS] = 0xf0ffff;
env->v7m.fpccr[M_REG_S] &=
~(R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK);
env->v7m.control[M_REG_S] |= R_V7M_CONTROL_FPCA_MASK;
#endif
}
#ifndef CONFIG_USER_ONLY
/* AArch32 has a hard highvec setting of 0xFFFF0000. If we are currently
* executing as AArch32 then check if highvecs are enabled and
* adjust the PC accordingly.

View File

@ -3102,7 +3102,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY | MSR_ARCH_CAP_MDS_NO |
MSR_ARCH_CAP_PSCHANGE_MC_NO | MSR_ARCH_CAP_TAA_NO,
.features[FEAT_7_1_EAX] =
CPUID_7_1_EAX_AVX_VNNI | CPUID_7_1_EAX_AVX512_BF16,
CPUID_7_1_EAX_AVX512_BF16,
/* XSAVES is added in version 2 */
.features[FEAT_XSAVE] =
CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |

View File

@ -257,6 +257,7 @@ typedef enum X86Seg {
| CR4_DE_MASK | CR4_PSE_MASK | CR4_PAE_MASK \
| CR4_MCE_MASK | CR4_PGE_MASK | CR4_PCE_MASK \
| CR4_OSFXSR_MASK | CR4_OSXMMEXCPT_MASK |CR4_UMIP_MASK \
| CR4_LA57_MASK \
| CR4_FSGSBASE_MASK | CR4_PCIDE_MASK | CR4_OSXSAVE_MASK \
| CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | CR4_PKS_MASK))

View File

@ -90,19 +90,10 @@ static int mmu_translate(CPUState *cs, hwaddr addr, MMUTranslateFunc get_hphys_f
target_ulong pdpe_addr;
#ifdef TARGET_X86_64
if (env->hflags & HF_LMA_MASK) {
if (pg_mode & PG_MODE_LMA) {
bool la57 = pg_mode & PG_MODE_LA57;
uint64_t pml5e_addr, pml5e;
uint64_t pml4e_addr, pml4e;
int32_t sext;
/* test virtual address sign extension */
sext = la57 ? (int64_t)addr >> 56 : (int64_t)addr >> 47;
if (get_hphys_func && sext != 0 && sext != -1) {
env->error_code = 0;
cs->exception_index = EXCP0D_GPF;
return 1;
}
if (la57) {
pml5e_addr = ((cr3 & ~0xfff) +
@ -287,7 +278,7 @@ do_check_protect_pse36:
*prot |= PAGE_EXEC;
}
if (!(env->hflags & HF_LMA_MASK)) {
if (!(pg_mode & PG_MODE_LMA)) {
pkr = 0;
} else if (ptep & PG_USER_MASK) {
pkr = pg_mode & PG_MODE_PKE ? env->pkru : 0;
@ -423,6 +414,18 @@ static int handle_mmu_fault(CPUState *cs, vaddr addr, int size,
page_size = 4096;
} else {
pg_mode = get_pg_mode(env);
if (pg_mode & PG_MODE_LMA) {
int32_t sext;
/* test virtual address sign extension */
sext = (int64_t)addr >> (pg_mode & PG_MODE_LA57 ? 56 : 47);
if (sext != 0 && sext != -1) {
env->error_code = 0;
cs->exception_index = EXCP0D_GPF;
return 1;
}
}
error_code = mmu_translate(cs, addr, get_hphys, env->cr[3], is_write1,
mmu_idx, pg_mode,
&paddr, &page_size, &prot);

View File

@ -2477,8 +2477,13 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg arg,
tcg_out_vldst(s, INSN_VLD1 | 0x7d0, arg, arg1, arg2);
return;
case TCG_TYPE_V128:
/* regs 2; size 8; align 16 */
tcg_out_vldst(s, INSN_VLD1 | 0xae0, arg, arg1, arg2);
/*
* We have only 8-byte alignment for the stack per the ABI.
* Rather than dynamically re-align the stack, it's easier
* to simply not request alignment beyond that. So:
* regs 2; size 8; align 8
*/
tcg_out_vldst(s, INSN_VLD1 | 0xad0, arg, arg1, arg2);
return;
default:
g_assert_not_reached();
@ -2497,8 +2502,8 @@ static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg,
tcg_out_vldst(s, INSN_VST1 | 0x7d0, arg, arg1, arg2);
return;
case TCG_TYPE_V128:
/* regs 2; size 8; align 16 */
tcg_out_vldst(s, INSN_VST1 | 0xae0, arg, arg1, arg2);
/* See tcg_out_ld re alignment: regs 2; size 8; align 8 */
tcg_out_vldst(s, INSN_VST1 | 0xad0, arg, arg1, arg2);
return;
default:
g_assert_not_reached();

View File

@ -3060,7 +3060,13 @@ static void temp_allocate_frame(TCGContext *s, TCGTemp *ts)
g_assert_not_reached();
}
assert(align <= TCG_TARGET_STACK_ALIGN);
/*
* Assume the stack is sufficiently aligned.
* This affects e.g. ARM NEON, where we have 8 byte stack alignment
* and do not require 16 byte vector alignment. This seems slightly
* easier than fully parameterizing the above switch statement.
*/
align = MIN(TCG_TARGET_STACK_ALIGN, align);
off = ROUND_UP(s->current_frame_offset, align);
/* If we've exhausted the stack frame, restart with a smaller TB. */

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -859,6 +859,23 @@ static void test_acpi_q35_tcg_bridge(void)
free_test_data(&data);
}
static void test_acpi_q35_multif_bridge(void)
{
test_data data = {
.machine = MACHINE_Q35,
.variant = ".multi-bridge",
};
test_acpi_one("-device pcie-root-port,id=pcie-root-port-0,"
"multifunction=on,"
"port=0x0,chassis=1,addr=0x2,bus=pcie.0 "
"-device pcie-root-port,id=pcie-root-port-1,"
"port=0x1,chassis=2,addr=0x3.0x1,bus=pcie.0 "
"-device virtio-balloon,id=balloon0,"
"bus=pcie.0,addr=0x4.0x2",
&data);
free_test_data(&data);
}
static void test_acpi_q35_tcg_mmio64(void)
{
test_data data = {
@ -1528,6 +1545,7 @@ int main(int argc, char *argv[])
test_acpi_piix4_no_acpi_pci_hotplug);
qtest_add_func("acpi/q35", test_acpi_q35_tcg);
qtest_add_func("acpi/q35/bridge", test_acpi_q35_tcg_bridge);
qtest_add_func("acpi/q35/multif-bridge", test_acpi_q35_multif_bridge);
qtest_add_func("acpi/q35/mmio64", test_acpi_q35_tcg_mmio64);
qtest_add_func("acpi/piix4/ipmi", test_acpi_piix4_tcg_ipmi);
qtest_add_func("acpi/q35/ipmi", test_acpi_q35_tcg_ipmi);

View File

@ -32,6 +32,9 @@
/* TODO actually test the results and get rid of this */
#define qmp_discard_response(...) qobject_unref(qmp(__VA_ARGS__))
#define DRIVE_FLOPPY_BLANK \
"-drive if=floppy,file=null-co://,file.read-zeroes=on,format=raw,size=1440k"
#define TEST_IMAGE_SIZE 1440 * 1024
#define FLOPPY_BASE 0x3f0
@ -546,6 +549,40 @@ static void fuzz_registers(void)
}
}
static bool qtest_check_clang_sanitizer(void)
{
#if defined(__SANITIZE_ADDRESS__) || __has_feature(address_sanitizer)
return true;
#else
g_test_skip("QEMU not configured using --enable-sanitizers");
return false;
#endif
}
static void test_cve_2021_20196(void)
{
QTestState *s;
if (!qtest_check_clang_sanitizer()) {
return;
}
s = qtest_initf("-nographic -m 32M -nodefaults " DRIVE_FLOPPY_BLANK);
qtest_outw(s, 0x3f4, 0x0500);
qtest_outb(s, 0x3f5, 0x00);
qtest_outb(s, 0x3f5, 0x00);
qtest_outw(s, 0x3f4, 0x0000);
qtest_outb(s, 0x3f5, 0x00);
qtest_outw(s, 0x3f1, 0x0400);
qtest_outw(s, 0x3f4, 0x0000);
qtest_outw(s, 0x3f4, 0x0000);
qtest_outb(s, 0x3f5, 0x00);
qtest_outb(s, 0x3f5, 0x01);
qtest_outw(s, 0x3f1, 0x0500);
qtest_outb(s, 0x3f5, 0x00);
qtest_quit(s);
}
int main(int argc, char **argv)
{
int fd;
@ -576,6 +613,7 @@ int main(int argc, char **argv)
qtest_add_func("/fdc/read_no_dma_18", test_read_no_dma_18);
qtest_add_func("/fdc/read_no_dma_19", test_read_no_dma_19);
qtest_add_func("/fdc/fuzz-registers", fuzz_registers);
qtest_add_func("/fdc/fuzz/cve_2021_20196", test_cve_2021_20196);
ret = g_test_run();

View File

@ -16,7 +16,10 @@ SECTIONS {
*(.rodata)
} :text
/* Keep build ID and PVH notes in same section */
/DISCARD/ : {
*(.note.gnu*)
}
.notes : {
*(.note.*)
} :note

View File

@ -1345,25 +1345,22 @@ socket_sockaddr_to_address_unix(struct sockaddr_storage *sa,
SocketAddress *addr;
struct sockaddr_un *su = (struct sockaddr_un *)sa;
assert(salen >= sizeof(su->sun_family) + 1 &&
salen <= sizeof(struct sockaddr_un));
addr = g_new0(SocketAddress, 1);
addr->type = SOCKET_ADDRESS_TYPE_UNIX;
salen -= offsetof(struct sockaddr_un, sun_path);
#ifdef CONFIG_LINUX
if (!su->sun_path[0]) {
if (salen > 0 && !su->sun_path[0]) {
/* Linux abstract socket */
addr->u.q_unix.path = g_strndup(su->sun_path + 1,
salen - sizeof(su->sun_family) - 1);
addr->u.q_unix.path = g_strndup(su->sun_path + 1, salen - 1);
addr->u.q_unix.has_abstract = true;
addr->u.q_unix.abstract = true;
addr->u.q_unix.has_tight = true;
addr->u.q_unix.tight = salen < sizeof(*su);
addr->u.q_unix.tight = salen < sizeof(su->sun_path);
return addr;
}
#endif
addr->u.q_unix.path = g_strndup(su->sun_path, sizeof(su->sun_path));
addr->u.q_unix.path = g_strndup(su->sun_path, salen);
return addr;
}
#endif /* WIN32 */

View File

@ -46,6 +46,7 @@
unsigned long rcu_gp_ctr = RCU_GP_LOCKED;
QemuEvent rcu_gp_event;
static int in_drain_call_rcu;
static QemuMutex rcu_registry_lock;
static QemuMutex rcu_sync_lock;
@ -107,6 +108,8 @@ static void wait_for_readers(void)
* get some extra futex wakeups.
*/
qatomic_set(&index->waiting, false);
} else if (qatomic_read(&in_drain_call_rcu)) {
notifier_list_notify(&index->force_rcu, NULL);
}
}
@ -339,8 +342,10 @@ void drain_call_rcu(void)
* assumed.
*/
qatomic_inc(&in_drain_call_rcu);
call_rcu1(&rcu_drain.rcu, drain_rcu_callback);
qemu_event_wait(&rcu_drain.drain_complete_event);
qatomic_dec(&in_drain_call_rcu);
if (locked) {
qemu_mutex_lock_iothread();
@ -363,6 +368,20 @@ void rcu_unregister_thread(void)
qemu_mutex_unlock(&rcu_registry_lock);
}
void rcu_add_force_rcu_notifier(Notifier *n)
{
qemu_mutex_lock(&rcu_registry_lock);
notifier_list_add(&rcu_reader.force_rcu, n);
qemu_mutex_unlock(&rcu_registry_lock);
}
void rcu_remove_force_rcu_notifier(Notifier *n)
{
qemu_mutex_lock(&rcu_registry_lock);
notifier_remove(n);
qemu_mutex_unlock(&rcu_registry_lock);
}
static void rcu_init_complete(void)
{
QemuThread thread;