Commit graph

78191 commits

Author SHA1 Message Date
Peter Maydell 3591ddd399 * Various fixes
* libdaxctl support to correctly align devdax character devices (Jingqi)
 * initial-all-set support for live migration (Jay)
 * forbid '-numa node, mem' for 5.1 and newer machine types (Igor)
 * x87 fixes (Joseph)
 * Tighten memory_region_access_valid (Michael) and fix fallout (myself)
 * Replay fixes (Pavel)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl71+zkUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO4MAgAo/aPLzCXJTzFOP88TclEETfSUeyG
 GFs6mAEJpoNnkAzY+y6ZIjtbp346UZB2KMxHTQcd7p2tO+jXSDPpr0UBLqU95j0/
 ucOnP1X9E5ee8P5Z7bXeGCtkfEippI5/TU+gHlx/SKeyVHdMKBsWCg/9LN5JXMJR
 ncQ6MxkU8huOksOLL32dxh1OqtdDiBoq9rswmHFXwDcRuIkteTlQo3Ze9BSb8t04
 7ZImKXNr+wIaq/xXAqltYNGhHoi31Rz+W8W7T84tYNr7wI1LWaLi2jzQ2qJthAdq
 25zXVz5QJjcfIemlrV03PN8IZfKqTfnOvf+DNW1ns/EdflQem/Mb0Q9KOg==
 =NfSA
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Various fixes
* libdaxctl support to correctly align devdax character devices (Jingqi)
* initial-all-set support for live migration (Jay)
* forbid '-numa node, mem' for 5.1 and newer machine types (Igor)
* x87 fixes (Joseph)
* Tighten memory_region_access_valid (Michael) and fix fallout (myself)
* Replay fixes (Pavel)

# gpg: Signature made Fri 26 Jun 2020 14:42:17 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (31 commits)
  i386: Mask SVM features if nested SVM is disabled
  ibex_uart: fix XOR-as-pow
  vmport: move compat properties to hw_compat_5_0
  hyperv: vmbus: Remove the 2nd IRQ
  kvm: i386: allow TSC to differ by NTP correction bounds without TSC scaling
  numa: forbid '-numa node, mem' for 5.1 and newer machine types
  osdep: Make MIN/MAX evaluate arguments only once
  target/i386: Add notes for versioned CPU models
  target/i386: reimplement fpatan using floatx80 operations
  target/i386: reimplement fyl2x using floatx80 operations
  target/i386: reimplement fyl2xp1 using floatx80 operations
  target/i386: reimplement fprem, fprem1 using floatx80 operations
  softfloat: return low bits of quotient from floatx80_modrem
  softfloat: do not set denominator high bit for floatx80 remainder
  softfloat: do not return pseudo-denormal from floatx80 remainder
  softfloat: fix floatx80 remainder pseudo-denormal check for zero
  softfloat: merge floatx80_mod and floatx80_rem
  target/i386: reimplement f2xm1 using floatx80 operations
  xen: Actually fix build without passthrough
  Makefile: Install qemu-[qmp/ga]-ref.* into the directory "interop"
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-26 16:55:20 +01:00
Eduardo Habkost 730319aef0 i386: Mask SVM features if nested SVM is disabled
QEMU incorrectly validates FEAT_SVM feature flags against
GET_SUPPORTED_CPUID even if SVM features are being masked out by
cpu_x86_cpuid().  This can make QEMU print warnings on most AMD
CPU models, even when SVM nesting is disabled (which is the
default).

This bug was never detected before because of a Linux KVM bug:
until Linux v5.6, KVM was not filtering out SVM features in
GET_SUPPORTED_CPUID when nested was disabled.  This KVM bug was
fixed in Linux v5.7-rc1, on Linux commit a50718cc3f43 ("KVM:
nSVM: Expose SVM features to L1 iff nested is enabled").

Fix the problem by adding a CPUID_EXT3_SVM dependency to all
FEAT_SVM feature flags in the feature_dependencies table.

Reported-by: Yanan Fu <yfu@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20200623230116.277409-1-ehabkost@redhat.com>
[Fix testcase. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:40 -04:00
Paolo Bonzini c8d7fd059d ibex_uart: fix XOR-as-pow
The xor-as-pow warning in clang actually detected a genuine bug.
Fix it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:40 -04:00
Paolo Bonzini f983ff95f4 vmport: move compat properties to hw_compat_5_0
The patches that introduced the properties were submitted when QEMU 5.0
had not been released yet, so they got merged under the wrong heading.
Move them to hw_compat_5_0 so that 5.0 machine types get the pre-patch
behavior.

Fixes: b889212973 ("hw/i386/vmport: Propagate IOPort read to vCPU EAX register")
Fixes: 0342ee761e ("hw/i386/vmport: Set EAX to -1 on failed and unsupported commands")
Fixes: f8bdc55037 ("hw/i386/vmport: Report vmware-vmx-type in CMD_GETVERSION")
Fixes: aaacf1c15a ("hw/i386/vmport: Add support for CMD_GETBIOSUUID")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:40 -04:00
Jon Doron 8f06f22f38 hyperv: vmbus: Remove the 2nd IRQ
It seems like Windows does not really require 2 IRQs to have a
functioning VMBus.

Signed-off-by: Jon Doron <arilou@gmail.com>
Message-Id: <20200617160904.681845-2-arilou@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:40 -04:00
Marcelo Tosatti 74aaddc628 kvm: i386: allow TSC to differ by NTP correction bounds without TSC scaling
The Linux TSC calibration procedure is subject to small variations
(its common to see +-1 kHz difference between reboots on a given CPU, for example).

So migrating a guest between two hosts with identical processor can fail, in case
of a small variation in calibrated TSC between them.

Allow a conservative 250ppm error between host TSC and VM TSC frequencies,
rather than requiring an exact match. NTP daemon in the guest can
correct this difference.

Also change migration to accept this bound.

KVM_SET_TSC_KHZ depends on a kernel interface change. Without this change,
the behaviour remains the same: in case of a different frequency
between host and VM, KVM_SET_TSC_KHZ will fail and QEMU will exit.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Message-Id: <20200616165805.GA324612@fuller.cnet>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:40 -04:00
Igor Mammedov 32a354dc6c numa: forbid '-numa node, mem' for 5.1 and newer machine types
Deprecation period is run out and it's a time to flip the switch
introduced by cd5ff8333a.  Disable legacy option for new machine
types (since 5.1) and amend documentation.

'-numa node,memdev' shall be used instead of disabled option
with new machine types.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20200609135635.761587-1-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Eric Blake f9919116b8 osdep: Make MIN/MAX evaluate arguments only once
I'm not aware of any immediate bugs in qemu where a second runtime
evaluation of the arguments to MIN() or MAX() causes a problem, but
proactively preventing such abuse is easier than falling prey to an
unintended case down the road.  At any rate, here's the conversation
that sparked the current patch:
https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html

Update the MIN/MAX macros to only evaluate their argument once at
runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are
promoting the temporaries to the same type as the final comparison (we
have to trigger type promotion, as typeof(bitfield) won't compile; and
we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our
uses of MAX are on void* pointers where such addition is undefined).

However, we are unable to work around gcc refusing to compile ({}) in
a constant context (such as the array length of a static variable),
even when only used in the dead branch of a __builtin_choose_expr(),
so we have to provide a second macro pair MIN_CONST and MAX_CONST for
use when both arguments are known to be compile-time constants and
where the result must also be usable as a constant; this second form
evaluates arguments multiple times but that doesn't matter for
constants.  By using a void expression as the expansion if a
non-constant is presented to this second form, we can enlist the
compiler to ensure the double evaluation is not attempted on
non-constants.

Alas, as both macros now rely on compiler intrinsics, they are no
longer usable in preprocessor #if conditions; those will just have to
be open-coded or the logic rewritten into #define or runtime 'if'
conditions (but where the compiler dead-code-elimination will probably
still apply).

I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all
forms of macro mis-use.  As the errors can sometimes be cryptic, I'm
demonstrating the gcc output:

Use of MIN when MIN_CONST is needed:

In file included from /home/eblake/qemu/qemu-img.c:25:
/home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function
  249 |     ({                                                  \
      |     ^
/home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’
   92 | char array[MIN(1, 2)] = "";
      |            ^~~

Use of MIN_CONST when MIN is needed:

/home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’:
/home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be
 1225 |             i = MIN_CONST(i, n);
      |               ^

Use of MIN in the preprocessor:

In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20:
/home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’:
/home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions
  249 |     ({                                                  \
      |      ^

Fix the resulting callsites that used #if or computed a compile-time
constant min or max to use the new macros.  cpu-defs.h is interesting,
as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes
dynamic.

It may be worth improving glib's MIN/MAX definitions to be saner, but
that is a task for another day.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200625162602.700741-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Tao Xu 47f0d11d21 target/i386: Add notes for versioned CPU models
Add which features are added or removed in this version.

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20200324051034.30541-1-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Joseph Myers ff57bb7b63 target/i386: reimplement fpatan using floatx80 operations
The x87 fpatan emulation is currently based around conversion to
double.  This is inherently unsuitable for a good emulation of any
floatx80 operation.  Reimplement using the soft-float operations, as
for other such instructions.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

Message-Id: <alpine.DEB.2.21.2006230000340.24721@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Joseph Myers 1f18a1e6ab target/i386: reimplement fyl2x using floatx80 operations
The x87 fyl2x emulation is currently based around conversion to
double.  This is inherently unsuitable for a good emulation of any
floatx80 operation.  Reimplement using the soft-float operations,
building on top of the reimplementation of fyl2xp1 and factoring out
code to be shared between the two instructions.

The included test assumes that the result in round-to-nearest mode
should always be one of the two closest floating-point numbers to the
mathematically exact result (including that it should be exact, in the
exact cases which cover more cases than for fyl2xp1).

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.21.2006172321530.20587@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Joseph Myers 5eebc49d2d target/i386: reimplement fyl2xp1 using floatx80 operations
The x87 fyl2xp1 emulation is currently based around conversion to
double.  This is inherently unsuitable for a good emulation of any
floatx80 operation, even before considering that it is a particularly
naive implementation using double (adding 1 then using log rather than
attempting a better emulation using log1p).

Reimplement using the soft-float operations, as was done for f2xm1; as
in that case, m68k has related operations but not exactly this one and
it seemed safest to implement directly rather than reusing the m68k
code to avoid accumulation of errors.

A test is included with many randomly generated inputs.  The
assumption of the test is that the result in round-to-nearest mode
should always be one of the two closest floating-point numbers to the
mathematical value of y * log2(x + 1); the implementation aims to do
somewhat better than that (about 70 correct bits before rounding).  I
haven't investigated how accurate hardware is.

Intel manuals describe a narrower range of valid arguments to this
instruction than AMD manuals.  The implementation accepts the wider
range (it's needed anyway for the core code to be reusable in a
subsequent patch reimplementing fyl2x), but the test only has inputs
in the narrower range so that it's valid on hardware that may reject
or produce poor results for inputs outside that range.

Code in the previous implementation that sets C2 for some out-of-range
arguments is not carried forward to the new implementation; C2 is
undefined for this instruction and I suspect that code was just
cut-and-pasted from the trigonometric instructions (fcos, fptan, fsin,
fsincos) where C2 *is* defined to be set for out-of-range arguments.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

Message-Id: <alpine.DEB.2.21.2006172320190.20587@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:38 -04:00
Joseph Myers 5ef396e2ba target/i386: reimplement fprem, fprem1 using floatx80 operations
The x87 fprem and fprem1 emulation is currently based around
conversion to double, which is inherently unsuitable for a good
emulation of any floatx80 operation.  Reimplement using the soft-float
floatx80 remainder operations.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081657200.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:38 -04:00
Joseph Myers 445810ec91 softfloat: return low bits of quotient from floatx80_modrem
Both x87 and m68k need the low parts of the quotient for their
remainder operations.  Arrange for floatx80_modrem to track those bits
and return them via a pointer.

The architectures using float32_rem and float64_rem do not appear to
need this information, so the *_rem interface is left unchanged and
the information returned only from floatx80_modrem.  The logic used to
determine the low 7 bits of the quotient for m68k
(target/m68k/fpu_helper.c:make_quotient) appears completely bogus (it
looks at the result of converting the remainder to integer, the
quotient having been discarded by that point); this patch does not
change that, but the m68k maintainers may wish to do so.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081656500.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:38 -04:00
Joseph Myers 566601f1f9 softfloat: do not set denominator high bit for floatx80 remainder
The floatx80 remainder implementation unnecessarily sets the high bit
of bSig explicitly.  By that point in the function, arguments that are
invalid, zero, infinity or NaN have already been handled and
subnormals have been through normalizeFloatx80Subnormal, so the high
bit will already be set.  Remove the unnecessary code.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081656220.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:38 -04:00
Joseph Myers b662495dca softfloat: do not return pseudo-denormal from floatx80 remainder
The floatx80 remainder implementation sometimes returns the numerator
unchanged when the denominator is sufficiently larger than the
numerator.  But if the value to be returned unchanged is a
pseudo-denormal, that is incorrect.  Fix it to normalize the numerator
in that case.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081655520.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:38 -04:00
Joseph Myers 499a2f7b55 softfloat: fix floatx80 remainder pseudo-denormal check for zero
The floatx80 remainder implementation ignores the high bit of the
significand when checking whether an operand (numerator) with zero
exponent is zero.  This means it mishandles a pseudo-denormal
representation of 0x1p-16382L by treating it as zero.  Fix this by
checking the whole significand instead.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081655180.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:38 -04:00
Joseph Myers 6b8b0136ab softfloat: merge floatx80_mod and floatx80_rem
The m68k-specific softfloat code includes a function floatx80_mod that
is extremely similar to floatx80_rem, but computing the remainder
based on truncating the quotient toward zero rather than rounding it
to nearest integer.  This is also useful for emulating the x87 fprem
and fprem1 instructions.  Change the floatx80_rem implementation into
floatx80_modrem that can perform either operation, with both
floatx80_rem and floatx80_mod as thin wrappers available for all
targets.

There does not appear to be any use for the _mod operation for other
floating-point formats in QEMU (the only other architectures using
_rem at all are linux-user/arm/nwfpe, for FPA emulation, and openrisc,
for instructions that have been removed in the latest version of the
architecture), so no change is made to the code for other formats.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <alpine.DEB.2.21.2006081654280.23637@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:37 -04:00
Joseph Myers eca30647fc target/i386: reimplement f2xm1 using floatx80 operations
The x87 f2xm1 emulation is currently based around conversion to
double.  This is inherently unsuitable for a good emulation of any
floatx80 operation, even before considering that it is a particularly
naive implementation using double (computing with pow and then
subtracting 1 rather than attempting a better emulation using expm1).

Reimplement using the soft-float operations, including additions and
multiplications with higher precision where appropriate to limit
accumulation of errors.  I considered reusing some of the m68k code
for transcendental operations, but the instructions don't generally
correspond exactly to x87 operations (for example, m68k has 2^x and
e^x - 1, but not 2^x - 1); to avoid possible accumulation of errors
from applying multiple such operations each rounding to floatx80
precision, I wrote a direct implementation of 2^x - 1 instead.  It
would be possible in principle to make the implementation more
efficient by doing the intermediate operations directly with
significands, signs and exponents and not packing / unpacking floatx80
format for each operation, but that would make it significantly more
complicated and it's not clear that's worthwhile; the m68k emulation
doesn't try to do that.

A test is included with many randomly generated inputs.  The
assumption of the test is that the result in round-to-nearest mode
should always be one of the two closest floating-point numbers to the
mathematical value of 2^x - 1; the implementation aims to do somewhat
better than that (about 70 correct bits before rounding).  I haven't
investigated how accurate hardware is.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

Message-Id: <alpine.DEB.2.21.2006112341010.18393@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:37 -04:00
Anthony PERARD b00de3a51f xen: Actually fix build without passthrough
Fix typo.

Fixes: acd0c9416d ("xen: fix build without pci passthrough")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20200619103115.254127-1-anthony.perard@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:37 -04:00
Liao Pingfang c13dba2c77 Makefile: Install qemu-[qmp/ga]-ref.* into the directory "interop"
We need install qemu-[qmp/ga]-ref.* files into the subdirectory of qemu docs: interop.

If we visit the following address and click the link to qemu-qmp-ref.html:
https://www.qemu.org/docs/master/interop/bitmaps.html#basic-qmp-usage

It will report following error:
"
Not Found
The requested URL /docs/master/interop/qemu-qmp-ref.html was not found on this server.
"

Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1591663670-47712-1-git-send-email-wang.yi59@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:37 -04:00
Thomas Huth ee760ac80a hw/scsi/megasas: Fix possible out-of-bounds array access in tracepoints
Some tracepoints in megasas.c use a guest-controlled value as an index
into the mfi_frame_desc[] array. Thus a malicious guest could cause an
out-of-bounds error here. Fortunately, the impact is very low since this
can only happen when the corresponding tracepoints have been enabled
before, but the problem should be fixed anyway with a proper check.

Buglink: https://bugs.launchpad.net/qemu/+bug/1882065
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20200615072629.32321-1-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:37 -04:00
Jingqi Liu 5f509751f7 docs/nvdimm: add description of alignment requirement of device dax
For device dax (e.g., /dev/dax0.0), the NUM of 'align=NUM' option
needs to match the alignment requirement of the device dax.
It must be larger than or equal to the 'align' of device dax.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Message-Id: <20200429085011.63752-3-jingqi.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:36 -04:00
Peter Maydell 87fb952da8 Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl7zJJUACgkQnKSrs4Gr
 c8ix3Qf/ZpEKTCWJcZZuJPEI4CSgHZTsmDilkhnI/SoSBIK+6do+oBtCWrNdfP/m
 BpAZspaGsKUu5kJe6HGl4Rvmjd/sTg+9+F6UnQVrWccttwmJgr+y0r9uTMEgxgdm
 2xeTzkzfwfxRLn4wb8k1kX/weQUcsbJUe2F9Nvm3HzeKGkaxWlYsRwqXAluC7gjx
 ZK0yHBz9JXKAreAfBRmNduLDElyzc6yYikY2gsJEOYTA7/h/ksmuNWYqNPRzWYGQ
 wRjAPyRMg+q+pZhoir5+6qgKLt6vNk5uQOjPaiLYhSMi7fiTIXrrVrO0dSx1Pkun
 2vlb2WOF7nbj5T1veJQE29/onKPhzA==
 =IYfR
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Wed 24 Jun 2020 11:01:57 BST
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  block/nvme: support nested aio_poll()
  block/nvme: keep BDRVNVMeState pointer in NVMeQueuePair
  block/nvme: clarify that free_req_queue is protected by q->lock
  block/nvme: switch to a NVMeRequest freelist
  block/nvme: don't access CQE after moving cq.head
  block/nvme: drop tautologous assertion
  block/nvme: poll queues without q->lock
  check-block: enable iotests with SafeStack
  configure: add flags to support SafeStack
  coroutine: add check for SafeStack in sigaltstack
  coroutine: support SafeStack in ucontext backend
  minikconf: explicitly set encoding to UTF-8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-26 13:48:54 +01:00
Jingqi Liu ce317be98d exec: fetch the alignment of Linux devdax pmem character device nodes
If the backend file is devdax pmem character device, the alignment
specified by the option 'align=NUM' in the '-object memory-backend-file'
needs to match the alignment requirement of the devdax pmem character device.

This patch uses the interfaces of libdaxctl to fetch the devdax pmem file
'align', so that we can compare it with the NUM of 'align=NUM'.
The NUM needs to be larger than or equal to the devdax pmem file 'align'.

It also fixes the problem that mmap() returns failure in qemu_ram_mmap()
when the NUM of 'align=NUM' is less than the devdax pmem file 'align'.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Message-Id: <20200429085011.63752-2-jingqi.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 07:21:24 -04:00
Peter Maydell 10f7ffabf9 qemu-macppc patches
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCgA8FiEEzGIauY6CIA2RXMnEW8LFb64PMh8FAl71vLgeHG1hcmsuY2F2
 ZS1heWxhbmRAaWxhbmRlLmNvLnVrAAoJEFvCxW+uDzIfRx8H/Ao3EPvVW562Y0tq
 yZNAJ/wCmhTM7+ekOzeImD/G/Tf8ugHgV0BlPfDlOajD6bxrlS6pnv3ncaKfN7G/
 d59ERGWUUqM8MsJ7t/tg/JGvLi0wBAj4ekSCe/MD5xxa2b9GzDGThjZASQvKHhMP
 q4EXMN5SgrRTRIqZ0w+ItPNoihLULiYq1LsNV6VgzBqhVhk349caOxjQ0PSdO64Q
 M7ymcVmiPg0KQPEzbcZDmiQEHP8AbnrJ+RImRrHOyL6pq87du7kPrb7ENpqgt4cA
 XLRzOccPvAp7BshEJIPgjaxNbCN555TMPswHp32lBRvDCF5lZ620TmwtWvb5pKXY
 kTxPd+4=
 =VN0J
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mcayland/tags/qemu-macppc-20200626' into staging

qemu-macppc patches

# gpg: Signature made Fri 26 Jun 2020 10:15:36 BST
# gpg:                using RSA key CC621AB98E82200D915CC9C45BC2C56FAE0F321F
# gpg:                issuer "mark.cave-ayland@ilande.co.uk"
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>" [full]
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-macppc-20200626: (22 commits)
  adb: add ADB bus trace events
  adb: use adb_device prefix for ADB device trace events
  adb: only call autopoll callbacks when autopoll is not blocked
  mac_via: rework ADB state machine to be compatible with both MacOS and Linux
  mac_via: move VIA1 portB write logic into mos6522_q800_via1_write()
  pmu: add adb_autopoll_block() and adb_autopoll_unblock() functions
  cuda: add adb_autopoll_block() and adb_autopoll_unblock() functions
  adb: add autopoll_blocked variable to block autopoll
  adb: use adb_request() only for explicit requests
  adb: add status field for holding information about the last ADB request
  adb: keep track of devices with pending data
  adb: introduce new ADBDeviceHasData method to ADBDeviceClass
  mac_via: convert to use ADBBusState internal autopoll variables
  pmu: convert to use ADBBusState internal autopoll variables
  cuda: convert to use ADBBusState internal autopoll variables
  adb: create autopoll variables directly within ADBBusState
  adb: introduce realize/unrealize and VMStateDescription for ADB bus
  pmu: honour autopoll_rate_ms when rearming the ADB autopoll timer
  pmu: fix duplicate autopoll mask variable
  cuda: convert ADB autopoll timer from ns to ms
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-26 12:14:18 +01:00
Jingqi Liu 21b2eca6fc configure: add libdaxctl support
Add a pair of configure options --{enable,disable}-libdaxctl to control
whether QEMU is compiled with libdaxctl [1]. Libdaxctl is a utility
library for managing the device dax subsystem.

QEMU uses mmap(2) to maps vNVDIMM backends and aligns the mapping
address to the page size (getpagesize(2)) by default. However, some
types of backends may require an alignment different than the page
size. The 'align' option is provided to memory-backend-file to allow
users to specify the proper alignment.

For device dax (e.g., /dev/dax0.0), the 'align' option needs to match
the alignment requirement of the device dax, which can be fetched
through the APIs of libdaxctl version 57 or up.

[1] Libdaxctl is a part of ndctl project.
The project's repository is: https://github.com/pmem/ndctl

For more information about libdaxctl APIs, you can refer to the
comments in source code of: pmem/ndctl/daxctl/lib/libdaxctl.c.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Message-Id: <20200429085011.63752-4-jingqi.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:31 -04:00
Pavel Dovgalyuk 677a3baba4 replay: synchronize on every virtual timer callback
Sometimes virtual timer callbacks depend on order
of virtual timer processing and warping of virtual clock.
Therefore every callback should be logged to make replay deterministic.
This patch creates a checkpoint before every virtual timer callback.
With these checkpoints virtual timers processing and clock warping
events order is completely deterministic.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Alex Bennée <alex.bennee@linaro.org>

--

v2:
  - remove mutex lock/unlock for virtual clock checkpoint since it is
    not process any asynchronous events (commit ca9759c2a9)
  - bump record/replay log file version
Message-Id: <159012932716.27256.8854065545365559921.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:30 -04:00
Pavel Dovgalyuk 255ae6e215 replay: notify the main loop when there are no instructions
When QEMU is executed in console mode without any external event sources,
main loop may sleep for a very long time. But in case of replay
there is another event source - event log.
This patch adds main loop notification when the vCPU loop has nothing
to do and main loop should process the inputs from the event log.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <159013007895.28110.2020104406699709721.stgit@pasha-ThinkPad-X280>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:30 -04:00
Michael S. Tsirkin 5d971f9e67 memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"
Memory API documentation documents valid .min_access_size and .max_access_size
fields and explains that any access outside these boundaries is blocked.

This is what devices seem to assume.

However this is not what the implementation does: it simply
ignores the boundaries unless there's an "accepts" callback.

Naturally, this breaks a bunch of devices.

Revert to the documented behaviour.

Devices that want to allow any access can just drop the valid field,
or add the impl field to have accesses converted to appropriate
length.

Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Fixes: CVE-2020-13754
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1842363
Fixes: a014ed07bd ("memory: accept mismatching sizes in memory_region_access_valid")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200610134731.1514409-1-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:30 -04:00
Paolo Bonzini 4b7c06837a libqos: pci-pc: use 32-bit write for EJ register
The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:30 -04:00
Paolo Bonzini 89ed83d8b2 libqos: usb-hcd-ehci: use 32-bit write for config register
The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:29 -04:00
David CARLIER ae2b72072b util/getauxval: Porting to FreeBSD getauxval feature
From d7f9d40777d1ed7c9450b0be4f957da2993dfc72 Mon Sep 17 00:00:00 2001
From: David Carlier <devnexen@gmail.com>
Date: Fri, 12 Jun 2020 09:39:17 +0100
Subject: [PATCH] util/getauxval: Porting to FreeBSD getauxval feature

FreeBSD has a similar API for auxiliary vector.

Signed-off-by: David Carlier <devnexen@gmail.com>
Message-Id: <CA+XhMqxTU6PUSQBpbA9VrS1QZfqgrCAKUCtUF-x2aF=fCMTDOw@mail.gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:29 -04:00
Jay Zhou 494cd11d76 kvm: support to get/set dirty log initial-all-set capability
Since the new capability KVM_DIRTY_LOG_INITIALLY_SET of
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 has been introduced in the
kernel, tweak the userspace side to detect and enable this
capability.

Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200304025554.2159-1-jianjay.zhou@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 06:45:29 -04:00
Mark Cave-Ayland e590e7f014 adb: add ADB bus trace events
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-23-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:52 +01:00
Mark Cave-Ayland fa6c953964 adb: use adb_device prefix for ADB device trace events
This is to allow us to distinguish between ADB device events and ADB
bus events separately.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-22-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:52 +01:00
Mark Cave-Ayland 913f47ef96 adb: only call autopoll callbacks when autopoll is not blocked
Handle this at the ADB bus level so that individual implementations do not need
to handle this themselves.

Finally add an assert() into adb_request() to prevent developers from accidentally
making an explicit ADB request without blocking autopoll.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-21-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:52 +01:00
Mark Cave-Ayland 975fcedd31 mac_via: rework ADB state machine to be compatible with both MacOS and Linux
The existing ADB state machine is designed to work with Linux which has a different
interpretation of the state machine detailed in "Guide to the Macintosh Family
Hardware". In particular the current Linux implementation includes an extra change
to IDLE state when switching the VIA between send and receive modes which does not
occur in MacOS, and omitting this transition causes the current mac_via ADB state
machine to fail.

Rework the ADB state machine accordingly so that it can enumerate and autopoll the
ADB under both Linux and MacOS, including the addition of the new adb_autopoll_block()
and adb_autopoll_unblock() functions.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-20-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:52 +01:00
Mark Cave-Ayland 378a503479 mac_via: move VIA1 portB write logic into mos6522_q800_via1_write()
Currently the logic is split between the mos6522 portB_write() callback and
the memory region used to capture the VIA1 MMIO accesses. Move everything
into the latter mos6522_q800_via1_write() function to keep all the logic in
one place to make it easier to follow.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-19-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:52 +01:00
Mark Cave-Ayland cf093b0772 pmu: add adb_autopoll_block() and adb_autopoll_unblock() functions
Ensure that the PMU buffer is protected from autopoll requests overwriting
its contents whilst existing PMU requests are in progress.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-18-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:52 +01:00
Mark Cave-Ayland 45c9d721ef cuda: add adb_autopoll_block() and adb_autopoll_unblock() functions
Ensure that the CUDA buffer is protected from autopoll requests overwriting
its contents whilst existing CUDA requests are in progress.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-17-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:52 +01:00
Mark Cave-Ayland 4e5df0369f adb: add autopoll_blocked variable to block autopoll
Whilst autopoll is enabled it is necessary to prevent the ADB buffer contents
from being overwritten until the host has read back the response in its
entirety.

Add adb_autopoll_block() and adb_autopoll_unblock() functions in preparation
for ensuring that the ADB buffer contents are protected for explicit ADB
requests.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-16-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland d2288b7584 adb: use adb_request() only for explicit requests
Currently adb_request() is called both for explicit ADB requests and internal
autopoll requests via adb_poll().

Move the current functionality into do_adb_request() to be used internally and
add a simple adb_request() wrapper for explicit requests.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-15-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland 3fe02cc8b3 adb: add status field for holding information about the last ADB request
Currently only 2 bits are defined: one to indicate if the request timed out (no
reply) and another to indicate whether the request was the result of an autopoll
operation.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-14-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland 244a0ee965 adb: keep track of devices with pending data
Add a new pending variable to ADBBusState which is a bitmask indicating which
ADB devices have data to send. Update the bitmask every time that an ADB
request is executed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-13-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland 969ca2f7a1 adb: introduce new ADBDeviceHasData method to ADBDeviceClass
This is required later to allow devices to assert a service request (SRQ)
signal to indicate that it has data to send, without having to consume it.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-12-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland f3d61457e8 mac_via: convert to use ADBBusState internal autopoll variables
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-11-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland df381d584c pmu: convert to use ADBBusState internal autopoll variables
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-10-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland b12a0b164c cuda: convert to use ADBBusState internal autopoll variables
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-9-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00
Mark Cave-Ayland da52c083ac adb: create autopoll variables directly within ADBBusState
Rather than each ADB implementation requiring its own functions to manage
autopoll state, timers, and autopoll masks prepare to move this information
directly into ADBBusState.

Add external functions within adb.h to allow each ADB implementation to
manage the new autopoll variables.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200623204936.24064-8-mark.cave-ayland@ilande.co.uk>
2020-06-26 10:13:51 +01:00