Commit graph

119 commits

Author SHA1 Message Date
David Hildenbrand a384bfa32e util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE
does not require a SIGBUS handler, doesn't actually touch page content,
and avoids context switches; it is, therefore, faster and easier to handle
than our current approach.

While MADV_POPULATE_WRITE is, in general, faster than manual
prefaulting, and especially faster with 4k pages, there is still value in
prefaulting using multiple threads to speed up preallocation.

More details on MADV_POPULATE_WRITE can be found in the Linux commits
4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault
page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for
MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1].

This resolves the TODO in do_touch_pages().

In the future, we might want to look into using fallocate(), eventually
combined with MADV_POPULATE_READ, when dealing with shared file/fd
mappings and not caring about memory bindings.

[1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com

Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Paolo Bonzini 7db492a1b6 osdep: fix HAVE_BROKEN_SIZE_MAX case
While config-host.mak entries are expanded to "1" for compatibility with
create-config.sh, tests done directly in meson.build expand to the empty
string and cannot be placed to the right of the && operator.  Adjust
osdep.h after commit e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX
to meson", 2021-07-06) changed the way HAVE_BROKEN_SIZE_MAX is defined.

Reported-by: Frederic Bezies <fredbezies@gmail.com>
Fixes: e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX to meson", 2021-07-06)
Resolves: #463
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-09 18:19:00 +02:00
Vladimir Sementsov-Ogievskiy 4d324c0bf6 introduce QEMU_AUTO_VFREE
Introduce a convenient macro, that works for qemu_memalign() like
g_autofree works with g_malloc.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210628121133.193984-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:21 +02:00
Paolo Bonzini c9797456f6 osdep: provide ROUND_DOWN macro
osdep.h provides a ROUND_UP macro to hide bitwise operations for the
purpose of rounding a number up to a power of two; add a ROUND_DOWN
macro that does the same with truncation towards zero.

While at it, change the formatting of some comments.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:12 +02:00
David Hildenbrand d94e0bc9ef util/mmap-alloc: Support RAM_NORESERVE via MAP_NORESERVE under Linux
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
effect on most shared mappings - except for hugetlbfs and anonymous memory.

Linux man page:
  "MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
  space is reserved, one has the guarantee that it is possible to modify
  the mapping. When swap space is not reserved one might get SIGSEGV
  upon a write if no physical memory is available. See also the discussion
  of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
  2.6, this flag had effect only for private writable mappings."

Note that the "guarantee" part is wrong with memory overcommit in Linux.

Also, in Linux hugetlbfs is treated differently - we configure reservation
of huge pages from the pool, not reservation of swap space (huge pages
cannot be swapped).

The rough behavior is [1]:
a) !Hugetlbfs:

  1) Without MAP_NORESERVE *or* with memory overcommit under Linux
     disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
     accounting/reservation happens:
      For a file backed map
       SHARED or READ-only - 0 cost (the file is the map not swap)
       PRIVATE WRITABLE - size of mapping per instance

      For an anonymous or /dev/zero map
       SHARED   - size of mapping
       PRIVATE READ-only - 0 cost (but of little use)
       PRIVATE WRITABLE - size of mapping per instance

  2) With MAP_NORESERVE, no accounting/reservation happens.

b) Hugetlbfs:

  1) Without MAP_NORESERVE, huge pages are reserved.

  2) With MAP_NORESERVE, no huge pages are reserved.

Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
to configure it for !hugetlbfs globally; this toggle now allows
configuring it more fine-grained, not for the whole system.

The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.

[1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting

Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
David Hildenbrand 8dbe22c686 memory: Introduce RAM_NORESERVE and wire it up in qemu_ram_mmap()
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:

"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"

Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()

... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.

The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
David Hildenbrand b444f5c079 util/mmap-alloc: Pass flags instead of separate bools to qemu_ram_mmap()
Let's pass flags instead of bools to prepare for passing other flags and
update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_
flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify
it.

We expose only flags that are currently supported by qemu_ram_mmap().
Maybe, we'll see qemu_mmap() in the future as well that can implement these
flags.

Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only
defined for some systems and we want to always be able to identify
these flags reliably inside qemu_ram_mmap() -- for example, to properly
warn when some future flags are not available or effective on a system.
Also, this way we can simplify PROT_ handling as well.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:38 +02:00
David Hildenbrand cdfa56c551 softmmu/physmem: Fix ram_block_discard_range() to handle shared anonymous memory
We can create shared anonymous memory via
    "-object memory-backend-ram,share=on,..."
which is, for example, required by PVRDMA for mremap() to work.

Shared anonymous memory is weird, though. Instead of MADV_DONTNEED, we
have to use MADV_REMOVE: MADV_DONTNEED will only remove / zap all
relevant page table entries of the current process, the backend storage
will not get removed, resulting in no reduced memory consumption and
a repopulation of previous content on next access.

Shared anonymous memory is internally really just shmem, but without a
fd exposed. As we cannot use fallocate() without the fd to discard the
backing storage, MADV_REMOVE gets the same job done without a fd as
documented in "man 2 madvise". Removing backing storage implicitly
invalidates all page table entries with relevant mappings - an additional
MADV_DONTNEED is not required.

Fixes: 06329ccecf ("mem: add share parameter to memory-backend-ram")
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210406080126.24010-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-15 20:27:37 +02:00
Richard Henderson d7107fc00a util/osdep: Add qemu_mprotect_rw
For --enable-tcg-interpreter on Windows, we will need this.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-06-13 17:42:40 -07:00
Peter Maydell 415a9fb880 osdep: Make os-win32.h and os-posix.h handle 'extern "C"' themselves
Both os-win32.h and os-posix.h include system header files. Instead
of having osdep.h include them inside its 'extern "C"' block, make
these headers handle that themselves, so that we don't include the
system headers inside 'extern "C"'.

This doesn't fix any current problems, but it's conceptually the
right way to handle system headers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
2021-05-10 17:21:54 +01:00
Peter Maydell ec63ca2d35 include/qemu/osdep.h: Move system includes to top
Mostly osdep.h puts the system includes at the top of the file; but
there are a couple of exceptions where we include a system header
halfway through the file.  Move these up to the top with the rest
so that all the system headers we include are included before
we include os-win32.h or os-posix.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-4-peter.maydell@linaro.org
Message-id: 20210414184343.26235-1-peter.maydell@linaro.org
2021-04-17 18:44:30 +01:00
Paolo Bonzini 875df03b22 osdep: protect qemu/osdep.h with extern "C"
System headers may include templates if compiled with a C++ compiler,
which cause the compiler to complain if qemu/osdep.h is included
within a C++ source file's 'extern "C"' block.  Add
an 'extern "C"' block directly to qemu/osdep.h, so that
system headers can be kept out of it.

There is a stray declaration early in qemu/osdep.h, which needs
to be special cased.  Add a definition in qemu/compiler.h to
make it look nice.

config-host.h, CONFIG_TARGET, exec/poison.h and qemu/compiler.h
are included outside the 'extern "C"' block; that is not
an issue because they consist entirely of preprocessor directives.

This allows us to move the include of osdep.h in our two C++
source files outside the extern "C" block they were previously
using for it, which in turn means that they compile successfully
against newer versions of glib which insist that glib.h is
*not* inside an extern "C" block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-3-peter.maydell@linaro.org
[PMM: Moved disas/arm-a64.cc osdep.h include out of its extern "C" block;
 explained in commit message why we're doing this]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-17 18:44:30 +01:00
Paolo Bonzini af1bb59c07 osdep: include glib-compat.h before other QEMU headers
glib-compat.h is sort of like a system header, and it needs to include
system headers (glib.h) that may dislike being included under
'extern "C"'.  Move it right after all system headers and before
all other QEMU headers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210416135543.20382-2-peter.maydell@linaro.org
[PMM: Added comment about why glib-compat.h is special]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-17 18:44:30 +01:00
Joelle van Dyne 1ad27f7d93 osdep: build with non-working system() function
Build without error on hosts without a working system(). If system()
is called, return -1 with ENOSYS.

Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-id: 20210126012457.39046-6-j@getutm.app
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-01-29 10:47:28 +00:00
Roman Bolshakov 653b87eb36 tcg: Toggle page execution for Apple Silicon
Pages can't be both write and executable at the same time on Apple
Silicon. macOS provides public API to switch write protection [1] for
JIT applications, like TCG.

1. https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon

Tested-by: Alexander Graf <agraf@csgraf.de>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20210113032806.18220-1-r.bolshakov@yadro.com>
[rth: Inline the qemu_thread_jit_* functions;
 drop the MAP_JIT change for a follow-on patch.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-01-23 12:13:00 -10:00
Michael Forney c0cb758eec osdep.h: Remove <sys/signal.h> include
Prior to 2a4b472c3c, sys/signal.h was only included on OpenBSD
(apart from two .c files). The POSIX standard location for this
header is just <signal.h> and in fact, OpenBSD's signal.h includes
sys/signal.h itself.

Unconditionally including <sys/signal.h> on musl causes warnings
for just about every source file:

  /usr/include/sys/signal.h:1:2: warning: #warning redirecting incorrect #include <sys/signal.h> to <signal.h> [-Wcpp]
      1 | #warning redirecting incorrect #include <sys/signal.h> to <signal.h>
        |  ^~~~~~~

Since there don't seem to be any platforms which require including
<sys/signal.h> in addition to <signal.h>, and some platforms like
Haiku lack it completely, just remove it.

Tested building on OpenBSD after removing this include.

Signed-off-by: Michael Forney <mforney@mforney.org>
Tested-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210113215600.16100-1-mforney@mforney.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-01-20 10:42:03 +01:00
Paolo Bonzini a4c13869f9 oslib: do not call g_strdup from qemu_get_exec_dir
Just return the directory without requiring the caller to free it.
This also removes a bogus check for NULL in os_find_datadir and
module_load_one; g_strdup of a static variable cannot return NULL.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-30 19:11:36 +02:00
Peter Maydell 5e0a8fda65 * Fix "readlink -f" problem in iotests on macOS (to fix the Cirrus-CI tests)
* Some minor qtest improvements
 * Fix the unit tests to work on MSYS2, too
 * Enable building and testing on MSYS2 in the Cirrus-CI
 * Build FreeBSD with one task again in the Cirrus-CI
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAl9h9e0RHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbXXMA/6AvNOOEgYeW+YrkIjMgh+jjgrmBK5FH0J
 REJiJ5CxQBh9v3gPV5ehWv4/R9pmaEPtbsZ4Bc1jmRwLHcAWIJ/JTYo11M4vTYa3
 IjS9+dlqgznzxZHFavwJ8USjcyeVjkqyaUTE7CNPgzE2b0237oQ8MHzFGlsHwGZV
 AiRhDHI0StCE3QeKICnpB91Us+KF/+UjZnCwSaC/SM8Sq+6LnTF0bEYYUH44SfZe
 AX3ax9kxzWFtzpXXh/3qL0gdGwiVqwv35V7MYpQWZJAPA3TdxVnUDE7/XP1RTOjL
 hhJLf6IqgPwbRWLszmYmTiUCDGE8kqO8wj5MkKlJcjLY9n4zv0ErOjy6Nhnr8b5Q
 TA9hjRfkRkUoquVRm7ZBOE9l2jIkWV9olxYFqBipqBMujSlt9T0seUi+eaY6NuAA
 Z8NOQslqi8xP7wN4Lw3DpGOfbeTvtOlDtA7O7HwwTChTlhCJX7FCoNmpqhCiFRpH
 s7VkNCXoc6l8NDI+Py5sjpRRHMQIsFWUCnZLWJQ+UJWZvfnNoLTM3ErdqzIasVLt
 vW/behHRd7L/hGMa7zNtQa+wv2bgXY/hbFFpNK6RUEaPBzUq3ZixFrMW2Fw6X7mg
 eIVPNrh/LloiJGQfpUuNkqiZ4vdgUeBq7Z89TCU49xskQAgHb0KglnveU42nP8Yf
 pO8OCBOjfJg=
 =ErBp
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/huth-gitlab/tags/pull-request-2020-09-16' into staging

* Fix "readlink -f" problem in iotests on macOS (to fix the Cirrus-CI tests)
* Some minor qtest improvements
* Fix the unit tests to work on MSYS2, too
* Enable building and testing on MSYS2 in the Cirrus-CI
* Build FreeBSD with one task again in the Cirrus-CI

# gpg: Signature made Wed 16 Sep 2020 12:24:29 BST
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/huth-gitlab/tags/pull-request-2020-09-16: (24 commits)
  cirrus: Building freebsd in a single shot
  ci: Enable msys2 ci in cirrus
  tests: Fixes test-qdev-global-props.c
  tests: fix test-util-sockets.c
  tests: Fixes test-io-channel-file by mask only owner file state mask bits
  tests: fixes aio-win32 about aio_remove_fd_handler, get it consistence with aio-posix.c
  tests: Fixes test-io-channel-socket.c tests under msys2/mingw
  vmstate: Fixes test-vmstate.c on msys2/mingw
  meson: remove empty else and duplicated gio deps
  meson: Use -b to ignore CR vs. CR-LF issues on Windows
  osdep: file locking functions are not available on Win32
  tests: test-replication disable /replication/secondary/* on msys2/mingw.
  tests: Fixes test-replication.c on msys2/mingw.
  meson: disable crypto tests are empty under win32
  meson: Disable test-char on msys2/mingw for fixing tests stuck
  rcu: fixes test-logging.c by call drain_call_rcu before rmdir_full
  tests: Convert g_free to g_autofree macro in test-logging.c
  rcu: Implement drain_call_rcu
  qga/commands-win32: Fix problem with redundant protype declaration
  Simplify the .gitignore file
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-09-17 13:38:08 +01:00
Daniel P. Berrangé c490af57cb util: introduce qemu_open and qemu_create with error reporting
qemu_open_old() works like open(): set errno and return -1 on failure.
It has even more failure modes, though.  Reporting the error clearly
to users is basically impossible for many of them.

Our standard cure for "errno is too coarse" is the Error object.
Introduce two new helper methods:

  int qemu_open(const char *name, int flags, Error **errp);
  int qemu_create(const char *name, int flags, mode_t mode, Error **errp);

Note that with this design we no longer require or even accept the
O_CREAT flag. Avoiding overloading the two distinct operations
means we can avoid variable arguments which would prevent 'errp' from
being the last argument. It also gives us a guarantee that the 'mode' is
given when creating files, avoiding a latent security bug.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-09-16 10:33:48 +01:00
Daniel P. Berrangé 448058aa99 util: rename qemu_open() to qemu_open_old()
We want to introduce a new version of qemu_open() that uses an Error
object for reporting problems and make this it the preferred interface.
Rename the existing method to release the namespace for the new impl.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-09-16 10:33:48 +01:00
Daniel P. Berrangé 60efffa41b monitor: simplify functions for getting a dup'd fdset entry
Currently code has to call monitor_fdset_get_fd, then dup
the return fd, and then add the duplicate FD back into the
fdset. This dance is overly verbose for the caller and
introduces extra failure modes which can be avoided by
folding all the logic into monitor_fdset_dup_fd_add and
removing monitor_fdset_get_fd entirely.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2020-09-16 10:33:48 +01:00
Yonggang Luo 6333da0f07 osdep: file locking functions are not available on Win32
Do not declare the following locking functions on Win32:
int qemu_lock_fd(int fd, int64_t start, int64_t len, bool exclusive);
int qemu_unlock_fd(int fd, int64_t start, int64_t len);
int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive);
bool qemu_has_ofd_lock(void);

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200915121318.247-10-luoyonggang@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2020-09-16 08:41:06 +02:00
Paolo Bonzini 2becc36a3e meson: infrastructure for building emulators
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-08-21 06:30:17 -04:00
Alex Bennée 2667e069e7 linux-user: don't use MAP_FIXED in pgd_find_hole_fallback
Plain MAP_FIXED has the undesirable behaviour of splatting exiting
maps so we don't actually achieve what we want when looking for gaps.
We should be using MAP_FIXED_NOREPLACE. As this isn't always available
we need to potentially check the returned address to see if the kernel
gave us what we asked for.

Fixes: ad592e37df ("linux-user: provide fallback pgd_find_hole for bare chroots")
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200724064509.331-9-alex.bennee@linaro.org>
2020-07-27 09:41:18 +01:00
Alex Bennée ad06ef0efb util: add qemu_get_host_physmem utility function
This will be used in a future patch. For POSIX systems _SC_PHYS_PAGES
isn't standardised but at least appears in the man pages for
Open/FreeBSD. The result is advisory so any users of it shouldn't just
fail if we can't work it out.

The win32 stub currently returns 0 until someone with a Windows system
can develop and test a patch.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Message-Id: <20200724064509.331-5-alex.bennee@linaro.org>
2020-07-27 09:40:12 +01:00
Philippe Mathieu-Daudé d450cccc9a qemu/osdep: Reword qemu_get_exec_dir() documentation
This comment is confuse, reword it a bit.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael Rolnik <mrolnik@gmail.com>
Tested-by: Michael Rolnik <mrolnik@gmail.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200714164257.23330-3-f4bug@amsat.org>
2020-07-21 16:13:04 +02:00
Michal Privoznik e47f4765af util: Introduce qemu_get_host_name()
This function offers operating system agnostic way to fetch host
name. It is implemented for both POSIX-like and Windows systems.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2020-07-13 17:44:58 -05:00
David CARLIER 8bf0f1754a osdep.h: For Haiku, define SIGIO as equivalent to SIGPOLL
Haiku doesn't provide SIGIO; fix this up in osdep.h by defining it as
equal to SIGPOLL.

Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200703145614.16684-6-peter.maydell@linaro.org
[PMM: Expanded commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-13 14:36:09 +01:00
David CARLIER 2a4b472c3c osdep.h: Always include <sys/signal.h> if it exists
Regularize our handling of <sys/signal.h>: currently we include it in
osdep.h, but only for OpenBSD, and we include it without an ifdef
guard in a couple of C files.  This causes problems for Haiku, which
doesn't have that header.

Instead, check in configure whether sys/signal.h exists, and if it
does then always include it from osdep.h.

Signed-off-by: David Carlier <devnexen@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200703145614.16684-5-peter.maydell@linaro.org
[PMM: Expanded commit message; rename to HAVE_SYS_SIGNAL_H]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-07-13 14:36:09 +01:00
Eric Blake 6553aa1d11 coverity: provide Coverity-friendly MIN_CONST and MAX_CONST
Coverity has problems seeing through __builtin_choose_expr, which
result in it abandoning analysis of later functions that utilize a
definition that used MIN_CONST or MAX_CONST, such as in qemu-file.c:

 50    DECLARE_BITMAP(may_free, MAX_IOV_SIZE);

CID 1429992 (#1 of 1): Unrecoverable parse warning (PARSE_ERROR)1.
expr_not_constant: expression must have a constant value

As has been done in the past (see 07d66672), it's okay to dumb things
down when compiling for static analyzers.  (Of course, now the
syntax-checker has a false positive on our reference to
__COVERITY__...)

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: CID 1429992, CID 1429995, CID 1429997, CID 1429999
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200629162804.1096180-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 18:02:18 -04:00
Eric Blake f9919116b8 osdep: Make MIN/MAX evaluate arguments only once
I'm not aware of any immediate bugs in qemu where a second runtime
evaluation of the arguments to MIN() or MAX() causes a problem, but
proactively preventing such abuse is easier than falling prey to an
unintended case down the road.  At any rate, here's the conversation
that sparked the current patch:
https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html

Update the MIN/MAX macros to only evaluate their argument once at
runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are
promoting the temporaries to the same type as the final comparison (we
have to trigger type promotion, as typeof(bitfield) won't compile; and
we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our
uses of MAX are on void* pointers where such addition is undefined).

However, we are unable to work around gcc refusing to compile ({}) in
a constant context (such as the array length of a static variable),
even when only used in the dead branch of a __builtin_choose_expr(),
so we have to provide a second macro pair MIN_CONST and MAX_CONST for
use when both arguments are known to be compile-time constants and
where the result must also be usable as a constant; this second form
evaluates arguments multiple times but that doesn't matter for
constants.  By using a void expression as the expansion if a
non-constant is presented to this second form, we can enlist the
compiler to ensure the double evaluation is not attempted on
non-constants.

Alas, as both macros now rely on compiler intrinsics, they are no
longer usable in preprocessor #if conditions; those will just have to
be open-coded or the logic rewritten into #define or runtime 'if'
conditions (but where the compiler dead-code-elimination will probably
still apply).

I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all
forms of macro mis-use.  As the errors can sometimes be cryptic, I'm
demonstrating the gcc output:

Use of MIN when MIN_CONST is needed:

In file included from /home/eblake/qemu/qemu-img.c:25:
/home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function
  249 |     ({                                                  \
      |     ^
/home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’
   92 | char array[MIN(1, 2)] = "";
      |            ^~~

Use of MIN_CONST when MIN is needed:

/home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’:
/home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be
 1225 |             i = MIN_CONST(i, n);
      |               ^

Use of MIN in the preprocessor:

In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20:
/home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’:
/home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions
  249 |     ({                                                  \
      |      ^

Fix the resulting callsites that used #if or computed a compile-time
constant min or max to use the new macros.  cpu-defs.h is interesting,
as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes
dynamic.

It may be worth improving glib's MIN/MAX definitions to be saner, but
that is a task for another day.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200625162602.700741-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Mikhail Gusarov 949da1eb9d chardev: Add macOS to list of OSes that support -chardev serial
macOS API for dealing with serial ports/ttys is identical to BSDs.

Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200426210956.17324-1-dottedmag@dottedmag.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-05-04 14:35:23 +02:00
Peter Maydell c160f17cd6 osdep.h: Drop no-longer-needed Coverity workarounds
In commit a1a98357e3 in 2018 we added some workarounds for Coverity
not being able to handle the _Float* types introduced by recent
glibc.  Newer versions of the Coverity scan tools have support for
these types, and will fail with errors about duplicate typedefs if we
have our workaround.  Remove our copy of the typedefs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200319193323.2038-2-peter.maydell@linaro.org
2020-04-14 09:44:31 +01:00
Marc-André Lureau ee13240e60 osdep: add qemu_unlink()
Add a helper function to match qemu_open() which may return files
under the /dev/fdset prefix. Those shouldn't be removed, since it's
only a qemu namespace.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-02 16:29:32 +04:00
Wei Yang 038adc2f58 core: replace getpagesize() with qemu_real_host_page_size
There are three page size in qemu:

  real host page size
  host page size
  target page size

All of them have dedicate variable to represent. For the last two, we
use the same form in the whole qemu project, while for the first one we
use two forms: qemu_real_host_page_size and getpagesize().

qemu_real_host_page_size is defined to be a replacement of
getpagesize(), so let it serve the role.

[Note] Not fully tested for some arch or device.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:06 +02:00
Stefan Hajnoczi 72d41eb4b8 memory: fetch pmem size in get_file_size()
Neither stat(2) nor lseek(2) report the size of Linux devdax pmem
character device nodes.  Commit 314aec4a6e
("hostmem-file: reject invalid pmem file sizes") added code to
hostmem-file.c to fetch the size from sysfs and compare against the
user-provided size=NUM parameter:

  if (backend->size > size) {
      error_setg(errp, "size property %" PRIu64 " is larger than "
                 "pmem file \"%s\" size %" PRIu64, backend->size,
                 fb->mem_path, size);
      return;
  }

It turns out that exec.c:qemu_ram_alloc_from_fd() already has an
equivalent size check but it skips devdax pmem character devices because
lseek(2) returns 0:

  if (file_size > 0 && file_size < size) {
      error_setg(errp, "backing store %s size 0x%" PRIx64
                 " does not match 'size' option 0x" RAM_ADDR_FMT,
                 mem_path, file_size, size);
      return NULL;
  }

This patch moves the devdax pmem file size code into get_file_size() so
that we check the memory size in a single place:
qemu_ram_alloc_from_fd().  This simplifies the code and makes it more
general.

This also fixes the problem that hostmem-file only checks the devdax
pmem file size when the pmem=on parameter is given.  An unchecked
size=NUM parameter can lead to SIGBUS in QEMU so we must always fetch
the file size for Linux devdax pmem character device nodes.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190830093056.12572-1-stefanha@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-16 12:32:21 +02:00
Cao Jiaxi 946376c21b osdep: Fix mingw compilation regarding stdio formats
I encountered the following compilation error on mingw:

/mnt/d/qemu/include/qemu/osdep.h:97:9: error: '__USE_MINGW_ANSI_STDIO' macro redefined [-Werror,-Wmacro-redefined]
 #define __USE_MINGW_ANSI_STDIO 1
        ^
/mnt/d/llvm-mingw/aarch64-w64-mingw32/include/_mingw.h:433:9: note: previous definition is here
 #define __USE_MINGW_ANSI_STDIO 0      /* was not defined so it should be 0 */

It turns out that __USE_MINGW_ANSI_STDIO must be set before any
system headers are included, not just before stdio.h.

Signed-off-by: Cao Jiaxi <driver1998@foxmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20190503003719.10233-1-driver1998@foxmail.com
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-07 12:55:03 +01:00
Stefan Hajnoczi 314aec4a6e hostmem-file: reject invalid pmem file sizes
Guests started with NVDIMMs larger than the underlying host file produce
confusing errors inside the guest.  This happens because the guest
accesses pages beyond the end of the file.

Check the pmem file size on startup and print a clear error message if
the size is invalid.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1669053
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Zhang Yi <yi.z.zhang@linux.intel.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190214031004.32522-3-stefanha@redhat.com>
Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-03-11 10:44:19 -03:00
Richard W.M. Jones d339d766d1 qemu-io: Add generic function for reinitializing optind.
On FreeBSD 11.2:

  $ nbdkit memory size=1M --run './qemu-io -f raw -c "aio_write 0 512" $nbd'
  Parsing error: non-numeric argument, or extraneous/unrecognized suffix -- aio_write

After main option parsing, we reinitialize optind so we can parse each
command.  However reinitializing optind to 0 does not work on FreeBSD.
What happens when you do this is optind remains 0 after the option
parsing loop, and the result is we try to parse argv[optind] ==
argv[0] == "aio_write" as if it was the first parameter.

The FreeBSD manual page says:

  In order to use getopt() to evaluate multiple sets of arguments, or to
  evaluate a single set of arguments multiple times, the variable optreset
  must be set to 1 before the second and each additional set of calls to
  getopt(), and the variable optind must be reinitialized.

(From the rest of the man page it is clear that optind must be
reinitialized to 1).

The glibc man page says:

  A program that scans multiple argument vectors,  or  rescans  the  same
  vector  more than once, and wants to make use of GNU extensions such as
  '+' and '-' at  the  start  of  optstring,  or  changes  the  value  of
  POSIXLY_CORRECT  between scans, must reinitialize getopt() by resetting
  optind to 0, rather than the traditional value of 1.  (Resetting  to  0
  forces  the  invocation  of  an  internal  initialization  routine that
  rechecks POSIXLY_CORRECT and checks for GNU extensions in optstring.)

This commit introduces an OS-portability function called
qemu_reset_optind which provides a way of resetting optind that works
on FreeBSD and platforms that use optreset, while keeping it the same
as now on other platforms.

Note that the qemu codebase sets optind in many other places, but in
those other places it's setting a local variable and not using getopt.
This change is only needed in places where we are using getopt and the
associated global variable optind.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 20190118101114.11759-2-rjones@redhat.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-01-31 00:38:19 +01:00
Marc-André Lureau 56cdca1d7a build-sys: build with Vista API by default
Both qemu & qga build with Vista API by default already, by defining
_WIN32_WINNT 0x0600. Set it globally in osdep.h instead.

This replaces WINVER by _WIN32_WINNT in osdep.h. WINVER doesn't seem
to be really useful these days.
(see also https://blogs.msdn.microsoft.com/oldnewthing/20070411-00/?p=27283)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181122110039.15972-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-11 13:57:25 +01:00
Marc-André Lureau 007e722c34 build-sys: move windows defines in osdep.h header
This removes some clutter in compilation logging, and allows some
easier tweaking per compilation unit/CFLAGS overriding.

Note that we can't move those define in os-win32.h, since they must be
set before the first system headers are included.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20181122110039.15972-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-11 13:57:25 +01:00
Richard Henderson 3ebee3b191 osdep: Work around MinGW assert
In several places we use assert(FEATURE), and assume that if FEATURE
is disabled, all following code is removed as unreachable.  Which allows
us to compile-out functions that are only present with FEATURE, and
have a link-time failure if the functions remain used.

MinGW does not mark its internal function _assert() as noreturn, so the
compiler cannot see when code is unreachable, which leads to link errors
for this host that are not present elsewhere.

The current build-time failure concerns 62823083b8, but I remember
having seen this same error before.  Fix it once and for all for MinGW.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20181022181623.8810-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-23 10:12:46 +01:00
Marc-André Lureau 9e6bdef224 util: add qemu_write_pidfile()
There are variants of qemu_create_pidfile() in qemu-pr-helper and
qemu-ga. Let's have a common implementation in libqemuutil.

The code is initially based from pr-helper write_pidfile(), with
various improvements and suggestions from Daniel Berrangé:

  QEMU will leave the pidfile existing on disk when it exits which
  initially made me think it avoids the deletion race. The app
  managing QEMU, however, may well delete the pidfile after it has
  seen QEMU exit, and even if the app locks the pidfile before
  deleting it, there is still a race.

  eg consider the following sequence

        QEMU 1        libvirtd        QEMU 2

  1.    lock(pidfile)

  2.    exit()

  3.                 open(pidfile)

  4.                 lock(pidfile)

  5.                                  open(pidfile)

  6.                 unlink(pidfile)

  7.                 close(pidfile)

  8.                                  lock(pidfile)

  IOW, at step 8 the new QEMU has successfully acquired the lock, but
  the pidfile no longer exists on disk because it was deleted after
  the original QEMU exited.

  While we could just say no external app should ever delete the
  pidfile, I don't think that is satisfactory as people don't read
  docs, and admins don't like stale pidfiles being left around on
  disk.

  To make this robust, I think we might want to copy libvirt's
  approach to pidfile acquisition which runs in a loop and checks that
  the file on disk /after/ acquiring the lock matches the file that
  was locked. Then we could in fact safely let QEMU delete its own
  pidfiles on clean exit..

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180831145314.14736-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02 18:47:55 +02:00
Emilio G. Cota 5fe2103429 cacheinfo: add i/d cache_linesize_log
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180910232752.31565-2-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-02 18:47:55 +02:00
Paolo Bonzini a1a98357e3 osdep: work around Coverity parsing errors
Coverity does not like the new _Float* types that are used by
recent glibc, and croaks on every single file that includes
stdlib.h.  Add dummy typedefs to please it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:34 +02:00
Nicholas Piggin 0c1272cc7c osdep: powerpc64 align memory to allow 2MB radix THP page tables
This allows KVM with the Book3S radix MMU mode to take advantage of
THP and install larger pages in the partition scope page tables (the
host translation).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
Michael S. Tsirkin 28012e190e osdep: add wait.h compat macros
Man page for WCOREDUMP says:

  WCOREDUMP(wstatus) returns true if the child produced a core dump.
  This macro should be employed only if WIFSIGNALED returned true.

  This  macro  is  not  specified  in POSIX.1-2001 and is not
  available on some UNIX implementations (e.g., AIX, SunOS).  Therefore,
  enclose its use inside #ifdef WCOREDUMP ... #endif.

Let's do exactly this.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2018-05-24 21:14:11 +03:00
Marcel Apfelbaum 06329ccecf mem: add share parameter to memory-backend-ram
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.

Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.

Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.

There are no functional changes if the new flag is not used.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
2018-02-19 13:03:24 +02:00
Peter Maydell 57d1f6d7ce sparc: Make sure we mmap at SHMLBA alignment
SPARC Linux has an oddity that it insists that mmap()
of MAP_FIXED memory must be at an alignment defined by
SHMLBA, which is more aligned than the page size
(typically, SHMLBA alignment is to 16K, and pages are 8K).
This is a relic of ancient hardware that had cache
aliasing constraints, but even on modern hardware the
kernel still insists on the alignment.

To ensure that we get mmap() alignment sufficient to
make the kernel happy, change QEMU_VMALLOC_ALIGN,
qemu_fd_getpagesize() and qemu_mempath_getpagesize()
to use the maximum of getpagesize() and SHMLBA.

In particular, this allows 'make check' to pass on Sparc:
we were previously failing the ivshmem tests.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512752248-17857-1-git-send-email-peter.maydell@linaro.org
2017-12-15 15:26:24 +00:00
Peter Maydell e7b47c22e2 osdep.h: Make TIME_MAX handle different time_t types
In our various supported host OSes, the time_t type may be either 32
or 64 bit, and could in theory also be either signed or unsigned.
Notably, in OpenBSD time_t is a 64 bit type even if 'long' is 32
bits, so using LONG_MAX for TIME_MAX is incorrect.

Use an approach suggested by Paolo Bonzini which calculates
the maximum value of the type rather than hardcoding it;
to do this we use the TYPE_MAXIMUM macro from Gnulib.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1511452598-6077-1-git-send-email-peter.maydell@linaro.org
2017-11-24 13:23:36 +00:00