Commit graph

1403 commits

Author SHA1 Message Date
Benoît Canet d55dee2044 quorum: Add quorum_getlength().
Check that every bs file returns the same length.
Otherwise, return -EIO to disable the quorum and
avoid length discrepancy.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:50 +01:00
Benoît Canet 95c6bff356 quorum: Add quorum mechanism.
This patchset enables the core of the quorum mechanism.
The num_children reads are compared to get the majority version and if this
version exists more than threshold times the guest won't see the error at all.

If a block is corrupted or if an error occurs during an IO or if the quorum
cannot be established QMP events are used to report to the management.

Use gnutls's SHA-256 to compare versions.

--enable-quorum must be used to enable the feature.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:50 +01:00
Benoît Canet 7db6982a19 quorum: Add quorum_aio_readv.
Add code to do num_children reads in parallel and cleanup the structures
afterwards.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:49 +01:00
Benoît Canet f70d7f7e4d blkverify: Extract qemu_iovec_clone() and qemu_iovec_compare() from blkverify.
qemu_iovec_compare() will be used to compare IOs vectors in quorum blkverify
mode. The patch extracts these functions in order to factorize the code.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:49 +01:00
Benoît Canet 13e7956e31 quorum: Add quorum_aio_writev and its dependencies.
Writes are mirrored num_children times on num_children devices.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:49 +01:00
Benoît Canet cadebd7a2a quorum: Create BDRVQuorumState and BlkDriver and do init.
Create the structure holding the quorum settings and write the minimal block
driver instanciation boilerplate.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:48 +01:00
Benoît Canet 27cec15e4e quorum: Create quorum.c, add QuorumChildRequest and QuorumAIOCB.
Quorum is a block filter mirroring writes to num_children children.
For reads quorum reads each children and does a vote.
If more than vote_threshold versions are identical the quorum is reached and
this winning version is returned to the guest. So quorum prevents bit corruption.
For high availability purpose minority errors are reported via QMP but the guest
does not see them.

This patch creates the driver C source file and introduces the structures that
will be used in asynchronous reads and writes.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 22:29:48 +01:00
Paolo Bonzini 5b7aa9b56d vdi: say why an image is bad
Instead of just putting it in debugging output, we can now put the
value in an Error.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini 76abe4071d block: do not abuse EMEDIUMTYPE
Returning "Wrong medium type" for an image that does not have a valid
header is a bit weird.  Improve the error by mentioning what format
was trying to open it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini 89ac8480a8 vmdk: correctly propagate errors
Now that we can return the "right" errors, use the Error** parameter
to pass them back instead of just printing them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini 37f09e5e3d vmdk: do not try opening a file as both image and descriptor
This prepares for propagating errors from vmdk_open_sparse and
vmdk_open_desc_file up to the caller of vmdk_open.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini d1833ef52b vmdk: push vmdk_read_desc up to caller
Currently, we just try reading a VMDK file as both image and descriptor.
This makes it hard to choose which of the two attempts gave the best error.
We'll decide in advance if the file looks like an image or a descriptor,
and this patch is the first step to that end.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini a8842e6d2a vmdk: extract vmdk_read_desc
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Paolo Bonzini c0f92b526d vvfat: correctly propagate errors
Before:
    $ ./qemu-io-old
    qemu-io-old> open -r -o driver=vvfat,fat-type=24,dir=i386-softmmu
    Valid FAT types are only 12, 16 and 32
    qemu-io-old: can't open device (null): Could not open image: Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o driver=vvfat,fat-type=24,dir=i386-softmmu
    qemu-io: can't open device (null): Valid FAT types are only 12, 16 and 32

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini 6890aad46b vhdx: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini 0fea6b7972 qed: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini b6d5066d32 qcow: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini 2a94fee3f6 curl: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini f8d924e481 cow: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini a7451cb850 gluster: correctly propagate errors
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini 24897a767b gluster: default scheme to gluster:// and host to localhost.
Currently, "gluster:///volname/img" and (using file. options)
"file.driver=gluster,file.filename=foo" will segfault.  Also,
"//host/volname/img" will be rejected, but it is a valid URL
that should be accepted just fine with "file.driver=gluster".
Accept all of these, by inferring missing transport and host
as TCP and localhost respectively.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini f2917853f7 iscsi: correctly propagate errors in iscsi_open
Before:
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=iscsi,file.filename=foo
    Failed to parse URL : foo
    qemu-io-old: can't open device (null): Could not open 'foo': Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o file.driver=iscsi,file.filename=foo
    qemu-io: can't open device (null): Failed to parse URL : foo

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini 35cb1748d5 iscsi: fix indentation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:23 +01:00
Paolo Bonzini 77e8b9ca64 nbd: correctly propagate errors
Before:
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd
    one of path and host must be specified.
    qemu-io-old: can't open device (null): Could not open image: Invalid argument
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    path and host may not be used at the same time.
    qemu-io-old: can't open device (null): Could not open image: Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd
    qemu-io: can't open device (null): one of path and host must be specified.
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    qemu-io: can't open device (null): path and host may not be used at the same time.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Paolo Bonzini a69d9af449 nbd: produce a better error if neither host nor port is passed
Before:
    $ qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd
    qemu-io-old: can't open device (null): Could not open image: Invalid argument
    $ ./qemu-io-old
    qemu-io-old> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    path and host may not be used at the same time.
    qemu-io-old: can't open device (null): Could not open image: Invalid argument

After:
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd
    one of path and host must be specified.
    qemu-io: can't open device (null): Could not open image: Invalid argument
    $ ./qemu-io
    qemu-io> open -r -o file.driver=nbd,file.host=foo,file.path=bar
    path and host may not be used at the same time.
    qemu-io: can't open device (null): Could not open image: Invalid argument

Next patch will fix the error propagation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz f7d9fd8c72 block: Remove bdrv_open_image()'s force_raw option
This option is now unnecessary since specifying BDRV_O_PROTOCOL as flag
will do exactly the same.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz 2e40134bfd block: Make bdrv_file_open() static
Add the bdrv_open() option BDRV_O_PROTOCOL which results in passing the
call to bdrv_file_open(). Additionally, make bdrv_file_open() static and
therefore bdrv_open() the only way to call it.

Consequently, all existing calls to bdrv_file_open() have to be adjusted
to use bdrv_open() with the BDRV_O_PROTOCOL flag instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz ddf5636dc9 block: Add reference parameter to bdrv_open()
Allow bdrv_open() to handle references to existing block devices just as
bdrv_file_open() is already capable of.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz f67503e5bd block: Change BDS parameter of bdrv_open() to **
Make bdrv_open() take a pointer to a BDS pointer, similarly to
bdrv_file_open(). If a pointer to a NULL pointer is given, bdrv_open()
will create a new BDS with an empty name; if the BDS pointer is not
NULL, that existing BDS will be reused (in the same way as bdrv_open()
already did).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:21 +01:00
Kevin Wolf a71835a0cc qcow2: Set zero flag for discarded clusters
Instead of making the backing file contents visible again after a discard
request, set the zero flag if possible (i.e. on version >= 3).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-02-21 21:02:21 +01:00
Fam Zheng 6ebc91e5d0 block: use per-object cflags and libs
No longer adds flags and libs for them to global variables, instead
create config-host.mak variables like FOO_CFLAGS and FOO_LIBS, which is
used as per object cflags and libs.

This removes unwanted dependencies from libcacard.

Signed-off-by: Fam Zheng <famz@redhat.com>
[Split from Fam's patch to enable modules. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-20 13:12:54 +01:00
Peter Maydell 4c0c9bbe78 Merge remote-tracking branch 'remotes/qmp-unstable/queue/qmp' into staging
* remotes/qmp-unstable/queue/qmp:
  monitor: Add object_add class argument completion.
  monitor: Add object_del id argument completion.
  monitor: Add device_add device argument completion.
  monitor: Add device_del id argument completion.
  qmp: expose list of supported character device backends
  Use error_is_set() only when necessary
  QMP: allow JSON dict arguments in qmp-shell
  hmp: migrate command (without -d) now blocks correctly

Conflicts:
	blockdev.c

[PMM: resolved trivial conflict in blockdev.c]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-02-20 12:10:23 +00:00
Markus Armbruster 84d18f065f Use error_is_set() only when necessary
error_is_set(&var) is the same as var != NULL, but it takes
whole-program analysis to figure that out.  Unnecessarily hard for
optimizers, static checkers, and human readers.  Dumb it down to
obvious.

Gets rid of several dozen Coverity false positives.

Note that the obvious form is already used in many places.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-02-17 11:57:23 -05:00
Jeff Cody cc67f4d1f9 block: mirror - use local_err to avoid NULL errp
When starting a block job, commit_active_start() relies on whether *errp
is set by mirror_start_job.  This allows it to determine if the mirror
job start failed, so that it can clean up any changes to open flags from
the bdrv_reopen().  If errp is NULL, then it will not be able to
determine if mirror_start_job failed or not.

To avoid this, use a local Error variable, and then propagate the error
(if any) to errp.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14 18:05:39 +01:00
Jeff Cody 39a611a3e0 block: Don't throw away errno via error_setg
There are a handful of places in the block layer where a failure path
has a valid -errno value, yet error_setg() is used.  Those instances
should instead use error_setg_errno(), to preserve as much error
information as possible.

This patch replaces those instances with error_setg_errno(), so that
errno is passed up the stack in the error message.

Reported-By: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14 18:05:38 +01:00
Jeff Cody 28f106afb3 block: Add notes to iSCSI's .bdrv_open and .bdrv_reopen_prepare
iSCSI currently does not need to do any actions to support the
current usage of bdrv_reopen().  However, it is important to note
a couple of things: 1.) A connection will not be re-established to
an iSCSI target, and 2.) If iscsi_open() is changed to parse 'flags',
then iscsi_reopen_prepare() may need to be more than a stub.

In light of the above, this commit adds comments above both of the
functions to bring attention to these facts.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-02-14 13:38:35 +01:00
Kevin Wolf eaf944a438 blkdebug: Don't leak bs->file on failure
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09 09:12:39 +01:00
Kevin Wolf ad6aef43d3 raw: Fix BlockLimits passthrough
raw copies over the BlockLimits of bs->file during bdrv_open().
However, since commit d34682cd it is immediately overwritten during
bdrv_refresh_limits(). This caused all fields except for
opt_transfer_length and opt_mem_alignment (which happen to be correctly
inherited in generic code) to be zeroed.

Move the BlockLimit assignment to a .bdrv_refresh_limits() callback to
make it work again for all fields.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
2014-02-09 09:12:39 +01:00
Hu Tao 7c2bbf4aa6 qcow2: check for NULL l2meta
In the case of a metadata preallocation with a large cluster size,
qcow2_alloc_cluster_offset() can allocate nothing and returns a
NULL l2meta. This patch checks for it and link2 l2 with only valid
l2meta.

Replace 9 and 512 with BDRV_SECTOR_BITS, BDRV_SECTOR_SIZE
respectively while at the function.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Hu Tao 33304ec9fa qcow2: fix offset overflow in qcow2_alloc_clusters_at()
When cluster size is big enough it can lead to an offset overflow
in qcow2_alloc_clusters_at(). This patch fixes it.

The allocation is stopped each time at L2 table boundary
(see handle_alloc()), so the possible maximum bytes could be

  2^(cluster_bits - 3 + cluster_bits)

cluster_bits - 3 is used to compute the number of entry by L2
and the additional cluster_bits is to take into account each
clusters referenced by the L2 entries.

so int is safe for cluster_bits<=17, unsafe otherwise.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Hu Tao 16f0587e0a qcow2: remove n_start and n_end of qcow2_alloc_cluster_offset()
n_start can be actually calculated from offset. The number of
sectors to be allocated(n_end - n_start) can be passed in in
num. By removing n_start and n_end, we can save two parameters.

The side effect is there is a bug in qcow2.c:preallocate() that
passes incorrect n_start to qcow2_alloc_cluster_offset() is
fixed. The bug can be triggerred by a larger cluster size than
the default value(65536), for example:

./qemu-img create -f qcow2 \
  -o 'cluster_size=131072,preallocation=metadata' file.img 4G

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Peter Lieven 5d259fc7da block/iscsi: always fill bs->bl.opt_transfer_length
the opt_transfer_length has nothing to do with logical
block provisioning stuff so always copy it from
the block limits VPD page.

Reported-By: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:38 +01:00
Peter Lieven 6542aa9c75 block: add native support for NFS
This patch adds native support for accessing images on NFS
shares without the requirement to actually mount the entire
NFS share on the host.

NFS Images can simply be specified by an url of the form:
nfs://<host>/<export>/<filename>[?param=value[&param2=value2[&...]]]

For example:
qemu-img create -f qcow2 nfs://10.0.0.1/qemu-images/test.qcow2

You need LibNFS from Ronnie Sahlberg available at:
   git://github.com/sahlberg/libnfs.git
for this to work.

During configure it is automatically probed for libnfs and support
is enabled on-the-fly. You can forbid or enforce libnfs support
with --disable-libnfs or --enable-libnfs respectively.

Due to NFS restrictions you might need to execute your binaries
as root, allow them to open priviledged ports (<1024) or specify
insecure option on the NFS server.

For additional information on ROOT vs. non-ROOT operation and URL
format + parameters see:
   https://raw.github.com/sahlberg/libnfs/master/README

Supported by qemu are the uid, gid and tcp-syncnt URL parameters.

LibNFS currently support NFS version 3 only.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:38 +01:00
Markus Armbruster f50159fa9b block/vhdx: Error checking fixes
Errors are inadvertently ignored in a few places.  Has always been
broken.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Peter Lieven f43aa8e18a block/vmdk: add basic .bdrv_check support
this adds a basic vmdk corruption check. it should detect severe
table corruptions and file truncation.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Jeff Cody 14b4a8b9c6 block: remove qcow2 .bdrv_make_empty implementation
The QCOW2 .bdrv_make_empty implementation always returns 0 for success,
but does not actually do anything.

The proper way to not support an optional driver function stub is to
just not implement it, so let's remove the stub.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Jeff Cody 55aff7f133 block: remove QED .bdrv_make_empty implementation
The QED .bdrv_make_empty() implementation does nothing but return
-ENOTSUP, which causes problems in bdrv_commit().  Since the function
stub exists for QED, it is called, which then always returns an error.

The proper way to not support an optional driver function stub is to
just not implement it, so let's remove the stub.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Anthony Liguori e9f526ab7b Merge remote-tracking branch 'bonzini/scsi-next' into staging
* bonzini/scsi-next:
  scsi: Support TEST UNIT READY in the dummy LUN0
  block: add .bdrv_reopen_prepare() stub for iscsi
  virtio-scsi: Prevent assertion on missed events
  virtio-scsi: Cleanup of I/Os that never started
  scsi: Assign cancel_io vector for scsi_disk_emulate_ops

Conflicts:
	block/iscsi.c

aliguori: resolve trivial merge conflict in block/iscsi.c

Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-24 15:50:14 -08:00
Kevin Wolf 9e1cb96d9a qemu-iotests: Test pwritev RMW logic
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:25 +01:00
Kevin Wolf b35ee7fb23 blkdebug: Make required alignment configurable
The new 'align' option of blkdebug can be used in order to emulate
backends with a required 4k alignment on hosts which only really require
512 byte alignment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 17:40:03 +01:00
Paolo Bonzini 2c9880c45e iscsi: Set bs->request_alignment
The iSCSI backend already gets the block size from the READ CAPACITY
command it sends.  Save it so that the generic block layer gets it
too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:03 +01:00
Kevin Wolf 793ed47a7a block: Switch BdrvTrackedRequest to byte granularity
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:02 +01:00
Paolo Bonzini c25f53b06e raw: Probe required direct I/O alignment
Add a bs->request_alignment field that contains the required
offset/length alignment for I/O requests and fill it in the raw block
drivers. Use ioctls if possible, else see what alignment it takes for
O_DIRECT to succeed.

While at it, also expose the memory alignment requirements, which may be
(and in practice are) different from the disk alignment requirements.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-01-24 17:40:02 +01:00
Kevin Wolf 355ef4ac95 block: Update BlockLimits when they might have changed
When reopening with different flags, or when backing files disappear
from the chain, the limits may change. Make sure they get updated in
these cases.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Kevin Wolf d34682cd4a block: Move initialisation of BlockLimits to bdrv_refresh_limits()
This function separates filling the BlockLimits from bdrv_open(), which
allows it to call it from other operations which may change the limits
(e.g. modifications to the backing file chain or bdrv_reopen)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Jeff Cody 4da8358596 block: resize backing image during active layer commit, if needed
If the top image to commit is the active layer, and also larger than
the base image, then an I/O error will likely be returned during
block-commit.

For instance, if we have a base image with a virtual size 10G, and a
active layer image of size 20G, then committing the snapshot via
'block-commit' will likely fail.

This will automatically attempt to resize the base image, if the
active layer image to be committed is larger.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:12:49 +01:00
Peter Maydell 031fd1be56 block/curl: Implement the libcurl timer callback interface
libcurl versions 7.16.0 and later have a timer callback interface which
must be implemented in order for libcurl to make forward progress (it
will sometimes rely on being called back on the timeout if there are
no file descriptors registered). Implement the callback, and use a
QEMU AIO timer to ensure we prod libcurl again when it asks us to.

Based on Peter's original patch plus my fix to add curl_multi_timeout_do.
Should compile just fine even on older versions of libcurl.

I also tried copy-on-read and streaming:

    $ ./qemu-img create -f qcow2 -o \
         backing_file=http://download.fedoraproject.org/pub/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso \
         foo.qcow2 1G
    $ x86_64-softmmu/qemu-system-x86_64 \
         -drive if=none,file=foo.qcow2,copy-on-read=on,id=cd \
         -device ide-cd,drive=cd --enable-kvm -m 1024

Direct http usage is probably too slow, but with copy-on-read ultimately
the image does boot!

After some time, streaming gets canceled by an EIO, which needs further
investigation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:07:08 +01:00
Benoît Canet 212a5a8f09 block: Create authorizations mechanism for external snapshot and resize.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:07:08 +01:00
Benoît Canet c13163fba1 qmp: Add QMP query-named-block-nodes to list the named BlockDriverState nodes.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 16:07:08 +01:00
Fam Zheng c8059b97e1 qapi: Add "backing" to BlockStats
Currently there is no way to query BlockStats of the backing chain. This
adds "backing" field into BlockStats to make it possible.

The comment of "parent" is reworded.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:01 +01:00
Fam Zheng d8a7b061ae vmdk: Fix format specific information (create type) for streamOptimized
Previously the field is wrong:

    $ ./qemu-img create -f vmdk -o subformat=streamOptimized /tmp/a.vmdk 1G

    $ ./qemu-img info /tmp/a.vmdk
    image: /tmp/a.vmdk
    file format: vmdk
    virtual size: 1.0G (1073741824 bytes)
    disk size: 12K
    Format specific information:
        cid: 1390460459
        parent cid: 4294967295
>>>     create type: monolithicSparse
        <snip>

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Zhang Min 6df3bf8eb3 drive mirror:fix memory leak
In the function mirror_iteration() -> qemu_iovec_init(),
it allocates memory for op->qiov.iov, when the write request calls back,
but in the function mirror_iteration_done(), it only frees the op,
not free the op->qiov.iov, so this causes memory leak.

It should use qemu_iovec_destroy() to free op->qiov.

Signed-off-by: Zhang Min <rudy.zhangmin@huawei.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Liu Yuan 9cd767376f sheepdog: fix 'qemu-img map'
It was muted in the previous commit 4bc74be9. Let's revive it since nothing
prevents us to do it.

With this patch, following command will work as other formats:

$ qemu-img map sheepdog:image

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Fam Zheng 34ceed81f9 vmdk: Check for overhead when opening
Report an error if file size is even smaller than metadata.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Hu Tao 46bae92713 qcow2: fix wrong value of L1E_OFFSET_MASK, L2E_OFFSET_MASK and REFT_OFFSET_MASK
Accoring to qcow spec, the offset fields in l1e, l2e and ref table entry
start at bit 9. The offset is cluster offset, and the smallest possible
cluster size is 512 bytes.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-24 14:33:00 +01:00
Max Reitz 22511ad681 blkverify: Don't require protocol filename
If the filename is not prefixed by "blkverify:" in
blkverify_parse_filename(), the blkverify driver was not selected
through that protocol prefix, but by an explicit command line (or QMP)
option (like driver=blkverify).

If blkverify_parse_filename() has been called, a filename has been
given. If it is not prefixed, it is probably really just a plain
filename. This is no problem, since we can use it as the test image
filename and rely on the user to specify the raw image filename through
the new corresponding option.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:18 +01:00
Max Reitz 70b6198acc blkverify: Allow command-line configuration
Introduce the "test" and "raw" options for specifying images.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:18 +01:00
Max Reitz 4373593d51 blkdebug: Allow command-line file configuration
Introduce the "image" option as an alternative to specifying the image
through the filename.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:18 +01:00
Max Reitz 72daa72eee block: Allow reference for bdrv_file_open()
Allow specifying a reference to an existing block device (by name) for
bdrv_file_open() instead of a filename and/or options.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz 89f2b21e36 blkdebug: Use command-line in read_config()
Use qemu_config_parse_qdict() to parse the command-line options in
addition to the config file.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz 85a040e548 blkdebug: Always call read_config()
Move the check whether there actually is a config file into the
read_config() function.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz d4881b9bcb blkdebug: Don't require sophisticated filename
If the filename is not prefixed by "blkdebug:" in
blkdebug_parse_filename(), the blkdebug driver was not selected through
that protocol prefix, but by an explicit command line option
(file.driver=blkdebug or something similar). Contrary to the current
reaction, this is not a problem at all; we just need to store the
filename (in the x-image option) and can go on; the user just has to
manually specify the config option.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Max Reitz 466b49f276 blkdebug: Use errp for read_config()
Use an Error variable in the read_config() function.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Fam Zheng 585ea0c841 vmdk: Fix big flat extent IO
Local variable "n" as int64_t avoids overflow with large sector number
calculation. See test case change for failure case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Liu Yuan 9f23fce7b2 sheepdog: fix clone operation by 'qemu-img create -b'
We should pass base_inode->vdi_id to base_vdi_id of SheepdogVdiReq so that sheep
can create a clone instead a fresh volume.

This fixes following command:

qemu-create -b sheepdog:base sheepdog:clone

so users can boot sheepdog:clone as a normal volume.

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Bharata B Rao cf7f616b9d gluster: Add support for creating zero-filled image
GlusterFS supports creation of zero-filled file on GlusterFS volume
by means of an API called glfs_zerofill(). Use this API from QEMU to
create an image that is filled with zeroes by using the preallocation
option of qemu-img.

qemu-img create gluster://server/volume/image -o preallocation=full 10G

The allowed values for preallocation are 'full' and 'off'. By default
preallocation is off and image is not zero-filled.

glfs_zerofill() offloads the writing of zeroes to the server and if
the storage supports SCSI WRITESAME, GlusterFS server can issue
BLKZEROOUT ioctl to achieve the zeroing.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Bharata B Rao 7c815372f3 gluster: Implement .bdrv_co_write_zeroes for gluster
Support .bdrv_co_write_zeroes() from gluster driver by using GlusterFS API
glfs_zerofill() that off-loads the writing of zeroes to GlusterFS server.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Bharata B Rao 15744b0b8f gluster: Convert aio routines into coroutines
Convert the read, write, flush and discard implementations from aio-based
ones to coroutine based ones.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Peter Lieven 92397116a6 block/iscsi: return -ENOMEM if an async call fails immediately
if an async libiscsi call fails directly it can only be due
to an out of memory condition. All other errors are returned
through the callback.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Stefan Hajnoczi e04fb07fd1 rbd: switch from pipe to QEMUBH completion notification
rbd callbacks are called from non-QEMU threads.  Up until now a pipe was
used to signal completion back to the QEMU iothread.

The pipe writer code handles EAGAIN using select(2).  The select(2) API
is not scalable since fd_set size is static.  FD_SET() can write beyond
the end of fd_set if the file descriptor number is too high.  (QEMU's
main loop uses poll(2) to avoid this issue with select(2).)

Since the pipe itself is quite clumsy to use and QEMUBH is now
thread-safe, just schedule a BH from the rbd callback function.  This
way we can simplify I/O completion in addition to eliminating the
potential FD_SET() crash when file descriptor numbers become too high.

Crash scenario: QEMU already has 1024 file descriptors open.  Hotplug an
rbd drive and get the pipe writer to take the select(2) code path.

Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Tested-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:16 +01:00
Jeff Cody dc6afb99b3 block: add .bdrv_reopen_prepare() stub for iscsi
To suppport reopen(), the .bdrv_reopen_prepare() stub must exist.
iSCSI does not have anything that needs to be done to support reopen,
so we can just implement the _prepare() stub.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-15 10:44:52 +01:00
Edgar E. Iglesias 133fe77437 Merge remote branch 'luiz/queue/qmp' into qmpq
* luiz/queue/qmp:
  migration: qmp_migrate(): keep working after syntax error
  qerror: Remove assert_no_error()
  qemu-option: Remove qemu_opts_create_nofail
  target-i386: Remove assert_no_error usage
  hw: Remove assert_no_error usages
  qdev: Delete dead code
  error: Add error_abort
  monitor: add object-add (QMP) and object_add (HMP) command
  monitor: add object-del (QMP) and object_del (HMP) command
  qom: catch errors in object_property_add_child
  qom: fix leak for objects created with -object
  rng: initialize file descriptor to -1
  qemu-monitor: HMP cpu-add wrapper
  vl: add missing transition debug->finish_migrate

Message-Id: 1389045795-18706-1-git-send-email-lcapitulino@redhat.com
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2014-01-14 12:10:08 +10:00
Anthony Liguori eedc1a5db5 Merge remote-tracking branch 'bonzini/scsi-next' into staging
* bonzini/scsi-next:
  scsi-disk: add UNMAP limits to block limits VPD page
  block/iscsi: use a bh to schedule co reentrance

Message-id: 1387720926-11421-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2014-01-10 11:05:17 -08:00
Peter Crosthwaite 87ea75d5e1 qemu-option: Remove qemu_opts_create_nofail
This is a boiler-plate _nofail variant of qemu_opts_create. Remove and
use error_abort in call sites.

null/0 arguments needs to be added for the id and fail_if_exists fields
in affected callsites due to argument inconsistency between the normal and
no_fail variants.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-01-06 15:02:30 -05:00
Fam Zheng 18da7f94cd commit: Remove unused check
We support top == active for commit now, remove the check and add an
assertion here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng 20a63d2cec commit: Support commit active layer
If active is top, it will be mirrored to base, (with block/mirror.c
code), then the image is switched when user completes the block job.

QMP documentation is updated.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng 03544a6e9e block: Add commit_active_start()
commit_active_start is implemented in block/mirror.c, It will create a
job with "commit" type and designated base in block-commit command. This
will be used for committing active layer of device.

Sync mode is removed from MirrorBlockJob because there's no proper type
for commit. The used information is is_none_mode.

The common part of mirror_start and commit_active_start is moved to
mirror_start_job().

Fix the comment wording for commit_start.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng 5bc361b813 mirror: Move base to MirrorBlockJob
This allows setting the base before entering mirror_run, commit will
make use of it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng f95c625ce4 mirror: Don't close target
Let reference count manage target and don't call bdrv_close here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 16:26:16 +01:00
Fam Zheng 917703c179 vmdk: Allow vmdk_create to work with protocol
This improves vmdk_create to use bdrv_* functions to replace qemu_open
and other fd functions. The error handling are improved as well. One
difference is that bdrv_pwrite will round up buffer to sectors, so for
description file, an extra bdrv_truncate is used in the end to drop
inding zeros.

Notes:

 - A bonus bug fix is correct endian is used in initializing GD entries.

 - ROUND_UP and DIV_ROUND_UP are used where possible.

I tested that new code produces exactly the same file as previously.

Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 13:56:56 +01:00
Fam Zheng b47053bd03 vmdk: Check VMFS extent line field number
VMFS extent line in description file should be with 4 fields:

    RW <size> VMFS "file-name.vmdk"

Check the number explicitly and report error if offset is appended as
FLAT, which should be invalid format.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:58 +01:00
Jeff Cody 7e30e6a674 block: vhdx - improve error message, and .bdrv_check implementation
If there is a dirty log file to be replayed in a VHDX image, it is
replayed in .vhdx_open().  However, if the file is opened read-only,
then a somewhat cryptic error message results.

This adds a more helpful error message for the user.  If an image file
contains a log to be replayed, and is opened read-only, the user is
instructed to run 'qemu-img check -r all' on the image file.

Running qemu-img check -r all will cause the image file to be opened
r/w, which will replay the log file.  If a log file replay is detected,
this is flagged, and bdrv_check will increase the corruptions_fixed
count for the image.

[Fixed typo in error message that was pointed out by Eric Blake
<eblake@redhat.com>.
--Stefan]

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:58 +01:00
Stefan Weil 219c252193 block/iscsi: Fix compilation for libiscsi 1.4.0 (API change)
Function iscsi_read10_task got additional parameters starting with version
libiscsi 1.5.0.

libiscsi 1.4.0 is still widely used (Debian wheezy, jessie and other Linux
distributions currently provide packages for QEMU which use it), so we
still need support for this older API.

Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:58 +01:00
Liu Yuan e50d7607f1 sheepdog: fix dynamic grow for running qcow2 format
When running qcow2 over sheepdog, we might meet following problem

  qemu-system-x86_64: shrinking is not supported

And cause IO errors to Guest. This is because we abuse bs->total_sectors, which
is manipulated by generic block layer and race with sheepdog code.

We should directly check if offset > vdi_size to dynamically enlarge the volume
instead of 'offset > bs->total_sectors', which will cause problem when following
case happens:

   vdi_size > offset > bs->total_sectors

   # then trigger sd_truncate() to shrink the volume wrongly.

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reported-by: Hadrien KOHL <hadrien.kohl@gmail.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-20 09:11:57 +01:00
Anthony Liguori b91f93243b Collection of little cleanups anf bugfixes.
nbd patches in preparation of spice-nbd.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJSrseRAAoJEEy22O7T6HE4vzIP/AwLTXZ1E73ZIF8t/e+1exYJ
 coQpvFkLgHeXvDPc/Ml5CJWFcUdEHMpprm3hIQgvsUyAujswSgZiO4Wn5vgMId+B
 fgjMX342j7P/lK27r8iOMCN7KZBMwh972DqTVzyzFdJAL9wgpUsN4Fq1vjQXCJiW
 jalLRS/xcqdRsu8ZNIvaLth+NgBOp7N0pgWOzYFPBJzgxGJw/pGTCG+rZaCJHyar
 F7T0K4CjHNGAGe255T4qzA5hOt6x9xDgd9zlDlWMzoqGtc0SkmomWyFWKx4fGv3x
 6WIZQjho17Sb7oe878OJUI6Ct5Bz1NzvE6WnaiQuedyM3CHO1/ynxqu5r68/vCDs
 fMMccnNCM/lUjmGrggi3PiRn1XeOBhH9ltEoehAqv/wTtgT7dUf97YV8I4zZAmUU
 0uKmEiyCHKktpPP8NXl+pAeSZVnI7LLDjIeQyVUjDx29G+GKuAxzYEH+m9ZW5BdQ
 DYDwvO0nm2WfpKlezUngffEPFOixEMJcsS+GeRUNLSjcA2eGErb2L3seGul8uLnM
 O1RBXZzTpFq34gmIUhgN5wYFKQu7jlO4tgNGbbG9j+CuEMVab/7eZEgOpT2iedRx
 sWvBtd2R0SB4/D3VEOsybjU3sgZVoIMlbreEUHxh6v5ntYkd81TlVi/Fuvp0Lrzm
 NGG74+6iyXSWdQfHemuo
 =J7aY
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'spice/tags/pull-spice-1' into staging

Collection of little cleanups anf bugfixes.
nbd patches in preparation of spice-nbd.

# gpg: Signature made Mon 16 Dec 2013 01:27:45 AM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found

# By Marc-André Lureau (12) and Gerd Hoffmann (4)
# Via Gerd Hoffmann
* spice/tags/pull-spice-1:
  spice: stop server for qxl hard reset
  spice: move spice_server_vm_{start,stop} calls into qemu_spice_display_*()
  spice: move qemu_spice_display_*() from spice-graphics to spice-core
  nbd: avoid uninitialized warnings
  nbd: finish any pending coroutine
  nbd: make nbd_client_session_close() idempotent
  nbd: pass export name as init argument
  nbd: don't change socket block during negotiate
  Split nbd block client code
  spice-char: implement chardev port event
  char: add qemu_chr_fe_event()
  include: add missing config-host.h include
  qmp_change_blockdev() remove unused has_format
  spice-char: remove unused field
  vscclient: do not add a socket watch if there is not data to send
  spice: flip streaming video mode to off by default
2013-12-16 09:44:13 -08:00
Anthony Liguori 80d6f5eae7 Block patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQIcBAABAgAGBQJSq0gXAAoJEH8JsnLIjy/WRosQAL+l9phOdbeyOkybR6fvt+i1
 ZuR8FQTS01xtnVvqeVEueVFrWvFiovBtS4N5ubie1O6CQGF/4HIrFTgRSeUzpPaG
 qYjMY9pmg2QXP4Kg24z+wMlVbn7uzK0Akrlerr47emLtMLrHdsLrs2m0DAlW8KZN
 6umGk+HQ5wPwmF8b4w3BGymOQk22oDCiTXmuBCzdZ+GpsTAhSr+XupvNxo8b+aFk
 kpphwulOPFntoHlOmcypgjVM8CfUUfjlkQZrZOXj5v63FPhwfeZ3DiVZIWGhOeJl
 qEaWRtoH2nELe67Py/vvGoITLdH2gS0l1JBkk2wLIqAjA0GiAXpZJrskop4NYAqm
 NvHaBHUPob5m3JVuh35hAMEbyFaIxS5teC6q1Pirjskiks0jvxYYlTiZw+RDkzqX
 gqBmBacyuSAJB9UlvlH3si3zSJ4MqN4feVTv8XWXoprq4+gC0xpyCtkZIb4ZrM9/
 oAmmYcEytf4Z2fqjPkmcD5lvLl48j3hYkd5zDwEy5TxWvyfqQSkkVx21AQMsqQvq
 PokiKFJUuyPfEGQgugWDsWM4FdXRhOrK9Q6479lAWP7PE9j9UTI24Z4JJVfLQz+z
 0m5IBuaQlBqWbN48twi4DSMTo5jFkxp/cKTXNGeV2cVDYyUuveAj1atTqUh8rZby
 95kteXfJhVGiaMQTniIl
 =XYe2
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'kwolf/tags/for-anthony' into staging

Block patches

# gpg: Signature made Fri 13 Dec 2013 09:47:03 AM PST using RSA key ID C88F2FD6
# gpg: Can't check signature: public key not found

# By Peter Lieven (2) and others
# Via Kevin Wolf
* kwolf/tags/for-anthony:
  blkdebug: Use QLIST_FOREACH_SAFE to resume IO
  qemu-img: make progress output more accurate during convert
  block: expect get_block_status errors in bdrv_make_zero
  block/vvfat: Fix compiler warnings for OpenBSD
  qapi-schema.json: Change 1.8 reference to 2.0
  sheepdog: check if '-o redundancy' is passed from user

Message-id: 1386956943-19474-1-git-send-email-kwolf@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-12-16 09:43:28 -08:00
Peter Lieven 8b9dfe9098 block/iscsi: use a bh to schedule co reentrance
this fixes a potential segfault and performance regression.

If the coroutine is reentered directly in the iscsi_co_generic_cb
iscsi_process_{read,write} are interrupted and reentered any
time later. One the one hand this could happen after an iscsi_close
where the iscsi context is already gone (segfault). On the
other hand this limits the number of processed callbacks
in each aio_dispatch to one (potential performance regression).

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-16 11:25:51 +01:00
Marc-André Lureau b1b27b6426 nbd: avoid uninitialized warnings
==15815== Thread 1:
==15815== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==15815==    at 0x65AD5CB: send (send.c:31)
==15815==    by 0x37F84B: nbd_wr_sync (nbd.c:145)
==15815==    by 0x37F94B: write_sync (nbd.c:186)
==15815==    by 0x380FA9: nbd_send_request (nbd.c:681)
==15815==    by 0x1C4A2D: nbd_teardown_connection (nbd-client.c:337)
==15815==    by 0x1C4AD8: nbd_client_session_close (nbd-client.c:354)
==15815==    by 0x1ED2D8: close_socketpair (spicebd.c:132)
==15815==    by 0x1EE265: spice_close (spicebd.c:457)
==15815==    by 0x1ACBF6: bdrv_close (block.c:1519)
==15815==    by 0x1AD804: bdrv_delete (block.c:1772)
==15815==    by 0x1B4136: bdrv_unref (block.c:4476)
==15815==    by 0x1ACCE0: bdrv_close (block.c:1541)
==15815==  Address 0x7feffef98 is on thread 1's stack

Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau 69152c09d3 nbd: finish any pending coroutine
Make sure all pending coroutines are finished when closing the session.

Signed-off-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau 5ad283ebb8 nbd: make nbd_client_session_close() idempotent
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau e2bc625f9b nbd: pass export name as init argument
There is no need to keep the export name around, and it seems a better
fit as an argument in the init() call.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau e53a18e488 nbd: don't change socket block during negotiate
The caller might handle non-blocking using coroutine. Leave the choice
to the caller to use a blocking or non-blocking negotiate.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Marc-André Lureau 2302c1cafb Split nbd block client code
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2013-12-16 10:12:20 +01:00
Fam Zheng c547e5640d blkdebug: Use QLIST_FOREACH_SAFE to resume IO
Qemu-iotest 030 was broken.

When the coroutine runs and finishes, it will remove itself from the req
list, so let's use safe version of foreach to avoid use after free.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-13 17:11:19 +01:00
Stefan Weil f671d173c7 block/vvfat: Fix compiler warnings for OpenBSD
The buildbot shows these compiler warnings:

block/vvfat.c: In function 'create_short_and_long_name':
block/vvfat.c:620: warning: array size (8) smaller than bound length (11)
block/vvfat.c:620: warning: array size (8) smaller than bound length (11)
block/vvfat.c:635: warning: array size (8) smaller than bound length (11)
block/vvfat.c:635: warning: array size (8) smaller than bound length (11)

They are caused by tricky code where 8 characters for the name are followed
by 3 characters for the extension, and some operations touch both name and
extension.

Using an 11 character name which includes the extension fixes the compiler
warning, satisfies cppcheck, valgrind and maybe other static and dynamic
code checkers, and even simplifies some parts of the code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-13 14:49:50 +01:00
Liu Yuan a3120deee5 sheepdog: check if '-o redundancy' is passed from user
This fix a segfault (that is caused by b3af018f3) of following command:

$ qemu-img convert some_img sheepdog:some_img

Cc: qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-12-13 14:49:50 +01:00
Peter Lieven 063c3378a9 block/iscsi: introduce bdrv_co_{readv, writev, flush_to_disk}
this converts read, write and flush functions from aio to coroutines
eliminating almost 200 lines of code.

The requirement for libiscsi is bumped to version 1.4.0 which was
released in may 2012.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-09 11:28:16 +01:00
Hu Tao ac95acdb8e qcow2: use start_of_cluster() and offset_into_cluster() everywhere
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-06 16:53:50 +01:00
Peter Lieven 7572ddc8db block/iscsi: set bs->bl.opt_transfer_length
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05 11:45:24 +01:00
Peter Lieven 1c0704a556 block/iscsi: set bdi->cluster_size
this patch aims to set bdi->cluster_size to the internal page size
of the iscsi target so that enabled callers can align requests
properly.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-05 11:45:24 +01:00
Wenchao Xia 8c116b0e41 qemu-nbd: support internal snapshot export
Now it is possible to directly export an internal snapshot, which
can be used to probe the snapshot's contents without qemu-img
convert.

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04 15:19:00 +01:00
Wenchao Xia 7b4c4781e3 snapshot: distinguish id and name in load_tmp
Since later this function will be used so improve it. The only caller of it
now is qemu-img, and it is not impacted by introduce function
bdrv_snapshot_load_tmp_by_id_or_name() that call bdrv_snapshot_load_tmp()
twice to keep old search logic. bdrv_snapshot_load_tmp_by_id_or_name() return
int to let caller know the errno, and errno will be used later.
Also fix a typo in comments of bdrv_snapshot_delete().

Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04 15:19:00 +01:00
Kevin Wolf f8413b3c23 qcow2: Zero-initialise first cluster for new images
Strictly speaking, this is only required for has_zero_init() == false,
but it's easy enough to just do a cluster-aligned write that is padded
with zeros after the header.

This fixes that after 'qemu-img create' header extensions are attempted
to be parsed that are really just random leftover data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04 11:29:37 +01:00
Paolo Bonzini 97a2ae3453 raw-posix: add support for write_zeroes on XFS and block devices
The code is similar to the implementation of discard and write_zeroes
with UNMAP.  However, failure must be propagated up to block.c.

The stale page cache problem can be reproduced as follows:

    # modprobe scsi-debug lbpws=1 lbprz=1
    # ./qemu-io /dev/sdXX
    qemu-io> write -P 0xcc 0 2M
    qemu-io> write -z 0 1M
    qemu-io> read -P 0x00 0 512
    Pattern verification failed at offset 0, 512 bytes
    qemu-io> read -v 0 512
    00000000:  cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc  ................
    ...

    # ./qemu-io --cache=none /dev/sdXX
    qemu-io> write -P 0xcc 0 2M
    qemu-io> write -z 0 1M
    qemu-io> read -P 0x00 0 512
    qemu-io> read -v 0 512
    00000000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    ...

And similarly with discard instead of "write -z".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini d0b4503ed2 raw-posix: implement write_zeroes with MAY_UNMAP for block devices
See the next commit for the description of the Linux kernel problem
that is worked around in raw_open_common.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini 260a82e524 raw-posix: implement write_zeroes with MAY_UNMAP for files
Writing zeroes to a file can be done by punching a hole if
MAY_UNMAP is set.

Note that in this case ENOTSUP is not ignored, but makes
the block layer fall back to the generic implementation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini fa6252b056 block/iscsi: check WRITE SAME support differently depending on MAY_UNMAP
The current check is right for MAY_UNMAP=1.  For MAY_UNMAP=0, just
try and fall back to regular writes as soon as a WRITE SAME command
fails.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Peter Lieven 2af8a1a704 block/iscsi: updated copyright
added myself to reflect recent work on the iscsi block driver.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Peter Lieven 4b52498e62 block/iscsi: remove .bdrv_has_zero_init
since commit 3ac21627 the default value changed to 0.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini cffb1ec600 block drivers: expose requirement for write same alignment from formats
This will let misaligned but large requests use zero clusters.  This
is important because the cluster size is not guest visible.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini 95de6d7078 block drivers: add discard/write_zeroes properties to bdrv_get_info implementation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini 97b00e2851 vpc, vhdx: add get_info
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini 7ce21016b6 block: handle ENOTSUP from discard in generic code
Similar to write_zeroes, let the generic code receive a ENOTSUP for
discard operations.  Since bdrv_discard has advisory semantics,
we can just swallow the error.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Fam Zheng af057fe740 vmdk: Fix creating big description file
The buffer for description file was 4096 which only covers a few
hundred of extents. This changes the buffer to dynamic allocated with
g_strdup_printf in order to support bigger cases.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 09:54:16 +01:00
Fam Zheng 509d39aa22 vmdk: Allow read only open of VMDK version 3
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 17:41:14 +01:00
Kevin Wolf c9fbb99d41 block: Use BDRV_O_NO_BACKING where appropriate
If you open an image temporarily just because you want to check its size
or get it flushed, there's no real reason to open the whole backing file
chain.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2013-11-29 17:41:09 +01:00
Fam Zheng 4cc70e9337 blkdebug: add "remove_break" command
This adds "remove_break" command which is the reverse of blkdebug
command "break": it removes all breakpoints with given tag and resumes
all the requests.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:37 +01:00
Liu Yuan b3af018f3b sheepdog: support user-defined redundancy option
Sheepdog support two kinds of redundancy, full replication and erasure coding.

# create a fully replicated vdi with x copies
 -o redundancy=x (1 <= x <= SD_MAX_COPIES)

# create a erasure coded vdi with x data strips and y parity strips
 -o redundancy=x:y (x must be one of {2,4,8,16} and 1 <= y < SD_EC_MAX_STRIP)

E.g, to convert a vdi into sheepdog vdi 'test' with 8:3 erasure coding scheme

$ qemu-img convert -o redundancy=8:3 linux-0.2.img sheepdog:test

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:37 +01:00
Liu Yuan c31d482f29 sheepdog: refactor do_sd_create()
We can actually use BDRVSheepdogState *s to pass most of the parameters.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:37 +01:00
Charlie Shepherd 091b1108ca COW: Extend checking allocated bits to beyond one sector
cow_co_is_allocated() only checks one sector's worth of allocated bits
before returning. This is allowed but (slightly) inefficient, so extend
it to check all of the file's metadata sectors.

Signed-off-by: Charlie Shepherd <charlie@ctshepherd.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

[kwolf: silenced compiler warning (-Wmaybe-uninitialized for changed)]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:36 +01:00
Charlie Shepherd 14b98fdaf3 COW: Speed up writes
Process a whole sector's worth of COW bits by reading a sector, setting
the bits after skipping any already set bits, then writing it out again.
Make sure we only flush once before writing metadata, and only if we
need to write metadata.

Signed-off-by: Charlie Shepherd <charlie@ctshepherd.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-29 13:40:36 +01:00
Fam Zheng 21b5683508 qapi: Change BlockDirtyInfo to list
We have multiple dirty bitmaps in BDS now, switch QAPI to allow query
it (BlockInfo.dirty_bitmaps), and also drop old BlockInfo.dirty.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:36 +01:00
Fam Zheng e4654d2d94 block: per caller dirty bitmap
Previously a BlockDriverState has only one dirty bitmap, so only one
caller (e.g. a block job) can keep track of writing. This changes the
dirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, the
lifecycle is managed with these new functions:

    bdrv_create_dirty_bitmap
    bdrv_release_dirty_bitmap

Where BdrvDirtyBitmap is a linked list wrapper structure of HBitmap.

In place of bdrv_set_dirty_tracking, a BdrvDirtyBitmap pointer argument
is added to these functions, since each caller has its own dirty bitmap:

    bdrv_get_dirty
    bdrv_dirty_iter_init
    bdrv_get_dirty_count

bdrv_set_dirty and bdrv_reset_dirty prototypes are unchanged but will
internally walk the list of all dirty bitmaps and set them one by one.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:33 +01:00
Max Reitz f4a193e717 block/stream: Don't stream unbacked devices
If a block device is unbacked, a streaming blockjob should immediately
finish instead of beginning to try to stream, then noticing the backing
file does not contain even the first sector (since it does not exist)
and then finishing normally.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-28 11:39:31 +01:00
Liu Yuan 8582972227 sheepdog: implement .bdrv_get_allocated_file_size
With this patch, qemu-img info sheepdog:image will show disk size for sheepdog
images.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven d4cd961507 iscsi: add bdrv_co_write_zeroes
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven 01a6a238a3 iscsi: simplify iscsi_co_discard
now that bdrv_co_discard can handle limits we do not need
the request split logic here anymore.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven ba6c59191f iscsi: set limits in BlockDriverState
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:52 +01:00
Peter Lieven 04f19e4d2d block/raw: copy BlockLimits on raw_open
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven 186d4f2b1d block/iscsi: add .bdrv_get_info
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven d32f35cbc5 block: introduce BDRV_REQ_MAY_UNMAP request flag
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven aa7bfbfff7 block: add flags to bdrv_*_write_zeroes
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven 78a52ad5ac qcow2: fix possible corruption when reading multiple clusters
if multiple sectors spanning multiple clusters are read the
function count_contiguous_clusters should ensure that the
cluster type should not change between the clusters.

Especially the for-loop should break when we have one
or more normal clusters followed by a compressed cluster.

Unfortunately the wrong macro was used in the mask to
compare the flags.

This was discovered while debugging a data corruption
issue when converting a compressed qcow2 image to raw.
qemu-img reads 2MB chunks which span multiple clusters.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-14 13:09:07 +01:00
Fam Zheng b04b6b6ec3 block: Print its file name if backing file opening failed
If backing file doesn't exist, the error message is confusing and
misleading:

    $ qemu /tmp/a.qcow2
    qemu: could not open disk image /tmp/a.qcow2: Could not open file: No
    such file or directory

But...

    $ ls /tmp/a.qcow2
    /tmp/a.qcow2

    $ qemu-img info /tmp/a.qcow2
    image: /tmp/a.qcow2
    file format: qcow2
    virtual size: 8.0G (8589934592 bytes)
    disk size: 196K
    cluster_size: 65536
    backing file: /tmp/b.qcow2

Because...

    $ ls /tmp/b.qcow2
    ls: cannot access /tmp/b.qcow2: No such file or directory

This is not intuitive. It's better to have the missing file's name in
the error message. With this patch:

    $ qemu-io -c 'read 0 512' /tmp/a.qcow2
    qemu-io: can't open device /tmp/a.qcow2: Could not open backing
    file: Could not open '/stor/vm/arch.raw': No such file or directory
    no file open, try 'help open'

Which is a little bit better.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-14 13:09:06 +01:00
Jeff Cody 3412f7b1bd block: vhdx - add .bdrv_create() support
This adds support for VHDX image creation, for images of type "Fixed"
and "Dynamic".  "Differencing" types (i.e., VHDX images with backing
files) are currently not supported.

Options for image creation include:
    * log size:
        The size of the journaling log for VHDX.  Minimum is 1MB,
        and it must be a multiple of 1MB. Invalid log sizes will be
        silently fixed by rounding up to the nearest MB.

        Default is 1MB.

    * block size:
        This is the size of a payload block.  The range is 1MB to 256MB,
        inclusive, and must be a multiple of 1MB as well.  Invalid sizes
        and multiples will be silently fixed.  If '0' is passed, then
        a sane size is chosen (depending on virtual image size).

        Default is 0 (Auto-select).

    * subformat:
        - "dynamic"
            An image without data pre-allocated.
        - "fixed"
            An image with data pre-allocated.

        Default is "dynamic"

When creating the image file, the lettered sections are created:

-----------------------------------------------------------------.
|   (A)    |   (B)    |    (C)    |     (D)       |     (E)
|  File ID |  Header1 |  Header 2 |  Region Tbl 1 |  Region Tbl 2
|          |          |           |               |
.-----------------------------------------------------------------.
0         64KB      128KB       192KB           256KB          320KB

.---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
|     (F)     |     (G)       |    (H)    |
| Journal Log |  BAT / Bitmap |  Metadata |  .... data ......
|             |               |           |
.---- ~ ----------- ~ ------------ ~ ---------------- ~ -----------.
1MB         (var.)          (var.)      (var.)

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody 61c02e5687 block: vhdx - fix comment typos in header, fix incorrect struct fields
VHDXPage83Data and VHDXParentLocatorHeader both incorrectly had their
MSGUID fields set as arrays of 16.  This is incorrect (it stems from
an early version where those fields were uint_8 arrays).  Those fields
were, up to this patch, unused.

Also, there were a couple of typos and incorrect wording in comments,
and those have been fixed up as well.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody 1e74a971cb block: vhdx - break out code operations to functions
This is preperation for vhdx_create().  The ability to write headers,
and calculate the number of BAT entries will be needed within the
create() functions, so move this relevant code into helper functions.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody c325ee1de8 block: vhdx - move more endian translations to vhdx-endian.c
In preparation for vhdx_create(), move more endian translation
functions out to vhdx-endian.c.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody 0b7da092b4 block: vhdx - remove BAT file offset bit shifting
Bit shifting can be fun, but in this case it was unnecessary.  The
upper 44 bits of the 64-bit BAT entry is specifies the File Offset,
so we shifted the bits to get access to the value.

However, per the spec the value is in MB.  So we dutifully shifted back
to the left by 20 bits, to convert to a true uint64_t file offset.

This replaces those steps with just a bit mask, to get rid of the lower
20 bits instead.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00
Jeff Cody d92aa8833c block: vhdx write support
This adds support for writing to VHDX image files, using coroutines.
Writes into the BAT table goes through the VHDX log.  Currently, BAT
table writes occur when expanding a dynamic VHDX file, and allocating a
new BAT entry.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-07 13:58:59 +01:00