| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't try to be clever by freeing all temporary data and calling all callbacks
when the return value (an error) is certain. Doing so has at least two
important problems:
* The temporary data that is freed (qiov, possibly zero buffer) is still used
by the requests that have not yet completed.
* Calling the callbacks for all requests in the multiwrite means for the caller
that it may free buffers etc. which are still in use.
Just remember the error value and do the cleanup when all requests have
completed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit de189a1b4a471d37a2909e97646654fc9751b52f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bdrv_aio_writev may call the callback immediately (and it will commonly do so
in error cases). Current code doesn't consider this. For details see the
comment added by this patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 453f9a1652629e5805995b165be2e634c8487139)
Conflicts:
block.c
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Add new functions that write and flush the written data to disk immediately.
This is what needs to be used for image format metadata to maintain integrity
for cache=... modes that don't use O_DSYNC. (Actually, we only need barriers,
and therefore the functions are defined as such, but flushes is what is
implemented in this patch - we can try to change that later)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f08145fe16470aca09304099888f68cfbc5d1de7)
|
|
|
|
|
|
|
|
| |
With overlapping requests, the total number of sectors is smaller than the sum
of the nb_sectors of both requests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cbf1dff2f1033cadcb15c0ffc9c0a3d039d8ed42)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The difference between the start sectors of two requests can be larger
than the size of the "int" type, which can lead to a not correctly
sorted multiwrite array and thus spurious I/O errors and filesystem
corruption due to incorrect request merges.
So instead of doing the cute sector arithmetics trick spell out the
exact comparisms.
Spotted by Kevin Wolf based on a testcase from Michael Tokarev.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 77be4366baface6613cfc312ba281f8e5860997c)
|
|
|
|
|
|
|
|
|
|
|
| |
A new iovec array is allocated when creating a merged write request.
This patch ensures that the iovec array is deleted in addition to its
qiov owner.
Reported-by: Leszek Urbanski <tygrys@moo.pl>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 1e1ea48d42e011b9bdd0d689d184e7cac4617b66)
|
|
|
|
|
|
|
|
|
|
|
| |
Previously multiwrite_user_cb was never called if a request in the multiwrite
batch failed right away because it did set mcb->error immediately. Make it look
more like a normal callback to fix this.
Reported-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 7eb58a6c556c3880e6712cbf6d24d681261c5095)
|
|
|
|
|
|
| |
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 0f0b604b00851f2c7160b4195136c1fd27418088)
|
|
|
|
|
|
|
|
|
|
| |
When two requests of the same multiwrite batch fail, the callback of all
requests in that batch were called twice. This could have any kind of nasty
effects, in my case it lead to use after free and eventually a segfault.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit cb6d3ca07b8f62b47ef30c6a92caa3e8bd71248b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we go over the maximum number of iovecs support by syscall we get
back EINVAL from the kernel which translate to I/O errors for the guest.
Add a MAX_IOV defintion for platforms that don't have it. For now we use
the same 1024 define that's used on Linux and various other platforms,
but until the windows block backend implements some kind of vectored I/O
it doesn't matter.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit e2a305fb13ff0f5cf6ff805555aaa90a5ed5954c)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
|
|
|
|
|
|
|
|
| |
Don't assume -EIO but return the real error.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 9a8c4cceaf670193270995b95378faa3867db999)
|
|
|
|
|
|
|
|
|
|
|
| |
Win32 suffers from a very big memory leak when dealing with SCSI devices.
Each read/write request allocates memory with qemu_memalign (ie
VirtualAlloc) but frees it with qemu_free (ie free).
Pair all qemu_memalign() calls with qemu_vfree() to prevent such leaks.
Signed-off-by: Herve Poussineau <hpoussin@reactos.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit f8a83245d9ec685bc6aa6173d6765fe03e20688f)
|
|
|
|
|
|
|
|
|
|
|
| |
Each device statistic information is stored in a QDict and
the returned QObject is a QList of all devices.
This commit should not change user output.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 218a536a7a7c6d3679d5eca0103f32fd11fbfaf0)
|
|
|
|
|
|
|
|
|
|
|
| |
Each block device information is stored in a QDict and the
returned QObject is a QList of all devices.
This commit should not change user output.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit d15e546567d75fca36d852c39e30adaab02121a7)
|
|
|
|
|
|
|
|
|
|
|
| |
This switches the dirty bitmap to a true bitmap, reducing its footprint
(specifically in caches). It moreover fixes off-by-one bugs in
set_dirty_bitmap (nb_sectors+1 were marked) and bdrv_get_dirty (limit
check allowed one sector behind end of drive). And is drops redundant
dirty_tracking field from BlockDriverState.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
| |
Instead of duplicating the definition of constants or introducing
trivial retrieval functions move the SECTOR constants into the public
block API. This also obsoletes sector_per_block in BlkMigState.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
| |
No functional changes.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support live migration without shared storage we need to be able to trace
writes to disk while migrating. This Patch expose dirty block tracking per
device to be polled from upper layer.
Changes from v4:
- Register dirty tracking for each block device.
- Minor coding style issues.
- Block.c will now manage a dirty bitmap per device once
bdrv_set_dirty_tracking() is called. Bitmap is polled by the upper
layer (block-migration.c).
Signed-off-by: Liran Schour <lirans@il.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have code for a quite a few block formats. While I trust that all
of these formats are useful at least for some people in some
circumstances, some of them are of a kind that friends don't let
friends use in production.
This patch provides an optional block format whitelist, default off.
If a whitelist is configured with --block-drv-whitelist, QEMU proper
can use only whitelisted formats. Other programs, like qemu-img, are
not affected.
Drivers for formats off the whitelist still participate in format
probing, to ensure all programs probe exactly the same. Without that,
QEMU proper would be prone to treat images with a format off the
whitelist as raw when the image's format is probed.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a slightly revised patch for adding readonly flag to the -drive command.
Even though this patch is "stand-alone", it assumes a previous related patch (in Anthony staging tree), that passes
the readonly attribute of the drive to the guest OS, applied first.
This enables sharing same image between guests, with readonly access.
Implementaion mark the drive as read_only and changes the flags when actually opening the file.
The readonly attribute of a qcow also passed to it's base file.
For ide that cannot pass the readonly attribute to the guest OS, disallow the readonly flag.
Also, return error code from bdrv_truncate for readonly drive.
Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
| |
bdrv_read/write emulation is used as the perfect example why we need something
like AsyncContexts. So maybe they better start using it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Our file sys-queue.h is a copy of the BSD file, but there are
some additions and it's not entirely compatible. Because of that, there have
been conflicts with system headers on BSD systems. Some hacks have been
introduced in the commits 15cc9235840a22c289edbe064a9b3c19c5f49896,
f40d753718c72693c5f520f0d9899f6e50395e94,
96555a96d724016e13190b28cffa3bc929ac60dc and
3990d09adf4463eca200ad964cc55643c33feb50 but the fixes were fragile.
Solution: Avoid the conflict entirely by renaming the functions and the
file. Revert the previous hacks.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead stalling the VCPU while serving a cache flush try to do it
asynchronously. Use our good old helper thread pool to issue an
asynchronous fdatasync for raw-posix. Note that while Linux AIO
implements a fdatasync operation it is not useful for us because
it isn't actually implement in asynchronous fashion.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a enable_write_cache flag in the block driver state, and use it to
decide if we claim to have a volatile write cache that needs controlled
flushing from the guest. The flag is off if cache=writethrough is
defined because O_DSYNC guarantees that every write goes to stable
storage, and it is on for cache=none and cache=writeback.
Both scsi-disk and ide now use the new flage, changing from their
defaults of always off (ide) or always on (scsi-disk).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One performance problem of qcow2 during the initial image growth are
sequential writes that are not cluster aligned. In this case, when a first
requests requires to allocate a new cluster but writes only to the first
couple of sectors in that cluster, the rest of the cluster is zeroed - just
to be overwritten by the following second request that fills up the cluster.
Let's try to merge sequential write requests to the same cluster, so we can
avoid to write the zero padding to the disk in the first place.
As a nice side effect, also other formats take advantage of dealing with less
and larger requests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that do have a nicer interface to work against we can add Linux native
AIO support. It's an extremly thing layer just setting up an iocb for
the io_submit system call in the submission path, and registering an
eventfd with the qemu poll handler to do complete the iocbs directly
from there.
This started out based on Anthony's earlier AIO patch, but after
estimated 42,000 rewrites and just as many build system changes
there's not much left of it.
To enable native kernel aio use the aio=native sub-command on the
drive command line. I have also added an option to qemu-io to
test the aio support without needing a guest.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
| |
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VM state offset is a concept internal to the image format. Replace
the old bdrv_{get,put}_buffer method that require an index into the
image file that is constructed from the VM state offset and an offset
into the vmstate with the bdrv_{load,save}_vmstate that just take an
offset into the VM state.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 6a7ad299 ("Call qemu_bh_delete at bdrv_aio_bh_cb") deletes emulated
aio bottom halves to prevent endless accumulation. However, it leaves a
stale ->bh pointer, which is then waited on when the aio is reused.
Zeroing the pointer fixes the issue, allowing vmdk format images to be used.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
| |
This reverts commit 707c0dbc97cddfe8d2441b8259c6c526d99f2dd8.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
| |
Fix missing strnlen (a GNU extension) problems by using qemu_strnlen
used for user emulators also for system emulators.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: It is impossible to feed filenames with the character colon because
qemu interprets such names as a protocol. For example filename scsi:0, is
interpreted as a protocol by name "scsi".
This patch allows user to espace colon characters. For example the above
filename can now be expressed either as 'scsi\:0' or as file:scsi:0
anything following the "file:" tag is interpreted verbatin. However if "file:"
tag is omitted then any colon characters in the string must be escaped using
backslash.
Here are couple of examples:
scsi\:0\:abc is a local file scsi:0:abc
http\://myweb is a local file by name http://myweb
file:scsi:0:abc is a local file scsi:0:abc
file:http://myweb is a local file by name http://myweb
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Section 10.8.25 ("START/STOP UNIT Command") of SFF-8020i states that
if the device is locked we should refuse to eject if the device is
locked.
ASC_MEDIA_REMOVAL_PREVENTED is the appropriate return in this case.
In order to stop itself from ejecting the media it is running from,
Fedora's installer (anaconda) requires the CDROMEJECT ioctl() to fail
if the drive has been previously locked.
See also https://bugzilla.redhat.com/501412
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
| |
Also replave qemu_bh_cancel with qemu_bh_delete in bdrv_aio_cancel_em.
Otherwise the bh will live forever in the bh list.
Signed-off-by: Dor Laor <dor@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
| |
Add a bdrv_probe_device method to all BlockDriver instances implementing
host devices to move matching of host device types into the actual drivers.
For now we keep exacly the old matching behaviour based on the devices names,
although we really should have better detetion methods based on device
information in the future.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of declaring one BlockDriver for all host devices declared one
for each type: a generic one for normal disk devices, a Linux floppy
driver and a CDROM driver for Linux and FreeBSD. This gets rid of a lot
of messy ifdefs and switching based on the type in the various removal
device methods.
block.c grows a new method to find the correct host device driver based
on OS-sepcific criteria, which will later into the actual drivers in a
later patch in this series.
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have a separate aio pool structure we can remove those
aio pool details from BlockDriver.
Every driver supporting AIO now needs to declare a static AIOPool
with the aiocb size and the cancellation method. This cleans up the
current code considerably and will make it cleaner and more obvious
to support two different aio implementations behind a single
BlockDriver.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
| |
This patch converts the remaining users of bdrv_create2 to bdrv_create and
removes the now unused function.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now we can make use of the newly introduced option structures. Instead of
having bdrv_create carry more and more parameters (which are format specific in
most cases), just pass a option structure as defined by the driver itself.
bdrv_create2() contains an emulation of the old interface to simplify the
transition.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
| |
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
| |
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
| |
This patch makes the range checks for block requests more strict: It fixes a
potential integer overflow and checks for negative offsets. Also, it adds the
check for compressed writes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this patch adds a buffer_alignment field to BlockDriverState and
implements a qemu_blockalign function that uses that field to allocate a
memory aligned buffer to be used by the block driver.
buffer_alignment is initialized to 512 but each block driver can set
a different value (at the moment none of them do).
This patch modifies ide.c, block-qcow.c, block-qcow2.c and block.c to
use qemu_blockalign instead of qemu_memalign.
There is only one place left that still uses qemu_memalign to allocate
buffers used by block drivers that is posix-aio-compat:handle_aiocb_rw
because it is not possible to get the BlockDriverState from that
function. However I think it is not important because posix-aio-compat
already deals with driver specific code so it is supposed to know its
own needs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7229 c046a42c-6fe2-441c-8c8c-71466251a162
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From: Kevin Wolf <kwolf@redhat.com>
Introduce a new bdrv_check function pointer for block drivers. Modify qcow2 to
return an error status in check_refcounts(), so it can implement bdrv_check.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7214 c046a42c-6fe2-441c-8c8c-71466251a162
|
|
|
|
| |
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7103 c046a42c-6fe2-441c-8c8c-71466251a162
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ties up the preadv/pwritev syscalls to qemu if they are declared in
unistd.h. This is the case currently on at least NetBSD and OpenBSD and
will hopefully soon be the case on Linux.
Thanks to Blue Swirl and Gerd Hoffmann for the configure autodetection
of preadv/pwritev.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7021 c046a42c-6fe2-441c-8c8c-71466251a162
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make all AIO requests vectored and defer linearization until the actual
I/O thread. This prepares for using native preadv/pwritev.
Also enables asynchronous direct I/O by handling that case in the I/O thread.
Qcow and qcow2 propably want to be adopted to directly deal with multi-segment
requests, but that can be implemented later.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7020 c046a42c-6fe2-441c-8c8c-71466251a162
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always use the vectored APIs to reduce code churn once we switch the BlockDriver
API to be vectored.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7019 c046a42c-6fe2-441c-8c8c-71466251a162
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now enforce that you cannot write beyond the end of a non-growable file.
qcow2 files are not growable but we rely on them being growable to do
savevm/loadvm. Temporarily allow them to be growable by introducing a new
API specifically for savevm read/write operations.
Reported-by: malc
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6994 c046a42c-6fe2-441c-8c8c-71466251a162
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All the bdrv_ helpers should check for bs->drv being zero as that means
there is no backend image open. bdrv_flush fails to perform that check
and can thus cause NULL pointer dereferences.
Found using qemu-io.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6943 c046a42c-6fe2-441c-8c8c-71466251a162
|