* [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs
@ 2011-02-19 16:19 Stefan Hajnoczi
2011-02-20 18:29 ` [Qemu-devel] [Bug 721825] " Stefan Weil
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2011-02-19 16:19 UTC (permalink / raw)
To: qemu-devel
Public bug reported:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
** Affects: qemu
Importance: Undecided
Status: New
** Tags: block vdi
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
New
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [Bug 721825] Re: VDI block driver bugs
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
@ 2011-02-20 18:29 ` Stefan Weil
2017-05-19 19:36 ` Thomas Huth
` (5 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Stefan Weil @ 2011-02-20 18:29 UTC (permalink / raw)
To: qemu-devel
** Changed in: qemu
Status: New => Confirmed
** Changed in: qemu
Assignee: (unassigned) => Stefan Weil (ubuntu-weilnetz)
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Confirmed
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [Bug 721825] Re: VDI block driver bugs
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
2011-02-20 18:29 ` [Qemu-devel] [Bug 721825] " Stefan Weil
@ 2017-05-19 19:36 ` Thomas Huth
2017-05-19 20:08 ` Stefan Hajnoczi
2017-05-19 23:25 ` Thomas Huth
` (4 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: Thomas Huth @ 2017-05-19 19:36 UTC (permalink / raw)
To: qemu-devel
Is this still an issue with the latest version of QEMU, or could we
close this bug nowadays?
** Changed in: qemu
Status: Confirmed => Incomplete
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Incomplete
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/721825/+subscriptions
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [Bug 721825] Re: VDI block driver bugs
2017-05-19 19:36 ` Thomas Huth
@ 2017-05-19 20:08 ` Stefan Hajnoczi
0 siblings, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2017-05-19 20:08 UTC (permalink / raw)
To: qemu-devel
On Fri, May 19, 2017 at 8:36 PM, Thomas Huth <721825@bugs.launchpad.net> wrote:
> Is this still an issue with the latest version of QEMU, or could we
> close this bug nowadays?
A quick check of block/vdi.c shows that error handling is still
lacking. Updates to in-memory data structures are not reverted if the
write to disk fails.
Let's leave this in case someone is interested in fixing the bugs
sometime. VDI is not used heavily and typically in read-only mode so
these bugs are not urgent.
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Incomplete
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/721825/+subscriptions
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [Bug 721825] Re: VDI block driver bugs
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
2011-02-20 18:29 ` [Qemu-devel] [Bug 721825] " Stefan Weil
2017-05-19 19:36 ` Thomas Huth
@ 2017-05-19 23:25 ` Thomas Huth
2018-08-31 4:35 ` Thomas Huth
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Thomas Huth @ 2017-05-19 23:25 UTC (permalink / raw)
To: qemu-devel
** Changed in: qemu
Status: Incomplete => Triaged
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Triaged
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/721825/+subscriptions
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [Bug 721825] Re: VDI block driver bugs
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
` (2 preceding siblings ...)
2017-05-19 23:25 ` Thomas Huth
@ 2018-08-31 4:35 ` Thomas Huth
2021-05-03 9:58 ` Thomas Huth
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Thomas Huth @ 2018-08-31 4:35 UTC (permalink / raw)
To: qemu-devel
** Changed in: qemu
Importance: Undecided => Low
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Triaged
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/721825/+subscriptions
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 721825] Re: VDI block driver bugs
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
` (3 preceding siblings ...)
2018-08-31 4:35 ` Thomas Huth
@ 2021-05-03 9:58 ` Thomas Huth
2021-05-26 14:46 ` Thomas Huth
2021-07-26 4:17 ` Launchpad Bug Tracker
6 siblings, 0 replies; 9+ messages in thread
From: Thomas Huth @ 2021-05-03 9:58 UTC (permalink / raw)
To: qemu-devel
Hi Stefan (Weil) - this bug is now assigned to you since more than 10
years ... do you still plan to tackle it at some point in time? If not,
I'd suggest to unassign it. Also, it's been four years again since the
last comment ... should we maybe close this as "Wont Fix" ?
** Changed in: qemu
Status: Triaged => Incomplete
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Incomplete
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/721825/+subscriptions
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 721825] Re: VDI block driver bugs
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
` (4 preceding siblings ...)
2021-05-03 9:58 ` Thomas Huth
@ 2021-05-26 14:46 ` Thomas Huth
2021-07-26 4:17 ` Launchpad Bug Tracker
6 siblings, 0 replies; 9+ messages in thread
From: Thomas Huth @ 2021-05-26 14:46 UTC (permalink / raw)
To: qemu-devel
** Changed in: qemu
Assignee: Stefan Weil (ubuntu-weilnetz) => (unassigned)
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Incomplete
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/721825/+subscriptions
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 721825] Re: VDI block driver bugs
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
` (5 preceding siblings ...)
2021-05-26 14:46 ` Thomas Huth
@ 2021-07-26 4:17 ` Launchpad Bug Tracker
6 siblings, 0 replies; 9+ messages in thread
From: Launchpad Bug Tracker @ 2021-07-26 4:17 UTC (permalink / raw)
To: qemu-devel
[Expired for QEMU because there has been no activity for 60 days.]
** Changed in: qemu
Status: Incomplete => Expired
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/721825
Title:
VDI block driver bugs
Status in QEMU:
Expired
Bug description:
Chunqiang Tang reports the following issues with the VDI block driver,
these are present in QEMU 0.14:
"Bug 1. The most serious bug is caused by race condition in updating a new
bmap entry in memory and on disk. Considering the following operation
sequence.
O1: VM issues a write to sector X
O2: VDI allocates a new bmap entry and updates in-memory s->bmap
O3: VDI writes data to disk
O4: The disk I/O for writing sector X fails
O5: VDI reports error to VM and returns.
Note that the bmap entry is updated in memory, but not persisted on disk.
Now consider another write that immediately follows:
P1: VM issues a write to sector X+1, which locates in the same block as
the previously used sector X.
P2: s->bmap already has one entry for the block, and hence VDI writes
data directly without persisting the new s->bmap entry on disk.
P3: The write disk I/O succeeds
P4: VDI report success to VM, but the bitmap entry is still not
persisted on disk.
Now suppose the VM powers off gracefully (i.e., the QEMU process quits)
and reboots. The second write to sector X+1, which is reported as finished
successfully, is simply lost, because the corresponding in-memory s->bmap
entry is never persisted on disk. This is exactly what FVD's testing tool
discovers. After the block device is closed and then re-opened, disk
content verification fails.
This is just one example of the problem. Race condition plus host crash
also causes problems. Consider another example below.
Q1: VM issues a write to sector X
Q2: VDI allocates a new bmap entry and updates in-memory s->bmap
Q3: VDI writes sector X to disk and waits for the callback
Q4: VM issues a write to another sector X+1, which is in the same block
as sector X.
Q5: VDI sees the bitmap entry in s->bmap is already allocated, and
writes sector X+1 to disk.
Q6: Write to sector X+1 finishes, and VDI's callback is invoked.
Q7: VDI acknowledges to the VM the completion of writing sector X+1
Q8: After observing the completion of writing sector X+1, VM issues a
flush to ensure that sector X+1 is persisted on disk.
Q9: VDI finishes the flush and acknowledge the completion of the
operation.
Q10: ... (some other arbitrary operations, but the disk I/O for writing
sector X is still not finished....)
Q11: The host crashes
Now the new bitmap entry is not persisted on disk, while both writing to
sector X+1 and the flush has been acknowledged as finished. Sector X+1 is
lost, which is a corruption. This problem exists even if it uses O_DSYNC.
The root cause of the problem is that, if a request updates in-memory
s->bmap, another request that sees this update assumes that the update is
already persisted on disk, which is not.
Bug 2: Similar to the bugs the FVD testing tool found for QCOW2, there are
several cases of the code below on failure handling path without setting
error return code, which mistakenly reports failure as success. This
mistake is caught by FVD when doing image content validation.
if (acb->hd_aiocb == NULL) {
/* missing ret = -EIO; */
goto done;
}
Bug 3: Similar to the bugs the FVD testing tool found for QCOW2,
vdi_aio_cancel does not perform a complete clean up and there are several
related bugs. First, memory buffer is not freed, acb->orig_buf and
acb->block_buffer. Second, acb->bh is not cancelled. Third,
vdi_aio_setup() does not initialize acb->bh to NULL so that when a request
acb is cancelled and then later reused for another request, its acb->bh !=
NULL and the new request fails in vdi_schedule_bh(). This is caught by
FVD's testing tool, when it observes that no I/O failure is injected but
VDI reports a failed I/O request, which indicates a bug in the driver."
http://permalink.gmane.org/gmane.comp.emulators.qemu/94340
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/721825/+subscriptions
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-07-26 4:27 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-19 16:19 [Qemu-devel] [Bug 721825] [NEW] VDI block driver bugs Stefan Hajnoczi
2011-02-20 18:29 ` [Qemu-devel] [Bug 721825] " Stefan Weil
2017-05-19 19:36 ` Thomas Huth
2017-05-19 20:08 ` Stefan Hajnoczi
2017-05-19 23:25 ` Thomas Huth
2018-08-31 4:35 ` Thomas Huth
2021-05-03 9:58 ` Thomas Huth
2021-05-26 14:46 ` Thomas Huth
2021-07-26 4:17 ` Launchpad Bug Tracker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).