* Re: [Qemu-devel] [PATCH] backup: block-job error BUG
[not found] <1468876979-13537-1-git-send-email-vsementsov@virtuozzo.com>
@ 2016-07-18 21:26 ` Vladimir Sementsov-Ogievskiy
2016-07-19 2:12 ` Fam Zheng
0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-07-18 21:26 UTC (permalink / raw)
To: qemu-block; +Cc: stefanha, famz, mreitz, jcody, kwolf, den, qemu-devel
forget qemu-devel@nongnu.org.. add it.
On 19.07.2016 00:22, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
>
> This is a variant of existing test case which produces test failure.
>
> It looks like the reason is:
>
> one block job is in backup_complete, in synchronous bdrv_flush (success job)
> other (job with injected io err) tries to synchronously cancel "success job"
> It looks like some kind of dead-lock
>
> -..........
> +........EEE
> +======================================================================
> +ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
> +Test: Verify backups made from a transaction that partially fails.
> +----------------------------------------------------------------------
> +Traceback (most recent call last):
> + File "124", line 478, in test_transaction_failure
> + self.wait_qmp_backup_cancelled(drive0['id'])
> + File "124", line 173, in wait_qmp_backup_cancelled
> + match={'data': {'device': device}})
> + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 308, in event_wait
> + event = self._qmp.pull_event(wait=timeout)
> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 194, in pull_event
> + self.__get_events(wait)
> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 109, in __get_events
> + raise QMPTimeoutError("Timeout waiting for event")
> +QMPTimeoutError: Timeout waiting for event
> +
> +======================================================================
> +ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
> +Test: Verify backups made from a transaction that partially fails.
> +----------------------------------------------------------------------
> +Traceback (most recent call last):
> + File "124", line 272, in tearDown
> + self.vm.shutdown()
> + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 260, in shutdown
> + self._qmp.cmd('quit')
> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 172, in cmd
> + return self.cmd_obj(qmp_cmd)
> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 157, in cmd_obj
> + return self.__json_read()
> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 66, in __json_read
> + data = self.__sockfile.readline()
> + File "/usr/lib64/python2.7/socket.py", line 447, in readline
> + data = self._sock.recv(self._rbufsize)
> +timeout: timed out
> +
> +======================================================================
> +ERROR: test_incremental_failure (__main__.TestIncrementalBackupBlkdebug)
> +Test: Verify backups made after a failure are correct.
> +----------------------------------------------------------------------
> +Traceback (most recent call last):
> + File "124", line 549, in setUp
> + self.vm.launch()
> + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 246, in launch
> + self._qmp = qmp.QEMUMonitorProtocol(self._monitor_path, server=True)
> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 44, in __init__
> + self.__sock.bind(self.__address)
> + File "/usr/lib64/python2.7/socket.py", line 224, in meth
> + return getattr(self._sock,name)(*args)
> +error: [Errno 98] Address already in use
> +
> ----------------------------------------------------------------------
> Ran 10 tests
>
> -OK
> +FAILED (errors=3)
> Failures: 124
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> tests/qemu-iotests/124 | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
> index de7cdbe..74de117 100644
> --- a/tests/qemu-iotests/124
> +++ b/tests/qemu-iotests/124
> @@ -448,9 +448,9 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
> self.assertFalse(self.vm.get_qmp_events(wait=False))
>
> # Emulate some writes
> - self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
> - ('0xfe', '16M', '256k'),
> - ('0x64', '32736k', '64k')))
> + #self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
> + # ('0xfe', '16M', '256k'),
> + # ('0x64', '32736k', '64k')))
> self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
> ('0xef', '16M', '256k'),
> ('0x46', '32736k', '64k')))
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] [PATCH] backup: block-job error BUG
2016-07-18 21:26 ` [Qemu-devel] [PATCH] backup: block-job error BUG Vladimir Sementsov-Ogievskiy
@ 2016-07-19 2:12 ` Fam Zheng
2016-07-19 11:58 ` Vladimir Sementsov-Ogievskiy
0 siblings, 1 reply; 3+ messages in thread
From: Fam Zheng @ 2016-07-19 2:12 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy
Cc: qemu-block, kwolf, jcody, qemu-devel, mreitz, stefanha, den
On Tue, 07/19 00:26, Vladimir Sementsov-Ogievskiy wrote:
> forget qemu-devel@nongnu.org.. add it.
>
> On 19.07.2016 00:22, Vladimir Sementsov-Ogievskiy wrote:
> > Hi all!
> >
> > This is a variant of existing test case which produces test failure.
> >
> > It looks like the reason is:
> >
> > one block job is in backup_complete, in synchronous bdrv_flush (success job)
> > other (job with injected io err) tries to synchronously cancel "success job"
> > It looks like some kind of dead-lock
> >
> > -..........
> > +........EEE
> > +======================================================================
> > +ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
> > +Test: Verify backups made from a transaction that partially fails.
> > +----------------------------------------------------------------------
> > +Traceback (most recent call last):
> > + File "124", line 478, in test_transaction_failure
> > + self.wait_qmp_backup_cancelled(drive0['id'])
> > + File "124", line 173, in wait_qmp_backup_cancelled
> > + match={'data': {'device': device}})
> > + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 308, in event_wait
> > + event = self._qmp.pull_event(wait=timeout)
> > + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 194, in pull_event
> > + self.__get_events(wait)
> > + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 109, in __get_events
> > + raise QMPTimeoutError("Timeout waiting for event")
> > +QMPTimeoutError: Timeout waiting for event
> > +
> > +======================================================================
> > +ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
> > +Test: Verify backups made from a transaction that partially fails.
> > +----------------------------------------------------------------------
> > +Traceback (most recent call last):
> > + File "124", line 272, in tearDown
> > + self.vm.shutdown()
> > + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 260, in shutdown
> > + self._qmp.cmd('quit')
> > + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 172, in cmd
> > + return self.cmd_obj(qmp_cmd)
> > + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 157, in cmd_obj
> > + return self.__json_read()
> > + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 66, in __json_read
> > + data = self.__sockfile.readline()
> > + File "/usr/lib64/python2.7/socket.py", line 447, in readline
> > + data = self._sock.recv(self._rbufsize)
> > +timeout: timed out
> > +
> > +======================================================================
> > +ERROR: test_incremental_failure (__main__.TestIncrementalBackupBlkdebug)
> > +Test: Verify backups made after a failure are correct.
> > +----------------------------------------------------------------------
> > +Traceback (most recent call last):
> > + File "124", line 549, in setUp
> > + self.vm.launch()
> > + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 246, in launch
> > + self._qmp = qmp.QEMUMonitorProtocol(self._monitor_path, server=True)
> > + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 44, in __init__
> > + self.__sock.bind(self.__address)
> > + File "/usr/lib64/python2.7/socket.py", line 224, in meth
> > + return getattr(self._sock,name)(*args)
> > +error: [Errno 98] Address already in use
> > +
> > ----------------------------------------------------------------------
> > Ran 10 tests
> >
> > -OK
> > +FAILED (errors=3)
> > Failures: 124
> >
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > ---
> > tests/qemu-iotests/124 | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
> > index de7cdbe..74de117 100644
> > --- a/tests/qemu-iotests/124
> > +++ b/tests/qemu-iotests/124
> > @@ -448,9 +448,9 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
> > self.assertFalse(self.vm.get_qmp_events(wait=False))
> > # Emulate some writes
> > - self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
> > - ('0xfe', '16M', '256k'),
> > - ('0x64', '32736k', '64k')))
> > + #self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
> > + # ('0xfe', '16M', '256k'),
> > + # ('0x64', '32736k', '64k')))
This patch doesn't make any sense to me.
If this is a deadlock in QEMU, shouldn't we fix the code instead of randomly
commenting out a few lines in the test case?
I cannot reproduce this on master. Could you post a backtrace where the block
job is stopped?
Fam
> > self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
> > ('0xef', '16M', '256k'),
> > ('0x46', '32736k', '64k')))
>
>
> --
> Best regards,
> Vladimir
>
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] [PATCH] backup: block-job error BUG
2016-07-19 2:12 ` Fam Zheng
@ 2016-07-19 11:58 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 3+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2016-07-19 11:58 UTC (permalink / raw)
To: Fam Zheng; +Cc: qemu-block, kwolf, jcody, qemu-devel, mreitz, stefanha, den
On 19.07.2016 05:12, Fam Zheng wrote:
> On Tue, 07/19 00:26, Vladimir Sementsov-Ogievskiy wrote:
>> forget qemu-devel@nongnu.org.. add it.
>>
>> On 19.07.2016 00:22, Vladimir Sementsov-Ogievskiy wrote:
>>> Hi all!
>>>
>>> This is a variant of existing test case which produces test failure.
>>>
>>> It looks like the reason is:
>>>
>>> one block job is in backup_complete, in synchronous bdrv_flush (success job)
>>> other (job with injected io err) tries to synchronously cancel "success job"
>>> It looks like some kind of dead-lock
>>>
>>> -..........
>>> +........EEE
>>> +======================================================================
>>> +ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
>>> +Test: Verify backups made from a transaction that partially fails.
>>> +----------------------------------------------------------------------
>>> +Traceback (most recent call last):
>>> + File "124", line 478, in test_transaction_failure
>>> + self.wait_qmp_backup_cancelled(drive0['id'])
>>> + File "124", line 173, in wait_qmp_backup_cancelled
>>> + match={'data': {'device': device}})
>>> + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 308, in event_wait
>>> + event = self._qmp.pull_event(wait=timeout)
>>> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 194, in pull_event
>>> + self.__get_events(wait)
>>> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 109, in __get_events
>>> + raise QMPTimeoutError("Timeout waiting for event")
>>> +QMPTimeoutError: Timeout waiting for event
>>> +
>>> +======================================================================
>>> +ERROR: test_transaction_failure (__main__.TestIncrementalBackup)
>>> +Test: Verify backups made from a transaction that partially fails.
>>> +----------------------------------------------------------------------
>>> +Traceback (most recent call last):
>>> + File "124", line 272, in tearDown
>>> + self.vm.shutdown()
>>> + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 260, in shutdown
>>> + self._qmp.cmd('quit')
>>> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 172, in cmd
>>> + return self.cmd_obj(qmp_cmd)
>>> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 157, in cmd_obj
>>> + return self.__json_read()
>>> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 66, in __json_read
>>> + data = self.__sockfile.readline()
>>> + File "/usr/lib64/python2.7/socket.py", line 447, in readline
>>> + data = self._sock.recv(self._rbufsize)
>>> +timeout: timed out
>>> +
>>> +======================================================================
>>> +ERROR: test_incremental_failure (__main__.TestIncrementalBackupBlkdebug)
>>> +Test: Verify backups made after a failure are correct.
>>> +----------------------------------------------------------------------
>>> +Traceback (most recent call last):
>>> + File "124", line 549, in setUp
>>> + self.vm.launch()
>>> + File "/work/src/qemu/tests/qemu-iotests/iotests.py", line 246, in launch
>>> + self._qmp = qmp.QEMUMonitorProtocol(self._monitor_path, server=True)
>>> + File "/work/src/qemu/tests/qemu-iotests/../../scripts/qmp/qmp.py", line 44, in __init__
>>> + self.__sock.bind(self.__address)
>>> + File "/usr/lib64/python2.7/socket.py", line 224, in meth
>>> + return getattr(self._sock,name)(*args)
>>> +error: [Errno 98] Address already in use
>>> +
>>> ----------------------------------------------------------------------
>>> Ran 10 tests
>>>
>>> -OK
>>> +FAILED (errors=3)
>>> Failures: 124
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>> tests/qemu-iotests/124 | 6 +++---
>>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
>>> index de7cdbe..74de117 100644
>>> --- a/tests/qemu-iotests/124
>>> +++ b/tests/qemu-iotests/124
>>> @@ -448,9 +448,9 @@ class TestIncrementalBackup(TestIncrementalBackupBase):
>>> self.assertFalse(self.vm.get_qmp_events(wait=False))
>>> # Emulate some writes
>>> - self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
>>> - ('0xfe', '16M', '256k'),
>>> - ('0x64', '32736k', '64k')))
>>> + #self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
>>> + # ('0xfe', '16M', '256k'),
>>> + # ('0x64', '32736k', '64k')))
> This patch doesn't make any sense to me.
>
> If this is a deadlock in QEMU, shouldn't we fix the code instead of randomly
> commenting out a few lines in the test case?
>
> I cannot reproduce this on master. Could you post a backtrace where the block
> job is stopped?
>
> Fam
This patch is not for commiting, but only to reproduce the bug.
with --enable-debug:
bt:
Thread 3 (Thread 0x7ff6e5ac7700 (LWP 3300)):
#0 0x00007ff6eb7da469 in syscall () from /lib64/libc.so.6
#1 0x00007ff6f33f6845 in futex_wait (ev=0x7ff6f3e844a4
<rcu_call_ready_event>, val=4294967295)
at util/qemu-thread-posix.c:292
#2 0x00007ff6f33f69ae in qemu_event_wait (ev=0x7ff6f3e844a4
<rcu_call_ready_event>)
at util/qemu-thread-posix.c:399
#3 0x00007ff6f340bf80 in call_rcu_thread (opaque=0x0) at util/rcu.c:250
#4 0x00007ff6ee67fdc5 in start_thread () from /lib64/libpthread.so.0
#5 0x00007ff6eb7dfced in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7ff6e43bd700 (LWP 3302)):
#0 0x00007ff6ee686e91 in sigwait () from /lib64/libpthread.so.0
#1 0x00007ff6f2f9aace in qemu_dummy_cpu_thread_fn (arg=0x7ff6f42c2250)
at /work/src/qemu/cpus.c:1123
#2 0x00007ff6ee67fdc5 in start_thread () from /lib64/libpthread.so.0
#3 0x00007ff6eb7dfced in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7ff6f2d34c00 (LWP 3299)):
#0 rfifolock_lock (r=0x7ff6f4280e00) at util/rfifolock.c:66
#1 0x00007ff6f3319dd8 in aio_context_acquire (ctx=0x7ff6f4280da0) at
async.c:373
#2 0x00007ff6f33236ed in block_job_completed_txn_abort
(job=0x7ff6f59e91c0) at blockjob.c:240
#3 0x00007ff6f332399d in block_job_completed (job=0x7ff6f59e91c0,
ret=-5) at blockjob.c:295
#4 0x00007ff6f3106776 in backup_complete (job=0x7ff6f59e91c0,
opaque=0x7ff6f6058660)
at block/backup.c:288
#5 0x00007ff6f3324389 in block_job_defer_to_main_loop_bh
(opaque=0x7ff6f62073d0) at blockjob.c:608
#6 0x00007ff6f3319477 in aio_bh_call (bh=0x7ff6f42f86d0) at async.c:66
#7 0x00007ff6f3319539 in aio_bh_poll (ctx=0x7ff6f4280da0) at async.c:94
#8 0x00007ff6f3326ae9 in aio_dispatch (ctx=0x7ff6f4280da0) at
aio-posix.c:308
#9 0x00007ff6f3319907 in aio_ctx_dispatch (source=0x7ff6f4280da0,
callback=0x0, user_data=0x0)
at async.c:233
#10 0x00007ff6ec51d7aa in g_main_context_dispatch () from
/lib64/libglib-2.0.so.0
#11 0x00007ff6f3324b19 in glib_pollfds_poll () at main-loop.c:213
#12 0x00007ff6f3324bf6 in os_host_main_loop_wait (timeout=0) at
main-loop.c:258
#13 0x00007ff6f3324ca6 in main_loop_wait (nonblocking=1) at main-loop.c:506
#14 0x00007ff6f30d3faa in main_loop () at vl.c:1908
#15 0x00007ff6f30db97c in main (argc=16, argv=0x7ffec0b0e3b8,
envp=0x7ffec0b0e440) at vl.c:4601
>
>>> self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
>>> ('0xef', '16M', '256k'),
>>> ('0x46', '32736k', '64k')))
>>
>> --
>> Best regards,
>> Vladimir
>>
>>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-19 11:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1468876979-13537-1-git-send-email-vsementsov@virtuozzo.com>
2016-07-18 21:26 ` [Qemu-devel] [PATCH] backup: block-job error BUG Vladimir Sementsov-Ogievskiy
2016-07-19 2:12 ` Fam Zheng
2016-07-19 11:58 ` Vladimir Sementsov-Ogievskiy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.