All of lore.kernel.org
 help / color / mirror / Atom feed
From: dongli.zhang@oracle.com
To: Julien Grall <julien@xen.org>,
	xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org
Cc: jgross@suse.com, boris.ostrovsky@oracle.com,
	sstabellini@kernel.org, joe.jin@oracle.com
Subject: Re: [Xen-devel] [PATCH v2 1/2] xenbus: req->body should be updated before req->state
Date: Tue, 3 Mar 2020 12:36:11 -0800	[thread overview]
Message-ID: <89e891be-2572-fdbf-d627-d1b71284e50d@oracle.com> (raw)
In-Reply-To: <4ed129f9-ff23-f228-6833-77e37c2bb7b2@xen.org>



On 3/3/20 11:37 AM, Julien Grall wrote:
> Hi,
> 
> On 03/03/2020 18:47, Dongli Zhang wrote:
>> The req->body should be updated before req->state is updated and the
>> order should be guaranteed by a barrier.
>>
>> Otherwise, read_reply() might return req->body = NULL.
>>
>> Below is sample callstack when the issue is reproduced on purpose by
>> reordering the updates of req->body and req->state and adding delay in
>> code between updates of req->state and req->body.
>>
>> [   22.356105] general protection fault: 0000 [#1] SMP PTI
>> [   22.361185] CPU: 2 PID: 52 Comm: xenwatch Not tainted 5.5.0xen+ #6
>> [   22.366727] Hardware name: Xen HVM domU, BIOS ...
>> [   22.372245] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60
>> ... ...
>> [   22.392163] RSP: 0018:ffffb2d64023fdf0 EFLAGS: 00010246
>> [   22.395933] RAX: 0000000000000000 RBX: 75746e7562755f6d RCX: 0000000000000000
>> [   22.400871] RDX: 0000000000000000 RSI: ffffb2d64023fdfc RDI: 75746e7562755f6d
>> [   22.405874] RBP: 0000000000000000 R08: 00000000000001e8 R09: 0000000000cdcdcd
>> [   22.410945] R10: ffffb2d6402ffe00 R11: ffff9d95395eaeb0 R12: ffff9d9535935000
>> [   22.417613] R13: ffff9d9526d4a000 R14: ffff9d9526f4f340 R15: ffff9d9537654000
>> [   22.423726] FS:  0000000000000000(0000) GS:ffff9d953bc80000(0000)
>> knlGS:0000000000000000
>> [   22.429898] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   22.434342] CR2: 000000c4206a9000 CR3: 00000001ea3fc002 CR4: 00000000001606e0
>> [   22.439645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   22.444941] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   22.450342] Call Trace:
>> [   22.452509]  simple_strtoull+0x27/0x70
>> [   22.455572]  xenbus_transaction_start+0x31/0x50
>> [   22.459104]  netback_changed+0x76c/0xcc1 [xen_netfront]
>> [   22.463279]  ? find_watch+0x40/0x40
>> [   22.466156]  xenwatch_thread+0xb4/0x150
>> [   22.469309]  ? wait_woken+0x80/0x80
>> [   22.472198]  kthread+0x10e/0x130
>> [   22.474925]  ? kthread_park+0x80/0x80
>> [   22.477946]  ret_from_fork+0x35/0x40
>> [   22.480968] Modules linked in: xen_kbdfront xen_fbfront(+) xen_netfront
>> xen_blkfront
>> [   22.486783] ---[ end trace a9222030a747c3f7 ]---
>> [   22.490424] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60
>>
>> The barrier() in test_reply() is changed to virt_rmb(). The "while" is
>> changed to "do while" so that test_reply() is used as a read memory
>> barrier.
>>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>> Changed since v1:
>>    - change "barrier()" to "virt_rmb()" in test_reply()
>>
>>   drivers/xen/xenbus/xenbus_comms.c |  2 ++
>>   drivers/xen/xenbus/xenbus_xs.c    | 11 +++++++----
>>   2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/xen/xenbus/xenbus_comms.c
>> b/drivers/xen/xenbus/xenbus_comms.c
>> index d239fc3c5e3d..852ed161fc2a 100644
>> --- a/drivers/xen/xenbus/xenbus_comms.c
>> +++ b/drivers/xen/xenbus/xenbus_comms.c
>> @@ -313,6 +313,8 @@ static int process_msg(void)
>>               req->msg.type = state.msg.type;
>>               req->msg.len = state.msg.len;
>>               req->body = state.body;
>> +            /* write body, then update state */
>> +            virt_wmb();
>>               req->state = xb_req_state_got_reply;
>>               req->cb(req);
>>           } else
>> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
>> index ddc18da61834..1e14c2118861 100644
>> --- a/drivers/xen/xenbus/xenbus_xs.c
>> +++ b/drivers/xen/xenbus/xenbus_xs.c
>> @@ -194,15 +194,18 @@ static bool test_reply(struct xb_req_data *req)
>>       if (req->state == xb_req_state_got_reply || !xenbus_ok())
>>           return true;
>>   -    /* Make sure to reread req->state each time. */
>> -    barrier();
>> +    /*
>> +     * read req->state before other fields of struct xb_req_data
>> +     * in the caller of test_reply(), e.g., read_reply()
>> +     */
>> +    virt_rmb();
> 
> Looking at the code again, I am afraid the barrier only happen in the false
> case. Should not the new barrier added in the 'true' case?

I would leave the original "barrier()" in the 'false' case and add the new
barrier only in the 'true' case.

Thank you very much!

Dongli Zhang

WARNING: multiple messages have this Message-ID (diff)
From: dongli.zhang@oracle.com
To: Julien Grall <julien@xen.org>,
	xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org
Cc: jgross@suse.com, boris.ostrovsky@oracle.com,
	sstabellini@kernel.org, joe.jin@oracle.com
Subject: Re: [Xen-devel] [PATCH v2 1/2] xenbus: req->body should be updated before req->state
Date: Tue, 3 Mar 2020 12:36:11 -0800	[thread overview]
Message-ID: <89e891be-2572-fdbf-d627-d1b71284e50d@oracle.com> (raw)
In-Reply-To: <4ed129f9-ff23-f228-6833-77e37c2bb7b2@xen.org>



On 3/3/20 11:37 AM, Julien Grall wrote:
> Hi,
> 
> On 03/03/2020 18:47, Dongli Zhang wrote:
>> The req->body should be updated before req->state is updated and the
>> order should be guaranteed by a barrier.
>>
>> Otherwise, read_reply() might return req->body = NULL.
>>
>> Below is sample callstack when the issue is reproduced on purpose by
>> reordering the updates of req->body and req->state and adding delay in
>> code between updates of req->state and req->body.
>>
>> [   22.356105] general protection fault: 0000 [#1] SMP PTI
>> [   22.361185] CPU: 2 PID: 52 Comm: xenwatch Not tainted 5.5.0xen+ #6
>> [   22.366727] Hardware name: Xen HVM domU, BIOS ...
>> [   22.372245] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60
>> ... ...
>> [   22.392163] RSP: 0018:ffffb2d64023fdf0 EFLAGS: 00010246
>> [   22.395933] RAX: 0000000000000000 RBX: 75746e7562755f6d RCX: 0000000000000000
>> [   22.400871] RDX: 0000000000000000 RSI: ffffb2d64023fdfc RDI: 75746e7562755f6d
>> [   22.405874] RBP: 0000000000000000 R08: 00000000000001e8 R09: 0000000000cdcdcd
>> [   22.410945] R10: ffffb2d6402ffe00 R11: ffff9d95395eaeb0 R12: ffff9d9535935000
>> [   22.417613] R13: ffff9d9526d4a000 R14: ffff9d9526f4f340 R15: ffff9d9537654000
>> [   22.423726] FS:  0000000000000000(0000) GS:ffff9d953bc80000(0000)
>> knlGS:0000000000000000
>> [   22.429898] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   22.434342] CR2: 000000c4206a9000 CR3: 00000001ea3fc002 CR4: 00000000001606e0
>> [   22.439645] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   22.444941] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   22.450342] Call Trace:
>> [   22.452509]  simple_strtoull+0x27/0x70
>> [   22.455572]  xenbus_transaction_start+0x31/0x50
>> [   22.459104]  netback_changed+0x76c/0xcc1 [xen_netfront]
>> [   22.463279]  ? find_watch+0x40/0x40
>> [   22.466156]  xenwatch_thread+0xb4/0x150
>> [   22.469309]  ? wait_woken+0x80/0x80
>> [   22.472198]  kthread+0x10e/0x130
>> [   22.474925]  ? kthread_park+0x80/0x80
>> [   22.477946]  ret_from_fork+0x35/0x40
>> [   22.480968] Modules linked in: xen_kbdfront xen_fbfront(+) xen_netfront
>> xen_blkfront
>> [   22.486783] ---[ end trace a9222030a747c3f7 ]---
>> [   22.490424] RIP: 0010:_parse_integer_fixup_radix+0x6/0x60
>>
>> The barrier() in test_reply() is changed to virt_rmb(). The "while" is
>> changed to "do while" so that test_reply() is used as a read memory
>> barrier.
>>
>> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
>> ---
>> Changed since v1:
>>    - change "barrier()" to "virt_rmb()" in test_reply()
>>
>>   drivers/xen/xenbus/xenbus_comms.c |  2 ++
>>   drivers/xen/xenbus/xenbus_xs.c    | 11 +++++++----
>>   2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/xen/xenbus/xenbus_comms.c
>> b/drivers/xen/xenbus/xenbus_comms.c
>> index d239fc3c5e3d..852ed161fc2a 100644
>> --- a/drivers/xen/xenbus/xenbus_comms.c
>> +++ b/drivers/xen/xenbus/xenbus_comms.c
>> @@ -313,6 +313,8 @@ static int process_msg(void)
>>               req->msg.type = state.msg.type;
>>               req->msg.len = state.msg.len;
>>               req->body = state.body;
>> +            /* write body, then update state */
>> +            virt_wmb();
>>               req->state = xb_req_state_got_reply;
>>               req->cb(req);
>>           } else
>> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
>> index ddc18da61834..1e14c2118861 100644
>> --- a/drivers/xen/xenbus/xenbus_xs.c
>> +++ b/drivers/xen/xenbus/xenbus_xs.c
>> @@ -194,15 +194,18 @@ static bool test_reply(struct xb_req_data *req)
>>       if (req->state == xb_req_state_got_reply || !xenbus_ok())
>>           return true;
>>   -    /* Make sure to reread req->state each time. */
>> -    barrier();
>> +    /*
>> +     * read req->state before other fields of struct xb_req_data
>> +     * in the caller of test_reply(), e.g., read_reply()
>> +     */
>> +    virt_rmb();
> 
> Looking at the code again, I am afraid the barrier only happen in the false
> case. Should not the new barrier added in the 'true' case?

I would leave the original "barrier()" in the 'false' case and add the new
barrier only in the 'true' case.

Thank you very much!

Dongli Zhang

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2020-03-03 20:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 18:47 [PATCH v2 1/2] xenbus: req->body should be updated before req->state Dongli Zhang
2020-03-03 18:47 ` [Xen-devel] " Dongli Zhang
2020-03-03 18:47 ` [PATCH v2 2/2] xenbus: req->err " Dongli Zhang
2020-03-03 18:47   ` [Xen-devel] " Dongli Zhang
2020-03-03 19:37 ` [PATCH v2 1/2] xenbus: req->body " Julien Grall
2020-03-03 19:37   ` [Xen-devel] " Julien Grall
2020-03-03 20:36   ` dongli.zhang [this message]
2020-03-03 20:36     ` dongli.zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=89e891be-2572-fdbf-d627-d1b71284e50d@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jgross@suse.com \
    --cc=joe.jin@oracle.com \
    --cc=julien@xen.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.