linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
@ 2018-09-22 19:55 Boris Ostrovsky
  2018-09-27  7:12 ` Juergen Gross
  0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2018-09-22 19:55 UTC (permalink / raw)
  To: jgross, konrad.wilk, roger.pau, axboe
  Cc: xen-devel, linux-kernel, Boris Ostrovsky

Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
added support for purging persistent grants when they are not in use. As
part of the purge, the grants were removed from the grant buffer, This
eventually causes the buffer to become empty, with BUG_ON triggered in
get_free_grant(). This can be observed even on an idle system, within
20-30 minutes.

We should keep the grants in the buffer when purging, and only free the
grant ref.

Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
 drivers/block/xen-blkfront.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index a71d817e900d..3b441fe69c0d 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2667,11 +2667,9 @@ static void purge_persistent_grants(struct blkfront_info *info)
 			    gnttab_query_foreign_access(gnt_list_entry->gref))
 				continue;
 
-			list_del(&gnt_list_entry->node);
 			gnttab_end_foreign_access(gnt_list_entry->gref, 0, 0UL);
+			gnt_list_entry->gref = GRANT_INVALID_REF;
 			rinfo->persistent_gnts_c--;
-			__free_page(gnt_list_entry->page);
-			kfree(gnt_list_entry);
 		}
 
 		spin_unlock_irqrestore(&rinfo->ring_lock, flags);
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-22 19:55 [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer Boris Ostrovsky
@ 2018-09-27  7:12 ` Juergen Gross
  2018-09-27 14:26   ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Juergen Gross @ 2018-09-27  7:12 UTC (permalink / raw)
  To: Boris Ostrovsky, konrad.wilk, roger.pau, axboe; +Cc: xen-devel, linux-kernel

On 22/09/18 21:55, Boris Ostrovsky wrote:
> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> added support for purging persistent grants when they are not in use. As
> part of the purge, the grants were removed from the grant buffer, This
> eventually causes the buffer to become empty, with BUG_ON triggered in
> get_free_grant(). This can be observed even on an idle system, within
> 20-30 minutes.
> 
> We should keep the grants in the buffer when purging, and only free the
> grant ref.
> 
> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27  7:12 ` Juergen Gross
@ 2018-09-27 14:26   ` Jens Axboe
  2018-09-27 18:52     ` [Xen-devel] " Sander Eikelenboom
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2018-09-27 14:26 UTC (permalink / raw)
  To: Juergen Gross, Boris Ostrovsky, konrad.wilk, roger.pau
  Cc: xen-devel, linux-kernel

On 9/27/18 1:12 AM, Juergen Gross wrote:
> On 22/09/18 21:55, Boris Ostrovsky wrote:
>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> added support for purging persistent grants when they are not in use. As
>> part of the purge, the grants were removed from the grant buffer, This
>> eventually causes the buffer to become empty, with BUG_ON triggered in
>> get_free_grant(). This can be observed even on an idle system, within
>> 20-30 minutes.
>>
>> We should keep the grants in the buffer when purging, and only free the
>> grant ref.
>>
>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> 
> Reviewed-by: Juergen Gross <jgross@suse.com>

Since Konrad is out, I'm going to queue this up for 4.19.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 14:26   ` Jens Axboe
@ 2018-09-27 18:52     ` Sander Eikelenboom
  2018-09-27 18:56       ` Jens Axboe
  0 siblings, 1 reply; 12+ messages in thread
From: Sander Eikelenboom @ 2018-09-27 18:52 UTC (permalink / raw)
  To: Jens Axboe, Juergen Gross, Boris Ostrovsky, konrad.wilk, roger.pau
  Cc: xen-devel, linux-kernel

On 27/09/18 16:26, Jens Axboe wrote:
> On 9/27/18 1:12 AM, Juergen Gross wrote:
>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>> added support for purging persistent grants when they are not in use. As
>>> part of the purge, the grants were removed from the grant buffer, This
>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>> get_free_grant(). This can be observed even on an idle system, within
>>> 20-30 minutes.
>>>
>>> We should keep the grants in the buffer when purging, and only free the
>>> grant ref.
>>>
>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>
>> Reviewed-by: Juergen Gross <jgross@suse.com>
> 
> Since Konrad is out, I'm going to queue this up for 4.19.
> 

Hi Boris/Juergen.

Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
and it gave these in dom0 dmesg:

[ 9251.696090] xen-blkback: requesting a grant already in use
[ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
[ 9251.715781] xen-blkback: requesting a grant already in use
[ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
[ 9251.735698] xen-blkback: requesting a grant already in use
[ 9251.745573] xen-blkback: trying to add a gref that's already in the tree

The VM was a HVM with 4 vcpu's and 2 phy disks:
xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants


Currently i have been running 4.19-rc5 with xen-next on top and commit a46b53672b2c reverted,
for a couple of days. That seems to run stable for me (since it's a small box so i'm not hit
by what a46b53672b2c tried to fix.

If you can come up with a debug patch i can give that a spin tomorrow evening or in the weekend,
so we are hopefully still in time for the 4.19 release.

--
Sander

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 18:52     ` [Xen-devel] " Sander Eikelenboom
@ 2018-09-27 18:56       ` Jens Axboe
  2018-09-27 19:06         ` Boris Ostrovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2018-09-27 18:56 UTC (permalink / raw)
  To: Sander Eikelenboom, Juergen Gross, Boris Ostrovsky, konrad.wilk,
	roger.pau
  Cc: xen-devel, linux-kernel

On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
> On 27/09/18 16:26, Jens Axboe wrote:
>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>> added support for purging persistent grants when they are not in use. As
>>>> part of the purge, the grants were removed from the grant buffer, This
>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>> get_free_grant(). This can be observed even on an idle system, within
>>>> 20-30 minutes.
>>>>
>>>> We should keep the grants in the buffer when purging, and only free the
>>>> grant ref.
>>>>
>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>
>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>
>> Since Konrad is out, I'm going to queue this up for 4.19.
>>
> 
> Hi Boris/Juergen.
> 
> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
> and it gave these in dom0 dmesg:
> 
> [ 9251.696090] xen-blkback: requesting a grant already in use
> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
> [ 9251.715781] xen-blkback: requesting a grant already in use
> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
> [ 9251.735698] xen-blkback: requesting a grant already in use
> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
> 
> The VM was a HVM with 4 vcpu's and 2 phy disks:
> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
> 
> 
> Currently i have been running 4.19-rc5 with xen-next on top and commit
> a46b53672b2c reverted, for a couple of days. That seems to run stable
> for me (since it's a small box so i'm not hit by what a46b53672b2c
> tried to fix.
> 
> If you can come up with a debug patch i can give that a spin tomorrow
> evening or in the weekend, so we are hopefully still in time for the
> 4.19 release.

At this late in the game, might make more sense to simply revert the
buggy commit.  Especially since what is currently out there doesn't fix
the issue for you.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 18:56       ` Jens Axboe
@ 2018-09-27 19:06         ` Boris Ostrovsky
  2018-09-27 19:16           ` Jens Axboe
  2018-09-27 20:33           ` Sander Eikelenboom
  0 siblings, 2 replies; 12+ messages in thread
From: Boris Ostrovsky @ 2018-09-27 19:06 UTC (permalink / raw)
  To: Jens Axboe, Sander Eikelenboom, Juergen Gross, konrad.wilk, roger.pau
  Cc: xen-devel, linux-kernel

On 9/27/18 2:56 PM, Jens Axboe wrote:
> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>> On 27/09/18 16:26, Jens Axboe wrote:
>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>> added support for purging persistent grants when they are not in use. As
>>>>> part of the purge, the grants were removed from the grant buffer, This
>>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>>> get_free_grant(). This can be observed even on an idle system, within
>>>>> 20-30 minutes.
>>>>>
>>>>> We should keep the grants in the buffer when purging, and only free the
>>>>> grant ref.
>>>>>
>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>
>> Hi Boris/Juergen.
>>
>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 

What do you mean by "rootFS is shuffled from under it's feet " ?

>> and it gave these in dom0 dmesg:
>>
>> [ 9251.696090] xen-blkback: requesting a grant already in use
>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>> [ 9251.715781] xen-blkback: requesting a grant already in use
>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>> [ 9251.735698] xen-blkback: requesting a grant already in use
>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>
>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>
>>
>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>> tried to fix.
>>
>> If you can come up with a debug patch i can give that a spin tomorrow
>> evening or in the weekend, so we are hopefully still in time for the
>> 4.19 release.
> At this late in the game, might make more sense to simply revert the
> buggy commit.  Especially since what is currently out there doesn't fix
> the issue for you.

If decision is to revert then I think the whole series needs to be
reverted.

-boris


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 19:06         ` Boris Ostrovsky
@ 2018-09-27 19:16           ` Jens Axboe
  2018-09-27 20:33           ` Sander Eikelenboom
  1 sibling, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2018-09-27 19:16 UTC (permalink / raw)
  To: Boris Ostrovsky, Sander Eikelenboom, Juergen Gross, konrad.wilk,
	roger.pau
  Cc: xen-devel, linux-kernel

On 9/27/18 1:06 PM, Boris Ostrovsky wrote:
> On 9/27/18 2:56 PM, Jens Axboe wrote:
>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>> On 27/09/18 16:26, Jens Axboe wrote:
>>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>> added support for purging persistent grants when they are not in use. As
>>>>>> part of the purge, the grants were removed from the grant buffer, This
>>>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>>>> get_free_grant(). This can be observed even on an idle system, within
>>>>>> 20-30 minutes.
>>>>>>
>>>>>> We should keep the grants in the buffer when purging, and only free the
>>>>>> grant ref.
>>>>>>
>>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>>
>>> Hi Boris/Juergen.
>>>
>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
> 
> What do you mean by "rootFS is shuffled from under it's feet " ?
> 
>>> and it gave these in dom0 dmesg:
>>>
>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>
>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>
>>>
>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>> tried to fix.
>>>
>>> If you can come up with a debug patch i can give that a spin tomorrow
>>> evening or in the weekend, so we are hopefully still in time for the
>>> 4.19 release.
>> At this late in the game, might make more sense to simply revert the
>> buggy commit.  Especially since what is currently out there doesn't fix
>> the issue for you.
> 
> If decision is to revert then I think the whole series needs to be
> reverted.

Yes, definitely.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 19:06         ` Boris Ostrovsky
  2018-09-27 19:16           ` Jens Axboe
@ 2018-09-27 20:33           ` Sander Eikelenboom
  2018-09-27 21:37             ` Jens Axboe
  1 sibling, 1 reply; 12+ messages in thread
From: Sander Eikelenboom @ 2018-09-27 20:33 UTC (permalink / raw)
  To: Boris Ostrovsky, Jens Axboe, Juergen Gross, konrad.wilk, roger.pau
  Cc: xen-devel, linux-kernel

On 27/09/18 21:06, Boris Ostrovsky wrote:
> On 9/27/18 2:56 PM, Jens Axboe wrote:
>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>> On 27/09/18 16:26, Jens Axboe wrote:
>>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>> added support for purging persistent grants when they are not in use. As
>>>>>> part of the purge, the grants were removed from the grant buffer, This
>>>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>>>> get_free_grant(). This can be observed even on an idle system, within
>>>>>> 20-30 minutes.
>>>>>>
>>>>>> We should keep the grants in the buffer when purging, and only free the
>>>>>> grant ref.
>>>>>>
>>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>>
>>> Hi Boris/Juergen.
>>>
>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
> 
> What do you mean by "rootFS is shuffled from under it's feet " ?

Assumption that block-front getting borked and either a kernel crash or rootfs becoming mounted readonly. Didn't (try) to check though.

>>> and it gave these in dom0 dmesg:
>>>
>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>
>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>
>>>
>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>> tried to fix.
>>>
>>> If you can come up with a debug patch i can give that a spin tomorrow
>>> evening or in the weekend, so we are hopefully still in time for the
>>> 4.19 release.
>> At this late in the game, might make more sense to simply revert the
>> buggy commit.  Especially since what is currently out there doesn't fix
>> the issue for you.
Don't know if Boris or Juergen have a hunch about the issue, if not perhaps a revert is the best. 

> If decision is to revert then I think the whole series needs to be
> reverted.
> 
> -boris
> 

For Boris and Juergen:
Would it make sense to have an "xen-next" branch in the xen-tip tree that is:
- based on the previous stable kernel
- and has the for-linus branches for the upcoming kernel release on top;
- and has the pathes for net(-next) and block changes on top (since these don't go via the tree but only via mailing-list patches);
  (which are scattered, difficult to track and use for automated testing)
- and dependency patches for the above if necessary to be able to build.

So there is one branch that can be used to test ALL pending kernel related Xen patches and which could be used in OSStest without as
many potential false alarms as linux-next will have ?

--
Sander

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 20:33           ` Sander Eikelenboom
@ 2018-09-27 21:37             ` Jens Axboe
  2018-09-27 21:48               ` Boris Ostrovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Axboe @ 2018-09-27 21:37 UTC (permalink / raw)
  To: Sander Eikelenboom, Boris Ostrovsky, Juergen Gross, konrad.wilk,
	roger.pau
  Cc: xen-devel, linux-kernel

On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
> On 27/09/18 21:06, Boris Ostrovsky wrote:
>> On 9/27/18 2:56 PM, Jens Axboe wrote:
>>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>>> On 27/09/18 16:26, Jens Axboe wrote:
>>>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>>>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>> added support for purging persistent grants when they are not in use. As
>>>>>>> part of the purge, the grants were removed from the grant buffer, This
>>>>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>>>>> get_free_grant(). This can be observed even on an idle system, within
>>>>>>> 20-30 minutes.
>>>>>>>
>>>>>>> We should keep the grants in the buffer when purging, and only free the
>>>>>>> grant ref.
>>>>>>>
>>>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>>>
>>>> Hi Boris/Juergen.
>>>>
>>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
>>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
>>
>> What do you mean by "rootFS is shuffled from under it's feet " ?
> 
> Assumption that block-front getting borked and either a kernel crash or rootfs becoming mounted readonly. Didn't (try) to check though.
> 
>>>> and it gave these in dom0 dmesg:
>>>>
>>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>>
>>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>>
>>>>
>>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>>> tried to fix.
>>>>
>>>> If you can come up with a debug patch i can give that a spin tomorrow
>>>> evening or in the weekend, so we are hopefully still in time for the
>>>> 4.19 release.
>>> At this late in the game, might make more sense to simply revert the
>>> buggy commit.  Especially since what is currently out there doesn't fix
>>> the issue for you.
>
> Don't know if Boris or Juergen have a hunch about the issue, if not
> perhaps a revert is the best.

Anyone? Unless I hear otherwise, I'll revert the series tomorrow.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 21:37             ` Jens Axboe
@ 2018-09-27 21:48               ` Boris Ostrovsky
  2018-09-27 22:03                 ` Sander Eikelenboom
  0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2018-09-27 21:48 UTC (permalink / raw)
  To: Jens Axboe, Sander Eikelenboom, Juergen Gross, konrad.wilk, roger.pau
  Cc: xen-devel, linux-kernel

On 9/27/18 5:37 PM, Jens Axboe wrote:
> On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
>> On 27/09/18 21:06, Boris Ostrovsky wrote:
>>> On 9/27/18 2:56 PM, Jens Axboe wrote:
>>>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>>>> On 27/09/18 16:26, Jens Axboe wrote:
>>>>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>>>>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>>> added support for purging persistent grants when they are not in use. As
>>>>>>>> part of the purge, the grants were removed from the grant buffer, This
>>>>>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>>>>>> get_free_grant(). This can be observed even on an idle system, within
>>>>>>>> 20-30 minutes.
>>>>>>>>
>>>>>>>> We should keep the grants in the buffer when purging, and only free the
>>>>>>>> grant ref.
>>>>>>>>
>>>>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>>>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>>>>
>>>>> Hi Boris/Juergen.
>>>>>
>>>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
>>>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
>>> What do you mean by "rootFS is shuffled from under it's feet " ?
>> Assumption that block-front getting borked and either a kernel crash or rootfs becoming mounted readonly. Didn't (try) to check though.
>>
>>>>> and it gave these in dom0 dmesg:
>>>>>
>>>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>>>
>>>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>>>
>>>>>
>>>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>>>> tried to fix.
>>>>>
>>>>> If you can come up with a debug patch i can give that a spin tomorrow
>>>>> evening or in the weekend, so we are hopefully still in time for the
>>>>> 4.19 release.
>>>> At this late in the game, might make more sense to simply revert the
>>>> buggy commit.  Especially since what is currently out there doesn't fix
>>>> the issue for you.
>> Don't know if Boris or Juergen have a hunch about the issue, if not
>> perhaps a revert is the best.
> Anyone? Unless I hear otherwise, I'll revert the series tomorrow.

Juergen may have something to say by tomorrow, but from my perspective,
given that we are coming up on rc6 --- yes.

I looked at the patches again and didn't see anything obvious.

-boris



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 21:48               ` Boris Ostrovsky
@ 2018-09-27 22:03                 ` Sander Eikelenboom
  2018-09-28  6:44                   ` Juergen Gross
  0 siblings, 1 reply; 12+ messages in thread
From: Sander Eikelenboom @ 2018-09-27 22:03 UTC (permalink / raw)
  To: Boris Ostrovsky, Jens Axboe, Juergen Gross, konrad.wilk, roger.pau
  Cc: xen-devel, linux-kernel

On 27/09/18 23:48, Boris Ostrovsky wrote:
> On 9/27/18 5:37 PM, Jens Axboe wrote:
>> On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
>>> On 27/09/18 21:06, Boris Ostrovsky wrote:
>>>> On 9/27/18 2:56 PM, Jens Axboe wrote:
>>>>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>>>>> On 27/09/18 16:26, Jens Axboe wrote:
>>>>>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>>>>>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>>>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>>>> added support for purging persistent grants when they are not in use. As
>>>>>>>>> part of the purge, the grants were removed from the grant buffer, This
>>>>>>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>>>>>>> get_free_grant(). This can be observed even on an idle system, within
>>>>>>>>> 20-30 minutes.
>>>>>>>>>
>>>>>>>>> We should keep the grants in the buffer when purging, and only free the
>>>>>>>>> grant ref.
>>>>>>>>>
>>>>>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>>>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>>>>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>>>>>
>>>>>> Hi Boris/Juergen.
>>>>>>
>>>>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
>>>>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
>>>> What do you mean by "rootFS is shuffled from under it's feet " ?
>>> Assumption that block-front getting borked and either a kernel crash or rootfs becoming mounted readonly. Didn't (try) to check though.
>>>
>>>>>> and it gave these in dom0 dmesg:
>>>>>>
>>>>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>>>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>>>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>>>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>>>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>>>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>>>>
>>>>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>>>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>>>>
>>>>>>
>>>>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>>>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>>>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>>>>> tried to fix.
>>>>>>
>>>>>> If you can come up with a debug patch i can give that a spin tomorrow
>>>>>> evening or in the weekend, so we are hopefully still in time for the
>>>>>> 4.19 release.
>>>>> At this late in the game, might make more sense to simply revert the
>>>>> buggy commit.  Especially since what is currently out there doesn't fix
>>>>> the issue for you.
>>> Don't know if Boris or Juergen have a hunch about the issue, if not
>>> perhaps a revert is the best.
>> Anyone? Unless I hear otherwise, I'll revert the series tomorrow.
> 
> Juergen may have something to say by tomorrow, but from my perspective,
> given that we are coming up on rc6 --- yes.
> 
> I looked at the patches again and didn't see anything obvious.
> 
> -boris

Could also be that what i hit is a latent bug, 
that is not caused by these patches but merely got uncovered by them.

xl dmesg also shows quite some:
    (XEN) [2018-09-24 03:15:46.847] grant_table.c:1755:d14v0 Expanding d14 grant table from 19 to 20 frames
    (XEN) [2018-09-24 03:15:46.849] grant_table.c:1755:d14v0 Expanding d14 grant table from 20 to 21 frames
(and has done that for ages on my box not leading to any direct problems to my knowledge)

I don't know if there could be related and something around the (persistent) grants for block devices could be leaking under some conditions?

--
Sander


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Xen-devel] [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer
  2018-09-27 22:03                 ` Sander Eikelenboom
@ 2018-09-28  6:44                   ` Juergen Gross
  0 siblings, 0 replies; 12+ messages in thread
From: Juergen Gross @ 2018-09-28  6:44 UTC (permalink / raw)
  To: Sander Eikelenboom, Boris Ostrovsky, Jens Axboe, konrad.wilk, roger.pau
  Cc: xen-devel, linux-kernel

On 28/09/2018 00:03, Sander Eikelenboom wrote:
> On 27/09/18 23:48, Boris Ostrovsky wrote:
>> On 9/27/18 5:37 PM, Jens Axboe wrote:
>>> On 9/27/18 2:33 PM, Sander Eikelenboom wrote:
>>>> On 27/09/18 21:06, Boris Ostrovsky wrote:
>>>>> On 9/27/18 2:56 PM, Jens Axboe wrote:
>>>>>> On 9/27/18 12:52 PM, Sander Eikelenboom wrote:
>>>>>>> On 27/09/18 16:26, Jens Axboe wrote:
>>>>>>>> On 9/27/18 1:12 AM, Juergen Gross wrote:
>>>>>>>>> On 22/09/18 21:55, Boris Ostrovsky wrote:
>>>>>>>>>> Commit a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>>>>> added support for purging persistent grants when they are not in use. As
>>>>>>>>>> part of the purge, the grants were removed from the grant buffer, This
>>>>>>>>>> eventually causes the buffer to become empty, with BUG_ON triggered in
>>>>>>>>>> get_free_grant(). This can be observed even on an idle system, within
>>>>>>>>>> 20-30 minutes.
>>>>>>>>>>
>>>>>>>>>> We should keep the grants in the buffer when purging, and only free the
>>>>>>>>>> grant ref.
>>>>>>>>>>
>>>>>>>>>> Fixes: a46b53672b2c ("xen/blkfront: cleanup stale persistent grants")
>>>>>>>>>> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>>>>>>> Reviewed-by: Juergen Gross <jgross@suse.com>
>>>>>>>> Since Konrad is out, I'm going to queue this up for 4.19.
>>>>>>>>
>>>>>>> Hi Boris/Juergen.
>>>>>>>
>>>>>>> Last week i tested a linux-4.19-rc4 kernel with xen-next and this patch from Boris pulled on top. 
>>>>>>> Unfortunately it made a VM hang (probably because it's rootFS is shuffled from under it's feet 
>>>>> What do you mean by "rootFS is shuffled from under it's feet " ?
>>>> Assumption that block-front getting borked and either a kernel crash or rootfs becoming mounted readonly. Didn't (try) to check though.
>>>>
>>>>>>> and it gave these in dom0 dmesg:
>>>>>>>
>>>>>>> [ 9251.696090] xen-blkback: requesting a grant already in use
>>>>>>> [ 9251.705861] xen-blkback: trying to add a gref that's already in the tree
>>>>>>> [ 9251.715781] xen-blkback: requesting a grant already in use
>>>>>>> [ 9251.725756] xen-blkback: trying to add a gref that's already in the tree
>>>>>>> [ 9251.735698] xen-blkback: requesting a grant already in use
>>>>>>> [ 9251.745573] xen-blkback: trying to add a gref that's already in the tree
>>>>>>>
>>>>>>> The VM was a HVM with 4 vcpu's and 2 phy disks:
>>>>>>> xen-blkback: backend/vbd/14/768: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>>>>> xen-blkback: backend/vbd/14/832: using 4 queues, protocol 1 (x86_64-abi) persistent grants
>>>>>>>
>>>>>>>
>>>>>>> Currently i have been running 4.19-rc5 with xen-next on top and commit
>>>>>>> a46b53672b2c reverted, for a couple of days. That seems to run stable
>>>>>>> for me (since it's a small box so i'm not hit by what a46b53672b2c
>>>>>>> tried to fix.
>>>>>>>
>>>>>>> If you can come up with a debug patch i can give that a spin tomorrow
>>>>>>> evening or in the weekend, so we are hopefully still in time for the
>>>>>>> 4.19 release.
>>>>>> At this late in the game, might make more sense to simply revert the
>>>>>> buggy commit.  Especially since what is currently out there doesn't fix
>>>>>> the issue for you.
>>>> Don't know if Boris or Juergen have a hunch about the issue, if not
>>>> perhaps a revert is the best.
>>> Anyone? Unless I hear otherwise, I'll revert the series tomorrow.
>>
>> Juergen may have something to say by tomorrow, but from my perspective,
>> given that we are coming up on rc6 --- yes.
>>
>> I looked at the patches again and didn't see anything obvious.
>>
>> -boris
> 
> Could also be that what i hit is a latent bug, 
> that is not caused by these patches but merely got uncovered by them.
> 
> xl dmesg also shows quite some:
>     (XEN) [2018-09-24 03:15:46.847] grant_table.c:1755:d14v0 Expanding d14 grant table from 19 to 20 frames
>     (XEN) [2018-09-24 03:15:46.849] grant_table.c:1755:d14v0 Expanding d14 grant table from 20 to 21 frames
> (and has done that for ages on my box not leading to any direct problems to my knowledge)
> 
> I don't know if there could be related and something around the (persistent) grants for block devices could be leaking under some conditions?

I could reproduce the issue Boris has seen and I have found the fault
in his patch. Just testing a fix.


Juergen

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-09-28  6:44 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-22 19:55 [PATCH] xen/blkfront: When purging persistent grants, keep them in the buffer Boris Ostrovsky
2018-09-27  7:12 ` Juergen Gross
2018-09-27 14:26   ` Jens Axboe
2018-09-27 18:52     ` [Xen-devel] " Sander Eikelenboom
2018-09-27 18:56       ` Jens Axboe
2018-09-27 19:06         ` Boris Ostrovsky
2018-09-27 19:16           ` Jens Axboe
2018-09-27 20:33           ` Sander Eikelenboom
2018-09-27 21:37             ` Jens Axboe
2018-09-27 21:48               ` Boris Ostrovsky
2018-09-27 22:03                 ` Sander Eikelenboom
2018-09-28  6:44                   ` Juergen Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).