All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: willy@infradead.org, wei.w.wang@intel.com
Cc: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,
	qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org,
	kvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com,
	mhocko@kernel.org, akpm@linux-foundation.org,
	mawilcox@microsoft.com, david@redhat.com,
	cornelia.huck@de.ibm.com, mgorman@techsingularity.net,
	aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com,
	liliang.opensource@gmail.com, yang.zhang.wz@gmail.com,
	quan.xu0@gmail.com, nilal@redhat.com, riel@redhat.com
Subject: Re: [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG
Date: Sun, 24 Dec 2017 13:45:10 +0900	[thread overview]
Message-ID: <201712241345.DIG21823.SLFOOJtQFOMVFH@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20171224032121.GA5273@bombadil.infradead.org>

Matthew Wilcox wrote:
> > +	unsigned long pfn = page_to_pfn(page);
> > +	int ret;
> > +
> > +	*pfn_min = min(pfn, *pfn_min);
> > +	*pfn_max = max(pfn, *pfn_max);
> > +
> > +	do {
> > +		if (xb_preload(GFP_NOWAIT | __GFP_NOWARN) < 0)
> > +			return -ENOMEM;
> > +
> > +		ret = xb_set_bit(&vb->page_xb, pfn);
> > +		xb_preload_end();
> > +	} while (unlikely(ret == -EAGAIN));
> 
> OK, so you don't need a spinlock because you're under a mutex?  But you
> can't allocate memory because you're in the balloon driver, and so a
> GFP_KERNEL allocation might recurse into your driver?

Right. We can't (directly or indirectly) depend on __GFP_DIRECT_RECLAIM && !__GFP_NORETRY
allocations because the balloon driver needs to handle OOM notifier callback.

>                                                        Would GFP_NOIO
> do the job?  I'm a little hazy on exactly how the balloon driver works.

GFP_NOIO implies __GFP_DIRECT_RECLAIM. In the worst case, it can lockup due to
the too small to fail memory allocation rule. GFP_NOIO | __GFP_NORETRY would work
if there is really a guarantee that GFP_NOIO | __GFP_NORETRY never depend on
__GFP_DIRECT_RECLAIM && !__GFP_NORETRY allocations, which is too subtle for me to
validate. The direct reclaim dependency is too complicated to validate.
I consider that !__GFP_DIRECT_RECLAIM is the future-safe choice.

> 
> If you can't preload with anything better than that, I think that
> xb_set_bit() should attempt an allocation with GFP_NOWAIT | __GFP_NOWARN,
> and then you can skip the preload; it has no value for you.

Yes, that's why I suggest directly using kzalloc() at xb_set_bit().

> 
> > @@ -173,8 +292,15 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >  
> >  	while ((page = balloon_page_pop(&pages))) {
> >  		balloon_page_enqueue(&vb->vb_dev_info, page);
> > +		if (use_sg) {
> > +			if (xb_set_page(vb, page, &pfn_min, &pfn_max) < 0) {
> > +				__free_page(page);
> > +				continue;
> > +			}
> > +		} else {
> > +			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> > +		}
> 
> Is this the right behaviour?

I don't think so. In the worst case, we can set no bit using xb_set_page().

>                               If we can't record the page in the xb,
> wouldn't we rather send it across as a single page?
> 

I think that we need to be able to fallback to !use_sg path when OOM.

WARNING: multiple messages have this Message-ID (diff)
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: willy@infradead.org, wei.w.wang@intel.com
Cc: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,
	qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org,
	kvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com,
	mhocko@kernel.org, akpm@linux-foundation.org,
	mawilcox@microsoft.com, david@redhat.com,
	cornelia.huck@de.ibm.com, mgorman@techsingularity.net,
	aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com,
	liliang.opensource@gmail.com, yang.zhang.wz@gmail.com,
	quan.xu0@gmail.com, nilal@redhat.com, riel@redhat.com
Subject: Re: [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG
Date: Sun, 24 Dec 2017 13:45:10 +0900	[thread overview]
Message-ID: <201712241345.DIG21823.SLFOOJtQFOMVFH@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20171224032121.GA5273@bombadil.infradead.org>

Matthew Wilcox wrote:
> > +	unsigned long pfn = page_to_pfn(page);
> > +	int ret;
> > +
> > +	*pfn_min = min(pfn, *pfn_min);
> > +	*pfn_max = max(pfn, *pfn_max);
> > +
> > +	do {
> > +		if (xb_preload(GFP_NOWAIT | __GFP_NOWARN) < 0)
> > +			return -ENOMEM;
> > +
> > +		ret = xb_set_bit(&vb->page_xb, pfn);
> > +		xb_preload_end();
> > +	} while (unlikely(ret == -EAGAIN));
> 
> OK, so you don't need a spinlock because you're under a mutex?  But you
> can't allocate memory because you're in the balloon driver, and so a
> GFP_KERNEL allocation might recurse into your driver?

Right. We can't (directly or indirectly) depend on __GFP_DIRECT_RECLAIM && !__GFP_NORETRY
allocations because the balloon driver needs to handle OOM notifier callback.

>                                                        Would GFP_NOIO
> do the job?  I'm a little hazy on exactly how the balloon driver works.

GFP_NOIO implies __GFP_DIRECT_RECLAIM. In the worst case, it can lockup due to
the too small to fail memory allocation rule. GFP_NOIO | __GFP_NORETRY would work
if there is really a guarantee that GFP_NOIO | __GFP_NORETRY never depend on
__GFP_DIRECT_RECLAIM && !__GFP_NORETRY allocations, which is too subtle for me to
validate. The direct reclaim dependency is too complicated to validate.
I consider that !__GFP_DIRECT_RECLAIM is the future-safe choice.

> 
> If you can't preload with anything better than that, I think that
> xb_set_bit() should attempt an allocation with GFP_NOWAIT | __GFP_NOWARN,
> and then you can skip the preload; it has no value for you.

Yes, that's why I suggest directly using kzalloc() at xb_set_bit().

> 
> > @@ -173,8 +292,15 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >  
> >  	while ((page = balloon_page_pop(&pages))) {
> >  		balloon_page_enqueue(&vb->vb_dev_info, page);
> > +		if (use_sg) {
> > +			if (xb_set_page(vb, page, &pfn_min, &pfn_max) < 0) {
> > +				__free_page(page);
> > +				continue;
> > +			}
> > +		} else {
> > +			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> > +		}
> 
> Is this the right behaviour?

I don't think so. In the worst case, we can set no bit using xb_set_page().

>                               If we can't record the page in the xb,
> wouldn't we rather send it across as a single page?
> 

I think that we need to be able to fallback to !use_sg path when OOM.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: willy@infradead.org, wei.w.wang@intel.com
Cc: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org,
	qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org,
	kvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com,
	mhocko@kernel.org, akpm@linux-foundation.org,
	mawilcox@microsoft.com, david@redhat.com,
	cornelia.huck@de.ibm.com, mgorman@techsingularity.net,
	aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com,
	liliang.opensource@gmail.com, yang.zhang.wz@gmail.com,
	quan.xu0@gmail.com, nilal@redhat.com, riel@redhat.com
Subject: Re: [Qemu-devel] [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG
Date: Sun, 24 Dec 2017 13:45:10 +0900	[thread overview]
Message-ID: <201712241345.DIG21823.SLFOOJtQFOMVFH@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20171224032121.GA5273@bombadil.infradead.org>

Matthew Wilcox wrote:
> > +	unsigned long pfn = page_to_pfn(page);
> > +	int ret;
> > +
> > +	*pfn_min = min(pfn, *pfn_min);
> > +	*pfn_max = max(pfn, *pfn_max);
> > +
> > +	do {
> > +		if (xb_preload(GFP_NOWAIT | __GFP_NOWARN) < 0)
> > +			return -ENOMEM;
> > +
> > +		ret = xb_set_bit(&vb->page_xb, pfn);
> > +		xb_preload_end();
> > +	} while (unlikely(ret == -EAGAIN));
> 
> OK, so you don't need a spinlock because you're under a mutex?  But you
> can't allocate memory because you're in the balloon driver, and so a
> GFP_KERNEL allocation might recurse into your driver?

Right. We can't (directly or indirectly) depend on __GFP_DIRECT_RECLAIM && !__GFP_NORETRY
allocations because the balloon driver needs to handle OOM notifier callback.

>                                                        Would GFP_NOIO
> do the job?  I'm a little hazy on exactly how the balloon driver works.

GFP_NOIO implies __GFP_DIRECT_RECLAIM. In the worst case, it can lockup due to
the too small to fail memory allocation rule. GFP_NOIO | __GFP_NORETRY would work
if there is really a guarantee that GFP_NOIO | __GFP_NORETRY never depend on
__GFP_DIRECT_RECLAIM && !__GFP_NORETRY allocations, which is too subtle for me to
validate. The direct reclaim dependency is too complicated to validate.
I consider that !__GFP_DIRECT_RECLAIM is the future-safe choice.

> 
> If you can't preload with anything better than that, I think that
> xb_set_bit() should attempt an allocation with GFP_NOWAIT | __GFP_NOWARN,
> and then you can skip the preload; it has no value for you.

Yes, that's why I suggest directly using kzalloc() at xb_set_bit().

> 
> > @@ -173,8 +292,15 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> >  
> >  	while ((page = balloon_page_pop(&pages))) {
> >  		balloon_page_enqueue(&vb->vb_dev_info, page);
> > +		if (use_sg) {
> > +			if (xb_set_page(vb, page, &pfn_min, &pfn_max) < 0) {
> > +				__free_page(page);
> > +				continue;
> > +			}
> > +		} else {
> > +			set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
> > +		}
> 
> Is this the right behaviour?

I don't think so. In the worst case, we can set no bit using xb_set_page().

>                               If we can't record the page in the xb,
> wouldn't we rather send it across as a single page?
> 

I think that we need to be able to fallback to !use_sg path when OOM.

  reply	other threads:[~2017-12-24  4:45 UTC|newest]

Thread overview: 143+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-19 12:17 [PATCH v20 0/7] Virtio-balloon Enhancement Wei Wang
2017-12-19 12:17 ` [virtio-dev] " Wei Wang
2017-12-19 12:17 ` [Qemu-devel] " Wei Wang
2017-12-19 12:17 ` Wei Wang
2017-12-19 12:17 ` [PATCH v20 1/7] xbitmap: Introduce xbitmap Wei Wang
2017-12-19 12:17   ` [virtio-dev] " Wei Wang
2017-12-19 12:17   ` [Qemu-devel] " Wei Wang
2017-12-19 12:17   ` Wei Wang
2017-12-19 15:58   ` Philippe Ombredanne
2017-12-19 15:58     ` [Qemu-devel] " Philippe Ombredanne
2017-12-19 15:58     ` Philippe Ombredanne
2017-12-19 15:58   ` Philippe Ombredanne
2017-12-19 12:17 ` Wei Wang
2017-12-19 12:17 ` [PATCH v20 2/7] xbitmap: potential improvement Wei Wang
2017-12-19 12:17   ` [virtio-dev] " Wei Wang
2017-12-19 12:17   ` [Qemu-devel] " Wei Wang
2017-12-19 12:17   ` Wei Wang
2017-12-19 12:17 ` Wei Wang
2017-12-19 12:17 ` [PATCH v20 3/7] xbitmap: add more operations Wei Wang
2017-12-19 12:17 ` Wei Wang
2017-12-19 12:17   ` [virtio-dev] " Wei Wang
2017-12-19 12:17   ` [Qemu-devel] " Wei Wang
2017-12-19 12:17   ` Wei Wang
2017-12-19 12:17 ` [PATCH v20 4/7] virtio-balloon: VIRTIO_BALLOON_F_SG Wei Wang
2017-12-19 12:17   ` [virtio-dev] " Wei Wang
2017-12-19 12:17   ` [Qemu-devel] " Wei Wang
2017-12-19 12:17   ` Wei Wang
2017-12-24  3:21   ` Matthew Wilcox
2017-12-24  3:21     ` [Qemu-devel] " Matthew Wilcox
2017-12-24  3:21     ` Matthew Wilcox
2017-12-24  3:21     ` Matthew Wilcox
2017-12-24  4:45     ` Tetsuo Handa [this message]
2017-12-24  4:45       ` [Qemu-devel] " Tetsuo Handa
2017-12-24  4:45       ` Tetsuo Handa
2017-12-24  7:42       ` Wei Wang
2017-12-24  7:42       ` Wei Wang
2017-12-24  7:42         ` [virtio-dev] " Wei Wang
2017-12-24  7:42         ` [Qemu-devel] " Wei Wang
2017-12-24  7:42         ` Wei Wang
2017-12-24  8:16         ` [virtio-dev] " Wei Wang
2017-12-24  8:16         ` Wei Wang
2017-12-24  8:16           ` Wei Wang
2017-12-24  8:16           ` [Qemu-devel] " Wei Wang
2017-12-24  8:16           ` Wei Wang
2017-12-25 14:51           ` Tetsuo Handa
2017-12-25 14:51             ` [Qemu-devel] " Tetsuo Handa
2017-12-25 14:51             ` Tetsuo Handa
2017-12-26  3:06             ` Wei Wang
2017-12-26  3:06               ` [virtio-dev] " Wei Wang
2017-12-26  3:06               ` [Qemu-devel] " Wei Wang
2017-12-26  3:06               ` Wei Wang
2017-12-26 10:38               ` Tetsuo Handa
2017-12-26 10:38                 ` [Qemu-devel] " Tetsuo Handa
2017-12-26 10:38                 ` Tetsuo Handa
2017-12-26 11:36                 ` Wei Wang
2017-12-26 11:36                 ` Wei Wang
2017-12-26 11:36                   ` [virtio-dev] " Wei Wang
2017-12-26 11:36                   ` [Qemu-devel] " Wei Wang
2017-12-26 11:36                   ` Wei Wang
2017-12-26 13:40                 ` Tetsuo Handa
2017-12-26 13:40                   ` [Qemu-devel] " Tetsuo Handa
2017-12-26 13:40                   ` Tetsuo Handa
2017-12-26  3:06             ` Wei Wang
2018-01-02 13:24         ` Matthew Wilcox
2018-01-02 13:24           ` [Qemu-devel] " Matthew Wilcox
2018-01-02 13:24           ` Matthew Wilcox
2018-01-03  2:29           ` Tetsuo Handa
2018-01-03  2:29             ` [Qemu-devel] " Tetsuo Handa
2018-01-03  2:29             ` Tetsuo Handa
2018-01-03  9:00             ` Wei Wang
2018-01-03  9:00               ` [virtio-dev] " Wei Wang
2018-01-03  9:00               ` [Qemu-devel] " Wei Wang
2018-01-03  9:00               ` Wei Wang
2018-01-03 10:19               ` Tetsuo Handa
2018-01-03 10:19                 ` [Qemu-devel] " Tetsuo Handa
2018-01-03 10:19                 ` Tetsuo Handa
2018-01-03  9:00             ` Wei Wang
2018-01-02 13:24         ` Matthew Wilcox
2017-12-19 12:17 ` Wei Wang
2017-12-19 12:17 ` [PATCH v20 5/7] mm: support reporting free page blocks Wei Wang
2017-12-19 12:17 ` Wei Wang
2017-12-19 12:17   ` [virtio-dev] " Wei Wang
2017-12-19 12:17   ` [Qemu-devel] " Wei Wang
2017-12-19 12:17   ` Wei Wang
2017-12-19 12:17 ` [PATCH v20 6/7] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ Wei Wang
2017-12-19 12:17   ` [virtio-dev] " Wei Wang
2017-12-19 12:17   ` [Qemu-devel] " Wei Wang
2017-12-19 12:17   ` Wei Wang
2017-12-19 12:17 ` Wei Wang
2017-12-19 12:17 ` [PATCH v20 7/7] virtio-balloon: don't report free pages when page poisoning is enabled Wei Wang
2017-12-19 12:17   ` [virtio-dev] " Wei Wang
2017-12-19 12:17   ` [Qemu-devel] " Wei Wang
2017-12-19 12:17   ` Wei Wang
2017-12-19 12:17 ` Wei Wang
2017-12-19 14:05 ` [PATCH v20 0/7] Virtio-balloon Enhancement Tetsuo Handa
2017-12-19 14:05   ` [Qemu-devel] " Tetsuo Handa
2017-12-19 14:05   ` Tetsuo Handa
2017-12-19 14:40   ` Matthew Wilcox
2017-12-19 14:40   ` Matthew Wilcox
2017-12-19 14:40     ` [Qemu-devel] " Matthew Wilcox
2017-12-19 14:40     ` Matthew Wilcox
2017-12-20  2:33     ` Tetsuo Handa
2017-12-20  2:33       ` [Qemu-devel] " Tetsuo Handa
2017-12-20  2:33       ` Tetsuo Handa
2017-12-19 18:08   ` Michael S. Tsirkin
2017-12-19 18:08     ` [virtio-dev] " Michael S. Tsirkin
2017-12-19 18:08     ` [Qemu-devel] " Michael S. Tsirkin
2017-12-19 18:08     ` Michael S. Tsirkin
2017-12-19 18:08   ` Michael S. Tsirkin
2017-12-20 10:34   ` Wei Wang
2017-12-20 10:34     ` [virtio-dev] " Wei Wang
2017-12-20 10:34     ` [Qemu-devel] " Wei Wang
2017-12-20 10:34     ` Wei Wang
2017-12-20 12:25     ` Matthew Wilcox
2017-12-20 12:25     ` Matthew Wilcox
2017-12-20 12:25       ` [Qemu-devel] " Matthew Wilcox
2017-12-20 12:25       ` Matthew Wilcox
2017-12-20 16:13       ` Wang, Wei W
2017-12-20 16:13       ` Wang, Wei W
2017-12-20 16:13         ` [virtio-dev] " Wang, Wei W
2017-12-20 16:13         ` [Qemu-devel] " Wang, Wei W
2017-12-20 16:13         ` Wang, Wei W
2017-12-20 16:13         ` Wang, Wei W
2017-12-20 17:10         ` Matthew Wilcox
2017-12-20 17:10         ` Matthew Wilcox
2017-12-20 17:10           ` [Qemu-devel] " Matthew Wilcox
2017-12-20 17:10           ` Matthew Wilcox
2017-12-20 17:10           ` Matthew Wilcox
2017-12-21  2:49           ` Wei Wang
2017-12-21  2:49           ` Wei Wang
2017-12-21  2:49             ` [virtio-dev] " Wei Wang
2017-12-21  2:49             ` [Qemu-devel] " Wei Wang
2017-12-21  2:49             ` Wei Wang
2017-12-21  2:49             ` Wei Wang
2017-12-21 12:14             ` Matthew Wilcox
2017-12-21 12:14             ` Matthew Wilcox
2017-12-21 12:14               ` [Qemu-devel] " Matthew Wilcox
2017-12-21 12:14               ` Matthew Wilcox
2017-12-21 12:14               ` Matthew Wilcox
2017-12-21 12:56             ` Tetsuo Handa
2017-12-21 12:56               ` [Qemu-devel] " Tetsuo Handa
2017-12-21 12:56               ` Tetsuo Handa
2017-12-20 10:34   ` Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201712241345.DIG21823.SLFOOJtQFOMVFH@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=amit.shah@redhat.com \
    --cc=cornelia.huck@de.ibm.com \
    --cc=david@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=liliang.opensource@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mawilcox@microsoft.com \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=nilal@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quan.xu0@gmail.com \
    --cc=riel@redhat.com \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.