linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: Dominique Martinet <asmadeus@codewreck.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	v9fs-developer@lists.sourceforge.net,
	Latchesar Ionkov <lucho@ionkov.net>,
	Eric Van Hensbergen <ericvh@gmail.com>,
	Ron Minnich <rminnich@sandia.gov>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2 5/6] 9p: Use a slab for allocating requests
Date: Mon, 23 Jul 2018 13:52:20 +0200	[thread overview]
Message-ID: <20180723135220.08ec45bf@bahia> (raw)
In-Reply-To: <20180718100554.GA21781@nautica>

On Wed, 18 Jul 2018 12:05:54 +0200
Dominique Martinet <asmadeus@codewreck.org> wrote:

> +Cc Greg, I could use your opinion on this if you have a moment.
> 

Hi Dominique,

The patch is quite big and I'm not sure I can find time to review it
carefully, but I'll try to help anyway.

> Matthew Wilcox wrote on Wed, Jul 11, 2018:
> > Replace the custom batch allocation with a slab.  Use an IDR to store
> > pointers to the active requests instead of an array.  We don't try to
> > handle P9_NOTAG specially; the IDR will happily shrink all the way back
> > once the TVERSION call has completed.  
> 
> Sorry for coming back to this patch now, I just noticed something that's
> actually probably a fairly big hit on performance...
> 
> While the slab is just as good as the array for the request itself, this
> makes every single request allocate "fcalls" everytime instead of
> reusing a cached allocation.
> The default msize is 8k and these allocs probably are fairly efficient,
> but some transports like RDMA allow to increase this to up to 1MB... And

It can be even bigger with virtio:

#define VIRTQUEUE_NUM	128

	.maxsize = PAGE_SIZE * (VIRTQUEUE_NUM - 3),

On a typical ppc64 server class setup with 64KB pages, this is nearly 8MB.

> doing this kind of allocation twice for every packet is going to be very
> slow.
> (not that hogging megabytes of memory was a great practice either!)
> 
> 
> One thing is that the buffers are all going to be the same size for a
> given client (.... except virtio zc buffers, I wonder what I'm missing
> or why that didn't blow up before?)

ZC allocates a 4KB buffer, which is more than enough to hold the 7-byte 9P
header and the "dqd" part of all messages that may use ZC, ie, 16 bytes.
So I'm not sure to catch what could blow up.

> Err, that aside I was going to ask if we couldn't find a way to keep a
> pool of these somehow.
> Ideally putting them in another slab so they could be reclaimed if
> necessary, but the size could vary from one client to another, can we
> create a kmem_cache object per client? the KMEM_CACHE macro is not very
> flexible so I don't think that is encouraged... :)
> 
> 
> It's a shame because I really like that patch, I'll try to find time to
> run some light benchmark with varying msizes eventually but I'm not sure
> when I'll find time for that... Hopefully before the 4.19 merge window!
> 

Yeah, the open-coded cache we have now really obfuscates things.

Maybe have a per-client kmem_cache object for non-ZC requests with
size msize [*], and a global kmem_cache object for ZC requests with
fixed size P9_ZC_HDR_SZ.

[*] the server can require a smaller msize during version negotiation,
    so maybe we should change the kmem_cache object in this case.

Cheers,

--
Greg

> 
> >  /**
> > - * p9_tag_alloc - lookup/allocate a request by tag
> > - * @c: client session to lookup tag within
> > - * @tag: numeric id for transaction
> > - *
> > - * this is a simple array lookup, but will grow the
> > - * request_slots as necessary to accommodate transaction
> > - * ids which did not previously have a slot.
> > - *
> > - * this code relies on the client spinlock to manage locks, its
> > - * possible we should switch to something else, but I'd rather
> > - * stick with something low-overhead for the common case.
> > + * p9_req_alloc - Allocate a new request.
> > + * @c: Client session.
> > + * @type: Transaction type.
> > + * @max_size: Maximum packet size for this request.
> >   *
> > + * Context: Process context.
> > + * Return: Pointer to new request.
> >   */
> > -
> >  static struct p9_req_t *
> > -p9_tag_alloc(struct p9_client *c, u16 tag, unsigned int max_size)
> > +p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
> >  {
> > -	unsigned long flags;
> > -	int row, col;
> > -	struct p9_req_t *req;
> > +	struct p9_req_t *req = kmem_cache_alloc(p9_req_cache, GFP_NOFS);
> >  	int alloc_msize = min(c->msize, max_size);
> > +	int tag;
> >  
> > -	/* This looks up the original request by tag so we know which
> > -	 * buffer to read the data into */
> > -	tag++;
> > -
> > -	if (tag >= c->max_tag) {
> > -		spin_lock_irqsave(&c->lock, flags);
> > -		/* check again since original check was outside of lock */
> > -		while (tag >= c->max_tag) {
> > -			row = (tag / P9_ROW_MAXTAG);
> > -			c->reqs[row] = kcalloc(P9_ROW_MAXTAG,
> > -					sizeof(struct p9_req_t), GFP_ATOMIC);
> > -
> > -			if (!c->reqs[row]) {
> > -				pr_err("Couldn't grow tag array\n");
> > -				spin_unlock_irqrestore(&c->lock, flags);
> > -				return ERR_PTR(-ENOMEM);
> > -			}
> > -			for (col = 0; col < P9_ROW_MAXTAG; col++) {
> > -				req = &c->reqs[row][col];
> > -				req->status = REQ_STATUS_IDLE;
> > -				init_waitqueue_head(&req->wq);
> > -			}
> > -			c->max_tag += P9_ROW_MAXTAG;
> > -		}
> > -		spin_unlock_irqrestore(&c->lock, flags);
> > -	}
> > -	row = tag / P9_ROW_MAXTAG;
> > -	col = tag % P9_ROW_MAXTAG;
> > +	if (!req)
> > +		return NULL;
> >  
> > -	req = &c->reqs[row][col];
> > -	if (!req->tc)
> > -		req->tc = p9_fcall_alloc(alloc_msize);
> > -	if (!req->rc)
> > -		req->rc = p9_fcall_alloc(alloc_msize);
> > +	req->tc = p9_fcall_alloc(alloc_msize);
> > +	req->rc = p9_fcall_alloc(alloc_msize);
> >  	if (!req->tc || !req->rc)
> > -		goto grow_failed;
> > +		goto free;
> >  
> >  	p9pdu_reset(req->tc);
> >  	p9pdu_reset(req->rc);
> > -
> > -	req->tc->tag = tag-1;
> >  	req->status = REQ_STATUS_ALLOC;
> > +	init_waitqueue_head(&req->wq);
> > +	INIT_LIST_HEAD(&req->req_list);
> > +
> > +	idr_preload(GFP_NOFS);
> > +	spin_lock_irq(&c->lock);
> > +	if (type == P9_TVERSION)
> > +		tag = idr_alloc(&c->reqs, req, P9_NOTAG, P9_NOTAG + 1,
> > +				GFP_NOWAIT);
> > +	else
> > +		tag = idr_alloc(&c->reqs, req, 0, P9_NOTAG, GFP_NOWAIT);
> > +	req->tc->tag = tag;
> > +	spin_unlock_irq(&c->lock);
> > +	idr_preload_end();
> > +	if (tag < 0)
> > +		goto free;  
> 


  parent reply	other threads:[~2018-07-23 12:32 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 21:02 [PATCH v2 0/6] 9p: Use IDRs more effectively Matthew Wilcox
2018-07-11 21:02 ` [PATCH v2 1/6] 9p: Fix comment on smp_wmb Matthew Wilcox
2018-07-12 11:55   ` [V9fs-developer] " Greg Kurz
2018-07-11 21:02 ` [PATCH v2 2/6] 9p: Change p9_fid_create calling convention Matthew Wilcox
2018-07-12  2:15   ` [V9fs-developer] " piaojun
2018-07-12 11:56   ` Greg Kurz
2018-07-13  1:18   ` jiangyiwen
2018-07-11 21:02 ` [PATCH v2 3/6] 9p: Replace the fidlist with an IDR Matthew Wilcox
2018-07-12 11:17   ` Dominique Martinet
2018-07-12 11:23     ` Matthew Wilcox
2018-07-12 11:30       ` Dominique Martinet
2018-07-13  2:05   ` [V9fs-developer] " jiangyiwen
2018-07-13  2:48     ` Matthew Wilcox
2018-07-11 21:02 ` [PATCH v2 4/6] 9p: Embed wait_queue_head into p9_req_t Matthew Wilcox
2018-07-12 14:36   ` [V9fs-developer] " Greg Kurz
2018-07-12 14:40     ` Dominique Martinet
2018-07-12 14:59       ` Greg Kurz
2018-07-11 21:02 ` [PATCH v2 5/6] 9p: Use a slab for allocating requests Matthew Wilcox
2018-07-18 10:05   ` Dominique Martinet
2018-07-18 11:49     ` Matthew Wilcox
2018-07-18 12:46       ` Dominique Martinet
2018-07-23 11:52     ` Greg Kurz [this message]
2018-07-23 12:25       ` Dominique Martinet
2018-07-23 14:24         ` Greg Kurz
2018-07-30  9:31         ` Dominique Martinet
2018-07-30  9:34           ` [PATCH 1/2] net/9p: embed fcall in req to round down buffer allocs Dominique Martinet
2018-07-30  9:34             ` [PATCH 2/2] net/9p: add a per-client fcall kmem_cache Dominique Martinet
2018-07-31  1:18               ` [V9fs-developer] " piaojun
2018-07-31  1:35                 ` Dominique Martinet
2018-07-31  1:45                   ` piaojun
2018-07-31  2:46               ` Matthew Wilcox
2018-07-31  4:17                 ` Dominique Martinet
2018-08-01 14:28               ` [V9fs-developer] " Greg Kurz
2018-08-01 15:22                 ` Dominique Martinet
2018-07-31  0:55             ` [V9fs-developer] [PATCH 1/2] net/9p: embed fcall in req to round down buffer allocs piaojun
2018-07-31  1:12               ` Dominique Martinet
2018-07-31  1:28                 ` piaojun
2018-08-01 14:14             ` Greg Kurz
2018-08-01 14:38               ` Dominique Martinet
2018-08-01 15:03                 ` Greg Kurz
2018-08-02  2:37             ` [PATCH v2 " Dominique Martinet
2018-08-02  2:37               ` [PATCH v2 2/2] net/9p: add a per-client fcall kmem_cache Dominique Martinet
2018-08-02  4:58                 ` [V9fs-developer] " Dominique Martinet
2018-08-02  9:23               ` [PATCH v2 1/2] net/9p: embed fcall in req to round down buffer allocs Greg Kurz
2018-08-02 22:03                 ` Dominique Martinet
2018-08-09 14:33               ` [PATCH v3 " Dominique Martinet
2018-08-09 14:33                 ` [PATCH v3 2/2] net/9p: add a per-client fcall kmem_cache Dominique Martinet
2018-08-10  1:23                   ` piaojun
2018-08-10  1:41                     ` Dominique Martinet
2018-08-10  1:49                       ` piaojun
2018-08-10  0:47                 ` [PATCH v3 1/2] net/9p: embed fcall in req to round down buffer allocs piaojun
2018-07-11 21:02 ` [PATCH v2 6/6] 9p: Remove p9_idpool Matthew Wilcox
2018-07-11 23:37 ` [PATCH v2 0/6] 9p: Use IDRs more effectively Dominique Martinet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180723135220.08ec45bf@bahia \
    --to=groug@kaod.org \
    --cc=asmadeus@codewreck.org \
    --cc=ericvh@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lucho@ionkov.net \
    --cc=rminnich@sandia.gov \
    --cc=v9fs-developer@lists.sourceforge.net \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).