All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benny Halevy <benny@tonian.com>
To: tao.peng@emc.com
Cc: bergwolf@gmail.com, rees@umich.edu, linux-nfs@vger.kernel.org,
	honey@citi.umich.edu
Subject: Re: [PATCH 87/88] Add configurable prefetch size for layoutget
Date: Fri, 10 Jun 2011 15:23:32 -0400	[thread overview]
Message-ID: <4DF26F34.8080008@tonian.com> (raw)
In-Reply-To: <F19688880B763E40B28B2B462677FBF805BED9D2AC@MX09A.corp.emc.com>

On 2011-06-10 10:09, tao.peng@emc.com wrote:
> Hi, Benny,
> 
> -----Original Message-----
> From: Benny Halevy [mailto:benny@tonian.com] 
> Sent: Friday, June 10, 2011 8:33 PM
> To: Peng, Tao
> Cc: bergwolf@gmail.com; rees@umich.edu; linux-nfs@vger.kernel.org; honey@citi.umich.edu
> Subject: Re: [PATCH 87/88] Add configurable prefetch size for layoutget
> 
> On 2011-06-10 02:00, tao.peng@emc.com wrote:
>> Hi, Benny,
>>
>> Cheers,
>> -Bergwolf
>>
>>
>> -----Original Message-----
>> From: linux-nfs-owner@vger.kernel.org [mailto:linux-nfs-owner@vger.kernel.org] On Behalf Of Benny Halevy
>> Sent: Friday, June 10, 2011 5:23 AM
>> To: Peng Tao
>> Cc: Jim Rees; linux-nfs@vger.kernel.org; peter honeyman
>> Subject: Re: [PATCH 87/88] Add configurable prefetch size for layoutget
>>
>> On 2011-06-09 08:07, Peng Tao wrote:
>>> Hi, Jim and Benny,
>>>
>>> On Thu, Jun 9, 2011 at 9:58 PM, Jim Rees <rees@umich.edu> wrote:
>>>> Benny Halevy wrote:
>>>>
>>>>  > My understanding is that layoutget specifies a min and max, and the server
>>>>
>>>>  There's a min.  What do you consider the max?
>>>>  Whatever gets into csa_fore_chan_attrs.ca_maxresponsesize?
>>>>
>>>> The spec doesn't say max, it says "desired."  I guess I assumed the server
>>>> wouldn't normally return more than desired.
>>> In fact server is returning "desired" length. The problem is that we
>>> call pnfs_update_layout in nfs_write_begin, and it will end up setting
>>> both minlength and length to page size. There is no space for client
>>> to collapse layoutget range in nfs_write_begin.
>>>
>>
>> That's a different issue.  Waiting with pnfs_update_layout to flush
>> time rather than write_begin if the whole page is written would help
>> sending a more meaningful desired range as well as avoiding needless
>> read-modify-writes in case the application also wrote the whole
>> preallocated block.
>> [PT] It is also the reason why we want to introduce layout prefetching, to get more segment than the page passed in nfs_write_begin.
>>
> 
> Peng, I understand what you want to achieve but the proposed way
> just doesn't fly. The server knows better than the client its allocation policies
> and it knows better the combined workload of different client and possible
> conflicts between them therefore it should be making the ultimate decision
> about the actual segment sizes.
> [PT] Yes, you are right. Server should know combined workload of all clients and make its decision based on that.
> And it always has the right to return more than (or less than) specified in loga_length.
>  
> That said, the client should indeed do its best to ask for the most appropriate
> segments size for its use and we should be making a better job at that.
> It's just that blindly asking for more is not a good strategy and requiring
> manual admin help to tune the clients is not acceptable.
> [PT] yeah, determing the most appropriate is always the hart part. Do you have any suggestions to that?

A simple algorithm I can suggest is:
- on initialization, calculate and save, per layout driver
  - maximum layout size
    - take into account csr_fore_chan_attrs.ca_maxresponsesize and possible other parameters
  - keep a working copy of the maximum value and the calculated copy.
  - alignment value.
- on miss, see if there's an adjacent layout segment in cache
- if found, ask for twice the found segment size, up to the maximum value,
  aligned on the alignment value.
- if the server returns less the layoutget range, keep note of the returned length
  (but not adjust maximum yet, as the server may return a short segment for various
   reasons)
- if the server is consistent about returning less than was asked, adjust the
  - working copy of the maximum length
- if the maximum was adjusted try bumping it up after X (TBD) layoutgets or T seconds
  to see if that was just due to high load or conflicts on the server
- on any error returned for LAYOUTGET reset the algorithm parameters
- on session reestablishment recalculate maximums.

Benny

> 
> Thanks,
> Tao

  reply	other threads:[~2011-06-10 19:23 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-07 17:24 [PATCH 00/88] pnfs block layout driver rees
2011-06-07 17:26 ` [PATCH 01/88] pnfs: add set-clear layoutdriver interface Jim Rees
2011-06-07 17:26 ` [PATCH 02/88] pnfs: let layoutcommit code handle multiple segments Jim Rees
2011-06-07 17:26 ` [PATCH 03/88] pnfs_post_submit: Restore "pnfs: pnfs_do_flush" part 1 Jim Rees
2011-06-07 17:26 ` [PATCH 04/88] pnfs_post_submit: Restore the pnfs_write_end part of "pnfs: commit and pnfs_write_end" Jim Rees
2011-06-07 17:26 ` [PATCH 05/88] pnfs: xdr support for three word attribute bitmap Jim Rees
2011-06-07 17:26 ` [PATCH 06/88] pnfs: HACK: ask for layout_blksize on mount Jim Rees
2011-06-07 17:26 ` [PATCH 07/88] pnfs: HACK: modify write_end_cleanup Jim Rees
2011-06-07 17:26 ` [PATCH 08/88] HACK: propagate fsdata into nfs_writepage_setup Jim Rees
2011-06-07 17:26 ` [PATCH 09/88] pnfs: HACK: adjust eof handling Jim Rees
2011-06-07 17:27 ` [PATCH 10/88] pnfsblock: define PNFS_BLOCK Kconfig option Jim Rees
2011-06-07 17:27 ` [PATCH 11/88] pnfsblock: blocklayout stub Jim Rees
2011-06-07 17:27 ` [PATCH 12/88] pnfsblock: expose scsi interface Jim Rees
2011-06-07 17:27 ` [PATCH 13/88] pnfsblock: scan scsi devices Jim Rees
2011-06-07 17:27 ` [PATCH 14/88] pnfsblock: call and parse getdevicelist Jim Rees
2011-06-07 17:27 ` [PATCH 15/88] pnfsblock: dm kernel interface Jim Rees
2011-06-07 17:27 ` [PATCH 16/88] pnfsblock: select BLK_DEV_DM when PNFS_BLOCK is configured Jim Rees
2011-06-07 17:27 ` [PATCH 17/88] pnfsblock: create and destroy dm metadevice Jim Rees
2011-06-07 17:27 ` [PATCH 18/88] pnfsblock: construct and load md table Jim Rees
2011-06-07 17:28 ` [PATCH 19/88] pnfsblock: layout alloc and free Jim Rees
2011-06-07 17:28 ` [PATCH 20/88] pnfsblock: basic extent code Jim Rees
2011-06-07 17:28 ` [PATCH 21/88] pnfsblock: lseg alloc and free Jim Rees
2011-06-07 17:28 ` [PATCH 22/88] pnfsblock: xdr decode pnfs_block_layout4 Jim Rees
2011-06-07 17:28 ` [PATCH 23/88] pnfsblock: merge extents Jim Rees
2011-06-07 17:28 ` [PATCH 24/88] pnfsblock: find_get_extent Jim Rees
2011-06-07 17:28 ` [PATCH 25/88] pnfsblock: bl_read_pagelist Jim Rees
2011-06-07 17:28 ` [PATCH 26/88] pnfsblock: allow use of PG_owner_priv_1 flag Jim Rees
2011-06-07 17:29 ` [PATCH 27/88] pnfsblock: read path error handling Jim Rees
2011-06-07 17:29 ` [PATCH 28/88] pnfsblock: SPLITME: add extent manipulation functions Jim Rees
2011-06-07 17:29 ` [PATCH 29/88] pnfsblock: write_begin Jim Rees
2011-06-07 17:29 ` [PATCH 30/88] pnfsblock: write_end Jim Rees
2011-06-07 17:29 ` [PATCH 31/88] pnfsblock: write_end_cleanup Jim Rees
2011-06-07 17:29 ` [PATCH 32/88] pnfsblock: bl_write_pagelist support functions Jim Rees
2011-06-07 17:29 ` [PATCH 33/88] pnfsblock: bl_write_pagelist Jim Rees
2011-06-07 17:29 ` [PATCH 34/88] pnfsblock: note written INVAL areas for layoutcommit Jim Rees
2011-06-07 17:30 ` [PATCH 35/88] pnfsblock: bl_setup_layoutcommit Jim Rees
2011-06-07 17:30 ` [PATCH 36/88] pnfsblock: encode_layoutcommit Jim Rees
2011-06-07 17:30 ` [PATCH 37/88] pnfsblock: cleanup_layoutcommit Jim Rees
2011-06-07 17:30 ` [PATCH 38/88] pnfsblock: merge rw extents Jim Rees
2011-06-07 17:30 ` [PATCH 39/88] pnfsblock: debugging dprintks for clist info Jim Rees
2011-06-07 17:30 ` [PATCH 40/88] SQAUSHME: blocklayoutdriver: NULL pointer reference when committing too many extents Jim Rees
2011-06-07 17:30 ` [PATCH 41/88] SQUASHME: pnfs-block: remove of CONFIG_PNFS fallout Jim Rees
2011-06-07 17:30 ` [PATCH 42/88] SQUASHME: pnfsblock: Fix a memory leak Jim Rees
2011-06-07 17:31 ` [PATCH 43/88] SQUASHME: pnfsblock: fix bug when decoding block device info Jim Rees
2011-06-07 17:31 ` [PATCH 44/88] SQUASHME: pnfsblock: Wrong extent refcount in block extents list Jim Rees
2011-06-07 17:31 ` [PATCH 45/88] SQUASHME: pnfsblock: Implement release_inval_marks Jim Rees
2011-06-07 17:31 ` [PATCH 46/88] SQUASHME: pnfsblock: Fix missing extent in commit list Jim Rees
2011-06-07 17:31 ` [PATCH 47/88] pnfsblock: use the session max response size for getdeviceinfo's maxcount Jim Rees
2011-06-07 17:31 ` [PATCH 48/88] SQUASHME: pnfs-block: fix compile breakage Jim Rees
2011-06-07 17:31 ` [PATCH 49/88] SQUASHME: pnfs-block: convert APIs pnfs-post-submit Jim Rees
2011-06-07 17:32 ` [PATCH 50/88] pnfsblock: Lookup list entry of layouts and tags in reverse order Jim Rees
2011-06-07 17:32 ` [PATCH 51/88] pnfsblock: expose block_class interface Jim Rees
2011-06-07 17:32 ` [PATCH 52/88] pnfsblock: iterating all local block disks instead of only scsi disks when initializing mount point Jim Rees
2011-06-07 17:32 ` [PATCH 53/88] SQUASHME: pnfsblock: set pnfs_blksize before calling set_pnfs_layoutdriver Jim Rees
2011-06-07 17:32 ` [PATCH 54/88] SQUASHME: pnfsblock: get rid of threshold policy ops Jim Rees
2011-06-07 17:32 ` [PATCH 55/88] SQUASHME: pnfsblock: write_begin adjust for removed fields Jim Rees
2011-06-07 17:32 ` [PATCH 56/88] SQUASHME: pnfsblock: write_end adjust for removed ok_to_use_pnfs Jim Rees
2011-06-07 17:32 ` [PATCH 57/88] SQUASHME: pnfsblock: write_end_cleanup " Jim Rees
2011-06-07 17:32 ` [PATCH 58/88] SQUASHME: pnfsblock: bl_write_pagelist support functions adjust for missing PG_USE_PNFS Jim Rees
2011-06-07 17:33 ` [PATCH 59/88] SQUASHME: pnfsblock: bl_write_pagelist " Jim Rees
2011-06-07 17:33 ` [PATCH 60/88] SQUASHME: pnfs-block: nfs4_blk_add_block_disk ret must be signed Jim Rees
2011-06-07 17:33 ` [PATCH 61/88] SQUASHME: pnfs-block: use new alloc/free_layout API Jim Rees
2011-06-07 17:33 ` [PATCH 62/88] SQUASHME: pnfs-block: use new commit api Jim Rees
2011-06-07 17:33 ` [PATCH 63/88] SQUASHME: pnfs-block: use new read_pagelist api Jim Rees
2011-06-07 17:33 ` [PATCH 64/88] SQUASHME: pnfs-block: use new write_pagelist api Jim Rees
2011-06-07 17:33 ` [PATCH 65/88] pnfs-block: Add support for simple rpc pipefs Jim Rees
2011-06-07 17:33 ` [PATCH 66/88] pnfs-block: Remove device creation from kernel Jim Rees
2011-06-07 17:33 ` [PATCH 67/88] SQUASHME: pnfs-block: apply types rename Jim Rees
2011-06-07 17:34 ` [PATCH 68/88] SQUASHME: pnfs-block: Revert "pnfsblock: expose block_class interface" Jim Rees
2011-06-07 17:34 ` [PATCH 69/88] SQUASHME: pnfsblock: remove obsolete include file from blocklayout.h Jim Rees
2011-06-07 17:34 ` [PATCH 70/88] SQUASHME: pnfsblock: use nfs4_deviceid Jim Rees
2011-06-07 17:34 ` [PATCH 71/88] SQUASHME: pnfsblock: no callback ops Jim Rees
2011-06-07 17:34 ` [PATCH 72/88] SQAUSHME: pnfsblock: no PNFS_NFS_SERVER Jim Rees
2011-06-07 17:34 ` [PATCH 73/88] SQUASHME: pnfsblock: no dev_notify_types Jim Rees
2011-06-07 17:34 ` [PATCH 74/88] SQUASHME: pnfsblock: use new struct pnfs_layout_hdr Jim Rees
2011-06-07 17:34 ` [PATCH 75/88] SQUASHME: pnfsblock: compile error in blocklayout code Jim Rees
2011-06-07 17:34 ` [PATCH 76/88] SQUASHME: pnfs-block: deprecate get_stripesize Jim Rees
2011-06-07 17:35 ` [PATCH 77/88] move include lines out of include file Jim Rees
2011-06-07 17:35 ` [PATCH 78/88] SQUASHME: pnfs-block: use {set,clear}_layoutdriver Jim Rees
2011-06-07 17:35 ` [PATCH 79/88] SQUASHME: pnfs-block: Return failure from bl_initialize_mountpoint Jim Rees
2011-06-07 17:35 ` [PATCH 80/88] SQUASHME: pnfs-block: fixup setup_layoutcommit arguments Jim Rees
2011-06-07 17:35 ` [PATCH 81/88] SQUASHME: pnfs-block: fixup cleanup_layoutcommit arguments Jim Rees
2011-06-07 17:35 ` [PATCH 82/88] SQUASHME: pnfs-block: fixup encode_layoutcommit arguments Jim Rees
2011-06-07 17:35 ` [PATCH 83/88] SQUASHME: pnfs-block: fixup layoutcommit methods args Jim Rees
2011-06-07 17:35 ` [PATCH 84/88] pnfs-block: fix blocklayoutdev.c for new blkdev_get_by_dev() Jim Rees
2011-06-07 17:35 ` [PATCH 85/88] SQUASHME: pnfs-block: use pnfs_layout_hdr field prefix Jim Rees
2011-06-07 17:35 ` [PATCH 86/88] SQUASHME: pnfs: blocklayout: port block layout code Jim Rees
2011-06-08  1:27   ` Benny Halevy
2011-06-08  2:06   ` Benny Halevy
2011-06-08  7:38     ` Peng Tao
2011-06-07 17:36 ` [PATCH 87/88] Add configurable prefetch size for layoutget Jim Rees
2011-06-08  2:01   ` Benny Halevy
2011-06-08  2:18     ` Jim Rees
2011-06-08  7:15       ` Peng Tao
2011-06-09  6:06         ` Benny Halevy
2011-06-09 11:49           ` Jim Rees
2011-06-09 13:32             ` Benny Halevy
2011-06-09 13:58               ` Jim Rees
2011-06-09 15:07                 ` Peng Tao
2011-06-09 21:22                   ` Benny Halevy
2011-06-10  6:00                     ` tao.peng
2011-06-10 12:33                       ` Benny Halevy
2011-06-10 14:09                         ` tao.peng
2011-06-10 19:23                           ` Benny Halevy [this message]
2011-06-10 20:03                             ` Fred Isaman
2011-06-10 21:15                               ` Benny Halevy
2011-06-11  1:46                                 ` Peng Tao
2011-06-10 23:20                             ` Boaz Harrosh
2011-06-11  2:19                               ` Peng Tao
2011-06-12 14:40                                 ` Boaz Harrosh
2011-06-12 18:46                                   ` Peng Tao
2011-06-11  1:35                             ` Peng Tao
2011-06-09 21:23                 ` Benny Halevy
2011-06-10  5:36                   ` tao.peng
2011-06-10 12:36                     ` Benny Halevy
2011-06-10 14:17                       ` tao.peng
2011-06-10 19:02                         ` Benny Halevy
2011-06-09 15:01             ` Peng Tao
2011-06-09 14:54           ` Peng Tao
2011-06-09 21:30             ` Benny Halevy
2011-06-10  6:02               ` tao.peng
2011-06-10 12:47                 ` Benny Halevy
2011-06-10 14:30                   ` tao.peng
2011-06-10 19:07                     ` Benny Halevy
2011-06-10 16:23                   ` Boaz Harrosh
2011-06-10 16:44                     ` Boaz Harrosh
2011-06-09  6:08         ` Benny Halevy
2011-06-07 17:36 ` [PATCH 88/88] NFS41: do not update isize if inode needs layoutcommit Jim Rees
2011-06-08  2:05   ` Benny Halevy
2011-06-08  7:06     ` Peng Tao
2011-06-08  7:29       ` Peng Tao
2011-06-09 21:52 ` [PATCH 00/88] pnfs block layout driver Boaz Harrosh
2011-06-09 22:15   ` Jim Rees
2011-06-10  2:16     ` Boaz Harrosh
2011-06-10  2:20       ` Boaz Harrosh
2011-06-10  4:04     ` Benny Halevy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DF26F34.8080008@tonian.com \
    --to=benny@tonian.com \
    --cc=bergwolf@gmail.com \
    --cc=honey@citi.umich.edu \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rees@umich.edu \
    --cc=tao.peng@emc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.