All of lore.kernel.org
 help / color / mirror / Atom feed
* Linux kernel 3.18+ blocklayout driver support only 4k pNFS block size
@ 2015-03-31 15:16 Gál, Balázs
  2015-04-01  8:47 ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Gál, Balázs @ 2015-03-31 15:16 UTC (permalink / raw)
  To: linux-nfs

Hi all!

Started with version 3.18 the Linux kernel blocklayout driver will support pNFS servers with pnfs_block_size <= PAGE_SIZE only.

You can find this restriction in the fs/nfs/blocklayout/blocklayout.c at the bl_set_layoutdriver function:

	if (server->pnfs_blksize > PAGE_SIZE) {
		printk(KERN_ERR "%s: pNFS blksize %d not supported.\n",
			__func__, server->pnfs_blksize);
		return -EINVAL;
	}

The page size is 4k on x64, but we use an EMC VNX5300 storage as a pNFS server, which use a hardcoded 8k file system block size.
So I'm not able to use the VNX pNFS functionality with the current kernel anymore.

As I know, EMC is not the only one solution with the 8k block size, so this can brake lot of product on the market.

There is another issue with the client layout get requests implementation, what will limit the pNFS max file size in 524MB with the maximum 4k layout length in the request.
This is a normal limitation with other block layout pNFS servers also?

I'm just a dummy user from the broadcast Industry, and I want report it only, but it would be great if this problem can be handled by someone/somehow.

Thanks for your help!

Best regards

BALÁZS GÁL



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Linux kernel 3.18+ blocklayout driver support only 4k pNFS block size
  2015-03-31 15:16 Linux kernel 3.18+ blocklayout driver support only 4k pNFS block size Gál, Balázs
@ 2015-04-01  8:47 ` Christoph Hellwig
  2015-04-01 16:21   ` Gál, Balázs
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2015-04-01  8:47 UTC (permalink / raw)
  To: G?l, Bal?zs; +Cc: linux-nfs

On Tue, Mar 31, 2015 at 03:16:41PM +0000, G?l, Bal?zs wrote:
> The page size is 4k on x64, but we use an EMC VNX5300 storage as a pNFS server, which use a hardcoded 8k file system block size.
> So I'm not able to use the VNX pNFS functionality with the current kernel anymore.
> 
> As I know, EMC is not the only one solution with the 8k block size, so this can brake lot of product on the market.

Which others pNFS block server is around?

> There is another issue with the client layout get requests implementation, what will limit the pNFS max file size in 524MB with the maximum 4k layout length in the request.

There is no concept of a max pNFS file size.  Do you mean a maximum
layout size?  There is no real delibrate limit.

> I'm just a dummy user from the broadcast Industry, and I want report it only, but it would be great if this problem can be handled by someone/somehow.

Linux can't handle writeback in guaranteed bigger than 4k sizes.  The
old code tried to work around this deep down in the stack in an
extremely dead-lock prone way.  We could do this less dead lock prone
but also a lot less performant when reading in data.  Either way we'd
need some deeply technical resource with access to the 8k servers to
help to test and debug the support.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Linux kernel 3.18+ blocklayout driver support only 4k pNFS block size
  2015-04-01  8:47 ` Christoph Hellwig
@ 2015-04-01 16:21   ` Gál, Balázs
  2015-04-02  8:27     ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Gál, Balázs @ 2015-04-01 16:21 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-nfs

> There is no concept of a max pNFS file size.  Do you mean a maximum layout size?  There is no real delibrate limit.

This is the maximum file size, what can be handled by the block layout driver, because of the maximum layout size limit (at least with EMC).
Every file what is bigger than this, will be handled over the IP network.
I'm not familiar with the layout, maybe an offset can be used, but seems this is not the case.

The client layout request:
layout type: LAYOUT4_BLOCK_VOLUME
IO mode: IOMODE_RW
offset: 0 !!!
length: 524288000 !!!
min length: 4096
stateid.maxount: 4096 !!!

The server will respond with NFS4ERR_TOOSMALL Status to indicate the layout can't be fit in 4k, what seems is an acceptable behavior.

> Do you mean a maximum layout size?  

Yes. (The stateid.maxcount)

> Linux can't handle writeback in guaranteed bigger than 4k sizes.  The old code tried to work around this deep down in the stack in an extremely dead-lock prone way.  We could do this less
> dead lock prone but also a lot less performant when reading in data.  Either way we'd need some deeply technical resource with access to the 8k servers to help to test and debug the support.

This would be great. Maybe EMC can support this with a test environment. We are in contact with EMC, we can push them to provide one.

Our pNFS story started at EMC in 01/22/2015, we are still waiting for a patch. Currently we will not use pNFS, because seems it's not production ready.

Thx

Balazs


-----Original Message-----
From: Christoph Hellwig [mailto:hch@infradead.org] 
Sent: Wednesday, April 01, 2015 10:47 AM
To: Gál, Balázs
Cc: linux-nfs@vger.kernel.org
Subject: Re: Linux kernel 3.18+ blocklayout driver support only 4k pNFS block size

On Tue, Mar 31, 2015 at 03:16:41PM +0000, G?l, Bal?zs wrote:
> The page size is 4k on x64, but we use an EMC VNX5300 storage as a pNFS server, which use a hardcoded 8k file system block size.
> So I'm not able to use the VNX pNFS functionality with the current kernel anymore.
> 
> As I know, EMC is not the only one solution with the 8k block size, so this can brake lot of product on the market.

Which others pNFS block server is around?

> There is another issue with the client layout get requests implementation, what will limit the pNFS max file size in 524MB with the maximum 4k layout length in the request.

There is no concept of a max pNFS file size.  Do you mean a maximum layout size?  There is no real delibrate limit.

> I'm just a dummy user from the broadcast Industry, and I want report it only, but it would be great if this problem can be handled by someone/somehow.

Linux can't handle writeback in guaranteed bigger than 4k sizes.  The old code tried to work around this deep down in the stack in an extremely dead-lock prone way.  We could do this less dead lock prone but also a lot less performant when reading in data.  Either way we'd need some deeply technical resource with access to the 8k servers to help to test and debug the support.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Linux kernel 3.18+ blocklayout driver support only 4k pNFS block size
  2015-04-01 16:21   ` Gál, Balázs
@ 2015-04-02  8:27     ` Christoph Hellwig
  0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2015-04-02  8:27 UTC (permalink / raw)
  To: G?l, Bal?zs; +Cc: Christoph Hellwig, linux-nfs

On Wed, Apr 01, 2015 at 04:21:31PM +0000, G?l, Bal?zs wrote:
> > There is no concept of a max pNFS file size.  Do you mean a maximum layout size?  There is no real delibrate limit.
> 
> This is the maximum file size, what can be handled by the block layout driver, because of the maximum layout size limit (at least with EMC).

So it's a server side limitation.  That also explains why no EMC user
ever saw the instant crash with > 2TB volumes in the old driver..

> Our pNFS story started at EMC in 01/22/2015, we are still waiting for a patch. Currently we will not use pNFS, because seems it's not production ready.

The NFS client driver has been in a very bad shape when I started
testing it against my new server, including crashed, lockups and lots of
data corruption.  This is why I ended up rewriting it entirely for Linux
3.18.  I would not recommend to use the older one at all.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-04-02  8:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-31 15:16 Linux kernel 3.18+ blocklayout driver support only 4k pNFS block size Gál, Balázs
2015-04-01  8:47 ` Christoph Hellwig
2015-04-01 16:21   ` Gál, Balázs
2015-04-02  8:27     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.