linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [blockdevices/NBD] huge read/write-operations are splitted by the kernel
@ 2003-09-08  0:02 Sven Köhler
  2003-09-08  8:58 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Sven Köhler @ 2003-09-08  0:02 UTC (permalink / raw)
  To: linux-kernel

Hi,

i discussed a problem of the NBD-protocl with Pavel Machek. The problem 
i saw is that there is no maximum for the length field in the requests 
that the NBD kernel module sends to the NBD server. Well, this length 
field is the length field from the read/write-operation that the kernel 
delegates to the blockdevice-implementation.
I did some tests tests like
   dd if=dev/nbd/0 of=/dev/null bs=10M
and our NBD-server implementation printed out the length field of each 
reqeust. There was a very regular pattern like
   0x1fc00 (127KB)
   0x00400 (1KB)
   0x1fc00
   0x00400
   ...
Well, can anybody explain that to me?
(why so "little" 1KB requests? but that's not important)

Well, i also tested
   dd if=dev/nbd/0 of=/dev/null bs=1
which means that the device will be read in chunks of 1byte.
The result was the same: 127KB, 1KB, 127KB, 1KB...

I guess the caching layer is inbetween, and will devide the huge 10MB 
requests into smaller 127KB ones, as well as joining the small 1byte 
requests by using read-ahead i guess.
Perhaps you could tell me how i can turn off caching. Than i will test 
again without the cache.

The thing i want to know is, if there is any part of the kernel that 
gaarantees that a read/write requests will not be bigger that a certain 
value. If there is no such upper limit, the NBD itself would need to 
split things up which might become a complicated task. This task need to 
be done, because it can become very difficult for the NBD server to 
handle huge values, and one huge requests will block all other pending 
small ones due to limitations of the NBD protocol.

Thx
   Sven



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blockdevices/NBD] huge read/write-operations are splitted by the kernel
  2003-09-08  0:02 [blockdevices/NBD] huge read/write-operations are splitted by the kernel Sven Köhler
@ 2003-09-08  8:58 ` Jens Axboe
  2003-09-08 12:42   ` Sven Köhler
  2003-09-08 13:26   ` Sven Köhler
  0 siblings, 2 replies; 6+ messages in thread
From: Jens Axboe @ 2003-09-08  8:58 UTC (permalink / raw)
  To: Sven Köhler; +Cc: linux-kernel

On Mon, Sep 08 2003, Sven Köhler wrote:
> Hi,
> 
> i discussed a problem of the NBD-protocl with Pavel Machek. The problem 
> i saw is that there is no maximum for the length field in the requests 
> that the NBD kernel module sends to the NBD server. Well, this length 
> field is the length field from the read/write-operation that the kernel 
> delegates to the blockdevice-implementation.
> I did some tests tests like
>   dd if=dev/nbd/0 of=/dev/null bs=10M
> and our NBD-server implementation printed out the length field of each 
> reqeust. There was a very regular pattern like
>   0x1fc00 (127KB)
>   0x00400 (1KB)
>   0x1fc00
>   0x00400
>   ...
> Well, can anybody explain that to me?
> (why so "little" 1KB requests? but that's not important)
> 
> Well, i also tested
>   dd if=dev/nbd/0 of=/dev/null bs=1
> which means that the device will be read in chunks of 1byte.
> The result was the same: 127KB, 1KB, 127KB, 1KB...
> 
> I guess the caching layer is inbetween, and will devide the huge 10MB 
> requests into smaller 127KB ones, as well as joining the small 1byte 
> requests by using read-ahead i guess.
> Perhaps you could tell me how i can turn off caching. Than i will test 
> again without the cache.
> 
> The thing i want to know is, if there is any part of the kernel that 
> gaarantees that a read/write requests will not be bigger that a certain 
> value. If there is no such upper limit, the NBD itself would need to 
> split things up which might become a complicated task. This task need to 
> be done, because it can become very difficult for the NBD server to 
> handle huge values, and one huge requests will block all other pending 
> small ones due to limitations of the NBD protocol.

You'll probably find that if you bump the max_sectors count if your
drive to 256 from 255 (that is the default if you haven't set it), then
you'll see 128kb chunks all the time.

See max_sectors[] array.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blockdevices/NBD] huge read/write-operations are splitted by the kernel
  2003-09-08  8:58 ` Jens Axboe
@ 2003-09-08 12:42   ` Sven Köhler
  2003-09-08 14:33     ` Jens Axboe
  2003-09-08 13:26   ` Sven Köhler
  1 sibling, 1 reply; 6+ messages in thread
From: Sven Köhler @ 2003-09-08 12:42 UTC (permalink / raw)
  To: linux-kernel

> You'll probably find that if you bump the max_sectors count if your
> drive to 256 from 255 (that is the default if you haven't set it), then
> you'll see 128kb chunks all the time.

Why is 255 the default. It seems to be an inefficient value. Perhaps the 
NBD itself should set it to 256.

> See max_sectors[] array.

Well, i found the declaration, but i can't imagine how to set the values 
in it.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blockdevices/NBD] huge read/write-operations are splitted by the kernel
  2003-09-08  8:58 ` Jens Axboe
  2003-09-08 12:42   ` Sven Köhler
@ 2003-09-08 13:26   ` Sven Köhler
  2003-09-08 14:34     ` Jens Axboe
  1 sibling, 1 reply; 6+ messages in thread
From: Sven Köhler @ 2003-09-08 13:26 UTC (permalink / raw)
  To: linux-kernel

> You'll probably find that if you bump the max_sectors count if your
> drive to 256 from 255 (that is the default if you haven't set it), then
> you'll see 128kb chunks all the time.
> 
> See max_sectors[] array.

To make it clear:
the kernel will never read or write more sectors at once than specified 
in the max_sectors array (where every device has its own value), right?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blockdevices/NBD] huge read/write-operations are splitted by the kernel
  2003-09-08 12:42   ` Sven Köhler
@ 2003-09-08 14:33     ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2003-09-08 14:33 UTC (permalink / raw)
  To: Sven Köhler; +Cc: linux-kernel

On Mon, Sep 08 2003, Sven Köhler wrote:
> >You'll probably find that if you bump the max_sectors count if your
> >drive to 256 from 255 (that is the default if you haven't set it), then
> >you'll see 128kb chunks all the time.
> 
> Why is 255 the default. It seems to be an inefficient value. Perhaps the 
> NBD itself should set it to 256.

To avoid 8-bit wrap arounds, basically. Not sure it's still very valid,
you are free to compile your kernel with it set to 256. 2.6 uses 256 by
default.

> >See max_sectors[] array.
> 
> Well, i found the declaration, but i can't imagine how to set the values 
> in it.

You can grep for other examples in the kernel, I would imagine?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blockdevices/NBD] huge read/write-operations are splitted by the kernel
  2003-09-08 13:26   ` Sven Köhler
@ 2003-09-08 14:34     ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2003-09-08 14:34 UTC (permalink / raw)
  To: Sven Köhler; +Cc: linux-kernel

On Mon, Sep 08 2003, Sven Köhler wrote:
> >You'll probably find that if you bump the max_sectors count if your
> >drive to 256 from 255 (that is the default if you haven't set it), then
> >you'll see 128kb chunks all the time.
> >
> >See max_sectors[] array.
> 
> To make it clear:
> the kernel will never read or write more sectors at once than specified 
> in the max_sectors array (where every device has its own value), right?

Correct

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-09-08 14:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-08  0:02 [blockdevices/NBD] huge read/write-operations are splitted by the kernel Sven Köhler
2003-09-08  8:58 ` Jens Axboe
2003-09-08 12:42   ` Sven Köhler
2003-09-08 14:33     ` Jens Axboe
2003-09-08 13:26   ` Sven Köhler
2003-09-08 14:34     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).