linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "kanchan" <joshi.k@samsung.com>
To: "'Dave Chinner'" <david@fromorbit.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-block@vger.kernel.org>,
	<linux-nvme@lists.infradead.org>, <linux-fsdevel@vger.kernel.org>,
	<linux-ext4@vger.kernel.org>, <axboe@fb.com>,
	<prakash.v@samsung.com>, <anshul@samsung.com>,
	<joshiiitr@gmail.com>
Subject: RE: [PATCH v3 6/7] fs: introduce write-hint start point for in-kernel hints
Date: Wed, 3 Apr 2019 20:00:22 +0530	[thread overview]
Message-ID: <11ce01d4ea29$cb576c90$620645b0$@samsung.com> (raw)
In-Reply-To: <20190401051239.GP26298@dastard>

> Which means that when a new userspace hint is defined, all the kernel
hints change numbers and, AIUI, that changes how the kernel hints are mapped
to the underlying device.

Currently adding a new user-space hint requires modifying code and
installing modified kernel. So I felt it would be less probable to encounter
that situation while in production workload.


>The kernel hints need to be mapped to the highest supported number a work
down, while userspace starts at the lowest and works up.

Actually, I initially implemented "blk_write_hint_to_streamid" function like
that i.e. as per the table you've put. But that code involved more
checks/branches (condition checks) than the current one.
Also, request queue contained this statically defined array called
"write_hints", which nvme driver updated to gather stream stats.
Snippet below - 

  	if (streamid < ARRAY_SIZE(req->q->write_hints))
		req->q->write_hints[streamid] += blk_rq_bytes(req) >> 9;

That requires nvme driver doing a reverse conversion from streamid to
array-index(some more conditional checks) if kernel-hints get mapped to
highest possible stream numbers.


Overall, will it not be about adding additional  run-time checks in I/O path
(which we will always execute) for the condition which will happen only if
one chooses to extend user-space hint count in between?


Thanks,

-----Original Message-----
From: Dave Chinner [mailto:david@fromorbit.com] 
Sent: Monday, April 01, 2019 10:43 AM
To: Kanchan Joshi <joshi.k@samsung.com>
Cc: linux-kernel@vger.kernel.org; linux-block@vger.kernel.org;
linux-nvme@lists.infradead.org; linux-fsdevel@vger.kernel.org;
linux-ext4@vger.kernel.org; axboe@fb.com; prakash.v@samsung.com;
anshul@samsung.com; joshiiitr@gmail.com
Subject: Re: [PATCH v3 6/7] fs: introduce write-hint start point for
in-kernel hints

On Fri, Mar 29, 2019 at 01:23:51PM +0530, Kanchan Joshi wrote:
> kernel-mode components can define own write-hints using 
> "WRITE_LIFE_KERN_MIN" as base.
> 
> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
> ---
>  include/linux/fs.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h index 
> 29d8e2c..6a2673e 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -291,6 +291,8 @@ enum rw_hint {
>  	WRITE_LIFE_MEDIUM	= RWH_WRITE_LIFE_MEDIUM,
>  	WRITE_LIFE_LONG		= RWH_WRITE_LIFE_LONG,
>  	WRITE_LIFE_EXTREME	= RWH_WRITE_LIFE_EXTREME,
> +/* Kernel should use write-hint starting from this */
> +	WRITE_LIFE_KERN_MIN,

Which means that when a new userspace hint is defined, all the kernel hints
change numbers and, AIUI, that changes how the kernel hints are mapped to
the underlying device.

The kernel hints need to be mapped to the highest supported number a work
down, while userspace starts at the lowest and works up. The "kernel to
device stream id" needs to translate the kernel hints down to the upper
range of the device hints.

I think the mapping range the code uses should be:

    HINT		Type			device
     0			USER 0			  0
     1			USER 1			  1
     ......
     n			USER MAX		  n

     {n,65535-m}	UNUSED			{n,dev_max-m}

     65535 - m		KERN_MIN,		dev_max - m
     ......
     65532		KERN 3			dev_max - 3
     65533		KERN 2			dev_max - 2
     65534		KERN 1			dev_max - 1
     65535		KERN 0			dev_max

i.e. if you look at the mapping as a signed short, >= 0 are user hints, < 0
are kernel hints. This provides an obvious, simple way to map the kernel
hints to the upper range of the device hint range. It also provides a simple
way to compress both user and kernel hints into a limited device hint range
- kernel always uses the top device hint, user is limited to the rest of the
range....

This means the ranges don't overlap or change at either the code or the
device level as we add more user and kernel hint channels in the future.

Cheers,

Dave.
--
Dave Chinner
david@fromorbit.com


  reply	other threads:[~2019-04-03 14:30 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20190329075737epcas1p4f32cad26279c1146982a6c91b3378eab@epcas1p4.samsung.com>
2019-03-29  7:53 ` [PATCH v3 0/7] Extend write-hint for in-kernel use Kanchan Joshi
     [not found]   ` <CGME20190329075743epcas1p12f1d290e65ddf84eb74ed94a5ae6eb74@epcas1p1.samsung.com>
2019-03-29  7:53     ` [PATCH v3 1/7] block: extend stream count " Kanchan Joshi
2019-03-30 17:48       ` Andreas Dilger
2019-04-01  5:02       ` Dave Chinner
     [not found]   ` <CGME20190329075746epcas1p19e5a93c1f593b5fed3f98eddf9159aa5@epcas1p1.samsung.com>
2019-03-29  7:53     ` [PATCH v3 2/7] block: introduce API to register stream information with block layer Kanchan Joshi
     [not found]   ` <CGME20190329075749epcas2p22b85cbdbb7ccb4f7f3ecf69b84f75dc9@epcas2p2.samsung.com>
2019-03-29  7:53     ` [PATCH v3 3/7] block: add write-hint to stream-id conversion Kanchan Joshi
2019-04-01  5:08       ` Dave Chinner
2019-04-02  9:20         ` Jan Kara
2019-04-02 20:35           ` Dave Chinner
2019-04-03  9:36             ` Jan Kara
2019-04-03 14:47               ` kanchan
     [not found]   ` <CGME20190329075753epcas1p16ce37362f83ccb8520a782b845dcb905@epcas1p1.samsung.com>
2019-03-29  7:53     ` [PATCH v3 4/7] nvme: register stream info with block layer Kanchan Joshi
2019-03-29 16:58       ` Heitke, Kenneth
     [not found]   ` <CGME20190329075755epcas2p2d2670db217ae3519d2c4b93d5e3749d1@epcas2p2.samsung.com>
2019-03-29  7:53     ` [PATCH v3 5/7] fs: introduce APIs to enable sending write-hint with buffer-head Kanchan Joshi
     [not found]   ` <CGME20190329075758epcas1p335511dcb10bb5592ea72409a1d01e752@epcas1p3.samsung.com>
2019-03-29  7:53     ` [PATCH v3 6/7] fs: introduce write-hint start point for in-kernel hints Kanchan Joshi
2019-04-01  5:12       ` Dave Chinner
2019-04-03 14:30         ` kanchan [this message]
     [not found]   ` <CGME20190329075800epcas2p4731f10eabf7d70dc7096117d161b05a3@epcas2p4.samsung.com>
2019-03-29  7:53     ` [PATCH v3 7/7] fs/ext4,jbd2: add support for passing write-hint with journal Kanchan Joshi
2019-03-30 17:49       ` Andreas Dilger
2019-04-02  9:07         ` Jan Kara
2019-04-03  2:57       ` Martin K. Petersen
2019-04-03 13:42         ` kanchan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='11ce01d4ea29$cb576c90$620645b0$@samsung.com' \
    --to=joshi.k@samsung.com \
    --cc=anshul@samsung.com \
    --cc=axboe@fb.com \
    --cc=david@fromorbit.com \
    --cc=joshiiitr@gmail.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=prakash.v@samsung.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).