All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Dave Chinner <david@fromorbit.com>, Jens Axboe <axboe@fb.com>
Cc: Ming Lin <mlin@kernel.org>, lkml <linux-kernel@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, ming.l@ssi.samsung.com,
	"Kwan (Hingkwan) Huen-SSI" <kwan.huen@ssi.samsung.com>
Subject: Re: [PATCH 3/6] direct-io: add support for write stream IDs
Date: Fri, 17 Apr 2015 20:00:53 -0600	[thread overview]
Message-ID: <5531BAD5.4030104@kernel.dk> (raw)
In-Reply-To: <20150417235142.GH15810@dastard>

On 04/17/2015 05:51 PM, Dave Chinner wrote:
> On Fri, Apr 17, 2015 at 05:11:40PM -0600, Jens Axboe wrote:
>> On 04/17/2015 05:06 PM, Dave Chinner wrote:
>>> On Thu, Apr 16, 2015 at 11:20:45PM -0700, Ming Lin wrote:
>>>> On Sat, Apr 11, 2015 at 4:59 AM, Dave Chinner <david@fromorbit.com> wrote:
>>>>> On Fri, Apr 10, 2015 at 04:50:05PM -0700, Ming Lin wrote:
>>>>>> On Wed, Mar 25, 2015 at 7:26 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>>> If iocb->ki_filp->f_streamid is not set, then it should fall back to
>>>>>>>> whatever is set on the inode->i_streamid.
>>>>>>
>>>>>> Why should do the fall back?
>>>>>
>>>>> Because then you have a method of using streams with applications
>>>>> that aren't aware of streams.
>>>>>
>>>>> Or perhaps you have a file you know has different access patterns to
>>>>> the rest of the files in a directory, and you don't want to have to
>>>>> set the stream on every process that opens and uses that file. e.g.
>>>>> database writeahead log files (sequential write, never read) vs
>>>>> database index/table files (random read/write).....
>>>>>
>>>>>>> Good point, agree. Will make that change.
>>>>>>
>>>>>> That change causes problem for direct IO, for example
>>>>>>
>>>>>> process 1:
>>>>>> fd = open("/dev/nvme0n1", O_DIRECT...);
>>>>>> //set stream_id 1
>>>>>> fadvise(fd, 1, 0, POSIX_FADV_STREAMID);
>>>>>> pwrite(fd, ....);
>>>>>>
>>>>>> process 2:
>>>>>> fd = open("/dev/nvme0n1", O_DIRECT...);
>>>>>> //should be legacy stream_id 0
>>>>>> pwrite(fd, ....);
>>>>>>
>>>>>> But now process 2 also see stream_id 1, which is wrong.
>>>>>
>>>>> It's not wrong, your behaviour model is just different You have
>>>>> defined a process/fd based stream model and not considered
>>>>> considered that admins and applications might want to use a file
>>>>> based stream model instead, so applications don't need to even be
>>>>> aware that write streams are in use...
>>>>
>>>> The stream must be opened, otherwise device will return error if application
>>>> write to a not-opened stream.
>>>
>>> That's an extremely device specific *implementation* of a write
>>> stream. The *concept* of a write stream being passed from userspace to
>>> the block layer doesn't have such constraints, and I get realy
>>> concerned when implementations of a generic concept are so tightly
>>> focussed around one type of hardware implementation of the
>>> concept...
>>
>> Indeed, which is why the implementation posted cares ONLY about the
>> stream ID itself, and passing that through.
>>
>> But the point about fallback is valid, however, for some use cases
>> that will not be what you want. But we have to make some sort of
>> decision, and falling back to the inode set value (if one is set) is
>> probably the right thing to do in most use cases.
>
> Right, the question is then whether fadvise should set the value on
> the inode at all, because then the effect of setting it on a fd also
> changes the fallback. Perhaps we need to a distinction between
> "setting the stream for this fd" which lasts as long as the fd is
> active, and "setting the default inode stream" which is potentially
> a persistent operation if the filesystem stores it on disk...

Yes, that might be a good compromise. The easiest would be to define a 
second fadvise advice, where the stronger advice would be file + inode. 
Another option would be changing the file approach to use fcntl(), and 
keeping the fadvise for the inode. I'll be happy to take input on what 
people would prefer here.

>>>> Device has limited number of streams, for example, 16 streams.
>>>> There are 2 APIs to open/close the stream.
>>>
>>> What's to stop me writing something for DM-thinp that understands
>>> write streams in bios and uses it to separate out the write streams
>>> into different regions of the thinp device to improve locality of
>>> it's data placement and hence reduce fragmentation?
>>
>> Absolutely nothing, in fact that's one of the use cases that I had
>> in mind. Or for for caching software.
>
> *nod*. We are on the same page, then :)

Yes completely, basically just wanted to clarify that.

-- 
Jens Axboe


  reply	other threads:[~2015-04-18  2:01 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-24 15:26 [PATCH RFC] Support for write stream IDs Jens Axboe
2015-03-24 15:26 ` [PATCH 1/6] block: add support for carrying a stream ID in a bio Jens Axboe
2015-03-24 17:11   ` Matias Bjørling
2015-03-24 17:26     ` Jens Axboe
2015-03-24 17:26       ` Jens Axboe
2015-03-24 22:07       ` Ming Lin-SSI
2015-03-25  1:42         ` Jens Axboe
2015-03-25  1:42           ` Jens Axboe
2015-03-25  8:11         ` Matias Bjørling
2015-03-25 18:36           ` Ming Lin-SSI
2015-03-25  2:30   ` Dave Chinner
2015-04-12 10:42     ` Dmitry Monakhov
2015-03-24 15:26 ` [PATCH 2/6] Add support for per-file stream ID Jens Axboe
2015-03-24 15:27 ` [PATCH 3/6] direct-io: add support for write stream IDs Jens Axboe
2015-03-25  2:43   ` Dave Chinner
2015-03-25 14:26     ` Jens Axboe
2015-04-10 23:50       ` Ming Lin
2015-04-11  0:06         ` Ming Lin
2015-04-11 11:59         ` Dave Chinner
2015-04-17  6:20           ` Ming Lin
2015-04-17 23:06             ` Dave Chinner
2015-04-17 23:11               ` Jens Axboe
2015-04-17 23:51                 ` Dave Chinner
2015-04-18  2:00                   ` Jens Axboe [this message]
2015-04-17 15:17         ` Jens Axboe
2015-03-24 15:27 ` [PATCH 4/6] Add stream ID support for buffered writeback Jens Axboe
2015-03-25  2:40   ` Dave Chinner
2015-03-25 14:17     ` Jens Axboe
2015-03-24 15:27 ` [PATCH 5/6] btrfs: add support for buffered writeback stream ID Jens Axboe
2015-03-24 15:27 ` [PATCH 6/6] xfs: " Jens Axboe
2015-03-25  2:41   ` Dave Chinner
2015-03-24 17:03 ` [PATCH RFC] Support for write stream IDs Jeff Moyer
2015-03-24 17:08   ` Jens Axboe
2015-03-24 21:46     ` Ming Lin-SSI
2015-03-24 21:48       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5531BAD5.4030104@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=axboe@fb.com \
    --cc=david@fromorbit.com \
    --cc=kwan.huen@ssi.samsung.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.l@ssi.samsung.com \
    --cc=mlin@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.