linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boaz Harrosh <boaz@plexistor.com>
To: Dave Chinner <david@fromorbit.com>, Boaz Harrosh <boaz@plexistor.com>
Cc: ira.weiny@intel.com, linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>,
	"Theodore Y. Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/5] Enable per-file/directory DAX operations
Date: Thu, 24 Oct 2019 17:05:45 +0300	[thread overview]
Message-ID: <fb4f8be7-bca6-733a-7f16-ced6557f7108@plexistor.com> (raw)
In-Reply-To: <20191024073446.GA4614@dread.disaster.area>

On 24/10/2019 10:34, Dave Chinner wrote:
> On Thu, Oct 24, 2019 at 05:31:13AM +0300, Boaz Harrosh wrote:
<>
> 
> The on disk DAX flag is inherited from the parent directory at
> create time. Hence an admin only need to set it on the data
> directory of the application when first configuring it, and
> everything the app creates will be configured for DAX access
> automatically.
> 

Yes I said that as well. But again I am concerned that this is the
opposite of our Intention. As you said the WRITEs are slow and
do not scale so what we like, and why we have the all problem, is
to WRITE *none*-DAX. And if so then how do we turn the bit ON later
for the fast READs.

> Or, alternatively, mkfs sets the flag on the root dir so that
> everything in the filesystem uses DAX by default (through
> inheritance) unless the admin turns off the flag on a directory
> before it starts to be used

> or on a set of files after they have
> been created (because DAX causes problems)...
> 

Yes exactly this can not be done currently.

> So, yeah, there's another problem with the basic assertion that we
> only need to allow the on disk flag to be changed on zero length
> files: we actually want to be able to -clear- the DAX flag when the
> file has data attached to it, not just when it is an empty file...
> 

Exactly, This is my concern. And the case that I see most useful is the
opposite where I want to turn it ON, for DAX fast READs.

>> What if, say in XFS when setting the DAX-bit we take all the three write-locks
>> same as a truncate. Then we check that there are no active page-cache mappings
>> ie. a single opener. Then allow to set the bit. Else return EBUISY. (file is in use)
> 
> DAX doesn't have page cache mappings, so anything that relies on
> checking page cache state isn't going to work reliably. 

I meant on the opposite case, Where the flag was OFF and I want it ON for
fast READs. In that case if I have any users there are pages on the
xarray.
BTW the opposite is also true if we have active DAX users we will have
DAX entries in the xarray. What we want is that there are *no* active
users while we change the file-DAX-mode. Else we fail the change.

> I also seem
> to recall that there was a need to take some vm level lock to really
> prevent page fault races, and that we can't safely take that in a
> safe combination with all the filesystem locks we need.
> 

We do not really care with page fault races in the Kernel as long
as I protect the xarray access and these are protected well if we
take truncate locking. But we have a bigger problem that you pointed
out with the change of the operations vector pointer.

I was thinking about this last night. One way to do this is with
file-exclusive-lock. Correct me if I'm wrong:
file-exclusive-readwrite-lock means any other openers will fail and
if there are openers already the lock will fail. Which is what we want
no? to make sure we are the exclusive user of the file while we change
the op vector.
Now the question is if we force the application to take the lock and
Kernel only check that we are locked. Or Kernel take the lock within
the IOCTL.

Lets touch base. As I understand the protocol we want to establish with
the administration tool is:

- File is created, written. read...

- ALL file handles are closed, there are no active users
- File open by single opener for the purpose of changing the DAX-mode
- lock from all other opens
- change the DAX-mode, op vectors
- unlock-exlusivness

- File activity can resume...

That's easy to say, But how can we enforce this protocol?

> Cheers,
> Dave.
> 

Thanks Dave
Boaz

  reply	other threads:[~2019-10-24 14:05 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-20 15:59 [PATCH 0/5] Enable per-file/directory DAX operations ira.weiny
2019-10-20 15:59 ` [PATCH 1/5] fs/stat: Define DAX statx attribute ira.weiny
2019-10-22 11:32   ` Boaz Harrosh
2019-10-22 16:51     ` Ira Weiny
2019-10-20 15:59 ` [PATCH 2/5] fs/xfs: Isolate the physical DAX flag from effective ira.weiny
2019-10-21  0:26   ` Dave Chinner
2019-10-21 17:40     ` Ira Weiny
2019-10-20 15:59 ` [PATCH 3/5] fs/xfs: Separate functionality of xfs_inode_supports_dax() ira.weiny
2019-10-20 15:59 ` [PATCH 4/5] fs/xfs: Clean up DAX support check ira.weiny
2019-10-20 15:59 ` [PATCH 5/5] fs/xfs: Allow toggle of physical DAX flag ira.weiny
2019-10-21  0:45   ` Dave Chinner
2019-10-21 22:49     ` Ira Weiny
2019-10-21 23:46       ` Dave Chinner
2019-11-08 13:12       ` Jan Kara
2019-11-08 13:46         ` Jan Kara
2019-11-08 19:36           ` Ira Weiny
2019-11-11 16:07             ` Jan Kara
2019-11-11 23:54               ` Ira Weiny
2019-10-22 11:21 ` [PATCH 0/5] Enable per-file/directory DAX operations Boaz Harrosh
2019-10-23 13:09   ` Boaz Harrosh
2019-10-23 22:13     ` Dave Chinner
2019-10-24  2:31       ` Boaz Harrosh
2019-10-24  7:34         ` Dave Chinner
2019-10-24 14:05           ` Boaz Harrosh [this message]
2019-10-24 21:35             ` Dave Chinner
2019-10-24 23:29               ` Boaz Harrosh
2019-10-25  0:36                 ` Dave Chinner
2019-10-25  1:15                   ` Boaz Harrosh
2019-10-25 20:49                   ` Ira Weiny
2019-10-27 22:10                     ` Dave Chinner
2019-10-31 16:17                       ` Ira Weiny
2019-11-01 22:47                         ` Dave Chinner
2019-11-02  4:25                           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb4f8be7-bca6-733a-7f16-ced6557f7108@plexistor.com \
    --to=boaz@plexistor.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).