From: Jan Kara <jack@suse.cz> To: Boaz Harrosh <boazh@netapp.com> Cc: linux-xfs@vger.kernel.org, Jan Kara <jack@suse.cz>, linux-nvdimm <linux-nvdimm@lists.01.org>, Dave Chinner <david@fromorbit.com>, Christoph Hellwig <hch@infradead.org>, Andy Lutomirski <luto@kernel.org>, Linux FS Devel <linux-fsdevel@vger.kernel.org>, "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>, Amit Golander <amitg@netapp.com> Subject: Re: [RFC PATCH 0/7] dax, ext4: Synchronous page faults Date: Thu, 17 Aug 2017 18:08:09 +0200 [thread overview] Message-ID: <20170817160809.GA28850@quack2.suse.cz> (raw) In-Reply-To: <ccf82f19-7693-25da-eaa6-a654830ca81e@netapp.com> On Mon 14-08-17 17:04:17, Boaz Harrosh wrote: > > Thank you Jan, I'm patiently waiting for this MAP_SYNC flag since I asked for > it in 2014. I'm so glad its time is finally do. > > Thank you for working on this. Please CC me on future patches. > (note the new Netapp email) > > On 13/08/17 12:25, Christoph Hellwig wrote: > > On Sat, Aug 12, 2017 at 07:44:14PM -0700, Dan Williams wrote: > >> How about MAP_SYNC == (MAP_SHARED|MAP_PRIVATE)? On older kernels that > >> should get -EINVAL, and on new kernels it means SYNC+SHARED. > > > > Cute trick, but I'd hate to waster it just for our little flag. > > > > How about: > > > > #define __MAP_VALIDATE MAP_SHARED|MAP_PRIVATE > > #define MAP_SYNC 0x??? | __MAP_VALIDATE > > > > so that we can reuse that trick for any new flag? > > > > YES! And please create a mask for all new flags and in validation > code if ((m_flags & __MAP_VALIDATE) == __MAP_VALIDATE) then you > want that (m_flags & __MAP_NEWFLAGS) does not come empty, this > way you actually preserve the old check that SHARED and PRIVATE > do not co exist. For now I did just a crude hack. Dan is working on new mmap syscall which checks flags which will be cleaner... > Few Comments on this new MAP_ flag > > 0] The name at least needs to be MAP_MSYNC because only meta-data is > synced not the data pointed to. That is the responsibility of the app So we actually do normal fdatasync() call so we do flush data as well. This way we don't have to be afraid of stale data exposure or other strange effects. So I've kept the name to be MAP_SYNC. > 1] This flag you have named MAP_SYNC but it is very much related to > dax and the ability for user-mode to "flush" the data pointed by this > now "synced" meta data. > For example in ext4, this flag set on an inode that is *not* IS_DAX > should fail the mmap. Because there is no point of synced meta if the > data is actually in page-cache and we know for sure it was not yet synced, > And there is no way for user-mode to directly "sync" the data as well. Yes, done. > 2] The code should be constructed that the default check for the MAP_SYNC > should fail, and only Hopped in FSs are allowed. > (So not to modify all Implementations of file_operations->mmap() ) Agreed but for now I've skipped this as I wait for new mmap syscall and how Dan implements flag checking there. > 3] /dev/pmem could start serving DAX pages in mmap, if asked for MAP_MSYNC > (which is also an API that says "I know I need to cl_flush". See 1. ) MAP_SYNC is rather more like: I can also use clflush instead of fdatasync(2). And this is rather important as all legacy applications are 100% safe in the new scheme. > 4] Once we have this flag. And properly implemented at least in one FS > and optionally in /dev/pmemX we no longer have any justification for > /dev/daxX and it can die a slow and happy death. This will be more complex I guess - see MAP_DIRECT proposal... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Jan Kara <jack@suse.cz> To: Boaz Harrosh <boazh@netapp.com> Cc: Christoph Hellwig <hch@infradead.org>, Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>, linux-nvdimm <linux-nvdimm@lists.01.org>, Dave Chinner <david@fromorbit.com>, linux-xfs@vger.kernel.org, Andy Lutomirski <luto@kernel.org>, Linux FS Devel <linux-fsdevel@vger.kernel.org>, "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>, Amit Golander <amitg@netapp.com> Subject: Re: [RFC PATCH 0/7] dax, ext4: Synchronous page faults Date: Thu, 17 Aug 2017 18:08:09 +0200 [thread overview] Message-ID: <20170817160809.GA28850@quack2.suse.cz> (raw) In-Reply-To: <ccf82f19-7693-25da-eaa6-a654830ca81e@netapp.com> On Mon 14-08-17 17:04:17, Boaz Harrosh wrote: > > Thank you Jan, I'm patiently waiting for this MAP_SYNC flag since I asked for > it in 2014. I'm so glad its time is finally do. > > Thank you for working on this. Please CC me on future patches. > (note the new Netapp email) > > On 13/08/17 12:25, Christoph Hellwig wrote: > > On Sat, Aug 12, 2017 at 07:44:14PM -0700, Dan Williams wrote: > >> How about MAP_SYNC == (MAP_SHARED|MAP_PRIVATE)? On older kernels that > >> should get -EINVAL, and on new kernels it means SYNC+SHARED. > > > > Cute trick, but I'd hate to waster it just for our little flag. > > > > How about: > > > > #define __MAP_VALIDATE MAP_SHARED|MAP_PRIVATE > > #define MAP_SYNC 0x??? | __MAP_VALIDATE > > > > so that we can reuse that trick for any new flag? > > > > YES! And please create a mask for all new flags and in validation > code if ((m_flags & __MAP_VALIDATE) == __MAP_VALIDATE) then you > want that (m_flags & __MAP_NEWFLAGS) does not come empty, this > way you actually preserve the old check that SHARED and PRIVATE > do not co exist. For now I did just a crude hack. Dan is working on new mmap syscall which checks flags which will be cleaner... > Few Comments on this new MAP_ flag > > 0] The name at least needs to be MAP_MSYNC because only meta-data is > synced not the data pointed to. That is the responsibility of the app So we actually do normal fdatasync() call so we do flush data as well. This way we don't have to be afraid of stale data exposure or other strange effects. So I've kept the name to be MAP_SYNC. > 1] This flag you have named MAP_SYNC but it is very much related to > dax and the ability for user-mode to "flush" the data pointed by this > now "synced" meta data. > For example in ext4, this flag set on an inode that is *not* IS_DAX > should fail the mmap. Because there is no point of synced meta if the > data is actually in page-cache and we know for sure it was not yet synced, > And there is no way for user-mode to directly "sync" the data as well. Yes, done. > 2] The code should be constructed that the default check for the MAP_SYNC > should fail, and only Hopped in FSs are allowed. > (So not to modify all Implementations of file_operations->mmap() ) Agreed but for now I've skipped this as I wait for new mmap syscall and how Dan implements flag checking there. > 3] /dev/pmem could start serving DAX pages in mmap, if asked for MAP_MSYNC > (which is also an API that says "I know I need to cl_flush". See 1. ) MAP_SYNC is rather more like: I can also use clflush instead of fdatasync(2). And this is rather important as all legacy applications are 100% safe in the new scheme. > 4] Once we have this flag. And properly implemented at least in one FS > and optionally in /dev/pmemX we no longer have any justification for > /dev/daxX and it can die a slow and happy death. This will be more complex I guess - see MAP_DIRECT proposal... Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
next prev parent reply other threads:[~2017-08-17 16:14 UTC|newest] Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-07-27 13:12 [RFC PATCH 0/7] dax, ext4: Synchronous page faults Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` [PATCH 1/7] mm: Remove VM_FAULT_HWPOISON_LARGE_MASK Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 21:57 ` Ross Zwisler 2017-07-27 21:57 ` Ross Zwisler 2017-07-27 21:57 ` Ross Zwisler 2017-08-01 10:52 ` Christoph Hellwig 2017-08-01 10:52 ` Christoph Hellwig 2017-08-01 10:52 ` Christoph Hellwig 2017-07-27 13:12 ` [PATCH 2/7] dax: Add sync argument to dax_iomap_fault() Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 22:06 ` Ross Zwisler 2017-07-27 22:06 ` Ross Zwisler 2017-07-27 22:06 ` Ross Zwisler 2017-07-28 9:40 ` Jan Kara 2017-07-28 9:40 ` Jan Kara 2017-07-27 13:12 ` [PATCH 3/7] dax: Simplify arguments of dax_insert_mapping() Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 22:09 ` Ross Zwisler 2017-07-27 22:09 ` Ross Zwisler 2017-07-27 22:09 ` Ross Zwisler 2017-08-01 10:54 ` Christoph Hellwig 2017-08-01 10:54 ` Christoph Hellwig 2017-08-01 10:54 ` Christoph Hellwig 2017-07-27 13:12 ` [PATCH 4/7] dax: Make dax_insert_mapping() return VM_FAULT_ state Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 22:22 ` Ross Zwisler 2017-07-27 22:22 ` Ross Zwisler 2017-07-28 9:43 ` Jan Kara 2017-07-28 9:43 ` Jan Kara 2017-07-27 13:12 ` [PATCH 5/7] dax, iomap: Add support for synchronous faults Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 22:42 ` Ross Zwisler 2017-07-27 22:42 ` Ross Zwisler 2017-08-01 10:56 ` Christoph Hellwig 2017-08-01 10:56 ` Christoph Hellwig 2017-08-01 10:56 ` Christoph Hellwig 2017-07-27 13:12 ` [PATCH 6/7] dax: Implement dax_pfn_mkwrite() Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 22:53 ` Ross Zwisler 2017-07-27 22:53 ` Ross Zwisler 2017-07-27 22:53 ` Ross Zwisler 2017-07-27 23:04 ` Ross Zwisler 2017-07-27 23:04 ` Ross Zwisler 2017-07-28 10:37 ` Jan Kara 2017-07-28 10:37 ` Jan Kara 2017-07-28 10:37 ` Jan Kara 2017-07-27 13:12 ` [PATCH 7/7] ext4: Support for synchronous DAX faults Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 13:12 ` Jan Kara 2017-07-27 22:57 ` Ross Zwisler 2017-07-27 22:57 ` Ross Zwisler 2017-07-27 14:09 ` [RFC PATCH 0/7] dax, ext4: Synchronous page faults Jeff Moyer 2017-07-27 14:09 ` Jeff Moyer 2017-07-27 14:09 ` Jeff Moyer 2017-07-27 21:57 ` Ross Zwisler 2017-07-27 21:57 ` Ross Zwisler 2017-07-28 2:05 ` Andy Lutomirski 2017-07-28 2:05 ` Andy Lutomirski 2017-07-28 9:38 ` Jan Kara 2017-07-28 9:38 ` Jan Kara 2017-07-28 9:38 ` Jan Kara 2017-08-01 11:02 ` Christoph Hellwig 2017-08-01 11:02 ` Christoph Hellwig 2017-08-01 11:26 ` Jan Kara 2017-08-01 11:26 ` Jan Kara 2017-08-01 11:26 ` Jan Kara 2017-08-08 0:24 ` Dan Williams 2017-08-08 0:24 ` Dan Williams 2017-08-11 10:03 ` Christoph Hellwig 2017-08-11 10:03 ` Christoph Hellwig 2017-08-11 10:03 ` Christoph Hellwig 2017-08-13 2:44 ` Dan Williams 2017-08-13 2:44 ` Dan Williams 2017-08-13 2:44 ` Dan Williams 2017-08-13 9:25 ` Christoph Hellwig 2017-08-13 9:25 ` Christoph Hellwig 2017-08-13 17:08 ` Dan Williams 2017-08-13 17:08 ` Dan Williams 2017-08-14 8:30 ` Jan Kara 2017-08-14 8:30 ` Jan Kara 2017-08-14 14:04 ` Boaz Harrosh 2017-08-14 14:04 ` Boaz Harrosh 2017-08-14 16:03 ` Dan Williams 2017-08-14 16:03 ` Dan Williams 2017-08-15 9:06 ` Boaz Harrosh 2017-08-15 9:06 ` Boaz Harrosh 2017-08-15 9:44 ` Boaz Harrosh 2017-08-15 9:44 ` Boaz Harrosh 2017-08-21 19:57 ` Ross Zwisler 2017-08-21 19:57 ` Ross Zwisler 2017-08-21 19:57 ` Ross Zwisler 2017-08-17 16:08 ` Jan Kara [this message] 2017-08-17 16:08 ` Jan Kara 2017-08-01 10:52 ` Christoph Hellwig 2017-08-01 10:52 ` Christoph Hellwig 2017-08-01 10:52 ` Christoph Hellwig
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170817160809.GA28850@quack2.suse.cz \ --to=jack@suse.cz \ --cc=amitg@netapp.com \ --cc=boazh@netapp.com \ --cc=david@fromorbit.com \ --cc=hch@infradead.org \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-nvdimm@lists.01.org \ --cc=linux-xfs@vger.kernel.org \ --cc=luto@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.