All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
@ 2015-07-16 23:14 Oleg Nesterov
  2015-07-16 23:22 ` Stephen Rothwell
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-16 23:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Joonsoo Kim, Benjamin LaHaise, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite()
is not defined in this case. Add yet another "must not be called"
helper into nommu.c to make the linker happy.

I still think this is pointless, afaics sys_io_setup() simply can't
succeed if CONFIG_MMU=n. Instead we should make CONFIG_AIO depend
on CONFIG_MMU.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 mm/nommu.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/mm/nommu.c b/mm/nommu.c
index e3026fd..979afad 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -2008,6 +2008,12 @@ void filemap_map_pages(struct vm_area_struct *vma, struct vm_fault *vmf)
 }
 EXPORT_SYMBOL(filemap_map_pages);
 
+int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	BUG();
+	return 0;
+}
+
 static int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
 		unsigned long addr, void *buf, int len, int write)
 {
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-16 23:14 [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
@ 2015-07-16 23:22 ` Stephen Rothwell
  2015-07-16 23:24 ` Andrew Morton
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Stephen Rothwell @ 2015-07-16 23:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Joonsoo Kim, Benjamin LaHaise, Fengguang Wu,
	Jeff Moyer, Johannes Weiner, linux-next, linux-kernel

Hi Oleg,

On Fri, 17 Jul 2015 01:14:05 +0200 Oleg Nesterov <oleg@redhat.com> wrote:
>
> fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite()
> is not defined in this case. Add yet another "must not be called"
> helper into nommu.c to make the linker happy.
> 
> I still think this is pointless, afaics sys_io_setup() simply can't
> succeed if CONFIG_MMU=n. Instead we should make CONFIG_AIO depend
> on CONFIG_MMU.
> 
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Added as a fix to linux-next today.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-16 23:14 [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
  2015-07-16 23:22 ` Stephen Rothwell
@ 2015-07-16 23:24 ` Andrew Morton
  2015-07-16 23:52   ` Oleg Nesterov
  2015-07-21 15:29 ` [PATCH v2] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
  2015-07-21 16:20 ` [PATCH v3] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
  3 siblings, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2015-07-16 23:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Joonsoo Kim, Benjamin LaHaise, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On Fri, 17 Jul 2015 01:14:05 +0200 Oleg Nesterov <oleg@redhat.com> wrote:

> fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite()
> is not defined in this case. Add yet another "must not be called"
> helper into nommu.c to make the linker happy.
> 
> I still think this is pointless, afaics sys_io_setup() simply can't
> succeed if CONFIG_MMU=n. Instead we should make CONFIG_AIO depend
> on CONFIG_MMU.
> 
> ..
>
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -2008,6 +2008,12 @@ void filemap_map_pages(struct vm_area_struct *vma, struct vm_fault *vmf)
>  }
>  EXPORT_SYMBOL(filemap_map_pages);
>  
> +int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> +	BUG();
> +	return 0;
> +}
> +
>  static int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
>  		unsigned long addr, void *buf, int len, int write)
>  {

So if anyone starts testing aio on NOMMU, this patch will make the
whole thing immediately go BUG.  This isn't helpful :(

Yes, making AIO depend on MMU sounds better.  Because if it wasn't
busted before, it sure is now!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-16 23:24 ` Andrew Morton
@ 2015-07-16 23:52   ` Oleg Nesterov
  2015-07-17 14:06     ` Benjamin LaHaise
  0 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-16 23:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Joonsoo Kim, Benjamin LaHaise, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On 07/16, Andrew Morton wrote:
>
> On Fri, 17 Jul 2015 01:14:05 +0200 Oleg Nesterov <oleg@redhat.com> wrote:
>
> > fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite()
> > is not defined in this case. Add yet another "must not be called"
> > helper into nommu.c to make the linker happy.
> >
> > I still think this is pointless, afaics sys_io_setup() simply can't
> > succeed if CONFIG_MMU=n. Instead we should make CONFIG_AIO depend
> > on CONFIG_MMU.
> >
> > ..
> >
> > --- a/mm/nommu.c
> > +++ b/mm/nommu.c
> > @@ -2008,6 +2008,12 @@ void filemap_map_pages(struct vm_area_struct *vma, struct vm_fault *vmf)
> >  }
> >  EXPORT_SYMBOL(filemap_map_pages);
> >
> > +int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > +{
> > +	BUG();
> > +	return 0;
> > +}
> > +
> >  static int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
> >  		unsigned long addr, void *buf, int len, int write)
> >  {
>
> So if anyone starts testing aio on NOMMU, this patch will make the
> whole thing immediately go BUG.  This isn't helpful :(

Well, I'm afraid I could miss something, but _afaics_ this can not
happen. filemap_page_mkwrite() can't be called if NOMMU.

In particular, simply because sys_io_setup() is the only user (if
NOMMU) and it can't succeed. But even if I missed something and it
can succeed, ->page_mkwrite() must not be called anyway. But this,
again, unless I missed something ;)

> Yes, making AIO depend on MMU sounds better.

Perhaps Benjamin can change his mind or correct me.

> Because if it wasn't
> busted before, it sure is now!

I hope this change can't make any difference.

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-16 23:52   ` Oleg Nesterov
@ 2015-07-17 14:06     ` Benjamin LaHaise
  2015-07-17 17:27       ` Oleg Nesterov
  2015-07-17 22:31       ` Oleg Nesterov
  0 siblings, 2 replies; 27+ messages in thread
From: Benjamin LaHaise @ 2015-07-17 14:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On Fri, Jul 17, 2015 at 01:52:28AM +0200, Oleg Nesterov wrote:
> On 07/16, Andrew Morton wrote:
> >
> > On Fri, 17 Jul 2015 01:14:05 +0200 Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite()
> > > is not defined in this case. Add yet another "must not be called"
> > > helper into nommu.c to make the linker happy.
> > >
> > > I still think this is pointless, afaics sys_io_setup() simply can't
> > > succeed if CONFIG_MMU=n. Instead we should make CONFIG_AIO depend
> > > on CONFIG_MMU.
> > >
> > > ..
> > >
> > > --- a/mm/nommu.c
> > > +++ b/mm/nommu.c
> > > @@ -2008,6 +2008,12 @@ void filemap_map_pages(struct vm_area_struct *vma, struct vm_fault *vmf)
> > >  }
> > >  EXPORT_SYMBOL(filemap_map_pages);
> > >
> > > +int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > > +{
> > > +	BUG();
> > > +	return 0;
> > > +}
> > > +
> > >  static int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
> > >  		unsigned long addr, void *buf, int len, int write)
> > >  {
> >
> > So if anyone starts testing aio on NOMMU, this patch will make the
> > whole thing immediately go BUG.  This isn't helpful :(
> 
> Well, I'm afraid I could miss something, but _afaics_ this can not
> happen. filemap_page_mkwrite() can't be called if NOMMU.
> 
> In particular, simply because sys_io_setup() is the only user (if
> NOMMU) and it can't succeed. But even if I missed something and it
> can succeed, ->page_mkwrite() must not be called anyway. But this,
> again, unless I missed something ;)
> 
> > Yes, making AIO depend on MMU sounds better.
> 
> Perhaps Benjamin can change his mind or correct me.

Either try to fix it correctly, or disable the config.  Making it just 
compile but be knowingly broken is worse than either of those 2 options.  
My point was that it is valid for someone to want to use the functionality 
on a nommu system, and given that it should have worked before the page 
migration code was added, It Would Be Nice(tm) to return it to that state.  
Adding a BUG() like that to the code is just plain broken.

		-ben

> > Because if it wasn't
> > busted before, it sure is now!
> 
> I hope this change can't make any difference.
> 
> Oleg.

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 14:06     ` Benjamin LaHaise
@ 2015-07-17 17:27       ` Oleg Nesterov
  2015-07-17 17:37         ` Benjamin LaHaise
  2015-07-17 22:31       ` Oleg Nesterov
  1 sibling, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-17 17:27 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

Benjamin,

it seems that we do not understand each other,

On 07/17, Benjamin LaHaise wrote:
>
> On Fri, Jul 17, 2015 at 01:52:28AM +0200, Oleg Nesterov wrote:
> > > >
> > > > +int filemap_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
> > > > +{
> > > > +	BUG();
> > > > +	return 0;
> > > > +}
> > > > +
> > > >  static int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
> > > >  		unsigned long addr, void *buf, int len, int write)
> > > >  {
> > >
> > > So if anyone starts testing aio on NOMMU, this patch will make the
> > > whole thing immediately go BUG.  This isn't helpful :(
> >
> > Well, I'm afraid I could miss something, but _afaics_ this can not
> > happen. filemap_page_mkwrite() can't be called if NOMMU.
> >
> > In particular, simply because sys_io_setup() is the only user (if
> > NOMMU) and it can't succeed. But even if I missed something and it
> > can succeed, ->page_mkwrite() must not be called anyway. But this,
> > again, unless I missed something ;)
> >
> > > Yes, making AIO depend on MMU sounds better.
> >
> > Perhaps Benjamin can change his mind or correct me.
>
> Either try to fix it correctly,

And I think this fix is correct. In a sense that we only add
filemap_page_mkwrite() to make the linker happy, it can never be called
and thus we can never hit this BUG().

Please look at filemap_fault() in nommu.c,

	int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
	{
		BUG();
		return 0;
	}

this is the same thing. If nothing else, mm/memory.c is not even compiled
if NOMMU.

> or disable the config.

Yes, I think this makes more sense. but see below...

> Making it just
> compile but be knowingly broken is worse than either of those 2 options.

Why? See above. I think this change makes no difference except it fixes
the build.

Again, of course I could miss something. Could you explain your point?

> My point was that it is valid for someone to want to use the functionality
> on a nommu system, and given that it should have worked before the page
> migration code was added, It Would Be Nice(tm) to return it to that state.

Perhaps it worked on NOMMU before, I have no idea. But currently, afaics,
it can not. Even sys_io_setup() can't suceed. So I do not understand why
do we allow NOMMU && CONFIG_AIO.

But this is another issue. Of course I won't insist, please forget.

> Adding a BUG() like that to the code is just plain broken.

Why? Could you explain what I have missed?

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 17:27       ` Oleg Nesterov
@ 2015-07-17 17:37         ` Benjamin LaHaise
  2015-07-17 17:55           ` Oleg Nesterov
  0 siblings, 1 reply; 27+ messages in thread
From: Benjamin LaHaise @ 2015-07-17 17:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On Fri, Jul 17, 2015 at 07:27:26PM +0200, Oleg Nesterov wrote:
> Benjamin,
> 
> it seems that we do not understand each other,
...
> >
> > Either try to fix it correctly,
> 
> And I think this fix is correct. In a sense that we only add
> filemap_page_mkwrite() to make the linker happy, it can never be called
> and thus we can never hit this BUG().
> 
> Please look at filemap_fault() in nommu.c,
> 
> 	int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> 	{
> 		BUG();
> 		return 0;
> 	}
> 
> this is the same thing. If nothing else, mm/memory.c is not even compiled
> if NOMMU.

Using BUG() is the wrong approach.  If the code is not needed in NOMMU, then 
#ifdef it out.  Think about it: NOMMU systems are very low memory systems 
and they should not have dead code compiled in if it is not needed.

> > or disable the config.
> 
> Yes, I think this makes more sense. but see below...
> 
> > Making it just
> > compile but be knowingly broken is worse than either of those 2 options.
> 
> Why? See above. I think this change makes no difference except it fixes
> the build.
> 
> Again, of course I could miss something. Could you explain your point?

Don't add BUG().  It's the equivalent approach of saying "I think this code 
isn't needed, but I'm lazy and not going to remove it properly."

> > My point was that it is valid for someone to want to use the functionality
> > on a nommu system, and given that it should have worked before the page
> > migration code was added, It Would Be Nice(tm) to return it to that state.
> 
> Perhaps it worked on NOMMU before, I have no idea. But currently, afaics,
> it can not. Even sys_io_setup() can't suceed. So I do not understand why
> do we allow NOMMU && CONFIG_AIO.
> 
> But this is another issue. Of course I won't insist, please forget.

As I said in my earlier email: aio on NOMMU was broken by the page migration 
code that was added in mid 3.1x.  Fixing it should not be that hard.
> 
> > Adding a BUG() like that to the code is just plain broken.
> 
> Why? Could you explain what I have missed?

It's doing half the job.  Either the code should be #if'd out or not.

		-ben

> Oleg.

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 17:37         ` Benjamin LaHaise
@ 2015-07-17 17:55           ` Oleg Nesterov
  2015-07-17 18:12             ` Austin S Hemmelgarn
  2015-07-17 22:56             ` Oleg Nesterov
  0 siblings, 2 replies; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-17 17:55 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On 07/17, Benjamin LaHaise wrote:
>
> On Fri, Jul 17, 2015 at 07:27:26PM +0200, Oleg Nesterov wrote:
> > Benjamin,
> >
> > it seems that we do not understand each other,
> ...
> > >
> > > Either try to fix it correctly,
> >
> > And I think this fix is correct. In a sense that we only add
> > filemap_page_mkwrite() to make the linker happy, it can never be called
> > and thus we can never hit this BUG().
> >
> > Please look at filemap_fault() in nommu.c,
> >
> > 	int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > 	{
> > 		BUG();
> > 		return 0;
> > 	}
> >
> > this is the same thing. If nothing else, mm/memory.c is not even compiled
> > if NOMMU.
>
> Using BUG() is the wrong approach.  If the code is not needed in NOMMU, then
> #ifdef it out.  Think about it: NOMMU systems are very low memory systems
> and they should not have dead code compiled in if it is not needed.

OK, at least I hope you no longer think that this patch makes this code
knowingly broken.

> Don't add BUG().  It's the equivalent approach of saying "I think this code
> isn't needed, but I'm lazy and not going to remove it properly."

There is another interpretation: I think this code must be never called,
if it is actually called we have a serious problem which should be loudly
reported.

> > Why? Could you explain what I have missed?
>
> It's doing half the job.  Either the code should be #if'd out or not.

Again, filemap_page_mkwrite() added to nommu.c matches filemap_fault()
and filemap_map_pages() we already have.


But I won't argue, you are maintainer. What exactly do you want me to
ifdef? Will you agree with the patch which adds ifdef into
aio_ring_vm_ops,

	static const struct vm_operations_struct aio_ring_vm_ops = {
	       .mremap         = aio_ring_mremap,
	#ifdef CONFIG_MMU
	       .fault          = filemap_fault,
	       .map_pages      = filemap_map_pages,
	       .page_mkwrite   = filemap_page_mkwrite,
	#endif
	};

?

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 17:55           ` Oleg Nesterov
@ 2015-07-17 18:12             ` Austin S Hemmelgarn
  2015-07-17 18:19               ` Oleg Nesterov
  2015-07-17 22:56             ` Oleg Nesterov
  1 sibling, 1 reply; 27+ messages in thread
From: Austin S Hemmelgarn @ 2015-07-17 18:12 UTC (permalink / raw)
  To: Oleg Nesterov, Benjamin LaHaise
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 572 bytes --]

On 2015-07-17 13:55, Oleg Nesterov wrote:
> On 07/17, Benjamin LaHaise wrote:
>>
>> Don't add BUG().  It's the equivalent approach of saying "I think this code
>> isn't needed, but I'm lazy and not going to remove it properly."
>
> There is another interpretation: I think this code must be never called,
> if it is actually called we have a serious problem which should be loudly
> reported.
>
And not compiling it at all _will_ loudly report it, it'll just report 
it during linking instead of at run-time, which is a much better time to 
shout about it.


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 18:12             ` Austin S Hemmelgarn
@ 2015-07-17 18:19               ` Oleg Nesterov
  2015-07-17 18:39                 ` Austin S Hemmelgarn
  0 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-17 18:19 UTC (permalink / raw)
  To: Austin S Hemmelgarn
  Cc: Benjamin LaHaise, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Jeff Moyer, Johannes Weiner, Stephen Rothwell, linux-next,
	linux-kernel

On 07/17, Austin S Hemmelgarn wrote:
>
> On 2015-07-17 13:55, Oleg Nesterov wrote:
>> On 07/17, Benjamin LaHaise wrote:
>>>
>>> Don't add BUG().  It's the equivalent approach of saying "I think this code
>>> isn't needed, but I'm lazy and not going to remove it properly."
>>
>> There is another interpretation: I think this code must be never called,
>> if it is actually called we have a serious problem which should be loudly
>> reported.
>>
> And not compiling it at all _will_ loudly report it, it'll just report
> it during linking instead of at run-time, which is a much better time to
> shout about it.

And how can we do this?

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 18:19               ` Oleg Nesterov
@ 2015-07-17 18:39                 ` Austin S Hemmelgarn
  2015-07-17 18:54                   ` Oleg Nesterov
  0 siblings, 1 reply; 27+ messages in thread
From: Austin S Hemmelgarn @ 2015-07-17 18:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Benjamin LaHaise, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Jeff Moyer, Johannes Weiner, Stephen Rothwell, linux-next,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1182 bytes --]

On 2015-07-17 14:19, Oleg Nesterov wrote:
> On 07/17, Austin S Hemmelgarn wrote:
>>
>> On 2015-07-17 13:55, Oleg Nesterov wrote:
>>> On 07/17, Benjamin LaHaise wrote:
>>>>
>>>> Don't add BUG().  It's the equivalent approach of saying "I think this code
>>>> isn't needed, but I'm lazy and not going to remove it properly."
>>>
>>> There is another interpretation: I think this code must be never called,
>>> if it is actually called we have a serious problem which should be loudly
>>> reported.
>>>
>> And not compiling it at all _will_ loudly report it, it'll just report
>> it during linking instead of at run-time, which is a much better time to
>> shout about it.
>
> And how can we do this?
>
If a function that isn't defined (for example, you use a #if block to 
comment it out under certain circumstances), then the link will fail 
rather noisily something references it.  We already know during the 
compile that it's a NOMMU kernel, so anything that calls it on a MMU 
enabled kernel can have a compile time check added instead of doing the 
check at runtime (or even just calling it without checking), thus even 
further reducing code size.



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 18:39                 ` Austin S Hemmelgarn
@ 2015-07-17 18:54                   ` Oleg Nesterov
  2015-07-17 19:09                     ` Austin S Hemmelgarn
  0 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-17 18:54 UTC (permalink / raw)
  To: Austin S Hemmelgarn
  Cc: Benjamin LaHaise, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Jeff Moyer, Johannes Weiner, Stephen Rothwell, linux-next,
	linux-kernel

On 07/17, Austin S Hemmelgarn wrote:
>
> On 2015-07-17 14:19, Oleg Nesterov wrote:
>> On 07/17, Austin S Hemmelgarn wrote:
>>>
>>> On 2015-07-17 13:55, Oleg Nesterov wrote:
>>>> On 07/17, Benjamin LaHaise wrote:
>>>>>
>>>>> Don't add BUG().  It's the equivalent approach of saying "I think this code
>>>>> isn't needed, but I'm lazy and not going to remove it properly."
>>>>
>>>> There is another interpretation: I think this code must be never called,
>>>> if it is actually called we have a serious problem which should be loudly
>>>> reported.
>>>>
>>> And not compiling it at all _will_ loudly report it, it'll just report
>>> it during linking instead of at run-time, which is a much better time to
>>> shout about it.
>>
>> And how can we do this?
>>
> If a function that isn't defined (for example, you use a #if block to
> comment it out under certain circumstances), then the link will fail
> rather noisily something references it.

This is what we are trying to fix.

> We already know during the
> compile that it's a NOMMU kernel, so anything that calls it on a MMU
> enabled kernel can have a compile time check added

It already has. memory.c is not compiled if NOMMU.

The problem is aio_ring_vm_ops which references this function. And btw
filemap_fault() too.

And just in case, I won't mind to add ifdef(CONFIG_MMU) there, I am
waiting for reply from Benjamin.

> instead of doing the
> check at runtime (or even just calling it without checking), thus even
> further reducing code size.

So what exactly do you suggest to fix the problem?

I agree with any solution which satisfies the maintainers.

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 18:54                   ` Oleg Nesterov
@ 2015-07-17 19:09                     ` Austin S Hemmelgarn
  0 siblings, 0 replies; 27+ messages in thread
From: Austin S Hemmelgarn @ 2015-07-17 19:09 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Benjamin LaHaise, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Jeff Moyer, Johannes Weiner, Stephen Rothwell, linux-next,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2276 bytes --]

On 2015-07-17 14:54, Oleg Nesterov wrote:
> On 07/17, Austin S Hemmelgarn wrote:
>>
>> On 2015-07-17 14:19, Oleg Nesterov wrote:
>>> On 07/17, Austin S Hemmelgarn wrote:
>>>>
>>>> On 2015-07-17 13:55, Oleg Nesterov wrote:
>>>>> On 07/17, Benjamin LaHaise wrote:
>>>>>>
>>>>>> Don't add BUG().  It's the equivalent approach of saying "I think this code
>>>>>> isn't needed, but I'm lazy and not going to remove it properly."
>>>>>
>>>>> There is another interpretation: I think this code must be never called,
>>>>> if it is actually called we have a serious problem which should be loudly
>>>>> reported.
>>>>>
>>>> And not compiling it at all _will_ loudly report it, it'll just report
>>>> it during linking instead of at run-time, which is a much better time to
>>>> shout about it.
>>>
>>> And how can we do this?
>>>
>> If a function that isn't defined (for example, you use a #if block to
>> comment it out under certain circumstances), then the link will fail
>> rather noisily something references it.
>
> This is what we are trying to fix.
Ah, I misunderstood the intent of the patch, looking at it again this 
makes perfect sense.
>
>> We already know during the
>> compile that it's a NOMMU kernel, so anything that calls it on a MMU
>> enabled kernel can have a compile time check added
>
> It already has. memory.c is not compiled if NOMMU.
>
> The problem is aio_ring_vm_ops which references this function. And btw
> filemap_fault() too.
>
> And just in case, I won't mind to add ifdef(CONFIG_MMU) there, I am
> waiting for reply from Benjamin.
>
>> instead of doing the
>> check at runtime (or even just calling it without checking), thus even
>> further reducing code size.
>
> So what exactly do you suggest to fix the problem?
Some sort of COMPILE_BUG_ON usage (I think that's the name of the macro 
that I'm thinking of, not certain though) possibly?  My only point is 
that it's more useful if the function should never be called to find a 
way to enforce this at build time instead of runtime (even more so for 
NOMMU systems, they tend (in my experience at least) to be even more of 
a pain to debug when they crash).  I do agree though that a helpful 
message is much preferred to a link error.



[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 2967 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 14:06     ` Benjamin LaHaise
  2015-07-17 17:27       ` Oleg Nesterov
@ 2015-07-17 22:31       ` Oleg Nesterov
  2015-07-20 14:22         ` Jeff Moyer
  1 sibling, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-17 22:31 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On 07/17, Benjamin LaHaise wrote:
>
> it should have worked before the page
> migration code was added,

This is off-topic, but the whole "vm" logic in aio_setup_ring()
looks sub-optimal. I do not mean the code, just it seems to me it
is pointless to pollute the page cache, and expose the pages we
can not swap/free to lru. Afaics we _only_ need this for migration.

Perhaps I missed something, doesn't matter. But this means that
this memory is not accounted, so if I increase aio-max-nr then
this test-case

	#define __NR_io_setup	206

	int main(void)
	{
		int nr;

		for (nr = 0; ;++nr) {
			void *ctx = NULL;
			int ret = syscall(__NR_io_setup, 1, &ctx);
			if (ret) {
				printf("failed %d %m: ", nr);
				getchar();
			}
		}

		return 0;
	}

triggers OOM-killer which kills sshd and other daemons on my machine.
These pages were not even faulted in (or the shrinker can unmap them),
the kernel can not know who should be blamed.

Shouldn't we account aio events/pages somehow, say per-user, or in
mm->pinned_vm ?

I do not think this is unkown, and probably this all is fine. IOW,
this is just a question, not a bug-report or something like this.

And of course, this is not exploitable because aio-max-nr limits
the number of pages you can steal.

But otoh, aio_max_nr is system-wide, so the unpriviliged user can
ddos (say) mysqld. And this leads to the same question: shouldn't
we account nr_events at least?

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 17:55           ` Oleg Nesterov
  2015-07-17 18:12             ` Austin S Hemmelgarn
@ 2015-07-17 22:56             ` Oleg Nesterov
  1 sibling, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-17 22:56 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

Benjamin,

This discussion was a bit confusing, so let me try to summarize and
please correct me if I misunderstood.

And sorry for bothering you, I just want to fix and forget about the
problem which was introduced by me (build failure with NOMMU).

So _iiuc_ you are starting to agree that tehcnically this change is
correct and we can never hit this BUG(). Like we can never hit another
BUG() in nommu.c:filemap_fault(), also referenced by aio_ring_vm_ops.
And this change should not hurt even if you make aio work with NOMMU.

No?

However, you still dislike this change because you think it is sub-
optimal and/or not clean enough. I won't argue with maintainer.

So what do you suggest instead? Will you agree with ifdef(CONFIG_MMU)
in aio_ring_vm_ops?

I aggree in advance with any suggestion.

On 07/17, Oleg Nesterov wrote:
>
> On 07/17, Benjamin LaHaise wrote:
> >
> > On Fri, Jul 17, 2015 at 07:27:26PM +0200, Oleg Nesterov wrote:
> > > Benjamin,
> > >
> > > it seems that we do not understand each other,
> > ...
> > > >
> > > > Either try to fix it correctly,
> > >
> > > And I think this fix is correct. In a sense that we only add
> > > filemap_page_mkwrite() to make the linker happy, it can never be called
> > > and thus we can never hit this BUG().
> > >
> > > Please look at filemap_fault() in nommu.c,
> > >
> > > 	int filemap_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> > > 	{
> > > 		BUG();
> > > 		return 0;
> > > 	}
> > >
> > > this is the same thing. If nothing else, mm/memory.c is not even compiled
> > > if NOMMU.
> >
> > Using BUG() is the wrong approach.  If the code is not needed in NOMMU, then
> > #ifdef it out.  Think about it: NOMMU systems are very low memory systems
> > and they should not have dead code compiled in if it is not needed.
>
> OK, at least I hope you no longer think that this patch makes this code
> knowingly broken.
>
> > Don't add BUG().  It's the equivalent approach of saying "I think this code
> > isn't needed, but I'm lazy and not going to remove it properly."
>
> There is another interpretation: I think this code must be never called,
> if it is actually called we have a serious problem which should be loudly
> reported.
>
> > > Why? Could you explain what I have missed?
> >
> > It's doing half the job.  Either the code should be #if'd out or not.
>
> Again, filemap_page_mkwrite() added to nommu.c matches filemap_fault()
> and filemap_map_pages() we already have.
>
>
> But I won't argue, you are maintainer. What exactly do you want me to
> ifdef? Will you agree with the patch which adds ifdef into
> aio_ring_vm_ops,
>
> 	static const struct vm_operations_struct aio_ring_vm_ops = {
> 	       .mremap         = aio_ring_mremap,
> 	#ifdef CONFIG_MMU
> 	       .fault          = filemap_fault,
> 	       .map_pages      = filemap_map_pages,
> 	       .page_mkwrite   = filemap_page_mkwrite,
> 	#endif
> 	};
>
> ?
>
> Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-17 22:31       ` Oleg Nesterov
@ 2015-07-20 14:22         ` Jeff Moyer
  2015-07-20 17:33           ` Oleg Nesterov
  0 siblings, 1 reply; 27+ messages in thread
From: Jeff Moyer @ 2015-07-20 14:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Benjamin LaHaise, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

Hi, Oleg,

Oleg Nesterov <oleg@redhat.com> writes:

> Shouldn't we account aio events/pages somehow, say per-user, or in
> mm->pinned_vm ?

Ages ago I wrote a patch to account the completion ring to a process'
memlock limit:
  "[patch] aio: remove aio-max-nr and instead use the memlock rlimit to
   limit the number of pages pinned for the aio completion ring"
  http://marc.info/?l=linux-aio&m=123661380807041&w=2

The problem with that patch is that it modifies the user/kernel
interface.  It could be done over time, as Andrew outlined in that
thread, but I've been reluctant to take that on.

If you just mean we should account the memory so that the right process
can be killed, that sounds like a good idea to me.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-20 14:22         ` Jeff Moyer
@ 2015-07-20 17:33           ` Oleg Nesterov
  2015-07-20 17:51             ` Benjamin LaHaise
  0 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-20 17:33 UTC (permalink / raw)
  To: Jeff Moyer
  Cc: Benjamin LaHaise, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

Hi Jeff,

On 07/20, Jeff Moyer wrote:
>
> Hi, Oleg,
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > Shouldn't we account aio events/pages somehow, say per-user, or in
> > mm->pinned_vm ?
>
> Ages ago I wrote a patch to account the completion ring to a process'
> memlock limit:
>   "[patch] aio: remove aio-max-nr and instead use the memlock rlimit to
>    limit the number of pages pinned for the aio completion ring"
>   http://marc.info/?l=linux-aio&m=123661380807041&w=2
>
> The problem with that patch is that it modifies the user/kernel
> interface.  It could be done over time, as Andrew outlined in that
> thread, but I've been reluctant to take that on.

See also the usage of mm->pinned_vm and user->locked_vm in perf_mmap(),
perhaps aio can do the same...

> If you just mean we should account the memory so that the right process
> can be killed, that sounds like a good idea to me.

Not sure we actually need this. I only meant that this looks confusing
because this memory is actually locked but the kernel doesn't know this.

And btw, I forgot to mention that I triggered OOM on the testing machine
with only 512mb ram, and aio-max-nr was huge. So, once again, while this
all doesn't look right to me, I do not think this is the real problem.

Except the fact that an unpriviliged user can steal all aio-max-nr events.
This probably worth fixing in any case.



And if we accept the fact this memory is locked and if we properly account
it, then may be we can just kill aio_migratepage(), aio_private_file(), and
change aio_setup_ring() to simply use install_special_mapping(). This will
greatly simplify the code. But let me remind that I know nothing about aio,
so please don't take my thoughts seriously.

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-20 17:33           ` Oleg Nesterov
@ 2015-07-20 17:51             ` Benjamin LaHaise
  2015-07-20 18:30               ` Jeff Moyer
  2015-07-20 18:31               ` Oleg Nesterov
  0 siblings, 2 replies; 27+ messages in thread
From: Benjamin LaHaise @ 2015-07-20 17:51 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jeff Moyer, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On Mon, Jul 20, 2015 at 07:33:11PM +0200, Oleg Nesterov wrote:
> Hi Jeff,
> 
> On 07/20, Jeff Moyer wrote:
> >
> > Hi, Oleg,
> >
> > Oleg Nesterov <oleg@redhat.com> writes:
> >
> > > Shouldn't we account aio events/pages somehow, say per-user, or in
> > > mm->pinned_vm ?
> >
> > Ages ago I wrote a patch to account the completion ring to a process'
> > memlock limit:
> >   "[patch] aio: remove aio-max-nr and instead use the memlock rlimit to
> >    limit the number of pages pinned for the aio completion ring"
> >   http://marc.info/?l=linux-aio&m=123661380807041&w=2
> >
> > The problem with that patch is that it modifies the user/kernel
> > interface.  It could be done over time, as Andrew outlined in that
> > thread, but I've been reluctant to take that on.
> 
> See also the usage of mm->pinned_vm and user->locked_vm in perf_mmap(),
> perhaps aio can do the same...
> 
> > If you just mean we should account the memory so that the right process
> > can be killed, that sounds like a good idea to me.
> 
> Not sure we actually need this. I only meant that this looks confusing
> because this memory is actually locked but the kernel doesn't know this.
> 
> And btw, I forgot to mention that I triggered OOM on the testing machine
> with only 512mb ram, and aio-max-nr was huge. So, once again, while this
> all doesn't look right to me, I do not think this is the real problem.
> 
> Except the fact that an unpriviliged user can steal all aio-max-nr events.
> This probably worth fixing in any case.
> 
> 
> 
> And if we accept the fact this memory is locked and if we properly account
> it, then may be we can just kill aio_migratepage(), aio_private_file(), and
> change aio_setup_ring() to simply use install_special_mapping(). This will
> greatly simplify the code. But let me remind that I know nothing about aio,
> so please don't take my thoughts seriously.

No, you can't get rid of that code.  The page migration is required when 
CPUs/memory is offlined and data needs to be moved to another node.  
Similarly, support for mremap() is also required for container migration / 
restoration.

As for accounting locked memory, we don't do that for memory pinned by 
O_DIRECT either.  Given how small the amount of memory aio can pin is 
compared to O_DIRECT or mlock(), it is unlikely that the accounting of 
how much aio has pinned will make any real difference in the big picture.  
A single O_DIRECT i/o can pin megabytes of memory.

		-ben

> Oleg.

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-20 17:51             ` Benjamin LaHaise
@ 2015-07-20 18:30               ` Jeff Moyer
  2015-07-20 18:31               ` Oleg Nesterov
  1 sibling, 0 replies; 27+ messages in thread
From: Jeff Moyer @ 2015-07-20 18:30 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Oleg Nesterov, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

Benjamin LaHaise <bcrl@kvack.org> writes:

> As for accounting locked memory, we don't do that for memory pinned by 
> O_DIRECT either.  Given how small the amount of memory aio can pin is 
> compared to O_DIRECT or mlock(), it is unlikely that the accounting of 
> how much aio has pinned will make any real difference in the big picture.  
> A single O_DIRECT i/o can pin megabytes of memory.

Actually, you can pin a lot of memory with aio.  Worst case, aio-max-nr
represents the maximum number of pages you can lock in memory (assuming
you pass 1 to io_setup in a loop).  So, for the default of 65536 events,
that comes to 256MB.

My system has libvirt installed, which changes the default to 1 million:
$ grep aio-max-nr /usr/lib/sysctl.d/libvirtd.conf
fs.aio-max-nr = 1048576

So, that means up to 4GB of memory can be tied up by aio.  Oracle's
installation guide recommends the same.

The difference between the aio ring and something like DIO is that the
ring is long-lived.  I/O, ideally, doesn't take that long, so that's a
poor comparison to make.

I tend to agree with Oleg, a system-wide setting just isn't a good fit
for this.  Changing it now that it's in place is difficult, though, and
obviously low priority.

-Jeff

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-20 17:51             ` Benjamin LaHaise
  2015-07-20 18:30               ` Jeff Moyer
@ 2015-07-20 18:31               ` Oleg Nesterov
  2015-07-20 19:24                 ` Oleg Nesterov
  1 sibling, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-20 18:31 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Jeff Moyer, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On 07/20, Benjamin LaHaise wrote:
>
> On Mon, Jul 20, 2015 at 07:33:11PM +0200, Oleg Nesterov wrote:
> >
> > And if we accept the fact this memory is locked and if we properly account
> > it, then may be we can just kill aio_migratepage(), aio_private_file(), and
> > change aio_setup_ring() to simply use install_special_mapping(). This will
> > greatly simplify the code. But let me remind that I know nothing about aio,
> > so please don't take my thoughts seriously.
>
> No, you can't get rid of that code.  The page migration is required when
> CPUs/memory is offlined and data needs to be moved to another node.

Of course, if we remove aio_migratepage() then aio can't be moved,

> Similarly, support for mremap() is also required for container migration /
> restoration.

This is not the problem. And one of the reasons to move ->mremap() into
vm_operations_struct was that install_special_mapping() can use it.

> Given how small the amount of memory aio can pin

I agree, but why should we worry about migration then? let this memory be
unmovable, don't use GFP_RECLAIMABLE/MOVABLE, etc.

But again, again, please ignore. This all is off-topic and my understanding
is very limited.

> it is unlikely that the accounting of
> how much aio has pinned will make any real difference in the big picture.

Agreed, but this can help to remove the system-wide aio-max-nr. Again,
unpriviliged user can steal aio.

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-20 18:31               ` Oleg Nesterov
@ 2015-07-20 19:24                 ` Oleg Nesterov
  2015-07-20 19:39                   ` Benjamin LaHaise
  0 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-20 19:24 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Jeff Moyer, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On 07/20, Oleg Nesterov wrote:
>
> But again, again, please ignore. This all is off-topic and my understanding
> is very limited.

Yes, yes, but sorry for noise and let me repeat...

This memory lives in page-cache/lru, it is visible for shrinker which
will unmap these pages for no reason on memory shortage. IOW, aio fools
the kernel, this memory looks reclaimable but it is not. And we only do
this for migration.

Even if this is not a problem, this does not look right. So perhaps at
least mapping_set_unevictable() makes sense. But I simply do not know
if migration will work with this change.


And I should have changes the subject a long ago... So what do you think
we should do with the build failure?

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-20 19:24                 ` Oleg Nesterov
@ 2015-07-20 19:39                   ` Benjamin LaHaise
  2015-07-20 20:03                     ` Oleg Nesterov
  0 siblings, 1 reply; 27+ messages in thread
From: Benjamin LaHaise @ 2015-07-20 19:39 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jeff Moyer, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On Mon, Jul 20, 2015 at 09:24:40PM +0200, Oleg Nesterov wrote:
> On 07/20, Oleg Nesterov wrote:
> >
> > But again, again, please ignore. This all is off-topic and my understanding
> > is very limited.
> 
> Yes, yes, but sorry for noise and let me repeat...
> 
> This memory lives in page-cache/lru, it is visible for shrinker which
> will unmap these pages for no reason on memory shortage. IOW, aio fools
> the kernel, this memory looks reclaimable but it is not. And we only do
> this for migration.

And we have the same problem with O_DIRECT.  Given the size of the LRU in 
a modern system, I highly doubt a handful of pages getting scanned is a 
major problem.  If you want to improve this, go ahead, but we need to 
retain support for page migration as people have run into the need for it.

> Even if this is not a problem, this does not look right. So perhaps at
> least mapping_set_unevictable() makes sense. But I simply do not know
> if migration will work with this change.

Nor do I know if that will work.

> And I should have changes the subject a long ago... So what do you think
> we should do with the build failure?

I honestly don't care what of the options you do -- please just don't go 
about adding BUG()s.

		-ben

> Oleg.

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-20 19:39                   ` Benjamin LaHaise
@ 2015-07-20 20:03                     ` Oleg Nesterov
  0 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-20 20:03 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Jeff Moyer, Andrew Morton, Joonsoo Kim, Fengguang Wu,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On 07/20, Benjamin LaHaise wrote:
>
> On Mon, Jul 20, 2015 at 09:24:40PM +0200, Oleg Nesterov wrote:
> > On 07/20, Oleg Nesterov wrote:
> > >
> > > But again, again, please ignore. This all is off-topic and my understanding
> > > is very limited.
> >
> > Yes, yes, but sorry for noise and let me repeat...
> >
> > This memory lives in page-cache/lru, it is visible for shrinker which
> > will unmap these pages for no reason on memory shortage. IOW, aio fools
> > the kernel, this memory looks reclaimable but it is not. And we only do
> > this for migration.
>
> And we have the same problem with O_DIRECT.

I don't this so, or I misunderstood. Sure, dio_get_page() will pin the page.
But not forever and this is unavoidable. This is not the same. OK, nevermind,
lets forget this.

> > And I should have changes the subject a long ago... So what do you think
> > we should do with the build failure?
>
> I honestly don't care what of the options you do -- please just don't go
> about adding BUG()s.

OK, thanks.

Then I'll send v2 which adds ifdef(MMU) into aio_ring_vm_ops tomorrow.

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-16 23:14 [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
  2015-07-16 23:22 ` Stephen Rothwell
  2015-07-16 23:24 ` Andrew Morton
@ 2015-07-21 15:29 ` Oleg Nesterov
  2015-07-21 15:38   ` Benjamin LaHaise
  2015-07-21 16:20 ` [PATCH v3] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
  3 siblings, 1 reply; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-21 15:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Joonsoo Kim, Benjamin LaHaise, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

Andrew,

this replaces

	mm-move-mremap-from-file_operations-to-vm_operations_struct-fix.patch

in -mm tree nacked by Benjamin.

fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite() is
not defined in this case. Add ifdef(CONFIG_MMU) into aio_ring_vm_ops.

Note that this only fixes the build error, currently aio doesn't work
if !CONFIG_MMU anyway.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/aio.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index c122624..822c199 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -338,9 +338,11 @@ static int aio_ring_mremap(struct vm_area_struct *vma)
 
 static const struct vm_operations_struct aio_ring_vm_ops = {
 	.mremap		= aio_ring_mremap,
+#ifdef CONFIG_MMU
 	.fault		= filemap_fault,
 	.map_pages	= filemap_map_pages,
 	.page_mkwrite	= filemap_page_mkwrite,
+#endif
 };
 
 static int aio_ring_mmap(struct file *file, struct vm_area_struct *vma)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-21 15:29 ` [PATCH v2] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
@ 2015-07-21 15:38   ` Benjamin LaHaise
  2015-07-21 16:18     ` Oleg Nesterov
  0 siblings, 1 reply; 27+ messages in thread
From: Benjamin LaHaise @ 2015-07-21 15:38 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On Tue, Jul 21, 2015 at 05:29:03PM +0200, Oleg Nesterov wrote:
> Andrew,
> 
> this replaces
> 
> 	mm-move-mremap-from-file_operations-to-vm_operations_struct-fix.patch
> 
> in -mm tree nacked by Benjamin.
> 
> fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite() is
> not defined in this case. Add ifdef(CONFIG_MMU) into aio_ring_vm_ops.
> 
> Note that this only fixes the build error, currently aio doesn't work
> if !CONFIG_MMU anyway.
> 
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>  fs/aio.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/aio.c b/fs/aio.c
> index c122624..822c199 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -338,9 +338,11 @@ static int aio_ring_mremap(struct vm_area_struct *vma)
>  
>  static const struct vm_operations_struct aio_ring_vm_ops = {
>  	.mremap		= aio_ring_mremap,
> +#ifdef CONFIG_MMU
>  	.fault		= filemap_fault,
>  	.map_pages	= filemap_map_pages,
>  	.page_mkwrite	= filemap_page_mkwrite,
> +#endif
>  };

One nit: using #if IS_ENABLED(CONFIG_MMU) is recommended these days not 
#ifdef CONFIG_MMU.

		-ben

>  static int aio_ring_mmap(struct file *file, struct vm_area_struct *vma)
> -- 
> 1.7.1
> 

-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-21 15:38   ` Benjamin LaHaise
@ 2015-07-21 16:18     ` Oleg Nesterov
  0 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-21 16:18 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Andrew Morton, Joonsoo Kim, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

On 07/21, Benjamin LaHaise wrote:
>
> On Tue, Jul 21, 2015 at 05:29:03PM +0200, Oleg Nesterov wrote:
> >
> >  static const struct vm_operations_struct aio_ring_vm_ops = {
> >  	.mremap		= aio_ring_mremap,
> > +#ifdef CONFIG_MMU
> >  	.fault		= filemap_fault,
> >  	.map_pages	= filemap_map_pages,
> >  	.page_mkwrite	= filemap_page_mkwrite,
> > +#endif
> >  };
>
> One nit: using #if IS_ENABLED(CONFIG_MMU) is recommended these days not
> #ifdef CONFIG_MMU.

OK, this matches IS_ENABLED(CONFIG_MIGRATION). I am sending v3.

Oleg.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix
  2015-07-16 23:14 [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
                   ` (2 preceding siblings ...)
  2015-07-21 15:29 ` [PATCH v2] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
@ 2015-07-21 16:20 ` Oleg Nesterov
  3 siblings, 0 replies; 27+ messages in thread
From: Oleg Nesterov @ 2015-07-21 16:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Joonsoo Kim, Benjamin LaHaise, Fengguang Wu, Jeff Moyer,
	Johannes Weiner, Stephen Rothwell, linux-next, linux-kernel

Andrew,

this replaces

	mm-move-mremap-from-file_operations-to-vm_operations_struct-fix.patch

in -mm tree nacked by Benjamin.

fs/aio.c can't be compiled if CONFIG_MMU=n, filemap_page_mkwrite() is
not defined in this case. Add ifdef(CONFIG_MMU) into aio_ring_vm_ops.

Note that this only fixes the build error, currently aio doesn't work
if !CONFIG_MMU anyway.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/aio.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index c122624..822c199 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -338,9 +338,11 @@ static int aio_ring_mremap(struct vm_area_struct *vma)
 
 static const struct vm_operations_struct aio_ring_vm_ops = {
 	.mremap		= aio_ring_mremap,
+#if IS_ENABLED(CONFIG_MMU)
 	.fault		= filemap_fault,
 	.map_pages	= filemap_map_pages,
 	.page_mkwrite	= filemap_page_mkwrite,
+#endif
 };
 
 static int aio_ring_mmap(struct file *file, struct vm_area_struct *vma)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2015-07-21 16:23 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-16 23:14 [PATCH] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
2015-07-16 23:22 ` Stephen Rothwell
2015-07-16 23:24 ` Andrew Morton
2015-07-16 23:52   ` Oleg Nesterov
2015-07-17 14:06     ` Benjamin LaHaise
2015-07-17 17:27       ` Oleg Nesterov
2015-07-17 17:37         ` Benjamin LaHaise
2015-07-17 17:55           ` Oleg Nesterov
2015-07-17 18:12             ` Austin S Hemmelgarn
2015-07-17 18:19               ` Oleg Nesterov
2015-07-17 18:39                 ` Austin S Hemmelgarn
2015-07-17 18:54                   ` Oleg Nesterov
2015-07-17 19:09                     ` Austin S Hemmelgarn
2015-07-17 22:56             ` Oleg Nesterov
2015-07-17 22:31       ` Oleg Nesterov
2015-07-20 14:22         ` Jeff Moyer
2015-07-20 17:33           ` Oleg Nesterov
2015-07-20 17:51             ` Benjamin LaHaise
2015-07-20 18:30               ` Jeff Moyer
2015-07-20 18:31               ` Oleg Nesterov
2015-07-20 19:24                 ` Oleg Nesterov
2015-07-20 19:39                   ` Benjamin LaHaise
2015-07-20 20:03                     ` Oleg Nesterov
2015-07-21 15:29 ` [PATCH v2] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov
2015-07-21 15:38   ` Benjamin LaHaise
2015-07-21 16:18     ` Oleg Nesterov
2015-07-21 16:20 ` [PATCH v3] mm-move-mremap-from-file_operations-to-vm_operations_struct-fix Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.