All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Hugh Dickins <hughd@google.com>, Andrey Vagin <avagin@openvz.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	Dmitry Safonov <dsafonov@virtuozzo.com>,
	Andrew Morton <akpm@linuxfoundation.org>,
	Adrian Reber <areber@redhat.com>
Subject: Re: [criu] 1M guard page ruined restore
Date: Thu, 22 Jun 2017 16:23:00 +0200	[thread overview]
Message-ID: <20170622142300.GA762@redhat.com> (raw)
In-Reply-To: <20170621170129.GA32752@redhat.com>

Cyrill,

I am replying to my own email because I got lost in numerous threads/emails
connected to stack guard/gap problems. IIRC you confirmed that the 1st load
doesn't fail and the patch fixes the problem. So everything is clear, and we
will discuss this change in another thread.

But let me add that (imo) you should not change this test-case. You simply
should not run it if kerndat_mm_guard_page_maps() detects the new kernel at
startup.

The new version makes no sense for criu, afaics. Yes, yes, thank you very
much for this test-case, it found the kernel regression ;) But criu has
nothing to do with this problem, and it is not clear right now if we are
going to fix it or not.

With the recent kernel changes criu should never look outside of start-end
region reported by /proc/maps; and restore doesn't even need to know if a
GROWSDOWN region will actually grow or not, because (iiuc) you do not need
to auto-grow the stack vma during restore, criu re-creates the whole vma
with the same length using MAP_FIXED and it should never write below the
addr returned by mmap(MAP_FIXED).

So (afaics) the only complication is that the process can be dumped on
a system running with (say) stack_guard_gap=4K kernel parameter, and then
restored on another system running with stack_guard_gap=1M. In this case
the application may fail after restore if it tries to auto-grow the stack,
but this is unlikely and this is another story.

Oleg.

On 06/21, Oleg Nesterov wrote:
>
> On 06/21, Cyrill Gorcunov wrote:
> >
> > On Wed, Jun 21, 2017 at 05:57:30PM +0200, Oleg Nesterov wrote:
> > > >
> > > > 	p = fake_grow_down;
> > > > 	*p-- = 'c';
> > >
> > > I guess this works? I mean, *p-- = 'c' should not fail...
> >
> > It fails.
>
> Hmm. Impossible ;) could you add the additional printf's to re-check?
>
> > Here is the complete code. It supposed to _extend_ stack but it fails
> > on the latest master + Hugh's [PATCH] mm: fix new crash in unmapped_area_topdown()
> > ---
> > [root@fc2 criu]# ~/st2
> > start_addr 7fe6162a8000
> > start_addr 7fe6163d9000
> > Segmentation fault (core dumped)
> > ---
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <errno.h>
> > #include <stdlib.h>
> > #include <string.h>
> > #include <unistd.h>
> >
> > #include <sys/mman.h>
> >
> > #define PAGE_SIZE 4096
> >
> > int main(int argc, char **argv)
> > {
> > 	char *start_addr, *start_addr1, *fake_grow_down, *test_addr, *grow_down;
> > 	volatile char *p;
> >
> > 	start_addr = mmap(NULL, PAGE_SIZE * 512, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> > 	if (start_addr == MAP_FAILED) {
> > 		printf("Can't mal a new region");
> > 		return 1;
> > 	}
> > 	printf("start_addr %lx\n", start_addr);
> > 	munmap(start_addr, PAGE_SIZE * 512);
> >
> > 	start_addr += PAGE_SIZE * 300;
> >
> > 	fake_grow_down = mmap(start_addr + PAGE_SIZE * 5, PAGE_SIZE,
> > 			 PROT_READ | PROT_WRITE,
> > 			 MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED | MAP_GROWSDOWN, -1, 0);
> > 	if (fake_grow_down == MAP_FAILED) {
> > 		printf("Can't mal a new region");
> > 		return 1;
> > 	}
> > 	printf("start_addr %lx\n", fake_grow_down);
> >
> > 	p = fake_grow_down;
> > 	*p-- = 'c';
>
> once again, I can't believe this STORE can fail...
>
> > 	*p = 'b';
>
> Ah. I forgot about another kernel "feature" ;) not related to the recent guard
> page changes...
>
> Could you test the patch below?
>
> Oleg.
>
>
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 8ad91a0..edc5d68 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1416,7 +1416,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>  		 * and pusha to work. ("enter $65535, $31" pushes
>  		 * 32 pointers and then decrements %sp by 65535.)
>  		 */
> -		if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < regs->sp)) {
> +if (0)		if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < regs->sp)) {
>  			bad_area(regs, error_code, address);
>  			return;
>  		}

  parent reply	other threads:[~2017-06-22 14:23 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-20  7:52 [criu] 1M guard page ruined restore Cyrill Gorcunov
2017-06-20 10:23 ` Hugh Dickins
2017-06-20 10:41   ` Cyrill Gorcunov
2017-06-21 15:22   ` Cyrill Gorcunov
2017-06-21 15:48     ` Cyrill Gorcunov
2017-06-21 15:57     ` Oleg Nesterov
2017-06-21 16:04       ` Cyrill Gorcunov
2017-06-21 17:01         ` Oleg Nesterov
2017-06-21 17:15           ` Dmitry Safonov
2017-06-21 17:19             ` Dmitry Safonov
2017-06-21 17:31               ` Oleg Nesterov
2017-06-21 17:37                 ` Dmitry Safonov
2017-06-21 17:52                 ` Dmitry Safonov
2017-06-22  1:24                   ` Hugh Dickins
2017-06-22  8:06                     ` Cyrill Gorcunov
2017-06-21 17:15           ` Oleg Nesterov
2017-06-21 17:53             ` Cyrill Gorcunov
2017-06-21 17:16           ` Willy Tarreau
2017-06-22 14:23           ` Oleg Nesterov [this message]
2017-06-22 15:05             ` Cyrill Gorcunov
2017-06-20 10:51 ` Oleg Nesterov
2017-06-20 11:10   ` Cyrill Gorcunov
2017-06-20 11:55   ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170622142300.GA762@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linuxfoundation.org \
    --cc=areber@redhat.com \
    --cc=avagin@openvz.org \
    --cc=dsafonov@virtuozzo.com \
    --cc=gorcunov@gmail.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.