All of lore.kernel.org
 help / color / mirror / Atom feed
* ckpt-v20-dev, ckpt-v21-rc1
@ 2010-03-30  6:26 Oren Laadan
       [not found] ` <4BB1997B.1010901-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Oren Laadan @ 2010-03-30  6:26 UTC (permalink / raw)
  To: Linux Containers


I pulled all the recent patches in linux-cr (except for ipv6 fixup
set), and created the following two branches:

ckpt-v20-dev - patches applied onto pof v20
ckpt-v21-rc1 - patches folded into a clean patchset

Likewise with user-cr (but with more exceptions - working now on
pulling more patches in).

This is totally untested except for successful compilation...

Oren.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
       [not found] ` <4BB1997B.1010901-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2010-03-30 20:56   ` Serge E. Hallyn
  2010-03-31 17:33   ` Serge E. Hallyn
  1 sibling, 0 replies; 10+ messages in thread
From: Serge E. Hallyn @ 2010-03-30 20:56 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Linux Containers

Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org):
> 
> I pulled all the recent patches in linux-cr (except for ipv6 fixup
> set), and created the following two branches:
> 
> ckpt-v20-dev - patches applied onto pof v20
> ckpt-v21-rc1 - patches folded into a clean patchset
> 
> Likewise with user-cr (but with more exceptions - working now on
> pulling more patches in).
> 
> This is totally untested except for successful compilation...
> 
> Oren.

ckpt-v21-rc1 (aside from the export_symbol patch) passes all
tests for me on powerpc, x86-64, and s390, and all manner of
config combos with lsms, namespaces, checkpoint=y/n and
debug=y/n compile with no errors

-serge

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
       [not found] ` <4BB1997B.1010901-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  2010-03-30 20:56   ` Serge E. Hallyn
@ 2010-03-31 17:33   ` Serge E. Hallyn
       [not found]     ` <20100331173339.GA19371-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Serge E. Hallyn @ 2010-03-31 17:33 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Linux Containers

Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org):
> 
> I pulled all the recent patches in linux-cr (except for ipv6 fixup
> set), and created the following two branches:
> 
> ckpt-v20-dev - patches applied onto pof v20
> ckpt-v21-rc1 - patches folded into a clean patchset
> 
> Likewise with user-cr (but with more exceptions - working now on
> pulling more patches in).
> 
> This is totally untested except for successful compilation...
> 
> Oren.

v21 with the recent patches applied seems rock-solid to me, on
x86-64, s390x, and powerpc.

recent patches means:
	skip down interfaces v2,
	put fops->checkpoint under ifdef
	get rid of ckpt_hdr_vpids,
	export net checkpoint fns

for kernel and

	Add --nonetns switch to user-cr checkpoint
	fix vpids
	user-c/r: get rid of ckpt_hdr_vpids

to user-cr#ckpt-v20-dev

Do we want to try and incorporate Matt's patchset to clear out
linux-2.6/checkpoint/ next, or ship what we have?

-serge

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
       [not found]     ` <20100331173339.GA19371-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-04-01  5:31       ` Oren Laadan
       [not found]         ` <4BB42FBB.8030206-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Oren Laadan @ 2010-04-01  5:31 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Containers


Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org):
>> I pulled all the recent patches in linux-cr (except for ipv6 fixup
>> set), and created the following two branches:
>>
>> ckpt-v20-dev - patches applied onto pof v20
>> ckpt-v21-rc1 - patches folded into a clean patchset
>>
>> Likewise with user-cr (but with more exceptions - working now on
>> pulling more patches in).
>>
>> This is totally untested except for successful compilation...
>>
>> Oren.
> 
> v21 with the recent patches applied seems rock-solid to me, on
> x86-64, s390x, and powerpc.
> 
> recent patches means:
> 	skip down interfaces v2,
> 	put fops->checkpoint under ifdef
> 	get rid of ckpt_hdr_vpids,
> 	export net checkpoint fns
> 
> for kernel and
> 
> 	Add --nonetns switch to user-cr checkpoint
> 	fix vpids
> 	user-c/r: get rid of ckpt_hdr_vpids
> 
> to user-cr#ckpt-v20-dev


I pushed linux-cr:ckpt-v21-rc2 with:

> 	skip down interfaces v2,
> 	put fops->checkpoint under ifdef
> 	get rid of ckpt_hdr_vpids,
> 	export net checkpoint fns

and user-cr:ckpt-v20-dev with:

> 	fix vpids
> 	user-c/r: get rid of ckpt_hdr_vpids
plus Suka's two recent patches.

Dan's --nonetns is pending - see my reply.

> 
> Do we want to try and incorporate Matt's patchset to clear out
> linux-2.6/checkpoint/ next, or ship what we have?

IIRC there is at least one more comment from fsdevel that I need
to address, need to dig into emails again (file_pos_ ?).

I hope to start tackling the transformation to matt's-format by
tomorrow, and will see how fast it goes.

Oren.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
       [not found]         ` <4BB42FBB.8030206-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2010-04-01 14:17           ` Serge E. Hallyn
       [not found]             ` <20100401141740.GB22648-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-04-01 21:15           ` Matt Helsley
  1 sibling, 1 reply; 10+ messages in thread
From: Serge E. Hallyn @ 2010-04-01 14:17 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Linux Containers

Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org):
> 
> Serge E. Hallyn wrote:
> > Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org):
> >> I pulled all the recent patches in linux-cr (except for ipv6 fixup
> >> set), and created the following two branches:
> >>
> >> ckpt-v20-dev - patches applied onto pof v20
> >> ckpt-v21-rc1 - patches folded into a clean patchset
> >>
> >> Likewise with user-cr (but with more exceptions - working now on
> >> pulling more patches in).
> >>
> >> This is totally untested except for successful compilation...
> >>
> >> Oren.
> > 
> > v21 with the recent patches applied seems rock-solid to me, on
> > x86-64, s390x, and powerpc.
> > 
> > recent patches means:
> > 	skip down interfaces v2,
> > 	put fops->checkpoint under ifdef
> > 	get rid of ckpt_hdr_vpids,
> > 	export net checkpoint fns
> > 
> > for kernel and
> > 
> > 	Add --nonetns switch to user-cr checkpoint
> > 	fix vpids
> > 	user-c/r: get rid of ckpt_hdr_vpids
> > 
> > to user-cr#ckpt-v20-dev
> 
> 
> I pushed linux-cr:ckpt-v21-rc2 with:
> 
> > 	skip down interfaces v2,
> > 	put fops->checkpoint under ifdef
> > 	get rid of ckpt_hdr_vpids,
> > 	export net checkpoint fns
> 
> and user-cr:ckpt-v20-dev with:
> 
> > 	fix vpids
> > 	user-c/r: get rid of ckpt_hdr_vpids
> plus Suka's two recent patches.
> 
> Dan's --nonetns is pending - see my reply.
> 
> > 
> > Do we want to try and incorporate Matt's patchset to clear out
> > linux-2.6/checkpoint/ next, or ship what we have?
> 
> IIRC there is at least one more comment from fsdevel that I need
> to address, need to dig into emails again (file_pos_ ?).
> 
> I hope to start tackling the transformation to matt's-format by
> tomorrow, and will see how fast it goes.

Alas after I sent this Nathan reported trouble on x86.  I haven't
gotten a x86-32 partition running yet so can't reproduce.  Does
anyone else have x86-32 with f12 they can test on?

-serge

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
       [not found]             ` <20100401141740.GB22648-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-04-01 18:09               ` Nathan Lynch
  2010-04-01 18:54                 ` Nathan Lynch
  0 siblings, 1 reply; 10+ messages in thread
From: Nathan Lynch @ 2010-04-01 18:09 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Containers

On Thu, 2010-04-01 at 09:17 -0500, Serge E. Hallyn wrote:
> Alas after I sent this Nathan reported trouble on x86.  I haven't
> gotten a x86-32 partition running yet so can't reproduce.  Does
> anyone else have x86-32 with f12 they can test on?

More detail on this.  Seeing programs crash after restart on an i386 KVM
guest with Fedora 12 userspace, e.g.

bash-simple.sh[3627] general protection ip:b76d8197 sp:bfe73904 error:0
in libc-2.11.1.so[b760e000+16f000]

kernel: ckpt-v21-rc1 (v2.6.33-108-g9f4401c)

user-cr: ckpt-v20 ( 70a5c7630ce8dd933e60174aba6dc5cb08ea4b41) with
CHECKPOINT_NONETNS flag added to checkpoint calls

cr_tests: master (7bd883e17f25d36d66f9b550194c445df2f63e7e) with
CHECKPOINT_NONETNS flag added to simple/ckpt.c.

The "simple" (self-checkpoint) testcase restarts successfully without
crashing.  Everything else seems to get crashes as above.  As far as I
can tell the memory map is restored correctly and from the kernel's POV
the restart is successful.

FWIW, I don't see these crashes on another KVM guest where the only
difference is that it is x86_64.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
  2010-04-01 18:09               ` Nathan Lynch
@ 2010-04-01 18:54                 ` Nathan Lynch
  2010-04-01 19:10                   ` Serge E. Hallyn
  0 siblings, 1 reply; 10+ messages in thread
From: Nathan Lynch @ 2010-04-01 18:54 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Containers

On Thu, 2010-04-01 at 13:09 -0500, Nathan Lynch wrote:
> On Thu, 2010-04-01 at 09:17 -0500, Serge E. Hallyn wrote:
> > Alas after I sent this Nathan reported trouble on x86.  I haven't
> > gotten a x86-32 partition running yet so can't reproduce.  Does
> > anyone else have x86-32 with f12 they can test on?
> 
> More detail on this.  Seeing programs crash after restart on an i386 KVM
> guest with Fedora 12 userspace, e.g.
> 
> bash-simple.sh[3627] general protection ip:b76d8197 sp:bfe73904 error:0
> in libc-2.11.1.so[b760e000+16f000]
> 
> kernel: ckpt-v21-rc1 (v2.6.33-108-g9f4401c)
> 
> user-cr: ckpt-v20 ( 70a5c7630ce8dd933e60174aba6dc5cb08ea4b41) with
> CHECKPOINT_NONETNS flag added to checkpoint calls

Confirmed that behavior is the same with user-cr ckpt-v20-dev
(7bd883e17f25d36d66f9b550194c445df2f63e7e).

> 
> cr_tests: master (7bd883e17f25d36d66f9b550194c445df2f63e7e) with
> CHECKPOINT_NONETNS flag added to simple/ckpt.c.
> 
> The "simple" (self-checkpoint) testcase restarts successfully without
> crashing.  Everything else seems to get crashes as above.  As far as I
> can tell the memory map is restored correctly and from the kernel's POV
> the restart is successful.
> 
> FWIW, I don't see these crashes on another KVM guest where the only
> difference is that it is x86_64.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
  2010-04-01 18:54                 ` Nathan Lynch
@ 2010-04-01 19:10                   ` Serge E. Hallyn
       [not found]                     ` <20100401191001.GA8882-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Serge E. Hallyn @ 2010-04-01 19:10 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: Linux Containers

Quoting Nathan Lynch (ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org):
> On Thu, 2010-04-01 at 13:09 -0500, Nathan Lynch wrote:
> > On Thu, 2010-04-01 at 09:17 -0500, Serge E. Hallyn wrote:
> > > Alas after I sent this Nathan reported trouble on x86.  I haven't
> > > gotten a x86-32 partition running yet so can't reproduce.  Does
> > > anyone else have x86-32 with f12 they can test on?
> > 
> > More detail on this.  Seeing programs crash after restart on an i386 KVM
> > guest with Fedora 12 userspace, e.g.
> > 
> > bash-simple.sh[3627] general protection ip:b76d8197 sp:bfe73904 error:0
> > in libc-2.11.1.so[b760e000+16f000]
> > 
> > kernel: ckpt-v21-rc1 (v2.6.33-108-g9f4401c)
> > 
> > user-cr: ckpt-v20 ( 70a5c7630ce8dd933e60174aba6dc5cb08ea4b41) with
> > CHECKPOINT_NONETNS flag added to checkpoint calls
> 
> Confirmed that behavior is the same with user-cr ckpt-v20-dev
> (7bd883e17f25d36d66f9b550194c445df2f63e7e).
> 
> > 
> > cr_tests: master (7bd883e17f25d36d66f9b550194c445df2f63e7e) with
> > CHECKPOINT_NONETNS flag added to simple/ckpt.c.
> > 
> > The "simple" (self-checkpoint) testcase restarts successfully without
> > crashing.  Everything else seems to get crashes as above.  As far as I
> > can tell the memory map is restored correctly and from the kernel's POV
> > the restart is successful.
> > 
> > FWIW, I don't see these crashes on another KVM guest where the only
> > difference is that it is x86_64.
> 

I just (finally) finished install and setup of f12 x86 kvm container,
and it seems to be passing.  (well cloop_parallel seems to need more
than the 10 second timeout to check all tasks are restarted, but they
do restart - I do have kvm constrained with memory and cpu cgroups :).

-serge

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
       [not found]                     ` <20100401191001.GA8882-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-04-01 20:57                       ` Nathan Lynch
  0 siblings, 0 replies; 10+ messages in thread
From: Nathan Lynch @ 2010-04-01 20:57 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Containers

On Thu, 2010-04-01 at 14:10 -0500, Serge E. Hallyn wrote:
> Quoting Nathan Lynch (ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org):
> > On Thu, 2010-04-01 at 13:09 -0500, Nathan Lynch wrote:
> > > On Thu, 2010-04-01 at 09:17 -0500, Serge E. Hallyn wrote:
> > > > Alas after I sent this Nathan reported trouble on x86.  I haven't
> > > > gotten a x86-32 partition running yet so can't reproduce.  Does
> > > > anyone else have x86-32 with f12 they can test on?
> > > 
> > > More detail on this.  Seeing programs crash after restart on an i386 KVM
> > > guest with Fedora 12 userspace, e.g.
> > > 
> > > bash-simple.sh[3627] general protection ip:b76d8197 sp:bfe73904 error:0
> > > in libc-2.11.1.so[b760e000+16f000]
> > > 
> > > kernel: ckpt-v21-rc1 (v2.6.33-108-g9f4401c)
> > > 
> > > user-cr: ckpt-v20 ( 70a5c7630ce8dd933e60174aba6dc5cb08ea4b41) with
> > > CHECKPOINT_NONETNS flag added to checkpoint calls
> > 
> > Confirmed that behavior is the same with user-cr ckpt-v20-dev
> > (7bd883e17f25d36d66f9b550194c445df2f63e7e).
> > 
> > > 
> > > cr_tests: master (7bd883e17f25d36d66f9b550194c445df2f63e7e) with
> > > CHECKPOINT_NONETNS flag added to simple/ckpt.c.
> > > 
> > > The "simple" (self-checkpoint) testcase restarts successfully without
> > > crashing.  Everything else seems to get crashes as above.  As far as I
> > > can tell the memory map is restored correctly and from the kernel's POV
> > > the restart is successful.
> > > 
> > > FWIW, I don't see these crashes on another KVM guest where the only
> > > difference is that it is x86_64.
> > 
> 
> I just (finally) finished install and setup of f12 x86 kvm container,
> and it seems to be passing.  (well cloop_parallel seems to need more
> than the 10 second timeout to check all tasks are restarted, but they
> do restart - I do have kvm constrained with memory and cpu cgroups :).

On the problem guest I booted a 64-bit kernel configured similarly to
the i386 one, and the crashes do not occur.  That is, I'm running 32-bit
userspace on a 64-bit kernel.  So the issue is likely specific to the
i386 c/r implementation and my particular version of kvm or host kernel
(which is also Fedora 12).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ckpt-v20-dev, ckpt-v21-rc1
       [not found]         ` <4BB42FBB.8030206-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
  2010-04-01 14:17           ` Serge E. Hallyn
@ 2010-04-01 21:15           ` Matt Helsley
  1 sibling, 0 replies; 10+ messages in thread
From: Matt Helsley @ 2010-04-01 21:15 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Linux Containers

On Thu, Apr 01, 2010 at 01:31:39AM -0400, Oren Laadan wrote:
> 
> Serge E. Hallyn wrote:
> > Quoting Oren Laadan (orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org):
> >> I pulled all the recent patches in linux-cr (except for ipv6 fixup
> >> set), and created the following two branches:
> >>
> >> ckpt-v20-dev - patches applied onto pof v20
> >> ckpt-v21-rc1 - patches folded into a clean patchset
> >>
> >> Likewise with user-cr (but with more exceptions - working now on
> >> pulling more patches in).
> >>
> >> This is totally untested except for successful compilation...
> >>
> >> Oren.
> > 
> > v21 with the recent patches applied seems rock-solid to me, on
> > x86-64, s390x, and powerpc.
> > 
> > recent patches means:
> > 	skip down interfaces v2,
> > 	put fops->checkpoint under ifdef
> > 	get rid of ckpt_hdr_vpids,
> > 	export net checkpoint fns
> > 
> > for kernel and
> > 
> > 	Add --nonetns switch to user-cr checkpoint
> > 	fix vpids
> > 	user-c/r: get rid of ckpt_hdr_vpids
> > 
> > to user-cr#ckpt-v20-dev
> 
> 
> I pushed linux-cr:ckpt-v21-rc2 with:
> 
> > 	skip down interfaces v2,
> > 	put fops->checkpoint under ifdef
> > 	get rid of ckpt_hdr_vpids,
> > 	export net checkpoint fns
> 
> and user-cr:ckpt-v20-dev with:
> 
> > 	fix vpids
> > 	user-c/r: get rid of ckpt_hdr_vpids
> plus Suka's two recent patches.
> 
> Dan's --nonetns is pending - see my reply.
> 
> > 
> > Do we want to try and incorporate Matt's patchset to clear out
> > linux-2.6/checkpoint/ next, or ship what we have?
> 
> IIRC there is at least one more comment from fsdevel that I need
> to address, need to dig into emails again (file_pos_ ?).
> 
> I hope to start tackling the transformation to matt's-format by
> tomorrow, and will see how fast it goes.

Well, my transformation doesn't touch eclone or the cgroup-freezer-related
patches so the initial 20 patches won't need to change.

It would be great if we could reduce the number of patches. Some ideas:

Can we push Dave's Namespace menu patch independently? I suspect it's
good enough even without c/r at this point.

The cgroup freezer fix patch has gone to Rafael. It only affects frozen
cgroups if we do:
	echo FROZEN > /cgroup/foo/freezer.state
	<suspend system>
	<resume system>  <-- unfreezes cgroup too!

I've only submitted patches via one tree and then "waited" patiently
-- can/should we drop this patch? My instinct is to trust the maintainers
to merge things properly and keep it so we don't get a cgroup freezer bug
report. But I would like to shorten the number of non-c/r patches in this
series...

Could Serge's "c/r: split core function out of some set*{u,g}id functions"
or "c/r: break out new_user_ns()" be considered code cleanups or
at all useful without c/r? If so, perhaps we can push those separately
and with c/r.

I think Nick's comments and the LWN article clearly suggest that
the current c/r code organization is a barrier to review. I didn't
get the impression that any of the patches added for v21 were complicated
enough to warrant another round of limited review amongst ourselves. The
longer we wait the harder it will be to do. So I think reorganizing the code
now, before v21 is the way to go.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-04-01 21:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-30  6:26 ckpt-v20-dev, ckpt-v21-rc1 Oren Laadan
     [not found] ` <4BB1997B.1010901-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-30 20:56   ` Serge E. Hallyn
2010-03-31 17:33   ` Serge E. Hallyn
     [not found]     ` <20100331173339.GA19371-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-04-01  5:31       ` Oren Laadan
     [not found]         ` <4BB42FBB.8030206-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-04-01 14:17           ` Serge E. Hallyn
     [not found]             ` <20100401141740.GB22648-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-04-01 18:09               ` Nathan Lynch
2010-04-01 18:54                 ` Nathan Lynch
2010-04-01 19:10                   ` Serge E. Hallyn
     [not found]                     ` <20100401191001.GA8882-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-04-01 20:57                       ` Nathan Lynch
2010-04-01 21:15           ` Matt Helsley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.