All of lore.kernel.org
 help / color / mirror / Atom feed
* [Linux-ia64] mprotect problem
@ 2001-12-06 23:56 Hoeflinger, Jay P
  2001-12-07  0:11 ` n0ano
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Hoeflinger, Jay P @ 2001-12-06 23:56 UTC (permalink / raw)
  To: linux-ia64

We are seeing an apparent problem with mprotect on Itanium.  We have seen
the problem
on two different machines, one running RedHat 7.1 (Seawolf)  [2.4.3-12smp]
and one
running Turbolinux [2.4.1-010131-8smp].

The mprotect is called from user space, as part of the implementation of a 
distributed virtual shared memory system that uses the virtual
memory mechanism to implement a shared address space between two or more
nodes.

The code works correctly under RedHat 7.1 for IA32 (and a variety of other
OS'es and platforms, so
we feel that there aren't coding errors, although maybe there is some
slightly different
way to use mprotect on Itanium (additional parameters, or flags?)?.

The problem we see is this:

During the course of running the user's program on top of our DVSM, the
program touches 
a "shared" page that has been mprotect'ed against reading and writing
previously because it is not
up-to-date with respect to the same page on other nodes in the system.  The
access faults,
our SEGV handler is called, we do the appropriate message passing operations
to 
make the data on the page consistent and up-to-date, then do an mprotect
allowing 
READ and WRITE this time, and return from the SEGV handler.  At this point,
the original instruction
(a READ) is restarted and immediately faults, causing control to go to the
SEGV handler 
again.  This time, since we know the page is up-to-date, we do nothing and
return, the 
instruction is again re-started, again faults, again jumps to the SEGV
handler . . . an infinite
loop.

The interesting thing is that this particular user code fails at random
points, sometimes working
correctly at points where it failed before.  We have never seen the code
work correctly all the way
through, though.  It always fails very soon after it begins, just at
different points on different runs.
We theorized that this was a timing problem, such that it just took some
time for mprotect to 
take effect, so we put in 10-millisecond delays after each mprotect, but
this really changed nothing.

One potential clue would be that the code is a pthreads program, and
multiple threads are 
running while one thread is doing the mprotect, and these machines are both
dual-processor machines.

We would appreciate any help that anyone can give.

Jay

Jay Hoeflinger, jay.p.hoeflinger@intel.com
KAI Software, A Division of Intel Americas, Inc., http://www.kai.com
Phone 217/356-2288, Direct 217/356-5052 x 140, Fax 217/356-5199




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
@ 2001-12-07  0:11 ` n0ano
  2001-12-07 14:53 ` Hoeflinger, Jay P
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: n0ano @ 2001-12-07  0:11 UTC (permalink / raw)
  To: linux-ia64

Jay-

Is this an IA32 binary you are running?  Do to issues with the way
the shared library loader uses the `mprotect' call I had to play
a little fast and loose with page protections for IA32 programs.
What happens is an `mprotect' will change the permissions for the
address range you specify but it might also change the permissions
for addresses just before and just after the specified range, depending
upon the kernel's page size and the address range specified.  It
doesn't sound like this should break your application but it's
possible.

One solution would be to run this on a kernel compiled for 4K pages.
This should give you exact IA32 operation and potentially solve your
problem.

If you're running an IA64 program then this is even more mysterious :-)

On Thu, Dec 06, 2001 at 03:56:20PM -0800, Hoeflinger, Jay P wrote:
> We are seeing an apparent problem with mprotect on Itanium.  We have seen
> the problem
> on two different machines, one running RedHat 7.1 (Seawolf)  [2.4.3-12smp]
> and one
> running Turbolinux [2.4.1-010131-8smp].
> 
> The mprotect is called from user space, as part of the implementation of a 
> distributed virtual shared memory system that uses the virtual
> memory mechanism to implement a shared address space between two or more
> nodes.
> 
> The code works correctly under RedHat 7.1 for IA32 (and a variety of other
> OS'es and platforms, so
> we feel that there aren't coding errors, although maybe there is some
> slightly different
> way to use mprotect on Itanium (additional parameters, or flags?)?.
> 
> The problem we see is this:
> 
> During the course of running the user's program on top of our DVSM, the
> program touches 
> a "shared" page that has been mprotect'ed against reading and writing
> previously because it is not
> up-to-date with respect to the same page on other nodes in the system.  The
> access faults,
> our SEGV handler is called, we do the appropriate message passing operations
> to 
> make the data on the page consistent and up-to-date, then do an mprotect
> allowing 
> READ and WRITE this time, and return from the SEGV handler.  At this point,
> the original instruction
> (a READ) is restarted and immediately faults, causing control to go to the
> SEGV handler 
> again.  This time, since we know the page is up-to-date, we do nothing and
> return, the 
> instruction is again re-started, again faults, again jumps to the SEGV
> handler . . . an infinite
> loop.
> 
> The interesting thing is that this particular user code fails at random
> points, sometimes working
> correctly at points where it failed before.  We have never seen the code
> work correctly all the way
> through, though.  It always fails very soon after it begins, just at
> different points on different runs.
> We theorized that this was a timing problem, such that it just took some
> time for mprotect to 
> take effect, so we put in 10-millisecond delays after each mprotect, but
> this really changed nothing.
> 
> One potential clue would be that the code is a pthreads program, and
> multiple threads are 
> running while one thread is doing the mprotect, and these machines are both
> dual-processor machines.
> 
> We would appreciate any help that anyone can give.
> 
> Jay
> 
> Jay Hoeflinger, jay.p.hoeflinger@intel.com
> KAI Software, A Division of Intel Americas, Inc., http://www.kai.com
> Phone 217/356-2288, Direct 217/356-5052 x 140, Fax 217/356-5199
> 
> 
> 
> _______________________________________________
> Linux-IA64 mailing list
> Linux-IA64@linuxia64.org
> http://lists.linuxia64.org/lists/listinfo/linux-ia64

-- 
Don Dugger
"Censeo Toto nos in Kansa esse decisse." - D. Gale
n0ano@indstorage.com
Ph: 303/652-0870x117


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
  2001-12-07  0:11 ` n0ano
@ 2001-12-07 14:53 ` Hoeflinger, Jay P
  2001-12-07 15:13 ` n0ano
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Hoeflinger, Jay P @ 2001-12-07 14:53 UTC (permalink / raw)
  To: linux-ia64

No, we haven't checked that this is not a page size or alignment issue.
The page size we get from getpagesize().

The alignment we get from the following:

			if ((shmid = shmget(IPC_PRIVATE, len,
IPC_CREAT|0600)) < 0)
				Tmk_perrexit("Tmk_page_initialize<shmget>:
can't allocate the shared memory");

			if ((page->vadr = shmat(shmid, addr, 0)) =
(caddr_t) -1L)
				Tmk_perrexit("Tmk_page_initialize<shmat>:
can't map the shared memory");

page->vadr is the address of the base of the area that we are managing
via our distributed shared virtual memory mechanism (the address of page 0.

We then attempt to open a swap file and mmap the whole range of addresses we
will manage (the length is "len"), map a set of alias pages, then mprotect 
the non-alias portion either PROT_READ or no-access.

			if ((fd = open(name, O_RDWR|O_CREAT|O_TRUNC, 0600))
< 0)
				Tmk_perrexit("Tmk_page_initialize<open>:
open(\"%s\", ... )", name);
			
			if (0 > unlink(name))
				Tmk_perrexit("Tmk_page_initialize<unlink>");

			if ((len - sizeof(c)) != lseek(fd, len - sizeof(c),
SEEK_END))
				Tmk_perrexit("Tmk_page_initialize<lseek>");

			if (0 > write(fd, &c, sizeof(c)))
				Tmk_perrexit("Tmk_page_initialize<write>");
#else
			if ((fd = open("/dev/zero", O_RDWR)) = -1)
				flags |= MAP_ANONYMOUS;
			else
				flags |= MAP_FILE;
#endif
			if ((page->vadr = mmap(addr, len, prot, flags, fd,
0)) = (caddr_t) -1L)
				Tmk_perrexit("Tmk_page_initialize<mmap>:
can't allocate the shared memory");

			if ((page->v_alias = mmap(addr+2*len, len, prot,
flags&~MAP_FIXED, fd, 0)) = (caddr_t) -1L)
				Tmk_perrexit("Tmk_page_initialize<mmap>:
can't allocate the shared memory");

			if (fd != -1)
				close(fd);
#endif
			if (Tmk_page_init_to_valid) {

				if (Tmk_nprocs > 1)
					if (0 > mprotect(page->vadr, len,
PROT_READ))
	
Tmk_perrexit("Tmk_page_initialize<mprotect>");
			}
			else {
				if (Tmk_proc_id)
					if (0 > mprotect(page->vadr, len,
0))
	
Tmk_perrexit("Tmk_page_initialize<mprotect>");
			}

We compute the address of the rest of the pages by looping and successively
adding the page size to this address.

This part seems to work, since we see the original fault happening as it
should,
but once this initial protection is set up, we seem to be unable to change
it,
even though we use:

				if (0 > mprotect(page->vadr, Tmk_page_size,
PROT_READ|PROT_WRITE))
	
Tmk_perrexit("segv_handler<mprotect>");

Also, as I said before, this is on a dual-processor Itanium box.  Currently
we are 
running two processes, both on the same box, that are cooperating to do this

DVSM operation.

Any ideas?

Jay

-----Original Message-----
From: Boehm, Hans [mailto:hans_boehm@hp.com]
Sent: Thursday, December 06, 2001 6:20 PM
To: 'Hoeflinger, Jay P'
Cc: 'n0ano@indstorage.com'; MOSBERGER, DAVID (HP-PaloAlto,unix3)
Subject: RE: [Linux-ia64] mprotect problem


I assume you checked that this is not a page size or alignment issue?

My garbage collector does something very similar when running in incremental
mode.  The standard test consistently passes on Itanium.  There are some
differences:  It only write protects pages.  It immediately unprotects the
page, and does little else, in the signal handler.  And of course the timing
is all different.  It's running in native IA64 mode.

Hans

> -----Original Message-----
> From: Hoeflinger, Jay P [mailto:jay.p.hoeflinger@intel.com]
> Sent: Thursday, December 06, 2001 3:56 PM
> To: 'linux-ia64@linuxia64.org'
> Subject: [Linux-ia64] mprotect problem
> 
> 
> We are seeing an apparent problem with mprotect on Itanium.  
> We have seen
> the problem
> on two different machines, one running RedHat 7.1 (Seawolf)  
> [2.4.3-12smp]
> and one
> running Turbolinux [2.4.1-010131-8smp].
> 
> The mprotect is called from user space, as part of the 
> implementation of a 
> distributed virtual shared memory system that uses the virtual
> memory mechanism to implement a shared address space between 
> two or more
> nodes.
> 
> The code works correctly under RedHat 7.1 for IA32 (and a 
> variety of other
> OS'es and platforms, so
> we feel that there aren't coding errors, although maybe there is some
> slightly different
> way to use mprotect on Itanium (additional parameters, or flags?)?.
> 
> The problem we see is this:
> 
> During the course of running the user's program on top of our 
> DVSM, the
> program touches 
> a "shared" page that has been mprotect'ed against reading and writing
> previously because it is not
> up-to-date with respect to the same page on other nodes in 
> the system.  The
> access faults,
> our SEGV handler is called, we do the appropriate message 
> passing operations
> to 
> make the data on the page consistent and up-to-date, then do 
> an mprotect
> allowing 
> READ and WRITE this time, and return from the SEGV handler.  
> At this point,
> the original instruction
> (a READ) is restarted and immediately faults, causing control 
> to go to the
> SEGV handler 
> again.  This time, since we know the page is up-to-date, we 
> do nothing and
> return, the 
> instruction is again re-started, again faults, again jumps to the SEGV
> handler . . . an infinite
> loop.
> 
> The interesting thing is that this particular user code fails 
> at random
> points, sometimes working
> correctly at points where it failed before.  We have never 
> seen the code
> work correctly all the way
> through, though.  It always fails very soon after it begins, just at
> different points on different runs.
> We theorized that this was a timing problem, such that it 
> just took some
> time for mprotect to 
> take effect, so we put in 10-millisecond delays after each 
> mprotect, but
> this really changed nothing.
> 
> One potential clue would be that the code is a pthreads program, and
> multiple threads are 
> running while one thread is doing the mprotect, and these 
> machines are both
> dual-processor machines.
> 
> We would appreciate any help that anyone can give.
> 
> Jay
> 
> Jay Hoeflinger, jay.p.hoeflinger@intel.com
> KAI Software, A Division of Intel Americas, Inc., http://www.kai.com
> Phone 217/356-2288, Direct 217/356-5052 x 140, Fax 217/356-5199
> 
> 
> 
> _______________________________________________
> Linux-IA64 mailing list
> Linux-IA64@linuxia64.org
> http://lists.linuxia64.org/lists/listinfo/linux-ia64
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
  2001-12-07  0:11 ` n0ano
  2001-12-07 14:53 ` Hoeflinger, Jay P
@ 2001-12-07 15:13 ` n0ano
  2001-12-07 15:18 ` Hoeflinger, Jay P
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: n0ano @ 2001-12-07 15:13 UTC (permalink / raw)
  To: linux-ia64

Jay-

Well that's a relief (for me anyway :-)

To my knowledge there is no change in any of these calls for IA64,
they should work the same way, with the same arguments, as they do
on IA32.

Maybe David has an idea?

On Fri, Dec 07, 2001 at 06:27:27AM -0800, Hoeflinger, Jay P wrote:
> In fact, this is an IA64 program.  Both the runtime library and the
> "user" code are compiled on IA64.  
> 
> Is there anything unusual about the various virtual memory calls on Itanium 
> (mmap, getpagesize, mprotect, shmget, shmat, etc) as compared to
> for IA32 (maybe there are extra args, or the flags don't mean
> exactly the same things, or there are new flags we have to use)?
> 
> Can you give us anything to try to help zero in on the problem?
> 
> 
> Jay
> 
> 
> -----Original Message-----
> From: n0ano@indstorage.com [mailto:n0ano@indstorage.com]
> Sent: Thursday, December 06, 2001 6:11 PM
> To: Hoeflinger, Jay P
> Cc: 'linux-ia64@linuxia64.org'
> Subject: Re: [Linux-ia64] mprotect problem
> 
> 
> Jay-
> 
> Is this an IA32 binary you are running?  Do to issues with the way
> the shared library loader uses the `mprotect' call I had to play
> a little fast and loose with page protections for IA32 programs.
> What happens is an `mprotect' will change the permissions for the
> address range you specify but it might also change the permissions
> for addresses just before and just after the specified range, depending
> upon the kernel's page size and the address range specified.  It
> doesn't sound like this should break your application but it's
> possible.
> 
> One solution would be to run this on a kernel compiled for 4K pages.
> This should give you exact IA32 operation and potentially solve your
> problem.
> 
> If you're running an IA64 program then this is even more mysterious :-)
> 
> On Thu, Dec 06, 2001 at 03:56:20PM -0800, Hoeflinger, Jay P wrote:
> > We are seeing an apparent problem with mprotect on Itanium.  We have seen
> > the problem
> > on two different machines, one running RedHat 7.1 (Seawolf)  [2.4.3-12smp]
> > and one
> > running Turbolinux [2.4.1-010131-8smp].
> > 
> > The mprotect is called from user space, as part of the implementation of a
> 
> > distributed virtual shared memory system that uses the virtual
> > memory mechanism to implement a shared address space between two or more
> > nodes.
> > 
> > The code works correctly under RedHat 7.1 for IA32 (and a variety of other
> > OS'es and platforms, so
> > we feel that there aren't coding errors, although maybe there is some
> > slightly different
> > way to use mprotect on Itanium (additional parameters, or flags?)?.
> > 
> > The problem we see is this:
> > 
> > During the course of running the user's program on top of our DVSM, the
> > program touches 
> > a "shared" page that has been mprotect'ed against reading and writing
> > previously because it is not
> > up-to-date with respect to the same page on other nodes in the system.
> The
> > access faults,
> > our SEGV handler is called, we do the appropriate message passing
> operations
> > to 
> > make the data on the page consistent and up-to-date, then do an mprotect
> > allowing 
> > READ and WRITE this time, and return from the SEGV handler.  At this
> point,
> > the original instruction
> > (a READ) is restarted and immediately faults, causing control to go to the
> > SEGV handler 
> > again.  This time, since we know the page is up-to-date, we do nothing and
> > return, the 
> > instruction is again re-started, again faults, again jumps to the SEGV
> > handler . . . an infinite
> > loop.
> > 
> > The interesting thing is that this particular user code fails at random
> > points, sometimes working
> > correctly at points where it failed before.  We have never seen the code
> > work correctly all the way
> > through, though.  It always fails very soon after it begins, just at
> > different points on different runs.
> > We theorized that this was a timing problem, such that it just took some
> > time for mprotect to 
> > take effect, so we put in 10-millisecond delays after each mprotect, but
> > this really changed nothing.
> > 
> > One potential clue would be that the code is a pthreads program, and
> > multiple threads are 
> > running while one thread is doing the mprotect, and these machines are
> both
> > dual-processor machines.
> > 
> > We would appreciate any help that anyone can give.
> > 
> > Jay
> > 
> > Jay Hoeflinger, jay.p.hoeflinger@intel.com
> > KAI Software, A Division of Intel Americas, Inc., http://www.kai.com
> > Phone 217/356-2288, Direct 217/356-5052 x 140, Fax 217/356-5199
> > 
> > 
> > 
> > _______________________________________________
> > Linux-IA64 mailing list
> > Linux-IA64@linuxia64.org
> > http://lists.linuxia64.org/lists/listinfo/linux-ia64
> 
> -- 
> Don Dugger
> "Censeo Toto nos in Kansa esse decisse." - D. Gale
> n0ano@indstorage.com
> Ph: 303/652-0870x117

-- 
Don Dugger
"Censeo Toto nos in Kansa esse decisse." - D. Gale
n0ano@indstorage.com
Ph: 303/652-0870x117


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
                   ` (2 preceding siblings ...)
  2001-12-07 15:13 ` n0ano
@ 2001-12-07 15:18 ` Hoeflinger, Jay P
  2001-12-07 16:10 ` David Mosberger
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Hoeflinger, Jay P @ 2001-12-07 15:18 UTC (permalink / raw)
  To: linux-ia64

Is there a clue in the fact that it fails for both RedHat 7.1 and 
Turbolinux?

Jay

-----Original Message-----
From: n0ano@indstorage.com [mailto:n0ano@indstorage.com]
Sent: Friday, December 07, 2001 9:13 AM
To: Hoeflinger, Jay P
Cc: David Mosberger; 'linux-ia64@linuxia64.org'
Subject: Re: [Linux-ia64] mprotect problem


Jay-

Well that's a relief (for me anyway :-)

To my knowledge there is no change in any of these calls for IA64,
they should work the same way, with the same arguments, as they do
on IA32.

Maybe David has an idea?

On Fri, Dec 07, 2001 at 06:27:27AM -0800, Hoeflinger, Jay P wrote:
> In fact, this is an IA64 program.  Both the runtime library and the
> "user" code are compiled on IA64.  
> 
> Is there anything unusual about the various virtual memory calls on
Itanium 
> (mmap, getpagesize, mprotect, shmget, shmat, etc) as compared to
> for IA32 (maybe there are extra args, or the flags don't mean
> exactly the same things, or there are new flags we have to use)?
> 
> Can you give us anything to try to help zero in on the problem?
> 
> 
> Jay
> 
> 
> -----Original Message-----
> From: n0ano@indstorage.com [mailto:n0ano@indstorage.com]
> Sent: Thursday, December 06, 2001 6:11 PM
> To: Hoeflinger, Jay P
> Cc: 'linux-ia64@linuxia64.org'
> Subject: Re: [Linux-ia64] mprotect problem
> 
> 
> Jay-
> 
> Is this an IA32 binary you are running?  Do to issues with the way
> the shared library loader uses the `mprotect' call I had to play
> a little fast and loose with page protections for IA32 programs.
> What happens is an `mprotect' will change the permissions for the
> address range you specify but it might also change the permissions
> for addresses just before and just after the specified range, depending
> upon the kernel's page size and the address range specified.  It
> doesn't sound like this should break your application but it's
> possible.
> 
> One solution would be to run this on a kernel compiled for 4K pages.
> This should give you exact IA32 operation and potentially solve your
> problem.
> 
> If you're running an IA64 program then this is even more mysterious :-)
> 
> On Thu, Dec 06, 2001 at 03:56:20PM -0800, Hoeflinger, Jay P wrote:
> > We are seeing an apparent problem with mprotect on Itanium.  We have
seen
> > the problem
> > on two different machines, one running RedHat 7.1 (Seawolf)
[2.4.3-12smp]
> > and one
> > running Turbolinux [2.4.1-010131-8smp].
> > 
> > The mprotect is called from user space, as part of the implementation of
a
> 
> > distributed virtual shared memory system that uses the virtual
> > memory mechanism to implement a shared address space between two or more
> > nodes.
> > 
> > The code works correctly under RedHat 7.1 for IA32 (and a variety of
other
> > OS'es and platforms, so
> > we feel that there aren't coding errors, although maybe there is some
> > slightly different
> > way to use mprotect on Itanium (additional parameters, or flags?)?.
> > 
> > The problem we see is this:
> > 
> > During the course of running the user's program on top of our DVSM, the
> > program touches 
> > a "shared" page that has been mprotect'ed against reading and writing
> > previously because it is not
> > up-to-date with respect to the same page on other nodes in the system.
> The
> > access faults,
> > our SEGV handler is called, we do the appropriate message passing
> operations
> > to 
> > make the data on the page consistent and up-to-date, then do an mprotect
> > allowing 
> > READ and WRITE this time, and return from the SEGV handler.  At this
> point,
> > the original instruction
> > (a READ) is restarted and immediately faults, causing control to go to
the
> > SEGV handler 
> > again.  This time, since we know the page is up-to-date, we do nothing
and
> > return, the 
> > instruction is again re-started, again faults, again jumps to the SEGV
> > handler . . . an infinite
> > loop.
> > 
> > The interesting thing is that this particular user code fails at random
> > points, sometimes working
> > correctly at points where it failed before.  We have never seen the code
> > work correctly all the way
> > through, though.  It always fails very soon after it begins, just at
> > different points on different runs.
> > We theorized that this was a timing problem, such that it just took some
> > time for mprotect to 
> > take effect, so we put in 10-millisecond delays after each mprotect, but
> > this really changed nothing.
> > 
> > One potential clue would be that the code is a pthreads program, and
> > multiple threads are 
> > running while one thread is doing the mprotect, and these machines are
> both
> > dual-processor machines.
> > 
> > We would appreciate any help that anyone can give.
> > 
> > Jay
> > 
> > Jay Hoeflinger, jay.p.hoeflinger@intel.com
> > KAI Software, A Division of Intel Americas, Inc., http://www.kai.com
> > Phone 217/356-2288, Direct 217/356-5052 x 140, Fax 217/356-5199
> > 
> > 
> > 
> > _______________________________________________
> > Linux-IA64 mailing list
> > Linux-IA64@linuxia64.org
> > http://lists.linuxia64.org/lists/listinfo/linux-ia64
> 
> -- 
> Don Dugger
> "Censeo Toto nos in Kansa esse decisse." - D. Gale
> n0ano@indstorage.com
> Ph: 303/652-0870x117

-- 
Don Dugger
"Censeo Toto nos in Kansa esse decisse." - D. Gale
n0ano@indstorage.com
Ph: 303/652-0870x117


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
                   ` (3 preceding siblings ...)
  2001-12-07 15:18 ` Hoeflinger, Jay P
@ 2001-12-07 16:10 ` David Mosberger
  2001-12-07 16:23 ` Hoeflinger, Jay P
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: David Mosberger @ 2001-12-07 16:10 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 7 Dec 2001 06:53:14 -0800 , "Hoeflinger, Jay P" <jay.p.hoeflinger@intel.com> said:

  Jay> No, we haven't checked that this is not a page size or
  Jay> alignment issue.  The page size we get from getpagesize().

Can you find out *why* the later page faults occur?  Printing the ISR
should help.  The page fault handler does not (yet) setup si_isr in
the siginfo, but you could just hack
arch/ia64/mm/fault.c:ia64_do_page_fault() to print the ISR when you
detect the problematic case.

If you have a (small) test program that replicates the problem, I'd be
happy to look into it early next week.

	--david


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
                   ` (4 preceding siblings ...)
  2001-12-07 16:10 ` David Mosberger
@ 2001-12-07 16:23 ` Hoeflinger, Jay P
  2001-12-07 17:34 ` David Mosberger
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Hoeflinger, Jay P @ 2001-12-07 16:23 UTC (permalink / raw)
  To: linux-ia64

I don't know how to do this.  I'm not a kernel hacker and I've
never rebuilt the kernel.

Is there any way I can view this with a debugger or anything else?

Jay

-----Original Message-----
From: David Mosberger [mailto:davidm@hpl.hp.com]
Sent: Friday, December 07, 2001 10:10 AM
To: Hoeflinger, Jay P
Cc: 'Boehm, Hans'; 'n0ano@indstorage.com'; MOSBERGER, DAVID
(HP-PaloAlto,unix3); 'linux-ia64@linuxia64.org'
Subject: RE: [Linux-ia64] mprotect problem


>>>>> On Fri, 7 Dec 2001 06:53:14 -0800 , "Hoeflinger, Jay P"
<jay.p.hoeflinger@intel.com> said:

  Jay> No, we haven't checked that this is not a page size or
  Jay> alignment issue.  The page size we get from getpagesize().

Can you find out *why* the later page faults occur?  Printing the ISR
should help.  The page fault handler does not (yet) setup si_isr in
the siginfo, but you could just hack
arch/ia64/mm/fault.c:ia64_do_page_fault() to print the ISR when you
detect the problematic case.

If you have a (small) test program that replicates the problem, I'd be
happy to look into it early next week.

	--david


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
                   ` (5 preceding siblings ...)
  2001-12-07 16:23 ` Hoeflinger, Jay P
@ 2001-12-07 17:34 ` David Mosberger
  2001-12-07 19:47 ` Hoeflinger, Jay P
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: David Mosberger @ 2001-12-07 17:34 UTC (permalink / raw)
  To: linux-ia64

Jay,

If you don't have a minimal test program or can't share the source
code, it might be useful to collect a syscall trace.  That way, we
could see what addresses and sizes are involved in the mprotect()
call.  Something like:

	strace -o /tmp/out PROGNAME

should do (I'm assuming your program is not multithreaded; if it is,
you'd need to use the -f option and make sure you're running the
latest version of strace).

If the resulting output is very big, you won't be able to send it to
the mailing list.  You can either trim the output or just make sure
you mail it to me directly (davidm@hpl.hp.com).

Thanks,

	--david



^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
                   ` (6 preceding siblings ...)
  2001-12-07 17:34 ` David Mosberger
@ 2001-12-07 19:47 ` Hoeflinger, Jay P
  2001-12-07 20:13 ` Boehm, Hans
  2001-12-07 20:27 ` David Mosberger
  9 siblings, 0 replies; 11+ messages in thread
From: Hoeflinger, Jay P @ 2001-12-07 19:47 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 1459 bytes --]

OK, here's a trace.  The program is multi-threaded, so I used
strace -f.  Fortunately the problem happens almost right away.
At the very end, you can see the infinite loop starting.
One thing I noticed is that our printf's in the segv handler
are printing the fault
address as 20004000 while the mprotect is giving the address
as 6000000020004000.  This last address seems to be correct in 
relation to the original mprotect for the shared heap, which was

mprotect(0x6000000020000000, 268435456, PROT_NONE) = 0

Please let me know if this tells you anything interesting.

Jay

-----Original Message-----
From: David Mosberger [mailto:davidm@hpl.hp.com]
Sent: Friday, December 07, 2001 11:35 AM
To: Hoeflinger, Jay P
Cc: 'Boehm, Hans'; 'n0ano@indstorage.com'; 'linux-ia64@linuxia64.org'
Subject: RE: [Linux-ia64] mprotect problem


Jay,

If you don't have a minimal test program or can't share the source
code, it might be useful to collect a syscall trace.  That way, we
could see what addresses and sizes are involved in the mprotect()
call.  Something like:

	strace -o /tmp/out PROGNAME

should do (I'm assuming your program is not multithreaded; if it is,
you'd need to use the -f option and make sure you're running the
latest version of strace).

If the resulting output is very big, you won't be able to send it to
the mailing list.  You can either trim the output or just make sure
you mail it to me directly (davidm@hpl.hp.com).

Thanks,

	--david


[-- Attachment #2: strace-f.txt --]
[-- Type: text/plain, Size: 17512 bytes --]

28563 execve("./stress.udp", ["stress.udp", "--", "-n2", "-N1", "-hodie", "-hodie", "-p5001", "-i1"], [/* 49 vars */]) = 0
28563 uname({sys="Linux", node="odie", ...}) = 0
28563 brk(0)                            = 0x60000000016efd90
28563 mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000030000
28563 open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
28563 open("/opt/intel/compiler60/ia32/lib/libpthread.so.0", O_RDONLY) = -1 ENOENT (No such file or directory)
28563 SYS_1210(0x80000fffffffa5d0, 0x80000fffffffa920, 0x4000000000000ec8, 0x200000000000ecc0, 0xc000000000001125, 0x80000fffffffa5c0, 0x80000fffffffa5d0, 0) = -1 ENOENT (No such file or directory)
28563 open("/opt/intel/compiler60/ia64/lib/libpthread.so.0", O_RDONLY) = -1 ENOENT (No such file or directory)
28563 SYS_1210(0x80000fffffffa5d0, 0x80000fffffffa920, 0x4000000000000ec8, 0x200000000000ecc0, 0xc000000000001125, 0x80000fffffffa5c0, 0x80000fffffffa5d0, 0) = 0
28563 open("/etc/ld.so.cache", O_RDONLY) = 3
28563 SYS_1212(0x3, 0x80000fffffffa920, 0x200000000003db30, 0x200000000000f810, 0xc000000000000898, 0x80000fffffffa9a0, 0x169bb, 0x1) = 0
28563 mmap(NULL, 83870, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2000000000040000
28563 close(3)                          = 0
28563 open("/lib/libpthread.so.0", O_RDONLY) = 3
28563 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\0z\0\0"..., 1024) = 1024
28563 SYS_1212(0x3, 0x80000fffffffa920, 0x80000fffffffa920, 0x4000000000000ec8, 0x200000000000ecc0, 0xc000000000001125, 0x80000fffffffa5c0, 0x80000fffffffa5d0) = 0
28563 mmap(NULL, 223544, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2000000000058000
28563 mprotect(0x2000000000074000, 108856, PROT_NONE) = 0
28563 mmap(0x2000000000078000, 98304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x10000) = 0x2000000000078000
28563 close(3)                          = 0
28563 open("/opt/intel/compiler60/ia64/lib/libc.so.6.1", O_RDONLY) = -1 ENOENT (No such file or directory)
28563 open("/lib/libc.so.6.1", O_RDONLY) = 3
28563 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0\220\210"..., 1024) = 1024
28563 SYS_1212(0x3, 0x80000fffffffa8f0, 0x80000fffffffa5b0, 0x20000000000305b0, 0x200000000000ecc0, 0xc000000000001125, 0x80000fffffffa590, 0x80000fffffffa5a0) = 0
28563 mmap(NULL, 2422816, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x2000000000090000
28563 mprotect(0x20000000002c0000, 129056, PROT_NONE) = 0
28563 mmap(0x20000000002c0000, 114688, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x220000) = 0x20000000002c0000
28563 mmap(0x20000000002dc000, 14368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000002dc000
28563 close(3)                          = 0
28563 mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000034000
28563 munmap(0x2000000000040000, 83870) = 0
28563 getpid()                          = 28563
28563 rt_sigaction(SIGRT0, {0x2000000000035230, [], 0}, NULL, 8) = 0
28563 rt_sigaction(SIGRT1, {0x2000000000035248, [], 0}, NULL, 8) = 0
28563 rt_sigaction(SIGRT2, {0x2000000000035260, [], 0}, NULL, 8) = 0
28563 rt_sigprocmask(SIG_BLOCK, [RT0], NULL, 8) = 0
28563 _sysctl(0x80000fffffffb3c0)       = 0
28563 getpid()                          = 28563
28563 gettimeofday({1007753574, 244290}, NULL) = 0
28563 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
28563 ioctl(3, SIOCGIFCONF, 0x80000fffffffa620) = 0
28563 close(3)                          = 0
28563 brk(0)                            = 0x60000000016efd90
28563 brk(0x60000000016efdf0)           = 0x60000000016efdf0
28563 brk(0x60000000016f0000)           = 0x60000000016f0000
28563 brk(0x60000000016f4000)           = 0x60000000016f4000
28563 gettimeofday({1007753574, 245618}, NULL) = 0
28563 getpid()                          = 28563
28563 open("/etc/resolv.conf", O_RDONLY) = 3
28563 SYS_1212(0x3, 0x80000fffffff7ec0, 0, 0x1f7, 0x80000fffffffa5b0, 0x20000000000305b0, 0x200000000000ecc0, 0xc000000000001125) = 0
28563 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000040000
28563 read(3, "nameserver 172.31.250.101\nnamese"..., 16384) = 52
28563 read(3, "", 16384)                = 0
28563 close(3)                          = 0
28563 munmap(0x2000000000040000, 65536) = 0
28563 uname({sys="Linux", node="odie", ...}) = 0
28563 socket(PF_UNIX, SOCK_STREAM, 0)   = 3
28563 connect(3, {sin_family=AF_UNIX, path="                                                                                       /var/run/.nscd_socket"}, 110) = -1 ENOENT (No such file or directory)
28563 close(3)                          = 0
28563 open("/etc/nsswitch.conf", O_RDONLY) = 3
28563 SYS_1212(0x3, 0x80000fffffffa0e0, 0x60000000016f0200, 0x60000000016f01f8, 0x200000000017b830, 0xc000000000000205, 0x60000000016f0260, 0x2000000000040000) = 0
28563 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000040000
28563 read(3, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 16384) = 1782
28563 read(3, "", 16384)                = 0
28563 close(3)                          = 0
28563 munmap(0x2000000000040000, 65536) = 0
28563 open("/opt/intel/compiler60/ia64/lib/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
28563 open("/etc/ld.so.cache", O_RDONLY) = 3
28563 SYS_1212(0x3, 0x80000ffffffee030, 0x200000000003db30, 0x200000000000f810, 0xc000000000000898, 0x80000ffffffee0b0, 0x26abb, 0x80000ffffffedce0) = 0
28563 mmap(NULL, 83870, PROT_READ, MAP_PRIVATE, 3, 0) = 0x2000000000040000
28563 close(3)                          = 0
28563 open("/lib/libnss_files.so.2", O_RDONLY) = 3
28563 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0002\0\1\0\0\0 <\0\0"..., 1024) = 1024
28563 SYS_1212(0x3, 0x80000ffffffee030, 0x80000ffffffedd08, 0x80000ffffffeeb68, 0x200000000000ecc0, 0xc000000000001125, 0x80000ffffffedcd0, 0x80000ffffffedce0) = 0
28563 mmap(NULL, 152288, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x20000000002e0000
28563 mprotect(0x20000000002f8000, 53984, PROT_NONE) = 0
28563 mmap(0x2000000000300000, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x10000) = 0x2000000000300000
28563 close(3)                          = 0
28563 munmap(0x2000000000040000, 83870) = 0
28563 open("/etc/host.conf", O_RDONLY)  = 3
28563 SYS_1212(0x3, 0x80000fffffffa010, 0x2000000000290520, 0xc000000000000207, 0x80000ffffffee810, 0, 0x200000000003da40, 0x200000000028e6b0) = 0
28563 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000040000
28563 read(3, "order hosts,bind\n", 16384) = 17
28563 read(3, "", 16384)                = 0
28563 close(3)                          = 0
28563 munmap(0x2000000000040000, 65536) = 0
28563 open("/etc/hosts", O_RDONLY)      = 3
28563 fcntl(3, F_GETFD)                 = 0
28563 fcntl(3, F_SETFD, FD_CLOEXEC)     = 0
28563 SYS_1212(0x3, 0x80000fffffff9e20, 0x80000ffffffee028, 0x20000000003052e0, 0x80000ffffffedf08, 0x252e0, 0x80000ffffffee290, 0x80000ffffffedee0) = 0
28563 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000040000
28563 read(3, "################################"..., 16384) = 14421
28563 close(3)                          = 0
28563 munmap(0x2000000000040000, 65536) = 0
28563 write(2, "Tmk_startup: SWAPDIR=(null), put"..., 58) = 58
28563 getpid()                          = 28563
28563 open("/tmp/Tmk_swap.28563", O_RDWR|O_CREAT|O_TRUNC, 0600) = 3
28563 unlink("/tmp/Tmk_swap.28563")     = 0
28563 lseek(3, 268435455, SEEK_END)     = 268435455
28563 write(3, "\0", 1)                 = 1
28563 mmap(0x6000000020000000, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x6000000020000000
28563 mmap(0x6000000040000000, 268435456, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x6000000040000000
28563 close(3)                          = 0
28563 mprotect(0x6000000020000000, 268435456, PROT_NONE) = 0
28563 mmap(NULL, 278528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000308000
28563 mmap(NULL, 1064960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x200000000034c000
28563 mmap(NULL, 1064960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000450000
28563 rt_sigaction(SIGSEGV, {0x20000000000352a8, [], SA_SIGINFO}, NULL, 8) = 0
28563 rt_sigaction(SIGBUS, {0x20000000000352a8, [], SA_SIGINFO}, NULL, 8) = 0
28563 brk(0x60000000016f8000)           = 0x60000000016f8000
28563 brk(0x60000000016fc000)           = 0x60000000016fc000
28563 brk(0x6000000001700000)           = 0x6000000001700000
28563 brk(0x6000000001704000)           = 0x6000000001704000
28563 brk(0x6000000001708000)           = 0x6000000001708000
28563 brk(0x600000000170c000)           = 0x600000000170c000
28563 brk(0x6000000001710000)           = 0x6000000001710000
28563 brk(0x6000000001714000)           = 0x6000000001714000
28563 brk(0x6000000001718000)           = 0x6000000001718000
28563 brk(0x600000000171c000)           = 0x600000000171c000
28563 brk(0x6000000001720000)           = 0x6000000001720000
28563 brk(0x6000000001724000)           = 0x6000000001724000
28563 brk(0x6000000001728000)           = 0x6000000001728000
28563 brk(0x600000000172c000)           = 0x600000000172c000
28563 brk(0x6000000001730000)           = 0x6000000001730000
28563 brk(0x6000000001734000)           = 0x6000000001734000
28563 rt_sigaction(SIGALRM, {0x20000000000352c0, [], 0}, NULL, 8) = 0
28563 rt_sigprocmask(SIG_BLOCK, [ALRM], NULL, 8) = 0
28563 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
28563 getsockopt(3, SOL_SOCKET, SO_RCVBUF, [65535], [4]) = 0
28563 getsockopt(3, SOL_SOCKET, SO_SNDBUF, [65535], [4]) = 0
28563 bind(3, {sin_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}}, 16) = 0
28563 getsockname(3, {sin_family=AF_INET, sin_port=htons(32811), sin_addr=inet_addr("0.0.0.0")}}, [16]) = 0
28563 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0
28563 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
28563 getsockopt(4, SOL_SOCKET, SO_RCVBUF, [65535], [4]) = 0
28563 getsockopt(4, SOL_SOCKET, SO_SNDBUF, [65535], [4]) = 0
28563 bind(4, {sin_family=AF_INET, sin_port=htons(5001), sin_addr=inet_addr("0.0.0.0")}}, 16) = 0
28563 recvmsg(4, {msg_name(16)={sin_family=AF_INET, sin_port=htons(32810), sin_addr=inet_addr("192.168.83.151")}}, msg_iov(2)=[{" \0\0\0\0\4\0\0", 8}, {")\200\0\0", 4}], msg_controllen=0, msg_flags=0}, 0) = 12
28563 connect(4, {sin_family=AF_INET, sin_port=htons(32810), sin_addr=inet_addr("192.168.83.151")}}, 16) = 0
28563 send(4, " \0\0\0", 4, 0)          = 4
28563 close(4)                          = 0
28563 socket(PF_UNIX, SOCK_STREAM, 0)   = 4
28563 connect(4, {sin_family=AF_UNIX, path="                                                                                       /var/run/.nscd_socket"}, 110) = -1 ENOENT (No such file or directory)
28563 close(4)                          = 0
28563 open("/etc/hosts", O_RDONLY)      = 4
28563 fcntl(4, F_GETFD)                 = 0
28563 fcntl(4, F_SETFD, FD_CLOEXEC)     = 0
28563 SYS_1212(0x4, 0x80000fffffffa0f0, 0x80000ffffffee0c8, 0x3, 0x200000000002a700, 0x200000000003dbf8, 0x80000ffffffee030, 0x80000ffffffedfd0) = 0
28563 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000040000
28563 read(4, "################################"..., 16384) = 14421
28563 close(4)                          = 0
28563 munmap(0x2000000000040000, 65536) = 0
28563 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
28563 getsockopt(4, SOL_SOCKET, SO_RCVBUF, [65535], [4]) = 0
28563 getsockopt(4, SOL_SOCKET, SO_SNDBUF, [65535], [4]) = 0
28563 bind(4, {sin_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}}, 16) = 0
28563 connect(4, {sin_family=AF_INET, sin_port=htons(32809), sin_addr=inet_addr("192.168.83.151")}}, 16) = 0
28563 sendmsg(4, {msg_name(0)=NULL, msg_iov(3)=[{"!\0\0\0\10\4\342\345", 8}, {"+\200\0\0", 4}, {"", 0}], msg_controllen=0, msg_flags=0}, 0) = 12
28563 setitimer(ITIMER_REAL, {it_interval={1, 0}, it_value={1, 0}}, NULL) = 0
28563 rt_sigprocmask(SIG_UNBLOCK, [ALRM], [ALRM RT0], 8) = 0
28563 recv(4, "!\0\0\0", 4, 0)          = 4
28563 rt_sigprocmask(SIG_SETMASK, [ALRM RT0], NULL, 8) = 0
28563 recvmsg(3, {msg_name(16)={sin_family=AF_INET, sin_port=htons(32813), sin_addr=inet_addr("192.168.83.151")}}, msg_iov(3)=[{"@\0\0\0\0\4\342\345", 8}, {")\200\0\0", 4}, {"", 0}], msg_controllen=0, msg_flags=0}, 0) = 12
28563 connect(3, {sin_family=AF_INET, sin_port=htons(32813), sin_addr=inet_addr("192.168.83.151")}}, 16) = 0
28563 send(3, "@\0\0\0", 4, 0)          = 4
28563 getrlimit(RLIMIT_STACK, {rlim_cur=60000*1024, rlim_max=RLIM_INFINITY}) = 0
28563 setrlimit(RLIMIT_STACK, {rlim_cur=1008*1024, rlim_max=RLIM_INFINITY}) = 0
28563 brk(0x6000000001740000)           = 0x6000000001740000
28563 pipe([653280, 536870912])         = 5
28563 SYS_1213(0x2000000000035140, 0x6000000001733320, 0x7fe0, 0xf00, 0x5, 0x2000000000035140, 0x5, 0xf00) = 28566
28563 write(6, "\310`n\1\0\0\0`\5\0\0\0\0\0\0\0\0\0\0\0\0\0\200\210\7\0"..., 168) = 168
28563 rt_sigprocmask(SIG_SETMASK, NULL, [ALRM RT0], 8) = 0
28563 write(6, "\0\220\10\0\0\0\0 \0\0\0\0\0\0\0\0\300\245\377\377\377"..., 168) = 168
28563 rt_sigprocmask(SIG_SETMASK, NULL, [ALRM RT0], 8) = 0
28563 rt_sigsuspend([ALRM] <unfinished ...>
28563 --- SIGRT0 (Real-time signal 0) ---
28563 <... rt_sigsuspend resumed> )     = -1 EINTR (Interrupted system call)
28563 rt_sigreturn()                    = ? (mask now [HUP QUIT USR1 USR2 PIPE TERM TTOU XCPU XFSZ PROF RT2 RT4 RT5 RT7 RT14 RT18 RT22 RT23 RT24 RT25 RT27 RT30 RT31])
28563 SYS_1212(0x1, 0x80000fffffffb030, 0x20000000002e6fb0, 0xc000000000000a98, 0x6000000001733320, 0x60000000016efe00, 0x3df, 0xa) = 0
28563 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2000000000040000
28563 ioctl(1, TCGETS, {B38400 opost isig icanon -echo ...}) = 0
28563 write(1, "Phase 0: distribute\n", 20) = 20
28563 write(1, "Phase 1: Locks and barriers\n", 28) = 28
28563 sendmsg(4, {msg_name(0)=NULL, msg_iov(5)=[{"A\0\0\0\10\0\0\0", 8}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 64}, {"\0\0\0\0\0\0\0\0", 8}, {NULL, 0}, {"\0\0", 2}], msg_controllen=0, msg_flags=0}, 0) = 82
28563 setitimer(ITIMER_REAL, {it_interval={1, 0}, it_value={1, 0}}, NULL) = 0
28563 rt_sigprocmask(SIG_UNBLOCK, [ALRM], [ALRM RT0], 8) = 0
28563 recvmsg(4, {msg_name(0)=NULL, msg_iov(3)=[{"A\0\0\0\0\0\0\0", 8}, {NULL, 0}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768}], msg_controllen=0, msg_flags=0}, 0) = 10
28563 rt_sigprocmask(SIG_SETMASK, [ALRM RT0], NULL, 8) = 0
28563 getpid()                          = 28563
28563 write(1, "Trying to acquire a lock:0\n", 27) = 27
28563 sendmsg(4, {msg_name(0)=NULL, msg_iov(3)=[{"a\0\0\0\10\16\0\0", 8}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 64}, {"\0\0", 2}], msg_controllen=0, msg_flags=0}, 0) = 74
28563 setitimer(ITIMER_REAL, {it_interval={1, 0}, it_value={1, 0}}, NULL) = 0
28563 rt_sigprocmask(SIG_UNBLOCK, [ALRM], [ALRM RT0], 8) = 0
28563 select(5, [4], NULL, NULL, NULL)  = 1 (in [4])
28563 recv(4, "a\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 32768, 0) = 136
28563 rt_sigprocmask(SIG_SETMASK, [ALRM RT0], NULL, 8) = 0
28563 write(1, "Accessing widgets index:0\n", 26) = 26
28563 write(1, "Address:20004000\n", 17) = 17
28563 --- SIGSEGV (Segmentation fault) ---
28563 write(1, "segv handler\n", 13)    = 13
28563 write(1, "Virt addr:20004000 1hello1\n", 27) = 27
28563 write(1, "addr:20004000\n", 14)   = 14
28563 write(1, "hello4\n", 7)           = 7
28563 send(4, "\201\0\0\0\10\20\1\0", 8, 0) = 8
28563 setitimer(ITIMER_REAL, {it_interval={1, 0}, it_value={1, 0}}, NULL) = 0
28563 rt_sigprocmask(SIG_UNBLOCK, [ALRM], [SEGV ALRM RT0], 8) = 0
28563 recvmsg(4, {msg_name(0)=NULL, msg_iov(2)=[{"\201\0\0\0", 4}, {"\0\0\0\0\0\0\360?\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 16388
28563 rt_sigprocmask(SIG_SETMASK, [SEGV ALRM RT0], NULL, 8) = 0
28563 write(1, "hello7\n", 7)           = 7
28563 write(1, "hello6\n", 7)           = 7
28563 write(1, "hello8\n", 7)           = 7
28563 write(1, "hello10\n", 8)          = 8
28563 write(1, "hello13\n", 8)          = 8
28563 write(1, "hello12\n", 8)          = 8
28563 write(1, "hello11\n", 8)          = 8
28563 mprotect(0x6000000020004000, 16384, PROT_READ) = 0
28563 gettimeofday({1007753574, 334744}, NULL) = 0
28563 rt_sigprocmask(SIG_BLOCK, NULL, [SEGV ALRM RT0], 8) = 0
28563 rt_sigprocmask(SIG_UNBLOCK, [RT0], [SEGV ALRM RT0], 8) = 0
28563 gettimeofday({1007753574, 336194}, NULL) = 0
28563 nanosleep({0, 8550000}, NULL)     = 0
28563 rt_sigprocmask(SIG_SETMASK, [SEGV ALRM RT0], NULL, 8) = 0
28563 write(1, "hello9\n", 7)           = 7
28563 write(1, "exiting segv handler\n", 21) = 21
28563 rt_sigreturn()                    = ? (mask now [ALRM RT0])
28563 --- SIGSEGV (Segmentation fault) ---
28563 write(1, "segv handler\n", 13)    = 13
28563 write(1, "Virt addr:20004000 1hello1\n", 27) = 27
28563 write(1, "addr:20004000\n", 14)   = 14
28563 write(1, "hello2\n", 7)           = 7
28563 write(1, "hello9\n", 7)           = 7
28563 write(1, "exiting segv handler\n", 21) = 21
28563 rt_sigreturn()                    = ? (mask now [ALRM RT0])
28563 --- SIGSEGV (Segmentation fault) ---

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
                   ` (7 preceding siblings ...)
  2001-12-07 19:47 ` Hoeflinger, Jay P
@ 2001-12-07 20:13 ` Boehm, Hans
  2001-12-07 20:27 ` David Mosberger
  9 siblings, 0 replies; 11+ messages in thread
From: Boehm, Hans @ 2001-12-07 20:13 UTC (permalink / raw)
  To: linux-ia64

What's the instruction that's faulting and what are the contents of the
register used for addressing?  You should easily be able to get that out of
gdb.

I assume the printf in the handler is using an incorrect (32 bit int) format
to print the address.  It should be using something like %p.  That's
probably irrelevant to the real problem.

Hans

> -----Original Message-----
> From: Hoeflinger, Jay P [mailto:jay.p.hoeflinger@intel.com]
> Sent: Friday, December 07, 2001 11:47 AM
> To: 'davidm@hpl.hp.com'; Hoeflinger, Jay P
> Cc: 'Boehm, Hans'; 'n0ano@indstorage.com'; 'linux-ia64@linuxia64.org'
> Subject: RE: [Linux-ia64] mprotect problem
> 
> 
> OK, here's a trace.  The program is multi-threaded, so I used
> strace -f.  Fortunately the problem happens almost right away.
> At the very end, you can see the infinite loop starting.
> One thing I noticed is that our printf's in the segv handler
> are printing the fault
> address as 20004000 while the mprotect is giving the address
> as 6000000020004000.  This last address seems to be correct in 
> relation to the original mprotect for the shared heap, which was
> 
> mprotect(0x6000000020000000, 268435456, PROT_NONE) = 0
> 
> Please let me know if this tells you anything interesting.
> 
> Jay
> 
> -----Original Message-----
> From: David Mosberger [mailto:davidm@hpl.hp.com]
> Sent: Friday, December 07, 2001 11:35 AM
> To: Hoeflinger, Jay P
> Cc: 'Boehm, Hans'; 'n0ano@indstorage.com'; 'linux-ia64@linuxia64.org'
> Subject: RE: [Linux-ia64] mprotect problem
> 
> 
> Jay,
> 
> If you don't have a minimal test program or can't share the source
> code, it might be useful to collect a syscall trace.  That way, we
> could see what addresses and sizes are involved in the mprotect()
> call.  Something like:
> 
> 	strace -o /tmp/out PROGNAME
> 
> should do (I'm assuming your program is not multithreaded; if it is,
> you'd need to use the -f option and make sure you're running the
> latest version of strace).
> 
> If the resulting output is very big, you won't be able to send it to
> the mailing list.  You can either trim the output or just make sure
> you mail it to me directly (davidm@hpl.hp.com).
> 
> Thanks,
> 
> 	--david
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Linux-ia64] mprotect problem
  2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
                   ` (8 preceding siblings ...)
  2001-12-07 20:13 ` Boehm, Hans
@ 2001-12-07 20:27 ` David Mosberger
  9 siblings, 0 replies; 11+ messages in thread
From: David Mosberger @ 2001-12-07 20:27 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 7 Dec 2001 12:13:26 -0800, "Boehm, Hans" <hans_boehm@hp.com> said:

  Hans> What's the instruction that's faulting and what are the
  Hans> contents of the register used for addressing?  You should
  Hans> easily be able to get that out of gdb.

Also, what's the address of the faulting instruction.  I assume it's
in normal (not shared) memory, right?

	--david


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-12-07 20:27 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-12-06 23:56 [Linux-ia64] mprotect problem Hoeflinger, Jay P
2001-12-07  0:11 ` n0ano
2001-12-07 14:53 ` Hoeflinger, Jay P
2001-12-07 15:13 ` n0ano
2001-12-07 15:18 ` Hoeflinger, Jay P
2001-12-07 16:10 ` David Mosberger
2001-12-07 16:23 ` Hoeflinger, Jay P
2001-12-07 17:34 ` David Mosberger
2001-12-07 19:47 ` Hoeflinger, Jay P
2001-12-07 20:13 ` Boehm, Hans
2001-12-07 20:27 ` David Mosberger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.