All of lore.kernel.org
 help / color / mirror / Atom feed
* Segfault in __c_f_f_c during strace of nptl application.
@ 2009-06-21  6:27 Carlos O'Donell
  2009-06-21 15:20 ` John David Anglin
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-06-21  6:27 UTC (permalink / raw)
  To: John David Anglin, linux-parisc

Dave,

I saw strace segfault in __canonicalize_funcptr_for_compare while
trying to trace an nptl enabled hppa application.

Program received signal SIGSEGV, Segmentation fault.
0x0002b3bc in __canonicalize_funcptr_for_compare ()
Current language:  auto; currently asm
(gdb) bt
#0  0x0002b3bc in __canonicalize_funcptr_for_compare ()
#1  0x00025fec in sys_rt_sigaction (tcp=0x4e070) at signal.c:1886
#2  0x00017aec in trace_syscall (tcp=0x4e070) at syscall.c:2549
#3  0x00016c98 in main (argc=<value optimized out>, argv=0xc032f01c)
at strace.c:2475
(gdb)

It's 100% reproducible. What should I try to debug this?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21  6:27 Segfault in __c_f_f_c during strace of nptl application Carlos O'Donell
@ 2009-06-21 15:20 ` John David Anglin
  2009-06-21 16:18   ` Carlos O'Donell
                     ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: John David Anglin @ 2009-06-21 15:20 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, linux-parisc

> I saw strace segfault in __canonicalize_funcptr_for_compare while
> trying to trace an nptl enabled hppa application.
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0002b3bc in __canonicalize_funcptr_for_compare ()
> Current language:  auto; currently asm
> (gdb) bt
> #0  0x0002b3bc in __canonicalize_funcptr_for_compare ()
> #1  0x00025fec in sys_rt_sigaction (tcp=0x4e070) at signal.c:1886
> #2  0x00017aec in trace_syscall (tcp=0x4e070) at syscall.c:2549
> #3  0x00016c98 in main (argc=<value optimized out>, argv=0xc032f01c)
> at strace.c:2475
> (gdb)
> 
> It's 100% reproducible. What should I try to debug this?

Look at the comparison in sys_rt_sigaction.  The last time this
happened, this involved a comparison with a special value that
wasn't a function pointer.  I think you are seeing the same problem
as Kyle.

I believe that the segv can be avoided by casting the values in the
comparison:

> > > which is:
> > > 
> > >                 if (sa.__sigaction_handler.__sa_handler == SIG_ERR)
> > >                         tprintf("{SIG_ERR, ");
> > 
> > Is __canonicalize_funcptr_for_compare choking on SIG_ERR?  This is
> > a special value (-1).  The plabel bit is set.

However, this is a cope out.  __canonicalize_funcptr_for_compare
actually faults on sa.__sigaction_handler.__sa_handler.

You can put a break on __canonicalize_funcptr_for_compare and look at
what's being passed in.

Looking at my email archive, I see the real cause involves kernel memory
maps:

  > > > On Wed, May 06, 2009 at 01:39:49PM -0400, John David Anglin wrote:
  > > > > > The tombstone is:
  > > > > > 
  > > > > > do_page_fault() pid=10205 command='strace' type=15 address=0x407d2f18
  > > > > > vm_start = 0x4068d000, vm_end = 0x4068f000
  > > > > 
  > > > > So, the pointer passed to __canonicalize_funcptr_for_compare is outside
  > > > > the vm range.
  > > > > 
  > > > > Maybe "info sharedlib" will show something.  Need to find out why the
  > > > > address of the function descriptor is outside the vm range.
  > > > > 
  > > > > > > 405c0000-405c2000 rwxp 405c0000 00:00 0 
  > > 
  > > The function pointer address is also outside this range.
  > > 
  > 
  > Sorry, this was with a rebuilt binary, and it lies within this range.
  
  It's marked rwxp, so why the fault?

We never figured out why the fault actually occurred (Kyle got busy).
It seems like there is a problem with the address mapping during signals.
However, there was some rebuilds in the above and I'm not sure the
analysis is correct.  However, I'm sure the problem isn't with
__canonicalize_funcptr_for_compare.

So, the quick fix to get strace going is to rebuild casting the function
pointers to long.  However, I think you will find that it has other problems.
You might have more success with the old version that Kyle patched a
year or so ago (posted in debian people).  Randolph was working on a program
called atrace.  I tried it but didn't have much luck with it.

PS: Hows NPTL comming?

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 15:20 ` John David Anglin
@ 2009-06-21 16:18   ` Carlos O'Donell
  2009-06-21 17:16     ` John David Anglin
  2009-06-22  1:46   ` Randolph Chung
  2009-09-09 14:39   ` Carlos O'Donell
  2 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-06-21 16:18 UTC (permalink / raw)
  To: John David Anglin, John David Anglin, linux-parisc

On Sun, Jun 21, 2009 at 11:20 AM, John David
Anglin<dave@hiauly1.hia.nrc.ca> wrote:
> We never figured out why the fault actually occurred (Kyle got busy).
> It seems like there is a problem with the address mapping during sign=
als.
> However, there was some rebuilds in the above and I'm not sure the
> analysis is correct. =A0However, I'm sure the problem isn't with
> __canonicalize_funcptr_for_compare.

Hoccam's razor. It's a gcc bug.

The arg0 to __c_f_f_c is being clobbered by the previous call.
Rearrangeing the if-the-else cases into a set if cases fixes the
clobbering of arg0 and fixes strace.

The move of the fptr into r26 is moved before the call to umove, then
umove clobbers r26, then __c_f_f_c is called and crashes.

> So, the quick fix to get strace going is to rebuild casting the funct=
ion
> pointers to long. =A0However, I think you will find that it has other=
 problems.
> You might have more success with the old version that Kyle patched a
> year or so ago (posted in debian people). =A0Randolph was working on =
a program
> called atrace. =A0I tried it but didn't have much luck with it.

The quick fix is to reorganize the if-then-else statement.

> PS: Hows NPTL comming?

The implementation is done. The testsing shows no regressions. The
custom compat testsuite I wrote also passes every test.

Unfortunately last night I was up late working and I managed to both
erase (be careful of bind mounts) half of my custom compat testsuite
and the chroot I was going to use for more advanced testing.

On todays schedule is to rebuild the chroot and test gnome using
vnc4server with the new libs.

At a later date I'll have to rewrite the missing pieces of the custom
compat testsuite.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 16:18   ` Carlos O'Donell
@ 2009-06-21 17:16     ` John David Anglin
  2009-06-21 20:01       ` Carlos O'Donell
  0 siblings, 1 reply; 21+ messages in thread
From: John David Anglin @ 2009-06-21 17:16 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, linux-parisc

> On Sun, Jun 21, 2009 at 11:20 AM, John David
> Anglin<dave@hiauly1.hia.nrc.ca> wrote:
> > We never figured out why the fault actually occurred (Kyle got busy).
> > It seems like there is a problem with the address mapping during signals.
> > However, there was some rebuilds in the above and I'm not sure the
> > analysis is correct. =A0However, I'm sure the problem isn't with
> > __canonicalize_funcptr_for_compare.
> 
> Hoccam's razor. It's a gcc bug.
> 
> The arg0 to __c_f_f_c is being clobbered by the previous call.
> Rearrangeing the if-the-else cases into a set if cases fixes the
> clobbering of arg0 and fixes strace.
> 
> The move of the fptr into r26 is moved before the call to umove, then
> umove clobbers r26, then __c_f_f_c is called and crashes.

This needs a GCC bug report.  It an important defect.  As usually,
preprocessed source is needed and compiler version.  The version of
strace in lenny/testing doesn't have any calls to __c_f_f_c.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 17:16     ` John David Anglin
@ 2009-06-21 20:01       ` Carlos O'Donell
  2009-06-21 20:21         ` John David Anglin
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-06-21 20:01 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, linux-parisc

On Sun, Jun 21, 2009 at 1:16 PM, John David
Anglin<dave@hiauly1.hia.nrc.ca> wrote:
>> The move of the fptr into r26 is moved before the call to umove, the=
n
>> umove clobbers r26, then __c_f_f_c is called and crashes.
>
> This needs a GCC bug report. =A0It an important defect. =A0As usually=
,
> preprocessed source is needed and compiler version. =A0The version of
> strace in lenny/testing doesn't have any calls to __c_f_f_c.

Sorry, this turns out not to be correct, after tracing the assembly
completely (including loads in delayed branches) it looks like the
location read from the pid through ptrace might be wrong. I'll have to
keep debugging this.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 20:01       ` Carlos O'Donell
@ 2009-06-21 20:21         ` John David Anglin
  2009-06-21 21:37           ` Carlos O'Donell
  0 siblings, 1 reply; 21+ messages in thread
From: John David Anglin @ 2009-06-21 20:21 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, linux-parisc

> Sorry, this turns out not to be correct, after tracing the assembly
> completely (including loads in delayed branches) it looks like the
> location read from the pid through ptrace might be wrong. I'll have to
> keep debugging this.

Delayed branches are a nasty invention.  I am testing a patch to
fix a problem that appeared last week.

When a conditional branch with an unfilled delay slot is followed
by an asm, bad things may end up in the delay slot.  Even worse,
the branch may branch into the delay slot.  GCC has no idea what's
in an asm.  So, I'm trying to add a nop in the delay slot when a
conditional branch is followed by an asm.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 20:21         ` John David Anglin
@ 2009-06-21 21:37           ` Carlos O'Donell
  2009-06-21 22:18             ` John David Anglin
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-06-21 21:37 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, linux-parisc

On Sun, Jun 21, 2009 at 4:21 PM, John David
Anglin<dave@hiauly1.hia.nrc.ca> wrote:
>> Sorry, this turns out not to be correct, after tracing the assembly
>> completely (including loads in delayed branches) it looks like the
>> location read from the pid through ptrace might be wrong. I'll have =
to
>> keep debugging this.
>
> Delayed branches are a nasty invention. =A0I am testing a patch to
> fix a problem that appeared last week.

Interesting.

> When a conditional branch with an unfilled delay slot is followed
> by an asm, bad things may end up in the delay slot. =A0Even worse,
> the branch may branch into the delay slot. =A0GCC has no idea what's
> in an asm. =A0So, I'm trying to add a nop in the delay slot when a
> conditional branch is followed by an asm.

There is an on-and-off-again bug in glibc's vfprintf.c implementation
that causes DBR to miscompile that file, but it comes and goes
depending on the gcc version. I filed a bug once, but because I
couldn't produce a reduced testcase Pinski closed it. If I see it
again I'll try again to produce a reduced testcase.

How's 4.4 on hppa?

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 21:37           ` Carlos O'Donell
@ 2009-06-21 22:18             ` John David Anglin
  0 siblings, 0 replies; 21+ messages in thread
From: John David Anglin @ 2009-06-21 22:18 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, linux-parisc

> There is an on-and-off-again bug in glibc's vfprintf.c implementation
> that causes DBR to miscompile that file, but it comes and goes
> depending on the gcc version. I filed a bug once, but because I
> couldn't produce a reduced testcase Pinski closed it. If I see it
> again I'll try again to produce a reduced testcase.

That one is a reorg bug.  There have been some fixes in that area.  If
you can reproduce with 4.3 or later, I will look at it.  Don't worry
about reproducing a testcase.  Just attached the entire preprocessed
source (compressed if necessary).  You can reopen old bug.

In PR40468, the generated assembler code looked like this:

0x000104e4 <f+56>:      cmpb,<>,n ret0,r19,0x104e8 <f+60>
0x000104e8 <f+60>:      b,l 0x104a8 <ff>,rp
0x000104ec <f+64>:      ldi 1,r26

I don't really understand how the hardware handles the above, but
the b,l didn't correctly update rp, and ff returned to f.  I would
have thought the instruction sequence would go:

f+56, f+60 nullified, f+60, f+64, ff, ..., f+68.

> How's 4.4 on hppa?

Probably as good as any other version (few bugs fixed, a few new
ones).  It has better error checking, but that may be annoying.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 15:20 ` John David Anglin
  2009-06-21 16:18   ` Carlos O'Donell
@ 2009-06-22  1:46   ` Randolph Chung
  2009-06-22 13:35     ` John David Anglin
  2009-09-09 14:39   ` Carlos O'Donell
  2 siblings, 1 reply; 21+ messages in thread
From: Randolph Chung @ 2009-06-22  1:46 UTC (permalink / raw)
  To: John David Anglin; +Cc: Carlos O'Donell, dave.anglin, linux-parisc


> Randolph was working on a program
> called atrace.  I tried it but didn't have much luck with it.
>   

Wow, somebody tried it! :)

Can you tell me more about what is not working?

randolph

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-22  1:46   ` Randolph Chung
@ 2009-06-22 13:35     ` John David Anglin
  0 siblings, 0 replies; 21+ messages in thread
From: John David Anglin @ 2009-06-22 13:35 UTC (permalink / raw)
  To: Randolph Chung; +Cc: carlos, dave.anglin, linux-parisc

> Can you tell me more about what is not working?

At this point, I'm fuzzy as to what happened.  I believe that I was
trying to trace the syscalls by sshd during login attempts to try to
gain information on the uid/gid authentication problem.

One little thing I noticed this morning (bad typing):
dave@hiauly6:~/opt/gnu/bin$ atrace --trace ls
Segmentation fault (core dumped)

Core was generated by `atrace --trace ls'.
Program terminated with signal 11, Segmentation fault.
#0  0x40520d90 in strncmp () from /lib/libc.so.6
(gdb) bt
#0  0x40520d90 in strncmp () from /lib/libc.so.6
#1  0x4055c848 in _getopt_internal_r () from /lib/libc.so.6
#2  0x4055d52c in _getopt_internal () from /lib/libc.so.6
#3  0x4055d644 in getopt_long () from /lib/libc.so.6
#4  0x00011ba8 in option_parse (argc=3, argv=0xfb66201c, opt=0x2dd80,
    proc=0xfb662208) at option.c:102
#5  0x00011780 in main (argc=<value optimized out>, argv=0xfb66201c)
    at atrace.c:58

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-06-21 15:20 ` John David Anglin
  2009-06-21 16:18   ` Carlos O'Donell
  2009-06-22  1:46   ` Randolph Chung
@ 2009-09-09 14:39   ` Carlos O'Donell
  2009-09-09 14:50     ` Mike Frysinger
  2009-09-09 15:16     ` John David Anglin
  2 siblings, 2 replies; 21+ messages in thread
From: Carlos O'Donell @ 2009-09-09 14:39 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, linux-parisc

On Sun, Jun 21, 2009 at 11:20 AM, John David
Anglin<dave@hiauly1.hia.nrc.ca> wrote:
> Looking at my email archive, I see the real cause involves kernel mem=
ory
> maps:
>
> =A0> > > On Wed, May 06, 2009 at 01:39:49PM -0400, John David Anglin =
wrote:
> =A0> > > > > The tombstone is:
> =A0> > > > >
> =A0> > > > > do_page_fault() pid=3D10205 command=3D'strace' type=3D15=
 address=3D0x407d2f18
> =A0> > > > > vm_start =3D 0x4068d000, vm_end =3D 0x4068f000
> =A0> > > >
> =A0> > > > So, the pointer passed to __canonicalize_funcptr_for_compa=
re is outside
> =A0> > > > the vm range.

The strace problem is a compiler flaw.

This is the problem:
* Strace examines the applications syscall.
* Strace extracts, via PTRACE, application addresses, addresses that
don't exist in the strace address space (and should not exist).
* Strace compares extracted address to a constant SIG_ERR.
* Compiler generates a call to __c_f_f_c, which dereference the
extracted address and strace faults.

Strace and the application have completely different address spaces,
and __c_f_f_c can't assume that an address is in the current address
space.

The solution is to detect that a comparison between two pointers is a
comparison between pointer and small constant, and avoid calling
__c_f_f_c for both.

The workaround is to cast both long. I tested this and it works. I'll
submit this to debian as the fix.

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-09 14:39   ` Carlos O'Donell
@ 2009-09-09 14:50     ` Mike Frysinger
  2009-09-09 15:16     ` John David Anglin
  1 sibling, 0 replies; 21+ messages in thread
From: Mike Frysinger @ 2009-09-09 14:50 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: John David Anglin, dave.anglin, linux-parisc

[-- Attachment #1: Type: Text/Plain, Size: 1778 bytes --]

On Wednesday 09 September 2009 10:39:30 Carlos O'Donell wrote:
> On Sun, Jun 21, 2009 at 11:20 AM, John David
> 
> Anglin<dave@hiauly1.hia.nrc.ca> wrote:
> > Looking at my email archive, I see the real cause involves kernel memory
> > maps:
> >
> >  > > > On Wed, May 06, 2009 at 01:39:49PM -0400, John David Anglin wrote:
> >  > > > > > The tombstone is:
> >  > > > > >
> >  > > > > > do_page_fault() pid=10205 command='strace' type=15
> > address=0x407d2f18 > > > > > vm_start = 0x4068d000, vm_end = 0x4068f000
> >  > > > >
> >  > > > > So, the pointer passed to __canonicalize_funcptr_for_compare is
> > outside > > > > the vm range.
> 
> The strace problem is a compiler flaw.
> 
> This is the problem:
> * Strace examines the applications syscall.
> * Strace extracts, via PTRACE, application addresses, addresses that
> don't exist in the strace address space (and should not exist).
> * Strace compares extracted address to a constant SIG_ERR.
> * Compiler generates a call to __c_f_f_c, which dereference the
> extracted address and strace faults.
> 
> Strace and the application have completely different address spaces,
> and __c_f_f_c can't assume that an address is in the current address
> space.
> 
> The solution is to detect that a comparison between two pointers is a
> comparison between pointer and small constant, and avoid calling
> __c_f_f_c for both.
> 
> The workaround is to cast both long. I tested this and it works. I'll
> submit this to debian as the fix.

and include this explanation in a comment right above the cast ? ;)

also, the strace list is active currently, so posting a patch there should get 
it merged (and since it's an important bugfix, it should get added before the 
next release).
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-09 14:39   ` Carlos O'Donell
  2009-09-09 14:50     ` Mike Frysinger
@ 2009-09-09 15:16     ` John David Anglin
  2009-09-09 15:24       ` Carlos O'Donell
  1 sibling, 1 reply; 21+ messages in thread
From: John David Anglin @ 2009-09-09 15:16 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, linux-parisc

> The strace problem is a compiler flaw.
> 
> This is the problem:
> * Strace examines the applications syscall.
> * Strace extracts, via PTRACE, application addresses, addresses that
> don't exist in the strace address space (and should not exist).
> * Strace compares extracted address to a constant SIG_ERR.
> * Compiler generates a call to __c_f_f_c, which dereference the
> extracted address and strace faults.
> 
> Strace and the application have completely different address spaces,
> and __c_f_f_c can't assume that an address is in the current address
> space.

Unfortunately, this is necessary to canonicalize function pointers.

> The solution is to detect that a comparison between two pointers is a
> comparison between pointer and small constant, and avoid calling
> __c_f_f_c for both.

__c_f_f_c would have to be passed both pointers to detect comparisons
with small constants, or the GCC middle-end would have to detect
comparisons with constants and avoid canonicalization in that case.

> The workaround is to cast both long. I tested this and it works. I'll
> submit this to debian as the fix.

This is simplest fix.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-09 15:16     ` John David Anglin
@ 2009-09-09 15:24       ` Carlos O'Donell
  2009-09-10  1:10         ` John David Anglin
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-09-09 15:24 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, linux-parisc

On Wed, Sep 9, 2009 at 11:16 AM, John David
Anglin<dave@hiauly1.hia.nrc.ca> wrote:
>> The workaround is to cast both long. I tested this and it works. I'll
>> submit this to debian as the fix.
>
> This is simplest fix.

Yes, it would appear that strace/gdb (applications that ptrace
addresses from another process) are most likely applications to suffer
from this problem, and it may be easier to work around the problem
there instead of in the compiler.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-09 15:24       ` Carlos O'Donell
@ 2009-09-10  1:10         ` John David Anglin
  2009-09-10 16:03           ` Carlos O'Donell
  0 siblings, 1 reply; 21+ messages in thread
From: John David Anglin @ 2009-09-10  1:10 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, linux-parisc

> On Wed, Sep 9, 2009 at 11:16 AM, John David
> Anglin<dave@hiauly1.hia.nrc.ca> wrote:
> >> The workaround is to cast both long. I tested this and it works. I'll
> >> submit this to debian as the fix.
> >
> > This is simplest fix.
> 
> Yes, it would appear that strace/gdb (applications that ptrace
> addresses from another process) are most likely applications to suffer
> from this problem, and it may be easier to work around the problem
> there instead of in the compiler.

Actually, it probably would not be too hard to disable canonicalization
with a compiler option.

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-10  1:10         ` John David Anglin
@ 2009-09-10 16:03           ` Carlos O'Donell
  2009-09-11 22:57             ` John David Anglin
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-09-10 16:03 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, linux-parisc

On Wed, Sep 9, 2009 at 9:10 PM, John David
Anglin<dave@hiauly1.hia.nrc.ca> wrote:
>> On Wed, Sep 9, 2009 at 11:16 AM, John David
>> Anglin<dave@hiauly1.hia.nrc.ca> wrote:
>> >> The workaround is to cast both long. I tested this and it works. I'll
>> >> submit this to debian as the fix.
>> >
>> > This is simplest fix.
>>
>> Yes, it would appear that strace/gdb (applications that ptrace
>> addresses from another process) are most likely applications to suffer
>> from this problem, and it may be easier to work around the problem
>> there instead of in the compiler.
>
> Actually, it probably would not be too hard to disable canonicalization
> with a compiler option.

No, it's not neccessary. This way I simply have to audit the incorrect
comparisons of a inferior's function pointer to a constant. If I were
to disable canonicalization I'd have to audit the entire compilation
unit to prove I haven't broken any valid function pointer comparisons.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-10 16:03           ` Carlos O'Donell
@ 2009-09-11 22:57             ` John David Anglin
  2009-09-13 20:56               ` Carlos O'Donell
  0 siblings, 1 reply; 21+ messages in thread
From: John David Anglin @ 2009-09-11 22:57 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: dave.anglin, linux-parisc

> No, it's not neccessary. This way I simply have to audit the incorrect
> comparisons of a inferior's function pointer to a constant. If I were
> to disable canonicalization I'd have to audit the entire compilation
> unit to prove I haven't broken any valid function pointer comparisons.

I'm thinking we have OPD support with recent binutils, so the
above shouldn't be necessary.  Wasn't that something you added?

Dave
-- 
J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-11 22:57             ` John David Anglin
@ 2009-09-13 20:56               ` Carlos O'Donell
  2009-09-13 21:00                 ` Kyle McMartin
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-09-13 20:56 UTC (permalink / raw)
  To: John David Anglin; +Cc: dave.anglin, linux-parisc

On Fri, Sep 11, 2009 at 6:57 PM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
>> No, it's not neccessary. This way I simply have to audit the incorre=
ct
>> comparisons of a inferior's function pointer to a constant. If I wer=
e
>> to disable canonicalization I'd have to audit the entire compilation
>> unit to prove I haven't broken any valid function pointer comparison=
s.
>
> I'm thinking we have OPD support with recent binutils, so the
> above shouldn't be necessary. =A0Wasn't that something you added?

I did the work, yes, I rewrote the PLABEL support to include adding
OPD's in the executable (instead of pointing the local PLABEL relocs
at the relevant PLT entry). However, with the start of my Masters I
didn't have time to clean it up, and check it into binutils. This is
*certainly* on my list, we need some work in binutils, but *right now*
I'm working on resurrecting strace which seems to be completely
broken. Fixing the function descriptor problem is only the first step,
after that strace still fails with an invalid upeek generated from
hppa-specific code. I'm working on it right now (today actually).

AFAICT "strace" *is* our "defense against the dark arts" if you catch
my harry potter reference :-)

Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc"=
 in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-13 20:56               ` Carlos O'Donell
@ 2009-09-13 21:00                 ` Kyle McMartin
  2009-09-13 21:32                   ` Carlos O'Donell
  0 siblings, 1 reply; 21+ messages in thread
From: Kyle McMartin @ 2009-09-13 21:00 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: John David Anglin, dave.anglin, linux-parisc

On Sun, Sep 13, 2009 at 04:56:20PM -0400, Carlos O'Donell wrote:
> after that strace still fails with an invalid upeek generated from
> hppa-specific code. I'm working on it right now (today actually).
> 
> AFAICT "strace" *is* our "defense against the dark arts" if you catch
> my harry potter reference :-)
> 

http://shortfin.cabal.ca/~kyle/strace-fix-hppa-syscalls.diff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-13 21:00                 ` Kyle McMartin
@ 2009-09-13 21:32                   ` Carlos O'Donell
  2009-09-13 21:37                     ` Kyle McMartin
  0 siblings, 1 reply; 21+ messages in thread
From: Carlos O'Donell @ 2009-09-13 21:32 UTC (permalink / raw)
  To: Kyle McMartin; +Cc: John David Anglin, dave.anglin, linux-parisc, Helge Deller

On Sun, Sep 13, 2009 at 5:00 PM, Kyle McMartin <kyle@mcmartin.ca> wrote:
> On Sun, Sep 13, 2009 at 04:56:20PM -0400, Carlos O'Donell wrote:
>> after that strace still fails with an invalid upeek generated from
>> hppa-specific code. I'm working on it right now (today actually).
>>
>> AFAICT "strace" *is* our "defense against the dark arts" if you catch
>> my harry potter reference :-)
>>
>
> http://shortfin.cabal.ca/~kyle/strace-fix-hppa-syscalls.diff

Thanks, that solves my problem.

Is it OK if I submit your patch to debian and upstream *again* and
keep doing so until they apply the patches? >:)

I see this ancient bug entry, which I'll update, and bug Roland...

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=437928

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Segfault in __c_f_f_c during strace of nptl application.
  2009-09-13 21:32                   ` Carlos O'Donell
@ 2009-09-13 21:37                     ` Kyle McMartin
  0 siblings, 0 replies; 21+ messages in thread
From: Kyle McMartin @ 2009-09-13 21:37 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Kyle McMartin, John David Anglin, dave.anglin, linux-parisc,
	Helge Deller

On Sun, Sep 13, 2009 at 05:32:41PM -0400, Carlos O'Donell wrote:
> On Sun, Sep 13, 2009 at 5:00 PM, Kyle McMartin <kyle@mcmartin.ca> wrote:
> > On Sun, Sep 13, 2009 at 04:56:20PM -0400, Carlos O'Donell wrote:
> >> after that strace still fails with an invalid upeek generated from
> >> hppa-specific code. I'm working on it right now (today actually).
> >>
> >> AFAICT "strace" *is* our "defense against the dark arts" if you catch
> >> my harry potter reference :-)
> >>
> >
> > http://shortfin.cabal.ca/~kyle/strace-fix-hppa-syscalls.diff
> 
> Thanks, that solves my problem.
> 
> Is it OK if I submit your patch to debian and upstream *again* and
> keep doing so until they apply the patches? >:)

gopher it.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-09-13 21:37 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-21  6:27 Segfault in __c_f_f_c during strace of nptl application Carlos O'Donell
2009-06-21 15:20 ` John David Anglin
2009-06-21 16:18   ` Carlos O'Donell
2009-06-21 17:16     ` John David Anglin
2009-06-21 20:01       ` Carlos O'Donell
2009-06-21 20:21         ` John David Anglin
2009-06-21 21:37           ` Carlos O'Donell
2009-06-21 22:18             ` John David Anglin
2009-06-22  1:46   ` Randolph Chung
2009-06-22 13:35     ` John David Anglin
2009-09-09 14:39   ` Carlos O'Donell
2009-09-09 14:50     ` Mike Frysinger
2009-09-09 15:16     ` John David Anglin
2009-09-09 15:24       ` Carlos O'Donell
2009-09-10  1:10         ` John David Anglin
2009-09-10 16:03           ` Carlos O'Donell
2009-09-11 22:57             ` John David Anglin
2009-09-13 20:56               ` Carlos O'Donell
2009-09-13 21:00                 ` Kyle McMartin
2009-09-13 21:32                   ` Carlos O'Donell
2009-09-13 21:37                     ` Kyle McMartin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.