All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report()) (Kazuhito Hagio)
       [not found] <mailman.7.1573156802.22483.kexec@lists.infradead.org>
@ 2019-11-07 20:18 ` Dave Anderson
  2019-11-07 21:28   ` Kazuhito Hagio
  0 siblings, 1 reply; 2+ messages in thread
From: Dave Anderson @ 2019-11-07 20:18 UTC (permalink / raw)
  To: kexec; +Cc: k-hagio



----- Original Message -----
> Date: Thu, 7 Nov 2019 16:12:06 +0000
> From: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
> To: Dave Jones <davej@codemonkey.org.uk>
> Cc: "kexec@lists.infradead.org" <kexec@lists.infradead.org>
> Subject: RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report())
> Message-ID: <4AE2DC15AC0B8543882A74EA0D43DBEC035949A4@BPXM09GP.gisp.nec.co.jp>
> Content-Type: text/plain; charset="iso-2022-jp"
> 
> Hi,
> 
> > -----Original Message-----
> > >  > > There are some other failure cases with non-null data, so maybe
> > >  > > there's >1 bug here.
> > >  > > I've not seen an obvious pattern to this. eg...
> > >  > >
> > >  > > https://pastebin.com/2uM4sBCF
> > >  > >
> > >  >
> > >  > As for this case, I suspect that Elf64_Ehdr.e_phnum overflows
> > >  > (i.e. num_loads_dumpfile > 65535):
> > >
> > > Oh, good catch.  These are 256GB machines, so after discarding
> > > everything, that explains why we end up with so many sections.
> > > This also explains why it sometimes works I think, when the discarding
> > > manages to get the total nr headers <64k.
> 
> I also could reproduce this issue on a system with 192GB memory.
> The note was actually overwritten by the following program headers.
> -----
> num_loads_dumpfile=76318                # more than 64k
> ehdr64.e_phnum=10783                    # overflowed
> note.p_offset=0x93708 .p_filesz=0x2958  # The note data is at 0x93708
> note cd_header->offset=0x40
> ...
>     head->off=     90040 load.p_addr= 44552e000 .p_off=  ed270060 ...
>                    ^^^^^ # these headers overwrote the note data.
>     head->off=     a0040 load.p_addr= 445630000 .p_off=  ed272060 ...
> ...
> The dumpfile is saved to dump.Ed25.devel.
> 
> makedumpfile Completed.
> 
> # readelf -a dump.Ed25.devel
> ...
>   Number of program headers:         10783
> ...
> Displaying notes found at file offset 0x00093708 with length 0x00002958:
>   Owner                 Data size       Description
>                        0x00000007       Unknown note type: (0xdbce6060)
>    description data: 00 00 7a 39 fffffff2 ffffff8a ffffffff
> # ../crash vmlinux dump.Ed25.devel
> 
> WARNING: possibly corrupt Elf64_Nhdr: n_namesz: 4185522176 n_descsz: 3
> n_type: f4000
> ...
> WARNING: cannot read linux_banner string
> crash: vmlinux and dump.Ed25.devel do not match!
> -----
> 
> > I think this will be the one of the causes, and had a look at how
> > we can fix it.  If you get a vmcore where this pattern occurs,
> > you can try this tree:
> > https://github.com/k-hagio/makedumpfile/tree/support-extended-elf
> > 
> > Then, the crash utility also needs a patch to support a dumpfile
> > that has more than 64k program headers:
> > https://github.com/k-hagio/crash/tree/support-extended-elf
> 
> These trees look to work well, though need more tests and tweaks.
> -----
> # readelf -a dump.Ed25.test
> ...
>   Number of program headers:         65535 (76319)  <<-- note + loads
> ...
> Displaying notes found at file offset 0x00413748 with length 0x00002958:
>   Owner                 Data size       Description
>   CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
>   CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
>   CORE                 0x00000150       NT_PRSTATUS (prstatus structure)
> ...
> # ../crash-test vmlinux dump.Ed25.test
> 
> crash-test> help -D
> vmcore_data:
>                   flags: c0 (KDUMP_LOCAL|KDUMP_ELF64)
>                    ndfd: 3
>                     ofp: 3141560
>             header_size: 4284576
>    num_pt_load_segments: 76318   <<-- loads
>      pt_load_segment[0]:
> -----
> 
> It is possible that the issue occurs on general systems if they have
> large memory, so I'm going to proceed with those patches.

Hi Kazu,

Do you want me to go ahead with the crash utility patch?  It looks
safe enough to apply, and I did test it to make sure there were no
ill-effects with sample ELF dumpfiles.

Thanks,
  Dave


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report()) (Kazuhito Hagio)
  2019-11-07 20:18 ` makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report()) (Kazuhito Hagio) Dave Anderson
@ 2019-11-07 21:28   ` Kazuhito Hagio
  0 siblings, 0 replies; 2+ messages in thread
From: Kazuhito Hagio @ 2019-11-07 21:28 UTC (permalink / raw)
  To: Dave Anderson; +Cc: kexec

Hi Dave,

> -----Original Message-----
> > > I think this will be the one of the causes, and had a look at how
> > > we can fix it.  If you get a vmcore where this pattern occurs,
> > > you can try this tree:
> > > https://github.com/k-hagio/makedumpfile/tree/support-extended-elf
> > >
> > > Then, the crash utility also needs a patch to support a dumpfile
> > > that has more than 64k program headers:
> > > https://github.com/k-hagio/crash/tree/support-extended-elf

> > It is possible that the issue occurs on general systems if they have
> > large memory, so I'm going to proceed with those patches.
> 
> Hi Kazu,
> 
> Do you want me to go ahead with the crash utility patch?  It looks
> safe enough to apply, and I did test it to make sure there were no
> ill-effects with sample ELF dumpfiles.

Oh, thank you for your attention and testing.

I'm dropping the ELF32 parts of them, because I think they will not be
used in the future.  (I estimate the theoretical minimum memory size
that makedumpfile could use the extended numbering is 64GB+256MB on
4k page system.)

I will let you know when it gets prepared.

Thanks!
Kazu
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-11-07 21:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.7.1573156802.22483.kexec@lists.infradead.org>
2019-11-07 20:18 ` makedumpfile: ELF format issues (RE: makedumpfile: Fix divide by zero in print_report()) (Kazuhito Hagio) Dave Anderson
2019-11-07 21:28   ` Kazuhito Hagio

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.