All of lore.kernel.org
 help / color / mirror / Atom feed
* makedumpfile: question about memory hole
@ 2013-03-15  0:31 HATAYAMA Daisuke
  2013-03-19  1:41 ` Atsushi Kumagai
  0 siblings, 1 reply; 8+ messages in thread
From: HATAYAMA Daisuke @ 2013-03-15  0:31 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: oomichi, kexec

Hello Kumagai-san,

I have a question about memory hole.

For example, create_1st_bitmap() calculates memory holes in the part
below:

int
create_1st_bitmap(void)
{
...
        /*
         * If page is on memory hole, set bit on the 1st-bitmap.
         */
        pfn_bitmap1 = 0;
        for (i = 0; get_pt_load(i, &phys_start, &phys_end, NULL, NULL); i++) {

                print_progress(PROGRESS_HOLES, i, num_pt_loads);

                pfn_start = paddr_to_pfn(phys_start);
                pfn_end   = paddr_to_pfn(phys_end);

                if (!is_in_segs(pfn_to_paddr(pfn_start)))
                        pfn_start++;
                for (pfn = pfn_start; pfn < pfn_end; pfn++) {
                        set_bit_on_1st_bitmap(pfn);
                        pfn_bitmap1++;
                }
        }
        pfn_memhole = info->max_mapnr - pfn_bitmap1;

What I don't understand well is that the part here:

                pfn_start = paddr_to_pfn(phys_start);
                pfn_end   = paddr_to_pfn(phys_end);

                if (!is_in_segs(pfn_to_paddr(pfn_start)))
                        pfn_start++;

phys_start and pfn_to_paddr(pfn_start) should belong to the same page
frame, so I suspect the pfn_start should be included in vmcore.

Looking into kexec-tool side, I don't see additional modification made
to phys_start after it's parsed from /proc/iomem or counterpart on EFI
interface. Is there any assumption about memory holes behind kernel?

Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: makedumpfile: question about memory hole
  2013-03-15  0:31 makedumpfile: question about memory hole HATAYAMA Daisuke
@ 2013-03-19  1:41 ` Atsushi Kumagai
  2013-03-19  2:42   ` HATAYAMA Daisuke
  0 siblings, 1 reply; 8+ messages in thread
From: Atsushi Kumagai @ 2013-03-19  1:41 UTC (permalink / raw)
  To: d.hatayama; +Cc: oomichi, kexec

Hello HATAYAMA-san,

On Fri, 15 Mar 2013 09:31:46 +0900 (JST)
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:

> Hello Kumagai-san,
> 
> I have a question about memory hole.
> 
> For example, create_1st_bitmap() calculates memory holes in the part
> below:
> 
> int
> create_1st_bitmap(void)
> {
> ...
>         /*
>          * If page is on memory hole, set bit on the 1st-bitmap.
>          */
>         pfn_bitmap1 = 0;
>         for (i = 0; get_pt_load(i, &phys_start, &phys_end, NULL, NULL); i++) {
> 
>                 print_progress(PROGRESS_HOLES, i, num_pt_loads);
> 
>                 pfn_start = paddr_to_pfn(phys_start);
>                 pfn_end   = paddr_to_pfn(phys_end);
> 
>                 if (!is_in_segs(pfn_to_paddr(pfn_start)))
>                         pfn_start++;
>                 for (pfn = pfn_start; pfn < pfn_end; pfn++) {
>                         set_bit_on_1st_bitmap(pfn);
>                         pfn_bitmap1++;
>                 }
>         }
>         pfn_memhole = info->max_mapnr - pfn_bitmap1;
> 
> What I don't understand well is that the part here:
> 
>                 pfn_start = paddr_to_pfn(phys_start);
>                 pfn_end   = paddr_to_pfn(phys_end);
> 
>                 if (!is_in_segs(pfn_to_paddr(pfn_start)))
>                         pfn_start++;
> 
> phys_start and pfn_to_paddr(pfn_start) should belong to the same page
> frame, so I suspect the pfn_start should be included in vmcore.
> 
> Looking into kexec-tool side, I don't see additional modification made
> to phys_start after it's parsed from /proc/iomem or counterpart on EFI
> interface. Is there any assumption about memory holes behind kernel?

Here is a PT_LOAD segment of ia64 machine which I actually use:

  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
[...]
  LOAD           0x000000015fd0b490 0xe0000040ffda5000 0x00000040ffda5000
                 0x000000000005a000 0x000000000005a000  RWE    0

In this case, pfn_to_paddr(pfn_start) is aligned to 0x40ffda4000 
because the page size is 16KiB, and this address is out of PT_LOAD
segment.

         phys_start
         = 0x40ffda5000                        
            |------------- PT_LOAD ----------------
     ----+----------+----------+----------+--------
         |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
     ----+----------+----------+----------+--------
         |
   pfn_to_paddr(pfn:N)
   = 0x40ffda4000

The statement you said is for care the case that phys_start isn't aligned
with the page size.

BTW, I'll add a comment to explain this intention into here.


Thanks
Atsushi Kumagai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: makedumpfile: question about memory hole
  2013-03-19  1:41 ` Atsushi Kumagai
@ 2013-03-19  2:42   ` HATAYAMA Daisuke
  2013-03-19  7:49     ` Atsushi Kumagai
  0 siblings, 1 reply; 8+ messages in thread
From: HATAYAMA Daisuke @ 2013-03-19  2:42 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: oomichi, kexec

From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Subject: Re: makedumpfile: question about memory hole
Date: Tue, 19 Mar 2013 10:41:37 +0900

> On Fri, 15 Mar 2013 09:31:46 +0900 (JST)
> HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:

>> What I don't understand well is that the part here:
>> 
>>                 pfn_start = paddr_to_pfn(phys_start);
>>                 pfn_end   = paddr_to_pfn(phys_end);
>> 
>>                 if (!is_in_segs(pfn_to_paddr(pfn_start)))
>>                         pfn_start++;
>> 
>> phys_start and pfn_to_paddr(pfn_start) should belong to the same page
>> frame, so I suspect the pfn_start should be included in vmcore.
>> 
>> Looking into kexec-tool side, I don't see additional modification made
>> to phys_start after it's parsed from /proc/iomem or counterpart on EFI
>> interface. Is there any assumption about memory holes behind kernel?
> 
> Here is a PT_LOAD segment of ia64 machine which I actually use:
> 
>   Type           Offset             VirtAddr           PhysAddr
>                  FileSiz            MemSiz              Flags  Align
> [...]
>   LOAD           0x000000015fd0b490 0xe0000040ffda5000 0x00000040ffda5000
>                  0x000000000005a000 0x000000000005a000  RWE    0
> 
> In this case, pfn_to_paddr(pfn_start) is aligned to 0x40ffda4000 
> because the page size is 16KiB, and this address is out of PT_LOAD
> segment.
> 
>          phys_start
>          = 0x40ffda5000                        
>             |------------- PT_LOAD ----------------
>      ----+----------+----------+----------+--------
>          |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
>      ----+----------+----------+----------+--------
>          |
>    pfn_to_paddr(pfn:N)
>    = 0x40ffda4000
> 
> The statement you said is for care the case that phys_start isn't aligned
> with the page size.
> 
> BTW, I'll add a comment to explain this intention into here.

Thanks for the pictorial explanation. It's easy to understand.

Still I think pfn:N should be included in vmcore. The current
implementation drops [0x40ffda5000, 0x40ffda8000] that is contained in
the PT_LOAD. Or, the range must be hole or other kinds of unnecessary
memory from some kernel-side assumption?

Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: makedumpfile: question about memory hole
  2013-03-19  2:42   ` HATAYAMA Daisuke
@ 2013-03-19  7:49     ` Atsushi Kumagai
  2013-03-19  8:47       ` HATAYAMA Daisuke
  0 siblings, 1 reply; 8+ messages in thread
From: Atsushi Kumagai @ 2013-03-19  7:49 UTC (permalink / raw)
  To: d.hatayama; +Cc: oomichi, kexec

On Tue, 19 Mar 2013 11:42:20 +0900 (JST)
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:

> From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
> Subject: Re: makedumpfile: question about memory hole
> Date: Tue, 19 Mar 2013 10:41:37 +0900
> 
> > On Fri, 15 Mar 2013 09:31:46 +0900 (JST)
> > HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
> 
> >> What I don't understand well is that the part here:
> >> 
> >>                 pfn_start = paddr_to_pfn(phys_start);
> >>                 pfn_end   = paddr_to_pfn(phys_end);
> >> 
> >>                 if (!is_in_segs(pfn_to_paddr(pfn_start)))
> >>                         pfn_start++;
> >> 
> >> phys_start and pfn_to_paddr(pfn_start) should belong to the same page
> >> frame, so I suspect the pfn_start should be included in vmcore.
> >> 
> >> Looking into kexec-tool side, I don't see additional modification made
> >> to phys_start after it's parsed from /proc/iomem or counterpart on EFI
> >> interface. Is there any assumption about memory holes behind kernel?
> > 
> > Here is a PT_LOAD segment of ia64 machine which I actually use:
> > 
> >   Type           Offset             VirtAddr           PhysAddr
> >                  FileSiz            MemSiz              Flags  Align
> > [...]
> >   LOAD           0x000000015fd0b490 0xe0000040ffda5000 0x00000040ffda5000
> >                  0x000000000005a000 0x000000000005a000  RWE    0
> > 
> > In this case, pfn_to_paddr(pfn_start) is aligned to 0x40ffda4000 
> > because the page size is 16KiB, and this address is out of PT_LOAD
> > segment.
> > 
> >          phys_start
> >          = 0x40ffda5000                        
> >             |------------- PT_LOAD ----------------
> >      ----+----------+----------+----------+--------
> >          |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
> >      ----+----------+----------+----------+--------
> >          |
> >    pfn_to_paddr(pfn:N)
> >    = 0x40ffda4000
> > 
> > The statement you said is for care the case that phys_start isn't aligned
> > with the page size.
> > 
> > BTW, I'll add a comment to explain this intention into here.
> 
> Thanks for the pictorial explanation. It's easy to understand.
> 
> Still I think pfn:N should be included in vmcore. The current
> implementation drops [0x40ffda5000, 0x40ffda8000] that is contained in
> the PT_LOAD. Or, the range must be hole or other kinds of unnecessary
> memory from some kernel-side assumption?

Oh, I understand your question correctly now.

When Ohmichi-san wrote this code, he thought the page which include
memory hole isn't be used. This came from the fact that the basic
unit of memory management is *page*, but there is no detailed
investigation. 

So, if there is any case where pfn:N is actually used, this statement
should be removed. Maybe, does this question come from an idea of such
cases ?


Thanks
Atsushi Kumagai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: makedumpfile: question about memory hole
  2013-03-19  7:49     ` Atsushi Kumagai
@ 2013-03-19  8:47       ` HATAYAMA Daisuke
  2013-03-29  8:13         ` Atsushi Kumagai
  0 siblings, 1 reply; 8+ messages in thread
From: HATAYAMA Daisuke @ 2013-03-19  8:47 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: oomichi, kexec

From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Subject: Re: makedumpfile: question about memory hole
Date: Tue, 19 Mar 2013 16:49:26 +0900

> On Tue, 19 Mar 2013 11:42:20 +0900 (JST)
> HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
> 
>> From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
>> Subject: Re: makedumpfile: question about memory hole
>> Date: Tue, 19 Mar 2013 10:41:37 +0900
>> 
>> > On Fri, 15 Mar 2013 09:31:46 +0900 (JST)
>> > HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
>> 
>> >> What I don't understand well is that the part here:
>> >> 
>> >>                 pfn_start = paddr_to_pfn(phys_start);
>> >>                 pfn_end   = paddr_to_pfn(phys_end);
>> >> 
>> >>                 if (!is_in_segs(pfn_to_paddr(pfn_start)))
>> >>                         pfn_start++;
>> >> 
>> >> phys_start and pfn_to_paddr(pfn_start) should belong to the same page
>> >> frame, so I suspect the pfn_start should be included in vmcore.
>> >> 
>> >> Looking into kexec-tool side, I don't see additional modification made
>> >> to phys_start after it's parsed from /proc/iomem or counterpart on EFI
>> >> interface. Is there any assumption about memory holes behind kernel?
>> > 
>> > Here is a PT_LOAD segment of ia64 machine which I actually use:
>> > 
>> >   Type           Offset             VirtAddr           PhysAddr
>> >                  FileSiz            MemSiz              Flags  Align
>> > [...]
>> >   LOAD           0x000000015fd0b490 0xe0000040ffda5000 0x00000040ffda5000
>> >                  0x000000000005a000 0x000000000005a000  RWE    0
>> > 
>> > In this case, pfn_to_paddr(pfn_start) is aligned to 0x40ffda4000 
>> > because the page size is 16KiB, and this address is out of PT_LOAD
>> > segment.
>> > 
>> >          phys_start
>> >          = 0x40ffda5000                        
>> >             |------------- PT_LOAD ----------------
>> >      ----+----------+----------+----------+--------
>> >          |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
>> >      ----+----------+----------+----------+--------
>> >          |
>> >    pfn_to_paddr(pfn:N)
>> >    = 0x40ffda4000
>> > 
>> > The statement you said is for care the case that phys_start isn't aligned
>> > with the page size.
>> > 
>> > BTW, I'll add a comment to explain this intention into here.
>> 
>> Thanks for the pictorial explanation. It's easy to understand.
>> 
>> Still I think pfn:N should be included in vmcore. The current
>> implementation drops [0x40ffda5000, 0x40ffda8000] that is contained in
>> the PT_LOAD. Or, the range must be hole or other kinds of unnecessary
>> memory from some kernel-side assumption?
> 
> Oh, I understand your question correctly now.
> 
> When Ohmichi-san wrote this code, he thought the page which include
> memory hole isn't be used. This came from the fact that the basic
> unit of memory management is *page*, but there is no detailed
> investigation. 

You mean on at least IA64 case such parts are always holes?

> 
> So, if there is any case where pfn:N is actually used, this statement
> should be removed. Maybe, does this question come from an idea of such
> cases ?

I'm wondering if such case can actually happens.

Even apart from the IA64 case, the regions that is not page-size
aligned can occur if some parts of System RAM are converted into other
types of memory at runtime.

So, ideally, we should handle page frames that corresponds to start
and end of each PT_LOAD entries specially, filling the ranges not
covered by any PT_LOAD entries with 0.

Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: makedumpfile: question about memory hole
  2013-03-19  8:47       ` HATAYAMA Daisuke
@ 2013-03-29  8:13         ` Atsushi Kumagai
  2013-05-14  1:55           ` Atsushi Kumagai
  0 siblings, 1 reply; 8+ messages in thread
From: Atsushi Kumagai @ 2013-03-29  8:13 UTC (permalink / raw)
  To: d.hatayama; +Cc: oomichi, kexec

Hello HATAYAMA-san,

Sorry for the delayed response.

On Tue, 19 Mar 2013 17:47:45 +0900 (JST)
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:

> >> >> What I don't understand well is that the part here:
> >> >> 
> >> >>                 pfn_start = paddr_to_pfn(phys_start);
> >> >>                 pfn_end   = paddr_to_pfn(phys_end);
> >> >> 
> >> >>                 if (!is_in_segs(pfn_to_paddr(pfn_start)))
> >> >>                         pfn_start++;
> >> >> 
> >> >> phys_start and pfn_to_paddr(pfn_start) should belong to the same page
> >> >> frame, so I suspect the pfn_start should be included in vmcore.
> >> >> 
> >> >> Looking into kexec-tool side, I don't see additional modification made
> >> >> to phys_start after it's parsed from /proc/iomem or counterpart on EFI
> >> >> interface. Is there any assumption about memory holes behind kernel?
> >> > 
> >> > Here is a PT_LOAD segment of ia64 machine which I actually use:
> >> > 
> >> >   Type           Offset             VirtAddr           PhysAddr
> >> >                  FileSiz            MemSiz              Flags  Align
> >> > [...]
> >> >   LOAD           0x000000015fd0b490 0xe0000040ffda5000 0x00000040ffda5000
> >> >                  0x000000000005a000 0x000000000005a000  RWE    0
> >> > 
> >> > In this case, pfn_to_paddr(pfn_start) is aligned to 0x40ffda4000 
> >> > because the page size is 16KiB, and this address is out of PT_LOAD
> >> > segment.
> >> > 
> >> >          phys_start
> >> >          = 0x40ffda5000                        
> >> >             |------------- PT_LOAD ----------------
> >> >      ----+----------+----------+----------+--------
> >> >          |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
> >> >      ----+----------+----------+----------+--------
> >> >          |
> >> >    pfn_to_paddr(pfn:N)
> >> >    = 0x40ffda4000
> >> > 
> >> > The statement you said is for care the case that phys_start isn't aligned
> >> > with the page size.
> >> > 
> >> > BTW, I'll add a comment to explain this intention into here.
> >> 
> >> Thanks for the pictorial explanation. It's easy to understand.
> >> 
> >> Still I think pfn:N should be included in vmcore. The current
> >> implementation drops [0x40ffda5000, 0x40ffda8000] that is contained in
> >> the PT_LOAD. Or, the range must be hole or other kinds of unnecessary
> >> memory from some kernel-side assumption?
> > 
> > Oh, I understand your question correctly now.
> > 
> > When Ohmichi-san wrote this code, he thought the page which include
> > memory hole isn't be used. This came from the fact that the basic
> > unit of memory management is *page*, but there is no detailed
> > investigation. 
> 
> You mean on at least IA64 case such parts are always holes?

I showed the IA64 case just to say that the statement can be executed
actually and it's meaningful code, and this is from my misunderstanding
of your question.
Whether such parts are holes or not is another matter, and I haven't
enough information to decide it now.
 
> > 
> > So, if there is any case where pfn:N is actually used, this statement
> > should be removed. Maybe, does this question come from an idea of such
> > cases ?
> 
> I'm wondering if such case can actually happens.

I checked a memory map on another IA64 machine and found the regions
that not be aligned by page-size:

  # cat /proc/iomem  | grep System
  ...
  4040000000-40fea09fff : System RAM
  40fea0a000-40fef5ffff : System RAM       // start address isn't page-aligned
  40fef60000-40fef63fff : System RAM

According to this, it seems that such regions can be exist normally
at least on IA64. So, what we should investigate is how does kernel
manage such regions (e.g. [0x40fea0a000, 0x40fea0c000]).
And this is the "kernel-side assumption" you said first, right ?

Since multiple page sizes are supported, I suppose some cases like
below may happen, so I'll confirm it.


      |-------------------------- PT_LOAD ------------------------------

      | 4k page| 4k page|            16k page            |
  ----+--------+--------+--------------------------------+--------------
      | pfn:N  | pfn:N+1|            pfn:N+2             |  ...
  ----+--------+--------+--------------------------------+--------------
      |        |        |                                |
0x40fea0a000   |   0x40fea0c000                    0x40fea10000
               |
          0x40fea0b000

> Even apart from the IA64 case, the regions that is not page-size
> aligned can occur if some parts of System RAM are converted into other
> types of memory at runtime.
> 
> So, ideally, we should handle page frames that corresponds to start
> and end of each PT_LOAD entries specially, filling the ranges not
> covered by any PT_LOAD entries with 0.
 
If the cases like I showed above can happen, makedumpfile should be
fixed as you said.


Thanks
Atsushi Kumagai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: makedumpfile: question about memory hole
  2013-03-29  8:13         ` Atsushi Kumagai
@ 2013-05-14  1:55           ` Atsushi Kumagai
  2013-05-15  1:00             ` HATAYAMA Daisuke
  0 siblings, 1 reply; 8+ messages in thread
From: Atsushi Kumagai @ 2013-05-14  1:55 UTC (permalink / raw)
  To: d.hatayama; +Cc: oomichi, kexec

Hello HATAYAMA-san,

Sorry for the delayed response, again...

On Fri, 29 Mar 2013 17:13:11 +0900
Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> wrote:

> Hello HATAYAMA-san,
> 
> Sorry for the delayed response.
> 
> On Tue, 19 Mar 2013 17:47:45 +0900 (JST)
> HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
> 
> > >> >> What I don't understand well is that the part here:
> > >> >> 
> > >> >>                 pfn_start = paddr_to_pfn(phys_start);
> > >> >>                 pfn_end   = paddr_to_pfn(phys_end);
> > >> >> 
> > >> >>                 if (!is_in_segs(pfn_to_paddr(pfn_start)))
> > >> >>                         pfn_start++;
> > >> >> 
> > >> >> phys_start and pfn_to_paddr(pfn_start) should belong to the same page
> > >> >> frame, so I suspect the pfn_start should be included in vmcore.
> > >> >> 
> > >> >> Looking into kexec-tool side, I don't see additional modification made
> > >> >> to phys_start after it's parsed from /proc/iomem or counterpart on EFI
> > >> >> interface. Is there any assumption about memory holes behind kernel?
> > >> > 
> > >> > Here is a PT_LOAD segment of ia64 machine which I actually use:
> > >> > 
> > >> >   Type           Offset             VirtAddr           PhysAddr
> > >> >                  FileSiz            MemSiz              Flags  Align
> > >> > [...]
> > >> >   LOAD           0x000000015fd0b490 0xe0000040ffda5000 0x00000040ffda5000
> > >> >                  0x000000000005a000 0x000000000005a000  RWE    0
> > >> > 
> > >> > In this case, pfn_to_paddr(pfn_start) is aligned to 0x40ffda4000 
> > >> > because the page size is 16KiB, and this address is out of PT_LOAD
> > >> > segment.
> > >> > 
> > >> >          phys_start
> > >> >          = 0x40ffda5000                        
> > >> >             |------------- PT_LOAD ----------------
> > >> >      ----+----------+----------+----------+--------
> > >> >          |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
> > >> >      ----+----------+----------+----------+--------
> > >> >          |
> > >> >    pfn_to_paddr(pfn:N)
> > >> >    = 0x40ffda4000
> > >> > 
> > >> > The statement you said is for care the case that phys_start isn't aligned
> > >> > with the page size.
> > >> > 
> > >> > BTW, I'll add a comment to explain this intention into here.
> > >> 
> > >> Thanks for the pictorial explanation. It's easy to understand.
> > >> 
> > >> Still I think pfn:N should be included in vmcore. The current
> > >> implementation drops [0x40ffda5000, 0x40ffda8000] that is contained in
> > >> the PT_LOAD. Or, the range must be hole or other kinds of unnecessary
> > >> memory from some kernel-side assumption?
> > > 
> > > Oh, I understand your question correctly now.
> > > 
> > > When Ohmichi-san wrote this code, he thought the page which include
> > > memory hole isn't be used. This came from the fact that the basic
> > > unit of memory management is *page*, but there is no detailed
> > > investigation. 
> > 
> > You mean on at least IA64 case such parts are always holes?
> 
> I showed the IA64 case just to say that the statement can be executed
> actually and it's meaningful code, and this is from my misunderstanding
> of your question.
> Whether such parts are holes or not is another matter, and I haven't
> enough information to decide it now.
>  
> > > 
> > > So, if there is any case where pfn:N is actually used, this statement
> > > should be removed. Maybe, does this question come from an idea of such
> > > cases ?
> > 
> > I'm wondering if such case can actually happens.
> 
> I checked a memory map on another IA64 machine and found the regions
> that not be aligned by page-size:
> 
>   # cat /proc/iomem  | grep System
>   ...
>   4040000000-40fea09fff : System RAM
>   40fea0a000-40fef5ffff : System RAM       // include"pfn:N" 40fea0a000-
>   40fef60000-40fef63fff : System RAM
> 
> According to this, it seems that such regions can be exist normally
> at least on IA64. So, what we should investigate is how does kernel
> manage such regions (e.g. [0x40fea0a000, 0x40fea0c000]).
> And this is the "kernel-side assumption" you said first, right ?

First, the memory map(iomem_resource) is made from EFI memory map 
with efi_initialize_iomem_resources(), then no rounding occurs.
And EFI page size is 4KB(EFI_PAGE_SHIFT == 12), so it is natural
that some regions aren't aligned by linux kernel page size.

Anyway, I found the case that "pfn:N" mentioned in previous mail was 
actually used on the IA64 machine.

> > >> >             |------------- PT_LOAD ----------------
> > >> >      ----+----------+----------+----------+--------
> > >> >          |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
> > >> >      ----+----------+----------+----------+--------

Here is the machine's /proc/iomem and dmesg:

  # cat /proc/iomem  | grep System
  ...
  4040000000-40fea09fff : System RAM
  40fea0a000-40fef5ffff : System RAM       // start address corresponds to "pfn:N"
  40fef60000-40fef63fff : System RAM

  # dmesg
  ...
  rsvd_region[0]: [0xe000000001000000, 0xe0000000010000a8)
  rsvd_region[1]: [0xe000000004000000, 0xe000000004e94e68)
  rsvd_region[2]: [0xe0000040fea0a010, 0xe0000040fea0a060)  // stored in "pfn:N"
  rsvd_region[3]: [0xe0000040fea0dfd8, 0xe0000040fea0e010)
  rsvd_region[4]: [0xe0000040fea10000, 0xe0000040fef5fc79)
  rsvd_region[5]: [0xe0000040fefd0010, 0xe0000040fefd0790)
  rsvd_region[6]: [0xffffffffffffffff, 0xffffffffffffffff)
  
  // these are virtual addresses, __pa(0xe0000040fea0a010) = 0x40fea0a010

According to reserve_memory(), rsvd_region[2] is used to save 
ia64_boot_param->command_line. This means that "pfn:N" can
include valid dates, we shouldn't remove it as holes.

Thank you for pointing out this issue, I'll fix it.


Atsushi Kumagai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: makedumpfile: question about memory hole
  2013-05-14  1:55           ` Atsushi Kumagai
@ 2013-05-15  1:00             ` HATAYAMA Daisuke
  0 siblings, 0 replies; 8+ messages in thread
From: HATAYAMA Daisuke @ 2013-05-15  1:00 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: oomichi, kexec

(2013/05/14 10:55), Atsushi Kumagai wrote:
> Hello HATAYAMA-san,
> 
> Sorry for the delayed response, again...
> 
> On Fri, 29 Mar 2013 17:13:11 +0900
> Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> wrote:
> 
>> Hello HATAYAMA-san,
>>
>> Sorry for the delayed response.
>>
>> On Tue, 19 Mar 2013 17:47:45 +0900 (JST)
>> HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
>>
>>>>>>> What I don't understand well is that the part here:
>>>>>>>
>>>>>>>                  pfn_start = paddr_to_pfn(phys_start);
>>>>>>>                  pfn_end   = paddr_to_pfn(phys_end);
>>>>>>>
>>>>>>>                  if (!is_in_segs(pfn_to_paddr(pfn_start)))
>>>>>>>                          pfn_start++;
>>>>>>>
>>>>>>> phys_start and pfn_to_paddr(pfn_start) should belong to the same page
>>>>>>> frame, so I suspect the pfn_start should be included in vmcore.
>>>>>>>
>>>>>>> Looking into kexec-tool side, I don't see additional modification made
>>>>>>> to phys_start after it's parsed from /proc/iomem or counterpart on EFI
>>>>>>> interface. Is there any assumption about memory holes behind kernel?
>>>>>>
>>>>>> Here is a PT_LOAD segment of ia64 machine which I actually use:
>>>>>>
>>>>>>    Type           Offset             VirtAddr           PhysAddr
>>>>>>                   FileSiz            MemSiz              Flags  Align
>>>>>> [...]
>>>>>>    LOAD           0x000000015fd0b490 0xe0000040ffda5000 0x00000040ffda5000
>>>>>>                   0x000000000005a000 0x000000000005a000  RWE    0
>>>>>>
>>>>>> In this case, pfn_to_paddr(pfn_start) is aligned to 0x40ffda4000
>>>>>> because the page size is 16KiB, and this address is out of PT_LOAD
>>>>>> segment.
>>>>>>
>>>>>>           phys_start
>>>>>>           = 0x40ffda5000
>>>>>>              |------------- PT_LOAD ----------------
>>>>>>       ----+----------+----------+----------+--------
>>>>>>           |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
>>>>>>       ----+----------+----------+----------+--------
>>>>>>           |
>>>>>>     pfn_to_paddr(pfn:N)
>>>>>>     = 0x40ffda4000
>>>>>>
>>>>>> The statement you said is for care the case that phys_start isn't aligned
>>>>>> with the page size.
>>>>>>
>>>>>> BTW, I'll add a comment to explain this intention into here.
>>>>>
>>>>> Thanks for the pictorial explanation. It's easy to understand.
>>>>>
>>>>> Still I think pfn:N should be included in vmcore. The current
>>>>> implementation drops [0x40ffda5000, 0x40ffda8000] that is contained in
>>>>> the PT_LOAD. Or, the range must be hole or other kinds of unnecessary
>>>>> memory from some kernel-side assumption?
>>>>
>>>> Oh, I understand your question correctly now.
>>>>
>>>> When Ohmichi-san wrote this code, he thought the page which include
>>>> memory hole isn't be used. This came from the fact that the basic
>>>> unit of memory management is *page*, but there is no detailed
>>>> investigation.
>>>
>>> You mean on at least IA64 case such parts are always holes?
>>
>> I showed the IA64 case just to say that the statement can be executed
>> actually and it's meaningful code, and this is from my misunderstanding
>> of your question.
>> Whether such parts are holes or not is another matter, and I haven't
>> enough information to decide it now.
>>   
>>>>
>>>> So, if there is any case where pfn:N is actually used, this statement
>>>> should be removed. Maybe, does this question come from an idea of such
>>>> cases ?
>>>
>>> I'm wondering if such case can actually happens.
>>
>> I checked a memory map on another IA64 machine and found the regions
>> that not be aligned by page-size:
>>
>>    # cat /proc/iomem  | grep System
>>    ...
>>    4040000000-40fea09fff : System RAM
>>    40fea0a000-40fef5ffff : System RAM       // include"pfn:N" 40fea0a000-
>>    40fef60000-40fef63fff : System RAM
>>
>> According to this, it seems that such regions can be exist normally
>> at least on IA64. So, what we should investigate is how does kernel
>> manage such regions (e.g. [0x40fea0a000, 0x40fea0c000]).
>> And this is the "kernel-side assumption" you said first, right ?
> 
> First, the memory map(iomem_resource) is made from EFI memory map
> with efi_initialize_iomem_resources(), then no rounding occurs.
> And EFI page size is 4KB(EFI_PAGE_SHIFT == 12), so it is natural
> that some regions aren't aligned by linux kernel page size.
> 
> Anyway, I found the case that "pfn:N" mentioned in previous mail was
> actually used on the IA64 machine.
> 
>>>>>>              |------------- PT_LOAD ----------------
>>>>>>       ----+----------+----------+----------+--------
>>>>>>           |   pfn:N  |  pfn:N+1 | pfn:N+2  |  ...
>>>>>>       ----+----------+----------+----------+--------
> 
> Here is the machine's /proc/iomem and dmesg:
> 
>    # cat /proc/iomem  | grep System
>    ...
>    4040000000-40fea09fff : System RAM
>    40fea0a000-40fef5ffff : System RAM       // start address corresponds to "pfn:N"
>    40fef60000-40fef63fff : System RAM
> 
>    # dmesg
>    ...
>    rsvd_region[0]: [0xe000000001000000, 0xe0000000010000a8)
>    rsvd_region[1]: [0xe000000004000000, 0xe000000004e94e68)
>    rsvd_region[2]: [0xe0000040fea0a010, 0xe0000040fea0a060)  // stored in "pfn:N"
>    rsvd_region[3]: [0xe0000040fea0dfd8, 0xe0000040fea0e010)
>    rsvd_region[4]: [0xe0000040fea10000, 0xe0000040fef5fc79)
>    rsvd_region[5]: [0xe0000040fefd0010, 0xe0000040fefd0790)
>    rsvd_region[6]: [0xffffffffffffffff, 0xffffffffffffffff)
>    
>    // these are virtual addresses, __pa(0xe0000040fea0a010) = 0x40fea0a010
> 
> According to reserve_memory(), rsvd_region[2] is used to save
> ia64_boot_param->command_line. This means that "pfn:N" can
> include valid dates, we shouldn't remove it as holes.
> 
> Thank you for pointing out this issue, I'll fix it.

Thanks for your investigation. I'm now very clear to what's happening there.

-- 
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-05-15  1:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-15  0:31 makedumpfile: question about memory hole HATAYAMA Daisuke
2013-03-19  1:41 ` Atsushi Kumagai
2013-03-19  2:42   ` HATAYAMA Daisuke
2013-03-19  7:49     ` Atsushi Kumagai
2013-03-19  8:47       ` HATAYAMA Daisuke
2013-03-29  8:13         ` Atsushi Kumagai
2013-05-14  1:55           ` Atsushi Kumagai
2013-05-15  1:00             ` HATAYAMA Daisuke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.