linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
@ 2019-05-14  2:56 Aneesh Kumar K.V
  2019-05-14  3:58 ` Dan Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2019-05-14  2:56 UTC (permalink / raw)
  To: dan.j.williams; +Cc: linux-nvdimm, linux-mm, linuxppc-dev, Aneesh Kumar K.V

The nfpn related change is needed to fix the kernel message

"number of pfns truncated from 2617344 to 163584"

The change makes sure the nfpns stored in the superblock is right value.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 drivers/nvdimm/pfn_devs.c    | 6 +++---
 drivers/nvdimm/region_devs.c | 8 ++++----
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index 347cab166376..6751ff0296ef 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
 		 * when populating the vmemmap. This *should* be equal to
 		 * PMD_SIZE for most architectures.
 		 */
-		offset = ALIGN(start + reserve + 64 * npfns,
-				max(nd_pfn->align, PMD_SIZE)) - start;
+		offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
+			       max(nd_pfn->align, PMD_SIZE)) - start;
 	} else if (nd_pfn->mode == PFN_MODE_RAM)
 		offset = ALIGN(start + reserve, nd_pfn->align) - start;
 	else
@@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
 		return -ENXIO;
 	}
 
-	npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
+	npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
 	pfn_sb->mode = cpu_to_le32(nd_pfn->mode);
 	pfn_sb->dataoff = cpu_to_le64(offset);
 	pfn_sb->npfns = cpu_to_le64(npfns);
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index b4ef7d9ff22e..2d8facea5a03 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -994,10 +994,10 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus,
 		struct nd_mapping_desc *mapping = &ndr_desc->mapping[i];
 		struct nvdimm *nvdimm = mapping->nvdimm;
 
-		if ((mapping->start | mapping->size) % SZ_4K) {
-			dev_err(&nvdimm_bus->dev, "%s: %s mapping%d is not 4K aligned\n",
-					caller, dev_name(&nvdimm->dev), i);
-
+		if ((mapping->start | mapping->size) % PAGE_SIZE) {
+			dev_err(&nvdimm_bus->dev,
+				"%s: %s mapping%d is not 4K aligned\n",
+				caller, dev_name(&nvdimm->dev), i);
 			return NULL;
 		}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-14  2:56 [PATCH] mm/nvdimm: Use correct #defines instead of opencoding Aneesh Kumar K.V
@ 2019-05-14  3:58 ` Dan Williams
  2019-05-14  4:05   ` Aneesh Kumar K.V
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Williams @ 2019-05-14  3:58 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-nvdimm, Linux MM, linuxppc-dev

On Mon, May 13, 2019 at 7:56 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> The nfpn related change is needed to fix the kernel message
>
> "number of pfns truncated from 2617344 to 163584"
>
> The change makes sure the nfpns stored in the superblock is right value.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  drivers/nvdimm/pfn_devs.c    | 6 +++---
>  drivers/nvdimm/region_devs.c | 8 ++++----
>  2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> index 347cab166376..6751ff0296ef 100644
> --- a/drivers/nvdimm/pfn_devs.c
> +++ b/drivers/nvdimm/pfn_devs.c
> @@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>                  * when populating the vmemmap. This *should* be equal to
>                  * PMD_SIZE for most architectures.
>                  */
> -               offset = ALIGN(start + reserve + 64 * npfns,
> -                               max(nd_pfn->align, PMD_SIZE)) - start;
> +               offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
> +                              max(nd_pfn->align, PMD_SIZE)) - start;

No, I think we need to record the page-size into the superblock format
otherwise this breaks in debug builds where the struct-page size is
extended.

>         } else if (nd_pfn->mode == PFN_MODE_RAM)
>                 offset = ALIGN(start + reserve, nd_pfn->align) - start;
>         else
> @@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>                 return -ENXIO;
>         }
>
> -       npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
> +       npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;

Similar comment, if the page size is variable then the superblock
needs to explicitly account for it.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-14  3:58 ` Dan Williams
@ 2019-05-14  4:05   ` Aneesh Kumar K.V
  2019-05-14  4:12     ` Dan Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2019-05-14  4:05 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm, Linux MM, linuxppc-dev

On 5/14/19 9:28 AM, Dan Williams wrote:
> On Mon, May 13, 2019 at 7:56 PM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> The nfpn related change is needed to fix the kernel message
>>
>> "number of pfns truncated from 2617344 to 163584"
>>
>> The change makes sure the nfpns stored in the superblock is right value.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   drivers/nvdimm/pfn_devs.c    | 6 +++---
>>   drivers/nvdimm/region_devs.c | 8 ++++----
>>   2 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
>> index 347cab166376..6751ff0296ef 100644
>> --- a/drivers/nvdimm/pfn_devs.c
>> +++ b/drivers/nvdimm/pfn_devs.c
>> @@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>>                   * when populating the vmemmap. This *should* be equal to
>>                   * PMD_SIZE for most architectures.
>>                   */
>> -               offset = ALIGN(start + reserve + 64 * npfns,
>> -                               max(nd_pfn->align, PMD_SIZE)) - start;
>> +               offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
>> +                              max(nd_pfn->align, PMD_SIZE)) - start;
> 
> No, I think we need to record the page-size into the superblock format
> otherwise this breaks in debug builds where the struct-page size is
> extended.
> 
>>          } else if (nd_pfn->mode == PFN_MODE_RAM)
>>                  offset = ALIGN(start + reserve, nd_pfn->align) - start;
>>          else
>> @@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>>                  return -ENXIO;
>>          }
>>
>> -       npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
>> +       npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
> 
> Similar comment, if the page size is variable then the superblock
> needs to explicitly account for it.
> 

PAGE_SIZE is not really variable. What we can run into is the issue you 
mentioned above. The size of struct page can change which means the 
reserved space for keeping vmemmap in device may not be sufficient for 
certain kernel builds.

I was planning to add another patch that fails namespace init if we 
don't have enough space to keep the struct page.

Why do you suggest we need to have PAGE_SIZE as part of pfn superblock?

-aneesh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-14  4:05   ` Aneesh Kumar K.V
@ 2019-05-14  4:12     ` Dan Williams
  2019-05-14  4:46       ` Aneesh Kumar K.V
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Williams @ 2019-05-14  4:12 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-nvdimm, Linux MM, linuxppc-dev

On Mon, May 13, 2019 at 9:05 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> On 5/14/19 9:28 AM, Dan Williams wrote:
> > On Mon, May 13, 2019 at 7:56 PM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
> >>
> >> The nfpn related change is needed to fix the kernel message
> >>
> >> "number of pfns truncated from 2617344 to 163584"
> >>
> >> The change makes sure the nfpns stored in the superblock is right value.
> >>
> >> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >> ---
> >>   drivers/nvdimm/pfn_devs.c    | 6 +++---
> >>   drivers/nvdimm/region_devs.c | 8 ++++----
> >>   2 files changed, 7 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> >> index 347cab166376..6751ff0296ef 100644
> >> --- a/drivers/nvdimm/pfn_devs.c
> >> +++ b/drivers/nvdimm/pfn_devs.c
> >> @@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> >>                   * when populating the vmemmap. This *should* be equal to
> >>                   * PMD_SIZE for most architectures.
> >>                   */
> >> -               offset = ALIGN(start + reserve + 64 * npfns,
> >> -                               max(nd_pfn->align, PMD_SIZE)) - start;
> >> +               offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
> >> +                              max(nd_pfn->align, PMD_SIZE)) - start;
> >
> > No, I think we need to record the page-size into the superblock format
> > otherwise this breaks in debug builds where the struct-page size is
> > extended.
> >
> >>          } else if (nd_pfn->mode == PFN_MODE_RAM)
> >>                  offset = ALIGN(start + reserve, nd_pfn->align) - start;
> >>          else
> >> @@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> >>                  return -ENXIO;
> >>          }
> >>
> >> -       npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
> >> +       npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
> >
> > Similar comment, if the page size is variable then the superblock
> > needs to explicitly account for it.
> >
>
> PAGE_SIZE is not really variable. What we can run into is the issue you
> mentioned above. The size of struct page can change which means the
> reserved space for keeping vmemmap in device may not be sufficient for
> certain kernel builds.
>
> I was planning to add another patch that fails namespace init if we
> don't have enough space to keep the struct page.
>
> Why do you suggest we need to have PAGE_SIZE as part of pfn superblock?

So that the kernel has a chance to identify cases where the superblock
it is handling was created on a system with different PAGE_SIZE
assumptions.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-14  4:12     ` Dan Williams
@ 2019-05-14  4:46       ` Aneesh Kumar K.V
  2019-05-21  7:47         ` Dan Williams
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2019-05-14  4:46 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm, Linux MM, linuxppc-dev

On 5/14/19 9:42 AM, Dan Williams wrote:
> On Mon, May 13, 2019 at 9:05 PM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 5/14/19 9:28 AM, Dan Williams wrote:
>>> On Mon, May 13, 2019 at 7:56 PM Aneesh Kumar K.V
>>> <aneesh.kumar@linux.ibm.com> wrote:
>>>>
>>>> The nfpn related change is needed to fix the kernel message
>>>>
>>>> "number of pfns truncated from 2617344 to 163584"
>>>>
>>>> The change makes sure the nfpns stored in the superblock is right value.
>>>>
>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>> ---
>>>>    drivers/nvdimm/pfn_devs.c    | 6 +++---
>>>>    drivers/nvdimm/region_devs.c | 8 ++++----
>>>>    2 files changed, 7 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
>>>> index 347cab166376..6751ff0296ef 100644
>>>> --- a/drivers/nvdimm/pfn_devs.c
>>>> +++ b/drivers/nvdimm/pfn_devs.c
>>>> @@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>>>>                    * when populating the vmemmap. This *should* be equal to
>>>>                    * PMD_SIZE for most architectures.
>>>>                    */
>>>> -               offset = ALIGN(start + reserve + 64 * npfns,
>>>> -                               max(nd_pfn->align, PMD_SIZE)) - start;
>>>> +               offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
>>>> +                              max(nd_pfn->align, PMD_SIZE)) - start;
>>>
>>> No, I think we need to record the page-size into the superblock format
>>> otherwise this breaks in debug builds where the struct-page size is
>>> extended.
>>>
>>>>           } else if (nd_pfn->mode == PFN_MODE_RAM)
>>>>                   offset = ALIGN(start + reserve, nd_pfn->align) - start;
>>>>           else
>>>> @@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>>>>                   return -ENXIO;
>>>>           }
>>>>
>>>> -       npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
>>>> +       npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
>>>
>>> Similar comment, if the page size is variable then the superblock
>>> needs to explicitly account for it.
>>>
>>
>> PAGE_SIZE is not really variable. What we can run into is the issue you
>> mentioned above. The size of struct page can change which means the
>> reserved space for keeping vmemmap in device may not be sufficient for
>> certain kernel builds.
>>
>> I was planning to add another patch that fails namespace init if we
>> don't have enough space to keep the struct page.
>>
>> Why do you suggest we need to have PAGE_SIZE as part of pfn superblock?
> 
> So that the kernel has a chance to identify cases where the superblock
> it is handling was created on a system with different PAGE_SIZE
> assumptions.
> 

The reason to do that is we don't have enough space to keep struct page 
backing the total number of pfns? If so, what i suggested above should 
handle that.

or are you finding any other reason why we should fail a namespace init 
with a different PAGE_SIZE value?

My another patch handle the details w.r.t devdax alignment for which 
devdax got created with PAGE_SIZE 4K but we are now trying to load that 
in a kernel with PAGE_SIZE 64k.

-aneesh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-14  4:46       ` Aneesh Kumar K.V
@ 2019-05-21  7:47         ` Dan Williams
  2019-05-21  9:50           ` Aneesh Kumar K.V
  0 siblings, 1 reply; 10+ messages in thread
From: Dan Williams @ 2019-05-21  7:47 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-nvdimm, Linux MM, linuxppc-dev

On Mon, May 13, 2019 at 9:46 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> On 5/14/19 9:42 AM, Dan Williams wrote:
> > On Mon, May 13, 2019 at 9:05 PM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
> >>
> >> On 5/14/19 9:28 AM, Dan Williams wrote:
> >>> On Mon, May 13, 2019 at 7:56 PM Aneesh Kumar K.V
> >>> <aneesh.kumar@linux.ibm.com> wrote:
> >>>>
> >>>> The nfpn related change is needed to fix the kernel message
> >>>>
> >>>> "number of pfns truncated from 2617344 to 163584"
> >>>>
> >>>> The change makes sure the nfpns stored in the superblock is right value.
> >>>>
> >>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >>>> ---
> >>>>    drivers/nvdimm/pfn_devs.c    | 6 +++---
> >>>>    drivers/nvdimm/region_devs.c | 8 ++++----
> >>>>    2 files changed, 7 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> >>>> index 347cab166376..6751ff0296ef 100644
> >>>> --- a/drivers/nvdimm/pfn_devs.c
> >>>> +++ b/drivers/nvdimm/pfn_devs.c
> >>>> @@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> >>>>                    * when populating the vmemmap. This *should* be equal to
> >>>>                    * PMD_SIZE for most architectures.
> >>>>                    */
> >>>> -               offset = ALIGN(start + reserve + 64 * npfns,
> >>>> -                               max(nd_pfn->align, PMD_SIZE)) - start;
> >>>> +               offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
> >>>> +                              max(nd_pfn->align, PMD_SIZE)) - start;
> >>>
> >>> No, I think we need to record the page-size into the superblock format
> >>> otherwise this breaks in debug builds where the struct-page size is
> >>> extended.
> >>>
> >>>>           } else if (nd_pfn->mode == PFN_MODE_RAM)
> >>>>                   offset = ALIGN(start + reserve, nd_pfn->align) - start;
> >>>>           else
> >>>> @@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> >>>>                   return -ENXIO;
> >>>>           }
> >>>>
> >>>> -       npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
> >>>> +       npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
> >>>
> >>> Similar comment, if the page size is variable then the superblock
> >>> needs to explicitly account for it.
> >>>
> >>
> >> PAGE_SIZE is not really variable. What we can run into is the issue you
> >> mentioned above. The size of struct page can change which means the
> >> reserved space for keeping vmemmap in device may not be sufficient for
> >> certain kernel builds.
> >>
> >> I was planning to add another patch that fails namespace init if we
> >> don't have enough space to keep the struct page.
> >>
> >> Why do you suggest we need to have PAGE_SIZE as part of pfn superblock?
> >
> > So that the kernel has a chance to identify cases where the superblock
> > it is handling was created on a system with different PAGE_SIZE
> > assumptions.
> >
>
> The reason to do that is we don't have enough space to keep struct page
> backing the total number of pfns? If so, what i suggested above should
> handle that.
>
> or are you finding any other reason why we should fail a namespace init
> with a different PAGE_SIZE value?

I want the kernel to be able to start understand cross-architecture
and cross-configuration geometries. Which to me means incrementing the
info-block version and recording PAGE_SIZE and sizeof(struct page) in
the info-block directly.

> My another patch handle the details w.r.t devdax alignment for which
> devdax got created with PAGE_SIZE 4K but we are now trying to load that
> in a kernel with PAGE_SIZE 64k.

Sure, but what about the reverse? These info-block format assumptions
are as fundamental as the byte-order of the info-block, it needs to be
cross-arch compatible and the x86 assumptions need to be fully lifted.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-21  7:47         ` Dan Williams
@ 2019-05-21  9:50           ` Aneesh Kumar K.V
  2019-05-21 14:49             ` Elliott, Robert (Servers)
  2019-05-21 16:07             ` Dan Williams
  0 siblings, 2 replies; 10+ messages in thread
From: Aneesh Kumar K.V @ 2019-05-21  9:50 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm, Linux MM, linuxppc-dev

Dan Williams <dan.j.williams@intel.com> writes:

> On Mon, May 13, 2019 at 9:46 PM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 5/14/19 9:42 AM, Dan Williams wrote:
>> > On Mon, May 13, 2019 at 9:05 PM Aneesh Kumar K.V
>> > <aneesh.kumar@linux.ibm.com> wrote:
>> >>
>> >> On 5/14/19 9:28 AM, Dan Williams wrote:
>> >>> On Mon, May 13, 2019 at 7:56 PM Aneesh Kumar K.V
>> >>> <aneesh.kumar@linux.ibm.com> wrote:
>> >>>>
>> >>>> The nfpn related change is needed to fix the kernel message
>> >>>>
>> >>>> "number of pfns truncated from 2617344 to 163584"
>> >>>>
>> >>>> The change makes sure the nfpns stored in the superblock is right value.
>> >>>>
>> >>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> >>>> ---
>> >>>>    drivers/nvdimm/pfn_devs.c    | 6 +++---
>> >>>>    drivers/nvdimm/region_devs.c | 8 ++++----
>> >>>>    2 files changed, 7 insertions(+), 7 deletions(-)
>> >>>>
>> >>>> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
>> >>>> index 347cab166376..6751ff0296ef 100644
>> >>>> --- a/drivers/nvdimm/pfn_devs.c
>> >>>> +++ b/drivers/nvdimm/pfn_devs.c
>> >>>> @@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>> >>>>                    * when populating the vmemmap. This *should* be equal to
>> >>>>                    * PMD_SIZE for most architectures.
>> >>>>                    */
>> >>>> -               offset = ALIGN(start + reserve + 64 * npfns,
>> >>>> -                               max(nd_pfn->align, PMD_SIZE)) - start;
>> >>>> +               offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
>> >>>> +                              max(nd_pfn->align, PMD_SIZE)) - start;
>> >>>
>> >>> No, I think we need to record the page-size into the superblock format
>> >>> otherwise this breaks in debug builds where the struct-page size is
>> >>> extended.
>> >>>
>> >>>>           } else if (nd_pfn->mode == PFN_MODE_RAM)
>> >>>>                   offset = ALIGN(start + reserve, nd_pfn->align) - start;
>> >>>>           else
>> >>>> @@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
>> >>>>                   return -ENXIO;
>> >>>>           }
>> >>>>
>> >>>> -       npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
>> >>>> +       npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
>> >>>
>> >>> Similar comment, if the page size is variable then the superblock
>> >>> needs to explicitly account for it.
>> >>>
>> >>
>> >> PAGE_SIZE is not really variable. What we can run into is the issue you
>> >> mentioned above. The size of struct page can change which means the
>> >> reserved space for keeping vmemmap in device may not be sufficient for
>> >> certain kernel builds.
>> >>
>> >> I was planning to add another patch that fails namespace init if we
>> >> don't have enough space to keep the struct page.
>> >>
>> >> Why do you suggest we need to have PAGE_SIZE as part of pfn superblock?
>> >
>> > So that the kernel has a chance to identify cases where the superblock
>> > it is handling was created on a system with different PAGE_SIZE
>> > assumptions.
>> >
>>
>> The reason to do that is we don't have enough space to keep struct page
>> backing the total number of pfns? If so, what i suggested above should
>> handle that.
>>
>> or are you finding any other reason why we should fail a namespace init
>> with a different PAGE_SIZE value?
>
> I want the kernel to be able to start understand cross-architecture
> and cross-configuration geometries. Which to me means incrementing the
> info-block version and recording PAGE_SIZE and sizeof(struct page) in
> the info-block directly.
>
>> My another patch handle the details w.r.t devdax alignment for which
>> devdax got created with PAGE_SIZE 4K but we are now trying to load that
>> in a kernel with PAGE_SIZE 64k.
>
> Sure, but what about the reverse? These info-block format assumptions
> are as fundamental as the byte-order of the info-block, it needs to be
> cross-arch compatible and the x86 assumptions need to be fully lifted.

Something like the below (Not tested). I am not sure what we will init the page_size
for minor version < 3. This will mark the namespace disabled if the
PAGE_SIZE and sizeof(struct page) doesn't match with the values used
during namespace create. 

diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h
index dde9853453d3..d6e0933d0dd4 100644
--- a/drivers/nvdimm/pfn.h
+++ b/drivers/nvdimm/pfn.h
@@ -36,6 +36,9 @@ struct nd_pfn_sb {
 	__le32 end_trunc;
 	/* minor-version-2 record the base alignment of the mapping */
 	__le32 align;
+	/* minor-version-3 record the page size and struct page size */
+	__le32 page_size;
+	__le32 page_struct_size;
 	u8 padding[4000];
 	__le64 checksum;
 };
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index 6f9f78858018..bbc1d792d7f3 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -477,6 +477,15 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
 	if (__le16_to_cpu(pfn_sb->version_minor) < 2)
 		pfn_sb->align = 0;
 
+	if (__le16_to_cpu(pfn_sb->version_minor) < 3) {
+		/*
+		 * For a large part we use PAGE_SIZE. But we
+		 * do have some accounting code using SIZE_4K.
+		 */
+		pfn_sb->page_size = cpu_to_le32(PAGE_SIZE);
+		pfn_sb->page_struct_size = cpu_to_le32(64);
+	}
+
 	switch (le32_to_cpu(pfn_sb->mode)) {
 	case PFN_MODE_RAM:
 	case PFN_MODE_PMEM:
@@ -504,6 +513,12 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
 		return -EOPNOTSUPP;
 	}
 
+	if (le32_to_cpu(pfn_sb->page_size) != PAGE_SIZE)
+		return -EOPNOTSUPP;
+
+	if (le32_to_cpu(pfn_sb->page_struct_size) != sizeof(struct page))
+		return -EOPNOTSUPP;
+
 	if (!nd_pfn->uuid) {
 		/*
 		 * When probing a namepace via nd_pfn_probe() the uuid
@@ -798,7 +813,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
 	memcpy(pfn_sb->uuid, nd_pfn->uuid, 16);
 	memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16);
 	pfn_sb->version_major = cpu_to_le16(1);
-	pfn_sb->version_minor = cpu_to_le16(2);
+	pfn_sb->version_minor = cpu_to_le16(3);
 	pfn_sb->start_pad = cpu_to_le32(start_pad);
 	pfn_sb->end_trunc = cpu_to_le32(end_trunc);
 	pfn_sb->align = cpu_to_le32(nd_pfn->align);


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* RE: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-21  9:50           ` Aneesh Kumar K.V
@ 2019-05-21 14:49             ` Elliott, Robert (Servers)
  2019-05-21 16:07             ` Dan Williams
  1 sibling, 0 replies; 10+ messages in thread
From: Elliott, Robert (Servers) @ 2019-05-21 14:49 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Dan Williams; +Cc: Linux MM, linuxppc-dev, linux-nvdimm



> -----Original Message-----
> From: Linux-nvdimm <linux-nvdimm-bounces@lists.01.org> On Behalf Of
> Aneesh Kumar K.V
> Sent: Tuesday, May 21, 2019 4:51 AM
> Subject: Re: [PATCH] mm/nvdimm: Use correct #defines instead of
> opencoding
> 
...
> @@ -36,6 +36,9 @@ struct nd_pfn_sb {
>  	__le32 end_trunc;
>  	/* minor-version-2 record the base alignment of the mapping */
>  	__le32 align;
> +	/* minor-version-3 record the page size and struct page size
> */
> +	__le32 page_size;
> +	__le32 page_struct_size;
>  	u8 padding[4000];
>  	__le64 checksum;
>  };

You might need to reduce the padding size to offset the extra added
fields.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-21  9:50           ` Aneesh Kumar K.V
  2019-05-21 14:49             ` Elliott, Robert (Servers)
@ 2019-05-21 16:07             ` Dan Williams
  2019-05-22  5:41               ` Aneesh Kumar K.V
  1 sibling, 1 reply; 10+ messages in thread
From: Dan Williams @ 2019-05-21 16:07 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linux-nvdimm, Linux MM, linuxppc-dev

On Tue, May 21, 2019 at 2:51 AM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> Dan Williams <dan.j.williams@intel.com> writes:
>
> > On Mon, May 13, 2019 at 9:46 PM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
> >>
> >> On 5/14/19 9:42 AM, Dan Williams wrote:
> >> > On Mon, May 13, 2019 at 9:05 PM Aneesh Kumar K.V
> >> > <aneesh.kumar@linux.ibm.com> wrote:
> >> >>
> >> >> On 5/14/19 9:28 AM, Dan Williams wrote:
> >> >>> On Mon, May 13, 2019 at 7:56 PM Aneesh Kumar K.V
> >> >>> <aneesh.kumar@linux.ibm.com> wrote:
> >> >>>>
> >> >>>> The nfpn related change is needed to fix the kernel message
> >> >>>>
> >> >>>> "number of pfns truncated from 2617344 to 163584"
> >> >>>>
> >> >>>> The change makes sure the nfpns stored in the superblock is right value.
> >> >>>>
> >> >>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >> >>>> ---
> >> >>>>    drivers/nvdimm/pfn_devs.c    | 6 +++---
> >> >>>>    drivers/nvdimm/region_devs.c | 8 ++++----
> >> >>>>    2 files changed, 7 insertions(+), 7 deletions(-)
> >> >>>>
> >> >>>> diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
> >> >>>> index 347cab166376..6751ff0296ef 100644
> >> >>>> --- a/drivers/nvdimm/pfn_devs.c
> >> >>>> +++ b/drivers/nvdimm/pfn_devs.c
> >> >>>> @@ -777,8 +777,8 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> >> >>>>                    * when populating the vmemmap. This *should* be equal to
> >> >>>>                    * PMD_SIZE for most architectures.
> >> >>>>                    */
> >> >>>> -               offset = ALIGN(start + reserve + 64 * npfns,
> >> >>>> -                               max(nd_pfn->align, PMD_SIZE)) - start;
> >> >>>> +               offset = ALIGN(start + reserve + sizeof(struct page) * npfns,
> >> >>>> +                              max(nd_pfn->align, PMD_SIZE)) - start;
> >> >>>
> >> >>> No, I think we need to record the page-size into the superblock format
> >> >>> otherwise this breaks in debug builds where the struct-page size is
> >> >>> extended.
> >> >>>
> >> >>>>           } else if (nd_pfn->mode == PFN_MODE_RAM)
> >> >>>>                   offset = ALIGN(start + reserve, nd_pfn->align) - start;
> >> >>>>           else
> >> >>>> @@ -790,7 +790,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn)
> >> >>>>                   return -ENXIO;
> >> >>>>           }
> >> >>>>
> >> >>>> -       npfns = (size - offset - start_pad - end_trunc) / SZ_4K;
> >> >>>> +       npfns = (size - offset - start_pad - end_trunc) / PAGE_SIZE;
> >> >>>
> >> >>> Similar comment, if the page size is variable then the superblock
> >> >>> needs to explicitly account for it.
> >> >>>
> >> >>
> >> >> PAGE_SIZE is not really variable. What we can run into is the issue you
> >> >> mentioned above. The size of struct page can change which means the
> >> >> reserved space for keeping vmemmap in device may not be sufficient for
> >> >> certain kernel builds.
> >> >>
> >> >> I was planning to add another patch that fails namespace init if we
> >> >> don't have enough space to keep the struct page.
> >> >>
> >> >> Why do you suggest we need to have PAGE_SIZE as part of pfn superblock?
> >> >
> >> > So that the kernel has a chance to identify cases where the superblock
> >> > it is handling was created on a system with different PAGE_SIZE
> >> > assumptions.
> >> >
> >>
> >> The reason to do that is we don't have enough space to keep struct page
> >> backing the total number of pfns? If so, what i suggested above should
> >> handle that.
> >>
> >> or are you finding any other reason why we should fail a namespace init
> >> with a different PAGE_SIZE value?
> >
> > I want the kernel to be able to start understand cross-architecture
> > and cross-configuration geometries. Which to me means incrementing the
> > info-block version and recording PAGE_SIZE and sizeof(struct page) in
> > the info-block directly.
> >
> >> My another patch handle the details w.r.t devdax alignment for which
> >> devdax got created with PAGE_SIZE 4K but we are now trying to load that
> >> in a kernel with PAGE_SIZE 64k.
> >
> > Sure, but what about the reverse? These info-block format assumptions
> > are as fundamental as the byte-order of the info-block, it needs to be
> > cross-arch compatible and the x86 assumptions need to be fully lifted.
>
> Something like the below (Not tested). I am not sure what we will init the page_size
> for minor version < 3. This will mark the namespace disabled if the
> PAGE_SIZE and sizeof(struct page) doesn't match with the values used
> during namespace create.

Yes, this is on the right track.

I would special-case page_size == 0 as 4096 and page_struct_size == 0
as 64. If either of those is non-zero then the info-block version
needs to be revved and it needs to be crafted to make older kernels
fail to parse it.

There was an earlier attempt to implement minimum info-block versions here:

https://lore.kernel.org/lkml/155000670159.348031.17631616775326330606.stgit@dwillia2-desk3.amr.corp.intel.com/

...but that was dropped in favor of the the "sub-section" patches.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/nvdimm: Use correct #defines instead of opencoding
  2019-05-21 16:07             ` Dan Williams
@ 2019-05-22  5:41               ` Aneesh Kumar K.V
  0 siblings, 0 replies; 10+ messages in thread
From: Aneesh Kumar K.V @ 2019-05-22  5:41 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-nvdimm, Linux MM, linuxppc-dev, Oliver O'Halloran

On 5/21/19 9:37 PM, Dan Williams wrote:
> On Tue, May 21, 2019 at 2:51 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:


....

>>
>> Something like the below (Not tested). I am not sure what we will init the page_size
>> for minor version < 3. This will mark the namespace disabled if the
>> PAGE_SIZE and sizeof(struct page) doesn't match with the values used
>> during namespace create.
> 
> Yes, this is on the right track.
> 
> I would special-case page_size == 0 as 4096 and page_struct_size == 0
> as 64. If either of those is non-zero then the info-block version
> needs to be revved and it needs to be crafted to make older kernels
> fail to parse it.
> 

page_size = SZ_4K implies we fail to enable namesepaces created on ppc64 
till now. We do work fine with page_size = PAGE_SIZE. It is a few error 
check and pfn_sb->npfns that got wrong values. We do reserve the correct 
space for the required pfns even when we recorded wrong pfn_sb->npfs.


> There was an earlier attempt to implement minimum info-block versions here:
> 
> https://lore.kernel.org/lkml/155000670159.348031.17631616775326330606.stgit@dwillia2-desk3.amr.corp.intel.com/
> 
> ...but that was dropped in favor of the the "sub-section" patches.
> 

Ok i will pick that too.

-aneesh


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-05-22  5:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14  2:56 [PATCH] mm/nvdimm: Use correct #defines instead of opencoding Aneesh Kumar K.V
2019-05-14  3:58 ` Dan Williams
2019-05-14  4:05   ` Aneesh Kumar K.V
2019-05-14  4:12     ` Dan Williams
2019-05-14  4:46       ` Aneesh Kumar K.V
2019-05-21  7:47         ` Dan Williams
2019-05-21  9:50           ` Aneesh Kumar K.V
2019-05-21 14:49             ` Elliott, Robert (Servers)
2019-05-21 16:07             ` Dan Williams
2019-05-22  5:41               ` Aneesh Kumar K.V

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).