* [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
@ 2018-07-06 19:06 ` Ross Zwisler
0 siblings, 0 replies; 19+ messages in thread
From: Ross Zwisler @ 2018-07-06 19:06 UTC (permalink / raw)
To: pasha.tatashin, linux-nvdimm
Cc: Ross Zwisler, osalvador, bhe, Dave Hansen, LKML, Linux MM,
Michal Hocko, Vlastimil Babka, Andrew Morton, Kirill A. Shutemov,
osalvador
The following commit in -next:
commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
remove check")
changed how the error handling in sparse_add_one_section() works.
Previously sparse_index_init() could return -EEXIST, and the function would
continue on happily. 'ret' would get unconditionally overwritten by the
result from sparse_init_one_section() and the error code after the 'out:'
label wouldn't be triggered.
With the above referenced commit, though, an -EEXIST error return from
sparse_index_init() now takes us through the function and into the error
case after 'out:'. This eventually causes a kernel BUG, probably because
we've just freed a memory section that we successfully set up and marked as
present:
BUG: unable to handle kernel paging request at ffffea0005000080
RIP: 0010:memmap_init_zone+0x154/0x1cf
Call Trace:
move_pfn_range_to_zone+0x168/0x180
devm_memremap_pages+0x29b/0x480
pmem_attach_disk+0x1ae/0x6c0 [nd_pmem]
? devm_memremap+0x79/0xb0
nd_pmem_probe+0x7e/0xa0 [nd_pmem]
nvdimm_bus_probe+0x6e/0x160 [libnvdimm]
driver_probe_device+0x310/0x480
__device_attach_driver+0x86/0x100
? __driver_attach+0x110/0x110
bus_for_each_drv+0x6e/0xb0
__device_attach+0xe2/0x160
device_initial_probe+0x13/0x20
bus_probe_device+0xa6/0xc0
device_add+0x41b/0x660
? lock_acquire+0xa3/0x210
nd_async_device_register+0x12/0x40 [libnvdimm]
async_run_entry_fn+0x3e/0x170
process_one_work+0x230/0x680
worker_thread+0x3f/0x3b0
kthread+0x12f/0x150
? process_one_work+0x680/0x680
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x3a/0x50
Fix this by clearing 'ret' back to 0 if sparse_index_init() returns
-EEXIST. This restores the previous behavior.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
mm/sparse.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/sparse.c b/mm/sparse.c
index 9574113fc745..d254bd2d3289 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -753,8 +753,12 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
* plus, it does a kmalloc
*/
ret = sparse_index_init(section_nr, pgdat->node_id);
- if (ret < 0 && ret != -EEXIST)
- return ret;
+ if (ret < 0) {
+ if (ret == -EEXIST)
+ ret = 0;
+ else
+ return ret;
+ }
memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, altmap);
if (!memmap)
return -ENOMEM;
--
2.14.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
2018-07-06 19:06 ` Ross Zwisler
@ 2018-07-06 21:23 ` Oscar Salvador
-1 siblings, 0 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-06 21:23 UTC (permalink / raw)
To: Ross Zwisler
Cc: Michal Hocko, bhe, linux-nvdimm, Dave Hansen, LKML,
pasha.tatashin, Linux MM, Kirill A. Shutemov, Andrew Morton,
Vlastimil Babka, osalvador
On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote:
> The following commit in -next:
>
> commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> remove check")
>
> changed how the error handling in sparse_add_one_section() works.
>
> Previously sparse_index_init() could return -EEXIST, and the function would
> continue on happily. 'ret' would get unconditionally overwritten by the
> result from sparse_init_one_section() and the error code after the 'out:'
> label wouldn't be triggered.
My bad, I missed that.
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 9574113fc745..d254bd2d3289 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -753,8 +753,12 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> * plus, it does a kmalloc
> */
> ret = sparse_index_init(section_nr, pgdat->node_id);
> - if (ret < 0 && ret != -EEXIST)
> - return ret;
> + if (ret < 0) {
> + if (ret == -EEXIST)
> + ret = 0;
> + else
> + return ret;
> + }
sparse_index_init() can return:
-ENOMEM, -EEXIST or 0.
So what about this?:
diff --git a/mm/sparse.c b/mm/sparse.c
index f55e79fda03e..eb188eb6b82d 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
ret = sparse_index_init(section_nr, pgdat->node_id);
if (ret < 0 && ret != -EEXIST)
return ret;
+ ret = 0;
Does this look more clean?
--
Oscar Salvador
SUSE L3
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
@ 2018-07-06 21:23 ` Oscar Salvador
0 siblings, 0 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-06 21:23 UTC (permalink / raw)
To: Ross Zwisler
Cc: pasha.tatashin, linux-nvdimm, bhe, Dave Hansen, LKML, Linux MM,
Michal Hocko, Vlastimil Babka, Andrew Morton, Kirill A. Shutemov,
osalvador
On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote:
> The following commit in -next:
>
> commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> remove check")
>
> changed how the error handling in sparse_add_one_section() works.
>
> Previously sparse_index_init() could return -EEXIST, and the function would
> continue on happily. 'ret' would get unconditionally overwritten by the
> result from sparse_init_one_section() and the error code after the 'out:'
> label wouldn't be triggered.
My bad, I missed that.
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 9574113fc745..d254bd2d3289 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -753,8 +753,12 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> * plus, it does a kmalloc
> */
> ret = sparse_index_init(section_nr, pgdat->node_id);
> - if (ret < 0 && ret != -EEXIST)
> - return ret;
> + if (ret < 0) {
> + if (ret == -EEXIST)
> + ret = 0;
> + else
> + return ret;
> + }
sparse_index_init() can return:
-ENOMEM, -EEXIST or 0.
So what about this?:
diff --git a/mm/sparse.c b/mm/sparse.c
index f55e79fda03e..eb188eb6b82d 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
ret = sparse_index_init(section_nr, pgdat->node_id);
if (ret < 0 && ret != -EEXIST)
return ret;
+ ret = 0;
Does this look more clean?
--
Oscar Salvador
SUSE L3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
2018-07-06 21:23 ` Oscar Salvador
@ 2018-07-06 21:54 ` Ross Zwisler
-1 siblings, 0 replies; 19+ messages in thread
From: Ross Zwisler @ 2018-07-06 21:54 UTC (permalink / raw)
To: Oscar Salvador
Cc: Michal Hocko, bhe, linux-nvdimm, Dave Hansen, LKML,
pasha.tatashin, Linux MM, Kirill A. Shutemov, Andrew Morton,
Vlastimil Babka, osalvador
On Fri, Jul 06, 2018 at 11:23:27PM +0200, Oscar Salvador wrote:
> On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote:
> > The following commit in -next:
> >
> > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> > remove check")
> >
> > changed how the error handling in sparse_add_one_section() works.
> >
> > Previously sparse_index_init() could return -EEXIST, and the function would
> > continue on happily. 'ret' would get unconditionally overwritten by the
> > result from sparse_init_one_section() and the error code after the 'out:'
> > label wouldn't be triggered.
>
> My bad, I missed that.
>
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index 9574113fc745..d254bd2d3289 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -753,8 +753,12 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> > * plus, it does a kmalloc
> > */
> > ret = sparse_index_init(section_nr, pgdat->node_id);
> > - if (ret < 0 && ret != -EEXIST)
> > - return ret;
> > + if (ret < 0) {
> > + if (ret == -EEXIST)
> > + ret = 0;
> > + else
> > + return ret;
> > + }
>
> sparse_index_init() can return:
>
> -ENOMEM, -EEXIST or 0.
>
> So what about this?:
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index f55e79fda03e..eb188eb6b82d 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> ret = sparse_index_init(section_nr, pgdat->node_id);
> if (ret < 0 && ret != -EEXIST)
> return ret;
> + ret = 0;
>
> Does this look more clean?
Sure, that's probably better.
Andrew, what's the easiest way forward? I can send out a v2, you can fold
this into his previous patch, or something else?
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
@ 2018-07-06 21:54 ` Ross Zwisler
0 siblings, 0 replies; 19+ messages in thread
From: Ross Zwisler @ 2018-07-06 21:54 UTC (permalink / raw)
To: Oscar Salvador
Cc: Ross Zwisler, pasha.tatashin, linux-nvdimm, bhe, Dave Hansen,
LKML, Linux MM, Michal Hocko, Vlastimil Babka, Andrew Morton,
Kirill A. Shutemov, osalvador
On Fri, Jul 06, 2018 at 11:23:27PM +0200, Oscar Salvador wrote:
> On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote:
> > The following commit in -next:
> >
> > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> > remove check")
> >
> > changed how the error handling in sparse_add_one_section() works.
> >
> > Previously sparse_index_init() could return -EEXIST, and the function would
> > continue on happily. 'ret' would get unconditionally overwritten by the
> > result from sparse_init_one_section() and the error code after the 'out:'
> > label wouldn't be triggered.
>
> My bad, I missed that.
>
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index 9574113fc745..d254bd2d3289 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -753,8 +753,12 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> > * plus, it does a kmalloc
> > */
> > ret = sparse_index_init(section_nr, pgdat->node_id);
> > - if (ret < 0 && ret != -EEXIST)
> > - return ret;
> > + if (ret < 0) {
> > + if (ret == -EEXIST)
> > + ret = 0;
> > + else
> > + return ret;
> > + }
>
> sparse_index_init() can return:
>
> -ENOMEM, -EEXIST or 0.
>
> So what about this?:
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index f55e79fda03e..eb188eb6b82d 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> ret = sparse_index_init(section_nr, pgdat->node_id);
> if (ret < 0 && ret != -EEXIST)
> return ret;
> + ret = 0;
>
> Does this look more clean?
Sure, that's probably better.
Andrew, what's the easiest way forward? I can send out a v2, you can fold
this into his previous patch, or something else?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
2018-07-06 21:54 ` Ross Zwisler
@ 2018-07-06 21:58 ` Andrew Morton
-1 siblings, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2018-07-06 21:58 UTC (permalink / raw)
To: Ross Zwisler
Cc: Oscar Salvador, bhe, linux-nvdimm, Dave Hansen, LKML,
pasha.tatashin, Linux MM, Michal Hocko, Kirill A. Shutemov,
Vlastimil Babka, osalvador
On Fri, 6 Jul 2018 15:54:37 -0600 Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> On Fri, Jul 06, 2018 at 11:23:27PM +0200, Oscar Salvador wrote:
> > On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote:
> > > The following commit in -next:
> > >
> > > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> > > remove check")
> > >
> > > changed how the error handling in sparse_add_one_section() works.
> > >
> > > Previously sparse_index_init() could return -EEXIST, and the function would
> > > continue on happily. 'ret' would get unconditionally overwritten by the
> > > result from sparse_init_one_section() and the error code after the 'out:'
> > > label wouldn't be triggered.
> >
> > My bad, I missed that.
> >
> > > diff --git a/mm/sparse.c b/mm/sparse.c
> > > index 9574113fc745..d254bd2d3289 100644
> > > --- a/mm/sparse.c
> > > +++ b/mm/sparse.c
> > > @@ -753,8 +753,12 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> > > * plus, it does a kmalloc
> > > */
> > > ret = sparse_index_init(section_nr, pgdat->node_id);
> > > - if (ret < 0 && ret != -EEXIST)
> > > - return ret;
> > > + if (ret < 0) {
> > > + if (ret == -EEXIST)
> > > + ret = 0;
> > > + else
> > > + return ret;
> > > + }
> >
> > sparse_index_init() can return:
> >
> > -ENOMEM, -EEXIST or 0.
> >
> > So what about this?:
> >
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index f55e79fda03e..eb188eb6b82d 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> > ret = sparse_index_init(section_nr, pgdat->node_id);
> > if (ret < 0 && ret != -EEXIST)
> > return ret;
> > + ret = 0;
> >
> > Does this look more clean?
>
> Sure, that's probably better.
>
> Andrew, what's the easiest way forward? I can send out a v2, you can fold
> this into his previous patch, or something else?
Whatever ;) v2 works.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
@ 2018-07-06 21:58 ` Andrew Morton
0 siblings, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2018-07-06 21:58 UTC (permalink / raw)
To: Ross Zwisler
Cc: Oscar Salvador, pasha.tatashin, linux-nvdimm, bhe, Dave Hansen,
LKML, Linux MM, Michal Hocko, Vlastimil Babka,
Kirill A. Shutemov, osalvador
On Fri, 6 Jul 2018 15:54:37 -0600 Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> On Fri, Jul 06, 2018 at 11:23:27PM +0200, Oscar Salvador wrote:
> > On Fri, Jul 06, 2018 at 01:06:58PM -0600, Ross Zwisler wrote:
> > > The following commit in -next:
> > >
> > > commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> > > remove check")
> > >
> > > changed how the error handling in sparse_add_one_section() works.
> > >
> > > Previously sparse_index_init() could return -EEXIST, and the function would
> > > continue on happily. 'ret' would get unconditionally overwritten by the
> > > result from sparse_init_one_section() and the error code after the 'out:'
> > > label wouldn't be triggered.
> >
> > My bad, I missed that.
> >
> > > diff --git a/mm/sparse.c b/mm/sparse.c
> > > index 9574113fc745..d254bd2d3289 100644
> > > --- a/mm/sparse.c
> > > +++ b/mm/sparse.c
> > > @@ -753,8 +753,12 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> > > * plus, it does a kmalloc
> > > */
> > > ret = sparse_index_init(section_nr, pgdat->node_id);
> > > - if (ret < 0 && ret != -EEXIST)
> > > - return ret;
> > > + if (ret < 0) {
> > > + if (ret == -EEXIST)
> > > + ret = 0;
> > > + else
> > > + return ret;
> > > + }
> >
> > sparse_index_init() can return:
> >
> > -ENOMEM, -EEXIST or 0.
> >
> > So what about this?:
> >
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index f55e79fda03e..eb188eb6b82d 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
> > ret = sparse_index_init(section_nr, pgdat->node_id);
> > if (ret < 0 && ret != -EEXIST)
> > return ret;
> > + ret = 0;
> >
> > Does this look more clean?
>
> Sure, that's probably better.
>
> Andrew, what's the easiest way forward? I can send out a v2, you can fold
> this into his previous patch, or something else?
Whatever ;) v2 works.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
2018-07-06 19:06 ` Ross Zwisler
@ 2018-07-06 21:32 ` Andrew Morton
-1 siblings, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2018-07-06 21:32 UTC (permalink / raw)
To: Ross Zwisler
Cc: osalvador, bhe, linux-nvdimm, Dave Hansen, LKML, pasha.tatashin,
Linux MM, Michal Hocko, Kirill A. Shutemov, Vlastimil Babka,
osalvador
On Fri, 6 Jul 2018 13:06:58 -0600 Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> remove check")
>
> changed how the error handling in sparse_add_one_section() works.
>
> Previously sparse_index_init() could return -EEXIST, and the function would
> continue on happily. 'ret' would get unconditionally overwritten by the
> result from sparse_init_one_section() and the error code after the 'out:'
> label wouldn't be triggered.
>
> With the above referenced commit, though, an -EEXIST error return from
> sparse_index_init() now takes us through the function and into the error
> case after 'out:'. This eventually causes a kernel BUG, probably because
> we've just freed a memory section that we successfully set up and marked as
> present:
Thanks.
And gee it would be nice if some of this code was commented. I
*assume* what's happening with that -EEXIST is that
sparse_add_one_section() is discovering that the root mem_section was
already initialized so things are OK. Maybe. My mind-reading skills
aren't so good on Fridays.
And sparse_index_init() sure looks like it needs locking to avoid races
around mem_section[root]. Or perhaps we're known to be single-threaded
here.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] mm/sparse.c: fix error path in sparse_add_one_section
@ 2018-07-06 21:32 ` Andrew Morton
0 siblings, 0 replies; 19+ messages in thread
From: Andrew Morton @ 2018-07-06 21:32 UTC (permalink / raw)
To: Ross Zwisler
Cc: pasha.tatashin, linux-nvdimm, osalvador, bhe, Dave Hansen, LKML,
Linux MM, Michal Hocko, Vlastimil Babka, Kirill A. Shutemov,
osalvador
On Fri, 6 Jul 2018 13:06:58 -0600 Ross Zwisler <ross.zwisler@linux.intel.com> wrote:
> commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> remove check")
>
> changed how the error handling in sparse_add_one_section() works.
>
> Previously sparse_index_init() could return -EEXIST, and the function would
> continue on happily. 'ret' would get unconditionally overwritten by the
> result from sparse_init_one_section() and the error code after the 'out:'
> label wouldn't be triggered.
>
> With the above referenced commit, though, an -EEXIST error return from
> sparse_index_init() now takes us through the function and into the error
> case after 'out:'. This eventually causes a kernel BUG, probably because
> we've just freed a memory section that we successfully set up and marked as
> present:
Thanks.
And gee it would be nice if some of this code was commented. I
*assume* what's happening with that -EEXIST is that
sparse_add_one_section() is discovering that the root mem_section was
already initialized so things are OK. Maybe. My mind-reading skills
aren't so good on Fridays.
And sparse_index_init() sure looks like it needs locking to avoid races
around mem_section[root]. Or perhaps we're known to be single-threaded
here.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] mm/sparse.c: fix error path in sparse_add_one_section
2018-07-06 19:06 ` Ross Zwisler
@ 2018-07-06 22:33 ` Ross Zwisler
-1 siblings, 0 replies; 19+ messages in thread
From: Ross Zwisler @ 2018-07-06 22:33 UTC (permalink / raw)
To: pasha.tatashin, linux-nvdimm
Cc: osalvador, bhe, Dave Hansen, LKML, Linux MM, Michal Hocko,
Kirill A. Shutemov, Andrew Morton, Vlastimil Babka, osalvador
The following commit in -next:
commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
remove check")
changed how the error handling in sparse_add_one_section() works.
Previously sparse_index_init() could return -EEXIST, and the function would
continue on happily. 'ret' would get unconditionally overwritten by the
result from sparse_init_one_section() and the error code after the 'out:'
label wouldn't be triggered.
With the above referenced commit, though, an -EEXIST error return from
sparse_index_init() now takes us through the function and into the error
case after 'out:'. This eventually causes a kernel BUG, probably because
we've just freed a memory section that we successfully set up and marked as
present:
BUG: unable to handle kernel paging request at ffffea0005000080
RIP: 0010:memmap_init_zone+0x154/0x1cf
Call Trace:
move_pfn_range_to_zone+0x168/0x180
devm_memremap_pages+0x29b/0x480
pmem_attach_disk+0x1ae/0x6c0 [nd_pmem]
? devm_memremap+0x79/0xb0
nd_pmem_probe+0x7e/0xa0 [nd_pmem]
nvdimm_bus_probe+0x6e/0x160 [libnvdimm]
driver_probe_device+0x310/0x480
__device_attach_driver+0x86/0x100
? __driver_attach+0x110/0x110
bus_for_each_drv+0x6e/0xb0
__device_attach+0xe2/0x160
device_initial_probe+0x13/0x20
bus_probe_device+0xa6/0xc0
device_add+0x41b/0x660
? lock_acquire+0xa3/0x210
nd_async_device_register+0x12/0x40 [libnvdimm]
async_run_entry_fn+0x3e/0x170
process_one_work+0x230/0x680
worker_thread+0x3f/0x3b0
kthread+0x12f/0x150
? process_one_work+0x680/0x680
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x3a/0x50
Fix this by clearing 'ret' back to 0 if sparse_index_init() returns
-EEXIST. This restores the previous behavior.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
mm/sparse.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/sparse.c b/mm/sparse.c
index f55e79fda03e..eb188eb6b82d 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
ret = sparse_index_init(section_nr, pgdat->node_id);
if (ret < 0 && ret != -EEXIST)
return ret;
+ ret = 0;
memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, altmap);
if (!memmap)
return -ENOMEM;
--
2.14.4
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v2] mm/sparse.c: fix error path in sparse_add_one_section
@ 2018-07-06 22:33 ` Ross Zwisler
0 siblings, 0 replies; 19+ messages in thread
From: Ross Zwisler @ 2018-07-06 22:33 UTC (permalink / raw)
To: pasha.tatashin, linux-nvdimm
Cc: Ross Zwisler, osalvador, bhe, Dave Hansen, LKML, Linux MM,
Michal Hocko, Vlastimil Babka, Andrew Morton, Kirill A. Shutemov,
osalvador
The following commit in -next:
commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
remove check")
changed how the error handling in sparse_add_one_section() works.
Previously sparse_index_init() could return -EEXIST, and the function would
continue on happily. 'ret' would get unconditionally overwritten by the
result from sparse_init_one_section() and the error code after the 'out:'
label wouldn't be triggered.
With the above referenced commit, though, an -EEXIST error return from
sparse_index_init() now takes us through the function and into the error
case after 'out:'. This eventually causes a kernel BUG, probably because
we've just freed a memory section that we successfully set up and marked as
present:
BUG: unable to handle kernel paging request at ffffea0005000080
RIP: 0010:memmap_init_zone+0x154/0x1cf
Call Trace:
move_pfn_range_to_zone+0x168/0x180
devm_memremap_pages+0x29b/0x480
pmem_attach_disk+0x1ae/0x6c0 [nd_pmem]
? devm_memremap+0x79/0xb0
nd_pmem_probe+0x7e/0xa0 [nd_pmem]
nvdimm_bus_probe+0x6e/0x160 [libnvdimm]
driver_probe_device+0x310/0x480
__device_attach_driver+0x86/0x100
? __driver_attach+0x110/0x110
bus_for_each_drv+0x6e/0xb0
__device_attach+0xe2/0x160
device_initial_probe+0x13/0x20
bus_probe_device+0xa6/0xc0
device_add+0x41b/0x660
? lock_acquire+0xa3/0x210
nd_async_device_register+0x12/0x40 [libnvdimm]
async_run_entry_fn+0x3e/0x170
process_one_work+0x230/0x680
worker_thread+0x3f/0x3b0
kthread+0x12f/0x150
? process_one_work+0x680/0x680
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x3a/0x50
Fix this by clearing 'ret' back to 0 if sparse_index_init() returns
-EEXIST. This restores the previous behavior.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
mm/sparse.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/sparse.c b/mm/sparse.c
index f55e79fda03e..eb188eb6b82d 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -770,6 +770,7 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat,
ret = sparse_index_init(section_nr, pgdat->node_id);
if (ret < 0 && ret != -EEXIST)
return ret;
+ ret = 0;
memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, altmap);
if (!memmap)
return -ENOMEM;
--
2.14.4
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/sparse.c: fix error path in sparse_add_one_section
2018-07-06 22:33 ` Ross Zwisler
@ 2018-07-07 6:01 ` Oscar Salvador
-1 siblings, 0 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-07 6:01 UTC (permalink / raw)
To: Ross Zwisler
Cc: Michal Hocko, bhe, linux-nvdimm, Dave Hansen, LKML,
pasha.tatashin, Linux MM, Kirill A. Shutemov, Andrew Morton,
Vlastimil Babka, osalvador
On Fri, Jul 06, 2018 at 04:33:58PM -0600, Ross Zwisler wrote:
> The following commit in -next:
>
> commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> remove check")
>
> changed how the error handling in sparse_add_one_section() works.
>
> Previously sparse_index_init() could return -EEXIST, and the function would
> continue on happily. 'ret' would get unconditionally overwritten by the
> result from sparse_init_one_section() and the error code after the 'out:'
> label wouldn't be triggered.
>
> With the above referenced commit, though, an -EEXIST error return from
> sparse_index_init() now takes us through the function and into the error
> case after 'out:'. This eventually causes a kernel BUG, probably because
> we've just freed a memory section that we successfully set up and marked as
> present:
>
> BUG: unable to handle kernel paging request at ffffea0005000080
> RIP: 0010:memmap_init_zone+0x154/0x1cf
>
> Call Trace:
> move_pfn_range_to_zone+0x168/0x180
> devm_memremap_pages+0x29b/0x480
> pmem_attach_disk+0x1ae/0x6c0 [nd_pmem]
> ? devm_memremap+0x79/0xb0
> nd_pmem_probe+0x7e/0xa0 [nd_pmem]
> nvdimm_bus_probe+0x6e/0x160 [libnvdimm]
> driver_probe_device+0x310/0x480
> __device_attach_driver+0x86/0x100
> ? __driver_attach+0x110/0x110
> bus_for_each_drv+0x6e/0xb0
> __device_attach+0xe2/0x160
> device_initial_probe+0x13/0x20
> bus_probe_device+0xa6/0xc0
> device_add+0x41b/0x660
> ? lock_acquire+0xa3/0x210
> nd_async_device_register+0x12/0x40 [libnvdimm]
> async_run_entry_fn+0x3e/0x170
> process_one_work+0x230/0x680
> worker_thread+0x3f/0x3b0
> kthread+0x12f/0x150
> ? process_one_work+0x680/0x680
> ? kthread_create_worker_on_cpu+0x70/0x70
> ret_from_fork+0x3a/0x50
>
> Fix this by clearing 'ret' back to 0 if sparse_index_init() returns
> -EEXIST. This restores the previous behavior.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
--
Oscar Salvador
SUSE L3
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] mm/sparse.c: fix error path in sparse_add_one_section
@ 2018-07-07 6:01 ` Oscar Salvador
0 siblings, 0 replies; 19+ messages in thread
From: Oscar Salvador @ 2018-07-07 6:01 UTC (permalink / raw)
To: Ross Zwisler
Cc: pasha.tatashin, linux-nvdimm, bhe, Dave Hansen, LKML, Linux MM,
Michal Hocko, Vlastimil Babka, Andrew Morton, Kirill A. Shutemov,
osalvador
On Fri, Jul 06, 2018 at 04:33:58PM -0600, Ross Zwisler wrote:
> The following commit in -next:
>
> commit 054620849110 ("mm/sparse.c: make sparse_init_one_section void and
> remove check")
>
> changed how the error handling in sparse_add_one_section() works.
>
> Previously sparse_index_init() could return -EEXIST, and the function would
> continue on happily. 'ret' would get unconditionally overwritten by the
> result from sparse_init_one_section() and the error code after the 'out:'
> label wouldn't be triggered.
>
> With the above referenced commit, though, an -EEXIST error return from
> sparse_index_init() now takes us through the function and into the error
> case after 'out:'. This eventually causes a kernel BUG, probably because
> we've just freed a memory section that we successfully set up and marked as
> present:
>
> BUG: unable to handle kernel paging request at ffffea0005000080
> RIP: 0010:memmap_init_zone+0x154/0x1cf
>
> Call Trace:
> move_pfn_range_to_zone+0x168/0x180
> devm_memremap_pages+0x29b/0x480
> pmem_attach_disk+0x1ae/0x6c0 [nd_pmem]
> ? devm_memremap+0x79/0xb0
> nd_pmem_probe+0x7e/0xa0 [nd_pmem]
> nvdimm_bus_probe+0x6e/0x160 [libnvdimm]
> driver_probe_device+0x310/0x480
> __device_attach_driver+0x86/0x100
> ? __driver_attach+0x110/0x110
> bus_for_each_drv+0x6e/0xb0
> __device_attach+0xe2/0x160
> device_initial_probe+0x13/0x20
> bus_probe_device+0xa6/0xc0
> device_add+0x41b/0x660
> ? lock_acquire+0xa3/0x210
> nd_async_device_register+0x12/0x40 [libnvdimm]
> async_run_entry_fn+0x3e/0x170
> process_one_work+0x230/0x680
> worker_thread+0x3f/0x3b0
> kthread+0x12f/0x150
> ? process_one_work+0x680/0x680
> ? kthread_create_worker_on_cpu+0x70/0x70
> ret_from_fork+0x3a/0x50
>
> Fix this by clearing 'ret' back to 0 if sparse_index_init() returns
> -EEXIST. This restores the previous behavior.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
--
Oscar Salvador
SUSE L3
^ permalink raw reply [flat|nested] 19+ messages in thread