All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cxl/region: Fix null pointer dereference for resetting decoder
       [not found] <CGME20221215170915uscas1p262ccdf32fb2ccd3840189376c2793d06@uscas1p2.samsung.com>
@ 2022-12-15 17:09 ` Fan Ni
  2023-01-13 11:01   ` Jonathan Cameron
                     ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Fan Ni @ 2022-12-15 17:09 UTC (permalink / raw)
  To: alison.schofield, vishal.l.verma, ira.weiny, bwidawsk,
	dan.j.williams, Jonathan.Cameron, dan.carpenter
  Cc: linux-cxl, Adam Manzanares, dave, linux-kernel, Fan Ni

Not all decoders have a reset callback.

The CXL specification allows a host bridge with a single root port to
have no explicit HDM decoders. Currently the region driver assumes there
are none.  As such the CXL core creates a special pass through decoder
instance without a commit/reset callback.

Prior to this patch, the ->reset() callback was called unconditionally when
calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
1 Root Port, and one directly attached CXL type 3 device or multiple CXL
type 3 devices attached to downstream ports of a switch can cause a null
pointer dereference.

Before the fix, a kernel crash was observed when we destroy the region, and
a pass through decoder is reset.

The issue can be reproduced as below,
    1) create a region with a CXL setup which includes a HB with a
    single root port under which a memdev is attached directly.
    2) destroy the region with cxl destroy-region regionX -f.

Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
Signed-off-by: Fan Ni <fan.ni@samsung.com>
---
 drivers/cxl/core/region.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index f9ae5ad284ff..3931793a13ac 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -131,7 +131,7 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
 		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
 		struct cxl_port *iter = cxled_to_port(cxled);
 		struct cxl_ep *ep;
-		int rc;
+		int rc = 0;
 
 		while (!is_cxl_root(to_cxl_port(iter->dev.parent)))
 			iter = to_cxl_port(iter->dev.parent);
@@ -143,7 +143,8 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
 
 			cxl_rr = cxl_rr_load(iter, cxlr);
 			cxld = cxl_rr->decoder;
-			rc = cxld->reset(cxld);
+			if (cxld->reset)
+				rc = cxld->reset(cxld);
 			if (rc)
 				return rc;
 		}
@@ -186,7 +187,8 @@ static int cxl_region_decode_commit(struct cxl_region *cxlr)
 			     iter = ep->next, ep = cxl_ep_load(iter, cxlmd)) {
 				cxl_rr = cxl_rr_load(iter, cxlr);
 				cxld = cxl_rr->decoder;
-				cxld->reset(cxld);
+				if (cxld->reset)
+					cxld->reset(cxld);
 			}
 
 			cxled->cxld.reset(&cxled->cxld);
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] cxl/region: Fix null pointer dereference for resetting decoder
  2022-12-15 17:09 ` [PATCH] cxl/region: Fix null pointer dereference for resetting decoder Fan Ni
@ 2023-01-13 11:01   ` Jonathan Cameron
  2023-02-01 17:58     ` Dan Williams
  2023-01-17 17:12   ` Dave Jiang
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Jonathan Cameron @ 2023-01-13 11:01 UTC (permalink / raw)
  To: Fan Ni
  Cc: alison.schofield, vishal.l.verma, ira.weiny, bwidawsk,
	dan.j.williams, dan.carpenter, linux-cxl, Adam Manzanares, dave,
	linux-kernel

On Thu, 15 Dec 2022 17:09:14 +0000
Fan Ni <fan.ni@samsung.com> wrote:

> Not all decoders have a reset callback.
> 
> The CXL specification allows a host bridge with a single root port to
> have no explicit HDM decoders. Currently the region driver assumes there
> are none.  As such the CXL core creates a special pass through decoder
> instance without a commit/reset callback.
> 
> Prior to this patch, the ->reset() callback was called unconditionally when
> calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
> 1 Root Port, and one directly attached CXL type 3 device or multiple CXL
> type 3 devices attached to downstream ports of a switch can cause a null
> pointer dereference.
> 
> Before the fix, a kernel crash was observed when we destroy the region, and
> a pass through decoder is reset.
> 
> The issue can be reproduced as below,
>     1) create a region with a CXL setup which includes a HB with a
>     single root port under which a memdev is attached directly.
>     2) destroy the region with cxl destroy-region regionX -f.
> 
> Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
> Signed-off-by: Fan Ni <fan.ni@samsung.com>

Explanation seems correct to me.  Only question (and it's one for the
Maintainers) is whether they prefer optionality here or a stub reset()
implementation for the pass through decoder.

either way
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/region.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index f9ae5ad284ff..3931793a13ac 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -131,7 +131,7 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
>  		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>  		struct cxl_port *iter = cxled_to_port(cxled);
>  		struct cxl_ep *ep;
> -		int rc;
> +		int rc = 0;
>  
>  		while (!is_cxl_root(to_cxl_port(iter->dev.parent)))
>  			iter = to_cxl_port(iter->dev.parent);
> @@ -143,7 +143,8 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
>  
>  			cxl_rr = cxl_rr_load(iter, cxlr);
>  			cxld = cxl_rr->decoder;
> -			rc = cxld->reset(cxld);
> +			if (cxld->reset)
> +				rc = cxld->reset(cxld);
>  			if (rc)
>  				return rc;
>  		}
> @@ -186,7 +187,8 @@ static int cxl_region_decode_commit(struct cxl_region *cxlr)
>  			     iter = ep->next, ep = cxl_ep_load(iter, cxlmd)) {
>  				cxl_rr = cxl_rr_load(iter, cxlr);
>  				cxld = cxl_rr->decoder;
> -				cxld->reset(cxld);
> +				if (cxld->reset)
> +					cxld->reset(cxld);
>  			}
>  
>  			cxled->cxld.reset(&cxled->cxld);


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cxl/region: Fix null pointer dereference for resetting decoder
  2022-12-15 17:09 ` [PATCH] cxl/region: Fix null pointer dereference for resetting decoder Fan Ni
  2023-01-13 11:01   ` Jonathan Cameron
@ 2023-01-17 17:12   ` Dave Jiang
  2023-02-01 15:57   ` Davidlohr Bueso
  2023-02-06 11:23   ` Gregory Price
  3 siblings, 0 replies; 7+ messages in thread
From: Dave Jiang @ 2023-01-17 17:12 UTC (permalink / raw)
  To: Fan Ni, alison.schofield, vishal.l.verma, ira.weiny, bwidawsk,
	dan.j.williams, Jonathan.Cameron, dan.carpenter
  Cc: linux-cxl, Adam Manzanares, dave, linux-kernel



On 12/15/22 10:09 AM, Fan Ni wrote:
> Not all decoders have a reset callback.
> 
> The CXL specification allows a host bridge with a single root port to
> have no explicit HDM decoders. Currently the region driver assumes there
> are none.  As such the CXL core creates a special pass through decoder
> instance without a commit/reset callback.
> 
> Prior to this patch, the ->reset() callback was called unconditionally when
> calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
> 1 Root Port, and one directly attached CXL type 3 device or multiple CXL
> type 3 devices attached to downstream ports of a switch can cause a null
> pointer dereference.
> 
> Before the fix, a kernel crash was observed when we destroy the region, and
> a pass through decoder is reset.
> 
> The issue can be reproduced as below,
>      1) create a region with a CXL setup which includes a HB with a
>      single root port under which a memdev is attached directly.
>      2) destroy the region with cxl destroy-region regionX -f.
> 
> Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
> Signed-off-by: Fan Ni <fan.ni@samsung.com>

Makes sense, especially with the emulated decoders coming w/o ->reset().

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> ---
>   drivers/cxl/core/region.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index f9ae5ad284ff..3931793a13ac 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -131,7 +131,7 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
>   		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>   		struct cxl_port *iter = cxled_to_port(cxled);
>   		struct cxl_ep *ep;
> -		int rc;
> +		int rc = 0;
>   
>   		while (!is_cxl_root(to_cxl_port(iter->dev.parent)))
>   			iter = to_cxl_port(iter->dev.parent);
> @@ -143,7 +143,8 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
>   
>   			cxl_rr = cxl_rr_load(iter, cxlr);
>   			cxld = cxl_rr->decoder;
> -			rc = cxld->reset(cxld);
> +			if (cxld->reset)
> +				rc = cxld->reset(cxld);
>   			if (rc)
>   				return rc;
>   		}
> @@ -186,7 +187,8 @@ static int cxl_region_decode_commit(struct cxl_region *cxlr)
>   			     iter = ep->next, ep = cxl_ep_load(iter, cxlmd)) {
>   				cxl_rr = cxl_rr_load(iter, cxlr);
>   				cxld = cxl_rr->decoder;
> -				cxld->reset(cxld);
> +				if (cxld->reset)
> +					cxld->reset(cxld);
>   			}
>   
>   			cxled->cxld.reset(&cxled->cxld);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cxl/region: Fix null pointer dereference for resetting decoder
  2022-12-15 17:09 ` [PATCH] cxl/region: Fix null pointer dereference for resetting decoder Fan Ni
  2023-01-13 11:01   ` Jonathan Cameron
  2023-01-17 17:12   ` Dave Jiang
@ 2023-02-01 15:57   ` Davidlohr Bueso
  2023-02-06 11:23   ` Gregory Price
  3 siblings, 0 replies; 7+ messages in thread
From: Davidlohr Bueso @ 2023-02-01 15:57 UTC (permalink / raw)
  To: Fan Ni
  Cc: alison.schofield, vishal.l.verma, ira.weiny, bwidawsk,
	dan.j.williams, Jonathan.Cameron, dan.carpenter, linux-cxl,
	Adam Manzanares, linux-kernel

On Thu, 15 Dec 2022, Fan Ni wrote:

>Not all decoders have a reset callback.
>
>The CXL specification allows a host bridge with a single root port to
>have no explicit HDM decoders. Currently the region driver assumes there
>are none.  As such the CXL core creates a special pass through decoder
>instance without a commit/reset callback.
>
>Prior to this patch, the ->reset() callback was called unconditionally when
>calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
>1 Root Port, and one directly attached CXL type 3 device or multiple CXL
>type 3 devices attached to downstream ports of a switch can cause a null
>pointer dereference.
>
>Before the fix, a kernel crash was observed when we destroy the region, and
>a pass through decoder is reset.
>
>The issue can be reproduced as below,
>    1) create a region with a CXL setup which includes a HB with a
>    single root port under which a memdev is attached directly.
>    2) destroy the region with cxl destroy-region regionX -f.
>
>Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
>Signed-off-by: Fan Ni <fan.ni@samsung.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cxl/region: Fix null pointer dereference for resetting decoder
  2023-01-13 11:01   ` Jonathan Cameron
@ 2023-02-01 17:58     ` Dan Williams
  0 siblings, 0 replies; 7+ messages in thread
From: Dan Williams @ 2023-02-01 17:58 UTC (permalink / raw)
  To: Jonathan Cameron, Fan Ni
  Cc: alison.schofield, vishal.l.verma, ira.weiny, bwidawsk,
	dan.j.williams, dan.carpenter, linux-cxl, Adam Manzanares, dave,
	linux-kernel

Jonathan Cameron wrote:
> On Thu, 15 Dec 2022 17:09:14 +0000
> Fan Ni <fan.ni@samsung.com> wrote:
> 
> > Not all decoders have a reset callback.
> > 
> > The CXL specification allows a host bridge with a single root port to
> > have no explicit HDM decoders. Currently the region driver assumes there
> > are none.  As such the CXL core creates a special pass through decoder
> > instance without a commit/reset callback.
> > 
> > Prior to this patch, the ->reset() callback was called unconditionally when
> > calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
> > 1 Root Port, and one directly attached CXL type 3 device or multiple CXL
> > type 3 devices attached to downstream ports of a switch can cause a null
> > pointer dereference.
> > 
> > Before the fix, a kernel crash was observed when we destroy the region, and
> > a pass through decoder is reset.
> > 
> > The issue can be reproduced as below,
> >     1) create a region with a CXL setup which includes a HB with a
> >     single root port under which a memdev is attached directly.
> >     2) destroy the region with cxl destroy-region regionX -f.
> > 
> > Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
> > Signed-off-by: Fan Ni <fan.ni@samsung.com>
> 
> Explanation seems correct to me.  Only question (and it's one for the
> Maintainers) is whether they prefer optionality here or a stub reset()
> implementation for the pass through decoder.

Yeah, I think this fix as is works for the purposes of the -stable
backport and then a follow-on can add the optionality.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cxl/region: Fix null pointer dereference for resetting decoder
  2022-12-15 17:09 ` [PATCH] cxl/region: Fix null pointer dereference for resetting decoder Fan Ni
                     ` (2 preceding siblings ...)
  2023-02-01 15:57   ` Davidlohr Bueso
@ 2023-02-06 11:23   ` Gregory Price
  2023-02-06 19:16     ` Dan Williams
  3 siblings, 1 reply; 7+ messages in thread
From: Gregory Price @ 2023-02-06 11:23 UTC (permalink / raw)
  To: Fan Ni
  Cc: alison.schofield, vishal.l.verma, ira.weiny, bwidawsk,
	dan.j.williams, Jonathan.Cameron, dan.carpenter, linux-cxl,
	Adam Manzanares, dave, linux-kernel

On Thu, Dec 15, 2022 at 05:09:14PM +0000, Fan Ni wrote:
> Not all decoders have a reset callback.
> 
> The CXL specification allows a host bridge with a single root port to
> have no explicit HDM decoders. Currently the region driver assumes there
> are none.  As such the CXL core creates a special pass through decoder
> instance without a commit/reset callback.
> 
> Prior to this patch, the ->reset() callback was called unconditionally when
> calling cxl_region_decode_reset. Thus a configuration with 1 Host Bridge,
> 1 Root Port, and one directly attached CXL type 3 device or multiple CXL
> type 3 devices attached to downstream ports of a switch can cause a null
> pointer dereference.
> 
> Before the fix, a kernel crash was observed when we destroy the region, and
> a pass through decoder is reset.
> 
> The issue can be reproduced as below,
>     1) create a region with a CXL setup which includes a HB with a
>     single root port under which a memdev is attached directly.
>     2) destroy the region with cxl destroy-region regionX -f.
> 
> Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
> Signed-off-by: Fan Ni <fan.ni@samsung.com>
> ---
>  drivers/cxl/core/region.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index f9ae5ad284ff..3931793a13ac 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -131,7 +131,7 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
>  		struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
>  		struct cxl_port *iter = cxled_to_port(cxled);
>  		struct cxl_ep *ep;
> -		int rc;
> +		int rc = 0;
>  
>  		while (!is_cxl_root(to_cxl_port(iter->dev.parent)))
>  			iter = to_cxl_port(iter->dev.parent);
> @@ -143,7 +143,8 @@ static int cxl_region_decode_reset(struct cxl_region *cxlr, int count)
>  
>  			cxl_rr = cxl_rr_load(iter, cxlr);
>  			cxld = cxl_rr->decoder;
> -			rc = cxld->reset(cxld);
> +			if (cxld->reset)
> +				rc = cxld->reset(cxld);
>  			if (rc)
>  				return rc;
>  		}
> @@ -186,7 +187,8 @@ static int cxl_region_decode_commit(struct cxl_region *cxlr)
>  			     iter = ep->next, ep = cxl_ep_load(iter, cxlmd)) {
>  				cxl_rr = cxl_rr_load(iter, cxlr);
>  				cxld = cxl_rr->decoder;
> -				cxld->reset(cxld);
> +				if (cxld->reset)
> +					cxld->reset(cxld);
>  			}
>  
>  			cxled->cxld.reset(&cxled->cxld);
> -- 
> 2.25.1


Should we try to get this upstreamed in 6.2-final?  Seems like a good
stable addition. Probably doesn't affect real hardware, but it certainly
affects QEMU.


Tested-by: Gregory Price <gregory.price@memverge.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cxl/region: Fix null pointer dereference for resetting decoder
  2023-02-06 11:23   ` Gregory Price
@ 2023-02-06 19:16     ` Dan Williams
  0 siblings, 0 replies; 7+ messages in thread
From: Dan Williams @ 2023-02-06 19:16 UTC (permalink / raw)
  To: Gregory Price, Fan Ni
  Cc: alison.schofield, vishal.l.verma, ira.weiny, bwidawsk,
	dan.j.williams, Jonathan.Cameron, dan.carpenter, linux-cxl,
	Adam Manzanares, dave, linux-kernel

Gregory Price wrote:
[..]
> Should we try to get this upstreamed in 6.2-final?  Seems like a good
> stable addition. Probably doesn't affect real hardware, but it certainly
> affects QEMU.

Yes, that's the plan.

https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/commit/?h=fixes&id=01d2cb2593b1

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-06 19:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20221215170915uscas1p262ccdf32fb2ccd3840189376c2793d06@uscas1p2.samsung.com>
2022-12-15 17:09 ` [PATCH] cxl/region: Fix null pointer dereference for resetting decoder Fan Ni
2023-01-13 11:01   ` Jonathan Cameron
2023-02-01 17:58     ` Dan Williams
2023-01-17 17:12   ` Dave Jiang
2023-02-01 15:57   ` Davidlohr Bueso
2023-02-06 11:23   ` Gregory Price
2023-02-06 19:16     ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.