linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
@ 2019-11-13 10:24 Jayshri Pawar
  2019-11-14  2:50 ` Peter Chen
  0 siblings, 1 reply; 8+ messages in thread
From: Jayshri Pawar @ 2019-11-13 10:24 UTC (permalink / raw)
  To: linux-usb
  Cc: gregkh, felipe.balbi, heikki.krogerus, rogerq, linux-kernel,
	jbergsagel, nsekhar, nm, peter.chen, kurahul, pawell, sparmar,
	jpawar

There is a problem when function driver allocate memory for buffer
used by DMA from outside dma_mask space.
It appears during testing f_tcm driver with cdns3 controller.
In the result cdns3 driver was not able to map virtual buffer to DMA.
This fix should be improved depending on dma_mask associated with device.
Adding GFP_DMA32 flag while allocationg command data buffer only for 32
bit controllers.

Signed-off-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
---
 drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
 include/linux/usb/gadget.h          |  2 ++
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/gadget/function/f_tcm.c b/drivers/usb/gadget/function/f_tcm.c
index 36504931b2d1..a78d5fad3d84 100644
--- a/drivers/usb/gadget/function/f_tcm.c
+++ b/drivers/usb/gadget/function/f_tcm.c
@@ -213,7 +213,8 @@ static int bot_send_read_response(struct usbg_cmd *cmd)
 	}
 
 	if (!gadget->sg_supported) {
-		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
+		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
+					gadget->dma_flag);
 		if (!cmd->data_buf)
 			return -ENOMEM;
 
@@ -257,7 +258,8 @@ static int bot_send_write_request(struct usbg_cmd *cmd)
 	}
 
 	if (!gadget->sg_supported) {
-		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL);
+		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL |
+					gadget->dma_flag);
 		if (!cmd->data_buf)
 			return -ENOMEM;
 
@@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep *ep, struct usb_request *req)
 static int bot_prepare_reqs(struct f_uas *fu)
 {
 	int ret;
+	struct usb_gadget *gadget = fuas_to_gadget(fu);
 
 	fu->bot_req_in = usb_ep_alloc_request(fu->ep_in, GFP_KERNEL);
 	if (!fu->bot_req_in)
@@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
 	fu->bot_status.req->complete = bot_status_complete;
 	fu->bot_status.csw.Signature = cpu_to_le32(US_BULK_CS_SIGN);
 
-	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL);
+	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL |
+				gadget->dma_flag);
 	if (!fu->cmd.buf)
 		goto err_buf;
 
@@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct usbg_cmd *cmd)
 	struct uas_stream *stream = cmd->stream;
 
 	if (!gadget->sg_supported) {
-		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
+		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
+					gadget->dma_flag);
 		if (!cmd->data_buf)
 			return -ENOMEM;
 
@@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct f_uas *fu, struct uas_stream *stream)
 
 static int uasp_alloc_cmd(struct f_uas *fu)
 {
+	struct usb_gadget *gadget = fuas_to_gadget(fu);
 	fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd, GFP_KERNEL);
 	if (!fu->cmd.req)
 		goto err;
 
-	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL);
+	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL |
+				gadget->dma_flag);
 	if (!fu->cmd.buf)
 		goto err_buf;
 
@@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct usbg_cmd *cmd, struct usb_request *req)
 	struct usb_gadget *gadget = fuas_to_gadget(fu);
 
 	if (!gadget->sg_supported) {
-		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
+		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
+					gadget->dma_flag);
 		if (!cmd->data_buf)
 			return -ENOMEM;
 
diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
index 124462d65eac..d6c9cd222600 100644
--- a/include/linux/usb/gadget.h
+++ b/include/linux/usb/gadget.h
@@ -373,6 +373,7 @@ struct usb_gadget_ops {
  * @connected: True if gadget is connected.
  * @lpm_capable: If the gadget max_speed is FULL or HIGH, this flag
  *	indicates that it supports LPM as per the LPM ECN & errata.
+ * @dma_flag: dma zone to be used for buffer allocation.
  *
  * Gadgets have a mostly-portable "gadget driver" implementing device
  * functions, handling all usb configurations and interfaces.  Gadget
@@ -427,6 +428,7 @@ struct usb_gadget {
 	unsigned			deactivated:1;
 	unsigned			connected:1;
 	unsigned			lpm_capable:1;
+	unsigned int			dma_flag;
 };
 #define work_to_gadget(w)	(container_of((w), struct usb_gadget, work))
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
  2019-11-13 10:24 [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer Jayshri Pawar
@ 2019-11-14  2:50 ` Peter Chen
  2019-11-14  9:48   ` Roger Quadros
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Chen @ 2019-11-14  2:50 UTC (permalink / raw)
  To: Jayshri Pawar
  Cc: linux-usb, gregkh, felipe.balbi, heikki.krogerus, rogerq,
	linux-kernel, jbergsagel, nsekhar, nm, kurahul, pawell, sparmar

On 19-11-13 10:24:32, Jayshri Pawar wrote:
> There is a problem when function driver allocate memory for buffer
> used by DMA from outside dma_mask space.
> It appears during testing f_tcm driver with cdns3 controller.
> In the result cdns3 driver was not able to map virtual buffer to DMA.
> This fix should be improved depending on dma_mask associated with device.
> Adding GFP_DMA32 flag while allocationg command data buffer only for 32
> bit controllers.

Hi Jayshri,

This issue should be fixed by setting DMA_MASK correctly for controller,
you can't limit user's memory region. At usb_ep_queue, the UDC driver
will call DMA MAP API, for Cadence, it is usb_gadget_map_request_by_dev. 
For the system without SMMU (IO-MMU), it will use swiotlb to make sure
the data buffer used for DMA transfer is within DMA mask for controller,
There is a reserved low memory region for debounce buffer in swiotlb 
use case.

Peter

> 
> Signed-off-by: Pawel Laszczak <pawell@cadence.com>
> Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
> ---
>  drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
>  include/linux/usb/gadget.h          |  2 ++
>  2 files changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/f_tcm.c b/drivers/usb/gadget/function/f_tcm.c
> index 36504931b2d1..a78d5fad3d84 100644
> --- a/drivers/usb/gadget/function/f_tcm.c
> +++ b/drivers/usb/gadget/function/f_tcm.c
> @@ -213,7 +213,8 @@ static int bot_send_read_response(struct usbg_cmd *cmd)
>  	}
>  
>  	if (!gadget->sg_supported) {
> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
> +					gadget->dma_flag);
>  		if (!cmd->data_buf)
>  			return -ENOMEM;
>  
> @@ -257,7 +258,8 @@ static int bot_send_write_request(struct usbg_cmd *cmd)
>  	}
>  
>  	if (!gadget->sg_supported) {
> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL);
> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL |
> +					gadget->dma_flag);
>  		if (!cmd->data_buf)
>  			return -ENOMEM;
>  
> @@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep *ep, struct usb_request *req)
>  static int bot_prepare_reqs(struct f_uas *fu)
>  {
>  	int ret;
> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
>  
>  	fu->bot_req_in = usb_ep_alloc_request(fu->ep_in, GFP_KERNEL);
>  	if (!fu->bot_req_in)
> @@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
>  	fu->bot_status.req->complete = bot_status_complete;
>  	fu->bot_status.csw.Signature = cpu_to_le32(US_BULK_CS_SIGN);
>  
> -	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL);
> +	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL |
> +				gadget->dma_flag);
>  	if (!fu->cmd.buf)
>  		goto err_buf;
>  
> @@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct usbg_cmd *cmd)
>  	struct uas_stream *stream = cmd->stream;
>  
>  	if (!gadget->sg_supported) {
> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
> +					gadget->dma_flag);
>  		if (!cmd->data_buf)
>  			return -ENOMEM;
>  
> @@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct f_uas *fu, struct uas_stream *stream)
>  
>  static int uasp_alloc_cmd(struct f_uas *fu)
>  {
> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
>  	fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd, GFP_KERNEL);
>  	if (!fu->cmd.req)
>  		goto err;
>  
> -	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL);
> +	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL |
> +				gadget->dma_flag);
>  	if (!fu->cmd.buf)
>  		goto err_buf;
>  
> @@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct usbg_cmd *cmd, struct usb_request *req)
>  	struct usb_gadget *gadget = fuas_to_gadget(fu);
>  
>  	if (!gadget->sg_supported) {
> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
> +					gadget->dma_flag);
>  		if (!cmd->data_buf)
>  			return -ENOMEM;
>  
> diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
> index 124462d65eac..d6c9cd222600 100644
> --- a/include/linux/usb/gadget.h
> +++ b/include/linux/usb/gadget.h
> @@ -373,6 +373,7 @@ struct usb_gadget_ops {
>   * @connected: True if gadget is connected.
>   * @lpm_capable: If the gadget max_speed is FULL or HIGH, this flag
>   *	indicates that it supports LPM as per the LPM ECN & errata.
> + * @dma_flag: dma zone to be used for buffer allocation.
>   *
>   * Gadgets have a mostly-portable "gadget driver" implementing device
>   * functions, handling all usb configurations and interfaces.  Gadget
> @@ -427,6 +428,7 @@ struct usb_gadget {
>  	unsigned			deactivated:1;
>  	unsigned			connected:1;
>  	unsigned			lpm_capable:1;
> +	unsigned int			dma_flag;
>  };
>  #define work_to_gadget(w)	(container_of((w), struct usb_gadget, work))
>  
> -- 
> 2.20.1
> 

-- 

Thanks,
Peter Chen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
  2019-11-14  2:50 ` Peter Chen
@ 2019-11-14  9:48   ` Roger Quadros
  2019-11-15 10:14     ` Jayshri Dajiram Pawar
  0 siblings, 1 reply; 8+ messages in thread
From: Roger Quadros @ 2019-11-14  9:48 UTC (permalink / raw)
  To: Peter Chen, Jayshri Pawar
  Cc: linux-usb, gregkh, felipe.balbi, heikki.krogerus, linux-kernel,
	jbergsagel, nsekhar, nm, kurahul, pawell, sparmar

Jayshri,

On 14/11/2019 04:50, Peter Chen wrote:
> On 19-11-13 10:24:32, Jayshri Pawar wrote:
>> There is a problem when function driver allocate memory for buffer
>> used by DMA from outside dma_mask space.
>> It appears during testing f_tcm driver with cdns3 controller.
>> In the result cdns3 driver was not able to map virtual buffer to DMA.
>> This fix should be improved depending on dma_mask associated with device.
>> Adding GFP_DMA32 flag while allocationg command data buffer only for 32
>> bit controllers.
> 
> Hi Jayshri,
> 
> This issue should be fixed by setting DMA_MASK correctly for controller,
> you can't limit user's memory region. At usb_ep_queue, the UDC driver
> will call DMA MAP API, for Cadence, it is usb_gadget_map_request_by_dev.
> For the system without SMMU (IO-MMU), it will use swiotlb to make sure
> the data buffer used for DMA transfer is within DMA mask for controller,
> There is a reserved low memory region for debounce buffer in swiotlb
> use case.
> 

/**
  * struct usb_request - describes one i/o request
  * @buf: Buffer used for data.  Always provide this; some controllers
  *	only use PIO, or don't use DMA for some endpoints.
  * @dma: DMA address corresponding to 'buf'.  If you don't set this
  *	field, and the usb controller needs one, it is responsible
  *	for mapping and unmapping the buffer.
<snip>
  */

So if dma is not set in the usb_request then controller driver is
responsible to do a dma_map of the buffer pointed by 'buf' before
it attemps to do DMA. This should take care of DMA mask and swiotlb.

This patch is not correct.

cheers,
-roger

> Peter
> 
>>
>> Signed-off-by: Pawel Laszczak <pawell@cadence.com>
>> Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
>> ---
>>   drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
>>   include/linux/usb/gadget.h          |  2 ++
>>   2 files changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/usb/gadget/function/f_tcm.c b/drivers/usb/gadget/function/f_tcm.c
>> index 36504931b2d1..a78d5fad3d84 100644
>> --- a/drivers/usb/gadget/function/f_tcm.c
>> +++ b/drivers/usb/gadget/function/f_tcm.c
>> @@ -213,7 +213,8 @@ static int bot_send_read_response(struct usbg_cmd *cmd)
>>   	}
>>   
>>   	if (!gadget->sg_supported) {
>> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
>> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
>> +					gadget->dma_flag);
>>   		if (!cmd->data_buf)
>>   			return -ENOMEM;
>>   
>> @@ -257,7 +258,8 @@ static int bot_send_write_request(struct usbg_cmd *cmd)
>>   	}
>>   
>>   	if (!gadget->sg_supported) {
>> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL);
>> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL |
>> +					gadget->dma_flag);
>>   		if (!cmd->data_buf)
>>   			return -ENOMEM;
>>   
>> @@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep *ep, struct usb_request *req)
>>   static int bot_prepare_reqs(struct f_uas *fu)
>>   {
>>   	int ret;
>> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
>>   
>>   	fu->bot_req_in = usb_ep_alloc_request(fu->ep_in, GFP_KERNEL);
>>   	if (!fu->bot_req_in)
>> @@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
>>   	fu->bot_status.req->complete = bot_status_complete;
>>   	fu->bot_status.csw.Signature = cpu_to_le32(US_BULK_CS_SIGN);
>>   
>> -	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL);
>> +	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL |
>> +				gadget->dma_flag);
>>   	if (!fu->cmd.buf)
>>   		goto err_buf;
>>   
>> @@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct usbg_cmd *cmd)
>>   	struct uas_stream *stream = cmd->stream;
>>   
>>   	if (!gadget->sg_supported) {
>> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
>> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
>> +					gadget->dma_flag);
>>   		if (!cmd->data_buf)
>>   			return -ENOMEM;
>>   
>> @@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct f_uas *fu, struct uas_stream *stream)
>>   
>>   static int uasp_alloc_cmd(struct f_uas *fu)
>>   {
>> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
>>   	fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd, GFP_KERNEL);
>>   	if (!fu->cmd.req)
>>   		goto err;
>>   
>> -	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL);
>> +	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL |
>> +				gadget->dma_flag);
>>   	if (!fu->cmd.buf)
>>   		goto err_buf;
>>   
>> @@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct usbg_cmd *cmd, struct usb_request *req)
>>   	struct usb_gadget *gadget = fuas_to_gadget(fu);
>>   
>>   	if (!gadget->sg_supported) {
>> -		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC);
>> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_ATOMIC |
>> +					gadget->dma_flag);
>>   		if (!cmd->data_buf)
>>   			return -ENOMEM;
>>   
>> diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
>> index 124462d65eac..d6c9cd222600 100644
>> --- a/include/linux/usb/gadget.h
>> +++ b/include/linux/usb/gadget.h
>> @@ -373,6 +373,7 @@ struct usb_gadget_ops {
>>    * @connected: True if gadget is connected.
>>    * @lpm_capable: If the gadget max_speed is FULL or HIGH, this flag
>>    *	indicates that it supports LPM as per the LPM ECN & errata.
>> + * @dma_flag: dma zone to be used for buffer allocation.
>>    *
>>    * Gadgets have a mostly-portable "gadget driver" implementing device
>>    * functions, handling all usb configurations and interfaces.  Gadget
>> @@ -427,6 +428,7 @@ struct usb_gadget {
>>   	unsigned			deactivated:1;
>>   	unsigned			connected:1;
>>   	unsigned			lpm_capable:1;
>> +	unsigned int			dma_flag;
>>   };
>>   #define work_to_gadget(w)	(container_of((w), struct usb_gadget, work))
>>   
>> -- 
>> 2.20.1
>>
> 

-- 
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
  2019-11-14  9:48   ` Roger Quadros
@ 2019-11-15 10:14     ` Jayshri Dajiram Pawar
  2019-11-15 11:11       ` Roger Quadros
  0 siblings, 1 reply; 8+ messages in thread
From: Jayshri Dajiram Pawar @ 2019-11-15 10:14 UTC (permalink / raw)
  To: Roger Quadros, Peter Chen
  Cc: linux-usb, gregkh, felipe.balbi, heikki.krogerus, linux-kernel,
	jbergsagel, nsekhar, nm, Rahul Kumar, Pawel Laszczak,
	Sanket Parmar

 
> >> There is a problem when function driver allocate memory for buffer
> >> used by DMA from outside dma_mask space.
> >> It appears during testing f_tcm driver with cdns3 controller.
> >> In the result cdns3 driver was not able to map virtual buffer to DMA.
> >> This fix should be improved depending on dma_mask associated with
> device.
> >> Adding GFP_DMA32 flag while allocationg command data buffer only for
> >> 32 bit controllers.
> >
> > Hi Jayshri,
> >
> > This issue should be fixed by setting DMA_MASK correctly for
> > controller, you can't limit user's memory region. At usb_ep_queue, the
> > UDC driver will call DMA MAP API, for Cadence, it is
> usb_gadget_map_request_by_dev.
> > For the system without SMMU (IO-MMU), it will use swiotlb to make sure
> > the data buffer used for DMA transfer is within DMA mask for
> > controller, There is a reserved low memory region for debounce buffer
> > in swiotlb use case.
> >
> 
> /**
>   * struct usb_request - describes one i/o request
>   * @buf: Buffer used for data.  Always provide this; some controllers
>   *	only use PIO, or don't use DMA for some endpoints.
>   * @dma: DMA address corresponding to 'buf'.  If you don't set this
>   *	field, and the usb controller needs one, it is responsible
>   *	for mapping and unmapping the buffer.
> <snip>
>   */
> 
> So if dma is not set in the usb_request then controller driver is responsible to
> do a dma_map of the buffer pointed by 'buf' before it attemps to do DMA.
> This should take care of DMA mask and swiotlb.
> 
> This patch is not correct.
> 
Hi Roger,

We have scatter-gather disabled.
We are getting below error while allocation of cmd data buffer with length 524288 or greater, while writing large size files to device.
This error occurred on x86 platform.
Because of this reason we have added DMA flag while allocation of buffer.

[ 1602.977532] swiotlb_tbl_map_single: 26 callbacks suppressed
[ 1602.977536] cdns-usb3 cdns-usb3.1: swiotlb buffer is full (sz: 524288 bytes), total 32768 (slots), used 0 (slots)
[ 1602.977542] cdns-usb3 cdns-usb3.1: overflow 0x00000007eee00000+524288 of DMA mask ffffffff bus mask 0
[ 1602.977555] WARNING: CPU: 6 PID: 285 at kernel/dma/direct.c:43 report_addr+0x37/0x60
[ 1602.977556] Modules linked in: target_core_user uio target_core_pscsi target_core_file target_core_iblock usb_f_tcm(OE) target_core_mod cdns3(OE) cdns3_pci_wrap(OE) roles(E) libcomposite(OE) udc_core(OE) xt_multiport iptable_filter bpfilter snd_hda_codec_hdmi nls_iso8859_1 i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel irqbypass snd_hda_codec snd_hda_core snd_hwdep snd_pcm drm_kms_helper snd_seq_midi snd_seq_midi_event crct10dif_pclmul snd_rawmidi crc32_pclmul drm snd_seq ghash_clmulni_intel snd_seq_device aesni_intel snd_timer mei_me i2c_algo_bit aes_x86_64 crypto_simd cryptd fb_sys_fops glue_helper snd mei input_leds syscopyarea intel_cstate sysfillrect intel_rapl_perf sysimgblt hp_wmi soundcore sparse_keymap serio_raw wmi_bmof tpm_infineon mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid e1000e psmouse ahci lpc_ich libahci i2c_i801 wmi
[ 1602.977605]  video
[ 1602.977613] CPU: 6 PID: 285 Comm: kworker/6:2 Tainted: G           OE     5.2.0-rc3cdns3-jayshri-stream-common+ #7
[ 1602.977615] Hardware name: Hewlett-Packard HP EliteDesk 800 G1 TWR/18E4, BIOS L01 v02.21 12/17/2013
[ 1602.977623] Workqueue: tcm_usb_gadget usbg_cmd_work [usb_f_tcm]
[ 1602.977628] RIP: 0010:report_addr+0x37/0x60
[ 1602.977631] Code: 48 8b 87 28 02 00 00 48 89 75 f8 48 85 c0 74 2a 4c 8b 00 b8 fe ff ff ff 49 39 c0 76 11 80 3d af 61 72 01 00 0f 84 df 06 00 00 <0f> 0b c9 c3 48 83 bf 38 02 00 00 00 74 f2 eb e3 80 3d 93 61 72 01
[ 1602.977634] RSP: 0018:ffffa0a6834dfc00 EFLAGS: 00010046
[ 1602.977636] RAX: 0000000000000000 RBX: ffff8ec574aeb010 RCX: 0000000000000000
[ 1602.977638] RDX: 0000000000000007 RSI: 0000000000000086 RDI: 0000000000000000
[ 1602.977640] RBP: ffffa0a6834dfc08 R08: 0000000000000569 R09: ffffffffa2189fb8
[ 1602.977642] R10: 0000000000000069 R11: ffffa0a6834df940 R12: 0000000000080000
[ 1602.977644] R13: ffff8ec5ad536218 R14: ffff8ec5ad693800 R15: ffff8ec5ad693800
[ 1602.977647] FS:  0000000000000000(0000) GS:ffff8ec5be980000(0000) knlGS:0000000000000000
[ 1602.977649] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1602.977651] CR2: 00007f05a56b7000 CR3: 000000036fc0a006 CR4: 00000000001606e0
[ 1602.977653] Call Trace:
[ 1602.977660]  dma_direct_map_page+0xdf/0xf0
[ 1602.977669]  usb_gadget_map_request_by_dev+0x17a/0x190 [udc_core]
[ 1602.977679]  __cdns3_gadget_ep_queue.isra.30+0x149/0x2e0 [cdns3]
[ 1602.977686]  ? kmalloc_order+0x18/0x40
[ 1602.977693]  cdns3_gadget_ep_queue+0x53/0x100 [cdns3]
[ 1602.977698]  usb_ep_queue+0x36/0xa0 [udc_core]
[ 1602.977704]  usbg_send_write_request+0x1ae/0x250 [usb_f_tcm]
[ 1602.977731]  transport_generic_new_cmd+0x1bc/0x320 [target_core_mod]
[ 1602.977749]  transport_handle_cdb_direct+0x42/0x60 [target_core_mod]
[ 1602.977766]  target_submit_cmd_map_sgls+0x176/0x230 [target_core_mod]
[ 1602.977771]  ? __switch_to_asm+0x40/0x70
[ 1602.977788]  target_submit_cmd+0x26/0x30 [target_core_mod]
[ 1602.977794]  usbg_cmd_work+0x60/0xd0 [usb_f_tcm]
[ 1602.977799]  process_one_work+0x20f/0x410
[ 1602.977802]  worker_thread+0x34/0x400
[ 1602.977807]  kthread+0x120/0x140
[ 1602.977811]  ? process_one_work+0x410/0x410
[ 1602.977815]  ? __kthread_parkme+0x70/0x70
[ 1602.977818]  ret_from_fork+0x35/0x40
[ 1602.977822] ---[ end trace 70f27f846049ae32 ]---
[ 1602.977826] cdns-usb3 cdns-usb3.1: failed to map buffer
[ 1602.977853] uasp_send_write_request(695)

Regards,
Jayshri

> cheers,
> -roger
> 
> > Peter
> >
> >>
> >> Signed-off-by: Pawel Laszczak <pawell@cadence.com>
> >> Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
> >> ---
> >>   drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
> >>   include/linux/usb/gadget.h          |  2 ++
> >>   2 files changed, 16 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/usb/gadget/function/f_tcm.c
> >> b/drivers/usb/gadget/function/f_tcm.c
> >> index 36504931b2d1..a78d5fad3d84 100644
> >> --- a/drivers/usb/gadget/function/f_tcm.c
> >> +++ b/drivers/usb/gadget/function/f_tcm.c
> >> @@ -213,7 +213,8 @@ static int bot_send_read_response(struct
> usbg_cmd *cmd)
> >>   	}
> >>
> >>   	if (!gadget->sg_supported) {
> >> -		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_ATOMIC);
> >> +		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_ATOMIC |
> >> +					gadget->dma_flag);
> >>   		if (!cmd->data_buf)
> >>   			return -ENOMEM;
> >>
> >> @@ -257,7 +258,8 @@ static int bot_send_write_request(struct
> usbg_cmd *cmd)
> >>   	}
> >>
> >>   	if (!gadget->sg_supported) {
> >> -		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_KERNEL);
> >> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL
> |
> >> +					gadget->dma_flag);
> >>   		if (!cmd->data_buf)
> >>   			return -ENOMEM;
> >>
> >> @@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep *ep,
> struct usb_request *req)
> >>   static int bot_prepare_reqs(struct f_uas *fu)
> >>   {
> >>   	int ret;
> >> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
> >>
> >>   	fu->bot_req_in = usb_ep_alloc_request(fu->ep_in, GFP_KERNEL);
> >>   	if (!fu->bot_req_in)
> >> @@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
> >>   	fu->bot_status.req->complete = bot_status_complete;
> >>   	fu->bot_status.csw.Signature = cpu_to_le32(US_BULK_CS_SIGN);
> >>
> >> -	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL);
> >> +	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL |
> >> +				gadget->dma_flag);
> >>   	if (!fu->cmd.buf)
> >>   		goto err_buf;
> >>
> >> @@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct
> usbg_cmd *cmd)
> >>   	struct uas_stream *stream = cmd->stream;
> >>
> >>   	if (!gadget->sg_supported) {
> >> -		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_ATOMIC);
> >> +		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_ATOMIC |
> >> +					gadget->dma_flag);
> >>   		if (!cmd->data_buf)
> >>   			return -ENOMEM;
> >>
> >> @@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct f_uas
> >> *fu, struct uas_stream *stream)
> >>
> >>   static int uasp_alloc_cmd(struct f_uas *fu)
> >>   {
> >> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
> >>   	fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd, GFP_KERNEL);
> >>   	if (!fu->cmd.req)
> >>   		goto err;
> >>
> >> -	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL);
> >> +	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL |
> >> +				gadget->dma_flag);
> >>   	if (!fu->cmd.buf)
> >>   		goto err_buf;
> >>
> >> @@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct
> usbg_cmd *cmd, struct usb_request *req)
> >>   	struct usb_gadget *gadget = fuas_to_gadget(fu);
> >>
> >>   	if (!gadget->sg_supported) {
> >> -		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_ATOMIC);
> >> +		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_ATOMIC |
> >> +					gadget->dma_flag);
> >>   		if (!cmd->data_buf)
> >>   			return -ENOMEM;
> >>
> >> diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
> >> index 124462d65eac..d6c9cd222600 100644
> >> --- a/include/linux/usb/gadget.h
> >> +++ b/include/linux/usb/gadget.h
> >> @@ -373,6 +373,7 @@ struct usb_gadget_ops {
> >>    * @connected: True if gadget is connected.
> >>    * @lpm_capable: If the gadget max_speed is FULL or HIGH, this flag
> >>    *	indicates that it supports LPM as per the LPM ECN & errata.
> >> + * @dma_flag: dma zone to be used for buffer allocation.
> >>    *
> >>    * Gadgets have a mostly-portable "gadget driver" implementing device
> >>    * functions, handling all usb configurations and interfaces.
> >> Gadget @@ -427,6 +428,7 @@ struct usb_gadget {
> >>   	unsigned			deactivated:1;
> >>   	unsigned			connected:1;
> >>   	unsigned			lpm_capable:1;
> >> +	unsigned int			dma_flag;
> >>   };
> >>   #define work_to_gadget(w)	(container_of((w), struct usb_gadget,
> work))
> >>
> >> --
> >> 2.20.1
> >>
> >
> 
> --
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
  2019-11-15 10:14     ` Jayshri Dajiram Pawar
@ 2019-11-15 11:11       ` Roger Quadros
  2019-11-15 18:02         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 8+ messages in thread
From: Roger Quadros @ 2019-11-15 11:11 UTC (permalink / raw)
  To: Jayshri Dajiram Pawar, Peter Chen, konrad.wilk
  Cc: linux-usb, gregkh, felipe.balbi, heikki.krogerus, linux-kernel,
	jbergsagel, nsekhar, nm, Rahul Kumar, Pawel Laszczak,
	Sanket Parmar, iommu

+Konrad

Jayshri,

On 15/11/2019 12:14, Jayshri Dajiram Pawar wrote:
>   
>>>> There is a problem when function driver allocate memory for buffer
>>>> used by DMA from outside dma_mask space.
>>>> It appears during testing f_tcm driver with cdns3 controller.
>>>> In the result cdns3 driver was not able to map virtual buffer to DMA.
>>>> This fix should be improved depending on dma_mask associated with
>> device.
>>>> Adding GFP_DMA32 flag while allocationg command data buffer only for
>>>> 32 bit controllers.
>>>
>>> Hi Jayshri,
>>>
>>> This issue should be fixed by setting DMA_MASK correctly for
>>> controller, you can't limit user's memory region. At usb_ep_queue, the
>>> UDC driver will call DMA MAP API, for Cadence, it is
>> usb_gadget_map_request_by_dev.
>>> For the system without SMMU (IO-MMU), it will use swiotlb to make sure
>>> the data buffer used for DMA transfer is within DMA mask for
>>> controller, There is a reserved low memory region for debounce buffer
>>> in swiotlb use case.
>>>
>>
>> /**
>>    * struct usb_request - describes one i/o request
>>    * @buf: Buffer used for data.  Always provide this; some controllers
>>    *	only use PIO, or don't use DMA for some endpoints.
>>    * @dma: DMA address corresponding to 'buf'.  If you don't set this
>>    *	field, and the usb controller needs one, it is responsible
>>    *	for mapping and unmapping the buffer.
>> <snip>
>>    */
>>
>> So if dma is not set in the usb_request then controller driver is responsible to
>> do a dma_map of the buffer pointed by 'buf' before it attemps to do DMA.
>> This should take care of DMA mask and swiotlb.
>>
>> This patch is not correct.
>>
> Hi Roger,
> 
> We have scatter-gather disabled.
> We are getting below error while allocation of cmd data buffer with length 524288 or greater, while writing large size files to device.
> This error occurred on x86 platform.
> Because of this reason we have added DMA flag while allocation of buffer.
> 
> [ 1602.977532] swiotlb_tbl_map_single: 26 callbacks suppressed
> [ 1602.977536] cdns-usb3 cdns-usb3.1: swiotlb buffer is full (sz: 524288 bytes), total 32768 (slots), used 0 (slots)

Why is swiotlb buffer getting full? How much is it on your system?
Are you sure that dma_unmap is happening on requests that complete? else we'll just keep hogging the swiotlb buffer.

cheers,
-roger

> [ 1602.977542] cdns-usb3 cdns-usb3.1: overflow 0x00000007eee00000+524288 of DMA mask ffffffff bus mask 0
> [ 1602.977555] WARNING: CPU: 6 PID: 285 at kernel/dma/direct.c:43 report_addr+0x37/0x60
> [ 1602.977556] Modules linked in: target_core_user uio target_core_pscsi target_core_file target_core_iblock usb_f_tcm(OE) target_core_mod cdns3(OE) cdns3_pci_wrap(OE) roles(E) libcomposite(OE) udc_core(OE) xt_multiport iptable_filter bpfilter snd_hda_codec_hdmi nls_iso8859_1 i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel irqbypass snd_hda_codec snd_hda_core snd_hwdep snd_pcm drm_kms_helper snd_seq_midi snd_seq_midi_event crct10dif_pclmul snd_rawmidi crc32_pclmul drm snd_seq ghash_clmulni_intel snd_seq_device aesni_intel snd_timer mei_me i2c_algo_bit aes_x86_64 crypto_simd cryptd fb_sys_fops glue_helper snd mei input_leds syscopyarea intel_cstate sysfillrect intel_rapl_perf sysimgblt hp_wmi soundcore sparse_keymap serio_raw wmi_bmof tpm_infineon mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid e1000e psmouse ahci lpc_ich libahci i2c_i801 wmi
> [ 1602.977605]  video
> [ 1602.977613] CPU: 6 PID: 285 Comm: kworker/6:2 Tainted: G           OE     5.2.0-rc3cdns3-jayshri-stream-common+ #7
> [ 1602.977615] Hardware name: Hewlett-Packard HP EliteDesk 800 G1 TWR/18E4, BIOS L01 v02.21 12/17/2013
> [ 1602.977623] Workqueue: tcm_usb_gadget usbg_cmd_work [usb_f_tcm]
> [ 1602.977628] RIP: 0010:report_addr+0x37/0x60
> [ 1602.977631] Code: 48 8b 87 28 02 00 00 48 89 75 f8 48 85 c0 74 2a 4c 8b 00 b8 fe ff ff ff 49 39 c0 76 11 80 3d af 61 72 01 00 0f 84 df 06 00 00 <0f> 0b c9 c3 48 83 bf 38 02 00 00 00 74 f2 eb e3 80 3d 93 61 72 01
> [ 1602.977634] RSP: 0018:ffffa0a6834dfc00 EFLAGS: 00010046
> [ 1602.977636] RAX: 0000000000000000 RBX: ffff8ec574aeb010 RCX: 0000000000000000
> [ 1602.977638] RDX: 0000000000000007 RSI: 0000000000000086 RDI: 0000000000000000
> [ 1602.977640] RBP: ffffa0a6834dfc08 R08: 0000000000000569 R09: ffffffffa2189fb8
> [ 1602.977642] R10: 0000000000000069 R11: ffffa0a6834df940 R12: 0000000000080000
> [ 1602.977644] R13: ffff8ec5ad536218 R14: ffff8ec5ad693800 R15: ffff8ec5ad693800
> [ 1602.977647] FS:  0000000000000000(0000) GS:ffff8ec5be980000(0000) knlGS:0000000000000000
> [ 1602.977649] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1602.977651] CR2: 00007f05a56b7000 CR3: 000000036fc0a006 CR4: 00000000001606e0
> [ 1602.977653] Call Trace:
> [ 1602.977660]  dma_direct_map_page+0xdf/0xf0
> [ 1602.977669]  usb_gadget_map_request_by_dev+0x17a/0x190 [udc_core]
> [ 1602.977679]  __cdns3_gadget_ep_queue.isra.30+0x149/0x2e0 [cdns3]
> [ 1602.977686]  ? kmalloc_order+0x18/0x40
> [ 1602.977693]  cdns3_gadget_ep_queue+0x53/0x100 [cdns3]
> [ 1602.977698]  usb_ep_queue+0x36/0xa0 [udc_core]
> [ 1602.977704]  usbg_send_write_request+0x1ae/0x250 [usb_f_tcm]
> [ 1602.977731]  transport_generic_new_cmd+0x1bc/0x320 [target_core_mod]
> [ 1602.977749]  transport_handle_cdb_direct+0x42/0x60 [target_core_mod]
> [ 1602.977766]  target_submit_cmd_map_sgls+0x176/0x230 [target_core_mod]
> [ 1602.977771]  ? __switch_to_asm+0x40/0x70
> [ 1602.977788]  target_submit_cmd+0x26/0x30 [target_core_mod]
> [ 1602.977794]  usbg_cmd_work+0x60/0xd0 [usb_f_tcm]
> [ 1602.977799]  process_one_work+0x20f/0x410
> [ 1602.977802]  worker_thread+0x34/0x400
> [ 1602.977807]  kthread+0x120/0x140
> [ 1602.977811]  ? process_one_work+0x410/0x410
> [ 1602.977815]  ? __kthread_parkme+0x70/0x70
> [ 1602.977818]  ret_from_fork+0x35/0x40
> [ 1602.977822] ---[ end trace 70f27f846049ae32 ]---
> [ 1602.977826] cdns-usb3 cdns-usb3.1: failed to map buffer
> [ 1602.977853] uasp_send_write_request(695)
> 
> Regards,
> Jayshri
> 
>> cheers,
>> -roger
>>
>>> Peter
>>>
>>>>
>>>> Signed-off-by: Pawel Laszczak <pawell@cadence.com>
>>>> Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
>>>> ---
>>>>    drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
>>>>    include/linux/usb/gadget.h          |  2 ++
>>>>    2 files changed, 16 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/usb/gadget/function/f_tcm.c
>>>> b/drivers/usb/gadget/function/f_tcm.c
>>>> index 36504931b2d1..a78d5fad3d84 100644
>>>> --- a/drivers/usb/gadget/function/f_tcm.c
>>>> +++ b/drivers/usb/gadget/function/f_tcm.c
>>>> @@ -213,7 +213,8 @@ static int bot_send_read_response(struct
>> usbg_cmd *cmd)
>>>>    	}
>>>>
>>>>    	if (!gadget->sg_supported) {
>>>> -		cmd->data_buf = kmalloc(se_cmd->data_length,
>> GFP_ATOMIC);
>>>> +		cmd->data_buf = kmalloc(se_cmd->data_length,
>> GFP_ATOMIC |
>>>> +					gadget->dma_flag);
>>>>    		if (!cmd->data_buf)
>>>>    			return -ENOMEM;
>>>>
>>>> @@ -257,7 +258,8 @@ static int bot_send_write_request(struct
>> usbg_cmd *cmd)
>>>>    	}
>>>>
>>>>    	if (!gadget->sg_supported) {
>>>> -		cmd->data_buf = kmalloc(se_cmd->data_length,
>> GFP_KERNEL);
>>>> +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL
>> |
>>>> +					gadget->dma_flag);
>>>>    		if (!cmd->data_buf)
>>>>    			return -ENOMEM;
>>>>
>>>> @@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep *ep,
>> struct usb_request *req)
>>>>    static int bot_prepare_reqs(struct f_uas *fu)
>>>>    {
>>>>    	int ret;
>>>> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
>>>>
>>>>    	fu->bot_req_in = usb_ep_alloc_request(fu->ep_in, GFP_KERNEL);
>>>>    	if (!fu->bot_req_in)
>>>> @@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
>>>>    	fu->bot_status.req->complete = bot_status_complete;
>>>>    	fu->bot_status.csw.Signature = cpu_to_le32(US_BULK_CS_SIGN);
>>>>
>>>> -	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL);
>>>> +	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL |
>>>> +				gadget->dma_flag);
>>>>    	if (!fu->cmd.buf)
>>>>    		goto err_buf;
>>>>
>>>> @@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct
>> usbg_cmd *cmd)
>>>>    	struct uas_stream *stream = cmd->stream;
>>>>
>>>>    	if (!gadget->sg_supported) {
>>>> -		cmd->data_buf = kmalloc(se_cmd->data_length,
>> GFP_ATOMIC);
>>>> +		cmd->data_buf = kmalloc(se_cmd->data_length,
>> GFP_ATOMIC |
>>>> +					gadget->dma_flag);
>>>>    		if (!cmd->data_buf)
>>>>    			return -ENOMEM;
>>>>
>>>> @@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct f_uas
>>>> *fu, struct uas_stream *stream)
>>>>
>>>>    static int uasp_alloc_cmd(struct f_uas *fu)
>>>>    {
>>>> +	struct usb_gadget *gadget = fuas_to_gadget(fu);
>>>>    	fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd, GFP_KERNEL);
>>>>    	if (!fu->cmd.req)
>>>>    		goto err;
>>>>
>>>> -	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL);
>>>> +	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL |
>>>> +				gadget->dma_flag);
>>>>    	if (!fu->cmd.buf)
>>>>    		goto err_buf;
>>>>
>>>> @@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct
>> usbg_cmd *cmd, struct usb_request *req)
>>>>    	struct usb_gadget *gadget = fuas_to_gadget(fu);
>>>>
>>>>    	if (!gadget->sg_supported) {
>>>> -		cmd->data_buf = kmalloc(se_cmd->data_length,
>> GFP_ATOMIC);
>>>> +		cmd->data_buf = kmalloc(se_cmd->data_length,
>> GFP_ATOMIC |
>>>> +					gadget->dma_flag);
>>>>    		if (!cmd->data_buf)
>>>>    			return -ENOMEM;
>>>>
>>>> diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
>>>> index 124462d65eac..d6c9cd222600 100644
>>>> --- a/include/linux/usb/gadget.h
>>>> +++ b/include/linux/usb/gadget.h
>>>> @@ -373,6 +373,7 @@ struct usb_gadget_ops {
>>>>     * @connected: True if gadget is connected.
>>>>     * @lpm_capable: If the gadget max_speed is FULL or HIGH, this flag
>>>>     *	indicates that it supports LPM as per the LPM ECN & errata.
>>>> + * @dma_flag: dma zone to be used for buffer allocation.
>>>>     *
>>>>     * Gadgets have a mostly-portable "gadget driver" implementing device
>>>>     * functions, handling all usb configurations and interfaces.
>>>> Gadget @@ -427,6 +428,7 @@ struct usb_gadget {
>>>>    	unsigned			deactivated:1;
>>>>    	unsigned			connected:1;
>>>>    	unsigned			lpm_capable:1;
>>>> +	unsigned int			dma_flag;
>>>>    };
>>>>    #define work_to_gadget(w)	(container_of((w), struct usb_gadget,
>> work))
>>>>
>>>> --
>>>> 2.20.1
>>>>
>>>
>>
>> --
>> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
>> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

-- 
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
  2019-11-15 11:11       ` Roger Quadros
@ 2019-11-15 18:02         ` Konrad Rzeszutek Wilk
  2019-11-22 11:55           ` Jayshri Dajiram Pawar
  0 siblings, 1 reply; 8+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-11-15 18:02 UTC (permalink / raw)
  To: Roger Quadros
  Cc: Jayshri Dajiram Pawar, Peter Chen, linux-usb, gregkh,
	felipe.balbi, heikki.krogerus, linux-kernel, jbergsagel, nsekhar,
	nm, Rahul Kumar, Pawel Laszczak, Sanket Parmar, iommu

On Fri, Nov 15, 2019 at 01:11:41PM +0200, Roger Quadros wrote:
> +Konrad

You can run Linux with CONFIG_DEBU_DMA and use debug_dma_dump_mappings() to dump
and figure things out. See http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/dump/dump_dma.c;h=2ba251a2f8c36c24c68762b3e4c9f76ea178238f;hb=HEAD

> 
> Jayshri,
> 
> On 15/11/2019 12:14, Jayshri Dajiram Pawar wrote:
> > > > > There is a problem when function driver allocate memory for buffer
> > > > > used by DMA from outside dma_mask space.
> > > > > It appears during testing f_tcm driver with cdns3 controller.
> > > > > In the result cdns3 driver was not able to map virtual buffer to DMA.
> > > > > This fix should be improved depending on dma_mask associated with
> > > device.
> > > > > Adding GFP_DMA32 flag while allocationg command data buffer only for
> > > > > 32 bit controllers.
> > > > 
> > > > Hi Jayshri,
> > > > 
> > > > This issue should be fixed by setting DMA_MASK correctly for
> > > > controller, you can't limit user's memory region. At usb_ep_queue, the
> > > > UDC driver will call DMA MAP API, for Cadence, it is
> > > usb_gadget_map_request_by_dev.
> > > > For the system without SMMU (IO-MMU), it will use swiotlb to make sure
> > > > the data buffer used for DMA transfer is within DMA mask for
> > > > controller, There is a reserved low memory region for debounce buffer
> > > > in swiotlb use case.
> > > > 
> > > 
> > > /**
> > >    * struct usb_request - describes one i/o request
> > >    * @buf: Buffer used for data.  Always provide this; some controllers
> > >    *	only use PIO, or don't use DMA for some endpoints.
> > >    * @dma: DMA address corresponding to 'buf'.  If you don't set this
> > >    *	field, and the usb controller needs one, it is responsible
> > >    *	for mapping and unmapping the buffer.
> > > <snip>
> > >    */
> > > 
> > > So if dma is not set in the usb_request then controller driver is responsible to
> > > do a dma_map of the buffer pointed by 'buf' before it attemps to do DMA.
> > > This should take care of DMA mask and swiotlb.
> > > 
> > > This patch is not correct.
> > > 
> > Hi Roger,
> > 
> > We have scatter-gather disabled.
> > We are getting below error while allocation of cmd data buffer with length 524288 or greater, while writing large size files to device.
> > This error occurred on x86 platform.
> > Because of this reason we have added DMA flag while allocation of buffer.
> > 
> > [ 1602.977532] swiotlb_tbl_map_single: 26 callbacks suppressed
> > [ 1602.977536] cdns-usb3 cdns-usb3.1: swiotlb buffer is full (sz: 524288 bytes), total 32768 (slots), used 0 (slots)
> 
> Why is swiotlb buffer getting full? How much is it on your system?
> Are you sure that dma_unmap is happening on requests that complete? else we'll just keep hogging the swiotlb buffer.
> 
> cheers,
> -roger
> 
> > [ 1602.977542] cdns-usb3 cdns-usb3.1: overflow 0x00000007eee00000+524288 of DMA mask ffffffff bus mask 0
> > [ 1602.977555] WARNING: CPU: 6 PID: 285 at kernel/dma/direct.c:43 report_addr+0x37/0x60
> > [ 1602.977556] Modules linked in: target_core_user uio target_core_pscsi target_core_file target_core_iblock usb_f_tcm(OE) target_core_mod cdns3(OE) cdns3_pci_wrap(OE) roles(E) libcomposite(OE) udc_core(OE) xt_multiport iptable_filter bpfilter snd_hda_codec_hdmi nls_iso8859_1 i915 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel irqbypass snd_hda_codec snd_hda_core snd_hwdep snd_pcm drm_kms_helper snd_seq_midi snd_seq_midi_event crct10dif_pclmul snd_rawmidi crc32_pclmul drm snd_seq ghash_clmulni_intel snd_seq_device aesni_intel snd_timer mei_me i2c_algo_bit aes_x86_64 crypto_simd cryptd fb_sys_fops glue_helper snd mei input_leds syscopyarea intel_cstate sysfillrect intel_rapl_perf sysimgblt hp_wmi soundcore sparse_keymap serio_raw wmi_bmof tpm_infineon mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid e1000e psmouse ahci lpc_ich libahci i2c_i801 wmi
> > [ 1602.977605]  video
> > [ 1602.977613] CPU: 6 PID: 285 Comm: kworker/6:2 Tainted: G           OE     5.2.0-rc3cdns3-jayshri-stream-common+ #7
> > [ 1602.977615] Hardware name: Hewlett-Packard HP EliteDesk 800 G1 TWR/18E4, BIOS L01 v02.21 12/17/2013
> > [ 1602.977623] Workqueue: tcm_usb_gadget usbg_cmd_work [usb_f_tcm]
> > [ 1602.977628] RIP: 0010:report_addr+0x37/0x60
> > [ 1602.977631] Code: 48 8b 87 28 02 00 00 48 89 75 f8 48 85 c0 74 2a 4c 8b 00 b8 fe ff ff ff 49 39 c0 76 11 80 3d af 61 72 01 00 0f 84 df 06 00 00 <0f> 0b c9 c3 48 83 bf 38 02 00 00 00 74 f2 eb e3 80 3d 93 61 72 01
> > [ 1602.977634] RSP: 0018:ffffa0a6834dfc00 EFLAGS: 00010046
> > [ 1602.977636] RAX: 0000000000000000 RBX: ffff8ec574aeb010 RCX: 0000000000000000
> > [ 1602.977638] RDX: 0000000000000007 RSI: 0000000000000086 RDI: 0000000000000000
> > [ 1602.977640] RBP: ffffa0a6834dfc08 R08: 0000000000000569 R09: ffffffffa2189fb8
> > [ 1602.977642] R10: 0000000000000069 R11: ffffa0a6834df940 R12: 0000000000080000
> > [ 1602.977644] R13: ffff8ec5ad536218 R14: ffff8ec5ad693800 R15: ffff8ec5ad693800
> > [ 1602.977647] FS:  0000000000000000(0000) GS:ffff8ec5be980000(0000) knlGS:0000000000000000
> > [ 1602.977649] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 1602.977651] CR2: 00007f05a56b7000 CR3: 000000036fc0a006 CR4: 00000000001606e0
> > [ 1602.977653] Call Trace:
> > [ 1602.977660]  dma_direct_map_page+0xdf/0xf0
> > [ 1602.977669]  usb_gadget_map_request_by_dev+0x17a/0x190 [udc_core]
> > [ 1602.977679]  __cdns3_gadget_ep_queue.isra.30+0x149/0x2e0 [cdns3]
> > [ 1602.977686]  ? kmalloc_order+0x18/0x40
> > [ 1602.977693]  cdns3_gadget_ep_queue+0x53/0x100 [cdns3]
> > [ 1602.977698]  usb_ep_queue+0x36/0xa0 [udc_core]
> > [ 1602.977704]  usbg_send_write_request+0x1ae/0x250 [usb_f_tcm]
> > [ 1602.977731]  transport_generic_new_cmd+0x1bc/0x320 [target_core_mod]
> > [ 1602.977749]  transport_handle_cdb_direct+0x42/0x60 [target_core_mod]
> > [ 1602.977766]  target_submit_cmd_map_sgls+0x176/0x230 [target_core_mod]
> > [ 1602.977771]  ? __switch_to_asm+0x40/0x70
> > [ 1602.977788]  target_submit_cmd+0x26/0x30 [target_core_mod]
> > [ 1602.977794]  usbg_cmd_work+0x60/0xd0 [usb_f_tcm]
> > [ 1602.977799]  process_one_work+0x20f/0x410
> > [ 1602.977802]  worker_thread+0x34/0x400
> > [ 1602.977807]  kthread+0x120/0x140
> > [ 1602.977811]  ? process_one_work+0x410/0x410
> > [ 1602.977815]  ? __kthread_parkme+0x70/0x70
> > [ 1602.977818]  ret_from_fork+0x35/0x40
> > [ 1602.977822] ---[ end trace 70f27f846049ae32 ]---
> > [ 1602.977826] cdns-usb3 cdns-usb3.1: failed to map buffer
> > [ 1602.977853] uasp_send_write_request(695)
> > 
> > Regards,
> > Jayshri
> > 
> > > cheers,
> > > -roger
> > > 
> > > > Peter
> > > > 
> > > > > 
> > > > > Signed-off-by: Pawel Laszczak <pawell@cadence.com>
> > > > > Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
> > > > > ---
> > > > >    drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
> > > > >    include/linux/usb/gadget.h          |  2 ++
> > > > >    2 files changed, 16 insertions(+), 6 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/usb/gadget/function/f_tcm.c
> > > > > b/drivers/usb/gadget/function/f_tcm.c
> > > > > index 36504931b2d1..a78d5fad3d84 100644
> > > > > --- a/drivers/usb/gadget/function/f_tcm.c
> > > > > +++ b/drivers/usb/gadget/function/f_tcm.c
> > > > > @@ -213,7 +213,8 @@ static int bot_send_read_response(struct
> > > usbg_cmd *cmd)
> > > > >    	}
> > > > > 
> > > > >    	if (!gadget->sg_supported) {
> > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC);
> > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC |
> > > > > +					gadget->dma_flag);
> > > > >    		if (!cmd->data_buf)
> > > > >    			return -ENOMEM;
> > > > > 
> > > > > @@ -257,7 +258,8 @@ static int bot_send_write_request(struct
> > > usbg_cmd *cmd)
> > > > >    	}
> > > > > 
> > > > >    	if (!gadget->sg_supported) {
> > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_KERNEL);
> > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length, GFP_KERNEL
> > > |
> > > > > +					gadget->dma_flag);
> > > > >    		if (!cmd->data_buf)
> > > > >    			return -ENOMEM;
> > > > > 
> > > > > @@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep *ep,
> > > struct usb_request *req)
> > > > >    static int bot_prepare_reqs(struct f_uas *fu)
> > > > >    {
> > > > >    	int ret;
> > > > > +	struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > > 
> > > > >    	fu->bot_req_in = usb_ep_alloc_request(fu->ep_in, GFP_KERNEL);
> > > > >    	if (!fu->bot_req_in)
> > > > > @@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
> > > > >    	fu->bot_status.req->complete = bot_status_complete;
> > > > >    	fu->bot_status.csw.Signature = cpu_to_le32(US_BULK_CS_SIGN);
> > > > > 
> > > > > -	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL);
> > > > > +	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL |
> > > > > +				gadget->dma_flag);
> > > > >    	if (!fu->cmd.buf)
> > > > >    		goto err_buf;
> > > > > 
> > > > > @@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct
> > > usbg_cmd *cmd)
> > > > >    	struct uas_stream *stream = cmd->stream;
> > > > > 
> > > > >    	if (!gadget->sg_supported) {
> > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC);
> > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC |
> > > > > +					gadget->dma_flag);
> > > > >    		if (!cmd->data_buf)
> > > > >    			return -ENOMEM;
> > > > > 
> > > > > @@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct f_uas
> > > > > *fu, struct uas_stream *stream)
> > > > > 
> > > > >    static int uasp_alloc_cmd(struct f_uas *fu)
> > > > >    {
> > > > > +	struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > >    	fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd, GFP_KERNEL);
> > > > >    	if (!fu->cmd.req)
> > > > >    		goto err;
> > > > > 
> > > > > -	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL);
> > > > > +	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket, GFP_KERNEL |
> > > > > +				gadget->dma_flag);
> > > > >    	if (!fu->cmd.buf)
> > > > >    		goto err_buf;
> > > > > 
> > > > > @@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct
> > > usbg_cmd *cmd, struct usb_request *req)
> > > > >    	struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > > 
> > > > >    	if (!gadget->sg_supported) {
> > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC);
> > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > GFP_ATOMIC |
> > > > > +					gadget->dma_flag);
> > > > >    		if (!cmd->data_buf)
> > > > >    			return -ENOMEM;
> > > > > 
> > > > > diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h
> > > > > index 124462d65eac..d6c9cd222600 100644
> > > > > --- a/include/linux/usb/gadget.h
> > > > > +++ b/include/linux/usb/gadget.h
> > > > > @@ -373,6 +373,7 @@ struct usb_gadget_ops {
> > > > >     * @connected: True if gadget is connected.
> > > > >     * @lpm_capable: If the gadget max_speed is FULL or HIGH, this flag
> > > > >     *	indicates that it supports LPM as per the LPM ECN & errata.
> > > > > + * @dma_flag: dma zone to be used for buffer allocation.
> > > > >     *
> > > > >     * Gadgets have a mostly-portable "gadget driver" implementing device
> > > > >     * functions, handling all usb configurations and interfaces.
> > > > > Gadget @@ -427,6 +428,7 @@ struct usb_gadget {
> > > > >    	unsigned			deactivated:1;
> > > > >    	unsigned			connected:1;
> > > > >    	unsigned			lpm_capable:1;
> > > > > +	unsigned int			dma_flag;
> > > > >    };
> > > > >    #define work_to_gadget(w)	(container_of((w), struct usb_gadget,
> > > work))
> > > > > 
> > > > > --
> > > > > 2.20.1
> > > > > 
> > > > 
> > > 
> > > --
> > > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> > > Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
> 
> -- 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
  2019-11-15 18:02         ` Konrad Rzeszutek Wilk
@ 2019-11-22 11:55           ` Jayshri Dajiram Pawar
  2019-11-25 16:31             ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 8+ messages in thread
From: Jayshri Dajiram Pawar @ 2019-11-22 11:55 UTC (permalink / raw)
  To: Roger Quadros
  Cc: Peter Chen, linux-usb, gregkh, felipe.balbi, heikki.krogerus,
	linux-kernel, jbergsagel, nsekhar, nm, Rahul Kumar,
	Pawel Laszczak, Sanket Parmar, iommu, Konrad Rzeszutek Wilk

> > +Konrad
> 
> You can run Linux with CONFIG_DEBU_DMA and use
> debug_dma_dump_mappings() to dump and figure things out. See
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__xenbits.xen.org_gitweb_-3Fp-3Dxentesttools_bootstrap.git-3Ba-3Dblob-
> 3Bf-3Droot-5Fimage_drivers_dump_dump-5Fdma.c-3Bh-
> 3D2ba251a2f8c36c24c68762b3e4c9f76ea178238f-3Bhb-
> 3DHEAD&d=DwIFAg&c=aUq983L2pue2FqKFoP6PGHMJQyoJ7kl3s3GZ-
> _haXqY&r=yUzzs89gsBIbqjpopjycywmchLJKpKHDc_YLMD1daI0&m=uPhjYpgZJ
> ovroCszC7ZGapdrx4F72MK7pqXmzpRyGmA&s=dSO49c-
> githIzhiwgvwOt0m00M2trGWB3AnKL3OKpkw&e=
> 
> >
> > Jayshri,
> >
> > On 15/11/2019 12:14, Jayshri Dajiram Pawar wrote:
> > > > > > There is a problem when function driver allocate memory for
> > > > > > buffer used by DMA from outside dma_mask space.
> > > > > > It appears during testing f_tcm driver with cdns3 controller.
> > > > > > In the result cdns3 driver was not able to map virtual buffer to DMA.
> > > > > > This fix should be improved depending on dma_mask associated
> > > > > > with
> > > > device.
> > > > > > Adding GFP_DMA32 flag while allocationg command data buffer
> > > > > > only for
> > > > > > 32 bit controllers.
> > > > >
> > > > > Hi Jayshri,
> > > > >
> > > > > This issue should be fixed by setting DMA_MASK correctly for
> > > > > controller, you can't limit user's memory region. At
> > > > > usb_ep_queue, the UDC driver will call DMA MAP API, for Cadence,
> > > > > it is
> > > > usb_gadget_map_request_by_dev.
> > > > > For the system without SMMU (IO-MMU), it will use swiotlb to
> > > > > make sure the data buffer used for DMA transfer is within DMA
> > > > > mask for controller, There is a reserved low memory region for
> > > > > debounce buffer in swiotlb use case.
> > > > >
> > > >
> > > > /**
> > > >    * struct usb_request - describes one i/o request
> > > >    * @buf: Buffer used for data.  Always provide this; some controllers
> > > >    *	only use PIO, or don't use DMA for some endpoints.
> > > >    * @dma: DMA address corresponding to 'buf'.  If you don't set this
> > > >    *	field, and the usb controller needs one, it is responsible
> > > >    *	for mapping and unmapping the buffer.
> > > > <snip>
> > > >    */
> > > >
> > > > So if dma is not set in the usb_request then controller driver is
> > > > responsible to do a dma_map of the buffer pointed by 'buf' before it
> attemps to do DMA.
> > > > This should take care of DMA mask and swiotlb.
> > > >
> > > > This patch is not correct.
> > > >
> > > Hi Roger,
> > >
> > > We have scatter-gather disabled.
> > > We are getting below error while allocation of cmd data buffer with
> length 524288 or greater, while writing large size files to device.
> > > This error occurred on x86 platform.
> > > Because of this reason we have added DMA flag while allocation of
> buffer.
> > >
> > > [ 1602.977532] swiotlb_tbl_map_single: 26 callbacks suppressed [
> > > 1602.977536] cdns-usb3 cdns-usb3.1: swiotlb buffer is full (sz:
> > > 524288 bytes), total 32768 (slots), used 0 (slots)
> >
Hi Roger,

> > Why is swiotlb buffer getting full? How much is it on your system?

On our system swiotlb max mapping size is 256KB.
UASP receive data state tries to queue and map buffer of length 524288 (512KB), which is greater than 256KB that's why swiotlb buffer is getting full.

> > Are you sure that dma_unmap is happening on requests that complete?
> else we'll just keep hogging the swiotlb buffer.

Yes, dma_unmap is happening on requests that complete.

I could map buffer of length 512KB with  IO_TLB_SEGSIZE changed to 256.
With this max mapping size is increased to  256*2048 = 512KB. 

+++ b/include/linux/swiotlb.h
@@ -21,7 +21,7 @@ enum swiotlb_force {
  * must be a power of 2.  What is the appropriate value ?
  * The complexity of {map,unmap}_single is linearly dependent on this value.
  */
-#define IO_TLB_SEGSIZE 128
+#define IO_TLB_SEGSIZE 256

Regards,
Jayshri

> >
> > cheers,
> > -roger



> >
> > > [ 1602.977542] cdns-usb3 cdns-usb3.1: overflow
> > > 0x00000007eee00000+524288 of DMA mask ffffffff bus mask 0 [
> > > 1602.977555] WARNING: CPU: 6 PID: 285 at kernel/dma/direct.c:43
> > > report_addr+0x37/0x60 [ 1602.977556] Modules linked in:
> target_core_user uio target_core_pscsi target_core_file target_core_iblock
> usb_f_tcm(OE) target_core_mod cdns3(OE) cdns3_pci_wrap(OE) roles(E)
> libcomposite(OE) udc_core(OE) xt_multiport iptable_filter bpfilter
> snd_hda_codec_hdmi nls_iso8859_1 i915 intel_rapl x86_pkg_temp_thermal
> intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek
> snd_hda_codec_generic ledtrig_audio snd_hda_intel irqbypass
> snd_hda_codec snd_hda_core snd_hwdep snd_pcm drm_kms_helper
> snd_seq_midi snd_seq_midi_event crct10dif_pclmul snd_rawmidi
> crc32_pclmul drm snd_seq ghash_clmulni_intel snd_seq_device aesni_intel
> snd_timer mei_me i2c_algo_bit aes_x86_64 crypto_simd cryptd fb_sys_fops
> glue_helper snd mei input_leds syscopyarea intel_cstate sysfillrect
> intel_rapl_perf sysimgblt hp_wmi soundcore sparse_keymap serio_raw
> wmi_bmof tpm_infineon mac_hid sch_fq_codel parport_pc ppdev lp parport
> ip_tables x_tables autofs4 hid_generic usbhid hid e1000e psmouse ahci
> lpc_ich libahci i2c_i801 wmi [ 1602.977605]  video
> > > [ 1602.977613] CPU: 6 PID: 285 Comm: kworker/6:2 Tainted: G           OE
> 5.2.0-rc3cdns3-jayshri-stream-common+ #7
> > > [ 1602.977615] Hardware name: Hewlett-Packard HP EliteDesk 800 G1
> > > TWR/18E4, BIOS L01 v02.21 12/17/2013 [ 1602.977623] Workqueue:
> > > tcm_usb_gadget usbg_cmd_work [usb_f_tcm] [ 1602.977628] RIP:
> > > 0010:report_addr+0x37/0x60 [ 1602.977631] Code: 48 8b 87 28 02 00 00
> > > 48 89 75 f8 48 85 c0 74 2a 4c 8b 00 b8 fe ff ff ff 49 39 c0 76 11 80
> > > 3d af 61 72 01 00 0f 84 df 06 00 00 <0f> 0b c9 c3 48 83 bf 38 02 00
> > > 00 00 74 f2 eb e3 80 3d 93 61 72 01 [ 1602.977634] RSP:
> > > 0018:ffffa0a6834dfc00 EFLAGS: 00010046 [ 1602.977636] RAX:
> > > 0000000000000000 RBX: ffff8ec574aeb010 RCX: 0000000000000000 [
> > > 1602.977638] RDX: 0000000000000007 RSI: 0000000000000086 RDI:
> > > 0000000000000000 [ 1602.977640] RBP: ffffa0a6834dfc08 R08:
> > > 0000000000000569 R09: ffffffffa2189fb8 [ 1602.977642] R10:
> > > 0000000000000069 R11: ffffa0a6834df940 R12: 0000000000080000 [
> > > 1602.977644] R13: ffff8ec5ad536218 R14: ffff8ec5ad693800 R15:
> ffff8ec5ad693800 [ 1602.977647] FS:  0000000000000000(0000)
> GS:ffff8ec5be980000(0000) knlGS:0000000000000000 [ 1602.977649] CS:
> 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1602.977651] CR2:
> 00007f05a56b7000 CR3: 000000036fc0a006 CR4: 00000000001606e0 [
> 1602.977653] Call Trace:
> > > [ 1602.977660]  dma_direct_map_page+0xdf/0xf0 [ 1602.977669]
> > > usb_gadget_map_request_by_dev+0x17a/0x190 [udc_core] [
> 1602.977679]
> > > __cdns3_gadget_ep_queue.isra.30+0x149/0x2e0 [cdns3] [ 1602.977686]
> > > ? kmalloc_order+0x18/0x40 [ 1602.977693]
> > > cdns3_gadget_ep_queue+0x53/0x100 [cdns3] [ 1602.977698]
> > > usb_ep_queue+0x36/0xa0 [udc_core] [ 1602.977704]
> > > usbg_send_write_request+0x1ae/0x250 [usb_f_tcm] [ 1602.977731]
> > > transport_generic_new_cmd+0x1bc/0x320 [target_core_mod] [
> > > 1602.977749]  transport_handle_cdb_direct+0x42/0x60
> > > [target_core_mod] [ 1602.977766]
> > > target_submit_cmd_map_sgls+0x176/0x230 [target_core_mod] [
> > > 1602.977771]  ? __switch_to_asm+0x40/0x70 [ 1602.977788]
> > > target_submit_cmd+0x26/0x30 [target_core_mod] [ 1602.977794]
> > > usbg_cmd_work+0x60/0xd0 [usb_f_tcm] [ 1602.977799]
> > > process_one_work+0x20f/0x410 [ 1602.977802]
> > > worker_thread+0x34/0x400 [ 1602.977807]  kthread+0x120/0x140 [
> > > 1602.977811]  ? process_one_work+0x410/0x410 [ 1602.977815]  ?
> > > __kthread_parkme+0x70/0x70 [ 1602.977818]  ret_from_fork+0x35/0x40
> [
> > > 1602.977822] ---[ end trace 70f27f846049ae32 ]--- [ 1602.977826]
> > > cdns-usb3 cdns-usb3.1: failed to map buffer [ 1602.977853]
> > > uasp_send_write_request(695)
> > >
> > > Regards,
> > > Jayshri
> > >
> > > > cheers,
> > > > -roger
> > > >
> > > > > Peter
> > > > >
> > > > > >
> > > > > > Signed-off-by: Pawel Laszczak <pawell@cadence.com>
> > > > > > Signed-off-by: Jayshri Pawar <jpawar@cadence.com>
> > > > > > ---
> > > > > >    drivers/usb/gadget/function/f_tcm.c | 20 ++++++++++++++------
> > > > > >    include/linux/usb/gadget.h          |  2 ++
> > > > > >    2 files changed, 16 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/usb/gadget/function/f_tcm.c
> > > > > > b/drivers/usb/gadget/function/f_tcm.c
> > > > > > index 36504931b2d1..a78d5fad3d84 100644
> > > > > > --- a/drivers/usb/gadget/function/f_tcm.c
> > > > > > +++ b/drivers/usb/gadget/function/f_tcm.c
> > > > > > @@ -213,7 +213,8 @@ static int bot_send_read_response(struct
> > > > usbg_cmd *cmd)
> > > > > >    	}
> > > > > >
> > > > > >    	if (!gadget->sg_supported) {
> > > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > > GFP_ATOMIC);
> > > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > > GFP_ATOMIC |
> > > > > > +					gadget->dma_flag);
> > > > > >    		if (!cmd->data_buf)
> > > > > >    			return -ENOMEM;
> > > > > >
> > > > > > @@ -257,7 +258,8 @@ static int bot_send_write_request(struct
> > > > usbg_cmd *cmd)
> > > > > >    	}
> > > > > >
> > > > > >    	if (!gadget->sg_supported) {
> > > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > > GFP_KERNEL);
> > > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length,
> GFP_KERNEL
> > > > |
> > > > > > +					gadget->dma_flag);
> > > > > >    		if (!cmd->data_buf)
> > > > > >    			return -ENOMEM;
> > > > > >
> > > > > > @@ -305,6 +307,7 @@ static void bot_cmd_complete(struct usb_ep
> > > > > > *ep,
> > > > struct usb_request *req)
> > > > > >    static int bot_prepare_reqs(struct f_uas *fu)
> > > > > >    {
> > > > > >    	int ret;
> > > > > > +	struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > > >
> > > > > >    	fu->bot_req_in = usb_ep_alloc_request(fu->ep_in,
> GFP_KERNEL);
> > > > > >    	if (!fu->bot_req_in)
> > > > > > @@ -327,7 +330,8 @@ static int bot_prepare_reqs(struct f_uas *fu)
> > > > > >    	fu->bot_status.req->complete = bot_status_complete;
> > > > > >    	fu->bot_status.csw.Signature =
> > > > > > cpu_to_le32(US_BULK_CS_SIGN);
> > > > > >
> > > > > > -	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket,
> GFP_KERNEL);
> > > > > > +	fu->cmd.buf = kmalloc(fu->ep_out->maxpacket, GFP_KERNEL
> |
> > > > > > +				gadget->dma_flag);
> > > > > >    	if (!fu->cmd.buf)
> > > > > >    		goto err_buf;
> > > > > >
> > > > > > @@ -515,7 +519,8 @@ static int uasp_prepare_r_request(struct
> > > > usbg_cmd *cmd)
> > > > > >    	struct uas_stream *stream = cmd->stream;
> > > > > >
> > > > > >    	if (!gadget->sg_supported) {
> > > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > > GFP_ATOMIC);
> > > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > > GFP_ATOMIC |
> > > > > > +					gadget->dma_flag);
> > > > > >    		if (!cmd->data_buf)
> > > > > >    			return -ENOMEM;
> > > > > >
> > > > > > @@ -763,11 +768,13 @@ static int uasp_alloc_stream_res(struct
> > > > > > f_uas *fu, struct uas_stream *stream)
> > > > > >
> > > > > >    static int uasp_alloc_cmd(struct f_uas *fu)
> > > > > >    {
> > > > > > +	struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > > >    	fu->cmd.req = usb_ep_alloc_request(fu->ep_cmd,
> GFP_KERNEL);
> > > > > >    	if (!fu->cmd.req)
> > > > > >    		goto err;
> > > > > >
> > > > > > -	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket,
> GFP_KERNEL);
> > > > > > +	fu->cmd.buf = kmalloc(fu->ep_cmd->maxpacket,
> GFP_KERNEL |
> > > > > > +				gadget->dma_flag);
> > > > > >    	if (!fu->cmd.buf)
> > > > > >    		goto err_buf;
> > > > > >
> > > > > > @@ -980,7 +987,8 @@ static int usbg_prepare_w_request(struct
> > > > usbg_cmd *cmd, struct usb_request *req)
> > > > > >    	struct usb_gadget *gadget = fuas_to_gadget(fu);
> > > > > >
> > > > > >    	if (!gadget->sg_supported) {
> > > > > > -		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > > GFP_ATOMIC);
> > > > > > +		cmd->data_buf = kmalloc(se_cmd->data_length,
> > > > GFP_ATOMIC |
> > > > > > +					gadget->dma_flag);
> > > > > >    		if (!cmd->data_buf)
> > > > > >    			return -ENOMEM;
> > > > > >
> > > > > > diff --git a/include/linux/usb/gadget.h
> > > > > > b/include/linux/usb/gadget.h index 124462d65eac..d6c9cd222600
> > > > > > 100644
> > > > > > --- a/include/linux/usb/gadget.h
> > > > > > +++ b/include/linux/usb/gadget.h
> > > > > > @@ -373,6 +373,7 @@ struct usb_gadget_ops {
> > > > > >     * @connected: True if gadget is connected.
> > > > > >     * @lpm_capable: If the gadget max_speed is FULL or HIGH, this
> flag
> > > > > >     *	indicates that it supports LPM as per the LPM ECN & errata.
> > > > > > + * @dma_flag: dma zone to be used for buffer allocation.
> > > > > >     *
> > > > > >     * Gadgets have a mostly-portable "gadget driver" implementing
> device
> > > > > >     * functions, handling all usb configurations and interfaces.
> > > > > > Gadget @@ -427,6 +428,7 @@ struct usb_gadget {
> > > > > >    	unsigned			deactivated:1;
> > > > > >    	unsigned			connected:1;
> > > > > >    	unsigned			lpm_capable:1;
> > > > > > +	unsigned int			dma_flag;
> > > > > >    };
> > > > > >    #define work_to_gadget(w)	(container_of((w), struct usb_gadget,
> > > > work))
> > > > > >
> > > > > > --
> > > > > > 2.20.1
> > > > > >
> > > > >
> > > >
> > > > --
> > > > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> > > > Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
> >
> > --
> > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
> > Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer
  2019-11-22 11:55           ` Jayshri Dajiram Pawar
@ 2019-11-25 16:31             ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 8+ messages in thread
From: Konrad Rzeszutek Wilk @ 2019-11-25 16:31 UTC (permalink / raw)
  To: Jayshri Dajiram Pawar
  Cc: Roger Quadros, Peter Chen, linux-usb, gregkh, felipe.balbi,
	heikki.krogerus, linux-kernel, jbergsagel, nsekhar, nm,
	Rahul Kumar, Pawel Laszczak, Sanket Parmar, iommu

. massive snip..
> > > Why is swiotlb buffer getting full? How much is it on your system?
> 
> On our system swiotlb max mapping size is 256KB.
> UASP receive data state tries to queue and map buffer of length 524288 (512KB), which is greater than 256KB that's why swiotlb buffer is getting full.

What is the reason for the UASP not being able to break the buffer in say two
256KB sg entries?

> 
> > > Are you sure that dma_unmap is happening on requests that complete?
> > else we'll just keep hogging the swiotlb buffer.
> 
> Yes, dma_unmap is happening on requests that complete.
> 
> I could map buffer of length 512KB with  IO_TLB_SEGSIZE changed to 256.
> With this max mapping size is increased to  256*2048 = 512KB.

If we go this route (which I rather dislike as this is a workaround, because
what if the next time there is 1MB buffer? Do we keep on increasing this?) - then
this should be dynamic and an option on the 'swiotlb' command line.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-11-25 16:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-13 10:24 [RFC PATCH] usb: gadget: f_tcm: Added DMA32 flag while allocation of command buffer Jayshri Pawar
2019-11-14  2:50 ` Peter Chen
2019-11-14  9:48   ` Roger Quadros
2019-11-15 10:14     ` Jayshri Dajiram Pawar
2019-11-15 11:11       ` Roger Quadros
2019-11-15 18:02         ` Konrad Rzeszutek Wilk
2019-11-22 11:55           ` Jayshri Dajiram Pawar
2019-11-25 16:31             ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).