* [PATCH v6 0/4] Re-introduce TX FIFO resize for larger EP bursting @ 2021-01-22 4:01 Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration Wesley Cheng ` (3 more replies) 0 siblings, 4 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-22 4:01 UTC (permalink / raw) To: balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp, Wesley Cheng Changes in V6: - Rebased patches to usb-testing. - Renamed to PATCH series instead of RFC. - Checking for fs_descriptors instead of ss_descriptors for determining the endpoint count for a particular configuration. - Re-ordered patch series to fix patch dependencies. Changes in V5: - Added check_config() logic, which is used to communicate the number of EPs used in a particular configuration. Based on this, the DWC3 gadget driver has the ability to know the maximum number of eps utilized in all configs. This helps reduce unnecessary allocation to unused eps, and will catch fifo allocation issues at bind() time. - Fixed variable declaration to single line per variable, and reverse xmas. - Created a helper for fifo clearing, which is used by ep0.c Changes in V4: - Removed struct dwc3* as an argument for dwc3_gadget_resize_tx_fifos() - Removed WARN_ON(1) in case we run out of fifo space Changes in V3: - Removed "Reviewed-by" tags - Renamed series back to RFC - Modified logic to ensure that fifo_size is reset if we pass the minimum threshold. Tested with binding multiple FDs requesting 6 FIFOs. Changes in V2: - Modified TXFIFO resizing logic to ensure that each EP is reserved a FIFO. - Removed dev_dbg() prints and fixed typos from patches - Added some more description on the dt-bindings commit message Currently, there is no functionality to allow for resizing the TXFIFOs, and relying on the HW default setting for the TXFIFO depth. In most cases, the HW default is probably sufficient, but for USB compositions that contain multiple functions that require EP bursting, the default settings might not be enough. Also to note, the current SW will assign an EP to a function driver w/o checking to see if the TXFIFO size for that particular EP is large enough. (this is a problem if there are multiple HW defined values for the TXFIFO size) It is mentioned in the SNPS databook that a minimum of TX FIFO depth = 3 is required for an EP that supports bursting. Otherwise, there may be frequent occurences of bursts ending. For high bandwidth functions, such as data tethering (protocols that support data aggregation), mass storage, and media transfer protocol (over FFS), the bMaxBurst value can be large, and a bigger TXFIFO depth may prove to be beneficial in terms of USB throughput. (which can be associated to system access latency, etc...) It allows for a more consistent burst of traffic, w/o any interruptions, as data is readily available in the FIFO. With testing done using the mass storage function driver, the results show that with a larger TXFIFO depth, the bandwidth increased significantly. Test Parameters: - Platform: Qualcomm SM8150 - bMaxBurst = 6 - USB req size = 256kB - Num of USB reqs = 16 - USB Speed = Super-Speed - Function Driver: Mass Storage (w/ ramdisk) - Test Application: CrystalDiskMark Results: TXFIFO Depth = 3 max packets Test Case | Data Size | AVG tput (in MB/s) ------------------------------------------- Sequential|1 GB x | Read |9 loops | 193.60 | | 195.86 | | 184.77 | | 193.60 ------------------------------------------- TXFIFO Depth = 6 max packets Test Case | Data Size | AVG tput (in MB/s) ------------------------------------------- Sequential|1 GB x | Read |9 loops | 287.35 | | 304.94 | | 289.64 | | 293.61 ------------------------------------------- Wesley Cheng (4): usb: gadget: udc: core: Introduce check_config to verify USB configuration usb: gadget: configfs: Check USB configuration before adding usb: dwc3: Resize TX FIFOs to meet EP bursting requirements arm64: boot: dts: qcom: sm8150: Enable dynamic TX FIFO resize logic arch/arm64/boot/dts/qcom/sm8150.dtsi | 1 + drivers/usb/dwc3/core.c | 2 + drivers/usb/dwc3/core.h | 8 ++ drivers/usb/dwc3/ep0.c | 2 + drivers/usb/dwc3/gadget.c | 194 +++++++++++++++++++++++++++++++++++ drivers/usb/gadget/configfs.c | 22 ++++ drivers/usb/gadget/udc/core.c | 9 ++ include/linux/usb/gadget.h | 2 + 8 files changed, 240 insertions(+) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration 2021-01-22 4:01 [PATCH v6 0/4] Re-introduce TX FIFO resize for larger EP bursting Wesley Cheng @ 2021-01-22 4:01 ` Wesley Cheng 2021-01-22 5:17 ` Jack Pham 2021-01-22 16:24 ` Alan Stern 2021-01-22 4:01 ` [PATCH v6 2/4] usb: gadget: configfs: Check USB configuration before adding Wesley Cheng ` (2 subsequent siblings) 3 siblings, 2 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-22 4:01 UTC (permalink / raw) To: balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp, Wesley Cheng Some UDCs may have constraints on how many high bandwidth endpoints it can support in a certain configuration. This API allows for the composite driver to pass down the total number of endpoints to the UDC so it can verify it has the required resources to support the configuration. Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> --- drivers/usb/gadget/udc/core.c | 9 +++++++++ include/linux/usb/gadget.h | 2 ++ 2 files changed, 11 insertions(+) diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c index 4173acd..469962f 100644 --- a/drivers/usb/gadget/udc/core.c +++ b/drivers/usb/gadget/udc/core.c @@ -1003,6 +1003,15 @@ int usb_gadget_ep_match_desc(struct usb_gadget *gadget, } EXPORT_SYMBOL_GPL(usb_gadget_ep_match_desc); +int usb_gadget_check_config(struct usb_gadget *gadget, unsigned long ep_map) +{ + if (!gadget->ops->check_config) + return 0; + + return gadget->ops->check_config(gadget, ep_map); +} +EXPORT_SYMBOL_GPL(usb_gadget_check_config); + /* ------------------------------------------------------------------------- */ static void usb_gadget_state_work(struct work_struct *work) diff --git a/include/linux/usb/gadget.h b/include/linux/usb/gadget.h index ee04ef2..8393fa8 100644 --- a/include/linux/usb/gadget.h +++ b/include/linux/usb/gadget.h @@ -328,6 +328,7 @@ struct usb_gadget_ops { struct usb_ep *(*match_ep)(struct usb_gadget *, struct usb_endpoint_descriptor *, struct usb_ss_ep_comp_descriptor *); + int (*check_config)(struct usb_gadget *gadget, unsigned long ep_map); }; /** @@ -607,6 +608,7 @@ int usb_gadget_connect(struct usb_gadget *gadget); int usb_gadget_disconnect(struct usb_gadget *gadget); int usb_gadget_deactivate(struct usb_gadget *gadget); int usb_gadget_activate(struct usb_gadget *gadget); +int usb_gadget_check_config(struct usb_gadget *gadget, unsigned long ep_map); #else static inline int usb_gadget_frame_number(struct usb_gadget *gadget) { return 0; } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration 2021-01-22 4:01 ` [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration Wesley Cheng @ 2021-01-22 5:17 ` Jack Pham 2021-01-26 1:01 ` Wesley Cheng 2021-01-22 16:24 ` Alan Stern 1 sibling, 1 reply; 20+ messages in thread From: Jack Pham @ 2021-01-22 5:17 UTC (permalink / raw) To: Wesley Cheng Cc: balbi, gregkh, robh+dt, agross, bjorn.andersson, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen Hi Wesley, On Thu, Jan 21, 2021 at 08:01:37PM -0800, Wesley Cheng wrote: > Some UDCs may have constraints on how many high bandwidth endpoints it can > support in a certain configuration. This API allows for the composite > driver to pass down the total number of endpoints to the UDC so it can verify > it has the required resources to support the configuration. > > Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> > --- > drivers/usb/gadget/udc/core.c | 9 +++++++++ > include/linux/usb/gadget.h | 2 ++ > 2 files changed, 11 insertions(+) > > diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c > index 4173acd..469962f 100644 > --- a/drivers/usb/gadget/udc/core.c > +++ b/drivers/usb/gadget/udc/core.c > @@ -1003,6 +1003,15 @@ int usb_gadget_ep_match_desc(struct usb_gadget *gadget, > } > EXPORT_SYMBOL_GPL(usb_gadget_ep_match_desc); > > +int usb_gadget_check_config(struct usb_gadget *gadget, unsigned long ep_map) You should probably add a kernel-doc for this function. Jack -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration 2021-01-22 5:17 ` Jack Pham @ 2021-01-26 1:01 ` Wesley Cheng 0 siblings, 0 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-26 1:01 UTC (permalink / raw) To: Jack Pham Cc: balbi, gregkh, robh+dt, agross, bjorn.andersson, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen On 1/21/2021 9:17 PM, Jack Pham wrote: > Hi Wesley, > > On Thu, Jan 21, 2021 at 08:01:37PM -0800, Wesley Cheng wrote: >> Some UDCs may have constraints on how many high bandwidth endpoints it can >> support in a certain configuration. This API allows for the composite >> driver to pass down the total number of endpoints to the UDC so it can verify >> it has the required resources to support the configuration. >> >> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> >> --- >> drivers/usb/gadget/udc/core.c | 9 +++++++++ >> include/linux/usb/gadget.h | 2 ++ >> 2 files changed, 11 insertions(+) >> >> diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c >> index 4173acd..469962f 100644 >> --- a/drivers/usb/gadget/udc/core.c >> +++ b/drivers/usb/gadget/udc/core.c >> @@ -1003,6 +1003,15 @@ int usb_gadget_ep_match_desc(struct usb_gadget *gadget, >> } >> EXPORT_SYMBOL_GPL(usb_gadget_ep_match_desc); >> >> +int usb_gadget_check_config(struct usb_gadget *gadget, unsigned long ep_map) > > You should probably add a kernel-doc for this function. > > Jack > Hi Jack, Sure, I'll update a bit more about how this API can be used. Thanks Wesley Cheng -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration 2021-01-22 4:01 ` [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration Wesley Cheng 2021-01-22 5:17 ` Jack Pham @ 2021-01-22 16:24 ` Alan Stern 2021-01-26 1:02 ` Wesley Cheng 1 sibling, 1 reply; 20+ messages in thread From: Alan Stern @ 2021-01-22 16:24 UTC (permalink / raw) To: Wesley Cheng Cc: balbi, gregkh, robh+dt, agross, bjorn.andersson, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On Thu, Jan 21, 2021 at 08:01:37PM -0800, Wesley Cheng wrote: > Some UDCs may have constraints on how many high bandwidth endpoints it can > support in a certain configuration. This API allows for the composite > driver to pass down the total number of endpoints to the UDC so it can verify > it has the required resources to support the configuration. > > Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> > --- a/include/linux/usb/gadget.h > +++ b/include/linux/usb/gadget.h > @@ -328,6 +328,7 @@ struct usb_gadget_ops { > struct usb_ep *(*match_ep)(struct usb_gadget *, > struct usb_endpoint_descriptor *, > struct usb_ss_ep_comp_descriptor *); > + int (*check_config)(struct usb_gadget *gadget, unsigned long ep_map); > }; > > /** > @@ -607,6 +608,7 @@ int usb_gadget_connect(struct usb_gadget *gadget); > int usb_gadget_disconnect(struct usb_gadget *gadget); > int usb_gadget_deactivate(struct usb_gadget *gadget); > int usb_gadget_activate(struct usb_gadget *gadget); > +int usb_gadget_check_config(struct usb_gadget *gadget, unsigned long ep_map); > #else > static inline int usb_gadget_frame_number(struct usb_gadget *gadget) > { return 0; } Don't you also need an entry for the case where CONFIG_USB_GADGET isn't enabled? Alan Stern ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration 2021-01-22 16:24 ` Alan Stern @ 2021-01-26 1:02 ` Wesley Cheng 0 siblings, 0 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-26 1:02 UTC (permalink / raw) To: Alan Stern Cc: balbi, gregkh, robh+dt, agross, bjorn.andersson, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On 1/22/2021 8:24 AM, Alan Stern wrote: > On Thu, Jan 21, 2021 at 08:01:37PM -0800, Wesley Cheng wrote: >> Some UDCs may have constraints on how many high bandwidth endpoints it can >> support in a certain configuration. This API allows for the composite >> driver to pass down the total number of endpoints to the UDC so it can verify >> it has the required resources to support the configuration. >> >> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> > > >> --- a/include/linux/usb/gadget.h >> +++ b/include/linux/usb/gadget.h >> @@ -328,6 +328,7 @@ struct usb_gadget_ops { >> struct usb_ep *(*match_ep)(struct usb_gadget *, >> struct usb_endpoint_descriptor *, >> struct usb_ss_ep_comp_descriptor *); >> + int (*check_config)(struct usb_gadget *gadget, unsigned long ep_map); >> }; >> >> /** >> @@ -607,6 +608,7 @@ int usb_gadget_connect(struct usb_gadget *gadget); >> int usb_gadget_disconnect(struct usb_gadget *gadget); >> int usb_gadget_deactivate(struct usb_gadget *gadget); >> int usb_gadget_activate(struct usb_gadget *gadget); >> +int usb_gadget_check_config(struct usb_gadget *gadget, unsigned long ep_map); >> #else >> static inline int usb_gadget_frame_number(struct usb_gadget *gadget) >> { return 0; } > > Don't you also need an entry for the case where CONFIG_USB_GADGET isn't > enabled? > > Alan Stern > Hi Alan, Thanks for pointing that out. I missed that, and will add it to the next rev. Thanks Wesley Cheng -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v6 2/4] usb: gadget: configfs: Check USB configuration before adding 2021-01-22 4:01 [PATCH v6 0/4] Re-introduce TX FIFO resize for larger EP bursting Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration Wesley Cheng @ 2021-01-22 4:01 ` Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 4/4] arm64: boot: dts: qcom: sm8150: Enable dynamic TX FIFO resize logic Wesley Cheng 3 siblings, 0 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-22 4:01 UTC (permalink / raw) To: balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp, Wesley Cheng Ensure that the USB gadget is able to support the configuration being added based on the number of endpoints required from all interfaces. This is for accounting for any bandwidth or space limitations. Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> --- drivers/usb/gadget/configfs.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/usb/gadget/configfs.c b/drivers/usb/gadget/configfs.c index 0d56f33..e6de3ca5 100644 --- a/drivers/usb/gadget/configfs.c +++ b/drivers/usb/gadget/configfs.c @@ -1368,6 +1368,7 @@ static int configfs_composite_bind(struct usb_gadget *gadget, struct usb_function *f; struct usb_function *tmp; struct gadget_config_name *cn; + unsigned long ep_map = 0; if (gadget_is_otg(gadget)) c->descriptors = otg_desc; @@ -1397,7 +1398,28 @@ static int configfs_composite_bind(struct usb_gadget *gadget, list_add(&f->list, &cfg->func_list); goto err_purge_funcs; } + if (f->fs_descriptors) { + struct usb_descriptor_header **d; + + d = f->fs_descriptors; + for (; *d; ++d) { + struct usb_endpoint_descriptor *ep; + int addr; + + if ((*d)->bDescriptorType != USB_DT_ENDPOINT) + continue; + + ep = (struct usb_endpoint_descriptor *)*d; + addr = ((ep->bEndpointAddress & 0x80) >> 3) | + (ep->bEndpointAddress & 0x0f); + set_bit(addr, &ep_map); + } + } } + ret = usb_gadget_check_config(cdev->gadget, ep_map); + if (ret) + goto err_purge_funcs; + usb_ep_autoconfig_reset(cdev->gadget); } if (cdev->use_os_string) { -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-22 4:01 [PATCH v6 0/4] Re-introduce TX FIFO resize for larger EP bursting Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 2/4] usb: gadget: configfs: Check USB configuration before adding Wesley Cheng @ 2021-01-22 4:01 ` Wesley Cheng 2021-01-22 17:12 ` Bjorn Andersson 2021-01-23 0:15 ` Thinh Nguyen 2021-01-22 4:01 ` [PATCH v6 4/4] arm64: boot: dts: qcom: sm8150: Enable dynamic TX FIFO resize logic Wesley Cheng 3 siblings, 2 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-22 4:01 UTC (permalink / raw) To: balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp, Wesley Cheng Some devices have USB compositions which may require multiple endpoints that support EP bursting. HW defined TX FIFO sizes may not always be sufficient for these compositions. By utilizing flexible TX FIFO allocation, this allows for endpoints to request the required FIFO depth to achieve higher bandwidth. With some higher bMaxBurst configurations, using a larger TX FIFO size results in better TX throughput. By introducing the check_config() callback, the resizing logic can fetch the maximum number of endpoints used in the USB composition (can contain multiple configurations), which helps ensure that the resizing logic can fulfill the configuration(s), or return an error to the gadget layer otherwise during bind time. Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> --- drivers/usb/dwc3/core.c | 2 + drivers/usb/dwc3/core.h | 8 ++ drivers/usb/dwc3/ep0.c | 2 + drivers/usb/dwc3/gadget.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 206 insertions(+) diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index 6969196..e7fa6af 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1284,6 +1284,8 @@ static void dwc3_get_properties(struct dwc3 *dwc) &tx_thr_num_pkt_prd); device_property_read_u8(dev, "snps,tx-max-burst-prd", &tx_max_burst_prd); + dwc->needs_fifo_resize = device_property_read_bool(dev, + "tx-fifo-resize"); dwc->disable_scramble_quirk = device_property_read_bool(dev, "snps,disable_scramble_quirk"); diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index eec1cf4..983b2fd4 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -1223,6 +1223,7 @@ struct dwc3 { unsigned is_utmi_l1_suspend:1; unsigned is_fpga:1; unsigned pending_events:1; + unsigned needs_fifo_resize:1; unsigned pullups_connected:1; unsigned setup_packet_pending:1; unsigned three_stage_setup:1; @@ -1257,6 +1258,10 @@ struct dwc3 { unsigned dis_split_quirk:1; u16 imod_interval; + + int max_cfg_eps; + int last_fifo_depth; + int num_ep_resized; }; #define INCRX_BURST_MODE 0 @@ -1471,6 +1476,7 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, struct dwc3_gadget_ep_cmd_params *params); int dwc3_send_gadget_generic_command(struct dwc3 *dwc, unsigned int cmd, u32 param); +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc); #else static inline int dwc3_gadget_init(struct dwc3 *dwc) { return 0; } @@ -1490,6 +1496,8 @@ static inline int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, static inline int dwc3_send_gadget_generic_command(struct dwc3 *dwc, int cmd, u32 param) { return 0; } +static inline void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) +{ } #endif #if IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE) diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c index 8b668ef..4f216bd 100644 --- a/drivers/usb/dwc3/ep0.c +++ b/drivers/usb/dwc3/ep0.c @@ -616,6 +616,8 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl) return -EINVAL; case USB_STATE_ADDRESS: + dwc3_gadget_clear_tx_fifos(dwc); + ret = dwc3_ep0_delegate_req(dwc, ctrl); /* if the cfg matches and the cfg is non zero */ if (cfg && (!ret || (ret == USB_GADGET_DELAYED_STATUS))) { diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 86f257f..26f9d64 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -615,6 +615,161 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action) static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool interrupt); +static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) +{ + int max_packet = 1024; + int fifo_size; + int mdwidth; + + mdwidth = DWC3_MDWIDTH(dwc->hwparams.hwparams0); + /* MDWIDTH is represented in bits, we need it in bytes */ + mdwidth >>= 3; + + fifo_size = mult * ((max_packet + mdwidth) / mdwidth) + 1; + return fifo_size; +} + +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) +{ + struct dwc3_ep *dep; + int fifo_depth; + int size; + int num; + + if (!dwc->needs_fifo_resize) + return; + + /* Read ep0IN related TXFIFO size */ + dep = dwc->eps[1]; + size = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); + if (DWC3_IP_IS(DWC31)) + fifo_depth = DWC31_GTXFIFOSIZ_TXFDEP(size); + else + fifo_depth = DWC3_GTXFIFOSIZ_TXFDEP(size); + + dwc->last_fifo_depth = fifo_depth; + /* Clear existing TXFIFO for all IN eps except ep0 */ + for (num = 3; num < min_t(int, dwc->num_eps, DWC3_ENDPOINTS_NUM); + num += 2) { + dep = dwc->eps[num]; + /* Don't change TXFRAMNUM on usb31 version */ + size = DWC3_IP_IS(DWC31) ? + dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1)) & + DWC31_GTXFIFOSIZ_TXFRAMNUM : 0; + + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1), size); + } + dwc->num_ep_resized = 0; +} + +/* + * dwc3_gadget_resize_tx_fifos - reallocate fifo spaces for current use-case + * @dwc: pointer to our context structure + * + * This function will a best effort FIFO allocation in order + * to improve FIFO usage and throughput, while still allowing + * us to enable as many endpoints as possible. + * + * Keep in mind that this operation will be highly dependent + * on the configured size for RAM1 - which contains TxFifo -, + * the amount of endpoints enabled on coreConsultant tool, and + * the width of the Master Bus. + * + * In general, FIFO depths are represented with the following equation: + * + * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1 + * + * Conversions can be done to the equation to derive the number of packets that + * will fit to a particular FIFO size value. + */ +static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) +{ + struct dwc3 *dwc = dep->dwc; + int fifo_0_start; + int ram1_depth; + int fifo_size; + int min_depth; + int num_in_ep; + int remaining; + int mult = 1; + int fifo; + int tmp; + + if (!dwc->needs_fifo_resize) + return 0; + + /* resize IN endpoints except ep0 */ + if (!usb_endpoint_dir_in(dep->endpoint.desc) || dep->number <= 1) + return 0; + + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); + + if ((dep->endpoint.maxburst > 1 && + usb_endpoint_xfer_bulk(dep->endpoint.desc)) || + usb_endpoint_xfer_isoc(dep->endpoint.desc)) + mult = 3; + + if (dep->endpoint.maxburst > 6 && + usb_endpoint_xfer_bulk(dep->endpoint.desc) && DWC3_IP_IS(DWC31)) + mult = 6; + + /* FIFO size for a single buffer */ + fifo = dwc3_gadget_calc_tx_fifo_size(dwc, 1); + + /* Calculate the number of remaining EPs w/o any FIFO */ + num_in_ep = dwc->max_cfg_eps; + num_in_ep -= dwc->num_ep_resized; + + /* Reserve at least one FIFO for the number of IN EPs */ + min_depth = num_in_ep * (fifo + 1); + remaining = ram1_depth - min_depth - dwc->last_fifo_depth; + + /* + * We've already reserved 1 FIFO per EP, so check what we can fit in + * addition to it. If there is not enough remaining space, allocate + * all the remaining space to the EP. + */ + fifo_size = (mult - 1) * fifo; + if (remaining < fifo_size) { + if (remaining > 0) + fifo_size = remaining; + else + fifo_size = 0; + } + + fifo_size += fifo; + /* Last increment according to the TX FIFO size equation */ + fifo_size++; + + /* Check if TXFIFOs start at non-zero addr */ + tmp = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(tmp); + + fifo_size |= (fifo_0_start + (dwc->last_fifo_depth << 16)); + if (DWC3_IP_IS(DWC31)) + dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); + else + dwc->last_fifo_depth += DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); + + /* Check fifo size allocation doesn't exceed available RAM size. */ + if (dwc->last_fifo_depth >= ram1_depth) { + dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", + dwc->last_fifo_depth, ram1_depth, + dep->endpoint.name, fifo_size); + if (DWC3_IP_IS(DWC31)) + fifo_size = DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); + else + fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); + dwc->last_fifo_depth -= fifo_size; + return -ENOMEM; + } + + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(dep->number >> 1), fifo_size); + dwc->num_ep_resized++; + + return 0; +} + /** * __dwc3_gadget_ep_enable - initializes a hw endpoint * @dep: endpoint to be initialized @@ -632,6 +787,10 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action) int ret; if (!(dep->flags & DWC3_EP_ENABLED)) { + ret = dwc3_gadget_resize_tx_fifos(dep); + if (ret) + return ret; + ret = dwc3_gadget_start_config(dep); if (ret) return ret; @@ -2418,6 +2577,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) spin_lock_irqsave(&dwc->lock, flags); dwc->gadget_driver = NULL; + dwc->max_cfg_eps = 0; spin_unlock_irqrestore(&dwc->lock, flags); free_irq(dwc->irq_gadget, dwc->ev_buf); @@ -2485,6 +2645,39 @@ static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA) return 0; } +static int dwc3_gadget_check_config(struct usb_gadget *g, unsigned long ep_map) +{ + struct dwc3 *dwc = gadget_to_dwc(g); + unsigned long in_ep_map; + int fifo_size = 0; + int ram1_depth; + int ep_num; + + if (!dwc->needs_fifo_resize) + return 0; + + /* Only interested in the IN endpoints */ + in_ep_map = ep_map >> 16; + ep_num = hweight_long(in_ep_map); + + if (ep_num <= dwc->max_cfg_eps) + return 0; + + /* Update the max number of eps in the composition */ + dwc->max_cfg_eps = ep_num; + + fifo_size = dwc3_gadget_calc_tx_fifo_size(dwc, dwc->max_cfg_eps); + /* Based on the equation, increment by one for every ep */ + fifo_size += dwc->max_cfg_eps; + + /* Check if we can fit a single fifo per endpoint */ + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); + if (fifo_size > ram1_depth) + return -ENOMEM; + + return 0; +} + static const struct usb_gadget_ops dwc3_gadget_ops = { .get_frame = dwc3_gadget_get_frame, .wakeup = dwc3_gadget_wakeup, @@ -2495,6 +2688,7 @@ static const struct usb_gadget_ops dwc3_gadget_ops = { .udc_set_speed = dwc3_gadget_set_speed, .get_config_params = dwc3_gadget_config_params, .vbus_draw = dwc3_gadget_vbus_draw, + .check_config = dwc3_gadget_check_config, }; /* -------------------------------------------------------------------------- */ -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-22 4:01 ` [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements Wesley Cheng @ 2021-01-22 17:12 ` Bjorn Andersson 2021-01-26 1:14 ` Wesley Cheng 2021-01-23 0:15 ` Thinh Nguyen 1 sibling, 1 reply; 20+ messages in thread From: Bjorn Andersson @ 2021-01-22 17:12 UTC (permalink / raw) To: Wesley Cheng Cc: balbi, gregkh, robh+dt, agross, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On Thu 21 Jan 22:01 CST 2021, Wesley Cheng wrote: > Some devices have USB compositions which may require multiple endpoints > that support EP bursting. HW defined TX FIFO sizes may not always be > sufficient for these compositions. By utilizing flexible TX FIFO > allocation, this allows for endpoints to request the required FIFO depth to > achieve higher bandwidth. With some higher bMaxBurst configurations, using > a larger TX FIFO size results in better TX throughput. > > By introducing the check_config() callback, the resizing logic can fetch > the maximum number of endpoints used in the USB composition (can contain > multiple configurations), which helps ensure that the resizing logic can > fulfill the configuration(s), or return an error to the gadget layer > otherwise during bind time. > > Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> > --- > drivers/usb/dwc3/core.c | 2 + > drivers/usb/dwc3/core.h | 8 ++ > drivers/usb/dwc3/ep0.c | 2 + > drivers/usb/dwc3/gadget.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 206 insertions(+) > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c > index 6969196..e7fa6af 100644 > --- a/drivers/usb/dwc3/core.c > +++ b/drivers/usb/dwc3/core.c > @@ -1284,6 +1284,8 @@ static void dwc3_get_properties(struct dwc3 *dwc) > &tx_thr_num_pkt_prd); > device_property_read_u8(dev, "snps,tx-max-burst-prd", > &tx_max_burst_prd); > + dwc->needs_fifo_resize = device_property_read_bool(dev, > + "tx-fifo-resize"); Under what circumstances should we specify this? And in particular are there scenarios (in the Qualcomm platforms) where this must not be set? In particular, the composition can be changed in runtime, so should we set this for all Qualcomm platforms? And if that's the case, can we not just set it from the qcom driver? Regards, Bjorn > > dwc->disable_scramble_quirk = device_property_read_bool(dev, > "snps,disable_scramble_quirk"); > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h > index eec1cf4..983b2fd4 100644 > --- a/drivers/usb/dwc3/core.h > +++ b/drivers/usb/dwc3/core.h > @@ -1223,6 +1223,7 @@ struct dwc3 { > unsigned is_utmi_l1_suspend:1; > unsigned is_fpga:1; > unsigned pending_events:1; > + unsigned needs_fifo_resize:1; > unsigned pullups_connected:1; > unsigned setup_packet_pending:1; > unsigned three_stage_setup:1; > @@ -1257,6 +1258,10 @@ struct dwc3 { > unsigned dis_split_quirk:1; > > u16 imod_interval; > + > + int max_cfg_eps; > + int last_fifo_depth; > + int num_ep_resized; > }; > > #define INCRX_BURST_MODE 0 > @@ -1471,6 +1476,7 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, > struct dwc3_gadget_ep_cmd_params *params); > int dwc3_send_gadget_generic_command(struct dwc3 *dwc, unsigned int cmd, > u32 param); > +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc); > #else > static inline int dwc3_gadget_init(struct dwc3 *dwc) > { return 0; } > @@ -1490,6 +1496,8 @@ static inline int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, > static inline int dwc3_send_gadget_generic_command(struct dwc3 *dwc, > int cmd, u32 param) > { return 0; } > +static inline void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) > +{ } > #endif > > #if IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE) > diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c > index 8b668ef..4f216bd 100644 > --- a/drivers/usb/dwc3/ep0.c > +++ b/drivers/usb/dwc3/ep0.c > @@ -616,6 +616,8 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl) > return -EINVAL; > > case USB_STATE_ADDRESS: > + dwc3_gadget_clear_tx_fifos(dwc); > + > ret = dwc3_ep0_delegate_req(dwc, ctrl); > /* if the cfg matches and the cfg is non zero */ > if (cfg && (!ret || (ret == USB_GADGET_DELAYED_STATUS))) { > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 86f257f..26f9d64 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -615,6 +615,161 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action) > static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, > bool interrupt); > > +static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) > +{ > + int max_packet = 1024; > + int fifo_size; > + int mdwidth; > + > + mdwidth = DWC3_MDWIDTH(dwc->hwparams.hwparams0); > + /* MDWIDTH is represented in bits, we need it in bytes */ > + mdwidth >>= 3; > + > + fifo_size = mult * ((max_packet + mdwidth) / mdwidth) + 1; > + return fifo_size; > +} > + > +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) > +{ > + struct dwc3_ep *dep; > + int fifo_depth; > + int size; > + int num; > + > + if (!dwc->needs_fifo_resize) > + return; > + > + /* Read ep0IN related TXFIFO size */ > + dep = dwc->eps[1]; > + size = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); > + if (DWC3_IP_IS(DWC31)) > + fifo_depth = DWC31_GTXFIFOSIZ_TXFDEP(size); > + else > + fifo_depth = DWC3_GTXFIFOSIZ_TXFDEP(size); > + > + dwc->last_fifo_depth = fifo_depth; > + /* Clear existing TXFIFO for all IN eps except ep0 */ > + for (num = 3; num < min_t(int, dwc->num_eps, DWC3_ENDPOINTS_NUM); > + num += 2) { > + dep = dwc->eps[num]; > + /* Don't change TXFRAMNUM on usb31 version */ > + size = DWC3_IP_IS(DWC31) ? > + dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1)) & > + DWC31_GTXFIFOSIZ_TXFRAMNUM : 0; > + > + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1), size); > + } > + dwc->num_ep_resized = 0; > +} > + > +/* > + * dwc3_gadget_resize_tx_fifos - reallocate fifo spaces for current use-case > + * @dwc: pointer to our context structure > + * > + * This function will a best effort FIFO allocation in order > + * to improve FIFO usage and throughput, while still allowing > + * us to enable as many endpoints as possible. > + * > + * Keep in mind that this operation will be highly dependent > + * on the configured size for RAM1 - which contains TxFifo -, > + * the amount of endpoints enabled on coreConsultant tool, and > + * the width of the Master Bus. > + * > + * In general, FIFO depths are represented with the following equation: > + * > + * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1 > + * > + * Conversions can be done to the equation to derive the number of packets that > + * will fit to a particular FIFO size value. > + */ > +static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) > +{ > + struct dwc3 *dwc = dep->dwc; > + int fifo_0_start; > + int ram1_depth; > + int fifo_size; > + int min_depth; > + int num_in_ep; > + int remaining; > + int mult = 1; > + int fifo; > + int tmp; > + > + if (!dwc->needs_fifo_resize) > + return 0; > + > + /* resize IN endpoints except ep0 */ > + if (!usb_endpoint_dir_in(dep->endpoint.desc) || dep->number <= 1) > + return 0; > + > + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); > + > + if ((dep->endpoint.maxburst > 1 && > + usb_endpoint_xfer_bulk(dep->endpoint.desc)) || > + usb_endpoint_xfer_isoc(dep->endpoint.desc)) > + mult = 3; > + > + if (dep->endpoint.maxburst > 6 && > + usb_endpoint_xfer_bulk(dep->endpoint.desc) && DWC3_IP_IS(DWC31)) > + mult = 6; > + > + /* FIFO size for a single buffer */ > + fifo = dwc3_gadget_calc_tx_fifo_size(dwc, 1); > + > + /* Calculate the number of remaining EPs w/o any FIFO */ > + num_in_ep = dwc->max_cfg_eps; > + num_in_ep -= dwc->num_ep_resized; > + > + /* Reserve at least one FIFO for the number of IN EPs */ > + min_depth = num_in_ep * (fifo + 1); > + remaining = ram1_depth - min_depth - dwc->last_fifo_depth; > + > + /* > + * We've already reserved 1 FIFO per EP, so check what we can fit in > + * addition to it. If there is not enough remaining space, allocate > + * all the remaining space to the EP. > + */ > + fifo_size = (mult - 1) * fifo; > + if (remaining < fifo_size) { > + if (remaining > 0) > + fifo_size = remaining; > + else > + fifo_size = 0; > + } > + > + fifo_size += fifo; > + /* Last increment according to the TX FIFO size equation */ > + fifo_size++; > + > + /* Check if TXFIFOs start at non-zero addr */ > + tmp = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); > + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(tmp); > + > + fifo_size |= (fifo_0_start + (dwc->last_fifo_depth << 16)); > + if (DWC3_IP_IS(DWC31)) > + dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); > + else > + dwc->last_fifo_depth += DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); > + > + /* Check fifo size allocation doesn't exceed available RAM size. */ > + if (dwc->last_fifo_depth >= ram1_depth) { > + dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", > + dwc->last_fifo_depth, ram1_depth, > + dep->endpoint.name, fifo_size); > + if (DWC3_IP_IS(DWC31)) > + fifo_size = DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); > + else > + fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); > + dwc->last_fifo_depth -= fifo_size; > + return -ENOMEM; > + } > + > + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(dep->number >> 1), fifo_size); > + dwc->num_ep_resized++; > + > + return 0; > +} > + > /** > * __dwc3_gadget_ep_enable - initializes a hw endpoint > * @dep: endpoint to be initialized > @@ -632,6 +787,10 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action) > int ret; > > if (!(dep->flags & DWC3_EP_ENABLED)) { > + ret = dwc3_gadget_resize_tx_fifos(dep); > + if (ret) > + return ret; > + > ret = dwc3_gadget_start_config(dep); > if (ret) > return ret; > @@ -2418,6 +2577,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) > > spin_lock_irqsave(&dwc->lock, flags); > dwc->gadget_driver = NULL; > + dwc->max_cfg_eps = 0; > spin_unlock_irqrestore(&dwc->lock, flags); > > free_irq(dwc->irq_gadget, dwc->ev_buf); > @@ -2485,6 +2645,39 @@ static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA) > return 0; > } > > +static int dwc3_gadget_check_config(struct usb_gadget *g, unsigned long ep_map) > +{ > + struct dwc3 *dwc = gadget_to_dwc(g); > + unsigned long in_ep_map; > + int fifo_size = 0; > + int ram1_depth; > + int ep_num; > + > + if (!dwc->needs_fifo_resize) > + return 0; > + > + /* Only interested in the IN endpoints */ > + in_ep_map = ep_map >> 16; > + ep_num = hweight_long(in_ep_map); > + > + if (ep_num <= dwc->max_cfg_eps) > + return 0; > + > + /* Update the max number of eps in the composition */ > + dwc->max_cfg_eps = ep_num; > + > + fifo_size = dwc3_gadget_calc_tx_fifo_size(dwc, dwc->max_cfg_eps); > + /* Based on the equation, increment by one for every ep */ > + fifo_size += dwc->max_cfg_eps; > + > + /* Check if we can fit a single fifo per endpoint */ > + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); > + if (fifo_size > ram1_depth) > + return -ENOMEM; > + > + return 0; > +} > + > static const struct usb_gadget_ops dwc3_gadget_ops = { > .get_frame = dwc3_gadget_get_frame, > .wakeup = dwc3_gadget_wakeup, > @@ -2495,6 +2688,7 @@ static const struct usb_gadget_ops dwc3_gadget_ops = { > .udc_set_speed = dwc3_gadget_set_speed, > .get_config_params = dwc3_gadget_config_params, > .vbus_draw = dwc3_gadget_vbus_draw, > + .check_config = dwc3_gadget_check_config, > }; > > /* -------------------------------------------------------------------------- */ > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-22 17:12 ` Bjorn Andersson @ 2021-01-26 1:14 ` Wesley Cheng 2021-01-26 1:55 ` Bjorn Andersson 0 siblings, 1 reply; 20+ messages in thread From: Wesley Cheng @ 2021-01-26 1:14 UTC (permalink / raw) To: Bjorn Andersson Cc: balbi, gregkh, robh+dt, agross, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On 1/22/2021 9:12 AM, Bjorn Andersson wrote: > On Thu 21 Jan 22:01 CST 2021, Wesley Cheng wrote: > Hi Bjorn, > > Under what circumstances should we specify this? And in particular are > there scenarios (in the Qualcomm platforms) where this must not be set? >The TXFIFO dynamic allocation is actually a feature within the DWC3 controller, and isn't specifically for QCOM based platforms. It won't do any harm functionally if this flag is not set, as this is meant for enhancing performance/bandwidth. > In particular, the composition can be changed in runtime, so should we > set this for all Qualcomm platforms? > Ideally yes, if we want to increase bandwith for situations where SS endpoint bursting is set to a higher value. > And if that's the case, can we not just set it from the qcom driver? > Since this is a common DWC3 core feature, I think it would make more sense to have it in DWC3 core instead of a vendor's DWC3 glue driver. Thanks Wesley Cheng > Regards, > Bjorn -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-26 1:14 ` Wesley Cheng @ 2021-01-26 1:55 ` Bjorn Andersson 2021-01-26 4:32 ` Wesley Cheng 0 siblings, 1 reply; 20+ messages in thread From: Bjorn Andersson @ 2021-01-26 1:55 UTC (permalink / raw) To: Wesley Cheng Cc: balbi, gregkh, robh+dt, agross, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On Mon 25 Jan 19:14 CST 2021, Wesley Cheng wrote: > > > On 1/22/2021 9:12 AM, Bjorn Andersson wrote: > > On Thu 21 Jan 22:01 CST 2021, Wesley Cheng wrote: > > > > Hi Bjorn, > > > > Under what circumstances should we specify this? And in particular are > > there scenarios (in the Qualcomm platforms) where this must not be set? > >The TXFIFO dynamic allocation is actually a feature within the DWC3 > controller, and isn't specifically for QCOM based platforms. It won't > do any harm functionally if this flag is not set, as this is meant for > enhancing performance/bandwidth. > > > In particular, the composition can be changed in runtime, so should we > > set this for all Qualcomm platforms? > > > Ideally yes, if we want to increase bandwith for situations where SS > endpoint bursting is set to a higher value. > > > And if that's the case, can we not just set it from the qcom driver? > > > Since this is a common DWC3 core feature, I think it would make more > sense to have it in DWC3 core instead of a vendor's DWC3 glue driver. > I don't have any objections to implementing it in the core driver, but my question is can we just skip the DT binding and just enable it from the vendor driver? Regards, Bjorn ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-26 1:55 ` Bjorn Andersson @ 2021-01-26 4:32 ` Wesley Cheng 2021-01-26 5:15 ` Bjorn Andersson 0 siblings, 1 reply; 20+ messages in thread From: Wesley Cheng @ 2021-01-26 4:32 UTC (permalink / raw) To: Bjorn Andersson Cc: balbi, gregkh, robh+dt, agross, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On 1/25/2021 5:55 PM, Bjorn Andersson wrote: > On Mon 25 Jan 19:14 CST 2021, Wesley Cheng wrote: > >> >> >> On 1/22/2021 9:12 AM, Bjorn Andersson wrote: >>> On Thu 21 Jan 22:01 CST 2021, Wesley Cheng wrote: >>> >> >> Hi Bjorn, >>> >>> Under what circumstances should we specify this? And in particular are >>> there scenarios (in the Qualcomm platforms) where this must not be set? >>> The TXFIFO dynamic allocation is actually a feature within the DWC3 >> controller, and isn't specifically for QCOM based platforms. It won't >> do any harm functionally if this flag is not set, as this is meant for >> enhancing performance/bandwidth. >> >>> In particular, the composition can be changed in runtime, so should we >>> set this for all Qualcomm platforms? >>> >> Ideally yes, if we want to increase bandwith for situations where SS >> endpoint bursting is set to a higher value. >> >>> And if that's the case, can we not just set it from the qcom driver? >>> >> Since this is a common DWC3 core feature, I think it would make more >> sense to have it in DWC3 core instead of a vendor's DWC3 glue driver. >> > > I don't have any objections to implementing it in the core driver, but > my question is can we just skip the DT binding and just enable it from > the vendor driver? > > Regards, > Bjorn > Hi Bjorn, I see. I think there are some designs which don't have a DWC3 glue driver, so assuming there may be other platforms using this, there may not always be a vendor driver to set this. Thanks Wesley Cheng -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-26 4:32 ` Wesley Cheng @ 2021-01-26 5:15 ` Bjorn Andersson 2021-01-28 23:08 ` Wesley Cheng 0 siblings, 1 reply; 20+ messages in thread From: Bjorn Andersson @ 2021-01-26 5:15 UTC (permalink / raw) To: Wesley Cheng Cc: balbi, gregkh, robh+dt, agross, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On Mon 25 Jan 22:32 CST 2021, Wesley Cheng wrote: > On 1/25/2021 5:55 PM, Bjorn Andersson wrote: > > On Mon 25 Jan 19:14 CST 2021, Wesley Cheng wrote: > > > >> > >> > >> On 1/22/2021 9:12 AM, Bjorn Andersson wrote: > >>> On Thu 21 Jan 22:01 CST 2021, Wesley Cheng wrote: > >>> > >> > >> Hi Bjorn, > >>> > >>> Under what circumstances should we specify this? And in particular are > >>> there scenarios (in the Qualcomm platforms) where this must not be set? > >>> The TXFIFO dynamic allocation is actually a feature within the DWC3 > >> controller, and isn't specifically for QCOM based platforms. It won't > >> do any harm functionally if this flag is not set, as this is meant for > >> enhancing performance/bandwidth. > >> > >>> In particular, the composition can be changed in runtime, so should we > >>> set this for all Qualcomm platforms? > >>> > >> Ideally yes, if we want to increase bandwith for situations where SS > >> endpoint bursting is set to a higher value. > >> > >>> And if that's the case, can we not just set it from the qcom driver? > >>> > >> Since this is a common DWC3 core feature, I think it would make more > >> sense to have it in DWC3 core instead of a vendor's DWC3 glue driver. > >> > > > > I don't have any objections to implementing it in the core driver, but > > my question is can we just skip the DT binding and just enable it from > > the vendor driver? > > > > Regards, > > Bjorn > > > > Hi Bjorn, > > I see. I think there are some designs which don't have a DWC3 glue > driver, so assuming there may be other platforms using this, there may > not always be a vendor driver to set this. > You mean that there are implementations of dwc3 without an associated glue driver that haven't yet realized that they need this feature? I would suggest then that we implement the core code necessary, we enable it from the Qualcomm glue layer and when someone realize that they need this without a glue driver it's going to be trivial to add the DT binding. The alternative is that we're lugging around a requirement to specify this property in all past, present and future Qualcomm dts files - and then we'll need to hard code it for ACPI anyways. Regards, Bjorn ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-26 5:15 ` Bjorn Andersson @ 2021-01-28 23:08 ` Wesley Cheng 0 siblings, 0 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-28 23:08 UTC (permalink / raw) To: Bjorn Andersson Cc: balbi, gregkh, robh+dt, agross, linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On 1/25/2021 9:15 PM, Bjorn Andersson wrote: > On Mon 25 Jan 22:32 CST 2021, Wesley Cheng wrote: >> On 1/25/2021 5:55 PM, Bjorn Andersson wrote: >>> On Mon 25 Jan 19:14 CST 2021, Wesley Cheng wrote: >>> >>>> >>>> >>>> On 1/22/2021 9:12 AM, Bjorn Andersson wrote: >>>>> On Thu 21 Jan 22:01 CST 2021, Wesley Cheng wrote: >>>>> >>>> >>>> Hi Bjorn, >>>>> >>>>> Under what circumstances should we specify this? And in particular are >>>>> there scenarios (in the Qualcomm platforms) where this must not be set? >>>>> The TXFIFO dynamic allocation is actually a feature within the DWC3 >>>> controller, and isn't specifically for QCOM based platforms. It won't >>>> do any harm functionally if this flag is not set, as this is meant for >>>> enhancing performance/bandwidth. >>>> >>>>> In particular, the composition can be changed in runtime, so should we >>>>> set this for all Qualcomm platforms? >>>>> >>>> Ideally yes, if we want to increase bandwith for situations where SS >>>> endpoint bursting is set to a higher value. >>>> >>>>> And if that's the case, can we not just set it from the qcom driver? >>>>> >>>> Since this is a common DWC3 core feature, I think it would make more >>>> sense to have it in DWC3 core instead of a vendor's DWC3 glue driver. >>>> >>> >>> I don't have any objections to implementing it in the core driver, but >>> my question is can we just skip the DT binding and just enable it from >>> the vendor driver? >>> >>> Regards, >>> Bjorn >>> >> >> Hi Bjorn, >> >> I see. I think there are some designs which don't have a DWC3 glue >> driver, so assuming there may be other platforms using this, there may >> not always be a vendor driver to set this. >> > > You mean that there are implementations of dwc3 without an associated > glue driver that haven't yet realized that they need this feature? > > I would suggest then that we implement the core code necessary, we > enable it from the Qualcomm glue layer and when someone realize that > they need this without a glue driver it's going to be trivial to add the > DT binding. >> > The alternative is that we're lugging around a requirement to specify > this property in all past, present and future Qualcomm dts files - and > then we'll need to hard code it for ACPI anyways. > Hi Bjorn, Can we utilize the of_add_property() call to add the "tx-fifo-resize" property from the dwc3_qcom_register_core() API? That way at least the above concern would be addressed. I'm not too familiar with the ACPI design, but I do see that the dwc3-qcom does have an array carrying some DWC3 core properties. Looks like we can add the tx-fifo-resize property here too. static const struct property_entry dwc3_qcom_acpi_properties[] = { PROPERTY_ENTRY_STRING("dr_mode", "host"), {} }; Thanks Wesley Cheng > Regards, > Bjorn > -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-22 4:01 ` [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements Wesley Cheng 2021-01-22 17:12 ` Bjorn Andersson @ 2021-01-23 0:15 ` Thinh Nguyen 2021-01-26 9:51 ` Wesley Cheng 1 sibling, 1 reply; 20+ messages in thread From: Thinh Nguyen @ 2021-01-23 0:15 UTC (permalink / raw) To: Wesley Cheng, balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp Hi, Wesley Cheng wrote: > Some devices have USB compositions which may require multiple endpoints > that support EP bursting. HW defined TX FIFO sizes may not always be > sufficient for these compositions. By utilizing flexible TX FIFO > allocation, this allows for endpoints to request the required FIFO depth to > achieve higher bandwidth. With some higher bMaxBurst configurations, using > a larger TX FIFO size results in better TX throughput. > > By introducing the check_config() callback, the resizing logic can fetch > the maximum number of endpoints used in the USB composition (can contain > multiple configurations), which helps ensure that the resizing logic can > fulfill the configuration(s), or return an error to the gadget layer > otherwise during bind time. > > Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> > --- > drivers/usb/dwc3/core.c | 2 + > drivers/usb/dwc3/core.h | 8 ++ > drivers/usb/dwc3/ep0.c | 2 + > drivers/usb/dwc3/gadget.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 206 insertions(+) > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c > index 6969196..e7fa6af 100644 > --- a/drivers/usb/dwc3/core.c > +++ b/drivers/usb/dwc3/core.c > @@ -1284,6 +1284,8 @@ static void dwc3_get_properties(struct dwc3 *dwc) > &tx_thr_num_pkt_prd); > device_property_read_u8(dev, "snps,tx-max-burst-prd", > &tx_max_burst_prd); > + dwc->needs_fifo_resize = device_property_read_bool(dev, > + "tx-fifo-resize"); > > dwc->disable_scramble_quirk = device_property_read_bool(dev, > "snps,disable_scramble_quirk"); > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h > index eec1cf4..983b2fd4 100644 > --- a/drivers/usb/dwc3/core.h > +++ b/drivers/usb/dwc3/core.h > @@ -1223,6 +1223,7 @@ struct dwc3 { > unsigned is_utmi_l1_suspend:1; > unsigned is_fpga:1; > unsigned pending_events:1; > + unsigned needs_fifo_resize:1; The prefix "need" sounds like a requirement, but I don't think it is the case here. I think "do" would be a better prefix here. > unsigned pullups_connected:1; > unsigned setup_packet_pending:1; > unsigned three_stage_setup:1; > @@ -1257,6 +1258,10 @@ struct dwc3 { > unsigned dis_split_quirk:1; > > u16 imod_interval; > + > + int max_cfg_eps; > + int last_fifo_depth; > + int num_ep_resized; > }; Please document these new fields. > > #define INCRX_BURST_MODE 0 > @@ -1471,6 +1476,7 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, > struct dwc3_gadget_ep_cmd_params *params); > int dwc3_send_gadget_generic_command(struct dwc3 *dwc, unsigned int cmd, > u32 param); > +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc); > #else > static inline int dwc3_gadget_init(struct dwc3 *dwc) > { return 0; } > @@ -1490,6 +1496,8 @@ static inline int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, > static inline int dwc3_send_gadget_generic_command(struct dwc3 *dwc, > int cmd, u32 param) > { return 0; } > +static inline void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) > +{ } > #endif > > #if IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE) > diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c > index 8b668ef..4f216bd 100644 > --- a/drivers/usb/dwc3/ep0.c > +++ b/drivers/usb/dwc3/ep0.c > @@ -616,6 +616,8 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl) > return -EINVAL; > > case USB_STATE_ADDRESS: > + dwc3_gadget_clear_tx_fifos(dwc); > + > ret = dwc3_ep0_delegate_req(dwc, ctrl); > /* if the cfg matches and the cfg is non zero */ > if (cfg && (!ret || (ret == USB_GADGET_DELAYED_STATUS))) { > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 86f257f..26f9d64 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -615,6 +615,161 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action) > static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, > bool interrupt); > > +static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) Can you document what this function does? > +{ > + int max_packet = 1024; Maybe you can also document why you chose 1024 (e.g. applicable to Enhanced SuperSpeed only?). > + int fifo_size; > + int mdwidth; > + > + mdwidth = DWC3_MDWIDTH(dwc->hwparams.hwparams0); > + /* MDWIDTH is represented in bits, we need it in bytes */ > + mdwidth >>= 3; mdwidth for DWC32 requires to read hwparams6 for the upper 2 significant bits. Can we add a check for DWC32 also? You can check how we're doing it now in the current code. > + > + fifo_size = mult * ((max_packet + mdwidth) / mdwidth) + 1; > + return fifo_size; > +} > + > +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) > +{ > + struct dwc3_ep *dep; > + int fifo_depth; > + int size; > + int num; > + > + if (!dwc->needs_fifo_resize) > + return; > + > + /* Read ep0IN related TXFIFO size */ > + dep = dwc->eps[1]; > + size = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); > + if (DWC3_IP_IS(DWC31)) > + fifo_depth = DWC31_GTXFIFOSIZ_TXFDEP(size); > + else > + fifo_depth = DWC3_GTXFIFOSIZ_TXFDEP(size); The driver handles 3 IPs. Getting the fifo depth for DWC32 is the same as DWC31. So the condition should be if (DWC3_IP_IS(DWC3)) fifo_depth = ... else fifo_depth = ... > + > + dwc->last_fifo_depth = fifo_depth; > + /* Clear existing TXFIFO for all IN eps except ep0 */ > + for (num = 3; num < min_t(int, dwc->num_eps, DWC3_ENDPOINTS_NUM); > + num += 2) { > + dep = dwc->eps[num]; > + /* Don't change TXFRAMNUM on usb31 version */ > + size = DWC3_IP_IS(DWC31) ? > + dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1)) & > + DWC31_GTXFIFOSIZ_TXFRAMNUM : 0; > + Same here. Check for DWC32. > + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1), size); > + } > + dwc->num_ep_resized = 0; > +} > + > +/* > + * dwc3_gadget_resize_tx_fifos - reallocate fifo spaces for current use-case > + * @dwc: pointer to our context structure > + * > + * This function will a best effort FIFO allocation in order > + * to improve FIFO usage and throughput, while still allowing > + * us to enable as many endpoints as possible. > + * > + * Keep in mind that this operation will be highly dependent > + * on the configured size for RAM1 - which contains TxFifo -, > + * the amount of endpoints enabled on coreConsultant tool, and > + * the width of the Master Bus. > + * > + * In general, FIFO depths are represented with the following equation: > + * > + * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1 > + * > + * Conversions can be done to the equation to derive the number of packets that > + * will fit to a particular FIFO size value. > + */ > +static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) > +{ > + struct dwc3 *dwc = dep->dwc; > + int fifo_0_start; > + int ram1_depth; > + int fifo_size; > + int min_depth; > + int num_in_ep; > + int remaining; > + int mult = 1; > + int fifo; > + int tmp; > + > + if (!dwc->needs_fifo_resize) > + return 0; Maybe add a condition to check for Enhanced SuperSpeed only? > + > + /* resize IN endpoints except ep0 */ > + if (!usb_endpoint_dir_in(dep->endpoint.desc) || dep->number <= 1) > + return 0; > + > + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); > + > + if ((dep->endpoint.maxburst > 1 && > + usb_endpoint_xfer_bulk(dep->endpoint.desc)) || > + usb_endpoint_xfer_isoc(dep->endpoint.desc)) > + mult = 3; > + > + if (dep->endpoint.maxburst > 6 && > + usb_endpoint_xfer_bulk(dep->endpoint.desc) && DWC3_IP_IS(DWC31)) > + mult = 6; You checked maxburst > 1 for isoc, but not when maxburst > 6. Why? Also, "mult" is the term we usually use for isoc endpoints. Applying it to bulk is confusing here. How did we decide on 3 and 6? Are they arbitrary? > + > + /* FIFO size for a single buffer */ > + fifo = dwc3_gadget_calc_tx_fifo_size(dwc, 1); > + > + /* Calculate the number of remaining EPs w/o any FIFO */ > + num_in_ep = dwc->max_cfg_eps; > + num_in_ep -= dwc->num_ep_resized; > + > + /* Reserve at least one FIFO for the number of IN EPs */ > + min_depth = num_in_ep * (fifo + 1); > + remaining = ram1_depth - min_depth - dwc->last_fifo_depth; Can "remaining" be a negative value? If so, I think it's clearer if you do remaining = max_t(int, 0, remaining); > + > + /* > + * We've already reserved 1 FIFO per EP, so check what we can fit in > + * addition to it. If there is not enough remaining space, allocate > + * all the remaining space to the EP. > + */ > + fifo_size = (mult - 1) * fifo; > + if (remaining < fifo_size) { > + if (remaining > 0) > + fifo_size = remaining; > + else > + fifo_size = 0; Then use this condition instead: if (remaining < fifo_size) fifo_size = remaining; > + } > + > + fifo_size += fifo; > + /* Last increment according to the TX FIFO size equation */ > + fifo_size++; > + > + /* Check if TXFIFOs start at non-zero addr */ > + tmp = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); > + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(tmp); > + > + fifo_size |= (fifo_0_start + (dwc->last_fifo_depth << 16)); > + if (DWC3_IP_IS(DWC31)) > + dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); > + else > + dwc->last_fifo_depth += DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); Take account of DWC32. > + > + /* Check fifo size allocation doesn't exceed available RAM size. */ > + if (dwc->last_fifo_depth >= ram1_depth) { > + dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", > + dwc->last_fifo_depth, ram1_depth, > + dep->endpoint.name, fifo_size); > + if (DWC3_IP_IS(DWC31)) > + fifo_size = DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); > + else > + fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); Same here. > + dwc->last_fifo_depth -= fifo_size; > + return -ENOMEM; > + } > + > + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(dep->number >> 1), fifo_size); > + dwc->num_ep_resized++; > + > + return 0; > +} > + > /** > * __dwc3_gadget_ep_enable - initializes a hw endpoint > * @dep: endpoint to be initialized > @@ -632,6 +787,10 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action) > int ret; > > if (!(dep->flags & DWC3_EP_ENABLED)) { > + ret = dwc3_gadget_resize_tx_fifos(dep); > + if (ret) > + return ret; > + > ret = dwc3_gadget_start_config(dep); > if (ret) > return ret; > @@ -2418,6 +2577,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) > > spin_lock_irqsave(&dwc->lock, flags); > dwc->gadget_driver = NULL; > + dwc->max_cfg_eps = 0; > spin_unlock_irqrestore(&dwc->lock, flags); > > free_irq(dwc->irq_gadget, dwc->ev_buf); > @@ -2485,6 +2645,39 @@ static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA) > return 0; > } > > +static int dwc3_gadget_check_config(struct usb_gadget *g, unsigned long ep_map) What's in ep_map? Can you document more to help with the review? Thanks, Thinh > +{ > + struct dwc3 *dwc = gadget_to_dwc(g); > + unsigned long in_ep_map; > + int fifo_size = 0; > + int ram1_depth; > + int ep_num; > + > + if (!dwc->needs_fifo_resize) > + return 0; > + > + /* Only interested in the IN endpoints */ > + in_ep_map = ep_map >> 16; > + ep_num = hweight_long(in_ep_map); > + > + if (ep_num <= dwc->max_cfg_eps) > + return 0; > + > + /* Update the max number of eps in the composition */ > + dwc->max_cfg_eps = ep_num; > + > + fifo_size = dwc3_gadget_calc_tx_fifo_size(dwc, dwc->max_cfg_eps); > + /* Based on the equation, increment by one for every ep */ > + fifo_size += dwc->max_cfg_eps; > + > + /* Check if we can fit a single fifo per endpoint */ > + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); > + if (fifo_size > ram1_depth) > + return -ENOMEM; > + > + return 0; > +} > + > static const struct usb_gadget_ops dwc3_gadget_ops = { > .get_frame = dwc3_gadget_get_frame, > .wakeup = dwc3_gadget_wakeup, > @@ -2495,6 +2688,7 @@ static const struct usb_gadget_ops dwc3_gadget_ops = { > .udc_set_speed = dwc3_gadget_set_speed, > .get_config_params = dwc3_gadget_config_params, > .vbus_draw = dwc3_gadget_vbus_draw, > + .check_config = dwc3_gadget_check_config, > }; > > /* -------------------------------------------------------------------------- */ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-23 0:15 ` Thinh Nguyen @ 2021-01-26 9:51 ` Wesley Cheng 2021-01-26 20:43 ` Thinh Nguyen 0 siblings, 1 reply; 20+ messages in thread From: Wesley Cheng @ 2021-01-26 9:51 UTC (permalink / raw) To: Thinh Nguyen, balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On 1/22/2021 4:15 PM, Thinh Nguyen wrote: > Hi, > > Wesley Cheng wrote: >> Some devices have USB compositions which may require multiple endpoints >> that support EP bursting. HW defined TX FIFO sizes may not always be >> sufficient for these compositions. By utilizing flexible TX FIFO >> allocation, this allows for endpoints to request the required FIFO depth to >> achieve higher bandwidth. With some higher bMaxBurst configurations, using >> a larger TX FIFO size results in better TX throughput. >> >> By introducing the check_config() callback, the resizing logic can fetch >> the maximum number of endpoints used in the USB composition (can contain >> multiple configurations), which helps ensure that the resizing logic can >> fulfill the configuration(s), or return an error to the gadget layer >> otherwise during bind time. >> >> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> >> --- >> drivers/usb/dwc3/core.c | 2 + >> drivers/usb/dwc3/core.h | 8 ++ >> drivers/usb/dwc3/ep0.c | 2 + >> drivers/usb/dwc3/gadget.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++ >> 4 files changed, 206 insertions(+) >> >> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c >> index 6969196..e7fa6af 100644 >> --- a/drivers/usb/dwc3/core.c >> +++ b/drivers/usb/dwc3/core.c >> @@ -1284,6 +1284,8 @@ static void dwc3_get_properties(struct dwc3 *dwc) >> &tx_thr_num_pkt_prd); >> device_property_read_u8(dev, "snps,tx-max-burst-prd", >> &tx_max_burst_prd); >> + dwc->needs_fifo_resize = device_property_read_bool(dev, >> + "tx-fifo-resize"); >> >> dwc->disable_scramble_quirk = device_property_read_bool(dev, >> "snps,disable_scramble_quirk"); >> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h >> index eec1cf4..983b2fd4 100644 >> --- a/drivers/usb/dwc3/core.h >> +++ b/drivers/usb/dwc3/core.h >> @@ -1223,6 +1223,7 @@ struct dwc3 { >> unsigned is_utmi_l1_suspend:1; >> unsigned is_fpga:1; >> unsigned pending_events:1; >> + unsigned needs_fifo_resize:1; > > The prefix "need" sounds like a requirement, but I don't think it is the > case here. I think "do" would be a better prefix here. > Hi Thinh, Sure, that is true, since this may be an optional flag for certain platforms. >> unsigned pullups_connected:1; >> unsigned setup_packet_pending:1; >> unsigned three_stage_setup:1; >> @@ -1257,6 +1258,10 @@ struct dwc3 { >> unsigned dis_split_quirk:1; >> >> u16 imod_interval; >> + >> + int max_cfg_eps; >> + int last_fifo_depth; >> + int num_ep_resized; >> }; > > Please document these new fields. > Will do. >> >> #define INCRX_BURST_MODE 0 >> @@ -1471,6 +1476,7 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >> struct dwc3_gadget_ep_cmd_params *params); >> int dwc3_send_gadget_generic_command(struct dwc3 *dwc, unsigned int cmd, >> u32 param); >> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc); >> #else >> static inline int dwc3_gadget_init(struct dwc3 *dwc) >> { return 0; } >> @@ -1490,6 +1496,8 @@ static inline int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >> static inline int dwc3_send_gadget_generic_command(struct dwc3 *dwc, >> int cmd, u32 param) >> { return 0; } >> +static inline void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >> +{ } >> #endif >> >> #if IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE) >> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c >> index 8b668ef..4f216bd 100644 >> --- a/drivers/usb/dwc3/ep0.c >> +++ b/drivers/usb/dwc3/ep0.c >> @@ -616,6 +616,8 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl) >> return -EINVAL; >> >> case USB_STATE_ADDRESS: >> + dwc3_gadget_clear_tx_fifos(dwc); >> + >> ret = dwc3_ep0_delegate_req(dwc, ctrl); >> /* if the cfg matches and the cfg is non zero */ >> if (cfg && (!ret || (ret == USB_GADGET_DELAYED_STATUS))) { >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >> index 86f257f..26f9d64 100644 >> --- a/drivers/usb/dwc3/gadget.c >> +++ b/drivers/usb/dwc3/gadget.c >> @@ -615,6 +615,161 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action) >> static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, >> bool interrupt); >> >> +static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) > > Can you document what this function does? > Will do. >> +{ >> + int max_packet = 1024; > > Maybe you can also document why you chose 1024 (e.g. applicable to > Enhanced SuperSpeed only?). > Sure. Its basically applicable for SS and isoc (hs/ss) use cases since max packet size is 1024 in both cases. >> + int fifo_size; >> + int mdwidth; >> + >> + mdwidth = DWC3_MDWIDTH(dwc->hwparams.hwparams0); >> + /* MDWIDTH is represented in bits, we need it in bytes */ >> + mdwidth >>= 3; > > mdwidth for DWC32 requires to read hwparams6 for the upper 2 significant > bits. Can we add a check for DWC32 also? You can check how we're doing > it now in the current code. > Sure. I'll make sure to get the correct registers for the DWC32 case. >> + >> + fifo_size = mult * ((max_packet + mdwidth) / mdwidth) + 1; >> + return fifo_size; >> +} >> + >> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >> +{ >> + struct dwc3_ep *dep; >> + int fifo_depth; >> + int size; >> + int num; >> + >> + if (!dwc->needs_fifo_resize) >> + return; >> + >> + /* Read ep0IN related TXFIFO size */ >> + dep = dwc->eps[1]; >> + size = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >> + if (DWC3_IP_IS(DWC31)) >> + fifo_depth = DWC31_GTXFIFOSIZ_TXFDEP(size); >> + else >> + fifo_depth = DWC3_GTXFIFOSIZ_TXFDEP(size); > > The driver handles 3 IPs. Getting the fifo depth for DWC32 is the same > as DWC31. So the condition should be > if (DWC3_IP_IS(DWC3)) > fifo_depth = ... > else > fifo_depth = ... > Understood. >> + >> + dwc->last_fifo_depth = fifo_depth; >> + /* Clear existing TXFIFO for all IN eps except ep0 */ >> + for (num = 3; num < min_t(int, dwc->num_eps, DWC3_ENDPOINTS_NUM); >> + num += 2) { >> + dep = dwc->eps[num]; >> + /* Don't change TXFRAMNUM on usb31 version */ >> + size = DWC3_IP_IS(DWC31) ? >> + dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1)) & >> + DWC31_GTXFIFOSIZ_TXFRAMNUM : 0; >> + > > Same here. Check for DWC32. > >> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1), size); >> + } >> + dwc->num_ep_resized = 0; >> +} >> + >> +/* >> + * dwc3_gadget_resize_tx_fifos - reallocate fifo spaces for current use-case >> + * @dwc: pointer to our context structure >> + * >> + * This function will a best effort FIFO allocation in order >> + * to improve FIFO usage and throughput, while still allowing >> + * us to enable as many endpoints as possible. >> + * >> + * Keep in mind that this operation will be highly dependent >> + * on the configured size for RAM1 - which contains TxFifo -, >> + * the amount of endpoints enabled on coreConsultant tool, and >> + * the width of the Master Bus. >> + * >> + * In general, FIFO depths are represented with the following equation: >> + * >> + * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1 >> + * >> + * Conversions can be done to the equation to derive the number of packets that >> + * will fit to a particular FIFO size value. >> + */ >> +static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) >> +{ >> + struct dwc3 *dwc = dep->dwc; >> + int fifo_0_start; >> + int ram1_depth; >> + int fifo_size; >> + int min_depth; >> + int num_in_ep; >> + int remaining; >> + int mult = 1; >> + int fifo; >> + int tmp; >> + >> + if (!dwc->needs_fifo_resize) >> + return 0; > > Maybe add a condition to check for Enhanced SuperSpeed only? > Since this logic applies for isoc endpoints as well in high speed mode, for high bandwidth use cases, we can't limit it to SS only. >> + >> + /* resize IN endpoints except ep0 */ >> + if (!usb_endpoint_dir_in(dep->endpoint.desc) || dep->number <= 1) >> + return 0; >> + >> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >> + >> + if ((dep->endpoint.maxburst > 1 && >> + usb_endpoint_xfer_bulk(dep->endpoint.desc)) || >> + usb_endpoint_xfer_isoc(dep->endpoint.desc)) >> + mult = 3; >> + >> + if (dep->endpoint.maxburst > 6 && >> + usb_endpoint_xfer_bulk(dep->endpoint.desc) && DWC3_IP_IS(DWC31)) >> + mult = 6; > > You checked maxburst > 1 for isoc, but not when maxburst > 6. Why? > Also, "mult" is the term we usually use for isoc endpoints. Applying it > to bulk is confusing here. > Ok, let me rename it to something else that makes more sense. The isoc endpoint check was targeted for mainly high-speed high bandwidth isoc use cases, and I don't believe our results improved with a larger fifo allocation, ie 6. (refer to below) > How did we decide on 3 and 6? Are they arbitrary? > So actually in the databook, they have some recommendations for TXFIFO sizes to use in "Table 4-3 Device Config Parameters." It mentions that for burst capable endpoints to have a fifo size to fit at least 3 packets of maxpacket size. There's also "Chapter 3 Cache, FIFO RAMs, and Bandwidth Requirements," which goes over a lot of optimizations that could be done based off your system's overall latency. The sizes were chosen after we ran our peak throughput and performance testing on our devices, and these values netted the best throughput while also allowing enough fifo space for our other USB endpoints. Also, there's going to be a limit on how much improvement you see with respects to increasing the fifo size, since your system will eventually be able to pull data out of the internal fifo faster than it is being filled. >> + >> + /* FIFO size for a single buffer */ >> + fifo = dwc3_gadget_calc_tx_fifo_size(dwc, 1); >> + >> + /* Calculate the number of remaining EPs w/o any FIFO */ >> + num_in_ep = dwc->max_cfg_eps; >> + num_in_ep -= dwc->num_ep_resized; >> + >> + /* Reserve at least one FIFO for the number of IN EPs */ >> + min_depth = num_in_ep * (fifo + 1); >> + remaining = ram1_depth - min_depth - dwc->last_fifo_depth; > > Can "remaining" be a negative value? If so, I think it's clearer if you do > remaining = max_t(int, 0, remaining); > Sure. >> + >> + /* >> + * We've already reserved 1 FIFO per EP, so check what we can fit in >> + * addition to it. If there is not enough remaining space, allocate >> + * all the remaining space to the EP. >> + */ >> + fifo_size = (mult - 1) * fifo; >> + if (remaining < fifo_size) { >> + if (remaining > 0) >> + fifo_size = remaining; >> + else >> + fifo_size = 0; > > Then use this condition instead: > > if (remaining < fifo_size) > fifo_size = remaining; > >> + } >> + >> + fifo_size += fifo; >> + /* Last increment according to the TX FIFO size equation */ >> + fifo_size++; >> + >> + /* Check if TXFIFOs start at non-zero addr */ >> + tmp = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >> + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(tmp); >> + >> + fifo_size |= (fifo_0_start + (dwc->last_fifo_depth << 16)); >> + if (DWC3_IP_IS(DWC31)) >> + dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >> + else >> + dwc->last_fifo_depth += DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); > > Take account of DWC32. > Got it. >> + >> + /* Check fifo size allocation doesn't exceed available RAM size. */ >> + if (dwc->last_fifo_depth >= ram1_depth) { >> + dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", >> + dwc->last_fifo_depth, ram1_depth, >> + dep->endpoint.name, fifo_size); >> + if (DWC3_IP_IS(DWC31)) >> + fifo_size = DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >> + else >> + fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); > > Same here. > >> + dwc->last_fifo_depth -= fifo_size; >> + return -ENOMEM; >> + } >> + >> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(dep->number >> 1), fifo_size); >> + dwc->num_ep_resized++; >> + >> + return 0; >> +} >> + >> /** >> * __dwc3_gadget_ep_enable - initializes a hw endpoint >> * @dep: endpoint to be initialized >> @@ -632,6 +787,10 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action) >> int ret; >> >> if (!(dep->flags & DWC3_EP_ENABLED)) { >> + ret = dwc3_gadget_resize_tx_fifos(dep); >> + if (ret) >> + return ret; >> + >> ret = dwc3_gadget_start_config(dep); >> if (ret) >> return ret; >> @@ -2418,6 +2577,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) >> >> spin_lock_irqsave(&dwc->lock, flags); >> dwc->gadget_driver = NULL; >> + dwc->max_cfg_eps = 0; >> spin_unlock_irqrestore(&dwc->lock, flags); >> >> free_irq(dwc->irq_gadget, dwc->ev_buf); >> @@ -2485,6 +2645,39 @@ static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA) >> return 0; >> } >> >> +static int dwc3_gadget_check_config(struct usb_gadget *g, unsigned long ep_map) > > What's in ep_map? Can you document more to help with the review? > Yeah, I will add some more comments. Just to explain here briefly, this check config callback is to address the concern pointed out by Felipe where we could run out of txfifo memory while our USB composition is being enabled. This would lead to an enumerated device, with non-functioning endpoints. The check config will be called during the function driver bind stage (before we enumerate), and ep_map carries the number of endpoints (both in and out) being used in a particular configuration. With this information, the logic will ensure that there is enough txfifo space for at least 1 fifo per endpoint. If not, we can catch the failure at the composition bind stages. (although it would be odd to see a DWC3 controller with not enough txfifo ram for 1 fifo per ep) The point at which we actually resize the fifo allocations, will always check to make sure that there is enough fifo space for the remaining endpoints after every resize. Thanks Wesley Cheng > Thanks, > Thinh > >> +{ >> + struct dwc3 *dwc = gadget_to_dwc(g); >> + unsigned long in_ep_map; >> + int fifo_size = 0; >> + int ram1_depth; >> + int ep_num; >> + >> + if (!dwc->needs_fifo_resize) >> + return 0; >> + >> + /* Only interested in the IN endpoints */ >> + in_ep_map = ep_map >> 16; >> + ep_num = hweight_long(in_ep_map); >> + >> + if (ep_num <= dwc->max_cfg_eps) >> + return 0; >> + >> + /* Update the max number of eps in the composition */ >> + dwc->max_cfg_eps = ep_num; >> + >> + fifo_size = dwc3_gadget_calc_tx_fifo_size(dwc, dwc->max_cfg_eps); >> + /* Based on the equation, increment by one for every ep */ >> + fifo_size += dwc->max_cfg_eps; >> + >> + /* Check if we can fit a single fifo per endpoint */ >> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >> + if (fifo_size > ram1_depth) >> + return -ENOMEM; >> + >> + return 0; >> +} >> + >> static const struct usb_gadget_ops dwc3_gadget_ops = { >> .get_frame = dwc3_gadget_get_frame, >> .wakeup = dwc3_gadget_wakeup, >> @@ -2495,6 +2688,7 @@ static const struct usb_gadget_ops dwc3_gadget_ops = { >> .udc_set_speed = dwc3_gadget_set_speed, >> .get_config_params = dwc3_gadget_config_params, >> .vbus_draw = dwc3_gadget_vbus_draw, >> + .check_config = dwc3_gadget_check_config, >> }; >> >> /* -------------------------------------------------------------------------- */ > -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-26 9:51 ` Wesley Cheng @ 2021-01-26 20:43 ` Thinh Nguyen 2021-01-26 23:26 ` Wesley Cheng 0 siblings, 1 reply; 20+ messages in thread From: Thinh Nguyen @ 2021-01-26 20:43 UTC (permalink / raw) To: Wesley Cheng, Thinh Nguyen, balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp Wesley Cheng wrote: > > On 1/22/2021 4:15 PM, Thinh Nguyen wrote: >> Hi, >> >> Wesley Cheng wrote: >>> Some devices have USB compositions which may require multiple endpoints >>> that support EP bursting. HW defined TX FIFO sizes may not always be >>> sufficient for these compositions. By utilizing flexible TX FIFO >>> allocation, this allows for endpoints to request the required FIFO depth to >>> achieve higher bandwidth. With some higher bMaxBurst configurations, using >>> a larger TX FIFO size results in better TX throughput. >>> >>> By introducing the check_config() callback, the resizing logic can fetch >>> the maximum number of endpoints used in the USB composition (can contain >>> multiple configurations), which helps ensure that the resizing logic can >>> fulfill the configuration(s), or return an error to the gadget layer >>> otherwise during bind time. >>> >>> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> >>> --- >>> drivers/usb/dwc3/core.c | 2 + >>> drivers/usb/dwc3/core.h | 8 ++ >>> drivers/usb/dwc3/ep0.c | 2 + >>> drivers/usb/dwc3/gadget.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++ >>> 4 files changed, 206 insertions(+) >>> >>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c >>> index 6969196..e7fa6af 100644 >>> --- a/drivers/usb/dwc3/core.c >>> +++ b/drivers/usb/dwc3/core.c >>> @@ -1284,6 +1284,8 @@ static void dwc3_get_properties(struct dwc3 *dwc) >>> &tx_thr_num_pkt_prd); >>> device_property_read_u8(dev, "snps,tx-max-burst-prd", >>> &tx_max_burst_prd); >>> + dwc->needs_fifo_resize = device_property_read_bool(dev, >>> + "tx-fifo-resize"); >>> >>> dwc->disable_scramble_quirk = device_property_read_bool(dev, >>> "snps,disable_scramble_quirk"); >>> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h >>> index eec1cf4..983b2fd4 100644 >>> --- a/drivers/usb/dwc3/core.h >>> +++ b/drivers/usb/dwc3/core.h >>> @@ -1223,6 +1223,7 @@ struct dwc3 { >>> unsigned is_utmi_l1_suspend:1; >>> unsigned is_fpga:1; >>> unsigned pending_events:1; >>> + unsigned needs_fifo_resize:1; >> The prefix "need" sounds like a requirement, but I don't think it is the >> case here. I think "do" would be a better prefix here. >> > Hi Thinh, > > Sure, that is true, since this may be an optional flag for certain > platforms. > >>> unsigned pullups_connected:1; >>> unsigned setup_packet_pending:1; >>> unsigned three_stage_setup:1; >>> @@ -1257,6 +1258,10 @@ struct dwc3 { >>> unsigned dis_split_quirk:1; >>> >>> u16 imod_interval; >>> + >>> + int max_cfg_eps; >>> + int last_fifo_depth; >>> + int num_ep_resized; >>> }; >> Please document these new fields. >> > Will do. > >>> >>> #define INCRX_BURST_MODE 0 >>> @@ -1471,6 +1476,7 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >>> struct dwc3_gadget_ep_cmd_params *params); >>> int dwc3_send_gadget_generic_command(struct dwc3 *dwc, unsigned int cmd, >>> u32 param); >>> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc); >>> #else >>> static inline int dwc3_gadget_init(struct dwc3 *dwc) >>> { return 0; } >>> @@ -1490,6 +1496,8 @@ static inline int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >>> static inline int dwc3_send_gadget_generic_command(struct dwc3 *dwc, >>> int cmd, u32 param) >>> { return 0; } >>> +static inline void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >>> +{ } >>> #endif >>> >>> #if IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE) >>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c >>> index 8b668ef..4f216bd 100644 >>> --- a/drivers/usb/dwc3/ep0.c >>> +++ b/drivers/usb/dwc3/ep0.c >>> @@ -616,6 +616,8 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl) >>> return -EINVAL; >>> >>> case USB_STATE_ADDRESS: >>> + dwc3_gadget_clear_tx_fifos(dwc); >>> + >>> ret = dwc3_ep0_delegate_req(dwc, ctrl); >>> /* if the cfg matches and the cfg is non zero */ >>> if (cfg && (!ret || (ret == USB_GADGET_DELAYED_STATUS))) { >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>> index 86f257f..26f9d64 100644 >>> --- a/drivers/usb/dwc3/gadget.c >>> +++ b/drivers/usb/dwc3/gadget.c >>> @@ -615,6 +615,161 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action) >>> static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, >>> bool interrupt); >>> >>> +static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) >> Can you document what this function does? >> > Will do. > >>> +{ >>> + int max_packet = 1024; >> Maybe you can also document why you chose 1024 (e.g. applicable to >> Enhanced SuperSpeed only?). >> > Sure. Its basically applicable for SS and isoc (hs/ss) use cases since > max packet size is 1024 in both cases. Highspeed bulk MPS is 512. Fullspeed varies more (bulk max MPS is 64 and isoc is 1023). However, we can keep it simple with 1024 as if it's ok to over estimate. Just need to note that here. > >>> + int fifo_size; >>> + int mdwidth; >>> + >>> + mdwidth = DWC3_MDWIDTH(dwc->hwparams.hwparams0); >>> + /* MDWIDTH is represented in bits, we need it in bytes */ >>> + mdwidth >>= 3; >> mdwidth for DWC32 requires to read hwparams6 for the upper 2 significant >> bits. Can we add a check for DWC32 also? You can check how we're doing >> it now in the current code. >> > Sure. I'll make sure to get the correct registers for the DWC32 case. > >>> + >>> + fifo_size = mult * ((max_packet + mdwidth) / mdwidth) + 1; >>> + return fifo_size; >>> +} >>> + >>> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >>> +{ >>> + struct dwc3_ep *dep; >>> + int fifo_depth; >>> + int size; >>> + int num; >>> + >>> + if (!dwc->needs_fifo_resize) >>> + return; >>> + >>> + /* Read ep0IN related TXFIFO size */ >>> + dep = dwc->eps[1]; >>> + size = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >>> + if (DWC3_IP_IS(DWC31)) >>> + fifo_depth = DWC31_GTXFIFOSIZ_TXFDEP(size); >>> + else >>> + fifo_depth = DWC3_GTXFIFOSIZ_TXFDEP(size); >> The driver handles 3 IPs. Getting the fifo depth for DWC32 is the same >> as DWC31. So the condition should be >> if (DWC3_IP_IS(DWC3)) >> fifo_depth = ... >> else >> fifo_depth = ... >> > Understood. > >>> + >>> + dwc->last_fifo_depth = fifo_depth; >>> + /* Clear existing TXFIFO for all IN eps except ep0 */ >>> + for (num = 3; num < min_t(int, dwc->num_eps, DWC3_ENDPOINTS_NUM); >>> + num += 2) { >>> + dep = dwc->eps[num]; >>> + /* Don't change TXFRAMNUM on usb31 version */ >>> + size = DWC3_IP_IS(DWC31) ? >>> + dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1)) & >>> + DWC31_GTXFIFOSIZ_TXFRAMNUM : 0; >>> + >> Same here. Check for DWC32. >> >>> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1), size); >>> + } >>> + dwc->num_ep_resized = 0; >>> +} >>> + >>> +/* >>> + * dwc3_gadget_resize_tx_fifos - reallocate fifo spaces for current use-case >>> + * @dwc: pointer to our context structure >>> + * >>> + * This function will a best effort FIFO allocation in order >>> + * to improve FIFO usage and throughput, while still allowing >>> + * us to enable as many endpoints as possible. >>> + * >>> + * Keep in mind that this operation will be highly dependent >>> + * on the configured size for RAM1 - which contains TxFifo -, >>> + * the amount of endpoints enabled on coreConsultant tool, and >>> + * the width of the Master Bus. >>> + * >>> + * In general, FIFO depths are represented with the following equation: >>> + * >>> + * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1 >>> + * >>> + * Conversions can be done to the equation to derive the number of packets that >>> + * will fit to a particular FIFO size value. >>> + */ >>> +static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) >>> +{ >>> + struct dwc3 *dwc = dep->dwc; >>> + int fifo_0_start; >>> + int ram1_depth; >>> + int fifo_size; >>> + int min_depth; >>> + int num_in_ep; >>> + int remaining; >>> + int mult = 1; >>> + int fifo; >>> + int tmp; >>> + >>> + if (!dwc->needs_fifo_resize) >>> + return 0; >> Maybe add a condition to check for Enhanced SuperSpeed only? >> > Since this logic applies for isoc endpoints as well in high speed mode, > for high bandwidth use cases, we can't limit it to SS only. Ok. I was asking because you use 1024 as MPS in your calculation. > >>> + >>> + /* resize IN endpoints except ep0 */ >>> + if (!usb_endpoint_dir_in(dep->endpoint.desc) || dep->number <= 1) >>> + return 0; >>> + >>> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >>> + >>> + if ((dep->endpoint.maxburst > 1 && >>> + usb_endpoint_xfer_bulk(dep->endpoint.desc)) || >>> + usb_endpoint_xfer_isoc(dep->endpoint.desc)) >>> + mult = 3; >>> + >>> + if (dep->endpoint.maxburst > 6 && >>> + usb_endpoint_xfer_bulk(dep->endpoint.desc) && DWC3_IP_IS(DWC31)) >>> + mult = 6; >> You checked maxburst > 1 for isoc, but not when maxburst > 6. Why? >> Also, "mult" is the term we usually use for isoc endpoints. Applying it >> to bulk is confusing here. >> > Ok, let me rename it to something else that makes more sense. The isoc > endpoint check was targeted for mainly high-speed high bandwidth isoc > use cases, and I don't believe our results improved with a larger fifo > allocation, ie 6. (refer to below) It should improve for isoc and reduce missed isoc error also. Most applications only use 1-16KB max per interval, so you don't see the impact as much. > >> How did we decide on 3 and 6? Are they arbitrary? >> > So actually in the databook, they have some recommendations for TXFIFO > sizes to use in "Table 4-3 Device Config Parameters." It mentions that > for burst capable endpoints to have a fifo size to fit at least 3 > packets of maxpacket size. > > There's also "Chapter 3 Cache, FIFO RAMs, and Bandwidth Requirements," > which goes over a lot of optimizations that could be done based off your > system's overall latency. The sizes were chosen after we ran our peak > throughput and performance testing on our devices, and these values > netted the best throughput while also allowing enough fifo space for our > other USB endpoints. Also, there's going to be a limit on how much > improvement you see with respects to increasing the fifo size, since > your system will eventually be able to pull data out of the internal > fifo faster than it is being filled. I added a patch a while back to check the max packet limit based on the recommended minimum Rx/TxFIFO size d94ea5319813 ("usb: dwc3: gadget: Properly set maxpacket limit") The driver wouldn't match the endpoint if it doesn't meet the minimum requirement of 3 MPS if the device is operating in SuperSpeed or SuperSpeed Plus. Was the check for "3" necessary? "6" is your tested value right? It may be different for different setup. Can we pass this as a setting parameter from the devicetree property? >>> + >>> + /* FIFO size for a single buffer */ >>> + fifo = dwc3_gadget_calc_tx_fifo_size(dwc, 1); >>> + >>> + /* Calculate the number of remaining EPs w/o any FIFO */ >>> + num_in_ep = dwc->max_cfg_eps; >>> + num_in_ep -= dwc->num_ep_resized; >>> + >>> + /* Reserve at least one FIFO for the number of IN EPs */ >>> + min_depth = num_in_ep * (fifo + 1); >>> + remaining = ram1_depth - min_depth - dwc->last_fifo_depth; >> Can "remaining" be a negative value? If so, I think it's clearer if you do >> remaining = max_t(int, 0, remaining); >> > Sure. > >>> + >>> + /* >>> + * We've already reserved 1 FIFO per EP, so check what we can fit in >>> + * addition to it. If there is not enough remaining space, allocate >>> + * all the remaining space to the EP. >>> + */ >>> + fifo_size = (mult - 1) * fifo; >>> + if (remaining < fifo_size) { >>> + if (remaining > 0) >>> + fifo_size = remaining; >>> + else >>> + fifo_size = 0; >> Then use this condition instead: >> >> if (remaining < fifo_size) >> fifo_size = remaining; >> >>> + } >>> + >>> + fifo_size += fifo; >>> + /* Last increment according to the TX FIFO size equation */ >>> + fifo_size++; >>> + >>> + /* Check if TXFIFOs start at non-zero addr */ >>> + tmp = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >>> + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(tmp); >>> + >>> + fifo_size |= (fifo_0_start + (dwc->last_fifo_depth << 16)); >>> + if (DWC3_IP_IS(DWC31)) >>> + dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >>> + else >>> + dwc->last_fifo_depth += DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); >> Take account of DWC32. >> > Got it. >>> + >>> + /* Check fifo size allocation doesn't exceed available RAM size. */ >>> + if (dwc->last_fifo_depth >= ram1_depth) { >>> + dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", >>> + dwc->last_fifo_depth, ram1_depth, >>> + dep->endpoint.name, fifo_size); >>> + if (DWC3_IP_IS(DWC31)) >>> + fifo_size = DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >>> + else >>> + fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); >> Same here. >> >>> + dwc->last_fifo_depth -= fifo_size; >>> + return -ENOMEM; >>> + } >>> + >>> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(dep->number >> 1), fifo_size); >>> + dwc->num_ep_resized++; >>> + >>> + return 0; >>> +} >>> + >>> /** >>> * __dwc3_gadget_ep_enable - initializes a hw endpoint >>> * @dep: endpoint to be initialized >>> @@ -632,6 +787,10 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action) >>> int ret; >>> >>> if (!(dep->flags & DWC3_EP_ENABLED)) { >>> + ret = dwc3_gadget_resize_tx_fifos(dep); >>> + if (ret) >>> + return ret; >>> + >>> ret = dwc3_gadget_start_config(dep); >>> if (ret) >>> return ret; >>> @@ -2418,6 +2577,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) >>> >>> spin_lock_irqsave(&dwc->lock, flags); >>> dwc->gadget_driver = NULL; >>> + dwc->max_cfg_eps = 0; >>> spin_unlock_irqrestore(&dwc->lock, flags); >>> >>> free_irq(dwc->irq_gadget, dwc->ev_buf); >>> @@ -2485,6 +2645,39 @@ static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA) >>> return 0; >>> } >>> >>> +static int dwc3_gadget_check_config(struct usb_gadget *g, unsigned long ep_map) >> What's in ep_map? Can you document more to help with the review? >> > Yeah, I will add some more comments. Just to explain here briefly, this > check config callback is to address the concern pointed out by Felipe > where we could run out of txfifo memory while our USB composition is > being enabled. This would lead to an enumerated device, with > non-functioning endpoints. > > The check config will be called during the function driver bind stage > (before we enumerate), and ep_map carries the number of endpoints (both > in and out) being used in a particular configuration. With this > information, the logic will ensure that there is enough txfifo space for > at least 1 fifo per endpoint. If not, we can catch the failure at the > composition bind stages. (although it would be odd to see a DWC3 > controller with not enough txfifo ram for 1 fifo per ep) The point at > which we actually resize the fifo allocations, will always check to make > sure that there is enough fifo space for the remaining endpoints after > every resize. > > Thanks > Wesley Cheng Thanks for the info. I'll review more after you add more detail. BR, Thinh >> Thanks, >> Thinh >> >>> +{ >>> + struct dwc3 *dwc = gadget_to_dwc(g); >>> + unsigned long in_ep_map; >>> + int fifo_size = 0; >>> + int ram1_depth; >>> + int ep_num; >>> + >>> + if (!dwc->needs_fifo_resize) >>> + return 0; >>> + >>> + /* Only interested in the IN endpoints */ >>> + in_ep_map = ep_map >> 16; >>> + ep_num = hweight_long(in_ep_map); >>> + >>> + if (ep_num <= dwc->max_cfg_eps) >>> + return 0; >>> + >>> + /* Update the max number of eps in the composition */ >>> + dwc->max_cfg_eps = ep_num; >>> + >>> + fifo_size = dwc3_gadget_calc_tx_fifo_size(dwc, dwc->max_cfg_eps); >>> + /* Based on the equation, increment by one for every ep */ >>> + fifo_size += dwc->max_cfg_eps; >>> + >>> + /* Check if we can fit a single fifo per endpoint */ >>> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >>> + if (fifo_size > ram1_depth) >>> + return -ENOMEM; >>> + >>> + return 0; >>> +} >>> + >>> static const struct usb_gadget_ops dwc3_gadget_ops = { >>> .get_frame = dwc3_gadget_get_frame, >>> .wakeup = dwc3_gadget_wakeup, >>> @@ -2495,6 +2688,7 @@ static const struct usb_gadget_ops dwc3_gadget_ops = { >>> .udc_set_speed = dwc3_gadget_set_speed, >>> .get_config_params = dwc3_gadget_config_params, >>> .vbus_draw = dwc3_gadget_vbus_draw, >>> + .check_config = dwc3_gadget_check_config, >>> }; >>> >>> /* -------------------------------------------------------------------------- */ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-26 20:43 ` Thinh Nguyen @ 2021-01-26 23:26 ` Wesley Cheng 2021-01-27 1:47 ` Thinh Nguyen 0 siblings, 1 reply; 20+ messages in thread From: Wesley Cheng @ 2021-01-26 23:26 UTC (permalink / raw) To: Thinh Nguyen, balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp On 1/26/2021 12:43 PM, Thinh Nguyen wrote: > Wesley Cheng wrote: >> >> On 1/22/2021 4:15 PM, Thinh Nguyen wrote: >>> Hi, >>> >>> Wesley Cheng wrote: >>>> Some devices have USB compositions which may require multiple endpoints >>>> that support EP bursting. HW defined TX FIFO sizes may not always be >>>> sufficient for these compositions. By utilizing flexible TX FIFO >>>> allocation, this allows for endpoints to request the required FIFO depth to >>>> achieve higher bandwidth. With some higher bMaxBurst configurations, using >>>> a larger TX FIFO size results in better TX throughput. >>>> >>>> By introducing the check_config() callback, the resizing logic can fetch >>>> the maximum number of endpoints used in the USB composition (can contain >>>> multiple configurations), which helps ensure that the resizing logic can >>>> fulfill the configuration(s), or return an error to the gadget layer >>>> otherwise during bind time. >>>> >>>> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> >>>> --- >>>> drivers/usb/dwc3/core.c | 2 + >>>> drivers/usb/dwc3/core.h | 8 ++ >>>> drivers/usb/dwc3/ep0.c | 2 + >>>> drivers/usb/dwc3/gadget.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++ >>>> 4 files changed, 206 insertions(+) >>>> >>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c >>>> index 6969196..e7fa6af 100644 >>>> --- a/drivers/usb/dwc3/core.c >>>> +++ b/drivers/usb/dwc3/core.c >>>> @@ -1284,6 +1284,8 @@ static void dwc3_get_properties(struct dwc3 *dwc) >>>> &tx_thr_num_pkt_prd); >>>> device_property_read_u8(dev, "snps,tx-max-burst-prd", >>>> &tx_max_burst_prd); >>>> + dwc->needs_fifo_resize = device_property_read_bool(dev, >>>> + "tx-fifo-resize"); >>>> >>>> dwc->disable_scramble_quirk = device_property_read_bool(dev, >>>> "snps,disable_scramble_quirk"); >>>> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h >>>> index eec1cf4..983b2fd4 100644 >>>> --- a/drivers/usb/dwc3/core.h >>>> +++ b/drivers/usb/dwc3/core.h >>>> @@ -1223,6 +1223,7 @@ struct dwc3 { >>>> unsigned is_utmi_l1_suspend:1; >>>> unsigned is_fpga:1; >>>> unsigned pending_events:1; >>>> + unsigned needs_fifo_resize:1; >>> The prefix "need" sounds like a requirement, but I don't think it is the >>> case here. I think "do" would be a better prefix here. >>> >> Hi Thinh, >> >> Sure, that is true, since this may be an optional flag for certain >> platforms. >> >>>> unsigned pullups_connected:1; >>>> unsigned setup_packet_pending:1; >>>> unsigned three_stage_setup:1; >>>> @@ -1257,6 +1258,10 @@ struct dwc3 { >>>> unsigned dis_split_quirk:1; >>>> >>>> u16 imod_interval; >>>> + >>>> + int max_cfg_eps; >>>> + int last_fifo_depth; >>>> + int num_ep_resized; >>>> }; >>> Please document these new fields. >>> >> Will do. >> >>>> >>>> #define INCRX_BURST_MODE 0 >>>> @@ -1471,6 +1476,7 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >>>> struct dwc3_gadget_ep_cmd_params *params); >>>> int dwc3_send_gadget_generic_command(struct dwc3 *dwc, unsigned int cmd, >>>> u32 param); >>>> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc); >>>> #else >>>> static inline int dwc3_gadget_init(struct dwc3 *dwc) >>>> { return 0; } >>>> @@ -1490,6 +1496,8 @@ static inline int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >>>> static inline int dwc3_send_gadget_generic_command(struct dwc3 *dwc, >>>> int cmd, u32 param) >>>> { return 0; } >>>> +static inline void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >>>> +{ } >>>> #endif >>>> >>>> #if IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE) >>>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c >>>> index 8b668ef..4f216bd 100644 >>>> --- a/drivers/usb/dwc3/ep0.c >>>> +++ b/drivers/usb/dwc3/ep0.c >>>> @@ -616,6 +616,8 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl) >>>> return -EINVAL; >>>> >>>> case USB_STATE_ADDRESS: >>>> + dwc3_gadget_clear_tx_fifos(dwc); >>>> + >>>> ret = dwc3_ep0_delegate_req(dwc, ctrl); >>>> /* if the cfg matches and the cfg is non zero */ >>>> if (cfg && (!ret || (ret == USB_GADGET_DELAYED_STATUS))) { >>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>> index 86f257f..26f9d64 100644 >>>> --- a/drivers/usb/dwc3/gadget.c >>>> +++ b/drivers/usb/dwc3/gadget.c >>>> @@ -615,6 +615,161 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action) >>>> static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, >>>> bool interrupt); >>>> >>>> +static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) >>> Can you document what this function does? >>> >> Will do. >> >>>> +{ >>>> + int max_packet = 1024; >>> Maybe you can also document why you chose 1024 (e.g. applicable to >>> Enhanced SuperSpeed only?). >>> >> Sure. Its basically applicable for SS and isoc (hs/ss) use cases since >> max packet size is 1024 in both cases. > > Highspeed bulk MPS is 512. Fullspeed varies more (bulk max MPS is 64 and > isoc is 1023). > > However, we can keep it simple with 1024 as if it's ok to over estimate. > Just need to note that here. > >> >>>> + int fifo_size; >>>> + int mdwidth; >>>> + >>>> + mdwidth = DWC3_MDWIDTH(dwc->hwparams.hwparams0); >>>> + /* MDWIDTH is represented in bits, we need it in bytes */ >>>> + mdwidth >>= 3; >>> mdwidth for DWC32 requires to read hwparams6 for the upper 2 significant >>> bits. Can we add a check for DWC32 also? You can check how we're doing >>> it now in the current code. >>> >> Sure. I'll make sure to get the correct registers for the DWC32 case. >> >>>> + >>>> + fifo_size = mult * ((max_packet + mdwidth) / mdwidth) + 1; >>>> + return fifo_size; >>>> +} >>>> + >>>> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >>>> +{ >>>> + struct dwc3_ep *dep; >>>> + int fifo_depth; >>>> + int size; >>>> + int num; >>>> + >>>> + if (!dwc->needs_fifo_resize) >>>> + return; >>>> + >>>> + /* Read ep0IN related TXFIFO size */ >>>> + dep = dwc->eps[1]; >>>> + size = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >>>> + if (DWC3_IP_IS(DWC31)) >>>> + fifo_depth = DWC31_GTXFIFOSIZ_TXFDEP(size); >>>> + else >>>> + fifo_depth = DWC3_GTXFIFOSIZ_TXFDEP(size); >>> The driver handles 3 IPs. Getting the fifo depth for DWC32 is the same >>> as DWC31. So the condition should be >>> if (DWC3_IP_IS(DWC3)) >>> fifo_depth = ... >>> else >>> fifo_depth = ... >>> >> Understood. >> >>>> + >>>> + dwc->last_fifo_depth = fifo_depth; >>>> + /* Clear existing TXFIFO for all IN eps except ep0 */ >>>> + for (num = 3; num < min_t(int, dwc->num_eps, DWC3_ENDPOINTS_NUM); >>>> + num += 2) { >>>> + dep = dwc->eps[num]; >>>> + /* Don't change TXFRAMNUM on usb31 version */ >>>> + size = DWC3_IP_IS(DWC31) ? >>>> + dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1)) & >>>> + DWC31_GTXFIFOSIZ_TXFRAMNUM : 0; >>>> + >>> Same here. Check for DWC32. >>> >>>> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1), size); >>>> + } >>>> + dwc->num_ep_resized = 0; >>>> +} >>>> + >>>> +/* >>>> + * dwc3_gadget_resize_tx_fifos - reallocate fifo spaces for current use-case >>>> + * @dwc: pointer to our context structure >>>> + * >>>> + * This function will a best effort FIFO allocation in order >>>> + * to improve FIFO usage and throughput, while still allowing >>>> + * us to enable as many endpoints as possible. >>>> + * >>>> + * Keep in mind that this operation will be highly dependent >>>> + * on the configured size for RAM1 - which contains TxFifo -, >>>> + * the amount of endpoints enabled on coreConsultant tool, and >>>> + * the width of the Master Bus. >>>> + * >>>> + * In general, FIFO depths are represented with the following equation: >>>> + * >>>> + * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1 >>>> + * >>>> + * Conversions can be done to the equation to derive the number of packets that >>>> + * will fit to a particular FIFO size value. >>>> + */ >>>> +static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) >>>> +{ >>>> + struct dwc3 *dwc = dep->dwc; >>>> + int fifo_0_start; >>>> + int ram1_depth; >>>> + int fifo_size; >>>> + int min_depth; >>>> + int num_in_ep; >>>> + int remaining; >>>> + int mult = 1; >>>> + int fifo; >>>> + int tmp; >>>> + >>>> + if (!dwc->needs_fifo_resize) >>>> + return 0; >>> Maybe add a condition to check for Enhanced SuperSpeed only? >>> >> Since this logic applies for isoc endpoints as well in high speed mode, >> for high bandwidth use cases, we can't limit it to SS only. > > Ok. I was asking because you use 1024 as MPS in your calculation. > >> >>>> + >>>> + /* resize IN endpoints except ep0 */ >>>> + if (!usb_endpoint_dir_in(dep->endpoint.desc) || dep->number <= 1) >>>> + return 0; >>>> + >>>> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >>>> + >>>> + if ((dep->endpoint.maxburst > 1 && >>>> + usb_endpoint_xfer_bulk(dep->endpoint.desc)) || >>>> + usb_endpoint_xfer_isoc(dep->endpoint.desc)) >>>> + mult = 3; >>>> + >>>> + if (dep->endpoint.maxburst > 6 && >>>> + usb_endpoint_xfer_bulk(dep->endpoint.desc) && DWC3_IP_IS(DWC31)) >>>> + mult = 6; >>> You checked maxburst > 1 for isoc, but not when maxburst > 6. Why? >>> Also, "mult" is the term we usually use for isoc endpoints. Applying it >>> to bulk is confusing here. >>> >> Ok, let me rename it to something else that makes more sense. The isoc >> endpoint check was targeted for mainly high-speed high bandwidth isoc >> use cases, and I don't believe our results improved with a larger fifo >> allocation, ie 6. (refer to below) > > It should improve for isoc and reduce missed isoc error also. Most > applications only use 1-16KB max per interval, so you don't see the > impact as much. > >> >>> How did we decide on 3 and 6? Are they arbitrary? >>> >> So actually in the databook, they have some recommendations for TXFIFO >> sizes to use in "Table 4-3 Device Config Parameters." It mentions that >> for burst capable endpoints to have a fifo size to fit at least 3 >> packets of maxpacket size. >> >> There's also "Chapter 3 Cache, FIFO RAMs, and Bandwidth Requirements," >> which goes over a lot of optimizations that could be done based off your >> system's overall latency. The sizes were chosen after we ran our peak >> throughput and performance testing on our devices, and these values >> netted the best throughput while also allowing enough fifo space for our >> other USB endpoints. Also, there's going to be a limit on how much >> improvement you see with respects to increasing the fifo size, since >> your system will eventually be able to pull data out of the internal >> fifo faster than it is being filled. > > I added a patch a while back to check the max packet limit based on the > recommended minimum Rx/TxFIFO size > d94ea5319813 ("usb: dwc3: gadget: Properly set maxpacket limit") > > The driver wouldn't match the endpoint if it doesn't meet the minimum > requirement of 3 MPS if the device is operating in SuperSpeed or > SuperSpeed Plus. Was the check for "3" necessary? > Hi Thinh, Please correct me if I'm wrong, but when a function driver requests for an endpoint using usb_ep_autoconfig(), most of the time it passes in a FS descriptor that doesn't have the wMaxPacketSize parameter set (0), which means it will probably fetch the first unused endpoint. (at least this is what I noticed for the BULK ep cases) In this case, it would be hard to determine if the endpoint selected would already have the constraint you added factored in. > "6" is your tested value right? It may be different for different setup. > Can we pass this as a setting parameter from the devicetree property? > Definitely can agree to that. We can make things a bit more flexible depending on the system. Maybe create a new property, and default it to 6 if "do_fifo_resize" is set, but this parameter is not defined. Thanks Wesley Cheng >>>> + >>>> + /* FIFO size for a single buffer */ >>>> + fifo = dwc3_gadget_calc_tx_fifo_size(dwc, 1); >>>> + >>>> + /* Calculate the number of remaining EPs w/o any FIFO */ >>>> + num_in_ep = dwc->max_cfg_eps; >>>> + num_in_ep -= dwc->num_ep_resized; >>>> + >>>> + /* Reserve at least one FIFO for the number of IN EPs */ >>>> + min_depth = num_in_ep * (fifo + 1); >>>> + remaining = ram1_depth - min_depth - dwc->last_fifo_depth; >>> Can "remaining" be a negative value? If so, I think it's clearer if you do >>> remaining = max_t(int, 0, remaining); >>> >> Sure. >> >>>> + >>>> + /* >>>> + * We've already reserved 1 FIFO per EP, so check what we can fit in >>>> + * addition to it. If there is not enough remaining space, allocate >>>> + * all the remaining space to the EP. >>>> + */ >>>> + fifo_size = (mult - 1) * fifo; >>>> + if (remaining < fifo_size) { >>>> + if (remaining > 0) >>>> + fifo_size = remaining; >>>> + else >>>> + fifo_size = 0; >>> Then use this condition instead: >>> >>> if (remaining < fifo_size) >>> fifo_size = remaining; >>> >>>> + } >>>> + >>>> + fifo_size += fifo; >>>> + /* Last increment according to the TX FIFO size equation */ >>>> + fifo_size++; >>>> + >>>> + /* Check if TXFIFOs start at non-zero addr */ >>>> + tmp = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >>>> + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(tmp); >>>> + >>>> + fifo_size |= (fifo_0_start + (dwc->last_fifo_depth << 16)); >>>> + if (DWC3_IP_IS(DWC31)) >>>> + dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >>>> + else >>>> + dwc->last_fifo_depth += DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); >>> Take account of DWC32. >>> >> Got it. >>>> + >>>> + /* Check fifo size allocation doesn't exceed available RAM size. */ >>>> + if (dwc->last_fifo_depth >= ram1_depth) { >>>> + dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", >>>> + dwc->last_fifo_depth, ram1_depth, >>>> + dep->endpoint.name, fifo_size); >>>> + if (DWC3_IP_IS(DWC31)) >>>> + fifo_size = DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >>>> + else >>>> + fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); >>> Same here. >>> >>>> + dwc->last_fifo_depth -= fifo_size; >>>> + return -ENOMEM; >>>> + } >>>> + >>>> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(dep->number >> 1), fifo_size); >>>> + dwc->num_ep_resized++; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> /** >>>> * __dwc3_gadget_ep_enable - initializes a hw endpoint >>>> * @dep: endpoint to be initialized >>>> @@ -632,6 +787,10 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action) >>>> int ret; >>>> >>>> if (!(dep->flags & DWC3_EP_ENABLED)) { >>>> + ret = dwc3_gadget_resize_tx_fifos(dep); >>>> + if (ret) >>>> + return ret; >>>> + >>>> ret = dwc3_gadget_start_config(dep); >>>> if (ret) >>>> return ret; >>>> @@ -2418,6 +2577,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) >>>> >>>> spin_lock_irqsave(&dwc->lock, flags); >>>> dwc->gadget_driver = NULL; >>>> + dwc->max_cfg_eps = 0; >>>> spin_unlock_irqrestore(&dwc->lock, flags); >>>> >>>> free_irq(dwc->irq_gadget, dwc->ev_buf); >>>> @@ -2485,6 +2645,39 @@ static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA) >>>> return 0; >>>> } >>>> >>>> +static int dwc3_gadget_check_config(struct usb_gadget *g, unsigned long ep_map) >>> What's in ep_map? Can you document more to help with the review? >>> >> Yeah, I will add some more comments. Just to explain here briefly, this >> check config callback is to address the concern pointed out by Felipe >> where we could run out of txfifo memory while our USB composition is >> being enabled. This would lead to an enumerated device, with >> non-functioning endpoints. >> >> The check config will be called during the function driver bind stage >> (before we enumerate), and ep_map carries the number of endpoints (both >> in and out) being used in a particular configuration. With this >> information, the logic will ensure that there is enough txfifo space for >> at least 1 fifo per endpoint. If not, we can catch the failure at the >> composition bind stages. (although it would be odd to see a DWC3 >> controller with not enough txfifo ram for 1 fifo per ep) The point at >> which we actually resize the fifo allocations, will always check to make >> sure that there is enough fifo space for the remaining endpoints after >> every resize. >> >> Thanks >> Wesley Cheng > > Thanks for the info. I'll review more after you add more detail. > > BR, > Thinh > >>> Thanks, >>> Thinh >>> >>>> +{ >>>> + struct dwc3 *dwc = gadget_to_dwc(g); >>>> + unsigned long in_ep_map; >>>> + int fifo_size = 0; >>>> + int ram1_depth; >>>> + int ep_num; >>>> + >>>> + if (!dwc->needs_fifo_resize) >>>> + return 0; >>>> + >>>> + /* Only interested in the IN endpoints */ >>>> + in_ep_map = ep_map >> 16; >>>> + ep_num = hweight_long(in_ep_map); >>>> + >>>> + if (ep_num <= dwc->max_cfg_eps) >>>> + return 0; >>>> + >>>> + /* Update the max number of eps in the composition */ >>>> + dwc->max_cfg_eps = ep_num; >>>> + >>>> + fifo_size = dwc3_gadget_calc_tx_fifo_size(dwc, dwc->max_cfg_eps); >>>> + /* Based on the equation, increment by one for every ep */ >>>> + fifo_size += dwc->max_cfg_eps; >>>> + >>>> + /* Check if we can fit a single fifo per endpoint */ >>>> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >>>> + if (fifo_size > ram1_depth) >>>> + return -ENOMEM; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> static const struct usb_gadget_ops dwc3_gadget_ops = { >>>> .get_frame = dwc3_gadget_get_frame, >>>> .wakeup = dwc3_gadget_wakeup, >>>> @@ -2495,6 +2688,7 @@ static const struct usb_gadget_ops dwc3_gadget_ops = { >>>> .udc_set_speed = dwc3_gadget_set_speed, >>>> .get_config_params = dwc3_gadget_config_params, >>>> .vbus_draw = dwc3_gadget_vbus_draw, >>>> + .check_config = dwc3_gadget_check_config, >>>> }; >>>> >>>> /* -------------------------------------------------------------------------- */ > -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements 2021-01-26 23:26 ` Wesley Cheng @ 2021-01-27 1:47 ` Thinh Nguyen 0 siblings, 0 replies; 20+ messages in thread From: Thinh Nguyen @ 2021-01-27 1:47 UTC (permalink / raw) To: Wesley Cheng, Thinh Nguyen, balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp Wesley Cheng wrote: > > On 1/26/2021 12:43 PM, Thinh Nguyen wrote: >> Wesley Cheng wrote: >>> On 1/22/2021 4:15 PM, Thinh Nguyen wrote: >>>> Hi, >>>> >>>> Wesley Cheng wrote: >>>>> Some devices have USB compositions which may require multiple endpoints >>>>> that support EP bursting. HW defined TX FIFO sizes may not always be >>>>> sufficient for these compositions. By utilizing flexible TX FIFO >>>>> allocation, this allows for endpoints to request the required FIFO depth to >>>>> achieve higher bandwidth. With some higher bMaxBurst configurations, using >>>>> a larger TX FIFO size results in better TX throughput. >>>>> >>>>> By introducing the check_config() callback, the resizing logic can fetch >>>>> the maximum number of endpoints used in the USB composition (can contain >>>>> multiple configurations), which helps ensure that the resizing logic can >>>>> fulfill the configuration(s), or return an error to the gadget layer >>>>> otherwise during bind time. >>>>> >>>>> Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> >>>>> --- >>>>> drivers/usb/dwc3/core.c | 2 + >>>>> drivers/usb/dwc3/core.h | 8 ++ >>>>> drivers/usb/dwc3/ep0.c | 2 + >>>>> drivers/usb/dwc3/gadget.c | 194 ++++++++++++++++++++++++++++++++++++++++++++++ >>>>> 4 files changed, 206 insertions(+) >>>>> >>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c >>>>> index 6969196..e7fa6af 100644 >>>>> --- a/drivers/usb/dwc3/core.c >>>>> +++ b/drivers/usb/dwc3/core.c >>>>> @@ -1284,6 +1284,8 @@ static void dwc3_get_properties(struct dwc3 *dwc) >>>>> &tx_thr_num_pkt_prd); >>>>> device_property_read_u8(dev, "snps,tx-max-burst-prd", >>>>> &tx_max_burst_prd); >>>>> + dwc->needs_fifo_resize = device_property_read_bool(dev, >>>>> + "tx-fifo-resize"); >>>>> >>>>> dwc->disable_scramble_quirk = device_property_read_bool(dev, >>>>> "snps,disable_scramble_quirk"); >>>>> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h >>>>> index eec1cf4..983b2fd4 100644 >>>>> --- a/drivers/usb/dwc3/core.h >>>>> +++ b/drivers/usb/dwc3/core.h >>>>> @@ -1223,6 +1223,7 @@ struct dwc3 { >>>>> unsigned is_utmi_l1_suspend:1; >>>>> unsigned is_fpga:1; >>>>> unsigned pending_events:1; >>>>> + unsigned needs_fifo_resize:1; >>>> The prefix "need" sounds like a requirement, but I don't think it is the >>>> case here. I think "do" would be a better prefix here. >>>> >>> Hi Thinh, >>> >>> Sure, that is true, since this may be an optional flag for certain >>> platforms. >>> >>>>> unsigned pullups_connected:1; >>>>> unsigned setup_packet_pending:1; >>>>> unsigned three_stage_setup:1; >>>>> @@ -1257,6 +1258,10 @@ struct dwc3 { >>>>> unsigned dis_split_quirk:1; >>>>> >>>>> u16 imod_interval; >>>>> + >>>>> + int max_cfg_eps; >>>>> + int last_fifo_depth; >>>>> + int num_ep_resized; >>>>> }; >>>> Please document these new fields. >>>> >>> Will do. >>> >>>>> >>>>> #define INCRX_BURST_MODE 0 >>>>> @@ -1471,6 +1476,7 @@ int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >>>>> struct dwc3_gadget_ep_cmd_params *params); >>>>> int dwc3_send_gadget_generic_command(struct dwc3 *dwc, unsigned int cmd, >>>>> u32 param); >>>>> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc); >>>>> #else >>>>> static inline int dwc3_gadget_init(struct dwc3 *dwc) >>>>> { return 0; } >>>>> @@ -1490,6 +1496,8 @@ static inline int dwc3_send_gadget_ep_cmd(struct dwc3_ep *dep, unsigned int cmd, >>>>> static inline int dwc3_send_gadget_generic_command(struct dwc3 *dwc, >>>>> int cmd, u32 param) >>>>> { return 0; } >>>>> +static inline void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >>>>> +{ } >>>>> #endif >>>>> >>>>> #if IS_ENABLED(CONFIG_USB_DWC3_DUAL_ROLE) >>>>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c >>>>> index 8b668ef..4f216bd 100644 >>>>> --- a/drivers/usb/dwc3/ep0.c >>>>> +++ b/drivers/usb/dwc3/ep0.c >>>>> @@ -616,6 +616,8 @@ static int dwc3_ep0_set_config(struct dwc3 *dwc, struct usb_ctrlrequest *ctrl) >>>>> return -EINVAL; >>>>> >>>>> case USB_STATE_ADDRESS: >>>>> + dwc3_gadget_clear_tx_fifos(dwc); >>>>> + >>>>> ret = dwc3_ep0_delegate_req(dwc, ctrl); >>>>> /* if the cfg matches and the cfg is non zero */ >>>>> if (cfg && (!ret || (ret == USB_GADGET_DELAYED_STATUS))) { >>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>>> index 86f257f..26f9d64 100644 >>>>> --- a/drivers/usb/dwc3/gadget.c >>>>> +++ b/drivers/usb/dwc3/gadget.c >>>>> @@ -615,6 +615,161 @@ static int dwc3_gadget_set_ep_config(struct dwc3_ep *dep, unsigned int action) >>>>> static void dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, >>>>> bool interrupt); >>>>> >>>>> +static int dwc3_gadget_calc_tx_fifo_size(struct dwc3 *dwc, int mult) >>>> Can you document what this function does? >>>> >>> Will do. >>> >>>>> +{ >>>>> + int max_packet = 1024; >>>> Maybe you can also document why you chose 1024 (e.g. applicable to >>>> Enhanced SuperSpeed only?). >>>> >>> Sure. Its basically applicable for SS and isoc (hs/ss) use cases since >>> max packet size is 1024 in both cases. >> Highspeed bulk MPS is 512. Fullspeed varies more (bulk max MPS is 64 and >> isoc is 1023). >> >> However, we can keep it simple with 1024 as if it's ok to over estimate. >> Just need to note that here. >> >>>>> + int fifo_size; >>>>> + int mdwidth; >>>>> + >>>>> + mdwidth = DWC3_MDWIDTH(dwc->hwparams.hwparams0); >>>>> + /* MDWIDTH is represented in bits, we need it in bytes */ >>>>> + mdwidth >>= 3; >>>> mdwidth for DWC32 requires to read hwparams6 for the upper 2 significant >>>> bits. Can we add a check for DWC32 also? You can check how we're doing >>>> it now in the current code. >>>> >>> Sure. I'll make sure to get the correct registers for the DWC32 case. >>> >>>>> + >>>>> + fifo_size = mult * ((max_packet + mdwidth) / mdwidth) + 1; >>>>> + return fifo_size; >>>>> +} >>>>> + >>>>> +void dwc3_gadget_clear_tx_fifos(struct dwc3 *dwc) >>>>> +{ >>>>> + struct dwc3_ep *dep; >>>>> + int fifo_depth; >>>>> + int size; >>>>> + int num; >>>>> + >>>>> + if (!dwc->needs_fifo_resize) >>>>> + return; >>>>> + >>>>> + /* Read ep0IN related TXFIFO size */ >>>>> + dep = dwc->eps[1]; >>>>> + size = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >>>>> + if (DWC3_IP_IS(DWC31)) >>>>> + fifo_depth = DWC31_GTXFIFOSIZ_TXFDEP(size); >>>>> + else >>>>> + fifo_depth = DWC3_GTXFIFOSIZ_TXFDEP(size); >>>> The driver handles 3 IPs. Getting the fifo depth for DWC32 is the same >>>> as DWC31. So the condition should be >>>> if (DWC3_IP_IS(DWC3)) >>>> fifo_depth = ... >>>> else >>>> fifo_depth = ... >>>> >>> Understood. >>> >>>>> + >>>>> + dwc->last_fifo_depth = fifo_depth; >>>>> + /* Clear existing TXFIFO for all IN eps except ep0 */ >>>>> + for (num = 3; num < min_t(int, dwc->num_eps, DWC3_ENDPOINTS_NUM); >>>>> + num += 2) { >>>>> + dep = dwc->eps[num]; >>>>> + /* Don't change TXFRAMNUM on usb31 version */ >>>>> + size = DWC3_IP_IS(DWC31) ? >>>>> + dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1)) & >>>>> + DWC31_GTXFIFOSIZ_TXFRAMNUM : 0; >>>>> + >>>> Same here. Check for DWC32. >>>> >>>>> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(num >> 1), size); >>>>> + } >>>>> + dwc->num_ep_resized = 0; >>>>> +} >>>>> + >>>>> +/* >>>>> + * dwc3_gadget_resize_tx_fifos - reallocate fifo spaces for current use-case >>>>> + * @dwc: pointer to our context structure >>>>> + * >>>>> + * This function will a best effort FIFO allocation in order >>>>> + * to improve FIFO usage and throughput, while still allowing >>>>> + * us to enable as many endpoints as possible. >>>>> + * >>>>> + * Keep in mind that this operation will be highly dependent >>>>> + * on the configured size for RAM1 - which contains TxFifo -, >>>>> + * the amount of endpoints enabled on coreConsultant tool, and >>>>> + * the width of the Master Bus. >>>>> + * >>>>> + * In general, FIFO depths are represented with the following equation: >>>>> + * >>>>> + * fifo_size = mult * ((max_packet + mdwidth)/mdwidth + 1) + 1 >>>>> + * >>>>> + * Conversions can be done to the equation to derive the number of packets that >>>>> + * will fit to a particular FIFO size value. >>>>> + */ >>>>> +static int dwc3_gadget_resize_tx_fifos(struct dwc3_ep *dep) >>>>> +{ >>>>> + struct dwc3 *dwc = dep->dwc; >>>>> + int fifo_0_start; >>>>> + int ram1_depth; >>>>> + int fifo_size; >>>>> + int min_depth; >>>>> + int num_in_ep; >>>>> + int remaining; >>>>> + int mult = 1; >>>>> + int fifo; >>>>> + int tmp; >>>>> + >>>>> + if (!dwc->needs_fifo_resize) >>>>> + return 0; >>>> Maybe add a condition to check for Enhanced SuperSpeed only? >>>> >>> Since this logic applies for isoc endpoints as well in high speed mode, >>> for high bandwidth use cases, we can't limit it to SS only. >> Ok. I was asking because you use 1024 as MPS in your calculation. >> >>>>> + >>>>> + /* resize IN endpoints except ep0 */ >>>>> + if (!usb_endpoint_dir_in(dep->endpoint.desc) || dep->number <= 1) >>>>> + return 0; >>>>> + >>>>> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >>>>> + >>>>> + if ((dep->endpoint.maxburst > 1 && >>>>> + usb_endpoint_xfer_bulk(dep->endpoint.desc)) || >>>>> + usb_endpoint_xfer_isoc(dep->endpoint.desc)) >>>>> + mult = 3; >>>>> + >>>>> + if (dep->endpoint.maxburst > 6 && >>>>> + usb_endpoint_xfer_bulk(dep->endpoint.desc) && DWC3_IP_IS(DWC31)) >>>>> + mult = 6; >>>> You checked maxburst > 1 for isoc, but not when maxburst > 6. Why? >>>> Also, "mult" is the term we usually use for isoc endpoints. Applying it >>>> to bulk is confusing here. >>>> >>> Ok, let me rename it to something else that makes more sense. The isoc >>> endpoint check was targeted for mainly high-speed high bandwidth isoc >>> use cases, and I don't believe our results improved with a larger fifo >>> allocation, ie 6. (refer to below) >> It should improve for isoc and reduce missed isoc error also. Most >> applications only use 1-16KB max per interval, so you don't see the >> impact as much. >> >>>> How did we decide on 3 and 6? Are they arbitrary? >>>> >>> So actually in the databook, they have some recommendations for TXFIFO >>> sizes to use in "Table 4-3 Device Config Parameters." It mentions that >>> for burst capable endpoints to have a fifo size to fit at least 3 >>> packets of maxpacket size. >>> >>> There's also "Chapter 3 Cache, FIFO RAMs, and Bandwidth Requirements," >>> which goes over a lot of optimizations that could be done based off your >>> system's overall latency. The sizes were chosen after we ran our peak >>> throughput and performance testing on our devices, and these values >>> netted the best throughput while also allowing enough fifo space for our >>> other USB endpoints. Also, there's going to be a limit on how much >>> improvement you see with respects to increasing the fifo size, since >>> your system will eventually be able to pull data out of the internal >>> fifo faster than it is being filled. >> I added a patch a while back to check the max packet limit based on the >> recommended minimum Rx/TxFIFO size >> d94ea5319813 ("usb: dwc3: gadget: Properly set maxpacket limit") >> >> The driver wouldn't match the endpoint if it doesn't meet the minimum >> requirement of 3 MPS if the device is operating in SuperSpeed or >> SuperSpeed Plus. Was the check for "3" necessary? >> > Hi Thinh, > > Please correct me if I'm wrong, but when a function driver requests for > an endpoint using usb_ep_autoconfig(), most of the time it passes in a > FS descriptor that doesn't have the wMaxPacketSize parameter set (0), > which means it will probably fetch the first unused endpoint. (at least > this is what I noticed for the BULK ep cases) > > In this case, it would be hard to determine if the endpoint selected > would already have the constraint you added factored in. The gadget driver should pass in the descriptor that matches the selected gadget_driver->max_speed or maximum HW speed capability to usb_ep_autoconfig*() . Otherwise, the usb_ep_autoconfig*() doesn't match the right endpoint for the selected speed. This is something that needs to be fixed. I'm not sure how it will impact some existing setup though. As you said, most of the gadget drivers ignore matching wMaxPacketSize (and others parameters) previously. IMO, this should be fixed regardless. Thanks, Thinh >> "6" is your tested value right? It may be different for different setup. >> Can we pass this as a setting parameter from the devicetree property? >> > Definitely can agree to that. We can make things a bit more flexible > depending on the system. Maybe create a new property, and default it to > 6 if "do_fifo_resize" is set, but this parameter is not defined. > > Thanks > Wesley Cheng >>>>> + >>>>> + /* FIFO size for a single buffer */ >>>>> + fifo = dwc3_gadget_calc_tx_fifo_size(dwc, 1); >>>>> + >>>>> + /* Calculate the number of remaining EPs w/o any FIFO */ >>>>> + num_in_ep = dwc->max_cfg_eps; >>>>> + num_in_ep -= dwc->num_ep_resized; >>>>> + >>>>> + /* Reserve at least one FIFO for the number of IN EPs */ >>>>> + min_depth = num_in_ep * (fifo + 1); >>>>> + remaining = ram1_depth - min_depth - dwc->last_fifo_depth; >>>> Can "remaining" be a negative value? If so, I think it's clearer if you do >>>> remaining = max_t(int, 0, remaining); >>>> >>> Sure. >>> >>>>> + >>>>> + /* >>>>> + * We've already reserved 1 FIFO per EP, so check what we can fit in >>>>> + * addition to it. If there is not enough remaining space, allocate >>>>> + * all the remaining space to the EP. >>>>> + */ >>>>> + fifo_size = (mult - 1) * fifo; >>>>> + if (remaining < fifo_size) { >>>>> + if (remaining > 0) >>>>> + fifo_size = remaining; >>>>> + else >>>>> + fifo_size = 0; >>>> Then use this condition instead: >>>> >>>> if (remaining < fifo_size) >>>> fifo_size = remaining; >>>> >>>>> + } >>>>> + >>>>> + fifo_size += fifo; >>>>> + /* Last increment according to the TX FIFO size equation */ >>>>> + fifo_size++; >>>>> + >>>>> + /* Check if TXFIFOs start at non-zero addr */ >>>>> + tmp = dwc3_readl(dwc->regs, DWC3_GTXFIFOSIZ(0)); >>>>> + fifo_0_start = DWC3_GTXFIFOSIZ_TXFSTADDR(tmp); >>>>> + >>>>> + fifo_size |= (fifo_0_start + (dwc->last_fifo_depth << 16)); >>>>> + if (DWC3_IP_IS(DWC31)) >>>>> + dwc->last_fifo_depth += DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >>>>> + else >>>>> + dwc->last_fifo_depth += DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); >>>> Take account of DWC32. >>>> >>> Got it. >>>>> + >>>>> + /* Check fifo size allocation doesn't exceed available RAM size. */ >>>>> + if (dwc->last_fifo_depth >= ram1_depth) { >>>>> + dev_err(dwc->dev, "Fifosize(%d) > RAM size(%d) %s depth:%d\n", >>>>> + dwc->last_fifo_depth, ram1_depth, >>>>> + dep->endpoint.name, fifo_size); >>>>> + if (DWC3_IP_IS(DWC31)) >>>>> + fifo_size = DWC31_GTXFIFOSIZ_TXFDEP(fifo_size); >>>>> + else >>>>> + fifo_size = DWC3_GTXFIFOSIZ_TXFDEP(fifo_size); >>>> Same here. >>>> >>>>> + dwc->last_fifo_depth -= fifo_size; >>>>> + return -ENOMEM; >>>>> + } >>>>> + >>>>> + dwc3_writel(dwc->regs, DWC3_GTXFIFOSIZ(dep->number >> 1), fifo_size); >>>>> + dwc->num_ep_resized++; >>>>> + >>>>> + return 0; >>>>> +} >>>>> + >>>>> /** >>>>> * __dwc3_gadget_ep_enable - initializes a hw endpoint >>>>> * @dep: endpoint to be initialized >>>>> @@ -632,6 +787,10 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action) >>>>> int ret; >>>>> >>>>> if (!(dep->flags & DWC3_EP_ENABLED)) { >>>>> + ret = dwc3_gadget_resize_tx_fifos(dep); >>>>> + if (ret) >>>>> + return ret; >>>>> + >>>>> ret = dwc3_gadget_start_config(dep); >>>>> if (ret) >>>>> return ret; >>>>> @@ -2418,6 +2577,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) >>>>> >>>>> spin_lock_irqsave(&dwc->lock, flags); >>>>> dwc->gadget_driver = NULL; >>>>> + dwc->max_cfg_eps = 0; >>>>> spin_unlock_irqrestore(&dwc->lock, flags); >>>>> >>>>> free_irq(dwc->irq_gadget, dwc->ev_buf); >>>>> @@ -2485,6 +2645,39 @@ static int dwc3_gadget_vbus_draw(struct usb_gadget *g, unsigned int mA) >>>>> return 0; >>>>> } >>>>> >>>>> +static int dwc3_gadget_check_config(struct usb_gadget *g, unsigned long ep_map) >>>> What's in ep_map? Can you document more to help with the review? >>>> >>> Yeah, I will add some more comments. Just to explain here briefly, this >>> check config callback is to address the concern pointed out by Felipe >>> where we could run out of txfifo memory while our USB composition is >>> being enabled. This would lead to an enumerated device, with >>> non-functioning endpoints. >>> >>> The check config will be called during the function driver bind stage >>> (before we enumerate), and ep_map carries the number of endpoints (both >>> in and out) being used in a particular configuration. With this >>> information, the logic will ensure that there is enough txfifo space for >>> at least 1 fifo per endpoint. If not, we can catch the failure at the >>> composition bind stages. (although it would be odd to see a DWC3 >>> controller with not enough txfifo ram for 1 fifo per ep) The point at >>> which we actually resize the fifo allocations, will always check to make >>> sure that there is enough fifo space for the remaining endpoints after >>> every resize. >>> >>> Thanks >>> Wesley Cheng >> Thanks for the info. I'll review more after you add more detail. >> >> BR, >> Thinh >> >>>> Thanks, >>>> Thinh >>>> >>>>> +{ >>>>> + struct dwc3 *dwc = gadget_to_dwc(g); >>>>> + unsigned long in_ep_map; >>>>> + int fifo_size = 0; >>>>> + int ram1_depth; >>>>> + int ep_num; >>>>> + >>>>> + if (!dwc->needs_fifo_resize) >>>>> + return 0; >>>>> + >>>>> + /* Only interested in the IN endpoints */ >>>>> + in_ep_map = ep_map >> 16; >>>>> + ep_num = hweight_long(in_ep_map); >>>>> + >>>>> + if (ep_num <= dwc->max_cfg_eps) >>>>> + return 0; >>>>> + >>>>> + /* Update the max number of eps in the composition */ >>>>> + dwc->max_cfg_eps = ep_num; >>>>> + >>>>> + fifo_size = dwc3_gadget_calc_tx_fifo_size(dwc, dwc->max_cfg_eps); >>>>> + /* Based on the equation, increment by one for every ep */ >>>>> + fifo_size += dwc->max_cfg_eps; >>>>> + >>>>> + /* Check if we can fit a single fifo per endpoint */ >>>>> + ram1_depth = DWC3_RAM1_DEPTH(dwc->hwparams.hwparams7); >>>>> + if (fifo_size > ram1_depth) >>>>> + return -ENOMEM; >>>>> + >>>>> + return 0; >>>>> +} >>>>> + >>>>> static const struct usb_gadget_ops dwc3_gadget_ops = { >>>>> .get_frame = dwc3_gadget_get_frame, >>>>> .wakeup = dwc3_gadget_wakeup, >>>>> @@ -2495,6 +2688,7 @@ static const struct usb_gadget_ops dwc3_gadget_ops = { >>>>> .udc_set_speed = dwc3_gadget_set_speed, >>>>> .get_config_params = dwc3_gadget_config_params, >>>>> .vbus_draw = dwc3_gadget_vbus_draw, >>>>> + .check_config = dwc3_gadget_check_config, >>>>> }; >>>>> >>>>> /* -------------------------------------------------------------------------- */ ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v6 4/4] arm64: boot: dts: qcom: sm8150: Enable dynamic TX FIFO resize logic 2021-01-22 4:01 [PATCH v6 0/4] Re-introduce TX FIFO resize for larger EP bursting Wesley Cheng ` (2 preceding siblings ...) 2021-01-22 4:01 ` [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements Wesley Cheng @ 2021-01-22 4:01 ` Wesley Cheng 3 siblings, 0 replies; 20+ messages in thread From: Wesley Cheng @ 2021-01-22 4:01 UTC (permalink / raw) To: balbi, gregkh, robh+dt, agross, bjorn.andersson Cc: linux-arm-msm, devicetree, linux-usb, linux-kernel, peter.chen, jackp, Wesley Cheng Enable the flexible TX FIFO resize logic on SM8150. Using a larger TX FIFO SZ can help account for situations when system latency is greater than the USB bus transmission latency. Signed-off-by: Wesley Cheng <wcheng@codeaurora.org> --- arch/arm64/boot/dts/qcom/sm8150.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi index 5270bda..c7706f4 100644 --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi @@ -1569,6 +1569,7 @@ iommus = <&apps_smmu 0x140 0>; snps,dis_u2_susphy_quirk; snps,dis_enblslpm_quirk; + tx-fifo-resize; phys = <&usb_1_hsphy>, <&usb_1_ssphy>; phy-names = "usb2-phy", "usb3-phy"; }; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2021-01-28 23:09 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-22 4:01 [PATCH v6 0/4] Re-introduce TX FIFO resize for larger EP bursting Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 1/4] usb: gadget: udc: core: Introduce check_config to verify USB configuration Wesley Cheng 2021-01-22 5:17 ` Jack Pham 2021-01-26 1:01 ` Wesley Cheng 2021-01-22 16:24 ` Alan Stern 2021-01-26 1:02 ` Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 2/4] usb: gadget: configfs: Check USB configuration before adding Wesley Cheng 2021-01-22 4:01 ` [PATCH v6 3/4] usb: dwc3: Resize TX FIFOs to meet EP bursting requirements Wesley Cheng 2021-01-22 17:12 ` Bjorn Andersson 2021-01-26 1:14 ` Wesley Cheng 2021-01-26 1:55 ` Bjorn Andersson 2021-01-26 4:32 ` Wesley Cheng 2021-01-26 5:15 ` Bjorn Andersson 2021-01-28 23:08 ` Wesley Cheng 2021-01-23 0:15 ` Thinh Nguyen 2021-01-26 9:51 ` Wesley Cheng 2021-01-26 20:43 ` Thinh Nguyen 2021-01-26 23:26 ` Wesley Cheng 2021-01-27 1:47 ` Thinh Nguyen 2021-01-22 4:01 ` [PATCH v6 4/4] arm64: boot: dts: qcom: sm8150: Enable dynamic TX FIFO resize logic Wesley Cheng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.