From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758687AbdACMfU (ORCPT ); Tue, 3 Jan 2017 07:35:20 -0500 Received: from mga11.intel.com ([192.55.52.93]:34080 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752449AbdACMfM (ORCPT ); Tue, 3 Jan 2017 07:35:12 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,451,1477983600"; d="asc'?scan'208";a="804607785" From: Felipe Balbi To: Baolin Wang , Janusz Dziedzic Cc: Greg KH , USB , LKML , Linaro Kernel Mailman List , Mark Brown Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler In-Reply-To: References: <0d79eb1f34e409749a136173b68f365b545c4789.1482738764.git.baolin.wang@linaro.org> <87wpel1vac.fsf@linux.intel.com> Date: Tue, 03 Jan 2017 14:33:23 +0200 Message-ID: <8737h0z5kc.fsf@linux.intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Hi, Baolin Wang writes: > On 28 December 2016 at 20:30, Janusz Dziedzic = wrote: >> 2016-12-27 13:16 GMT+01:00 Baolin Wang : >>> Hi, >>> >>> On 27 December 2016 at 19:11, Felipe Balbi wrote: >>>> >>>> Hi, >>>> >>>> Baolin Wang writes: >>>>> Hi, >>>>> >>>>> On 27 December 2016 at 18:52, Janusz Dziedzic wrote: >>>>>> 2016-12-26 9:01 GMT+01:00 Baolin Wang : >>>>>>> On some platfroms(like x86 platform), when one core is running the = USB gadget >>>>>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another co= re also can >>>>>>> respond other interrupts from dwc3 controller and modify the event = buffer by >>>>>>> dwc3_interrupt() function, that will cause getting the wrong event = count in >>>>>>> irq thread handler to make the USB function abnormal. >>>>>>> >>>>>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid= this race. >>>>>>> >>>>>> Interesting, I always think we mask interrupt in dwc3_interrupt() by= setting >>>>>> DWC3_GEVNTSIZ_INTMASK >>>>>> And unmask interrupt when we end dwc3_thread_interrupt(). >>>>>> >>>>>> So, we shouldn't get any IRQ from HW during dwc3_thread_interrupt(), >>>>>> or I miss something? >>>>>> Do you have some traces that indicate this masking will not work cor= rectly? >>>>> >>>>> Yes, but we just masked the interrupts described in DEVTEN register, >>>>> and we did not mask all the interrupts, like the endpoint command >>>>> complete event, transfer complete event and so on, so we can still get >>>>> interrupts. >>>> >>>> not true, we masked interrupts for the entire event buffer: >>> >>> Yes, you are right and I missed that. I should reproduce this problem >>> and analyse the real reason. >>> >>>> >>>>> static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) >>>>> { >>>>> struct dwc3 *dwc =3D evt->dwc; >>>>> u32 count; >>>>> u32 reg; >>>>> >>>>> if (pm_runtime_suspended(dwc->dev)) { >>>>> pm_runtime_get(dwc->dev); >>>>> disable_irq_nosync(dwc->irq_gadget); >>>>> dwc->pending_events =3D true; >>>>> return IRQ_HANDLED; >>>>> } >>>>> >>>>> count =3D dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); >>>>> count &=3D DWC3_GEVNTCOUNT_MASK; >>>>> if (!count) >>>>> return IRQ_NONE; >>>>> >>>>> evt->count =3D count; >>>>> evt->flags |=3D DWC3_EVENT_PENDING; >>>>> >>>>> /* Mask interrupt */ >>>>> reg =3D dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0)); >>>>> reg |=3D DWC3_GEVNTSIZ_INTMASK; >>>> >>>> See here ?!? >>>> >>>>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg); >>>>> >>>>> return IRQ_WAKE_THREAD; >>>>> } >>>> >>>>>> BTW, what value you get when problem occured, 0xFFFC? >>>>> >>>>> Yes, something like this, the event count become huge. >>>> >> Probably you have little bit different code than current community >> version (depends how your PM works). >> >> This is possible when we write: >> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0); >> And after that >> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 4); >> >> After that we will get 0xFFFC (-4). >> >> Possible races: >> 1) dwc3_event_buffers_setup/dwc3_event_buffers_cleanup - write 0 >> 2) dwc3_thread - write 4 >> >> While [1] could be called in PM work or UM context (init in Android >> case) spin_lock_irqsave() will only disable local irqs and still we >> could get IRQ on different core, next update evt->count and run >> thread... > > Yeah, that's the possible races. and you have triggered this with mailine? How? We don't write to GEVNT* registers from PM code and we only allow runtime_suspend with cable dettached. =2D-=20 balbi --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEElLzh7wn96CXwjh2IzL64meEamQYFAlhrmhMACgkQzL64meEa mQbC8w//bRfh7uisKxwusndB2q0JBZAfPih6zmSfoBACN+1Ve3LYPgqezdfdfkz+ mco1PnXrNBLe9BW3wzLkGgTELQOew1rPJMH3izc0dM5XGhzjW/k0DZQ9nHty8cTu vtNBotEs7DYF0L/WNvsAWPfxvTJK/YxEpXbl+aoV4QtuaizAkoNqO4W5wRtnFGp4 9SedRNdv+IOmWObXIkzS1snO5dcWVremZqu/hTqKHiPsZCOwwagOvCVA/MdA8+Of RHhXh7lfYerVb6OyVurvxnG3JDQXQfVyppoYrIUxIlZS0gIFbVKw4TnFeadlPNVX uLx7hoJ4Ey5gyI4n1+LVbYF1C5yNQnI4ssBdhlNE4dPSin6dVG6PSMlYAleulV8c zsvCmAY+J9jvq813+9VNl3mM7QSQ/nVVKy3ygrbM7bIteta0JwB0SYZcYY7hUD1o Rq7SelG05jWblFCVxNaLkP6F2YHhH25k0y5b3Sjj1l3JKCrvd7Ti9F1DsQI4YkNC b0rmNJ7iPmCRKjcDWVUNec99InOq/6il+P+DGzqeav53PvQ+qsm4zbmfGEd5r6cN /AtDiEU8qUzNdr2MpPBFltdpXOaRzb5qbv95yts/FrMVqcP09JhvDjgIlnsJhoqI 3c8wFXuZ0wpflihn2VTz0uko/sdK9+aJiCgq0f5BOKydUrPFRt0= =jZqK -----END PGP SIGNATURE----- --=-=-=--