All of lore.kernel.org
 help / color / mirror / Atom feed
* Locking issues w/ functionfs gadget and aio?
@ 2015-06-09  1:14 John Stultz
  2015-06-13  0:51 ` John Stultz
  0 siblings, 1 reply; 5+ messages in thread
From: John Stultz @ 2015-06-09  1:14 UTC (permalink / raw)
  To: Felipe Balbi, Al Viro, Andrzej Pietrasiewicz, Krzysztof Opasiak
  Cc: Greg Kroah-Hartman, Michal Nazarewicz, Robert Baldyga, lkml, linux-usb

After setting up functionfs for adb w/ 4.1-rc7, I noticed some flakey behavior.
I enabled some lock debugging and got the following:

[   91.648093] read strings
[   91.650264] g_ffs gadget: g_ffs ready
[   91.652551] ci_hdrc ci_hdrc.0: CI_HDRC_CONTROLLER_RESET_EVENT received
[   96.068693] BUG: spinlock lockup suspected on CPU#0, adbd/2791
[   96.068751]  lock: 0xe7764880, .magic: e7764880, .owner: <none>/-1,
.owner_cpu: -407539900
[   96.073448] CPU: 0 PID: 2791 Comm: adbd Not tainted
4.1.0-rc1-00032-g359b12f #147
[   96.081688] Hardware name: Qualcomm (Flattened Device Tree)
[   96.089266] [<c0216ac8>] (unwind_backtrace) from [<c02136a8>]
(show_stack+0x10/0x14)
[   96.094635] [<c02136a8>] (show_stack) from [<c075d9fc>]
(dump_stack+0x70/0xbc)
[   96.102627] [<c075d9fc>] (dump_stack) from [<c026ef90>]
(do_raw_spin_lock+0x114/0x1a0)
[   96.109661] [<c026ef90>] (do_raw_spin_lock) from [<c0764cb8>]
(_raw_spin_lock_irqsave+0x50/0x5c)
[   96.117560] [<c0764cb8>] (_raw_spin_lock_irqsave) from [<c037c1a0>]
(kiocb_set_cancel_fn+0x1c/0x60)
[   96.126519] [<c037c1a0>] (kiocb_set_cancel_fn) from [<c05ae568>]
(ffs_epfile_read_iter+0x8c/0x140)
[   96.135289] [<c05ae568>] (ffs_epfile_read_iter) from [<c0332018>]
(__vfs_read+0xb0/0xd4)
[   96.144290] [<c0332018>] (__vfs_read) from [<c0332ef8>] (vfs_read+0x7c/0x100)
[   96.152535] [<c0332ef8>] (vfs_read) from [<c0332fbc>] (SyS_read+0x40/0x8c)
[   96.159571] [<c0332fbc>] (SyS_read) from [<c020ff20>]
(ret_fast_syscall+0x0/0x4c)
[  117.678633] INFO: rcu_preempt detected stalls on CPUs/tasks:
[  117.683069]  0: (1 GPs behind) idle=805/140000000000000/0
softirq=7187/7189 fqs=2601
[  117.683316]  (detected by 3, t=2603 jiffies, g=3028, c=3027, q=474)
[  117.692345] Task dump for CPU 0:
[  117.697202] adbd            R running      0  2791      1 0x00000002
[  117.704296] [<c075f234>] (__schedule) from [<ffffffff>] (0xffffffff)


It seems we're stuck on the kioctx.ctx_lock, and reviewing that lock
usage I don't see any problems in fs/aio.c right off.

So I'm guessing the f_fs.c code is somehow not initializing the lock
structure, or maybe calling kiocb_set_cancel_fn() from the wrong
context?

Anyway, I was curious if anyone else has seen similar issues or had
suggestions for further debugging.  I've seen this issue as well with
4.1-rc1.

thanks
-john

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Locking issues w/ functionfs gadget and aio?
  2015-06-09  1:14 Locking issues w/ functionfs gadget and aio? John Stultz
@ 2015-06-13  0:51 ` John Stultz
  2015-06-21  5:24   ` Al Viro
  0 siblings, 1 reply; 5+ messages in thread
From: John Stultz @ 2015-06-13  0:51 UTC (permalink / raw)
  To: Felipe Balbi, Al Viro, Andrzej Pietrasiewicz, Krzysztof Opasiak
  Cc: Greg Kroah-Hartman, Michal Nazarewicz, Robert Baldyga, lkml, linux-usb

On Mon, Jun 8, 2015 at 6:14 PM, John Stultz <john.stultz@linaro.org> wrote:
> After setting up functionfs for adb w/ 4.1-rc7, I noticed some flakey behavior.
> I enabled some lock debugging and got the following:
>
> [   91.648093] read strings
> [   91.650264] g_ffs gadget: g_ffs ready
> [   91.652551] ci_hdrc ci_hdrc.0: CI_HDRC_CONTROLLER_RESET_EVENT received
> [   96.068693] BUG: spinlock lockup suspected on CPU#0, adbd/2791
> [   96.068751]  lock: 0xe7764880, .magic: e7764880, .owner: <none>/-1,
> .owner_cpu: -407539900
> [   96.073448] CPU: 0 PID: 2791 Comm: adbd Not tainted
> 4.1.0-rc1-00032-g359b12f #147
> [   96.081688] Hardware name: Qualcomm (Flattened Device Tree)
> [   96.089266] [<c0216ac8>] (unwind_backtrace) from [<c02136a8>]
> (show_stack+0x10/0x14)
> [   96.094635] [<c02136a8>] (show_stack) from [<c075d9fc>]
> (dump_stack+0x70/0xbc)
> [   96.102627] [<c075d9fc>] (dump_stack) from [<c026ef90>]
> (do_raw_spin_lock+0x114/0x1a0)
> [   96.109661] [<c026ef90>] (do_raw_spin_lock) from [<c0764cb8>]
> (_raw_spin_lock_irqsave+0x50/0x5c)
> [   96.117560] [<c0764cb8>] (_raw_spin_lock_irqsave) from [<c037c1a0>]
> (kiocb_set_cancel_fn+0x1c/0x60)
> [   96.126519] [<c037c1a0>] (kiocb_set_cancel_fn) from [<c05ae568>]
> (ffs_epfile_read_iter+0x8c/0x140)
> [   96.135289] [<c05ae568>] (ffs_epfile_read_iter) from [<c0332018>]
> (__vfs_read+0xb0/0xd4)
> [   96.144290] [<c0332018>] (__vfs_read) from [<c0332ef8>] (vfs_read+0x7c/0x100)
> [   96.152535] [<c0332ef8>] (vfs_read) from [<c0332fbc>] (SyS_read+0x40/0x8c)
> [   96.159571] [<c0332fbc>] (SyS_read) from [<c020ff20>]
> (ret_fast_syscall+0x0/0x4c)
> [  117.678633] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [  117.683069]  0: (1 GPs behind) idle=805/140000000000000/0
> softirq=7187/7189 fqs=2601
> [  117.683316]  (detected by 3, t=2603 jiffies, g=3028, c=3027, q=474)
> [  117.692345] Task dump for CPU 0:
> [  117.697202] adbd            R running      0  2791      1 0x00000002
> [  117.704296] [<c075f234>] (__schedule) from [<ffffffff>] (0xffffffff)
>
>
> It seems we're stuck on the kioctx.ctx_lock, and reviewing that lock
> usage I don't see any problems in fs/aio.c right off.
>
> So I'm guessing the f_fs.c code is somehow not initializing the lock
> structure, or maybe calling kiocb_set_cancel_fn() from the wrong
> context?

So looking further at this, it seems that the __vfs_read() calls
new_sync_read(), which allocates a struct kiocb kiocb on the stack and
passes it to the ffs_epfile_read_iter() funciton. That then calls
kiocb_set_cancel_fn() passing a pointer to that kiocb. However,
kiocb_set_cancel_fn assumes the kiocb is a sub-element of a struct
aio_kiocb, and it tries to grab the kioctx from that parent structure.
However it seems there is no aio_kiocb structure here, so the
spin_lock_irqsave hangs trying to lock random data on the stack.

I'm not super sure what the right fix is, but if do something like the
following (sorry, whitespace corrupted via copy/paste), I don't seem
to run into the problem.

diff --git a/drivers/usb/gadget/function/f_fs.c
b/drivers/usb/gadget/function/f_fs.c
index 3507f88..e322818 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -924,7 +924,8 @@ static ssize_t ffs_epfile_write_iter(struct kiocb
*kiocb, struct iov_iter *from)

        kiocb->private = p;

-       kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
+       if (p->aio)
+               kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);

        res = ffs_epfile_io(kiocb->ki_filp, p);
        if (res == -EIOCBQUEUED)
@@ -968,7 +969,8 @@ static ssize_t ffs_epfile_read_iter(struct kiocb
*kiocb, struct iov_iter *to)

        kiocb->private = p;

-       kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
+       if(p->aio)
+               kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);

        res = ffs_epfile_io(kiocb->ki_filp, p);
        if (res == -EIOCBQUEUED)


Is there a better solution here? I'm not sure I see if the
is_sync_kiocb(kiocb) check would ever be false from here, so I'm not
sure if all the p->aio checking is really needed or not.

This issue seems to have been introduced with 70e60d917e91fff
(gadget/function/f_fs.c: switch to ->{read,write}_iter())

thanks
-john

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Locking issues w/ functionfs gadget and aio?
  2015-06-13  0:51 ` John Stultz
@ 2015-06-21  5:24   ` Al Viro
  2015-06-22 17:20     ` John Stultz
  0 siblings, 1 reply; 5+ messages in thread
From: Al Viro @ 2015-06-21  5:24 UTC (permalink / raw)
  To: John Stultz
  Cc: Felipe Balbi, Andrzej Pietrasiewicz, Krzysztof Opasiak,
	Greg Kroah-Hartman, Michal Nazarewicz, Robert Baldyga, lkml,
	linux-usb

On Fri, Jun 12, 2015 at 05:51:12PM -0700, John Stultz wrote:

> I'm not super sure what the right fix is, but if do something like the
> following (sorry, whitespace corrupted via copy/paste), I don't seem
> to run into the problem.

Looks sane.  Which tree would you prefer it to go through, vfs or usb?
BTW, in either case you'd need Signed-off-by: on that patch...

> diff --git a/drivers/usb/gadget/function/f_fs.c
> b/drivers/usb/gadget/function/f_fs.c
> index 3507f88..e322818 100644
> --- a/drivers/usb/gadget/function/f_fs.c
> +++ b/drivers/usb/gadget/function/f_fs.c
> @@ -924,7 +924,8 @@ static ssize_t ffs_epfile_write_iter(struct kiocb
> *kiocb, struct iov_iter *from)
> 
>         kiocb->private = p;
> 
> -       kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
> +       if (p->aio)
> +               kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
> 
>         res = ffs_epfile_io(kiocb->ki_filp, p);
>         if (res == -EIOCBQUEUED)
> @@ -968,7 +969,8 @@ static ssize_t ffs_epfile_read_iter(struct kiocb
> *kiocb, struct iov_iter *to)
> 
>         kiocb->private = p;
> 
> -       kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
> +       if(p->aio)
> +               kiocb_set_cancel_fn(kiocb, ffs_aio_cancel);
> 
>         res = ffs_epfile_io(kiocb->ki_filp, p);
>         if (res == -EIOCBQUEUED)
> 
> 
> Is there a better solution here? I'm not sure I see if the
> is_sync_kiocb(kiocb) check would ever be false from here, so I'm not
> sure if all the p->aio checking is really needed or not.
> 
> This issue seems to have been introduced with 70e60d917e91fff
> (gadget/function/f_fs.c: switch to ->{read,write}_iter())
> 
> thanks
> -john
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Locking issues w/ functionfs gadget and aio?
  2015-06-21  5:24   ` Al Viro
@ 2015-06-22 17:20     ` John Stultz
  2015-07-01  9:59       ` Andrzej Pietrasiewicz
  0 siblings, 1 reply; 5+ messages in thread
From: John Stultz @ 2015-06-22 17:20 UTC (permalink / raw)
  To: Al Viro
  Cc: Felipe Balbi, Andrzej Pietrasiewicz, Krzysztof Opasiak,
	Greg Kroah-Hartman, Michal Nazarewicz, Robert Baldyga, lkml,
	linux-usb

On Sat, Jun 20, 2015 at 10:24 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, Jun 12, 2015 at 05:51:12PM -0700, John Stultz wrote:
>
>> I'm not super sure what the right fix is, but if do something like the
>> following (sorry, whitespace corrupted via copy/paste), I don't seem
>> to run into the problem.
>
> Looks sane.  Which tree would you prefer it to go through, vfs or usb?
> BTW, in either case you'd need Signed-off-by: on that patch...

Heh. I assumed my hack would be the wrong thing, but if there's no
better suggestion, I'm happy to submit it.

I have no strong preference of which tree it goes through, but I'd
like for Andrzej to weigh in to make sure he agrees.

I'll write up a proper changelog and resubmit here shortly.

thanks for the review!
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Locking issues w/ functionfs gadget and aio?
  2015-06-22 17:20     ` John Stultz
@ 2015-07-01  9:59       ` Andrzej Pietrasiewicz
  0 siblings, 0 replies; 5+ messages in thread
From: Andrzej Pietrasiewicz @ 2015-07-01  9:59 UTC (permalink / raw)
  To: John Stultz, Al Viro
  Cc: Felipe Balbi, Krzysztof Opasiak, Greg Kroah-Hartman,
	Michal Nazarewicz, Robert Baldyga, lkml, linux-usb

W dniu 22.06.2015 o 19:20, John Stultz pisze:
> On Sat, Jun 20, 2015 at 10:24 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>> On Fri, Jun 12, 2015 at 05:51:12PM -0700, John Stultz wrote:
>>
>>> I'm not super sure what the right fix is, but if do something like the
>>> following (sorry, whitespace corrupted via copy/paste), I don't seem
>>> to run into the problem.
>>
>> Looks sane.  Which tree would you prefer it to go through, vfs or usb?
>> BTW, in either case you'd need Signed-off-by: on that patch...
>
> Heh. I assumed my hack would be the wrong thing, but if there's no
> better suggestion, I'm happy to submit it.
>
> I have no strong preference of which tree it goes through, but I'd
> like for Andrzej to weigh in to make sure he agrees.
>

I have no preference of which tree it goes through either.

As far as the changes themselves are concerned, they have
been reviewed by Robert, whose desk is next to mine and
I trust his opinion.

Thanks,

AP


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-01 10:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-09  1:14 Locking issues w/ functionfs gadget and aio? John Stultz
2015-06-13  0:51 ` John Stultz
2015-06-21  5:24   ` Al Viro
2015-06-22 17:20     ` John Stultz
2015-07-01  9:59       ` Andrzej Pietrasiewicz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.