All of lore.kernel.org
 help / color / mirror / Atom feed
* I/O issues with writing to mtdblock devices on kirkwood
@ 2015-08-21 12:24 Ian Campbell
  2015-08-21 13:07 ` Andrew Lunn
  2015-09-05 21:08 ` Andrew Lunn
  0 siblings, 2 replies; 32+ messages in thread
From: Ian Campbell @ 2015-08-21 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi kirkwood-upstream,

We (Debian) have had a couple of reports of I/O errors running Debian
on kirkwood, specifically it seems to relate to later kernels (e.g.
4.0+) and I _suspect_ (without proof) that it may be due to the switch
from board files to the DTS based kernel, or some change implied by
this (e.g. different SATA driver now or timeouts have changed
perhaps?).

There are two reports that writing to mdtblock (as a raw device, where
the bootloader expects to find the kernel etc) causes SATA to fall
apart. Please see this thread:
http://mid.gmane.org/<
CAMLx2CqT+2fzfa9_Bc0fXLAJ1wJ+yQuJvUThJwN48kGJTJhJxg@mail.gmail.com>

There was an earlier iteration of the same thing as the above:
http://mid.gmane.org/<CAKzmTe1vUGjiPt8k3VHNUP9aHs7zF9wCXoXOo8XFjD-YEH-U5A@mail.gmail.com>

Thanks for any pointers you can give. I've copied the reporters as well
as the debian-arm list.

@debian-arm: I had the impression there had been more reports than
these two/three but I'm not seeing them in my folder, if you know of
others then please chime in.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-08-21 12:24 I/O issues with writing to mtdblock devices on kirkwood Ian Campbell
@ 2015-08-21 13:07 ` Andrew Lunn
  2015-08-21 20:23   ` Ian Campbell
  2015-09-05 21:08 ` Andrew Lunn
  1 sibling, 1 reply; 32+ messages in thread
From: Andrew Lunn @ 2015-08-21 13:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
> Hi kirkwood-upstream,

Hi Ian

Thanks for forwarding the reports to us. I will try to reproduce it on my
TS-119P.

I'm travelling for the next 10 days, so it may take a while.

    Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-08-21 13:07 ` Andrew Lunn
@ 2015-08-21 20:23   ` Ian Campbell
  0 siblings, 0 replies; 32+ messages in thread
From: Ian Campbell @ 2015-08-21 20:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2015-08-21 at 15:07 +0200, Andrew Lunn wrote:
> On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
> > Hi kirkwood-upstream,
> 
> Hi Ian
> 
> Thanks for forwarding the reports to us. I will try to reproduce it on my
> TS-119P.

No problem.

BTW maybe Debian bug #794265[0] is related, it is "flash-kernel
subsequent actions result in segfault" where "flash-kernel" is the tool
which writes the mtdblock device (causing SATA issues in the original
mail).

> I'm travelling for the next 10 days, so it may take a while.

Ack. FYI I'll be on vacation from Sunday for a week.

Cheer,
Ian.

[0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=794265

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-08-21 12:24 I/O issues with writing to mtdblock devices on kirkwood Ian Campbell
  2015-08-21 13:07 ` Andrew Lunn
@ 2015-09-05 21:08 ` Andrew Lunn
  2015-09-06 12:11   ` Ian Campbell
  2015-10-11 14:37   ` JM
  1 sibling, 2 replies; 32+ messages in thread
From: Andrew Lunn @ 2015-09-05 21:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
> Hi kirkwood-upstream,
> 
> We (Debian) have had a couple of reports of I/O errors running Debian
> on kirkwood, specifically it seems to relate to later kernels (e.g.
> 4.0+) and I _suspect_ (without proof) that it may be due to the switch
> from board files to the DTS based kernel, or some change implied by
> this (e.g. different SATA driver now or timeouts have changed
> perhaps?).

Hi Ian

I've not reproduced this exactly, but something similar.

I do a find / and in parallel a cat /dev/mtdblock3 > /dev/null

While the cat is active, the find grinds to a halt. I don't get any
SATA timeouts, but that could be because /dev/mtdblock3 is small
enough that the timers don't expire.

So i will now try to track down if there is a lock getting contended,
or at least why only the cat process makes progress.

   Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-09-05 21:08 ` Andrew Lunn
@ 2015-09-06 12:11   ` Ian Campbell
  2015-10-11 14:37   ` JM
  1 sibling, 0 replies; 32+ messages in thread
From: Ian Campbell @ 2015-09-06 12:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, 2015-09-05 at 23:08 +0200, Andrew Lunn wrote:
> On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
> > Hi kirkwood-upstream,
> > 
> > We (Debian) have had a couple of reports of I/O errors running 
> Debian
> > on kirkwood, specifically it seems to relate to later kernels (e.g.
> > 4.0+) and I _suspect_ (without proof) that it may be due to the 
> switch
> > from board files to the DTS based kernel, or some change implied by
> > this (e.g. different SATA driver now or timeouts have changed
> > perhaps?).
> 
> Hi Ian
> 
> I've not reproduced this exactly, but something similar.

It sounds likely to be the same underlying issue.

> I do a find / and in parallel a cat /dev/mtdblock3 > /dev/null

FYI it was reported this morning[0] that using flashcp doesn't exhibit
this behaviour. Since AIUI flashcp goes directly at the /dev/mtdN
device, rather than via mtdblockN, this seems to point towards
something in the mtdblock layer perhaps?

> While the cat is active, the find grinds to a halt. I don't get any
> SATA timeouts, but that could be because /dev/mtdblock3 is small
> enough that the timers don't expire.

Agreed.

Thanks,
Ian.

[0] http://article.gmane.org/gmane.linux.debian.ports.arm/15767

> So i will now try to track down if there is a lock getting contended,
> or at least why only the cat process makes progress.
> 
>    Andrew
> 
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-09-05 21:08 ` Andrew Lunn
  2015-09-06 12:11   ` Ian Campbell
@ 2015-10-11 14:37   ` JM
  2015-10-11 15:35     ` Andrew Lunn
  1 sibling, 1 reply; 32+ messages in thread
From: JM @ 2015-10-11 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Sep 5, 2015 at 11:08 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
>> Hi kirkwood-upstream,
>>
>> We (Debian) have had a couple of reports of I/O errors running Debian
>> on kirkwood, specifically it seems to relate to later kernels (e.g.
>> 4.0+) and I _suspect_ (without proof) that it may be due to the switch
>> from board files to the DTS based kernel, or some change implied by
>> this (e.g. different SATA driver now or timeouts have changed
>> perhaps?).
>
> Hi Ian
>
> I've not reproduced this exactly, but something similar.
>
> I do a find / and in parallel a cat /dev/mtdblock3 > /dev/null
>
> While the cat is active, the find grinds to a halt. I don't get any
> SATA timeouts, but that could be because /dev/mtdblock3 is small
> enough that the timers don't expire.
>
> So i will now try to track down if there is a lock getting contended,
> or at least why only the cat process makes progress.
>
>    Andrew

Is there any update on this? The bug persists with kernel 4.2.1

Best regards,
Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-11 14:37   ` JM
@ 2015-10-11 15:35     ` Andrew Lunn
  2015-10-12 14:29       ` JM
  2016-01-11 23:00       ` Martin Michlmayr
  0 siblings, 2 replies; 32+ messages in thread
From: Andrew Lunn @ 2015-10-11 15:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 11, 2015 at 04:37:52PM +0200, JM wrote:
> On Sat, Sep 5, 2015 at 11:08 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> > On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
> >> Hi kirkwood-upstream,
> >>
> >> We (Debian) have had a couple of reports of I/O errors running Debian
> >> on kirkwood, specifically it seems to relate to later kernels (e.g.
> >> 4.0+) and I _suspect_ (without proof) that it may be due to the switch
> >> from board files to the DTS based kernel, or some change implied by
> >> this (e.g. different SATA driver now or timeouts have changed
> >> perhaps?).
> >
> > Hi Ian
> >
> > I've not reproduced this exactly, but something similar.
> >
> > I do a find / and in parallel a cat /dev/mtdblock3 > /dev/null
> >
> > While the cat is active, the find grinds to a halt. I don't get any
> > SATA timeouts, but that could be because /dev/mtdblock3 is small
> > enough that the timers don't expire.
> >
> > So i will now try to track down if there is a lock getting contended,
> > or at least why only the cat process makes progress.
> >
> >    Andrew
> 
> Is there any update on this? The bug persists with kernel 4.2.1

Sorry, not had time to look.

Do you have any idea when the issue started? My QNAP box got device
tree support somewhere around 3.5. So i would be tempted to see if
that has the issue, and then git bisect from there. Could you do the
same with you hardware?

     Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-11 15:35     ` Andrew Lunn
@ 2015-10-12 14:29       ` JM
  2015-10-12 16:05         ` Rob J. Epping
  2016-01-11 23:00       ` Martin Michlmayr
  1 sibling, 1 reply; 32+ messages in thread
From: JM @ 2015-10-12 14:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 11, 2015 at 5:35 PM, Andrew Lunn <andrew@lunn.ch> wrote:
> On Sun, Oct 11, 2015 at 04:37:52PM +0200, JM wrote:
>> On Sat, Sep 5, 2015 at 11:08 PM, Andrew Lunn <andrew@lunn.ch> wrote:
>> > On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
>> >> Hi kirkwood-upstream,
>> >>
>> >> We (Debian) have had a couple of reports of I/O errors running Debian
>> >> on kirkwood, specifically it seems to relate to later kernels (e.g.
>> >> 4.0+) and I _suspect_ (without proof) that it may be due to the switch
>> >> from board files to the DTS based kernel, or some change implied by
>> >> this (e.g. different SATA driver now or timeouts have changed
>> >> perhaps?).
>> >
>> > Hi Ian
>> >
>> > I've not reproduced this exactly, but something similar.
>> >
>> > I do a find / and in parallel a cat /dev/mtdblock3 > /dev/null
>> >
>> > While the cat is active, the find grinds to a halt. I don't get any
>> > SATA timeouts, but that could be because /dev/mtdblock3 is small
>> > enough that the timers don't expire.
>> >
>> > So i will now try to track down if there is a lock getting contended,
>> > or at least why only the cat process makes progress.
>> >
>> >    Andrew
>>
>> Is there any update on this? The bug persists with kernel 4.2.1
>
> Sorry, not had time to look.
>
> Do you have any idea when the issue started? My QNAP box got device
> tree support somewhere around 3.5. So i would be tempted to see if
> that has the issue, and then git bisect from there. Could you do the
> same with you hardware?
>
>      Andrew

Thanks for the reply.

I'm afraid my only box is 'in production' and I'd be very reluctant to
test older kernels there as I use fairly recent filesystem features. I
have a feeling this bug has been around for a long time (or perhaps is
even inherent to the way the mtdblock driver is written) and has only
been brought out of its dormancy due to some more recent changes, as I
remember my device momentarily freezing during flashing with Debian
Wheezy (kernel 3.2) (although at that time it wouldn't result in a
complete SATA reset). This could use an independent confirmation
though.

Best regards,
Jan

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-12 14:29       ` JM
@ 2015-10-12 16:05         ` Rob J. Epping
  2015-10-12 16:21           ` Andrew Lunn
  0 siblings, 1 reply; 32+ messages in thread
From: Rob J. Epping @ 2015-10-12 16:05 UTC (permalink / raw)
  To: linux-arm-kernel

On October 12, 2015 4:29:51 PM GMT+02:00, JM <fijam@archlinux.us> wrote:
>On Sun, Oct 11, 2015 at 5:35 PM, Andrew Lunn <andrew@lunn.ch> wrote:
>> On Sun, Oct 11, 2015 at 04:37:52PM +0200, JM wrote:
>>> On Sat, Sep 5, 2015 at 11:08 PM, Andrew Lunn <andrew@lunn.ch> wrote:
>>> > On Fri, Aug 21, 2015 at 01:24:17PM +0100, Ian Campbell wrote:
>>> >> Hi kirkwood-upstream,
>>> >>
>>> >> We (Debian) have had a couple of reports of I/O errors running
>Debian
>>> >> on kirkwood, specifically it seems to relate to later kernels
>(e.g.
>>> >> 4.0+) and I _suspect_ (without proof) that it may be due to the
>switch
>>> >> from board files to the DTS based kernel, or some change implied
>by
>>> >> this (e.g. different SATA driver now or timeouts have changed
>>> >> perhaps?).
>>> >
>>> > Hi Ian
>>> >
>>> > I've not reproduced this exactly, but something similar.
>>> >
>>> > I do a find / and in parallel a cat /dev/mtdblock3 > /dev/null
>>> >
>>> > While the cat is active, the find grinds to a halt. I don't get
>any
>>> > SATA timeouts, but that could be because /dev/mtdblock3 is small
>>> > enough that the timers don't expire.
>>> >
>>> > So i will now try to track down if there is a lock getting
>contended,
>>> > or at least why only the cat process makes progress.
>>> >
>>> >    Andrew
>>>
>>> Is there any update on this? The bug persists with kernel 4.2.1
>>
>> Sorry, not had time to look.
>>
>> Do you have any idea when the issue started? My QNAP box got device
>> tree support somewhere around 3.5. So i would be tempted to see if
>> that has the issue, and then git bisect from there. Could you do the
>> same with you hardware?
>>
>>      Andrew
>
>Thanks for the reply.
>
>I'm afraid my only box is 'in production' and I'd be very reluctant to
>test older kernels there as I use fairly recent filesystem features. I
>have a feeling this bug has been around for a long time (or perhaps is
>even inherent to the way the mtdblock driver is written) and has only
>been brought out of its dormancy due to some more recent changes, as I
>remember my device momentarily freezing during flashing with Debian
>Wheezy (kernel 3.2) (although at that time it wouldn't result in a
>complete SATA reset). This could use an independent confirmation
>though.
>
>Best regards,
>Jan

I've been on backports  for some time. AFAIR the problems started with the change to dtb/dts.

I have a box that I can test with, but it has no console connected. Can steal the console connection from my "production" box but that is cumbersome.

GRTNX,
RobJE
-- 
Sent from my mobile device. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-12 16:05         ` Rob J. Epping
@ 2015-10-12 16:21           ` Andrew Lunn
  2015-10-13  7:51             ` Ian Campbell
  0 siblings, 1 reply; 32+ messages in thread
From: Andrew Lunn @ 2015-10-12 16:21 UTC (permalink / raw)
  To: linux-arm-kernel

> I've been on backports for some time. AFAIR the problems started
> with the change to dtb/dts.

Which kernel version? Can you give me a kernel version when it was
good and a version when it was bad?

For a few kernel versions it was possible to boot both DT and the old
setup file. It would be great to confirm it was the swap to DT, and
not some other kernel change at about the same time.

My gut feeling is that the change to DT is not the issue. But this is
all speculation at the moment, we need some hard evidence.

   Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-12 16:21           ` Andrew Lunn
@ 2015-10-13  7:51             ` Ian Campbell
       [not found]               ` <5626B4DC.8000407@mcfarlanes.me>
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2015-10-13  7:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 2015-10-12 at 18:21 +0200, Andrew Lunn wrote:
> > I've been on backports for some time. AFAIR the problems started
> > with the change to dtb/dts.
> 
> Which kernel version? Can you give me a kernel version when it was
> good and a version when it was bad?
> 
> For a few kernel versions it was possible to boot both DT and the old
> setup file. It would be great to confirm it was the swap to DT, and
> not some other kernel change at about the same time.
> 
> My gut feeling is that the change to DT is not the issue. But this is
> all speculation at the moment, we need some hard evidence.

IIRC someone mentioned that the SATA driver which got used changed with
boardfile vs DT? Not sure if that is true though.

If anyone would like to perform the suggested experiments then flash
-kernel in Debian currently appends a DTB for Linux >= v3.17-rc1.
However this can be adjusted by editing /etc/flash-kernel/db and
adding:

Machine: <...>
DTB-Append-From: v3.16

Where the Machine stanza is the one for your platform, check
/usr/share/flash-kernel/db/all.db. You may need to repeat the Machine
stanza to cope with board vs. dtb mode names, e.g.

    Machine: QNAP TS-41x
    Machine: QNAP TS419 family
    DTB-Append-From: 3.17-rc1

One name is from /proc/device-tree/model under DTB the other is
/proc/cpuinfo:Hardware under board file. Under DTB the
/proc/cpuinfo:Hardware field is generic and not useful here.

You can also do just DTB-Append: Yes.

In any case having fiddled with the db you need to rerun flash-kernel
to make it take affect.

I think the most profitable first test would be to boot the v3.16
Jessie kernel in both board (default in Jessie) and DTB mode (by
modifying /etc/flash-kernel/db as above) and see if the issue
reproduces in both modes.

Older kernel binaries can be found at 
http://snapshot.debian.org/package/linux/, I'm not sure how far back it
is safe to go with Debian's binaries while appending a DTB, I'd
recommend having a serial console available if you are going to try.

Ian.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
       [not found]                 ` <5627F4E8.3030907@mcfarlanes.me>
@ 2015-10-21 21:11                   ` Ian Campbell
  2015-10-21 21:22                     ` Iain McFarlane
  0 siblings, 1 reply; 32+ messages in thread
From: Ian Campbell @ 2015-10-21 21:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-10-21 at 21:26 +0100, Iain McFarlane wrote:
> Summary of progress:

Thanks for the info. I've reinstated the CC list here.

> I ended up connecting using a serial cable which was invaluable as my
> initrd was not working (see below)
> 
> I tried going backwards:
> 4.2.0 - SATA errors when flashing initrd
> 4.1.0 - SATA errors when flashing initrd
> 4.0.0 - SATA errors when flashing initrd
> 3.16 - No SATA errors when updating initrd - the kernel is flashed using
> dtb but how do I tell if it is used?

/proc/device-tree will exist. Also the Hardware field in /proc/cpuinfo
would differ, IIRC with a board file it says "QNAP TS-119/TS-219" and
with DTB it says "QNAP TS219 family".

Ian.

> it didn't help that somewhere along the line udev/initramfs/systemd is
> broken so that I had to remove all the UUID values from my fstab and go
> back to device names.
> 
> 
> On 20/10/15 22:40, Iain McFarlane wrote:
> > Posting from archive as previously not subscribed.
> >
> > I have a qnap ts-219p (kirkwood) in pieces at the moment as it is
> > failing to boot the 4.2 kernel from a brand new install of stretch.
> >
> > I am also seeing problems with the sata controller interacting with the
> > mtdblock driver
> >
> > I am going to see if rolling back a version or 2 of the kernel will work
> > tomorrow and will feedback to the list.
> >
> > Regards
> >
> > Iain
> >
> 
> 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-21 21:11                   ` Ian Campbell
@ 2015-10-21 21:22                     ` Iain McFarlane
  2015-10-21 21:28                       ` JM
                                         ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Iain McFarlane @ 2015-10-21 21:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 21/10/15 22:11, Ian Campbell wrote:
> On Wed, 2015-10-21 at 21:26 +0100, Iain McFarlane wrote:
>> Summary of progress:
> Thanks for the info. I've reinstated the CC list here.
>
>> I ended up connecting using a serial cable which was invaluable as my
>> initrd was not working (see below)
>>
>> I tried going backwards:
>> 4.2.0 - SATA errors when flashing initrd
>> 4.1.0 - SATA errors when flashing initrd
>> 4.0.0 - SATA errors when flashing initrd
>> 3.16 - No SATA errors when updating initrd - the kernel is flashed using
>> dtb but how do I tell if it is used?
> /proc/device-tree will exist. Also the Hardware field in /proc/cpuinfo
> would differ, IIRC with a board file it says "QNAP TS-119/TS-219" and
> with DTB it says "QNAP TS219 family".
>
> Ian.
>
>> it didn't help that somewhere along the line udev/initramfs/systemd is
>> broken so that I had to remove all the UUID values from my fstab and go
>> back to device names.
>>
>>
>> On 20/10/15 22:40, Iain McFarlane wrote:
>>> Posting from archive as previously not subscribed.
>>>
>>> I have a qnap ts-219p (kirkwood) in pieces at the moment as it is
>>> failing to boot the 4.2 kernel from a brand new install of stretch.
>>>
>>> I am also seeing problems with the sata controller interacting with the
>>> mtdblock driver
>>>
>>> I am going to see if rolling back a version or 2 of the kernel will work
>>> tomorrow and will feedback to the list.
>>>
>>> Regards
>>>
>>> Iain
>>>
>>
In which case it is using the board file

How do I force it to use the dtb?

On a different note anyone know why udev would not be running modprobe? 
Every time I reboot this box I have to the reload any of the modules not
specified in the initrd even things like mvmdio to get the network working.

Iain

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-21 21:22                     ` Iain McFarlane
@ 2015-10-21 21:28                       ` JM
  2015-10-21 21:31                         ` Iain McFarlane
  2015-10-22  0:38                       ` Andrew Lunn
                                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 32+ messages in thread
From: JM @ 2015-10-21 21:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 21, 2015 at 11:22 PM, Iain McFarlane <iain@mcfarlanes.me> wrote:
> On a different note anyone know why udev would not be running modprobe?
> Every time I reboot this box I have to the reload any of the modules not
> specified in the initrd even things like mvmdio to get the network working.
>

If you see something like ' kmod_search_moddep() could not open moddep
file' in your logs, you need to run depmod -a manually.

I've had it happen once for not entirely clear reasons, but probably
related to the interrupted kernel package installation.

HTH,
Jan

PS Now we have Ian, Iain and Jan posting in this thread.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-21 21:28                       ` JM
@ 2015-10-21 21:31                         ` Iain McFarlane
  0 siblings, 0 replies; 32+ messages in thread
From: Iain McFarlane @ 2015-10-21 21:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 21/10/15 22:28, JM wrote:
> On Wed, Oct 21, 2015 at 11:22 PM, Iain McFarlane <iain@mcfarlanes.me> wrote:
>> On a different note anyone know why udev would not be running modprobe?
>> Every time I reboot this box I have to the reload any of the modules not
>> specified in the initrd even things like mvmdio to get the network working.
>>
> If you see something like ' kmod_search_moddep() could not open moddep
> file' in your logs, you need to run depmod -a manually.
>
> I've had it happen once for not entirely clear reasons, but probably
> related to the interrupted kernel package installation.
>
> HTH,
> Jan
>
> PS Now we have Ian, Iain and Jan posting in this thread.
>
I tried that but no luck - any other ideas?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-21 21:22                     ` Iain McFarlane
  2015-10-21 21:28                       ` JM
@ 2015-10-22  0:38                       ` Andrew Lunn
  2015-10-22  6:40                       ` Ian Campbell
  2015-10-22  6:40                       ` Ian Campbell
  3 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2015-10-22  0:38 UTC (permalink / raw)
  To: linux-arm-kernel

> In which case it is using the board file
> 
> How do I force it to use the dtb?

FYI: Mainline 3.16 kernel does support device tree for QNAP. So it
should be possible to boot using it.

My gut feeling is that DT is not part of the problem. So if you cannot
quickly find a way to force DT booting, can i suggest you binary chop
kernel versions between what does work and what does not, independent
of if it used DT or not.

Thanks
	Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-21 21:22                     ` Iain McFarlane
  2015-10-21 21:28                       ` JM
  2015-10-22  0:38                       ` Andrew Lunn
@ 2015-10-22  6:40                       ` Ian Campbell
  2015-10-22  6:40                       ` Ian Campbell
  3 siblings, 0 replies; 32+ messages in thread
From: Ian Campbell @ 2015-10-22  6:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-10-21 at 22:22 +0100, Iain McFarlane wrote:

> >> 3.16 - No SATA errors when updating initrd - the kernel is flashed using
> >> dtb but how do I tell if it is used?
> [...]
> In which case it is using the board file
> 
> How do I force it to use the dtb?

If the DTB has been appended then it should Just Happen on reboot, I
think. I'm 99% certain that the 3.16 kernel in Jessie had the options
to enable DTB on this platform turned on. Specifically it should have
 CONFIG_ARCH_KIRKWOOD_DT=y.

You say "the kernel is flashed using the dtb", please could you post
the logs from flash-kernel to confirm this. And the dmesg from the
subsequent boot (whether board or DTB) would be useful too.

> On a different note anyone know why udev would not be running modprobe? 
> Every time I reboot this box I have to the reload any of the modules not
> specified in the initrd even things like mvmdio to get the network working.

I've no smart ideas, but you could maybe workaround by listing them in
/etc/modules?

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-21 21:22                     ` Iain McFarlane
                                         ` (2 preceding siblings ...)
  2015-10-22  6:40                       ` Ian Campbell
@ 2015-10-22  6:40                       ` Ian Campbell
  3 siblings, 0 replies; 32+ messages in thread
From: Ian Campbell @ 2015-10-22  6:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2015-10-21 at 22:22 +0100, Iain McFarlane wrote:

> >> 3.16 - No SATA errors when updating initrd - the kernel is flashed using
> >> dtb but how do I tell if it is used?
> [...]
> In which case it is using the board file
> 
> How do I force it to use the dtb?

If the DTB has been appended then it should Just Happen on reboot, I
think. I'm 99% certain that the 3.16 kernel in Jessie had the options
to enable DTB on this platform turned on. Specifically it should have
 CONFIG_ARCH_KIRKWOOD_DT=y.

You say "the kernel is flashed using the dtb", please could you post
the logs from flash-kernel to confirm this. And the dmesg from the
subsequent boot (whether board or DTB) would be useful too.

> On a different note anyone know why udev would not be running modprobe? 
> Every time I reboot this box I have to the reload any of the modules not
> specified in the initrd even things like mvmdio to get the network working.

I've no smart ideas, but you could maybe workaround by listing them in
/etc/modules?

Ian

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2015-10-11 15:35     ` Andrew Lunn
  2015-10-12 14:29       ` JM
@ 2016-01-11 23:00       ` Martin Michlmayr
  2016-01-11 23:22         ` Mark Brown
  1 sibling, 1 reply; 32+ messages in thread
From: Martin Michlmayr @ 2016-01-11 23:00 UTC (permalink / raw)
  To: linux-arm-kernel

A few months ago Debian users with QNAP devices (ARM Kirkwood)
reported issues (mostly SATA timeouts) when doing kernel upgrades,
specifically when the new ramdisk was being written to flash.

cat file > /dev/mtdblockX worked fine on a 2 MB flash partition but
resulted in SATA timeouts on a 9 MB flash partition.

flascp file /dev/mtd2 works fine.

I've now bisected it down to this change:

 commit 0461a4149836c792d186027c8c859637a4cfb11a
 Author: Mark Brown <broonie@kernel.org>
 Date:   Tue Dec 9 21:38:05 2014 +0000

     spi: Pump transfers inside calling context for spi_sync()

-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-11 23:00       ` Martin Michlmayr
@ 2016-01-11 23:22         ` Mark Brown
  2016-01-11 23:43           ` Andrew Lunn
  2016-01-12  0:07           ` Martin Michlmayr
  0 siblings, 2 replies; 32+ messages in thread
From: Mark Brown @ 2016-01-11 23:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 03:00:59PM -0800, Martin Michlmayr wrote:
> A few months ago Debian users with QNAP devices (ARM Kirkwood)
> reported issues (mostly SATA timeouts) when doing kernel upgrades,
> specifically when the new ramdisk was being written to flash.

> cat file > /dev/mtdblockX worked fine on a 2 MB flash partition but
> resulted in SATA timeouts on a 9 MB flash partition.

> flascp file /dev/mtd2 works fine.

> I've now bisected it down to this change:

>  commit 0461a4149836c792d186027c8c859637a4cfb11a
>  Author: Mark Brown <broonie@kernel.org>
>  Date:   Tue Dec 9 21:38:05 2014 +0000

>      spi: Pump transfers inside calling context for spi_sync()

Can you please clarify?  You're saying this causes SATA timeouts but
this is a change in the SPI subsystem and you're talking about MTD
devices.  You've also not said which kernel version this is with...

In any case, please provide traces from ftrace with all the SPI trace
enabled (via /sys/kernel/debug/trace/events/spi/enable).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160111/a4a603d8/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-11 23:22         ` Mark Brown
@ 2016-01-11 23:43           ` Andrew Lunn
  2016-01-12  1:21             ` Mark Brown
  2016-01-12  0:07           ` Martin Michlmayr
  1 sibling, 1 reply; 32+ messages in thread
From: Andrew Lunn @ 2016-01-11 23:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 11:22:31PM +0000, Mark Brown wrote:
> On Mon, Jan 11, 2016 at 03:00:59PM -0800, Martin Michlmayr wrote:
> > A few months ago Debian users with QNAP devices (ARM Kirkwood)
> > reported issues (mostly SATA timeouts) when doing kernel upgrades,
> > specifically when the new ramdisk was being written to flash.
> 
> > cat file > /dev/mtdblockX worked fine on a 2 MB flash partition but
> > resulted in SATA timeouts on a 9 MB flash partition.
> 
> > flascp file /dev/mtd2 works fine.
> 
> > I've now bisected it down to this change:
> 
> >  commit 0461a4149836c792d186027c8c859637a4cfb11a
> >  Author: Mark Brown <broonie@kernel.org>
> >  Date:   Tue Dec 9 21:38:05 2014 +0000
> 
> >      spi: Pump transfers inside calling context for spi_sync()
> 
> Can you please clarify?  You're saying this causes SATA timeouts but
> this is a change in the SPI subsystem and you're talking about MTD
> devices.  You've also not said which kernel version this is with...

Hi Mark

I've done a little testing. What appears to happen is that while the
cat file > /dev/mtdblockX is going on, all access to filesystems on
SATA are blocked. I set off a "find ." and it busily prints
filenames. But as soon as i start the cat, it grinds to a halt, and
only continues once the cat has finished.

My guess is that the locking behaviour has changed somehow. SPI or MTD
is now holding onto a lock so preventing other filesystems making
progress? Maybe before this change the lock was release and grabbed every
message?

   Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-11 23:22         ` Mark Brown
  2016-01-11 23:43           ` Andrew Lunn
@ 2016-01-12  0:07           ` Martin Michlmayr
  2016-01-12  0:47             ` Mark Brown
  1 sibling, 1 reply; 32+ messages in thread
From: Martin Michlmayr @ 2016-01-12  0:07 UTC (permalink / raw)
  To: linux-arm-kernel

* Mark Brown <broonie@debian.org> [2016-01-11 23:22]:
> >      spi: Pump transfers inside calling context for spi_sync()
> 
> Can you please clarify?  You're saying this causes SATA timeouts but
> this is a change in the SPI subsystem and you're talking about MTD
> devices.  You've also not said which kernel version this is with...

Sorry for being unclear.  The problem is that other activities get
blocked (most notably SATA) when writing a 9 MB file to an SPI flash
chip.  The problem does not happen when writing a smaller (2 MB) file.
It only happens when writing to mtdblock, not to mtd (maybe flashcp
writes the file in smaller blocks?).

The problem still exists in 4.4.  I started with 3.16 which was known
to be good.  This was using the ARM board file for QNAP.  I then tried
Device Tree with 3.16 since someone suggested to try that.  I verified
that 3.19 works and that 4.0 shows the problem, and bisected it to the
commit I mentioned.

> In any case, please provide traces from ftrace with all the SPI trace
> enabled (via /sys/kernel/debug/trace/events/spi/enable).

I've never used ftrace so I need to look into that (or maybe Andrew
can help).

Anyway, here's the original log:

root at debian:~# cat mtd2 > /dev/mtdblock2
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata1.00: failed command: FLUSH CACHE EXT
ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 20
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl F300)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: hard resetting link
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl F300)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: limiting SATA link speed to 1.5 Gbps
ata1: hard resetting link

-- 
Martin Michlmayr
http://www.cyrius.com/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12  0:07           ` Martin Michlmayr
@ 2016-01-12  0:47             ` Mark Brown
  2016-01-12  1:19               ` Andrew Lunn
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Brown @ 2016-01-12  0:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 11, 2016 at 04:07:21PM -0800, Martin Michlmayr wrote:
> * Mark Brown <broonie@debian.org> [2016-01-11 23:22]:

I've just noticed that you're asking about the kernel SPI subsystem on
the Debian ARM list, please don't do things like this - please include
the kernel community, you'll be able to get more help that way.

> > >      spi: Pump transfers inside calling context for spi_sync()

> > Can you please clarify?  You're saying this causes SATA timeouts but
> > this is a change in the SPI subsystem and you're talking about MTD
> > devices.  You've also not said which kernel version this is with...

> Sorry for being unclear.  The problem is that other activities get
> blocked (most notably SATA) when writing a 9 MB file to an SPI flash
> chip.  The problem does not happen when writing a smaller (2 MB) file.
> It only happens when writing to mtdblock, not to mtd (maybe flashcp
> writes the file in smaller blocks?).

Oh, right.  This sounds like everything is working fine with SPI - that
commit was supposed to improve throughput with single threaded workloads
by avoiding pointless context switches and it seems it is in fact doing
that.  Most likely you are using a bitbanging SPI controller driver and
that's causing lots of I/O wait states which is upsetting the scheduler
but it's hard to be sure.

Possibly whatever SPI driver this system uses is doing something really
rude (perhaps limited by the hardware), possibly it isn't using DMA when
it should be, or possibly the scheduler just isn't doing a good job with
the workload you're giving it.

> > In any case, please provide traces from ftrace with all the SPI trace
> > enabled (via /sys/kernel/debug/trace/events/spi/enable).

> I've never used ftrace so I need to look into that (or maybe Andrew
> can help).

cat /sys/kernel/debug/tracing/trace
https://www.kernel.org/doc/Documentation/trace/ftrace.txt
http://www.sirena.org.uk/2011/01/22/tracing-asoc-with-trace-points/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160112/6a85d0fc/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12  0:47             ` Mark Brown
@ 2016-01-12  1:19               ` Andrew Lunn
  2016-01-12  1:31                 ` Mark Brown
  2016-01-12 16:29                 ` Arnd Bergmann
  0 siblings, 2 replies; 32+ messages in thread
From: Andrew Lunn @ 2016-01-12  1:19 UTC (permalink / raw)
  To: linux-arm-kernel

> Oh, right.  This sounds like everything is working fine with SPI - that
> commit was supposed to improve throughput with single threaded workloads
> by avoiding pointless context switches and it seems it is in fact doing
> that.  Most likely you are using a bitbanging SPI controller driver and
> that's causing lots of I/O wait states which is upsetting the scheduler
> but it's hard to be sure.

drivers/spi/spi-orion.c

Not bitbanging, but it is polled IO, not DMA.
 
> Possibly whatever SPI driver this system uses is doing something really
> rude (perhaps limited by the hardware), possibly it isn't using DMA when
> it should be, or possibly the scheduler just isn't doing a good job with
> the workload you're giving it.

When i played with this, i added a reschedule point at the end of the
drivers transfer_one_message() function, to see if that would help. It
did not, which is why i made a guess it has something to do with a
lock.

	Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-11 23:43           ` Andrew Lunn
@ 2016-01-12  1:21             ` Mark Brown
  0 siblings, 0 replies; 32+ messages in thread
From: Mark Brown @ 2016-01-12  1:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 12, 2016 at 12:43:37AM +0100, Andrew Lunn wrote:

> I've done a little testing. What appears to happen is that while the
> cat file > /dev/mtdblockX is going on, all access to filesystems on
> SATA are blocked. I set off a "find ." and it busily prints
> filenames. But as soon as i start the cat, it grinds to a halt, and
> only continues once the cat has finished.

> My guess is that the locking behaviour has changed somehow. SPI or MTD
> is now holding onto a lock so preventing other filesystems making
> progress? Maybe before this change the lock was release and grabbed every
> message?

It's not likely to be the SPI core - it has nothing to do with
filesystems or SATA and the workload being presented to it isn't going
to have changed (that workload generally being single threaded one
message at a time for MTD so SPI will go idle between operations).  What
that commit does is avoid needless context switches before and after we
hand things off to the SPI driver so like I said in the other mail most
likely either the SPI driver is being very rude somehow, the scheduler
isn't coping or some combination of the two.  My guess is that you'd
always have been able to trigger these issues if you did a sufficiently
large flash read at once, or had sufficiently many simultaneous flash
operations going on in parallel to create a queue.

Guessing this is the spi-orion driver it looks like it's busy waiting
for the full transfer, doing register I/O interspersed with udelay()
calls for delays up to 2ms in between words.  That's never going to be
terribly friendly to other users though I don't know if the hardware
allows us to do much better.  

I would expect the scheduler to let the SATA subsystem use the CPU but
it's possible all the register I/O is getting in the way here or the
timeouts are too short.  If it is this then some schedule() calls in the
inner loop for the driver might help (eg, in _wait_till_ready() or
_write_read()), or do the schedule() or even insert artificial sleeps
into the driver at the end of _transfer_one() to simulate previous
behaviour (that'd hurt throughput for other users though).  You could
also artificially slow down the userspace program that accesses the
flash, that seems really undesirable though.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160112/3ce6816c/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12  1:19               ` Andrew Lunn
@ 2016-01-12  1:31                 ` Mark Brown
  2016-01-12 16:29                 ` Arnd Bergmann
  1 sibling, 0 replies; 32+ messages in thread
From: Mark Brown @ 2016-01-12  1:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 12, 2016 at 02:19:04AM +0100, Andrew Lunn wrote:

> When i played with this, i added a reschedule point at the end of the
> drivers transfer_one_message() function, to see if that would help. It
> did not, which is why i made a guess it has something to do with a
> lock.

The SPI core doesn't hold any locks while it pushes things into the
driver.  It has a spin lock to queue and deque messages but it drops
that before it hands off to the driver (many of which sleep so we'd
expect people to notice if we forgot to drop the lock).  It's possible
there's something going on there but I'd be very surprised if we messed
up the locking and only managed to notice over a period of years via an
interaction with another subsystem with no direct ties...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160112/bac3e762/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12  1:19               ` Andrew Lunn
  2016-01-12  1:31                 ` Mark Brown
@ 2016-01-12 16:29                 ` Arnd Bergmann
  2016-01-12 18:02                   ` Mark Brown
  1 sibling, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-01-12 16:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 12 January 2016 02:19:04 Andrew Lunn wrote:
> > Oh, right.  This sounds like everything is working fine with SPI - that
> > commit was supposed to improve throughput with single threaded workloads
> > by avoiding pointless context switches and it seems it is in fact doing
> > that.  Most likely you are using a bitbanging SPI controller driver and
> > that's causing lots of I/O wait states which is upsetting the scheduler
> > but it's hard to be sure.
> 
> drivers/spi/spi-orion.c
> 
> Not bitbanging, but it is polled IO, not DMA.
>  
> > Possibly whatever SPI driver this system uses is doing something really
> > rude (perhaps limited by the hardware), possibly it isn't using DMA when
> > it should be, or possibly the scheduler just isn't doing a good job with
> > the workload you're giving it.
> 
> When i played with this, i added a reschedule point at the end of the
> drivers transfer_one_message() function, to see if that would help. It
> did not, which is why i made a guess it has something to do with a
> lock.

Can you try using usleep_range() instead of udelay()? It might also
be worth trying what the actual delay is for each byte to see if a
longer sleep time would help, but I guess that time is highly
device specific.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12 16:29                 ` Arnd Bergmann
@ 2016-01-12 18:02                   ` Mark Brown
  2016-01-12 21:49                     ` Arnd Bergmann
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Brown @ 2016-01-12 18:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 12, 2016 at 05:29:40PM +0100, Arnd Bergmann wrote:
> On Tuesday 12 January 2016 02:19:04 Andrew Lunn wrote:

> > When i played with this, i added a reschedule point at the end of the
> > drivers transfer_one_message() function, to see if that would help. It
> > did not, which is why i made a guess it has something to do with a
> > lock.

> Can you try using usleep_range() instead of udelay()? It might also

Oh, indeed - should've thought of that!

> be worth trying what the actual delay is for each byte to see if a
> longer sleep time would help, but I guess that time is highly
> device specific.

I expect that the delay will depend on the clock speed of the SPI bus
which is system dependant.  Might actually be worth checking if the bus
is clocked excessively slowly.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160112/6a0c55c1/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12 18:02                   ` Mark Brown
@ 2016-01-12 21:49                     ` Arnd Bergmann
  2016-01-12 22:00                       ` Mark Brown
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-01-12 21:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 12 January 2016 18:02:23 Mark Brown wrote:
> On Tue, Jan 12, 2016 at 05:29:40PM +0100, Arnd Bergmann wrote:
> > On Tuesday 12 January 2016 02:19:04 Andrew Lunn wrote:
> 
> > > When i played with this, i added a reschedule point at the end of the
> > > drivers transfer_one_message() function, to see if that would help. It
> > > did not, which is why i made a guess it has something to do with a
> > > lock.
> 
> > Can you try using usleep_range() instead of udelay()? It might also
> 
> Oh, indeed - should've thought of that!
> 
> > be worth trying what the actual delay is for each byte to see if a
> > longer sleep time would help, but I guess that time is highly
> > device specific.
> 
> I expect that the delay will depend on the clock speed of the SPI bus
> which is system dependant.  Might actually be worth checking if the bus
> is clocked excessively slowly.

Definitely worth checking, bug if we are transferring large enough amounts
of data, we will always incur a long delay even if the SPI bus is
clocked relatively fast. Doing a usleep_range() will likely make the
throughput much lower, as we will have longer times during which we
are not transferring any data here.

I see in the Armada 370 datasheet that the controller actually supports an
interrupt driven mode, which should be much better in principle than blocking
the CPU during the entire transfer. There may of course be a good reason
why the driver has never used interrupts, but it was originally introduced
in Orion5x, so maybe there was a bug that has been fixed in the meantime.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12 21:49                     ` Arnd Bergmann
@ 2016-01-12 22:00                       ` Mark Brown
  2016-01-12 22:41                         ` Arnd Bergmann
  0 siblings, 1 reply; 32+ messages in thread
From: Mark Brown @ 2016-01-12 22:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 12, 2016 at 10:49:57PM +0100, Arnd Bergmann wrote:

> I see in the Armada 370 datasheet that the controller actually supports an
> interrupt driven mode, which should be much better in principle than blocking
> the CPU during the entire transfer. There may of course be a good reason
> why the driver has never used interrupts, but it was originally introduced
> in Orion5x, so maybe there was a bug that has been fixed in the meantime.

There's also the potential for the overhead from interrupts to be worse
than the overhead for busy waiting, but given how badly we're coping now
it's worth another look.

I did also see a patch today that suggested that at least some of these
controllers have some kind of memory mapped mode but I don't know
exactly how that works (the patch was a little unclear).  There was some
suggestion that it performed well and if there's a free general purpose
DMA controller it might be able to work with this mode too.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160112/407fcbbc/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12 22:00                       ` Mark Brown
@ 2016-01-12 22:41                         ` Arnd Bergmann
  2016-01-13 11:42                           ` Mark Brown
  0 siblings, 1 reply; 32+ messages in thread
From: Arnd Bergmann @ 2016-01-12 22:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 12 January 2016 22:00:58 Mark Brown wrote:
> On Tue, Jan 12, 2016 at 10:49:57PM +0100, Arnd Bergmann wrote:
> 
> > I see in the Armada 370 datasheet that the controller actually supports an
> > interrupt driven mode, which should be much better in principle than blocking
> > the CPU during the entire transfer. There may of course be a good reason
> > why the driver has never used interrupts, but it was originally introduced
> > in Orion5x, so maybe there was a bug that has been fixed in the meantime.
> 
> There's also the potential for the overhead from interrupts to be worse
> than the overhead for busy waiting, but given how badly we're coping now
> it's worth another look.

Possible, yes. Right now, we do a udelay(1), which I think waits between one and
two microseconds normally, so we are probably adding extra latency of under
one microsecond per byte by not polling constantly. If the interrupt latency
from device-ready to entering the code is more than that, we can assume that
the transfer ends up being slower. There is also some time spent for getting
back out of the interrupt handler, but that's no worse than doing nothing
in udelay().

> I did also see a patch today that suggested that at least some of these
> controllers have some kind of memory mapped mode but I don't know
> exactly how that works (the patch was a little unclear).  There was some
> suggestion that it performed well and if there's a free general purpose
> DMA controller it might be able to work with this mode too.

There is the "xor" DMA engine that has a memcpy mode, but setting this up
to point at the right mbus translation window is going to be interesting.

	Arnd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* I/O issues with writing to mtdblock devices on kirkwood
  2016-01-12 22:41                         ` Arnd Bergmann
@ 2016-01-13 11:42                           ` Mark Brown
  0 siblings, 0 replies; 32+ messages in thread
From: Mark Brown @ 2016-01-13 11:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jan 12, 2016 at 11:41:45PM +0100, Arnd Bergmann wrote:
> On Tuesday 12 January 2016 22:00:58 Mark Brown wrote:

> > There's also the potential for the overhead from interrupts to be worse
> > than the overhead for busy waiting, but given how badly we're coping now
> > it's worth another look.

> Possible, yes. Right now, we do a udelay(1), which I think waits between one and
> two microseconds normally, so we are probably adding extra latency of under
> one microsecond per byte by not polling constantly. If the interrupt latency
> from device-ready to entering the code is more than that, we can assume that
> the transfer ends up being slower. There is also some time spent for getting
> back out of the interrupt handler, but that's no worse than doing nothing
> in udelay().

Indeed.  It may also make the cache performance worse as a result of
frequently switching to the interrupt handler but if nothing else it
should make it easier for something else to get access to the CPU which
seems to be the issue here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20160113/1928c3a6/attachment.sig>

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-01-13 11:42 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-21 12:24 I/O issues with writing to mtdblock devices on kirkwood Ian Campbell
2015-08-21 13:07 ` Andrew Lunn
2015-08-21 20:23   ` Ian Campbell
2015-09-05 21:08 ` Andrew Lunn
2015-09-06 12:11   ` Ian Campbell
2015-10-11 14:37   ` JM
2015-10-11 15:35     ` Andrew Lunn
2015-10-12 14:29       ` JM
2015-10-12 16:05         ` Rob J. Epping
2015-10-12 16:21           ` Andrew Lunn
2015-10-13  7:51             ` Ian Campbell
     [not found]               ` <5626B4DC.8000407@mcfarlanes.me>
     [not found]                 ` <5627F4E8.3030907@mcfarlanes.me>
2015-10-21 21:11                   ` Ian Campbell
2015-10-21 21:22                     ` Iain McFarlane
2015-10-21 21:28                       ` JM
2015-10-21 21:31                         ` Iain McFarlane
2015-10-22  0:38                       ` Andrew Lunn
2015-10-22  6:40                       ` Ian Campbell
2015-10-22  6:40                       ` Ian Campbell
2016-01-11 23:00       ` Martin Michlmayr
2016-01-11 23:22         ` Mark Brown
2016-01-11 23:43           ` Andrew Lunn
2016-01-12  1:21             ` Mark Brown
2016-01-12  0:07           ` Martin Michlmayr
2016-01-12  0:47             ` Mark Brown
2016-01-12  1:19               ` Andrew Lunn
2016-01-12  1:31                 ` Mark Brown
2016-01-12 16:29                 ` Arnd Bergmann
2016-01-12 18:02                   ` Mark Brown
2016-01-12 21:49                     ` Arnd Bergmann
2016-01-12 22:00                       ` Mark Brown
2016-01-12 22:41                         ` Arnd Bergmann
2016-01-13 11:42                           ` Mark Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.