linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* block layer bug with 4.4-rc3+
@ 2015-12-15 11:05 Andre Przywara
  2015-12-15 11:54 ` Ming Lei
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2015-12-15 11:05 UTC (permalink / raw)
  To: Jens Axboe, Ming Lei
  Cc: linux-block, linux-kernel, Rob Herring, Eric Auger, linux-arm-kernel

Hi,

I've been experiencing issues with at least 4.4-rc3 (including current
HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
spinning disk or SSD).
After some disk I/O load (kernel compile with -j6) I see the kernel
screaming:

[  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
action 0x6 frozen
[  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
[  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
ncq 1048576 out
[  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[  103.764659] ata1.00: status: { DRDY }
[  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
[  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
ncq 8728576 out
[  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
< repeated with increasing tag numbers>

This repeats for a while, but then seems to recover later, though I
haven't checked if there are more issues and rebooted instead to avoid
filesystem damage.

While I agree that this looks like a disk error on the first glance, I
never saw this before 4.4-rc2, had the very same error on different
nodes (with another spinning disk and even an SSD) and I can make it
vanish by reverting the commit I identified after bisection:

commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
Author: Ming Lei <ming.lei@canonical.com>
Date:   Tue Nov 24 10:35:29 2015 +0800

    block: fix segment split
...
I understand that this fix seems sane, but actually reverting it fixes
the issue for me: 4.4-rc5 crashed within some minutes with the above
log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
kernel compiles without issues.
Looking at the git history of that file I see quite some recent changes
there, but it's beyond my understanding of the code to spot the real
culprit.

Can anyone point me to a change in blk-merge.c I could try to revert to
identify the real root cause? I can run tests quickly, though a real
positive case would need some hours of runtime to be sure it's fine.

Many thanks!
Cheers,
Andre.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-15 11:05 block layer bug with 4.4-rc3+ Andre Przywara
@ 2015-12-15 11:54 ` Ming Lei
  2015-12-15 12:23   ` Andre Przywara
  0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2015-12-15 11:54 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Rob Herring,
	Eric Auger, linux-arm-kernel

On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi,
>
> I've been experiencing issues with at least 4.4-rc3 (including current

I'd suggest you to test the latest linus tree first, and at least two
fix patches
have been merged for blk-merge issue.  If there is still the issue
with linus tree,
I am happy to take a look.

Thanks,

> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
> spinning disk or SSD).
> After some disk I/O load (kernel compile with -j6) I see the kernel
> screaming:
>
> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
> action 0x6 frozen
> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
> ncq 1048576 out
> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [  103.764659] ata1.00: status: { DRDY }
> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
> ncq 8728576 out
> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> < repeated with increasing tag numbers>
>
> This repeats for a while, but then seems to recover later, though I
> haven't checked if there are more issues and rebooted instead to avoid
> filesystem damage.
>
> While I agree that this looks like a disk error on the first glance, I
> never saw this before 4.4-rc2, had the very same error on different
> nodes (with another spinning disk and even an SSD) and I can make it
> vanish by reverting the commit I identified after bisection:
>
> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
> Author: Ming Lei <ming.lei@canonical.com>
> Date:   Tue Nov 24 10:35:29 2015 +0800
>
>     block: fix segment split
> ...
> I understand that this fix seems sane, but actually reverting it fixes
> the issue for me: 4.4-rc5 crashed within some minutes with the above
> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
> kernel compiles without issues.
> Looking at the git history of that file I see quite some recent changes
> there, but it's beyond my understanding of the code to spot the real
> culprit.
>
> Can anyone point me to a change in blk-merge.c I could try to revert to
> identify the real root cause? I can run tests quickly, though a real
> positive case would need some hours of runtime to be sure it's fine.
>
> Many thanks!
> Cheers,
> Andre.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-15 11:54 ` Ming Lei
@ 2015-12-15 12:23   ` Andre Przywara
  2015-12-15 13:39     ` Ming Lei
  0 siblings, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2015-12-15 12:23 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Rob Herring,
	Eric Auger, linux-arm-kernel

Hi Ming,

thanks for the answer!

On 15/12/15 11:54, Ming Lei wrote:
> On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Hi,
>>
>> I've been experiencing issues with at least 4.4-rc3 (including current
> 
> I'd suggest you to test the latest linus tree first, and at least two
> fix patches
> have been merged for blk-merge issue.  If there is still the issue
> with linus tree,
> I am happy to take a look.

Mmh, as said ("including current HEAD") this happens still with the
latest HEAD from Linus (which is "9f9499ae8e64: Linux 4.4-rc5" for me).
Just tested yesterday.
Is there another branch/tree with block fixes I should test? Is it worth
to try any of the upcoming branches in linux-block.git (for-4.5/core,
maybe?)

Thanks,
Andre.

> Thanks,
> 
>> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
>> spinning disk or SSD).
>> After some disk I/O load (kernel compile with -j6) I see the kernel
>> screaming:
>>
>> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
>> action 0x6 frozen
>> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
>> ncq 1048576 out
>> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>> 0x4 (timeout)
>> [  103.764659] ata1.00: status: { DRDY }
>> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
>> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
>> ncq 8728576 out
>> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>> 0x4 (timeout)
>> < repeated with increasing tag numbers>
>>
>> This repeats for a while, but then seems to recover later, though I
>> haven't checked if there are more issues and rebooted instead to avoid
>> filesystem damage.
>>
>> While I agree that this looks like a disk error on the first glance, I
>> never saw this before 4.4-rc2, had the very same error on different
>> nodes (with another spinning disk and even an SSD) and I can make it
>> vanish by reverting the commit I identified after bisection:
>>
>> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
>> Author: Ming Lei <ming.lei@canonical.com>
>> Date:   Tue Nov 24 10:35:29 2015 +0800
>>
>>     block: fix segment split
>> ...
>> I understand that this fix seems sane, but actually reverting it fixes
>> the issue for me: 4.4-rc5 crashed within some minutes with the above
>> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
>> kernel compiles without issues.
>> Looking at the git history of that file I see quite some recent changes
>> there, but it's beyond my understanding of the code to spot the real
>> culprit.
>>
>> Can anyone point me to a change in blk-merge.c I could try to revert to
>> identify the real root cause? I can run tests quickly, though a real
>> positive case would need some hours of runtime to be sure it's fine.
>>
>> Many thanks!
>> Cheers,
>> Andre.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-block" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-15 12:23   ` Andre Przywara
@ 2015-12-15 13:39     ` Ming Lei
  2015-12-16 14:55       ` Andre Przywara
  0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2015-12-15 13:39 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Rob Herring,
	Eric Auger, linux-arm-kernel

On Tue, Dec 15, 2015 at 8:23 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi Ming,
>
> thanks for the answer!
>
> On 15/12/15 11:54, Ming Lei wrote:
>> On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>> Hi,
>>>
>>> I've been experiencing issues with at least 4.4-rc3 (including current
>>
>> I'd suggest you to test the latest linus tree first, and at least two
>> fix patches
>> have been merged for blk-merge issue.  If there is still the issue
>> with linus tree,
>> I am happy to take a look.
>
> Mmh, as said ("including current HEAD") this happens still with the
> latest HEAD from Linus (which is "9f9499ae8e64: Linux 4.4-rc5" for me).
> Just tested yesterday.
> Is there another branch/tree with block fixes I should test? Is it worth
> to try any of the upcoming branches in linux-block.git (for-4.5/core,
> maybe?)

Both the fixes have been in linus tree already, and reverting the commit
basically makes merge not possible, so there must be issues somewhere.

And can you see the issue on other 32bit ARM platform?  I don't see the
issue on x86 and arm64, and the commit itself is correct, IMO.

>
> Thanks,
> Andre.
>
>> Thanks,
>>
>>> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
>>> spinning disk or SSD).
>>> After some disk I/O load (kernel compile with -j6) I see the kernel
>>> screaming:
>>>
>>> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
>>> action 0x6 frozen
>>> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
>>> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
>>> ncq 1048576 out
>>> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>> 0x4 (timeout)
>>> [  103.764659] ata1.00: status: { DRDY }
>>> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
>>> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
>>> ncq 8728576 out
>>> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>> 0x4 (timeout)
>>> < repeated with increasing tag numbers>
>>>
>>> This repeats for a while, but then seems to recover later, though I
>>> haven't checked if there are more issues and rebooted instead to avoid
>>> filesystem damage.
>>>
>>> While I agree that this looks like a disk error on the first glance, I
>>> never saw this before 4.4-rc2, had the very same error on different
>>> nodes (with another spinning disk and even an SSD) and I can make it
>>> vanish by reverting the commit I identified after bisection:
>>>
>>> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
>>> Author: Ming Lei <ming.lei@canonical.com>
>>> Date:   Tue Nov 24 10:35:29 2015 +0800
>>>
>>>     block: fix segment split
>>> ...
>>> I understand that this fix seems sane, but actually reverting it fixes
>>> the issue for me: 4.4-rc5 crashed within some minutes with the above
>>> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
>>> kernel compiles without issues.
>>> Looking at the git history of that file I see quite some recent changes
>>> there, but it's beyond my understanding of the code to spot the real
>>> culprit.
>>>
>>> Can anyone point me to a change in blk-merge.c I could try to revert to
>>> identify the real root cause? I can run tests quickly, though a real
>>> positive case would need some hours of runtime to be sure it's fine.
>>>
>>> Many thanks!
>>> Cheers,
>>> Andre.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-block" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-15 13:39     ` Ming Lei
@ 2015-12-16 14:55       ` Andre Przywara
  2015-12-16 15:43         ` Arnd Bergmann
  2015-12-17  3:52         ` Ming Lei
  0 siblings, 2 replies; 13+ messages in thread
From: Andre Przywara @ 2015-12-16 14:55 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Rob Herring,
	Eric Auger, linux-arm-kernel

Hi,

On 15/12/15 13:39, Ming Lei wrote:
> On Tue, Dec 15, 2015 at 8:23 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Hi Ming,
>>
>> thanks for the answer!
>>
>> On 15/12/15 11:54, Ming Lei wrote:
>>> On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>> Hi,
>>>>
>>>> I've been experiencing issues with at least 4.4-rc3 (including current
>>>
>>> I'd suggest you to test the latest linus tree first, and at least two
>>> fix patches
>>> have been merged for blk-merge issue.  If there is still the issue
>>> with linus tree,
>>> I am happy to take a look.
>>
>> Mmh, as said ("including current HEAD") this happens still with the
>> latest HEAD from Linus (which is "9f9499ae8e64: Linux 4.4-rc5" for me).
>> Just tested yesterday.
>> Is there another branch/tree with block fixes I should test? Is it worth
>> to try any of the upcoming branches in linux-block.git (for-4.5/core,
>> maybe?)
> 
> Both the fixes have been in linus tree already, and reverting the commit
> basically makes merge not possible, so there must be issues somewhere.
> 
> And can you see the issue on other 32bit ARM platform?  I don't see the
> issue on x86 and arm64, and the commit itself is correct, IMO.

Quick tests on a Cubietruck didn't show the issue, but this board is
nowhere near the Midway (2 in-order cores with 2GB RAM vs. 4
out-of-order cores with 8 GB RAM), so the load isn't the same.
I could rule out .config issues by using multi_v7_defconfig - with LPAE
enabled on top, that is.
Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
loose half of the RAM on that box) didn't show the bug so far.
One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
turn to 64-bit, with long, int and void* still being 32-bit. Can you
think of any issues that could be related to that?

Also can you briefly sketch what that patch (578270bfbd) eventually
changes? I see that the fix looks right, I am just wondering what the
impact is: Do we get more blocks or less or bigger ones or smaller?

I will try to do more experiments and to find the real culprit.

Thanks,
Andre.

> 
>>
>> Thanks,
>> Andre.
>>
>>> Thanks,
>>>
>>>> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
>>>> spinning disk or SSD).
>>>> After some disk I/O load (kernel compile with -j6) I see the kernel
>>>> screaming:
>>>>
>>>> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
>>>> action 0x6 frozen
>>>> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
>>>> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
>>>> ncq 1048576 out
>>>> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>> 0x4 (timeout)
>>>> [  103.764659] ata1.00: status: { DRDY }
>>>> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
>>>> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
>>>> ncq 8728576 out
>>>> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>> 0x4 (timeout)
>>>> < repeated with increasing tag numbers>
>>>>
>>>> This repeats for a while, but then seems to recover later, though I
>>>> haven't checked if there are more issues and rebooted instead to avoid
>>>> filesystem damage.
>>>>
>>>> While I agree that this looks like a disk error on the first glance, I
>>>> never saw this before 4.4-rc2, had the very same error on different
>>>> nodes (with another spinning disk and even an SSD) and I can make it
>>>> vanish by reverting the commit I identified after bisection:
>>>>
>>>> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
>>>> Author: Ming Lei <ming.lei@canonical.com>
>>>> Date:   Tue Nov 24 10:35:29 2015 +0800
>>>>
>>>>     block: fix segment split
>>>> ...
>>>> I understand that this fix seems sane, but actually reverting it fixes
>>>> the issue for me: 4.4-rc5 crashed within some minutes with the above
>>>> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
>>>> kernel compiles without issues.
>>>> Looking at the git history of that file I see quite some recent changes
>>>> there, but it's beyond my understanding of the code to spot the real
>>>> culprit.
>>>>
>>>> Can anyone point me to a change in blk-merge.c I could try to revert to
>>>> identify the real root cause? I can run tests quickly, though a real
>>>> positive case would need some hours of runtime to be sure it's fine.
>>>>
>>>> Many thanks!
>>>> Cheers,
>>>> Andre.
>>>> --

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-16 14:55       ` Andre Przywara
@ 2015-12-16 15:43         ` Arnd Bergmann
  2015-12-17  3:55           ` Ming Lei
  2015-12-17  9:28           ` Andre Przywara
  2015-12-17  3:52         ` Ming Lei
  1 sibling, 2 replies; 13+ messages in thread
From: Arnd Bergmann @ 2015-12-16 15:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Andre Przywara, Ming Lei, Jens Axboe, Rob Herring, Eric Auger,
	Linux Kernel Mailing List, linux-block

On Wednesday 16 December 2015 14:55:43 Andre Przywara wrote:
> Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
> loose half of the RAM on that box) didn't show the bug so far.
> One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
> turn to 64-bit, with long, int and void* still being 32-bit. Can you
> think of any issues that could be related to that?
> 

Another difference between the platforms is highmem. On Midway,
almost all of RAM is highmem, which needs a lot of special handling
that we don't need on platform with less RAM.

To rule out both of highmem and LPAE, it might be interesting to
test again on Midway with CONFIG_HIGHMEM and CONFIG_LPAE disabled.
This will limit the RAM to 768MB, but if the bug still shows up,
you know it's something else.

	Arnd

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-16 14:55       ` Andre Przywara
  2015-12-16 15:43         ` Arnd Bergmann
@ 2015-12-17  3:52         ` Ming Lei
  2015-12-17 12:32           ` Andre Przywara
  2015-12-17 12:33           ` Andre Przywara
  1 sibling, 2 replies; 13+ messages in thread
From: Ming Lei @ 2015-12-17  3:52 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Rob Herring,
	Eric Auger, linux-arm-kernel

On Wed, Dec 16, 2015 at 10:55 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi,
>
> On 15/12/15 13:39, Ming Lei wrote:
>> On Tue, Dec 15, 2015 at 8:23 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>> Hi Ming,
>>>
>>> thanks for the answer!
>>>
>>> On 15/12/15 11:54, Ming Lei wrote:
>>>> On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>>> Hi,
>>>>>
>>>>> I've been experiencing issues with at least 4.4-rc3 (including current
>>>>
>>>> I'd suggest you to test the latest linus tree first, and at least two
>>>> fix patches
>>>> have been merged for blk-merge issue.  If there is still the issue
>>>> with linus tree,
>>>> I am happy to take a look.
>>>
>>> Mmh, as said ("including current HEAD") this happens still with the
>>> latest HEAD from Linus (which is "9f9499ae8e64: Linux 4.4-rc5" for me).
>>> Just tested yesterday.
>>> Is there another branch/tree with block fixes I should test? Is it worth
>>> to try any of the upcoming branches in linux-block.git (for-4.5/core,
>>> maybe?)
>>
>> Both the fixes have been in linus tree already, and reverting the commit
>> basically makes merge not possible, so there must be issues somewhere.
>>
>> And can you see the issue on other 32bit ARM platform?  I don't see the
>> issue on x86 and arm64, and the commit itself is correct, IMO.
>
> Quick tests on a Cubietruck didn't show the issue, but this board is
> nowhere near the Midway (2 in-order cores with 2GB RAM vs. 4
> out-of-order cores with 8 GB RAM), so the load isn't the same.
> I could rule out .config issues by using multi_v7_defconfig - with LPAE
> enabled on top, that is.
> Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
> loose half of the RAM on that box) didn't show the bug so far.
> One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
> turn to 64-bit, with long, int and void* still being 32-bit. Can you
> think of any issues that could be related to that?
>
> Also can you briefly sketch what that patch (578270bfbd) eventually
> changes? I see that the fix looks right, I am just wondering what the
> impact is: Do we get more blocks or less or bigger ones or smaller?

Without the change, 'bvprvp' always points to 'bv', then each bio vector
can't be merged to other bio vector, so each bvec becomes one single
physical segment(convert to one single sg element in driver), finally the
transfer size for each bio becomes much smaller, and size of each
segment becomes much smaller, but segment number may become
bigger.

>
> I will try to do more experiments and to find the real culprit.

It may be helpful to enable 'block:*' trace events, and get/analyze the
traces close to the kernel warning.

>
> Thanks,
> Andre.
>
>>
>>>
>>> Thanks,
>>> Andre.
>>>
>>>> Thanks,
>>>>
>>>>> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
>>>>> spinning disk or SSD).
>>>>> After some disk I/O load (kernel compile with -j6) I see the kernel
>>>>> screaming:
>>>>>
>>>>> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
>>>>> action 0x6 frozen
>>>>> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
>>>>> ncq 1048576 out
>>>>> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>> 0x4 (timeout)
>>>>> [  103.764659] ata1.00: status: { DRDY }
>>>>> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
>>>>> ncq 8728576 out
>>>>> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>> 0x4 (timeout)
>>>>> < repeated with increasing tag numbers>
>>>>>
>>>>> This repeats for a while, but then seems to recover later, though I
>>>>> haven't checked if there are more issues and rebooted instead to avoid
>>>>> filesystem damage.
>>>>>
>>>>> While I agree that this looks like a disk error on the first glance, I
>>>>> never saw this before 4.4-rc2, had the very same error on different
>>>>> nodes (with another spinning disk and even an SSD) and I can make it
>>>>> vanish by reverting the commit I identified after bisection:
>>>>>
>>>>> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
>>>>> Author: Ming Lei <ming.lei@canonical.com>
>>>>> Date:   Tue Nov 24 10:35:29 2015 +0800
>>>>>
>>>>>     block: fix segment split
>>>>> ...
>>>>> I understand that this fix seems sane, but actually reverting it fixes
>>>>> the issue for me: 4.4-rc5 crashed within some minutes with the above
>>>>> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
>>>>> kernel compiles without issues.
>>>>> Looking at the git history of that file I see quite some recent changes
>>>>> there, but it's beyond my understanding of the code to spot the real
>>>>> culprit.
>>>>>
>>>>> Can anyone point me to a change in blk-merge.c I could try to revert to
>>>>> identify the real root cause? I can run tests quickly, though a real
>>>>> positive case would need some hours of runtime to be sure it's fine.
>>>>>
>>>>> Many thanks!
>>>>> Cheers,
>>>>> Andre.
>>>>> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-16 15:43         ` Arnd Bergmann
@ 2015-12-17  3:55           ` Ming Lei
  2015-12-17  9:28           ` Andre Przywara
  1 sibling, 0 replies; 13+ messages in thread
From: Ming Lei @ 2015-12-17  3:55 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Andre Przywara, Jens Axboe, Rob Herring,
	Eric Auger, Linux Kernel Mailing List, linux-block

On Wed, Dec 16, 2015 at 11:43 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 16 December 2015 14:55:43 Andre Przywara wrote:
>> Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
>> loose half of the RAM on that box) didn't show the bug so far.
>> One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
>> turn to 64-bit, with long, int and void* still being 32-bit. Can you
>> think of any issues that could be related to that?
>>
>
> Another difference between the platforms is highmem. On Midway,
> almost all of RAM is highmem, which needs a lot of special handling
> that we don't need on platform with less RAM.

Yeah, block bounce can be triggered for highmem generally.

>
> To rule out both of highmem and LPAE, it might be interesting to
> test again on Midway with CONFIG_HIGHMEM and CONFIG_LPAE disabled.
> This will limit the RAM to 768MB, but if the bug still shows up,
> you know it's something else.
>
>         Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-block" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-16 15:43         ` Arnd Bergmann
  2015-12-17  3:55           ` Ming Lei
@ 2015-12-17  9:28           ` Andre Przywara
  2015-12-17  9:46             ` Arnd Bergmann
  1 sibling, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2015-12-17  9:28 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arm-kernel
  Cc: Ming Lei, Jens Axboe, Rob Herring, Eric Auger,
	Linux Kernel Mailing List, linux-block

Tach Arnd,

On 16/12/15 15:43, Arnd Bergmann wrote:
> On Wednesday 16 December 2015 14:55:43 Andre Przywara wrote:
>> Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
>> loose half of the RAM on that box) didn't show the bug so far.
>> One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
>> turn to 64-bit, with long, int and void* still being 32-bit. Can you
>> think of any issues that could be related to that?
>>
> 
> Another difference between the platforms is highmem. On Midway,
> almost all of RAM is highmem, which needs a lot of special handling
> that we don't need on platform with less RAM.

Good point, but ...

> To rule out both of highmem and LPAE, it might be interesting to
> test again on Midway with CONFIG_HIGHMEM and CONFIG_LPAE disabled.
> This will limit the RAM to 768MB, but if the bug still shows up,
> you know it's something else.

So it was running for almost a day without LPAE now, but with highmem,
and the bug didn't show up. So for the time being I'd avoid another test
run without highmem, as LPAE alone is sufficient to trigger it.
Also it seems that CONFIG_HIGHMEM is selected by MULTI_V7, so I can't
configure it out for this box, but would have to restrict memory to
768M, I guess.

Next I will try some tracing as Ming suggested.

Cheers,
Andre.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-17  9:28           ` Andre Przywara
@ 2015-12-17  9:46             ` Arnd Bergmann
  0 siblings, 0 replies; 13+ messages in thread
From: Arnd Bergmann @ 2015-12-17  9:46 UTC (permalink / raw)
  To: Andre Przywara
  Cc: linux-arm-kernel, Ming Lei, Jens Axboe, Rob Herring, Eric Auger,
	Linux Kernel Mailing List, linux-block

On Thursday 17 December 2015 09:28:36 Andre Przywara wrote:
> 
> So it was running for almost a day without LPAE now, but with highmem,
> and the bug didn't show up. So for the time being I'd avoid another test
> run without highmem, as LPAE alone is sufficient to trigger it.

There is clearly no need to try with both LPAE and HIGHMEM disabled, that
would almost certainly not bring the bug back.

A more useful test might be to try Cubieboard with LPAE enabled, possibly
with LPAE and no highmem, to rule out the possibility that the bug just
gets less likely when you have less highmem.

> Also it seems that CONFIG_HIGHMEM is selected by MULTI_V7, so I can't
> configure it out for this box, but would have to restrict memory to
> 768M, I guess.
> 
> Next I will try some tracing as Ming suggested.

It should not be selected by MULTI_V7, the only 'select HIGHMEM' statement
I see is for ARCH_OMAP2PLUS_TYPICAL, and that is not a mandatory option
for anything, it just turns on a few things that are generally useful.

	Arnd

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-17  3:52         ` Ming Lei
@ 2015-12-17 12:32           ` Andre Przywara
  2015-12-17 12:33           ` Andre Przywara
  1 sibling, 0 replies; 13+ messages in thread
From: Andre Przywara @ 2015-12-17 12:32 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Rob Herring,
	Eric Auger, linux-arm-kernel

Hi Ming,

On 17/12/15 03:52, Ming Lei wrote:
> On Wed, Dec 16, 2015 at 10:55 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Hi,
>>
>> On 15/12/15 13:39, Ming Lei wrote:
>>> On Tue, Dec 15, 2015 at 8:23 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>> Hi Ming,
>>>>
>>>> thanks for the answer!
>>>>
>>>> On 15/12/15 11:54, Ming Lei wrote:
>>>>> On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I've been experiencing issues with at least 4.4-rc3 (including current
>>>>>
>>>>> I'd suggest you to test the latest linus tree first, and at least two
>>>>> fix patches
>>>>> have been merged for blk-merge issue.  If there is still the issue
>>>>> with linus tree,
>>>>> I am happy to take a look.
>>>>
>>>> Mmh, as said ("including current HEAD") this happens still with the
>>>> latest HEAD from Linus (which is "9f9499ae8e64: Linux 4.4-rc5" for me).
>>>> Just tested yesterday.
>>>> Is there another branch/tree with block fixes I should test? Is it worth
>>>> to try any of the upcoming branches in linux-block.git (for-4.5/core,
>>>> maybe?)
>>>
>>> Both the fixes have been in linus tree already, and reverting the commit
>>> basically makes merge not possible, so there must be issues somewhere.
>>>
>>> And can you see the issue on other 32bit ARM platform?  I don't see the
>>> issue on x86 and arm64, and the commit itself is correct, IMO.
>>
>> Quick tests on a Cubietruck didn't show the issue, but this board is
>> nowhere near the Midway (2 in-order cores with 2GB RAM vs. 4
>> out-of-order cores with 8 GB RAM), so the load isn't the same.
>> I could rule out .config issues by using multi_v7_defconfig - with LPAE
>> enabled on top, that is.
>> Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
>> loose half of the RAM on that box) didn't show the bug so far.
>> One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
>> turn to 64-bit, with long, int and void* still being 32-bit. Can you
>> think of any issues that could be related to that?
>>
>> Also can you briefly sketch what that patch (578270bfbd) eventually
>> changes? I see that the fix looks right, I am just wondering what the
>> impact is: Do we get more blocks or less or bigger ones or smaller?
> 
> Without the change, 'bvprvp' always points to 'bv', then each bio vector
> can't be merged to other bio vector, so each bvec becomes one single
> physical segment(convert to one single sg element in driver), finally the
> transfer size for each bio becomes much smaller, and size of each
> segment becomes much smaller, but segment number may become
> bigger.

Thanks a lot, exactly the explanation I was looking for.
Tracing this function didn't show any significant difference between the
two version on the first glance, though.

I will try to catch the actual differences and take it from there.

Cheers,
Andre.

>>
>> I will try to do more experiments and to find the real culprit.
> 
> It may be helpful to enable 'block:*' trace events, and get/analyze the
> traces close to the kernel warning.
> 
>>
>> Thanks,
>> Andre.
>>
>>>
>>>>
>>>> Thanks,
>>>> Andre.
>>>>
>>>>> Thanks,
>>>>>
>>>>>> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
>>>>>> spinning disk or SSD).
>>>>>> After some disk I/O load (kernel compile with -j6) I see the kernel
>>>>>> screaming:
>>>>>>
>>>>>> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
>>>>>> action 0x6 frozen
>>>>>> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>>> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
>>>>>> ncq 1048576 out
>>>>>> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>>> 0x4 (timeout)
>>>>>> [  103.764659] ata1.00: status: { DRDY }
>>>>>> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>>> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
>>>>>> ncq 8728576 out
>>>>>> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>>> 0x4 (timeout)
>>>>>> < repeated with increasing tag numbers>
>>>>>>
>>>>>> This repeats for a while, but then seems to recover later, though I
>>>>>> haven't checked if there are more issues and rebooted instead to avoid
>>>>>> filesystem damage.
>>>>>>
>>>>>> While I agree that this looks like a disk error on the first glance, I
>>>>>> never saw this before 4.4-rc2, had the very same error on different
>>>>>> nodes (with another spinning disk and even an SSD) and I can make it
>>>>>> vanish by reverting the commit I identified after bisection:
>>>>>>
>>>>>> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
>>>>>> Author: Ming Lei <ming.lei@canonical.com>
>>>>>> Date:   Tue Nov 24 10:35:29 2015 +0800
>>>>>>
>>>>>>     block: fix segment split
>>>>>> ...
>>>>>> I understand that this fix seems sane, but actually reverting it fixes
>>>>>> the issue for me: 4.4-rc5 crashed within some minutes with the above
>>>>>> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
>>>>>> kernel compiles without issues.
>>>>>> Looking at the git history of that file I see quite some recent changes
>>>>>> there, but it's beyond my understanding of the code to spot the real
>>>>>> culprit.
>>>>>>
>>>>>> Can anyone point me to a change in blk-merge.c I could try to revert to
>>>>>> identify the real root cause? I can run tests quickly, though a real
>>>>>> positive case would need some hours of runtime to be sure it's fine.
>>>>>>
>>>>>> Many thanks!
>>>>>> Cheers,
>>>>>> Andre.
>>>>>> --
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-block" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-17  3:52         ` Ming Lei
  2015-12-17 12:32           ` Andre Przywara
@ 2015-12-17 12:33           ` Andre Przywara
  2015-12-18  8:36             ` Ming Lei
  1 sibling, 1 reply; 13+ messages in thread
From: Andre Przywara @ 2015-12-17 12:33 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Rob Herring,
	Eric Auger, linux-arm-kernel

Hi Ming,

On 17/12/15 03:52, Ming Lei wrote:
> On Wed, Dec 16, 2015 at 10:55 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Hi,
>>
>> On 15/12/15 13:39, Ming Lei wrote:
>>> On Tue, Dec 15, 2015 at 8:23 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>> Hi Ming,
>>>>
>>>> thanks for the answer!
>>>>
>>>> On 15/12/15 11:54, Ming Lei wrote:
>>>>> On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I've been experiencing issues with at least 4.4-rc3 (including current
>>>>>
>>>>> I'd suggest you to test the latest linus tree first, and at least two
>>>>> fix patches
>>>>> have been merged for blk-merge issue.  If there is still the issue
>>>>> with linus tree,
>>>>> I am happy to take a look.
>>>>
>>>> Mmh, as said ("including current HEAD") this happens still with the
>>>> latest HEAD from Linus (which is "9f9499ae8e64: Linux 4.4-rc5" for me).
>>>> Just tested yesterday.
>>>> Is there another branch/tree with block fixes I should test? Is it worth
>>>> to try any of the upcoming branches in linux-block.git (for-4.5/core,
>>>> maybe?)
>>>
>>> Both the fixes have been in linus tree already, and reverting the commit
>>> basically makes merge not possible, so there must be issues somewhere.
>>>
>>> And can you see the issue on other 32bit ARM platform?  I don't see the
>>> issue on x86 and arm64, and the commit itself is correct, IMO.
>>
>> Quick tests on a Cubietruck didn't show the issue, but this board is
>> nowhere near the Midway (2 in-order cores with 2GB RAM vs. 4
>> out-of-order cores with 8 GB RAM), so the load isn't the same.
>> I could rule out .config issues by using multi_v7_defconfig - with LPAE
>> enabled on top, that is.
>> Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
>> loose half of the RAM on that box) didn't show the bug so far.
>> One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
>> turn to 64-bit, with long, int and void* still being 32-bit. Can you
>> think of any issues that could be related to that?
>>
>> Also can you briefly sketch what that patch (578270bfbd) eventually
>> changes? I see that the fix looks right, I am just wondering what the
>> impact is: Do we get more blocks or less or bigger ones or smaller?
> 
> Without the change, 'bvprvp' always points to 'bv', then each bio vector
> can't be merged to other bio vector, so each bvec becomes one single
> physical segment(convert to one single sg element in driver), finally the
> transfer size for each bio becomes much smaller, and size of each
> segment becomes much smaller, but segment number may become
> bigger.
> 
>>
>> I will try to do more experiments and to find the real culprit.
> 
> It may be helpful to enable 'block:*' trace events, and get/analyze the
> traces close to the kernel warning.

Good hint.
I just enabled all block events, so it's a lot of data and I guess I
didn't catch the actual "bug moment" before the buffer was overwritten.
Do you know of any specific event that would be useful?

Anyway I see a _lot_ of these in there, even before the bug triggers:

block_dirty_buffer: 8,7 sector=18446744073709486080 size=4096
block_dirty_buffer: 8,8 sector=18446744073709486080 size=4096

So that long number is 0xffffffffffff0000. Is that is some special value
for struct buffer_head.b_blocknr?
I see this in all versions, though, so with and without LPAE and on both
4.4-rc5 and with the patch in question reverted.

The type of this variable is sector_t, which is u64 with LBDAF defined
(which is enabled for me), but "unsigned long" without it.

Does that ring a bell?

Thanks,
Andre.




> 
>>
>> Thanks,
>> Andre.
>>
>>>
>>>>
>>>> Thanks,
>>>> Andre.
>>>>
>>>>> Thanks,
>>>>>
>>>>>> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
>>>>>> spinning disk or SSD).
>>>>>> After some disk I/O load (kernel compile with -j6) I see the kernel
>>>>>> screaming:
>>>>>>
>>>>>> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
>>>>>> action 0x6 frozen
>>>>>> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>>> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
>>>>>> ncq 1048576 out
>>>>>> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>>> 0x4 (timeout)
>>>>>> [  103.764659] ata1.00: status: { DRDY }
>>>>>> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>>> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
>>>>>> ncq 8728576 out
>>>>>> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>>> 0x4 (timeout)
>>>>>> < repeated with increasing tag numbers>
>>>>>>
>>>>>> This repeats for a while, but then seems to recover later, though I
>>>>>> haven't checked if there are more issues and rebooted instead to avoid
>>>>>> filesystem damage.
>>>>>>
>>>>>> While I agree that this looks like a disk error on the first glance, I
>>>>>> never saw this before 4.4-rc2, had the very same error on different
>>>>>> nodes (with another spinning disk and even an SSD) and I can make it
>>>>>> vanish by reverting the commit I identified after bisection:
>>>>>>
>>>>>> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
>>>>>> Author: Ming Lei <ming.lei@canonical.com>
>>>>>> Date:   Tue Nov 24 10:35:29 2015 +0800
>>>>>>
>>>>>>     block: fix segment split
>>>>>> ...
>>>>>> I understand that this fix seems sane, but actually reverting it fixes
>>>>>> the issue for me: 4.4-rc5 crashed within some minutes with the above
>>>>>> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
>>>>>> kernel compiles without issues.
>>>>>> Looking at the git history of that file I see quite some recent changes
>>>>>> there, but it's beyond my understanding of the code to spot the real
>>>>>> culprit.
>>>>>>
>>>>>> Can anyone point me to a change in blk-merge.c I could try to revert to
>>>>>> identify the real root cause? I can run tests quickly, though a real
>>>>>> positive case would need some hours of runtime to be sure it's fine.
>>>>>>
>>>>>> Many thanks!
>>>>>> Cheers,
>>>>>> Andre.
>>>>>> --
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-block" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: block layer bug with 4.4-rc3+
  2015-12-17 12:33           ` Andre Przywara
@ 2015-12-18  8:36             ` Ming Lei
  0 siblings, 0 replies; 13+ messages in thread
From: Ming Lei @ 2015-12-18  8:36 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Jens Axboe, Rob Herring, Eric Auger, Linux Kernel Mailing List,
	linux-block, linux-arm-kernel

On Thu, Dec 17, 2015 at 8:33 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi Ming,
>
> On 17/12/15 03:52, Ming Lei wrote:
>> On Wed, Dec 16, 2015 at 10:55 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>> Hi,
>>>
>>> On 15/12/15 13:39, Ming Lei wrote:
>>>> On Tue, Dec 15, 2015 at 8:23 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>>> Hi Ming,
>>>>>
>>>>> thanks for the answer!
>>>>>
>>>>> On 15/12/15 11:54, Ming Lei wrote:
>>>>>> On Tue, Dec 15, 2015 at 7:05 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've been experiencing issues with at least 4.4-rc3 (including current
>>>>>>
>>>>>> I'd suggest you to test the latest linus tree first, and at least two
>>>>>> fix patches
>>>>>> have been merged for blk-merge issue.  If there is still the issue
>>>>>> with linus tree,
>>>>>> I am happy to take a look.
>>>>>
>>>>> Mmh, as said ("including current HEAD") this happens still with the
>>>>> latest HEAD from Linus (which is "9f9499ae8e64: Linux 4.4-rc5" for me).
>>>>> Just tested yesterday.
>>>>> Is there another branch/tree with block fixes I should test? Is it worth
>>>>> to try any of the upcoming branches in linux-block.git (for-4.5/core,
>>>>> maybe?)
>>>>
>>>> Both the fixes have been in linus tree already, and reverting the commit
>>>> basically makes merge not possible, so there must be issues somewhere.
>>>>
>>>> And can you see the issue on other 32bit ARM platform?  I don't see the
>>>> issue on x86 and arm64, and the commit itself is correct, IMO.
>>>
>>> Quick tests on a Cubietruck didn't show the issue, but this board is
>>> nowhere near the Midway (2 in-order cores with 2GB RAM vs. 4
>>> out-of-order cores with 8 GB RAM), so the load isn't the same.
>>> I could rule out .config issues by using multi_v7_defconfig - with LPAE
>>> enabled on top, that is.
>>> Using the plain multi_v7_defconfig (which doesn't have LPAE and makes me
>>> loose half of the RAM on that box) didn't show the bug so far.
>>> One of the effects of turning on LPAE is that dma_addr_t and phys_addr_t
>>> turn to 64-bit, with long, int and void* still being 32-bit. Can you
>>> think of any issues that could be related to that?
>>>
>>> Also can you briefly sketch what that patch (578270bfbd) eventually
>>> changes? I see that the fix looks right, I am just wondering what the
>>> impact is: Do we get more blocks or less or bigger ones or smaller?
>>
>> Without the change, 'bvprvp' always points to 'bv', then each bio vector
>> can't be merged to other bio vector, so each bvec becomes one single
>> physical segment(convert to one single sg element in driver), finally the
>> transfer size for each bio becomes much smaller, and size of each
>> segment becomes much smaller, but segment number may become
>> bigger.
>>
>>>
>>> I will try to do more experiments and to find the real culprit.
>>
>> It may be helpful to enable 'block:*' trace events, and get/analyze the
>> traces close to the kernel warning.
>
> Good hint.
> I just enabled all block events, so it's a lot of data and I guess I
> didn't catch the actual "bug moment" before the buffer was overwritten.
> Do you know of any specific event that would be useful?

That is easy:

1) you can figure out the fault sector number from sata warning log, see
ata_eh_link_report, then all related trace can be extraced by the sector number

OR

2) just simply add one line trace_printk() in the function of printing
warning, which
can be thought as one timestamp in trace buffer.

>
> Anyway I see a _lot_ of these in there, even before the bug triggers:
>
> block_dirty_buffer: 8,7 sector=18446744073709486080 size=4096
> block_dirty_buffer: 8,8 sector=18446744073709486080 size=4096
>
> So that long number is 0xffffffffffff0000. Is that is some special value
> for struct buffer_head.b_blocknr?

I guess the above buffer_head isn't mapped yet, and the sector isn't valid.

> I see this in all versions, though, so with and without LPAE and on both
> 4.4-rc5 and with the patch in question reverted.
>
> The type of this variable is sector_t, which is u64 with LBDAF defined
> (which is enabled for me), but "unsigned long" without it.
>
> Does that ring a bell?
>
> Thanks,
> Andre.
>
>
>
>
>>
>>>
>>> Thanks,
>>> Andre.
>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Andre.
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>> HEAD) on a Calxeda Midway (4*ARM Cortex-A15 (32-bit), 8GB RAM, SATA
>>>>>>> spinning disk or SSD).
>>>>>>> After some disk I/O load (kernel compile with -j6) I see the kernel
>>>>>>> screaming:
>>>>>>>
>>>>>>> [  103.736982] ata1.00: exception Emask 0x0 SAct 0x3ffff0 SErr 0x0
>>>>>>> action 0x6 frozen
>>>>>>> [  103.744476] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>>>> [  103.749707] ata1.00: cmd 61/00:20:48:6b:41/08:00:0a:00:00/40 tag 4
>>>>>>> ncq 1048576 out
>>>>>>> [  103.749707]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>>>> 0x4 (timeout)
>>>>>>> [  103.764659] ata1.00: status: { DRDY }
>>>>>>> [  103.768321] ata1.00: failed command: WRITE FPDMA QUEUED
>>>>>>> [  103.773547] ata1.00: cmd 61/98:28:48:73:41/42:00:0a:00:00/40 tag 5
>>>>>>> ncq 8728576 out
>>>>>>> [  103.773547]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>>>>>>> 0x4 (timeout)
>>>>>>> < repeated with increasing tag numbers>
>>>>>>>
>>>>>>> This repeats for a while, but then seems to recover later, though I
>>>>>>> haven't checked if there are more issues and rebooted instead to avoid
>>>>>>> filesystem damage.
>>>>>>>
>>>>>>> While I agree that this looks like a disk error on the first glance, I
>>>>>>> never saw this before 4.4-rc2, had the very same error on different
>>>>>>> nodes (with another spinning disk and even an SSD) and I can make it
>>>>>>> vanish by reverting the commit I identified after bisection:
>>>>>>>
>>>>>>> commit 578270bfbd2803dc7b0b03fbc2ac119efbc73195
>>>>>>> Author: Ming Lei <ming.lei@canonical.com>
>>>>>>> Date:   Tue Nov 24 10:35:29 2015 +0800
>>>>>>>
>>>>>>>     block: fix segment split
>>>>>>> ...
>>>>>>> I understand that this fix seems sane, but actually reverting it fixes
>>>>>>> the issue for me: 4.4-rc5 crashed within some minutes with the above
>>>>>>> log, 4.4-rc5 with 578270bfbd reverted survived 19 hours of continuous
>>>>>>> kernel compiles without issues.
>>>>>>> Looking at the git history of that file I see quite some recent changes
>>>>>>> there, but it's beyond my understanding of the code to spot the real
>>>>>>> culprit.
>>>>>>>
>>>>>>> Can anyone point me to a change in blk-merge.c I could try to revert to
>>>>>>> identify the real root cause? I can run tests quickly, though a real
>>>>>>> positive case would need some hours of runtime to be sure it's fine.
>>>>>>>
>>>>>>> Many thanks!
>>>>>>> Cheers,
>>>>>>> Andre.
>>>>>>> --
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-block" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-12-18  8:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-15 11:05 block layer bug with 4.4-rc3+ Andre Przywara
2015-12-15 11:54 ` Ming Lei
2015-12-15 12:23   ` Andre Przywara
2015-12-15 13:39     ` Ming Lei
2015-12-16 14:55       ` Andre Przywara
2015-12-16 15:43         ` Arnd Bergmann
2015-12-17  3:55           ` Ming Lei
2015-12-17  9:28           ` Andre Przywara
2015-12-17  9:46             ` Arnd Bergmann
2015-12-17  3:52         ` Ming Lei
2015-12-17 12:32           ` Andre Przywara
2015-12-17 12:33           ` Andre Przywara
2015-12-18  8:36             ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).