linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Breuer <mbreuer@majjas.com>
To: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Jarek Poplawski <jarkao2@gmail.com>,
	David Miller <davem@davemloft.net>,
	akpm@linux-foundation.org, flyboy@gmail.com,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Michael Chan <mchan@broadcom.com>, Don Fry <pcnet32@verizon.net>,
	Francois Romieu <romieu@fr.zoreil.com>,
	Matt Carlson <mcarlson@broadcom.com>
Subject: Re: Hang: 2.6.32.4 sky2/DMAR (was [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync)
Date: Thu, 28 Jan 2010 11:43:16 -0500	[thread overview]
Message-ID: <4B61BEA4.1030905@majjas.com> (raw)
In-Reply-To: <4B61ADF1.7060705@majjas.com>

On 01/28/2010 10:32 AM, Michael Breuer wrote:
> On 01/27/2010 01:45 PM, Michael Breuer wrote:
>> On 1/27/2010 1:08 PM, Michael Breuer wrote:
>>>
>>>
>>> I've got (both in 2.6.32.4 and 2.6.33-rc5: pci_unmap_len(re, 
>>> data_size) vs., "length." I assume that I can just replace the 
>>> pci_unmap_len with dma_size... but perhaps the intermediate change 
>>> may have affected this as well?
>>>
>> Never mind - that was from one of the earlier patches I had been 
>> trying out. will try the above patch after reestablishing that the 
>> system still crashes without copybreak=1.
> Just FYI - still crashes with default copybreak.  Didn't get the 
> netdev watchdog this time - just DMAR and then HW watchdog reboot (see 
> below).
>
> So what's known to be required to cause this crash:
>
> 1) sky2 @ 1Gb
> 2) High sustained RX load (> 40MBps)
> 3) Uptime (I can't cause this to happen just after boot).
> 4) DMAR enabled (doesn't crash w/o DMAR).
> 5) copybreak != 1
>
> What might be required but is unproven:
> 1) cifs traffic (I've only seen this when the high traffic was due to 
> a Win7 box doing backup). I've tried but have been unable to recreate 
> by just copying large files. Backups done from a Mac OS laptop don't 
> trigger the issue even though that machine is also connecting with 
> CIFS (TimeMachine works better that way).
> 2) DHCP traffic. There has always been some sort of DHCP exchange in 
> the log before the first indication of a problem (DMAR).
> 3) Total throughput since boot. DK about this - however the uptime 
> component before the latest crash was the shortest yet. In preparation 
> I moved a bunch of large files around on the Windows box to ensure a 
> larger than normal backup run. I also ran manually before going to bed 
> (then moved the files around again). Didn't crash when I was watching 
> - but did overnight. Total uptime before this crash was only about 6 
> hours. Previously (with less backup data) the system didn't crash 
> until 24-36 hours.
>
> Observations:
>
> Copybreak: I did play for an hour or so yesterday with copybreak=1000. 
> Ran traffic, etc. No crash, but throughput was lower and the system 
> was clearly working way harder than normal. Given the whine of the 
> fans I'm not keen on leaving the system in that state for any extended 
> period of time.
>
> MTU: Increasing the MTU to 9000 yesterday after the system had been up 
> for some time (copybreak=1) crashed the system immediately. 
> Subsequently I have been able to change the mtu without crashes 
> (although the driver does end up in some sort of state that requires a 
> restart after lowering the mtu). I suspect that over time something is 
> being corrupted resulting in the crash when changing mtu. Whatever it 
> becoming corrupted is probably related to the other crash as well. 
> That suggests to me that copybreak=1 is preventing or delaying the 
> manifestation of the underlying issue but is unrelated to the source 
> of corruption.
>
> [no messages in the prior three minutes - there was a dhcp exchange 
> (request/ack) at 06:02:27]
> Jan 28 06:05:58 mail kernel: DRHD: handling fault status reg 2
> Jan 28 06:05:58 mail kernel: DMAR:[DMA Read] Request device [06:00.0] 
> fault addr ffdd06bfe000
> Jan 28 06:05:58 mail kernel: DMAR:[fault reason 06] PTE Read access is 
> not set
> Jan 28 06:05:58 mail kernel: sky2 0000:06:00.0: error interrupt 
> status=0x80000000
> Jan 28 06:05:58 mail kernel: sky2 0000:06:00.0: PCI hardware error 
> (0x2010)
> [No further messages until restart at 06:09:46.]
>
Update: I played with dma-debug. Was being disabled due to lack of 
memory. I forced it back on while pumping traffic through and got this:
Jan 28 11:39:30 mail kernel: ------------[ cut here ]------------
Jan 28 11:39:30 mail kernel: WARNING: at lib/dma-debug.c:902 
check_sync+0xc1/0x43f()
Jan 28 11:39:30 mail kernel: Hardware name: System Product Name
Jan 28 11:39:30 mail kernel: sky2 0000:06:00.0: DMA-API: device driver 
tries to sync DMA memory it has not allocated [device 
address=0x0000ffff4fe37022] [size=1520 bytes]
Jan 28 11:39:30 mail kernel: Modules linked in: microcode(+) 
ip6table_filter ip6table_mangle ip6_tables iptable_raw iptable_mangle 
ipt_MASQUERADE iptable_nat nf_nat bridge stp appletalk psnap llc nfsd 
lockd nfs_acl auth_rpcgss exportfs hwmon_vid coretemp sunrpc 
acpi_cpufreq sit tunnel4 ipt_LOG nf_conntrack_netbios_ns 
nf_conntrack_ftp xt_DSCP xt_dscp xt_MARK nf_conntrack_ipv6 xt_multiport 
ipv6 dm_multipath kvm_intel kvm snd_hda_codec_analog snd_ens1371 
gameport snd_rawmidi gspca_spca505 snd_hda_intel snd_ac97_codec 
gspca_main snd_hda_codec videodev snd_hwdep snd_seq v4l1_compat i2c_i801 
pcspkr ac97_bus v4l2_compat_ioctl32 snd_seq_device asus_atk0110 hwmon 
snd_pcm firewire_ohci firewire_core crc_itu_t sky2 snd_timer snd 
iTCO_wdt iTCO_vendor_support wmi soundcore snd_page_alloc fbcon tileblit 
font bitblit softcursor raid456 async_raid6_recov async_pq raid6_pq 
async_xor xor async_memcpy async_tx raid1 ata_generic pata_acpi 
pata_marvell nouveau ttm drm_kms_helper drm agpgart fb i2c_algo_bit 
cfbcopyarea i2c_core cfb
Jan 28 11:39:30 mail kernel: imgblt cfbfillrect [last unloaded: ip6_tables]
Jan 28 11:39:30 mail kernel: Pid: 5327, comm: bash Tainted: G        W  
2.6.32.4MMAPDMARAF3SKY2PSKBMAYPULL-00912-g914160d-dirty #6
Jan 28 11:39:30 mail kernel: Call Trace:
Jan 28 11:39:30 mail kernel: <IRQ>  [<ffffffff810536ee>] 
warn_slowpath_common+0x7c/0x94
Jan 28 11:39:30 mail kernel: [<ffffffff8105375d>] 
warn_slowpath_fmt+0x41/0x43
Jan 28 11:39:30 mail kernel: [<ffffffff8127b891>] check_sync+0xc1/0x43f
Jan 28 11:39:30 mail kernel: [<ffffffff8146c51a>] ? 
_spin_unlock_irqrestore+0x29/0x41
Jan 28 11:39:30 mail kernel: [<ffffffff813cac10>] ? 
__netdev_alloc_skb+0x34/0x50
Jan 28 11:39:30 mail kernel: [<ffffffff8127bf62>] 
debug_dma_sync_single_for_cpu+0x42/0x44
Jan 28 11:39:30 mail kernel: [<ffffffff813cac10>] ? 
__netdev_alloc_skb+0x34/0x50
Jan 28 11:39:30 mail kernel: [<ffffffffa019aee8>] sky2_poll+0x4d5/0xb06 
[sky2]
Jan 28 11:39:30 mail kernel: [<ffffffff81044840>] ? 
enqueue_entity+0x26c/0x279
Jan 28 11:39:30 mail kernel: [<ffffffff8107decf>] ? 
clockevents_program_event+0x7a/0x83
Jan 28 11:39:30 mail kernel: [<ffffffff813d18ae>] net_rx_action+0xb5/0x1f3
Jan 28 11:39:30 mail kernel: [<ffffffff8105af0f>] __do_softirq+0xf8/0x1cd
Jan 28 11:39:30 mail kernel: [<ffffffff810a3006>] ? 
handle_IRQ_event+0x119/0x12b
Jan 28 11:39:30 mail kernel: [<ffffffff81012e1c>] call_softirq+0x1c/0x30
Jan 28 11:39:30 mail kernel: [<ffffffff810143a3>] do_softirq+0x4b/0xa6
Jan 28 11:39:30 mail kernel: [<ffffffff8105aaef>] irq_exit+0x4a/0x8c
Jan 28 11:39:30 mail kernel: [<ffffffff81470575>] do_IRQ+0xa5/0xbc
Jan 28 11:39:30 mail kernel: [<ffffffff81012613>] ret_from_intr+0x0/0x16
Jan 28 11:39:30 mail kernel: <EOI>
Jan 28 11:39:30 mail kernel: ---[ end trace 57f7151f6a5def07 ]---
Jan 28 11:39:30 mail kernel: DMA-API: debugging out of memory - disabling




  reply	other threads:[~2010-01-28 16:43 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-20  9:41 [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync Jarek Poplawski
2010-01-20 18:03 ` Stephen Hemminger
2010-01-20 20:11   ` Michael Chan
2010-01-20 20:30     ` Stephen Hemminger
2010-01-20 20:58       ` Jarek Poplawski
2010-01-20 22:50         ` David Miller
2010-01-20 22:45       ` David Miller
2010-01-20 18:09 ` Stephen Hemminger
2010-01-20 22:24 ` Alan Cox
2010-01-20 22:53   ` David Miller
2010-01-20 22:53   ` Jarek Poplawski
2010-01-21 15:22     ` FUJITA Tomonori
2010-01-21 18:41       ` Jarek Poplawski
2010-01-22  5:11         ` FUJITA Tomonori
2010-01-22  6:38           ` David Miller
2010-02-03  1:18             ` FUJITA Tomonori
2010-02-03  1:27               ` David Miller
2010-01-21 19:59 ` Michael Breuer
2010-01-21 20:41   ` Jarek Poplawski
2010-01-21 20:46     ` Michael Breuer
2010-01-21 21:02       ` Jarek Poplawski
2010-01-22 18:01     ` Hang: 2.6.32.4 sky2/DMAR (was [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync) Michael Breuer
2010-01-22 21:53       ` Jarek Poplawski
2010-01-22 22:14         ` Michael Breuer
2010-01-22 23:06           ` Jarek Poplawski
2010-01-22 23:25             ` Michael Breuer
2010-01-22 23:46               ` Jarek Poplawski
2010-01-22 23:50                 ` Michael Breuer
2010-01-23 23:21                   ` Jarek Poplawski
2010-01-24  1:53                     ` Michael Breuer
2010-01-27 15:34                     ` Michael Breuer
2010-01-27 16:50                       ` Stephen Hemminger
2010-01-27 16:57                         ` Michael Breuer
2010-01-27 17:45                           ` Stephen Hemminger
2010-01-27 17:57                             ` Michael Breuer
2010-01-27 18:33                               ` Michael Breuer
2010-01-27 23:54                             ` Hang: 2.6.32.4 sky2/DMAR David Miller
2010-01-27 17:56                           ` Hang: 2.6.32.4 sky2/DMAR (was [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync) Stephen Hemminger
2010-01-27 17:58                             ` Michael Breuer
2010-01-27 18:08                             ` Michael Breuer
2010-01-27 18:45                               ` Michael Breuer
2010-01-27 19:23                                 ` Jarek Poplawski
2010-01-27 19:32                                   ` Jarek Poplawski
2010-01-28 15:32                                 ` Michael Breuer
2010-01-28 16:43                                   ` Michael Breuer [this message]
2010-01-28 17:08                                     ` Stephen Hemminger
2010-01-28 18:46                                       ` Michael Breuer
2010-01-28 22:34                                         ` Jarek Poplawski
2010-01-28 22:43                                           ` Michael Breuer
2010-01-28 22:56                                             ` Jarek Poplawski
2010-01-28 22:59                                               ` Michael Breuer
2010-01-28 23:36                                                 ` [PATCH] sky2: receive dma mapping error handling Stephen Hemminger
2010-01-29  0:05                                                   ` Michael Breuer
2010-01-30 16:30                                                   ` Michael Breuer
2010-01-30 16:31                                                   ` Michael Breuer
2010-01-31  0:34                                                     ` Jarek Poplawski
2010-01-31  4:17                                                       ` Michael Breuer
2010-01-31 22:25                                                         ` Jarek Poplawski
2010-01-31 23:58                                                           ` Michael Breuer
2010-01-31  4:55                                                       ` Michael Breuer
2010-01-31 18:50                                                         ` Michael Breuer
2010-01-31 21:58                                                           ` Michael Breuer
2010-01-31 22:18                                                             ` Jarek Poplawski
2010-02-01  0:19                                                               ` Michael Breuer
2010-02-01  4:26                                                                 ` Michael Breuer
2010-02-01 10:47                                                                   ` Jarek Poplawski
2010-02-01  9:17                                                                 ` [PATCH v2] sky2: Fix transmit dma mapping handling Jarek Poplawski
2010-02-01 17:52                                                                   ` Michael Breuer
2010-02-01 18:08                                                               ` [PATCH] sky2: receive dma mapping error handling Stephen Hemminger
2010-02-01 18:20                                                               ` Stephen Hemminger
2010-02-01 18:44                                                                 ` Michael Breuer
2010-02-01 20:13                                                                 ` Jarek Poplawski
2010-02-01 20:41                                                                   ` Jarek Poplawski
2010-02-01 21:27                                                                 ` [PATCH v3] " Jarek Poplawski
2010-02-01 22:29                                                                   ` Stephen Hemminger
2010-02-01 22:46                                                                     ` Jarek Poplawski
2010-02-01 22:51                                                                       ` Stephen Hemminger
2010-02-01 21:42                                                                 ` [PATCH v3b resent] sky2: Fix transmit dma mapping handling Jarek Poplawski
2010-02-03  4:07                                                                 ` [PATCH] sky2: receive dma mapping error handling Michael Breuer
2010-02-03 16:47                                                                   ` Michael Breuer
2010-02-03 16:56                                                                     ` Stephen Hemminger
2010-02-03 17:07                                                                       ` Michael Breuer
2010-02-03 18:23                                                                         ` Justin P. Mattock
2010-02-03 18:25                                                                           ` Stephen Hemminger
2010-02-03 18:48                                                                             ` Justin P. Mattock
2010-02-03 17:16                                                                       ` Justin P. Mattock
2010-02-02 22:44                                                   ` Andi Kleen
2012-01-16 16:39       ` Regression: sky2 kernel between 3.1 and 3.2.1 (last known good 3.0.9) Michael Breuer
2012-01-20 14:24         ` Michael Breuer
2012-01-20 16:10           ` Stephen Hemminger
2012-01-20 16:17             ` Michael Breuer
2012-01-20 16:26         ` Stephen Hemminger
2012-01-20 16:44           ` Michael Breuer
2012-01-21 15:29             ` Michael Breuer
2012-01-22 18:03               ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B61BEA4.1030905@majjas.com \
    --to=mbreuer@majjas.com \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=flyboy@gmail.com \
    --cc=jarkao2@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcarlson@broadcom.com \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=pcnet32@verizon.net \
    --cc=romieu@fr.zoreil.com \
    --cc=shemminger@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).