From: Michael Breuer <mbreuer@majjas.com>
To: Stephen Hemminger <shemminger@vyatta.com>
Cc: Jarek Poplawski <jarkao2@gmail.com>,
David Miller <davem@davemloft.net>,
Stephen Hemminger <shemminger@linux-foundation.org>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: Regression: sky2 kernel between 3.1 and 3.2.1 (last known good 3.0.9)
Date: Fri, 20 Jan 2012 11:17:51 -0500 [thread overview]
Message-ID: <4F1993AF.1020303@majjas.com> (raw)
In-Reply-To: <20120120081059.1deb4468@s6510.linuxnetplumber.net>
On 1/20/2012 11:10 AM, Stephen Hemminger wrote:
> On Fri, 20 Jan 2012 09:24:38 -0500
> Michael Breuer<mbreuer@majjas.com> wrote:
>
>> On 1/16/2012 11:39 AM, Michael Breuer wrote:
>>> Synopsis:
>>>
>>> Receiving DMAR and other errors after approximately three days of
>>> uptime. The symptoms exactly match errors seen and then fixed around
>>> 2.6.32.4.
>>>
>>> While the system remains unaffected for too long to do a bisect, I was
>>> able to confirm that the problem exists in the 3.1 stable branch (I
>>> jumped from 3.0 to 3.2 when 3.2. was released).
>>>
>>> For now I reverted to the sky2.c from 3.0.9 and am running the rest of
>>> the kernel from 3.1.2, but won't be certain that this works until
>>> later in the week.
>>>
>>> Note that 20 seconds prior to the log extract below were DHCP renewal
>>> attempts on eth1, the issue below was on eth0. Not sure it's relevant,
>>> however back in 2010 a preceding DHCP event did turn out to be
>>> relevant to the manifestation of the bug.
>>>
>>> The 3.2.1-dirty I'm running is from git with a single local patch -
>>> for sidewinder force-feedback support (shouldn't be relevant to the
>>> sky2 issue).
>>>
>>> Log extract:
>>>
>>> Jan 16 05:49:46 mail kernel: [198230.628919] DRHD: handling fault
>>> status reg 2
>>> Jan 16 05:49:46 mail kernel: [198230.628925] sky2 0000:06:00.0: error
>>> interrupt status=0x80000000
>>> Jan 16 05:49:46 mail kernel: [198230.628929] DMAR:[DMA Read] Request
>>> device [06:00.0] fault addr fff78000
>>> Jan 16 05:49:46 mail kernel: [198230.628931] DMAR:[fault reason 06]
>>> PTE Read access is not set
>>> Jan 16 05:49:46 mail kernel: [198230.628939] sky2 0000:06:00.0: PCI
>>> hardware error (0x2010)
>>> Jan 16 05:49:53 mail dhclient[1616]: DHCPREQUEST on eth1 to
>>> 10.240.184.29 port 67
>>> Jan 16 05:50:01 mail kernel: [198246.288400] ------------[ cut here
>>> ]------------
>>> Jan 16 05:50:01 mail kernel: [198246.288408] WARNING: at
>>> net/sched/sch_generic.c:255 dev_watchdog+0x247/0x250()
>>> Jan 16 05:50:01 mail kernel: [198246.288411] Hardware name: System
>>> Product Name
>>> Jan 16 05:50:01 mail kernel: [198246.288413] NETDEV WATCHDOG: eth0
>>> (sky2): transmit queue 0 timed out
>>> Jan 16 05:50:01 mail kernel: [198246.288415] Modules linked in: tcp_lp
>>> cpufreq_stats ebtable_nat ebtables nf_conntrack_netbios_ns
>>> nf_conntrack_broadcast ip6table_mangle ip6table_filter ip6_tables
>>> iptable_mangle ipt_MASQUERADE iptable_nat nf_nat iptable_raw tun
>>> bridge stp llc lockd sit tunnel4 ipt_LOG nf_conntrack_ftp
>>> nf_conntrack_ipv6 nf_defrag_ipv6 xt_CHECKSUM xt_multiport xt_DSCP
>>> w83627ehf xt_mark xt_dscp hwmon_vid binfmt_misc raid1 btrfs sunrpc
>>> zlib_deflate libcrc32c snd_hda_codec_analog snd_ens1371 gameport
>>> snd_hda_intel snd_rawmidi snd_ac97_codec snd_hda_codec snd_hwdep
>>> ac97_bus snd_seq snd_seq_device snd_pcm gspca_spca505 snd_timer
>>> gspca_main snd videodev media soundcore i2c_i801 iTCO_wdt microcode
>>> v4l2_compat_ioctl32 snd_page_alloc i7core_edac sky2 edac_core pcspkr
>>> iTCO_vendor_support virtio_net virtio virtio_ring kvm_intel kvm uinput
>>> ipv6 raid456 async_raid6_recov async_pq raid6_pq async_xor
>>> firewire_ohci firewire_core pata_acpi ata_generic xor async_memcpy
>>> async_tx crc_itu_t pata_marvell nouveau ttm d
>>> Jan 16 05:50:01 mail kernel: rm_kms_helper drm i2c_algo_bit i2c_core
>>> mxm_wmi video [last unloaded: nf_conntrack_broadcast]
>>> Jan 16 05:50:01 mail kernel: [198246.288487] Pid: 0, comm: swapper/0
>>> Tainted: G W 3.2.1-dirty #1
>>> Jan 16 05:50:01 mail kernel: [198246.288489] Call Trace:
>>> Jan 16 05:50:01 mail kernel: [198246.288491]<IRQ>
>>> [<ffffffff81050a4f>] warn_slowpath_common+0x7f/0xc0
>>> Jan 16 05:50:01 mail kernel: [198246.288501] [<ffffffff8101f0bd>] ?
>>> lapic_next_event+0x1d/0x30
>>> Jan 16 05:50:01 mail kernel: [198246.288504] [<ffffffff81050b46>]
>>> warn_slowpath_fmt+0x46/0x50
>>> Jan 16 05:50:01 mail kernel: [198246.288509] [<ffffffff81009319>] ?
>>> read_tsc+0x9/0x20
>>> Jan 16 05:50:01 mail kernel: [198246.288513] [<ffffffff814a81e7>]
>>> dev_watchdog+0x247/0x250
>>> Jan 16 05:50:01 mail kernel: [198246.288518] [<ffffffff8105fbbb>]
>>> run_timer_softirq+0x12b/0x3b0
>>> Jan 16 05:50:01 mail kernel: [198246.288521] [<ffffffff814a7fa0>] ?
>>> qdisc_reset+0x50/0x50
>>> Jan 16 05:50:01 mail kernel: [198246.288525] [<ffffffff81057d18>]
>>> __do_softirq+0xa8/0x210
>>> Jan 16 05:50:01 mail kernel: [198246.288529] [<ffffffff8157496c>]
>>> call_softirq+0x1c/0x30
>>> Jan 16 05:50:01 mail kernel: [198246.288533] [<ffffffff810041e5>]
>>> do_softirq+0x65/0xa0
>>> Jan 16 05:50:01 mail kernel: [198246.288536] [<ffffffff810580fe>]
>>> irq_exit+0x8e/0xb0
>>> Jan 16 05:50:01 mail kernel: [198246.288539] [<ffffffff815750a3>]
>>> do_IRQ+0x63/0xe0
>>> Jan 16 05:50:01 mail kernel: [198246.288543] [<ffffffff8156ad2e>]
>>> common_interrupt+0x6e/0x6e
>>> Jan 16 05:50:01 mail kernel: [198246.288545]<EOI>
>>> [<ffffffff81307b6d>] ? intel_idle+0xed/0x150
>>> Jan 16 05:50:01 mail kernel: [198246.288551] [<ffffffff81307b4f>] ?
>>> intel_idle+0xcf/0x150
>>> Jan 16 05:50:01 mail kernel: [198246.288555] [<ffffffff8144d331>]
>>> cpuidle_idle_call+0xc1/0x280
>>> Jan 16 05:50:01 mail kernel: [198246.288559] [<ffffffff8100122a>]
>>> cpu_idle+0xca/0x120
>>> Jan 16 05:50:01 mail kernel: [198246.288563] [<ffffffff8154741e>]
>>> rest_init+0x72/0x74
>>> Jan 16 05:50:01 mail kernel: [198246.288568] [<ffffffff81b6abdd>]
>>> start_kernel+0x3b5/0x3c0
>>> Jan 16 05:50:01 mail kernel: [198246.288572] [<ffffffff81b6a32b>]
>>> x86_64_start_reservations+0x132/0x136
>>> Jan 16 05:50:01 mail kernel: [198246.288576] [<ffffffff81b6a140>] ?
>>> early_idt_handlers+0x140/0x140
>>> Jan 16 05:50:01 mail kernel: [198246.288580] [<ffffffff81b6a431>]
>>> x86_64_start_kernel+0x102/0x111
>>> Jan 16 05:50:01 mail kernel: [198246.288583] ---[ end trace
>>> bb26011d21a2b1d7 ]---
>>> Jan 16 05:50:01 mail kernel: [198246.288586] sky2 0000:06:00.0: eth0:
>>> tx timeout
>>> Jan 16 05:50:01 mail kernel: [198246.288593] sky2 0000:06:00.0: eth0:
>>> transmit ring 115 .. 10 report=115 done=115
>>>
>>>
>>>
>> FYI - I've been up for four days now without issues running on 3.2.1 +
>> sky2.c from 3.0.9. Looks like the issue is in fact in one of the
>> modifications made in sky2.c between those two releases.
> Since only you seem to be able to reproduce it, most likely the
> bisect burden will be on you. If you know it is only one file,
> then bisecting that file is fairly quick.
>
As of now, I have no reliable way to reproduce... so this is likely to
take about 3-4 days per bisect run... more if it doesn't fail.
If there are suggestions as to diagnostic code to put in; or specific
bias towards one version or another that may reduce the time significantly.
I've also got some windows where I have to leave a stable version up.
next prev parent reply other threads:[~2012-01-20 16:18 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-20 9:41 [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync Jarek Poplawski
2010-01-20 18:03 ` Stephen Hemminger
2010-01-20 20:11 ` Michael Chan
2010-01-20 20:30 ` Stephen Hemminger
2010-01-20 20:58 ` Jarek Poplawski
2010-01-20 22:50 ` David Miller
2010-01-20 22:45 ` David Miller
2010-01-20 18:09 ` Stephen Hemminger
2010-01-20 22:24 ` Alan Cox
2010-01-20 22:53 ` David Miller
2010-01-20 22:53 ` Jarek Poplawski
2010-01-21 15:22 ` FUJITA Tomonori
2010-01-21 18:41 ` Jarek Poplawski
2010-01-22 5:11 ` FUJITA Tomonori
2010-01-22 6:38 ` David Miller
2010-02-03 1:18 ` FUJITA Tomonori
2010-02-03 1:27 ` David Miller
2010-01-21 19:59 ` Michael Breuer
2010-01-21 20:41 ` Jarek Poplawski
2010-01-21 20:46 ` Michael Breuer
2010-01-21 21:02 ` Jarek Poplawski
2010-01-22 18:01 ` Hang: 2.6.32.4 sky2/DMAR (was [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync) Michael Breuer
2010-01-22 21:53 ` Jarek Poplawski
2010-01-22 22:14 ` Michael Breuer
2010-01-22 23:06 ` Jarek Poplawski
2010-01-22 23:25 ` Michael Breuer
2010-01-22 23:46 ` Jarek Poplawski
2010-01-22 23:50 ` Michael Breuer
2010-01-23 23:21 ` Jarek Poplawski
2010-01-24 1:53 ` Michael Breuer
2010-01-27 15:34 ` Michael Breuer
2010-01-27 16:50 ` Stephen Hemminger
2010-01-27 16:57 ` Michael Breuer
2010-01-27 17:45 ` Stephen Hemminger
2010-01-27 17:57 ` Michael Breuer
2010-01-27 18:33 ` Michael Breuer
2010-01-27 23:54 ` Hang: 2.6.32.4 sky2/DMAR David Miller
2010-01-27 17:56 ` Hang: 2.6.32.4 sky2/DMAR (was [PATCH] sky2: Fix WARNING: at lib/dma-debug.c:902 check_sync) Stephen Hemminger
2010-01-27 17:58 ` Michael Breuer
2010-01-27 18:08 ` Michael Breuer
2010-01-27 18:45 ` Michael Breuer
2010-01-27 19:23 ` Jarek Poplawski
2010-01-27 19:32 ` Jarek Poplawski
2010-01-28 15:32 ` Michael Breuer
2010-01-28 16:43 ` Michael Breuer
2010-01-28 17:08 ` Stephen Hemminger
2010-01-28 18:46 ` Michael Breuer
2010-01-28 22:34 ` Jarek Poplawski
2010-01-28 22:43 ` Michael Breuer
2010-01-28 22:56 ` Jarek Poplawski
2010-01-28 22:59 ` Michael Breuer
2010-01-28 23:36 ` [PATCH] sky2: receive dma mapping error handling Stephen Hemminger
2010-01-29 0:05 ` Michael Breuer
2010-01-30 16:30 ` Michael Breuer
2010-01-30 16:31 ` Michael Breuer
2010-01-31 0:34 ` Jarek Poplawski
2010-01-31 4:17 ` Michael Breuer
2010-01-31 22:25 ` Jarek Poplawski
2010-01-31 23:58 ` Michael Breuer
2010-01-31 4:55 ` Michael Breuer
2010-01-31 18:50 ` Michael Breuer
2010-01-31 21:58 ` Michael Breuer
2010-01-31 22:18 ` Jarek Poplawski
2010-02-01 0:19 ` Michael Breuer
2010-02-01 4:26 ` Michael Breuer
2010-02-01 10:47 ` Jarek Poplawski
2010-02-01 9:17 ` [PATCH v2] sky2: Fix transmit dma mapping handling Jarek Poplawski
2010-02-01 17:52 ` Michael Breuer
2010-02-01 18:08 ` [PATCH] sky2: receive dma mapping error handling Stephen Hemminger
2010-02-01 18:20 ` Stephen Hemminger
2010-02-01 18:44 ` Michael Breuer
2010-02-01 20:13 ` Jarek Poplawski
2010-02-01 20:41 ` Jarek Poplawski
2010-02-01 21:27 ` [PATCH v3] " Jarek Poplawski
2010-02-01 22:29 ` Stephen Hemminger
2010-02-01 22:46 ` Jarek Poplawski
2010-02-01 22:51 ` Stephen Hemminger
2010-02-01 21:42 ` [PATCH v3b resent] sky2: Fix transmit dma mapping handling Jarek Poplawski
2010-02-03 4:07 ` [PATCH] sky2: receive dma mapping error handling Michael Breuer
2010-02-03 16:47 ` Michael Breuer
2010-02-03 16:56 ` Stephen Hemminger
2010-02-03 17:07 ` Michael Breuer
2010-02-03 18:23 ` Justin P. Mattock
2010-02-03 18:25 ` Stephen Hemminger
2010-02-03 18:48 ` Justin P. Mattock
2010-02-03 17:16 ` Justin P. Mattock
2010-02-02 22:44 ` Andi Kleen
2012-01-16 16:39 ` Regression: sky2 kernel between 3.1 and 3.2.1 (last known good 3.0.9) Michael Breuer
2012-01-20 14:24 ` Michael Breuer
2012-01-20 16:10 ` Stephen Hemminger
2012-01-20 16:17 ` Michael Breuer [this message]
2012-01-20 16:26 ` Stephen Hemminger
2012-01-20 16:44 ` Michael Breuer
2012-01-21 15:29 ` Michael Breuer
2012-01-22 18:03 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F1993AF.1020303@majjas.com \
--to=mbreuer@majjas.com \
--cc=davem@davemloft.net \
--cc=jarkao2@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=shemminger@linux-foundation.org \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).