linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Zheng, Lv" <lv.zheng@intel.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	"Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Wysocki, Rafael J" <rafael.j.wysocki@intel.com>,
	"Brown, Len" <len.brown@intel.com>, Lv Zheng <zetalog@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>
Subject: RE: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
Date: Wed, 19 Nov 2014 08:55:25 +0000	[thread overview]
Message-ID: <1AE640813FDE7649BE1B193DEA596E88026A36FB@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <4486101.LWS7CexbAj@vostro.rjw.lan>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 8859 bytes --]

Hi, Rafael

I think you know this issue.

 [PATCH 1] can trigger this dead lock because it is actually based on another GPE dead lock fixing series.
I have fixed the dead lock in acpi_ev_gpe_detect() or acpi_ev_gpe_dispatch().
The problem is they haven't been upstreamed to ACPICA, so I couldn't post them here.

I was thinking we can work this around by applying the acpi_os_wait_events_complete() enhancement support prior than applying this  because it can only happen in suspend.
But it seems this can also be triggered during boot.

So we can have 3 choices here in order to merge this series:
1. Merging the GPE dead lock fix before it is merged in the ACPICA upstream.
2. Changing [PATCH 1] and do not hold EC lock currently (though it is racy, it is currently racy).
3. Reverting [PATCH 1-4] and wait until GPE dead lock fixed in ACPICA upstream.
Which one do you prefer?

IMO, we have several issues, their fixes form a dependency circle:
1. GPE dead lock: it may depends on DISPATCH_METHOD flushing (we shouldn't bump enabling status up in acpi_ev_asynch_enable_gpe())
2. EC transaction flushing: it depends on the GPE dead lock
3. EC event polling: it depends on the EC transaction flushing, this is required to support EC event draining as mentioned in bugzilla 44161
4. DISPATCH_METHOD flushing: it depends on EC event polling, if we don't move EC query from the _Lxx/_Exx work queue, then it may block DISPATCH_METHOD flushing.
So it seems we need to determine which one should be merged first.
IMO, the GPE dead lock fix is the most basic one.

Thanks and best regards
-Lv


> From: Rafael J. Wysocki [mailto:rjw@rjwysocki.net]
> Sent: Wednesday, November 19, 2014 5:20 AM
> To: Kirill A. Shutemov
> Cc: Zheng, Lv; Wysocki, Rafael J; Brown, Len; Lv Zheng; linux-kernel@vger.kernel.org; linux-acpi@vger.kernel.org
> Subject: Re: [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag.
> 
> On Tuesday, November 18, 2014 03:23:28 PM Kirill A. Shutemov wrote:
> > On Wed, Nov 05, 2014 at 02:52:36AM +0000, Zheng, Lv wrote:
> 
> [cut]
> 
> >
> > Here's lockdep warning I see on -next:
> 
> Is patch [1/6] sufficient to trigger this or do you need all [1-4/6]?
> 
> 
> > [    0.510159] ======================================================
> > [    0.510171] [ INFO: possible circular locking dependency detected ]
> > [    0.510185] 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66 Not tainted
> > [    0.510197] -------------------------------------------------------
> > [    0.510209] swapper/3/0 is trying to acquire lock:
> > [    0.510219]  (&(&ec->lock)->rlock){-.....}, at: [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> > [    0.510254]
> > [    0.510254] but task is already holding lock:
> > [    0.510266]  (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> > [    0.510296]
> > [    0.510296] which lock already depends on the new lock.
> > [    0.510296]
> > [    0.510312]
> > [    0.510312] the existing dependency chain (in reverse order) is:
> > [    0.510327]
> > [    0.510327] -> #1 (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}:
> > [    0.510344]        [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> > [    0.510364]        [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> > [    0.510381]        [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> > [    0.510398]        [<ffffffff814e31e8>] acpi_enable_gpe+0x22/0x68
> > [    0.510416]        [<ffffffff814d5b24>] acpi_ec_start+0x66/0x87
> > [    0.510432]        [<ffffffff81afc771>] ec_install_handlers+0x41/0xa4
> > [    0.510449]        [<ffffffff823e72b9>] acpi_ec_ecdt_probe+0x1a9/0x1ea
> > [    0.510466]        [<ffffffff823e6ae3>] acpi_init+0x8b/0x26e
> > [    0.510480]        [<ffffffff81002148>] do_one_initcall+0xd8/0x210
> > [    0.510496]        [<ffffffff8239f1dc>] kernel_init_freeable+0x1f5/0x282
> > [    0.510513]        [<ffffffff81af1a1e>] kernel_init+0xe/0xf0
> > [    0.510527]        [<ffffffff81b08cfc>] ret_from_fork+0x7c/0xb0
> > [    0.510542]
> > [    0.510542] -> #0 (&(&ec->lock)->rlock){-.....}:
> > [    0.510558]        [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
> > [    0.510574]        [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> > [    0.510589]        [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> > [    0.510604]        [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> > [    0.510620]        [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
> > [    0.510636]        [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
> > [    0.510652]        [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
> > [    0.510669]        [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
> > [    0.510684]        [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
> > [    0.510702]        [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
> > [    0.510718]        [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
> > [    0.510733]        [<ffffffff81075a22>] handle_irq+0x22/0x40
> > [    0.510748]        [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
> > [    0.510762]        [<ffffffff81b09bb2>] ret_from_intr+0x0/0x1a
> > [    0.510777]        [<ffffffff8107e783>] default_idle+0x23/0x260
> > [    0.510792]        [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
> > [    0.510806]        [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
> > [    0.510821]        [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0
> > [    0.510840]
> > [    0.510840] other info that might help us debug this:
> > [    0.510840]
> > [    0.510856]  Possible unsafe locking scenario:
> > [    0.510856]
> > [    0.510868]        CPU0                    CPU1
> > [    0.510877]        ----                    ----
> > [    0.510886]   lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> > [    0.510898]                                lock(&(&ec->lock)->rlock);
> > [    0.510912]                                lock(&(*(&acpi_gbl_gpe_lock))->rlock);
> > [    0.510927]   lock(&(&ec->lock)->rlock);
> > [    0.510938]
> > [    0.510938]  *** DEADLOCK ***
> > [    0.510938]
> > [    0.510953] 1 lock held by swapper/3/0:
> > [    0.510961]  #0:  (&(*(&acpi_gbl_gpe_lock))->rlock){-.....}, at: [<ffffffff814cd67e>] acpi_os_acquire_lock+0xe/0x10
> > [    0.510990]
> > [    0.510990] stack backtrace:
> > [    0.511004] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 3.18.0-rc4-next-20141117-07404-g9dad2ab6df8b #66
> > [    0.511021] Hardware name: LENOVO 3460CC6/3460CC6, BIOS G6ET93WW (2.53 ) 02/04/2013
> > [    0.511035]  ffffffff82cb2f70 ffff88011e2c3bb8 ffffffff81afc316 0000000000000011
> > [    0.511055]  ffffffff82cb2f70 ffff88011e2c3c08 ffffffff81afae11 0000000000000001
> > [    0.511074]  ffff88011e2c3c68 ffff88011e2c3c08 ffff8801193f92d0 ffff8801193f9b20
> > [    0.511094] Call Trace:
> > [    0.511101]  <IRQ>  [<ffffffff81afc316>] dump_stack+0x4c/0x6e
> > [    0.511125]  [<ffffffff81afae11>] print_circular_bug+0x2b2/0x2c3
> > [    0.511142]  [<ffffffff811585ef>] __lock_acquire+0x210f/0x2220
> > [    0.511159]  [<ffffffff81158f4f>] lock_acquire+0xdf/0x2d0
> > [    0.511176]  [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
> > [    0.511192]  [<ffffffff81b08010>] _raw_spin_lock_irqsave+0x50/0x70
> > [    0.511209]  [<ffffffff814d533e>] ? acpi_ec_gpe_handler+0x21/0xfc
> > [    0.511225]  [<ffffffff814ea192>] ? acpi_hw_write+0x4b/0x52
> > [    0.511241]  [<ffffffff814d533e>] acpi_ec_gpe_handler+0x21/0xfc
> > [    0.511258]  [<ffffffff814e02c2>] acpi_ev_gpe_dispatch+0xd2/0x143
> > [    0.511274]  [<ffffffff814e03fb>] acpi_ev_gpe_detect+0xc8/0x10f
> > [    0.511292]  [<ffffffff814e23b6>] acpi_ev_sci_xrupt_handler+0x22/0x38
> > [    0.511309]  [<ffffffff814cc8ee>] acpi_irq+0x16/0x31
> > [    0.511325]  [<ffffffff8116eccf>] handle_irq_event_percpu+0x6f/0x540
> > [    0.511342]  [<ffffffff8116f1e1>] handle_irq_event+0x41/0x70
> > [    0.511357]  [<ffffffff81171e98>] ? handle_fasteoi_irq+0x28/0x140
> > [    0.511372]  [<ffffffff81171ef6>] handle_fasteoi_irq+0x86/0x140
> > [    0.511388]  [<ffffffff81075a22>] handle_irq+0x22/0x40
> > [    0.511402]  [<ffffffff81b0beaf>] do_IRQ+0x4f/0xf0
> > [    0.511417]  [<ffffffff81b09bb2>] common_interrupt+0x72/0x72
> > [    0.511428]  <EOI>  [<ffffffff810b8986>] ? native_safe_halt+0x6/0x10
> > [    0.511454]  [<ffffffff81154f3d>] ? trace_hardirqs_on+0xd/0x10
> > [    0.511468]  [<ffffffff8107e783>] default_idle+0x23/0x260
> > [    0.511482]  [<ffffffff8107f35f>] arch_cpu_idle+0xf/0x20
> > [    0.511496]  [<ffffffff8114a99b>] cpu_startup_entry+0x36b/0x5b0
> > [    0.511512]  [<ffffffff810a8d04>] start_secondary+0x1a4/0x1d0
> >
> >
> >
> 
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

  reply	other threads:[~2014-11-19  9:49 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-03  5:16 [PATCH 0/6] ACPI/EC: Cleanups of command flushing and event polling Lv Zheng
2014-11-03  5:16 ` [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag Lv Zheng
2014-11-05  2:52   ` Zheng, Lv
2014-11-18 13:23     ` Kirill A. Shutemov
2014-11-18 21:20       ` Rafael J. Wysocki
2014-11-19  8:55         ` Zheng, Lv [this message]
2014-11-19 12:16         ` Kirill A. Shutemov
2014-11-20  2:19           ` [RFC PATCH v3 1/2] ACPICA: Events: Remove duplicated sanity check in acpi_ev_enable_gpe() Lv Zheng
2014-11-20  2:19           ` [RFC PATCH v3 2/2] ACPICA: Events: Introduce ACPI_GPE_HANDLER_RAW to fix 2 issues for the current GPE APIs Lv Zheng
2014-11-20  2:20           ` [PATCH 1/6] ACPI/EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag Zheng, Lv
2014-11-20 21:33             ` Kirill A. Shutemov
2014-11-21  0:42               ` Zheng, Lv
2014-11-21  0:55               ` Zheng, Lv
2014-11-20  2:34           ` Zheng, Lv
2014-11-20 21:34             ` Kirill A. Shutemov
2014-11-20  6:44           ` [RFC PATCH v4] ACPICA/Events: Add support to ensure GPE is disabled by default for handlers Lv Zheng
2014-11-20  6:47             ` Zheng, Lv
2014-11-20 22:15               ` Kirill A. Shutemov
2014-11-21  1:36                 ` Zheng, Lv
2015-02-06  0:57           ` [PATCH v2 0/5] ACPI / EC: Add reference counting for requests and cleans up the grace periods support Lv Zheng
2015-02-06  0:57             ` [PATCH v2 1/5] ACPI / EC: Introduce STARTED/STOPPED flags to replace BLOCKED flag Lv Zheng
2015-02-06  0:57             ` [PATCH v2 2/5] ACPI / EC: Add command flushing support Lv Zheng
2015-02-06  0:58             ` [PATCH v2 3/5] ACPI / EC: Refine command storm prevention support Lv Zheng
2015-02-06  0:58             ` [PATCH v2 4/5] ACPI / EC: Add query flushing support Lv Zheng
2015-02-06  0:58             ` [PATCH v2 5/5] ACPI / EC: Add GPE reference counting debugging messages Lv Zheng
2015-02-06  1:26             ` [PATCH v2 0/5] ACPI / EC: Add reference counting for requests and cleans up the grace periods support Rafael J. Wysocki
2015-02-09  2:23               ` Zheng, Lv
2015-02-12  1:17                 ` Rafael J. Wysocki
2015-02-16  6:41                   ` Zheng, Lv
2014-11-03  5:16 ` [PATCH 2/6] ACPI/EC: Enhance the checks to apply to QR_EC transactions Lv Zheng
2014-11-03  5:16 ` [PATCH 3/6] ACPI/EC: Add reference counting for query handlers Lv Zheng
2014-11-03  5:16 ` [PATCH 4/6] ACPI/EC: Add command flushing support Lv Zheng
2014-11-03  5:16 ` [PATCH 5/6] ACPI/EC: Cleanup QR_SC command processing by adding a kernel thread to poll EC events Lv Zheng
2014-11-12  1:16   ` Rafael J. Wysocki
2014-11-13  2:31     ` Zheng, Lv
2014-11-13  2:58       ` Rafael J. Wysocki
2014-11-13  2:52         ` Zheng, Lv
2014-11-13 22:37           ` Rafael J. Wysocki
2014-11-14  1:21             ` Zheng, Lv
2014-11-14 23:38               ` Rafael J. Wysocki
2014-11-03  5:16 ` [PATCH 6/6] ACPI/EC: Add GPE reference counting debugging messages Lv Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1AE640813FDE7649BE1B193DEA596E88026A36FB@SHSMSX101.ccr.corp.intel.com \
    --to=lv.zheng@intel.com \
    --cc=kirill@shutemov.name \
    --cc=len.brown@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=zetalog@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).