From: Jens Axboe <axboe@suse.de>
To: gmu 2k6 <gmu2006@gmail.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Re: i686 hang on boot in userspace
Date: Tue, 25 Jul 2006 11:24:57 +0200 [thread overview]
Message-ID: <20060725092457.GL4044@suse.de> (raw)
In-Reply-To: <f96157c40607250235t4cdd76ffxfd6f95389d2ddbdc@mail.gmail.com>
On Tue, Jul 25 2006, gmu 2k6 wrote:
> On 7/25/06, gmu 2k6 <gmu2006@gmail.com> wrote:
> >On 7/25/06, gmu 2k6 <gmu2006@gmail.com> wrote:
> >> On 7/25/06, Jens Axboe <axboe@suse.de> wrote:
> >> > On Tue, Jul 25 2006, gmu 2k6 wrote:
> >> > > On 7/25/06, Jens Axboe <axboe@suse.de> wrote:
> >> > > >On Tue, Jul 25 2006, gmu 2k6 wrote:
> >> > > >> On 7/25/06, Jens Axboe <axboe@suse.de> wrote:
> >> > > >> >On Tue, Jul 25 2006, gmu 2k6 wrote:
> >> > > >> >> On 7/25/06, Jens Axboe <axboe@suse.de> wrote:
> >> > > >> >> >On Mon, Jul 24 2006, gmu 2k6 wrote:
> >> > > >> >> >> the problem I have with hangs is related to changes in CFQ
> >and that
> >> > > >> >> >> CFQ is now the default. 2.6.17-git12 had the problem but
> >booting
> >> > > >> >> >> it with elevator=deadline fixes the hang.
> >> > > >> >> >>
> >> > > >> >> >> symptoms encountered during git-bisecting between v2.6.17
> >and
> >> > > >> >> >v2.6.18-rc1:
> >> > > >> >> >> A hang while starting network services
> >> > > >> >> >> B hang while trying to login
> >> > > >> >> >> 1 on remote console [not SSH] it hang after typing
> ><uid><CR>
> >> > > >> >> >> 1 via OpenSSH it hang after typing <pwd><CR> when doing
> >slogin
> >> > > >> >> >root@<IP>
> >> > > >> >> >>
> >> > > >> >> >> A is the problem I got in the first place and this seems to
> >be the
> >> > > >> >> >> case since 2.6.17-git11 definitely although git-bisect
> >pointed me
> >> > > >at
> >> > > >> >> >> the following
> >> > > >> >> >> changeset which is included since 2.6.17-git12:
> >> > > >> >> >>
> >> > > >> >> >> caaa5f9f0a75d1dc5e812e69afdbb8720e077fd3
> >> > > >> >> >> by Jens Axboe
> >> > > >> >> >> titled "[PATCH] cfq-iosched: many performance fixes"
> >> > > >> >> >>
> >> > > >> >> >> strange enough it also hangs with 2.6.17-git11 which did not
> >> > > >include
> >> > > >> >that
> >> > > >> >> >> one changeset yet.
> >> > > >> >> >
> >> > > >> >> >So perhaps your bisect isn't 100% trust worthy? Can you do a
> >manual
> >> > > >> >> >-gitX bisect to see which 2.6.17-gitX introduced the problem?
> >> > > >> >> >
> >> > > >> >> >Also please put a serial console or similar on the machine,
> >so you
> >> > > >can
> >> > > >> >> >log + store the sysrq+t output.
> >> > > >> >>
> >> > > >> >> well I didn't say that caa....fd3 is the exact change which
> >broke it,
> >> > > >> >> just that it's related to 1) CFQ changes and 2) CFQ being the
> >default
> >> > > >> >> now.
> >> > > >> >> I have a Remote Serial Console via HP's integrated Lights-Out
> >Java
> >> > > >> >> Applet but am not sure how to enable serial console via kernel
> >boot
> >> > > >> >> params (will try to find out).
> >> > > >> >> I will first try to find the 2.6.17-git* revision working
> >before
> >> > > >> >> bisecting it against -git11 or git12.
> >> > > >> >
> >> > > >> >Thanks, would be much appreciated to try and narrow it down to a
> >> > > >> >specific fix.
> >> > > >> >
> >> > > >> >Are you seeing the hang on cciss?
> >> > > >>
> >> > > >> I'm not sure it is in the cciss driver, but the SmartArray is
> >driven by
> >> > > >> cciss.
> >> > > >> starting git<11 boot tests in a minute now.
> >> > > >
> >> > > >Ok, thanks for confirming it's cciss. The bug is likely an
> >interaction
> >> > > >between cciss and cfq I think, so it would be very useful if you
> >can pin
> >> > > >point which of the cfq patches make it stall.
> >> > >
> >> > > is there anything special about cciss or did you just deduce that it
> >> > > must be cciss in that particular box and are suspecting interaction
> >> > > problems with that driver and your CFQ changes?
> >> >
> >> > Nothing really special about cciss, but a few months ago I had a
> >similar
> >> > discussion about cciss and a strange hang.
> >> >
> >> > If possible, please also try a known bad kernel and apply the below
> >> > patch and see if it still reproduces:
> >> >
> >> > diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> >> > index 1c4df22..2b36e7a 100644
> >> > --- a/drivers/block/cciss.c
> >> > +++ b/drivers/block/cciss.c
> >> > @@ -2362,7 +2362,11 @@ static inline void complete_command(ctlr
> >> > cmd->rq->completion_data = cmd;
> >> > cmd->rq->errors = status;
> >> > blk_add_trace_rq(cmd->rq->q, cmd->rq, BLK_TA_COMPLETE);
> >> > +#if 1
> >> > + cciss_softirq_done(cmd->rq);
> >> > +#else
> >> > blk_complete_request(cmd->rq);
> >> > +#endif
> >> > }
> >> >
> >> > /*
> >>
> >> manually nailed it down to 2.6.17-git7 being the first broken revision.
> >> going to try whether Linus' git tree knows the -git revisions and do a
> >bisect
> >> otherwise interdiff and looking for CFQ or cciss changes as best I can.
> >
> >oops, doing git-status while running 2.6.17-git6 seems to have locked the
> >box
> >again :D, ping works though. *sigh*. Jens I will try your cciss.c change
> >now.
>
> ok, let's nail it to 2.6.17-git5 instead as it survived git status
> compared to -git6
> which seems to have correctly booted by accident the lastime. timing issues
> I guess.
Then please also try the cciss patch, as suggested.
--
Jens Axboe
next prev parent reply other threads:[~2006-07-25 9:50 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060714150418.120680@gmx.net>
2006-07-14 19:43 ` i686 hang on boot in userspace john stultz
2006-07-17 10:52 ` Roman Zippel
2006-07-17 11:09 ` Roman Zippel
2006-07-17 13:38 ` Uwe Bugla
2006-07-17 14:17 ` Roman Zippel
2006-07-17 14:59 ` gmu 2k6
2006-07-17 15:21 ` Roman Zippel
2006-07-17 15:58 ` gmu 2k6
2006-07-17 16:02 ` gmu 2k6
2006-07-17 17:03 ` Roman Zippel
2006-07-17 18:15 ` gmu 2k6
2006-07-17 18:17 ` gmu 2k6
2006-07-18 9:38 ` gmu 2k6
2006-07-19 10:26 ` gmu 2k6
2006-07-24 15:34 ` gmu 2k6
2006-07-25 7:32 ` Jens Axboe
2006-07-25 8:00 ` gmu 2k6
2006-07-25 7:41 ` Jens Axboe
[not found] ` <f96157c40607250120s2554cbc6qbd7c42972b70f6de@mail.gmail.com>
[not found] ` <20060725080002.GD4044@suse.de>
2006-07-25 8:28 ` gmu 2k6
2006-07-25 8:08 ` Jens Axboe
2006-07-25 9:17 ` gmu 2k6
2006-07-25 8:57 ` Jens Axboe
2006-07-25 10:09 ` gmu 2k6
2006-07-25 9:46 ` Jens Axboe
2006-07-25 10:19 ` gmu 2k6
2006-07-25 10:41 ` Jens Axboe
2006-07-25 9:20 ` gmu 2k6
2006-07-25 8:57 ` Jens Axboe
2006-07-25 9:35 ` gmu 2k6
2006-07-25 9:24 ` Jens Axboe [this message]
2006-07-25 11:29 ` Jens Axboe
2006-07-25 12:47 ` gmu 2k6
2006-07-25 12:52 ` Jens Axboe
2006-07-25 12:58 ` Jens Axboe
2006-07-25 14:27 ` gmu 2k6
2006-07-25 14:29 ` gmu 2k6
2006-07-25 15:18 ` Jens Axboe
2006-07-25 13:13 ` gmu 2k6
2006-07-25 14:50 ` gmu 2k6
2006-07-25 15:19 ` Jens Axboe
2006-07-25 18:58 ` gmu 2k6
2006-07-25 19:21 ` Jens Axboe
2006-07-25 19:28 ` gmu 2k6
2006-07-25 9:51 ` gmu 2k6
2006-07-25 9:42 ` Jens Axboe
2006-07-17 16:11 ` gmu 2k6
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060725092457.GL4044@suse.de \
--to=axboe@suse.de \
--cc=gmu2006@gmail.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).