linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@suse.de>
To: gmu 2k6 <gmu2006@gmail.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Re: i686 hang on boot in userspace
Date: Tue, 25 Jul 2006 17:18:48 +0200	[thread overview]
Message-ID: <20060725151848.GX4044@suse.de> (raw)
In-Reply-To: <f96157c40607250727o685b8195i67da8c68123728f@mail.gmail.com>

On Tue, Jul 25 2006, gmu 2k6 wrote:
> On 7/25/06, Jens Axboe <axboe@suse.de> wrote:
> >On Tue, Jul 25 2006, Jens Axboe wrote:
> >> On Tue, Jul 25 2006, gmu 2k6 wrote:
> >> > On 7/25/06, Jens Axboe <axboe@suse.de> wrote:
> >> > >On Tue, Jul 25 2006, gmu 2k6 wrote:
> >> > >> ok, let's nail it to 2.6.17-git5 instead as it survived git status
> >> > >> compared to -git6
> >> > >> which seems to have correctly booted by accident the lastime. timing
> >> > >issues
> >> > >> I guess.
> >> > >
> >> > >I will try and reproduce it here now. It seems to be in between commit
> >> > >271f18f102c789f59644bb6c53a69da1df72b2f4 and commit
> >> > >dd67d051529387f6e44d22d1d5540ef281965fdd where the first one could 
> >also
> >> > >be bad.
> >> > >
> >> > >I'm assuming that acf421755593f7d7bd9352d57eda796c6eb4fa43 should be
> >> > >good, so you can try and verify that
> >> > >dd67d051529387f6e44d22d1d5540ef281965fdd is bad and bisect between the
> >> > >two. It's only about 6 commits, so should be quick enough to do.
> >> >
> >> > 1) no luck with remote serial console
> >> > 2) netconsole does not work although connecting to the listener with 
> >netcat
> >> > and
> >> > sending strings works
> >> > I'm gonna try via physical rs232 9pins and see how that works.
> >> > afterwards I will try to bisect the revisions you mentioned.
> >> >
> >> > btw, the issue seems to come and go as I managed to boot log into a 
> >.17-git6
> >> > kernel or is timing-dependent.
> >>
> >> I can reproduce it, you don't have to spend more time on bisecting or
> >> testing. This should fix it:
> >>
> >> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> >> index 1c4df22..1eac041 100644
> >> --- a/drivers/block/cciss.c
> >> +++ b/drivers/block/cciss.c
> >> @@ -1238,6 +1238,7 @@ static void cciss_softirq_done(struct re
> >>       CommandList_struct *cmd = rq->completion_data;
> >>       ctlr_info_t *h = hba[cmd->ctlr];
> >>       unsigned long flags;
> >> +     request_queue_t *q;
> >>       u64bit temp64;
> >>       int i, ddir;
> >>
> >> @@ -1260,10 +1261,13 @@ #ifdef CCISS_DEBUG
> >>       printk("Done with %p\n", rq);
> >>  #endif                               /* CCISS_DEBUG */
> >>
> >> +     q = rq->q;
> >> +
> >>       add_disk_randomness(rq->rq_disk);
> >>       spin_lock_irqsave(&h->lock, flags);
> >>       end_that_request_last(rq, rq->errors);
> >>       cmd_free(h, cmd, 1);
> >> +     blk_start_queue(q);
> >>       spin_unlock_irqrestore(&h->lock, flags);
> >>  }
> >>
> >>
> >> A better fix would rework the start_queue logic entirely in the driver,
> >> but the above should get you running for now. I'll take a further look.
> >
> >Something like this matches the current logic better. It's not very good
> >from a cpu efficiency point of view, but it's better than what is there
> >now since at least it's not in hard irq context.
> >
> >Not tested yet, will do so right now.
> >
> >diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> >index 1c4df22..a9e0510 100644
> >--- a/drivers/block/cciss.c
> >+++ b/drivers/block/cciss.c
> >@@ -1233,6 +1233,50 @@ static inline void complete_buffers(stru
> >        }
> > }
> >
> >+static void cciss_check_queues(ctlr_info_t *h)
> >+{
> >+       int start_queue = h->next_to_run;
> >+       int i;
> >+
> >+       /* check to see if we have maxed out the number of commands that 
> >can
> >+        * be placed on the queue.  If so then exit.  We do this check here
> >+        * in case the interrupt we serviced was from an ioctl and did not
> >+        * free any new commands.
> >+        */
> >+       if ((find_first_zero_bit(h->cmd_pool_bits, NR_CMDS)) == NR_CMDS)
> >+               return;
> >+
> >+       /* We have room on the queue for more commands.  Now we need to 
> >queue
> >+        * them up.  We will also keep track of the next queue to run so
> >+        * that every queue gets a chance to be started first.
> >+        */
> >+       for (i = 0; i < h->highest_lun + 1; i++) {
> >+               int curr_queue = (start_queue + i) % (h->highest_lun + 1);
> >+               /* make sure the disk has been added and the drive is real
> >+                * because this can be called from the middle of init_one.
> >+                */
> >+               if (!(h->drv[curr_queue].queue) || 
> >!(h->drv[curr_queue].heads))
> >+                       continue;
> >+               blk_start_queue(h->gendisk[curr_queue]->queue);
> >+
> >+               /* check to see if we have maxed out the number of commands
> >+                * that can be placed on the queue.
> >+                */
> >+               if ((find_first_zero_bit(h->cmd_pool_bits, NR_CMDS)) == 
> >NR_CMDS) {
> >+                       if (curr_queue == start_queue) {
> >+                               h->next_to_run =
> >+                                   (start_queue + 1) % (h->highest_lun + 
> >1);
> >+                               break;
> >+                       } else {
> >+                               h->next_to_run = curr_queue;
> >+                               break;
> >+                       }
> >+               } else {
> >+                       curr_queue = (curr_queue + 1) % (h->highest_lun + 
> >1);
> >+               }
> >+       }
> >+}
> >+
> > static void cciss_softirq_done(struct request *rq)
> > {
> >        CommandList_struct *cmd = rq->completion_data;
> >@@ -1264,6 +1308,7 @@ #endif                            /* CCISS_DEBUG */
> >        spin_lock_irqsave(&h->lock, flags);
> >        end_that_request_last(rq, rq->errors);
> >        cmd_free(h, cmd, 1);
> >+       cciss_check_queues(h);
> >        spin_unlock_irqrestore(&h->lock, flags);
> > }
> >
> >@@ -2528,8 +2573,6 @@ static irqreturn_t do_cciss_intr(int irq
> >        CommandList_struct *c;
> >        unsigned long flags;
> >        __u32 a, a1, a2;
> >-       int j;
> >-       int start_queue = h->next_to_run;
> >
> >        if (interrupt_not_for_us(h))
> >                return IRQ_NONE;
> >@@ -2588,45 +2631,6 @@ #                                endif
> >                }
> >        }
> >
> >-       /* check to see if we have maxed out the number of commands that 
> >can
> >-        * be placed on the queue.  If so then exit.  We do this check here
> >-        * in case the interrupt we serviced was from an ioctl and did not
> >-        * free any new commands.
> >-        */
> >-       if ((find_first_zero_bit(h->cmd_pool_bits, NR_CMDS)) == NR_CMDS)
> >-               goto cleanup;
> >-
> >-       /* We have room on the queue for more commands.  Now we need to 
> >queue
> >-        * them up.  We will also keep track of the next queue to run so
> >-        * that every queue gets a chance to be started first.
> >-        */
> >-       for (j = 0; j < h->highest_lun + 1; j++) {
> >-               int curr_queue = (start_queue + j) % (h->highest_lun + 1);
> >-               /* make sure the disk has been added and the drive is real
> >-                * because this can be called from the middle of init_one.
> >-                */
> >-               if (!(h->drv[curr_queue].queue) || 
> >!(h->drv[curr_queue].heads))
> >-                       continue;
> >-               blk_start_queue(h->gendisk[curr_queue]->queue);
> >-
> >-               /* check to see if we have maxed out the number of commands
> >-                * that can be placed on the queue.
> >-                */
> >-               if ((find_first_zero_bit(h->cmd_pool_bits, NR_CMDS)) == 
> >NR_CMDS) {
> >-                       if (curr_queue == start_queue) {
> >-                               h->next_to_run =
> >-                                   (start_queue + 1) % (h->highest_lun + 
> >1);
> >-                               goto cleanup;
> >-                       } else {
> >-                               h->next_to_run = curr_queue;
> >-                               goto cleanup;
> >-                       }
> >-               } else {
> >-                       curr_queue = (curr_queue + 1) % (h->highest_lun + 
> >1);
> >-               }
> >-       }
> >-
> >-      cleanup:
> >        spin_unlock_irqrestore(CCISS_LOCK(h->ctlr), flags);
> >        return IRQ_HANDLED;
> > }
> 
> this makes the cciss init hang.

hmm strange, it works for me. sysrq-t for the hang, please. just note
the top few functions, should be easy enough to write down manually.

-- 
Jens Axboe


  parent reply	other threads:[~2006-07-25 19:01 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20060714150418.120680@gmx.net>
2006-07-14 19:43 ` i686 hang on boot in userspace john stultz
2006-07-17 10:52 ` Roman Zippel
2006-07-17 11:09   ` Roman Zippel
2006-07-17 13:38   ` Uwe Bugla
2006-07-17 14:17     ` Roman Zippel
2006-07-17 14:59       ` gmu 2k6
2006-07-17 15:21         ` Roman Zippel
2006-07-17 15:58           ` gmu 2k6
2006-07-17 16:02             ` gmu 2k6
2006-07-17 17:03               ` Roman Zippel
2006-07-17 18:15                 ` gmu 2k6
2006-07-17 18:17                   ` gmu 2k6
2006-07-18  9:38                   ` gmu 2k6
2006-07-19 10:26                     ` gmu 2k6
2006-07-24 15:34                       ` gmu 2k6
2006-07-25  7:32                         ` Jens Axboe
2006-07-25  8:00                           ` gmu 2k6
2006-07-25  7:41                             ` Jens Axboe
     [not found]                               ` <f96157c40607250120s2554cbc6qbd7c42972b70f6de@mail.gmail.com>
     [not found]                                 ` <20060725080002.GD4044@suse.de>
2006-07-25  8:28                                   ` gmu 2k6
2006-07-25  8:08                                     ` Jens Axboe
2006-07-25  9:17                                       ` gmu 2k6
2006-07-25  8:57                                         ` Jens Axboe
2006-07-25 10:09                                           ` gmu 2k6
2006-07-25  9:46                                             ` Jens Axboe
2006-07-25 10:19                                               ` gmu 2k6
2006-07-25 10:41                                                 ` Jens Axboe
2006-07-25  9:20                                         ` gmu 2k6
2006-07-25  8:57                                           ` Jens Axboe
2006-07-25  9:35                                           ` gmu 2k6
2006-07-25  9:24                                             ` Jens Axboe
2006-07-25 11:29                                             ` Jens Axboe
2006-07-25 12:47                                               ` gmu 2k6
2006-07-25 12:52                                                 ` Jens Axboe
2006-07-25 12:58                                                   ` Jens Axboe
2006-07-25 14:27                                                     ` gmu 2k6
2006-07-25 14:29                                                       ` gmu 2k6
2006-07-25 15:18                                                       ` Jens Axboe [this message]
2006-07-25 13:13                                                   ` gmu 2k6
2006-07-25 14:50                                                   ` gmu 2k6
2006-07-25 15:19                                                     ` Jens Axboe
2006-07-25 18:58                                                     ` gmu 2k6
2006-07-25 19:21                                                       ` Jens Axboe
2006-07-25 19:28                                                         ` gmu 2k6
2006-07-25  9:51                                       ` gmu 2k6
2006-07-25  9:42                                         ` Jens Axboe
2006-07-17 16:11             ` gmu 2k6

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060725151848.GX4044@suse.de \
    --to=axboe@suse.de \
    --cc=gmu2006@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).