* hvc_console change results in corrupt oops output @ 2011-07-04 10:57 Anton Blanchard 2011-07-04 13:56 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: Anton Blanchard @ 2011-07-04 10:57 UTC (permalink / raw) To: benh, brueckner, borntraeger, linuxppc-dev Hi, We've been struggling to debug a hang on a large ppc64 box. Every time we collect oops output there are pieces of the oops output missing and in some cases entire CPUs are missing. Eventually I realised the hvc_console driver is dropping characters. The commit that caused this is: commit 3feebbb5492e9e463467cefb633e23a3dfcec132 Author: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Date: Mon Oct 13 23:12:50 2008 +0000 hvc_console: Fix loop if put_char() returns 0 If put_char() routine of a hvc console backend returns 0, then the hvc console starts looping in the following scenarios: 1. hvc_console_print() If put_char() returns 0 then the while loop may loop forever. I have added the missing check for 0 to throw away console messages. The hypervisor gives us a busy return, so we could retry a number of times instead of dropping it on the floor. We'd need to do it in the hvc_console driver - the tty drivers share the same backend functions so we can't hide it in the pseries put_chars function. Anton ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-04 10:57 hvc_console change results in corrupt oops output Anton Blanchard @ 2011-07-04 13:56 ` Benjamin Herrenschmidt 2011-07-04 14:24 ` Hendrik Brueckner 0 siblings, 1 reply; 10+ messages in thread From: Benjamin Herrenschmidt @ 2011-07-04 13:56 UTC (permalink / raw) To: Anton Blanchard; +Cc: borntraeger, brueckner, linuxppc-dev On Mon, 2011-07-04 at 20:57 +1000, Anton Blanchard wrote: .../... > The hypervisor gives us a busy return, so we could retry a number of > times instead of dropping it on the floor. We'd need to do it in the > hvc_console driver - the tty drivers share the same backend > functions so we can't hide it in the pseries put_chars function. For kernel console, I don't see why not wait forever ... If the underlying backend really lost the connection it can always return a negative error no ? Or we could have "well defined" return codes for "wait forever" vs. "drop this" ... maybe return -EAGAIN. Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-04 13:56 ` Benjamin Herrenschmidt @ 2011-07-04 14:24 ` Hendrik Brueckner 2011-07-05 4:17 ` Tabi Timur-B04825 0 siblings, 1 reply; 10+ messages in thread From: Hendrik Brueckner @ 2011-07-04 14:24 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: borntraeger, brueckner, linuxppc-dev, Anton Blanchard On Mon, Jul 04, 2011 at 11:56:27PM +1000, Benjamin Herrenschmidt wrote: > On Mon, 2011-07-04 at 20:57 +1000, Anton Blanchard wrote: > > .../... > > > The hypervisor gives us a busy return, so we could retry a number of > > times instead of dropping it on the floor. We'd need to do it in the > > hvc_console driver - the tty drivers share the same backend > > functions so we can't hide it in the pseries put_chars function. > > For kernel console, I don't see why not wait forever ... If the > underlying backend really lost the connection it can always return a > negative error no ? > > Or we could have "well defined" return codes for "wait forever" vs. > "drop this" ... maybe return -EAGAIN. I will check this again for my hvc_iucv back-end. Meanwhile a found an old thread discussing the same issue. It covers some differences between console and ttys which actually does not matter for hvc-backend because of the shared put_chars() routine. You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149 Kind regards, Hendrik ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-04 14:24 ` Hendrik Brueckner @ 2011-07-05 4:17 ` Tabi Timur-B04825 2011-07-05 6:22 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 10+ messages in thread From: Tabi Timur-B04825 @ 2011-07-05 4:17 UTC (permalink / raw) To: Hendrik Brueckner; +Cc: linuxppc-dev, Anton Blanchard, borntraeger On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner <brueckner@linux.vnet.ibm.com> wrote: > I will check this again for my hvc_iucv back-end. =A0Meanwhile a found > an old thread discussing the same issue. =A0It covers some differences > between console and ttys which actually does not matter for hvc-backend > because of the shared put_chars() routine. > > You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149 I started that thread. After much soul searching, we came to the conclusion that HVC is not compatible with hypervisors that return BUSY on writes. So I threw out my HVC driver and rewrote it as a standard console and TTY driver. That driver is waiting to be included in the 3.1 kernel. --=20 Timur Tabi Linux kernel developer at Freescale= ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-05 4:17 ` Tabi Timur-B04825 @ 2011-07-05 6:22 ` Benjamin Herrenschmidt 2011-07-05 13:51 ` Tabi Timur-B04825 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 0 siblings, 2 replies; 10+ messages in thread From: Benjamin Herrenschmidt @ 2011-07-05 6:22 UTC (permalink / raw) To: Tabi Timur-B04825 Cc: borntraeger, Hendrik Brueckner, linuxppc-dev, Anton Blanchard On Tue, 2011-07-05 at 04:17 +0000, Tabi Timur-B04825 wrote: > On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner > <brueckner@linux.vnet.ibm.com> wrote: > > > I will check this again for my hvc_iucv back-end. Meanwhile a found > > an old thread discussing the same issue. It covers some differences > > between console and ttys which actually does not matter for hvc-backend > > because of the shared put_chars() routine. > > > > You can read the thread on lkml.org: http://lkml.org/lkml/2009/10/15/149 > > I started that thread. After much soul searching, we came to the > conclusion that HVC is not compatible with hypervisors that return > BUSY on writes. That is a fun conclusion considering that hvc has been written for the pseries hypervisor which ... can return BUSY on writes :-) > So I threw out my HVC driver and rewrote it as a > standard console and TTY driver. That driver is waiting to be > included in the 3.1 kernel. Sucktastic. We just need to fix HVC properly. Cheers, Ben. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: hvc_console change results in corrupt oops output 2011-07-05 6:22 ` Benjamin Herrenschmidt @ 2011-07-05 13:51 ` Tabi Timur-B04825 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 1 sibling, 0 replies; 10+ messages in thread From: Tabi Timur-B04825 @ 2011-07-05 13:51 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: borntraeger, Hendrik Brueckner, linuxppc-dev, Anton Blanchard Benjamin Herrenschmidt wrote: > That is a fun conclusion considering that hvc has been written for the > pseries hypervisor which ... can return BUSY on writes :-) Go read the original thread. The problem is that tty writes and console=20 writes are treated the same by the hvc client driver. If a client driver=20 detects that the hypervisor is busy, it has the choice of either spinning=20 or returning right away. Spinning is not acceptable for tty output, so=20 all drivers return right away. hvc then drops the unwritten characters. According to Hendrik, this is still happening. > We just need to fix HVC properly. Where were you two years ago? I complained about the problem, and even=20 posted a hackish "fix". The response I got was tepid -- some=20 acknowledgement that the problem exists, but no real desire to fix it by=20 anyone. So I had no choice but to abandon hvc. And frankly, I still don't=20 understand why it exists. Since then, I wrote a very nice console/tty=20 driver, and I have no plans to return to hvc even if the problem is fixed. --=20 Timur Tabi Linux kernel developer at Freescale= ^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC] [PATCH] hvc_console: improve tty/console put_chars handling 2011-07-05 6:22 ` Benjamin Herrenschmidt 2011-07-05 13:51 ` Tabi Timur-B04825 @ 2011-07-05 14:28 ` Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard ` (2 more replies) 1 sibling, 3 replies; 10+ messages in thread From: Hendrik Brueckner @ 2011-07-05 14:28 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: borntraeger, Hendrik Brueckner, Tabi Timur-B04825, linuxppc-dev, Anton Blanchard Hi folks, On Tue, Jul 05, 2011 at 04:22:43PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2011-07-05 at 04:17 +0000, Tabi Timur-B04825 wrote: > > On Mon, Jul 4, 2011 at 9:24 AM, Hendrik Brueckner > > <brueckner@linux.vnet.ibm.com> wrote: > > > > I started that thread. After much soul searching, we came to the > > conclusion that HVC is not compatible with hypervisors that return > > BUSY on writes. > > That is a fun conclusion considering that hvc has been written for the > pseries hypervisor which ... can return BUSY on writes :-) > > We just need to fix HVC properly. So I took initiative and looked again into this issue. Below you can find a patch that is based on Ben's -EAGAIN idea. The hvc console layer takes care of retrying depending on the backend's return code. However, the main issue is that from a backend perspective, there is no difference between tty and console output. Because consoles, especially the preferred console, behave different than a simple tty. Blocked write to a preferred console can stop the system. So with the patch below, the backend can now indirectly control the way console output is handled for it. I still have to think if this solution is ok or if it is better to introduce a new callback to console output only (and might provide a default implemenatation similar to the patch below). NOTE: I did not yet test this patch but will do.. I just want to share it early to get feedback from you. -->8--------------------------------------------------------------------- Currently, the hvc_console_print() function drops console output if the hvc backend's put_chars() returns 0. This patch changes this behavior to allow a retry through returning -EAGAIN. This change also affects the hvc_push() function. Both functions are changed to handle -EAGAIN and to retry the put_chars() operation. If a hvc backend returns -EAGAIN, the retry handling differs: - hvc_console_print() spins to write the complete console output. - hvc_push() behaves the same way as for returning 0. Now hvc backends can indirectly control the way how console output is handled through the hvc console layer. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> --- drivers/tty/hvc/hvc_console.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -163,8 +163,10 @@ static void hvc_console_print(struct con } else { r = cons_ops[index]->put_chars(vtermnos[index], c, i); if (r <= 0) { - /* throw away chars on error */ - i = 0; + /* throw away characters on error + * but spin in case of -EAGAIN */ + if (r != -EAGAIN) + i = 0; } else if (r > 0) { i -= r; if (i > 0) @@ -448,7 +450,7 @@ static int hvc_push(struct hvc_struct *h n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf); if (n <= 0) { - if (n == 0) { + if (n == 0 || n == -EAGAIN) { hp->do_wakeup = 1; return 0; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] [PATCH] hvc_console: improve tty/console put_chars handling 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner @ 2011-07-06 7:49 ` Anton Blanchard 2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard 2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard 2 siblings, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2011-07-06 7:49 UTC (permalink / raw) To: Hendrik Brueckner; +Cc: Tabi Timur-B04825, linuxppc-dev, borntraeger Hi Hendrik, > So with the patch below, the backend can now indirectly control the > way console output is handled for it. I still have to think if this > solution is ok or if it is better to introduce a new callback to > console output only (and might provide a default implemenatation > similar to the patch below). > > NOTE: I did not yet test this patch but will do.. I just want to > share it early to get feedback from you. Thanks a lot for this. I added the pseries bits and tested it, patches to follow. Anton ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2]: hvc_console: improve tty/console put_chars handling 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard @ 2011-07-06 7:50 ` Anton Blanchard 2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard 2 siblings, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2011-07-06 7:50 UTC (permalink / raw) To: Hendrik Brueckner; +Cc: Tabi Timur-B04825, linuxppc-dev, borntraeger From: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Currently, the hvc_console_print() function drops console output if the hvc backend's put_chars() returns 0. This patch changes this behavior to allow a retry through returning -EAGAIN. This change also affects the hvc_push() function. Both functions are changed to handle -EAGAIN and to retry the put_chars() operation. If a hvc backend returns -EAGAIN, the retry handling differs: - hvc_console_print() spins to write the complete console output. - hvc_push() behaves the same way as for returning 0. Now hvc backends can indirectly control the way how console output is handled through the hvc console layer. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> --- Index: linux-powerpc/drivers/tty/hvc/hvc_console.c =================================================================== --- linux-powerpc.orig/drivers/tty/hvc/hvc_console.c 2011-07-06 17:10:51.288360541 +1000 +++ linux-powerpc/drivers/tty/hvc/hvc_console.c 2011-07-06 17:11:16.398794913 +1000 @@ -163,8 +163,10 @@ static void hvc_console_print(struct con } else { r = cons_ops[index]->put_chars(vtermnos[index], c, i); if (r <= 0) { - /* throw away chars on error */ - i = 0; + /* throw away characters on error + * but spin in case of -EAGAIN */ + if (r != -EAGAIN) + i = 0; } else if (r > 0) { i -= r; if (i > 0) @@ -448,7 +450,7 @@ static int hvc_push(struct hvc_struct *h n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf); if (n <= 0) { - if (n == 0) { + if (n == 0 || n == -EAGAIN) { hp->do_wakeup = 1; return 0; } ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard 2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard @ 2011-07-06 7:51 ` Anton Blanchard 2 siblings, 0 replies; 10+ messages in thread From: Anton Blanchard @ 2011-07-06 7:51 UTC (permalink / raw) To: Hendrik Brueckner; +Cc: Tabi Timur-B04825, linuxppc-dev, borntraeger Return -EAGAIN when we get H_BUSY back from the hypervisor. This makes the hvc console driver retry, avoiding dropped printks. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> --- Index: linux-powerpc/arch/powerpc/platforms/pseries/hvconsole.c =================================================================== --- linux-powerpc.orig/arch/powerpc/platforms/pseries/hvconsole.c 2011-07-06 17:11:33.799095935 +1000 +++ linux-powerpc/arch/powerpc/platforms/pseries/hvconsole.c 2011-07-06 17:11:47.499332962 +1000 @@ -73,7 +73,7 @@ int hvc_put_chars(uint32_t vtermno, cons if (ret == H_SUCCESS) return count; if (ret == H_BUSY) - return 0; + return -EAGAIN; return -EIO; } ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-07-06 7:51 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-07-04 10:57 hvc_console change results in corrupt oops output Anton Blanchard 2011-07-04 13:56 ` Benjamin Herrenschmidt 2011-07-04 14:24 ` Hendrik Brueckner 2011-07-05 4:17 ` Tabi Timur-B04825 2011-07-05 6:22 ` Benjamin Herrenschmidt 2011-07-05 13:51 ` Tabi Timur-B04825 2011-07-05 14:28 ` [RFC] [PATCH] hvc_console: improve tty/console put_chars handling Hendrik Brueckner 2011-07-06 7:49 ` Anton Blanchard 2011-07-06 7:50 ` [PATCH 1/2]: " Anton Blanchard 2011-07-06 7:51 ` [PATCH 2/2]: powerpc/pseries/hvconsole: Fix dropped console output Anton Blanchard
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.