All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: Maybe a bug in libata-core
@ 2010-08-23  0:52 Stefan /*St0fF*/ Hübner
  2010-08-23  9:16 ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-08-23  0:52 UTC (permalink / raw)
  To: jgarzik, linux-ide

Hi Jeff and list,

maybe this mail was wrong on the linux-scsi list.  So here we go again:

-------- Original-Nachricht --------
Betreff: Maybe a bug in libata-core
Datum: Fri, 20 Aug 2010 23:48:09 +0200
Von: Stefan Hübner  <stefan.huebner@stud.tu-ilmenau.de>
Antwort an: stefan.huebner@stud.tu-ilmenau.de
Organisation: TU-Ilmenau
An: linux-scsi@vger.kernel.org

 Hi List!

After sending a WRITE_DMA_FUA_EXT ATA-command via SG_IO Passthru to a
harddisk (here: /dev/sdd), my kernel panics.  The is what I find in syslog:

Aug 20 23:29:49 dobbelherz kernel: ------------[ cut here ]------------
Aug 20 23:29:49 dobbelherz kernel: kernel BUG at
drivers/ata/libata-core.c:5136!
Aug 20 23:29:49 dobbelherz kernel: invalid opcode: 0000 [#1] SMP
Aug 20 23:29:49 dobbelherz kernel: last sysfs file:
/sys/devices/pci0000:00/0000:00:1c.5/0000:02:00.0/host7/target7:0:0/7:0:0:0/block/sdd/dev
Aug 20 23:29:49 dobbelherz kernel: CPU 0
Aug 20 23:29:49 dobbelherz kernel: Modules linked in: snd_seq
snd_seq_device cifs nls_iso8859_15 nls_cp1250 vfat fat vboxdrv nvidia(P)
arc4 ecb snd_h
da_intel rtl8187 snd_hda_codec snd_ctxfi snd_hwdep mac80211 snd_pcm
cfg80211 snd_timer 8250_pnp ahci snd 8250 rtc_cmos eeprom_93cx6 libahci
rtc_core
serial_core soundcore snd_page_alloc rtc_lib
Aug 20 23:29:49 dobbelherz kernel:
Aug 20 23:29:49 dobbelherz kernel: Pid: 2764, comm: scsi Tainted:
P            2.6.35-gentoo-r1-dobbelherz #2 P5W DH Deluxe/P5W DH Deluxe
Aug 20 23:29:49 dobbelherz kernel: RIP: 0010:[<ffffffff81250a56>]
[<ffffffff81250a56>] ata_qc_issue+0x286/0x330
Aug 20 23:29:49 dobbelherz kernel: RSP: 0018:ffff88004a0299b8  EFLAGS:
00010046
Aug 20 23:29:49 dobbelherz kernel: RAX: 0000000000000002 RBX:
ffff88007b588128 RCX: 0000000000000000
Aug 20 23:29:49 dobbelherz kernel: RDX: ffff88007b589d38 RSI:
00000000ffffffe1 RDI: ffff88007b588128
Aug 20 23:29:49 dobbelherz kernel: RBP: ffff88007b588000 R08:
0000000000000000 R09: 0000000000000000
Aug 20 23:29:49 dobbelherz kernel: R10: 0000000000000000 R11:
0000000000000000 R12: ffff88007b589d38
Aug 20 23:29:49 dobbelherz kernel: R13: 0000000000000002 R14:
0000000000000002 R15: ffffffff81235b90
Aug 20 23:29:49 dobbelherz kernel: FS:  00007fffed71b710(0000)
GS:ffff880001600000(0000) knlGS:0000000000000000
Aug 20 23:29:49 dobbelherz kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 20 23:29:49 dobbelherz kernel: CR2: 00007f753cdc12b8 CR3:
000000007d6af000 CR4: 00000000000006f0
Aug 20 23:29:49 dobbelherz kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Aug 20 23:29:49 dobbelherz kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Aug 20 23:29:49 dobbelherz kernel: Process scsi (pid: 2764, threadinfo
ffff88004a028000, task ffff88005500b480)
Aug 20 23:29:49 dobbelherz kernel: Stack:
Aug 20 23:29:49 dobbelherz kernel: ffff880001611e00 ffff88007ade32f0
0000000000000000 ffff88007b588128
Aug 20 23:29:49 dobbelherz kernel: <0> 0000000000000000 ffff88007b589ea8
ffffffff812585c0 ffff88007b588000
Aug 20 23:29:49 dobbelherz kernel: <0> ffffffff81235b90 ffffffff81255a80
ffffffffffffffff ffff88007f2088c0
Aug 20 23:29:49 dobbelherz kernel: Call Trace:
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff812585c0>] ?
ata_scsi_pass_thru+0x0/0x300
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81235b90>] ? scsi_done+0x0/0x20
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81255a80>] ?
ata_scsi_translate+0xa0/0x190
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81235b90>] ? scsi_done+0x0/0x20
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81258e6b>] ?
ata_scsi_queuecmd+0xab/0x2b0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81235c99>] ?
scsi_dispatch_cmd+0xe9/0x240
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff8123bb21>] ?
scsi_request_fn+0x3e1/0x530
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff811784f0>] ?
__blk_run_queue+0x50/0x140
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81175e0b>] ?
elv_insert+0x10b/0x1a0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff8117c569>] ?
blk_execute_rq_nowait+0x59/0xb0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff8117c621>] ?
blk_execute_rq+0x61/0xb0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81178ea3>] ?
get_request_wait+0x23/0x140
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81180472>] ? sg_io+0x212/0x3d0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81180ce7>] ?
scsi_cmd_ioctl+0x2c7/0x490
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81365e7f>] ?
schedule+0x23f/0x6a0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff81243e32>] ? sd_ioctl+0x92/0xf0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff8117dfe1>] ?
__blkdev_driver_ioctl+0xa1/0xe0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff8117e49b>] ?
blkdev_ioctl+0x20b/0x7e0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff810af700>] ?
cp_new_stat+0xe0/0x100
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff810d6ed7>] ?
block_ioctl+0x37/0x40
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff810ba165>] ?
vfs_ioctl+0x35/0xd0
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff810ba328>] ?
do_vfs_ioctl+0x88/0x530
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff810af7f1>] ?
sys_newstat+0x31/0x50
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff810ba819>] ?
sys_ioctl+0x49/0x80
Aug 20 23:29:49 dobbelherz kernel: [<ffffffff810023eb>] ?
system_call_fastpath+0x16/0x1b
Aug 20 23:29:49 dobbelherz kernel: Code: ff ff 0f 1f 00 4c 8b 35 89 13
25 00 e9 81 fe ff ff 48 83 bb a0 00 00 00 00 74 10 83 7b 54 00 74 0a 83
7b 64 00 0f 85 25 fe ff ff <0f> 0b eb fe 49 8d 7c 24 28 41 83 4c 24 38
06 48 c7 c6 0b 81 43
Aug 20 23:29:49 dobbelherz kernel: RIP  [<ffffffff81250a56>]
ata_qc_issue+0x286/0x330
Aug 20 23:29:49 dobbelherz kernel: RSP <ffff88004a0299b8>
Aug 20 23:29:49 dobbelherz kernel: ---[ end trace bb586d9f3dc3cf97 ]---

Please tell me if you need any more docs for a fix...

Cheers,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: Maybe a bug in libata-core
  2010-08-23  0:52 Fwd: Maybe a bug in libata-core Stefan /*St0fF*/ Hübner
@ 2010-08-23  9:16 ` Tejun Heo
  2010-08-23  9:27   ` [PATCH #upstream-fixes] libata: be less of a drama queen on empty data commands Tejun Heo
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Tejun Heo @ 2010-08-23  9:16 UTC (permalink / raw)
  To: stefan.huebner; +Cc: jgarzik, linux-ide

On 08/23/2010 02:52 AM, Stefan /*St0fF*/ Hübner wrote:
> Hi Jeff and list,
> 
> maybe this mail was wrong on the linux-scsi list.  So here we go again:
> 
> -------- Original-Nachricht --------
> Betreff: Maybe a bug in libata-core
> Datum: Fri, 20 Aug 2010 23:48:09 +0200
> Von: Stefan Hübner  <stefan.huebner@stud.tu-ilmenau.de>
> Antwort an: stefan.huebner@stud.tu-ilmenau.de
> Organisation: TU-Ilmenau
> An: linux-scsi@vger.kernel.org
> 
>  Hi List!
> 
> After sending a WRITE_DMA_FUA_EXT ATA-command via SG_IO Passthru to a
> harddisk (here: /dev/sdd), my kernel panics.  The is what I find in syslog:

You're sending down data command w/o data.  Panicking probably isn't
the best response here.  Maybe the BUG_ON() should be changed to
WARN_ON_ONCE() + goto sg_err.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Maybe a bug in libata-core
  2010-08-23  9:44   ` Maybe a bug in libata-core Alan Cox
@ 2010-08-23  9:27     ` Tejun Heo
  0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2010-08-23  9:27 UTC (permalink / raw)
  To: Alan Cox; +Cc: stefan.huebner, jgarzik, linux-ide

Hello,

On 08/23/2010 11:44 AM, Alan Cox wrote:
> It's user triggerable (as root I grant) so really shouldn't be
> generating WARNs in that case either - just an -EIO.

I still want to keep WARN_ON_ONCE() tho.  It will help detecting
possible breakage in the many code paths prepping data vectors through
block, SCSI, DMA API and whatnot.  Silent AC_ERR_SYSTEM failure can be
pretty difficult to track down and it's not like WARN_ON_ONCE() can be
used as an attack vector.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH #upstream-fixes] libata: be less of a drama queen on empty data commands
  2010-08-23  9:16 ` Tejun Heo
@ 2010-08-23  9:27   ` Tejun Heo
  2010-08-23  9:44   ` Maybe a bug in libata-core Alan Cox
  2010-08-23 22:41   ` Stefan /*St0fF*/ Hübner
  2 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2010-08-23  9:27 UTC (permalink / raw)
  To: jgarzik; +Cc: stefan.huebner, linux-ide

ata_qc_issue() BUG_ON()s on data commands w/o data, which may be
submitted via SG_IO.  Be less of a drama queen and just trigger
WARN_ON_ONCE() and fail the command with AC_ERR_SYSTEM.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stefan Hübner <stefan.huebner@stud.tu-ilmenau.de>
---
 drivers/ata/libata-core.c |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7ef7c4f..c035b3d 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5111,15 +5111,18 @@ void ata_qc_issue(struct ata_queued_cmd *qc)
 	qc->flags |= ATA_QCFLAG_ACTIVE;
 	ap->qc_active |= 1 << qc->tag;

-	/* We guarantee to LLDs that they will have at least one
+	/*
+	 * We guarantee to LLDs that they will have at least one
 	 * non-zero sg if the command is a data command.
 	 */
-	BUG_ON(ata_is_data(prot) && (!qc->sg || !qc->n_elem || !qc->nbytes));
+	if (WARN_ON_ONCE(ata_is_data(prot) &&
+			 (!qc->sg || !qc->n_elem || !qc->nbytes)))
+		goto sys_err;

 	if (ata_is_dma(prot) || (ata_is_pio(prot) &&
 				 (ap->flags & ATA_FLAG_PIO_DMA)))
 		if (ata_sg_setup(qc))
-			goto sg_err;
+			goto sys_err;

 	/* if device is sleeping, schedule reset and abort the link */
 	if (unlikely(qc->dev->flags & ATA_DFLAG_SLEEPING)) {
@@ -5136,7 +5139,7 @@ void ata_qc_issue(struct ata_queued_cmd *qc)
 		goto err;
 	return;

-sg_err:
+sys_err:
 	qc->err_mask |= AC_ERR_SYSTEM;
 err:
 	ata_qc_complete(qc);

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Maybe a bug in libata-core
  2010-08-23  9:16 ` Tejun Heo
  2010-08-23  9:27   ` [PATCH #upstream-fixes] libata: be less of a drama queen on empty data commands Tejun Heo
@ 2010-08-23  9:44   ` Alan Cox
  2010-08-23  9:27     ` Tejun Heo
  2010-08-23 22:41   ` Stefan /*St0fF*/ Hübner
  2 siblings, 1 reply; 8+ messages in thread
From: Alan Cox @ 2010-08-23  9:44 UTC (permalink / raw)
  To: Tejun Heo; +Cc: stefan.huebner, jgarzik, linux-ide

On Mon, 23 Aug 2010 11:16:21 +0200
Tejun Heo <tj@kernel.org> wrote:

> On 08/23/2010 02:52 AM, Stefan /*St0fF*/ Hübner wrote:
> > Hi Jeff and list,
> > 
> > maybe this mail was wrong on the linux-scsi list.  So here we go again:
> > 
> > -------- Original-Nachricht --------
> > Betreff: Maybe a bug in libata-core
> > Datum: Fri, 20 Aug 2010 23:48:09 +0200
> > Von: Stefan Hübner  <stefan.huebner@stud.tu-ilmenau.de>
> > Antwort an: stefan.huebner@stud.tu-ilmenau.de
> > Organisation: TU-Ilmenau
> > An: linux-scsi@vger.kernel.org
> > 
> >  Hi List!
> > 
> > After sending a WRITE_DMA_FUA_EXT ATA-command via SG_IO Passthru to a
> > harddisk (here: /dev/sdd), my kernel panics.  The is what I find in syslog:
> 
> You're sending down data command w/o data.  Panicking probably isn't
> the best response here.  Maybe the BUG_ON() should be changed to
> WARN_ON_ONCE() + goto sg_err.

It's user triggerable (as root I grant) so really shouldn't be
generating WARNs in that case either - just an -EIO.

Alan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Maybe a bug in libata-core
  2010-08-23  9:16 ` Tejun Heo
  2010-08-23  9:27   ` [PATCH #upstream-fixes] libata: be less of a drama queen on empty data commands Tejun Heo
  2010-08-23  9:44   ` Maybe a bug in libata-core Alan Cox
@ 2010-08-23 22:41   ` Stefan /*St0fF*/ Hübner
  2010-08-24  7:27     ` Tejun Heo
  2 siblings, 1 reply; 8+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-08-23 22:41 UTC (permalink / raw)
  To: Tejun Heo, linux-ide, jgarzik

Am 23.08.2010 11:16, schrieb Tejun Heo:
> On 08/23/2010 02:52 AM, Stefan /*St0fF*/ Hübner wrote:
>> Hi Jeff and list,
>>
>> maybe this mail was wrong on the linux-scsi list.  So here we go again:
>>
>> -------- Original-Nachricht --------
>> Betreff: Maybe a bug in libata-core
>> Datum: Fri, 20 Aug 2010 23:48:09 +0200
>> Von: Stefan Hübner  <stefan.huebner@stud.tu-ilmenau.de>
>> Antwort an: stefan.huebner@stud.tu-ilmenau.de
>> Organisation: TU-Ilmenau
>> An: linux-scsi@vger.kernel.org
>>
>>  Hi List!
>>
>> After sending a WRITE_DMA_FUA_EXT ATA-command via SG_IO Passthru to a
>> harddisk (here: /dev/sdd), my kernel panics.  The is what I find in syslog:
> 
> You're sending down data command w/o data.  Panicking probably isn't
> the best response here.  Maybe the BUG_ON() should be changed to
> WARN_ON_ONCE() + goto sg_err.
> 
> Thanks.
> 

As far as I know my code I thought I did send enough data.  Maybe I'm
misunderstanding something: I valloc'ated block_count*logical_block_size
bytes, filled them with data and presented this buffer to the
sg_io_hdr_t structure by setting dxfer_len to the length of the buffer,
setting dxferp to a pointer to the buffer and setting the
dxfer_direction to SG_DXFER_TO_DEV.

The only thing coming to my mind would be overflow of dxfer_len, but as
this is a unsigned int - wouldn't it be 32 bits wide? (and by that
accepting f.e. 64k*512 bytes = 33 554 432 bytes (32M, needing 25 bits))

Any other suggestions, or do I have to present the code?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Maybe a bug in libata-core
  2010-08-23 22:41   ` Stefan /*St0fF*/ Hübner
@ 2010-08-24  7:27     ` Tejun Heo
  2010-09-09  5:42       ` Stefan /*St0fF*/ Hübner
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2010-08-24  7:27 UTC (permalink / raw)
  To: stefan.huebner; +Cc: linux-ide, jgarzik

Hello,

On 08/24/2010 12:41 AM, Stefan /*St0fF*/ Hübner wrote:
> As far as I know my code I thought I did send enough data.  Maybe I'm
> misunderstanding something: I valloc'ated block_count*logical_block_size
> bytes, filled them with data and presented this buffer to the
> sg_io_hdr_t structure by setting dxfer_len to the length of the buffer,
> setting dxferp to a pointer to the buffer and setting the
> dxfer_direction to SG_DXFER_TO_DEV.
> 
> The only thing coming to my mind would be overflow of dxfer_len, but as
> this is a unsigned int - wouldn't it be 32 bits wide? (and by that
> accepting f.e. 64k*512 bytes = 33 554 432 bytes (32M, needing 25 bits))
> 
> Any other suggestions, or do I have to present the code?

Yeah, posting a simple test case would be the best way to proceed
here.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Maybe a bug in libata-core
  2010-08-24  7:27     ` Tejun Heo
@ 2010-09-09  5:42       ` Stefan /*St0fF*/ Hübner
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan /*St0fF*/ Hübner @ 2010-09-09  5:42 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, jgarzik

Dear list,

I'm very sorry I didn't find the time earlier to pursue this matter.
Indeed it was a coding error on my side.  I sat up sg_io_hdr_t correctly
except for one thing: dxfer_len was set to the amount of blocks instead
of block*logical_blocksize - so it was all my bad.

Thank you for the suggestions, it helped finding the error quickly.

Greetings,
Stefan


Am 24.08.2010 09:27, schrieb Tejun Heo:
> Hello,
> 
> On 08/24/2010 12:41 AM, Stefan /*St0fF*/ Hübner wrote:
>> As far as I know my code I thought I did send enough data.  Maybe I'm
>> misunderstanding something: I valloc'ated block_count*logical_block_size
>> bytes, filled them with data and presented this buffer to the
>> sg_io_hdr_t structure by setting dxfer_len to the length of the buffer,
>> setting dxferp to a pointer to the buffer and setting the
>> dxfer_direction to SG_DXFER_TO_DEV.
>>
>> The only thing coming to my mind would be overflow of dxfer_len, but as
>> this is a unsigned int - wouldn't it be 32 bits wide? (and by that
>> accepting f.e. 64k*512 bytes = 33 554 432 bytes (32M, needing 25 bits))
>>
>> Any other suggestions, or do I have to present the code?
> 
> Yeah, posting a simple test case would be the best way to proceed
> here.
> 
> Thanks.
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-09-09  5:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-23  0:52 Fwd: Maybe a bug in libata-core Stefan /*St0fF*/ Hübner
2010-08-23  9:16 ` Tejun Heo
2010-08-23  9:27   ` [PATCH #upstream-fixes] libata: be less of a drama queen on empty data commands Tejun Heo
2010-08-23  9:44   ` Maybe a bug in libata-core Alan Cox
2010-08-23  9:27     ` Tejun Heo
2010-08-23 22:41   ` Stefan /*St0fF*/ Hübner
2010-08-24  7:27     ` Tejun Heo
2010-09-09  5:42       ` Stefan /*St0fF*/ Hübner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.