* RE: I/O error on dd commands
@ 2016-12-08 0:45 Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil
[not found] ` <45c451dc71cc42b5bb24e385a160249a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil @ 2016-12-08 0:45 UTC (permalink / raw)
To: maxg-VPRAkNaXOzVWk0Htik3J/w, monis-VPRAkNaXOzVWk0Htik3J/w
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil
Hi Max,
> Any interesting logs in dmesg ?
Dynamic debug was used and there was following log in dmesg of target.
[ 584.274459] rdma_rxe: qp#17 state = GET_REQ
[ 584.274461] rdma_rxe: qp#17 state = CHK_PSN
[ 584.274462] rdma_rxe: qp#17 state = CHK_OP_SEQ
[ 584.274464] rdma_rxe: qp#17 state = CHK_OP_VALID
[ 584.274465] rdma_rxe: qp#17 state = CHK_RESOURCE
[ 584.274467] rdma_rxe: qp#17 state = CHK_LENGTH
[ 584.274468] rdma_rxe: qp#17 state = CHK_RKEY
[ 584.274470] rdma_rxe: qp#17 state = EXECUTE
[ 584.274473] rdma_rxe: qp#17 state = COMPLETE
[ 584.274475] rdma_rxe: qp#17 state = ACKNOWLEDGE
[ 584.274496] rdma_rxe: qp#17 state = CLEANUP
[ 584.274498] rdma_rxe: qp#17 state = DONE
[ 584.274499] rdma_rxe: qp#17 state = GET_REQ
[ 584.274500] rdma_rxe: qp#17 state = EXIT
[ 584.275561] nvmet_rdma: IB send queue full (needed 1): queue 0 cntlid 1
thanks,
Haruo.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <45c451dc71cc42b5bb24e385a160249a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org>]
* Re: I/O error on dd commands [not found] ` <45c451dc71cc42b5bb24e385a160249a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> @ 2016-12-08 9:25 ` Max Gurtovoy [not found] ` <50499552-0842-a9f0-e46b-30b427b29ca9-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Max Gurtovoy @ 2016-12-08 9:25 UTC (permalink / raw) To: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil, monis-VPRAkNaXOzVWk0Htik3J/w Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 12/8/2016 2:45 AM, Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil@public.gmane.org wrote: > Hi Max, > >> Any interesting logs in dmesg ? > > Dynamic debug was used and there was following log in dmesg of target. > > [ 584.274459] rdma_rxe: qp#17 state = GET_REQ > [ 584.274461] rdma_rxe: qp#17 state = CHK_PSN > [ 584.274462] rdma_rxe: qp#17 state = CHK_OP_SEQ > [ 584.274464] rdma_rxe: qp#17 state = CHK_OP_VALID > [ 584.274465] rdma_rxe: qp#17 state = CHK_RESOURCE > [ 584.274467] rdma_rxe: qp#17 state = CHK_LENGTH > [ 584.274468] rdma_rxe: qp#17 state = CHK_RKEY > [ 584.274470] rdma_rxe: qp#17 state = EXECUTE > [ 584.274473] rdma_rxe: qp#17 state = COMPLETE > [ 584.274475] rdma_rxe: qp#17 state = ACKNOWLEDGE > [ 584.274496] rdma_rxe: qp#17 state = CLEANUP > [ 584.274498] rdma_rxe: qp#17 state = DONE > [ 584.274499] rdma_rxe: qp#17 state = GET_REQ > [ 584.274500] rdma_rxe: qp#17 state = EXIT > [ 584.275561] nvmet_rdma: IB send queue full (needed 1): queue 0 cntlid 1 > > thanks, > Haruo. > Very wierd. can you pring "queue->host_qid" also in nvmet_rdma nvmet_rdma_execute_command (IB send queue full (needed 1): queue_idx 0 cntlid 1 queue_qid ????). seems like it's the admin queue but i'm not sure. Moni, can you advise regarding rdma_rxe logs ? thanks, Max. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <50499552-0842-a9f0-e46b-30b427b29ca9-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>]
* RE: I/O error on dd commands [not found] ` <50499552-0842-a9f0-e46b-30b427b29ca9-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> @ 2016-12-16 4:25 ` Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil 0 siblings, 0 replies; 7+ messages in thread From: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil @ 2016-12-16 4:25 UTC (permalink / raw) To: maxg-VPRAkNaXOzVWk0Htik3J/w, monis-VPRAkNaXOzVWk0Htik3J/w Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil Hi Max and Moni, "IB send queue full" issue is reproduced easily in vanilla kernel4.9. > Very wierd. can you pring "queue->host_qid" also in nvmet_rdma > nvmet_rdma_execute_command (IB send queue full (needed 1): queue_idx 0 > cntlid 1 queue_qid ????). > > seems like it's the admin queue but i'm not sure. > > Moni, > can you advise regarding rdma_rxe logs ? " queue->host_qid" is printed. When there is other information to debug for this issue, please tell me. thanks, Haruo. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: I/O error on dd commands @ 2016-12-06 5:55 Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil [not found] ` <0980d1e6e7d34e098900ee293aa0e487-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil @ 2016-12-06 5:55 UTC (permalink / raw) To: maxg-VPRAkNaXOzVWk0Htik3J/w, monis-VPRAkNaXOzVWk0Htik3J/w Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil Hi Max, Thank you for your reply. This issue reproduced vanilla 4.9-rc8. > can you try to repro it with iSER ? I'm sorry. I can't try iSER. > what is your backing store device ? I do now using TOSHIBA Enterprise SSD PX04PMB160. https://toshiba.semicon-storage.com/ap-en/product/storage-products/enterprise-ssd.html > is this happens in 1k bs only or in different bs as well ? This issue is confirming the reproduction in 512k/1024k/2048k/4096k/8192k bs. It occurs by more than one machine, so it isn't a failed of a NIC. LAN is a connection directly (It isn't a failed of a hub.) > rxe_req.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > --- linux-4.9-rc7/drivers/infiniband/sw/rxe/rxe_req.c.orig 2016-12-05 10:11:38.000000000 +0900 > +++ linux-4.9-rc7/drivers/infiniband/sw/rxe/rxe_req.c 2016-12-05 10:15:43.000000000 +0900 > @@ -705,12 +705,12 @@ next_wqe: > skb = init_req_packet(qp, wqe, opcode, payload, &pkt); > if (unlikely(!skb)) { > pr_err("qp#%d Failed allocating skb\n", qp_num(qp)); > - goto err; > + goto err1; > } > > if (fill_packet(qp, wqe, &pkt, skb, payload)) { > pr_debug("qp#%d Error during fill packet\n", qp_num(qp)); > - goto err; > + goto err2; > } > > /* > @@ -734,15 +734,16 @@ next_wqe: > goto exit; > } > > - goto err; > + goto err1; > } > > update_state(qp, wqe, &pkt, payload); > > goto next_wqe; > > -err: > +err2: > kfree_skb(skb); > +err1: > wqe->status = IB_WC_LOC_PROT_ERR; > wqe->state = wqe_state_error; > It's unrelated to this issue, please apply this patch. thanks, Haruo. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <0980d1e6e7d34e098900ee293aa0e487-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org>]
* Re: I/O error on dd commands [not found] ` <0980d1e6e7d34e098900ee293aa0e487-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> @ 2016-12-07 10:47 ` Max Gurtovoy 0 siblings, 0 replies; 7+ messages in thread From: Max Gurtovoy @ 2016-12-07 10:47 UTC (permalink / raw) To: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil, monis-VPRAkNaXOzVWk0Htik3J/w Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA Any interesting logs in dmesg ? On 12/6/2016 7:55 AM, Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil@public.gmane.org wrote: > Hi Max, > > Thank you for your reply. > This issue reproduced vanilla 4.9-rc8. > >> can you try to repro it with iSER ? > > I'm sorry. I can't try iSER. > >> what is your backing store device ? > > I do now using TOSHIBA Enterprise SSD PX04PMB160. > https://toshiba.semicon-storage.com/ap-en/product/storage-products/enterprise-ssd.html > >> is this happens in 1k bs only or in different bs as well ? > > This issue is confirming the reproduction in 512k/1024k/2048k/4096k/8192k bs. > It occurs by more than one machine, so it isn't a failed of a NIC. > LAN is a connection directly (It isn't a failed of a hub.) > >> rxe_req.c | 9 +++++---- >> 1 file changed, 5 insertions(+), 4 deletions(-) >> --- linux-4.9-rc7/drivers/infiniband/sw/rxe/rxe_req.c.orig 2016-12-05 10:11:38.000000000 +0900 >> +++ linux-4.9-rc7/drivers/infiniband/sw/rxe/rxe_req.c 2016-12-05 10:15:43.000000000 +0900 >> @@ -705,12 +705,12 @@ next_wqe: >> skb = init_req_packet(qp, wqe, opcode, payload, &pkt); >> if (unlikely(!skb)) { >> pr_err("qp#%d Failed allocating skb\n", qp_num(qp)); >> - goto err; >> + goto err1; >> } >> >> if (fill_packet(qp, wqe, &pkt, skb, payload)) { >> pr_debug("qp#%d Error during fill packet\n", qp_num(qp)); >> - goto err; >> + goto err2; >> } >> >> /* >> @@ -734,15 +734,16 @@ next_wqe: >> goto exit; >> } >> >> - goto err; >> + goto err1; >> } >> >> update_state(qp, wqe, &pkt, payload); >> >> goto next_wqe; >> >> -err: >> +err2: >> kfree_skb(skb); >> +err1: >> wqe->status = IB_WC_LOC_PROT_ERR; >> wqe->state = wqe_state_error; >> > > It's unrelated to this issue, please apply this patch. > > thanks, > Haruo. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* I/O error on dd commands @ 2016-12-05 5:41 Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil [not found] ` <b8b48041a15444bc9c62176d6807433a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> 0 siblings, 1 reply; 7+ messages in thread From: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil @ 2016-12-05 5:41 UTC (permalink / raw) To: monis-VPRAkNaXOzVWk0Htik3J/w Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil Hi Moni, Does a rxe driver of vanilla 4.9-rc6 fine work? When the dd command is tested for a read and write, it'll be the following error. (read) # dd if=/dev/nvme0n1 of=<readfile>.bin bs=1024 count=10000 iflag=direct blk_update_request: I/O error, dev nvme0n1, sector 1860 nvme nvme0: reconnecting in 10 seconds nvme nvme0: Successfully reconnected or nvme nvme0: failed nvme_keep_alive_end_io error=16391 nvme nvme0: reconnecting in 10 seconds nvme nvme0: Successfully reconnected (write) # dd if=<writefile>.bin of=/dev/nvme0n1 bs=1024 count=10000 oflag=direct blk_update_request: I/O error, dev nvme0n1, sector 1860 nvme nvme0: reconnecting in 10 seconds nvme nvme0: Successfully reconnected or nvme nvme0: failed nvme_keep_alive_end_io error=16391 nvme nvme0: reconnecting in 10 seconds nvme nvme0: Successfully reconnected I'd like to investigate the root cause of this error, are there any ideas? (PS) I'm checking a rxe driver. typo was found by release of skb in rxe_requester(). Is my patch right? rxe_req.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) --- linux-4.9-rc7/drivers/infiniband/sw/rxe/rxe_req.c.orig 2016-12-05 10:11:38.000000000 +0900 +++ linux-4.9-rc7/drivers/infiniband/sw/rxe/rxe_req.c 2016-12-05 10:15:43.000000000 +0900 @@ -705,12 +705,12 @@ next_wqe: skb = init_req_packet(qp, wqe, opcode, payload, &pkt); if (unlikely(!skb)) { pr_err("qp#%d Failed allocating skb\n", qp_num(qp)); - goto err; + goto err1; } if (fill_packet(qp, wqe, &pkt, skb, payload)) { pr_debug("qp#%d Error during fill packet\n", qp_num(qp)); - goto err; + goto err2; } /* @@ -734,15 +734,16 @@ next_wqe: goto exit; } - goto err; + goto err1; } update_state(qp, wqe, &pkt, payload); goto next_wqe; -err: +err2: kfree_skb(skb); +err1: wqe->status = IB_WC_LOC_PROT_ERR; wqe->state = wqe_state_error; -- Haruo -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <b8b48041a15444bc9c62176d6807433a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org>]
* Re: I/O error on dd commands [not found] ` <b8b48041a15444bc9c62176d6807433a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> @ 2016-12-05 10:12 ` Max Gurtovoy 0 siblings, 0 replies; 7+ messages in thread From: Max Gurtovoy @ 2016-12-05 10:12 UTC (permalink / raw) To: Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil, monis-VPRAkNaXOzVWk0Htik3J/w Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 12/5/2016 7:41 AM, Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil@public.gmane.org wrote: > Hi Moni, > > Does a rxe driver of vanilla 4.9-rc6 fine work? > When the dd command is tested for a read and write, it'll be the following error. > > (read) > # dd if=/dev/nvme0n1 of=<readfile>.bin bs=1024 count=10000 iflag=direct > > blk_update_request: I/O error, dev nvme0n1, sector 1860 > nvme nvme0: reconnecting in 10 seconds > nvme nvme0: Successfully reconnected > > or > > nvme nvme0: failed nvme_keep_alive_end_io error=16391 > nvme nvme0: reconnecting in 10 seconds > nvme nvme0: Successfully reconnected > > (write) > # dd if=<writefile>.bin of=/dev/nvme0n1 bs=1024 count=10000 oflag=direct > > blk_update_request: I/O error, dev nvme0n1, sector 1860 > nvme nvme0: reconnecting in 10 seconds > nvme nvme0: Successfully reconnected > > or > > nvme nvme0: failed nvme_keep_alive_end_io error=16391 > nvme nvme0: reconnecting in 10 seconds > nvme nvme0: Successfully reconnected > > I'd like to investigate the root cause of this error, are there any ideas? Hi Haruo, can you try to repro it with iSER ? what is your backing store device ? is this happens in 1k bs only or in different bs as well ? thanks, Max. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-12-16 4:25 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-12-08 0:45 I/O error on dd commands Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil [not found] ` <45c451dc71cc42b5bb24e385a160249a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> 2016-12-08 9:25 ` Max Gurtovoy [not found] ` <50499552-0842-a9f0-e46b-30b427b29ca9-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org> 2016-12-16 4:25 ` Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil -- strict thread matches above, loose matches on Subject: below -- 2016-12-06 5:55 Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil [not found] ` <0980d1e6e7d34e098900ee293aa0e487-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> 2016-12-07 10:47 ` Max Gurtovoy 2016-12-05 5:41 Tomita.Haruo-IGagC74glE2s6Rmoc/2Z03gSJqDPrsil [not found] ` <b8b48041a15444bc9c62176d6807433a-DqNMWkYM789gYsFYm0uEO7jjLBE8jN/0@public.gmane.org> 2016-12-05 10:12 ` Max Gurtovoy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.