All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ursula Braun <ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: "Matan Barak (External)"
	<matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: "Saeed Mahameed
	(saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org)"
	<saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Leon Romanovsky <leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Fwd: mlx5_ib_post_send panic on s390x
Date: Mon, 6 Mar 2017 14:03:00 +0100	[thread overview]
Message-ID: <491cf3e1-b2f8-3695-ecd4-3d34b0ae9e25@linux.vnet.ibm.com> (raw)
In-Reply-To: <20e4f31e-b2a7-89fb-d4c0-583c0dc1efb6-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>



On 02/26/2017 10:45 AM, Matan Barak (External) wrote:
> On 24/02/2017 12:27, Ursula Braun wrote:
>> sorry, typo in the mail address.
>>
>> -------- Forwarded Message --------
>> Subject: mlx5_ib_post_send panic on s390x
>> Date: Fri, 24 Feb 2017 10:51:32 +0100
>> From: Ursula Braun <ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>> To: matamb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org, leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
>> CC: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>
>> Hi Saeed and Matan,
>>
>> up to now I run SMC-R traffic on Connect X3, which works.
>> But when switching to Connect X4, the first mlx5_ib_post_send() fails:
>>
>> [  247.787660] Unable to handle kernel pointer dereference in virtual kernel address space
>> [  247.787662] Failing address: 000000010484a000 TEID: 000000010484a803
>> [  247.787664] Fault in home space mode while using kernel ASCE.
>> [  247.787667] AS:00000000011ec007 R3:0000000000000024
>> [  247.787701] Oops: 003b ilc:2 [#1] PREEMPT SMP
>> [  247.787704] Modules linked in: smc_diag smc xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter rpcrdma rdma_ucm ib_ucm ib_uverbs rdma_cm configfs ib_cm iw_cm mlx5_ib ib_core mlx5_core xts gf128mul cbc ecb aes_s390 des_s390 des_generic ptp sha512_s390 pps_core sha256_s390 sha1_s390 sha_common eadm_sch nfsd auth_rpcgss oid_registry nfs_acl lockd vhost_net tun grace vhost sunrpc macvtap sch_fq_codel macvlan dm_multipath kvm dm_mod ip_tables x_tables autofs4
>> [  247.787738] CPU: 0 PID: 10498 Comm: kworker/0:3 Tainted: G        W       4.10.0uschi+ #4
>> [  247.787739] Hardware name: IBM              2964 N96              704              (LPAR)
>> [  247.787743] Workqueue: events smc_listen_work [smc]
>> [  247.787745] task: 00000000b4148008 task.stack: 0000000099c2c000
>> [  247.787746] Krnl PSW : 0404c00180000000 0000000000762412 (memcpy+0x22/0x48)
>> [  247.787751]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
>> [  247.787753] Krnl GPRS: 0000000000a7a100 0000000099c96414 0000000099c96414 000000010484afc8
>> [  247.787755]            000000000000002b 000000000076242e 000000000000002c 0000000099c96440
>> [  247.787757]            000000010484afc8 000000000000002c 0000000099c96414 0000000000000001
>> [  247.787758]            00000000ae8a75d0 000003ff8108aa50 000003ff8107cde6 0000000099c2fa38
>> [  247.787764] Krnl Code: 0000000000762404: b9040012        lgr    %r1,%r2
>>                           0000000000762408: a7740008        brc    7,762418
>>                          #000000000076240c: c05000000011    larl    %r5,76242e
>>                          >0000000000762412: 44405000        ex    %r4,0(%r5)
>>                           0000000000762416: 07fe        bcr    15,%r14
>>                           0000000000762418: d2ff10003000    mvc    0(256,%r1),0(%r3)
>>                           000000000076241e: 41101100        la    %r1,256(%r1)
>>                           0000000000762422: 41303100        la    %r3,256(%r3)
>> [  247.787780] Call Trace:
>> [  247.787785] ([<000003ff8107cdd4>] mlx5_ib_post_send+0x139c/0x1810 [mlx5_ib])
>> [  247.787789]  [<000003ff8047999a>] smc_wr_tx_send+0xd2/0x100 [smc]
>> [  247.787792]  [<000003ff8047a97a>] smc_llc_send_confirm_link+0x9a/0xd0 [smc]
>> [  247.787794]  [<000003ff804751ee>] smc_listen_work+0x24e/0x4e0 [smc]
>> [  247.787797]  [<00000000001659e8>] process_one_work+0x3d8/0x780
>> [  247.787799]  [<0000000000166044>] worker_thread+0x2b4/0x478
>> [  247.787801]  [<000000000016e62c>] kthread+0x15c/0x170
>> [  247.787803]  [<0000000000a115f2>] kernel_thread_starter+0x6/0xc
>> [  247.787804]  [<0000000000a115ec>] kernel_thread_starter+0x0/0xc
>> [  247.787806] INFO: lockdep is turned off.
>> [  247.787807] Last Breaking-Event-Address:
>> [  247.787811]  [<000003ff8106edc0>] 0x3ff8106edc0
>> [  247.787813]
>> [  247.787814] Kernel panic - not syncing: Fatal exception: panic_on_oops
>>
>> The problem seems to be caused by the usage of plain memcpy in set_data_inl_seg().
>> The address provided by SMC-code in struct ib_send_wr *wr is an address belonging to
>> an area mapped with the ib_dma_map_single() call. On s390x those kind of addresses
>> require extra access functions (see arch/s390/include/asm/io.h).
>>
> 
> So I guess memcpy_toio is required here, right?
> Since we don't have a s390 based system, could you please test this?
memcpy_toio() did not help. Then I replaced the memcpy-calls in set_data_inl_seg()
by this preliminary test code (just to give an idea, not a real patch proposal):

static void *memcpy_usc(void *dest, const void *src, size_t count)
{
        char *tmp_dest = (char *)dest;
        char *tmp_src = (char *)src;
        int copied = 0;
        u32 tmp_u32;

        while (copied < count) {
                tmp_u32 = __raw_readl(tmp_src);
                __raw_writel(tmp_u32, tmp_dest);
                copied += sizeof(tmp_u32);
                tmp_dest += sizeof(tmp_u32);
                tmp_src += sizeof(tmp_u32);
        }
        return dest;
}

This helped; the first mlx5_ib_post_send code initiated from SMC-code (type IB_WR_SEND,
flagged with IB_SEND_INLINE, length 44 bytes) run successful.

A following mlx5_ib_post_send call of type RDMA_WRITE seems to stall later on, but
this is something I have to analyze in more detail.

> 
>> Kind regards, Ursula Braun (IBM Germany)
>>
> 
> Thanks for notifying.
> 
> Matan
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2017-03-06 13:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-24  9:51 mlx5_ib_post_send panic on s390x Ursula Braun
     [not found] ` <56246ac0-a706-291c-7baa-a6dd2c6331cd-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-02-24 17:28   ` Eli Cohen
     [not found]     ` <AM4PR0501MB2787E2BB6C8CBBCA5DCE9E82C5520-dp/nxUn679jFcPxmzbbP+MDSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-03-06 11:17       ` Ursula Braun
     [not found]         ` <ea211a05-f26a-e7a7-27b4-fc5edc2e3b57-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-03-06 12:56           ` Eli Cohen
     [not found]             ` <AM4PR0501MB27879C1EBF26FBF02F088AD7C52C0-dp/nxUn679jFcPxmzbbP+MDSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-03-06 13:47               ` Ursula Braun
     [not found] ` <dcc90daa-b932-8957-d8bc-e1f02d04e03a@linux.vnet.ibm.com>
     [not found]   ` <20e4f31e-b2a7-89fb-d4c0-583c0dc1efb6@mellanox.com>
     [not found]     ` <20e4f31e-b2a7-89fb-d4c0-583c0dc1efb6-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-03-06 13:03       ` Ursula Braun [this message]
     [not found]         ` <491cf3e1-b2f8-3695-ecd4-3d34b0ae9e25-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-03-06 13:08           ` Fwd: " Eli Cohen
     [not found]             ` <AM4PR0501MB278723F1BF4DA9846C664C62C52C0-dp/nxUn679jFcPxmzbbP+MDSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-03-09  9:54               ` Ursula Braun
     [not found]                 ` <e57691e1-55bc-308a-fc91-0a8072218dd5-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-03-09 12:58                   ` Eli Cohen
2017-03-12 20:20                   ` Parav Pandit
     [not found]                     ` <VI1PR0502MB300817FC6256218DE800497BD1220-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-03-12 20:38                       ` Parav Pandit
2017-03-14 15:02                       ` Ursula Braun
     [not found]                         ` <04049739-a008-f7c7-4f7a-30616fbf787a-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-03-14 15:24                           ` Parav Pandit
     [not found]                             ` <VI1PR0502MB30081C4618B1905B82247F05D1240-o1MPJYiShExKsLr+rGaxW8DSnupUy6xnnBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-03-16 11:51                               ` Ursula Braun
     [not found]                                 ` <8e791524-dd66-629d-7f44-9050d9c7715a-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-03-20 21:04                                   ` Parav Pandit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=491cf3e1-b2f8-3695-ecd4-3d34b0ae9e25@linux.vnet.ibm.com \
    --to=ubraun-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=leonro-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=saeedm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.