linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Kalderon <mkalderon@marvell.com>
To: Gal Pressman <galpress@amazon.com>, "jgg@ziepe.ca" <jgg@ziepe.ca>,
	"dledford@redhat.com" <dledford@redhat.com>
Cc: Ariel Elior <aelior@marvell.com>,
	"bmt@zurich.ibm.com" <bmt@zurich.ibm.com>,
	"sleybo@amazon.com" <sleybo@amazon.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Subject: RE: [PATCH v7 rdma-next 0/7] RDMA/qedr: Use the doorbell overflow recovery mechanism for RDMA
Date: Wed, 21 Aug 2019 08:03:24 +0000	[thread overview]
Message-ID: <MN2PR18MB31827364A5B64323681D1B4BA1AA0@MN2PR18MB3182.namprd18.prod.outlook.com> (raw)
In-Reply-To: <aca72068-1155-6755-4494-0436a5d5a31f@amazon.com>

> From: Gal Pressman <galpress@amazon.com>
> Sent: Tuesday, August 20, 2019 9:31 PM
> 
> On 20/08/2019 15:18, Michal Kalderon wrote:
> > This patch series uses the doorbell overflow recovery mechanism
> > introduced in commit 36907cd5cd72 ("qed: Add doorbell overflow
> > recovery mechanism") for rdma ( RoCE and iWARP )
> >
> > The first five patches modify the core code to contain helper
> > functions for managing mmap_xa inserting, getting and freeing entries.
> > The code was based on the code from efa driver.
> > There is still an open discussion on whether we should take this even
> > further and make the entire mmap generic. Until a decision is made, I
> > only created the database API and modified the efa, qedr, siw driver
> > to use it. The functions are integrated witht the umap mechanism.
> >
> > The doorbell recovery code is based on the common code.
> >
> > Efa driver was compile tested and checked only modprobe/rmmod.
> > SIW was compile tested only
> 
> Hey Michal,
> 
> I haven't had the time to review the patches yet, but I did run it through our
> regression and got some dmesg call traces [1].
> There are also some kmemleak warnings for suspected memory leaks, don't
> have the full information ATM but I can try and extract it if needed.
> 
> Thanks!
> 
Hi Gal, 

Thanks for the quick testing and feedback!

Can you share some more information on the scenario you're running ? 
Does this happen each time or intermittently ? 
Can you send me your .config ? are you running agains rdma-next tree ? 
Can  you reproduce with enabling ib_core module dynamic debug on ? 

Thanks,
Michal

> [1] (this is the first trace of many)
> BUG: Bad page state in process ib_send_bw  pfn:1411f76
> page:ffffea005047dd80 refcount:-1 mapcount:0 mapping:0000000000000000
> index:0x0
> flags: 0x2fffe000000000()
> raw: 002fffe000000000 dead000000000100 dead000000000122
> 0000000000000000
> raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
> page dumped because: nonzero _refcount Modules linked in: sunrpc
> dm_mirror dm_region_hash dm_log dm_mod efa ib_uverbs ib_core
> crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd
> glue_helper button pcspkr evdev ip_tables x_tables xfs libcrc32c nvme
> crc32c_intel nvme_core ena ipv6 crc_ccitt nf_defrag_ipv6 autofs4
> CPU: 29 PID: 62474 Comm: ib_send_bw Not tainted 5.3.0-rc1-dirty #1
> Hardware name: Amazon EC2 c5n.18xlarge/, BIOS 1.0 10/16/2017 Call Trace:
>  dump_stack+0x9a/0xeb
>  bad_page+0x104/0x180
>  free_pcppages_bulk+0x31b/0xdd0
>  ? uncharge_batch+0x1d2/0x2b0
>  ? free_compound_page+0x40/0x40
>  ? free_unref_page_commit+0x152/0x1b0
>  free_unref_page_list+0x1b8/0x3e0
>  release_pages+0x4c6/0x620
>  ? put_pages_list+0xf0/0xf0
>  ? free_pages_and_swap_cache+0x97/0x140
>  tlb_flush_mmu+0x7a/0x280
>  tlb_finish_mmu+0x44/0x170
>  exit_mmap+0x147/0x2b0
>  ? do_munmap+0x10/0x10
>  mmput+0xb4/0x1d0
>  do_exit+0x4c2/0x14d0
>  ? mm_update_next_owner+0x360/0x360
>  ? ktime_get_coarse_real_ts64+0xc0/0x120
>  ? syscall_trace_enter+0x22d/0x5f0
>  ? __audit_syscall_exit+0x31e/0x460
>  ? syscall_slow_exit_work+0x2c0/0x2c0
>  ? kfree+0x221/0x290
>  ? mark_held_locks+0x1c/0xa0
>  do_group_exit+0x6f/0x140
>  __x64_sys_exit_group+0x28/0x30
>  do_syscall_64+0x68/0x290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x7f6072a3b928
> Code: Bad RIP value.
> RSP: 002b:00007ffe4e09ae68 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000e7
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f6072a3b928
> RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> RBP: 00007f6072d24898 R08: 00000000000000e7 R09: ffffffffffffff70
> R10: 00007f60724ead68 R11: 0000000000000246 R12: 00007f6072d24898
> R13: 00007f6072d29d80 R14: 0000000000000000 R15: 0000000000000000
> Disabling lock debugging due to kernel taint

  reply	other threads:[~2019-08-21  8:03 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-20 12:18 [PATCH v7 rdma-next 0/7] RDMA/qedr: Use the doorbell overflow recovery mechanism for RDMA Michal Kalderon
2019-08-20 12:18 ` [PATCH v7 rdma-next 1/7] RDMA/core: Move core content from ib_uverbs to ib_core Michal Kalderon
2019-08-20 12:58   ` Jason Gunthorpe
2019-08-20 21:30     ` Michal Kalderon
2019-08-21 16:30       ` Michal Kalderon
2019-08-20 14:08   ` Gal Pressman
2019-08-20 21:32     ` Michal Kalderon
2019-08-21  6:06       ` Gal Pressman
2019-08-21  7:56         ` Michal Kalderon
2019-08-20 12:18 ` [PATCH v7 rdma-next 2/7] RDMA/core: Create mmap database and cookie helper functions Michal Kalderon
2019-08-20 13:21   ` Jason Gunthorpe
2019-08-20 21:23     ` [EXT] " Michal Kalderon
2019-08-21 16:47       ` Michal Kalderon
2019-08-21 16:51         ` Jason Gunthorpe
2019-08-21 17:14           ` Michal Kalderon
2019-08-21 17:37             ` Jason Gunthorpe
2019-08-26 11:53               ` Michal Kalderon
2019-08-26 12:01                 ` Gal Pressman
2019-08-22  8:35   ` Gal Pressman
2019-08-25  8:36     ` Michal Kalderon
2019-08-25 10:39       ` Gal Pressman
2019-08-26  8:41         ` Michal Kalderon
2019-08-26 15:30       ` Michal Kalderon
2019-08-20 12:18 ` [PATCH v7 rdma-next 3/7] RDMA/efa: Use the common mmap_xa helpers Michal Kalderon
2019-08-22 13:18   ` Gal Pressman
2019-08-25  8:41     ` Michal Kalderon
2019-08-25 10:45       ` Gal Pressman
2019-08-26  8:42         ` Michal Kalderon
2019-08-20 12:18 ` [PATCH v7 rdma-next 4/7] RDMA/siw: " Michal Kalderon
2019-08-20 12:18 ` [PATCH v7 rdma-next 5/7] RDMA/qedr: Use the common mmap API Michal Kalderon
2019-08-20 12:18 ` [PATCH v7 rdma-next 6/7] RDMA/qedr: Add doorbell overflow recovery support Michal Kalderon
2019-08-20 12:18 ` [PATCH v7 rdma-next 7/7] RDMA/qedr: Add iWARP doorbell " Michal Kalderon
2019-08-20 18:31 ` [PATCH v7 rdma-next 0/7] RDMA/qedr: Use the doorbell overflow recovery mechanism for RDMA Gal Pressman
2019-08-21  8:03   ` Michal Kalderon [this message]
2019-08-21 10:15     ` Gal Pressman
2019-08-21 10:32       ` Michal Kalderon
2019-08-21 10:41         ` Gal Pressman
2019-08-21 12:25           ` Gal Pressman
2019-08-21 16:23             ` Gal Pressman
2019-08-21 16:27               ` Michal Kalderon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MN2PR18MB31827364A5B64323681D1B4BA1AA0@MN2PR18MB3182.namprd18.prod.outlook.com \
    --to=mkalderon@marvell.com \
    --cc=aelior@marvell.com \
    --cc=bmt@zurich.ibm.com \
    --cc=dledford@redhat.com \
    --cc=galpress@amazon.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sleybo@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).