* [PATCH] Handle access of a target that has been removed
@ 2015-10-13 18:10 Lee Duncan
2015-10-17 12:57 ` FUJITA Tomonori
0 siblings, 1 reply; 7+ messages in thread
From: Lee Duncan @ 2015-10-13 18:10 UTC (permalink / raw)
To: stgt
Hello:
I recently got a report of a tgtd core dump from our opencloud
group. The stack trace showed that a strcmp against a NULL was causing
the failure:
----------------------------------------------------------------
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0 0x00007fa701817576 in __strcmp_sse42 () from /lib64/libc.so.6
#1 0x0000000000408012 in target_find_by_name (
name=0x6ac16f "iqn.2010-10.org.openstack:volume-e812c705-80bc-4064-a84c-5559cda8b1ca") at iscsi/target.c:216
#2 0x0000000000406042 in login_start (conn=0x6abea8) at iscsi/iscsid.c:478
#3 0x0000000000406e77 in cmnd_exec_login (conn=<optimized out>)
at iscsi/iscsid.c:654
#4 cmnd_execute (conn=<optimized out>) at iscsi/iscsid.c:914
#5 iscsi_rx_handler (conn=0x6abea8) at iscsi/iscsid.c:2064
#6 0x0000000000409e98 in iscsi_tcp_event_handler (fd=<optimized out>,
events=1, data=0x63a480 <target_list>) at iscsi/iscsi_tcp.c:158
#7 0x0000000000418f1e in event_loop () at tgtd.c:272
#8 0x0000000000419405 in main (argc=1, argv=<optimized out>) at tgtd.c:438
----------------------------------------------------------------
It looks like target_find_by_name() uses tgt_targetname(), but doesn't
account for the fact that it can return a NULL when the target being
looked up does not (now) exist:
----------------------------------------------------------------
char *tgt_targetname(int tid)
{
struct target *target;
target = target_lookup(tid);
if (!target)
return NULL;
return target->name;
}
----------------------------------------------------------------
I propose the following patch to address this:
---
diff --git a/usr/iscsi/target.c b/usr/iscsi/target.c
index 7afd7fd0ba87..b0a36e7de7b4 100644
--- a/usr/iscsi/target.c
+++ b/usr/iscsi/target.c
@@ -369,9 +369,11 @@ void target_list_build(struct iscsi_connection *conn, char *addr, char *name)
struct iscsi_target *target_find_by_name(const char *name)
{
struct iscsi_target *target;
+ char *tname;
list_for_each_entry(target, &iscsi_targets_list, tlist) {
- if (!strcmp(tgt_targetname(target->tid), name))
+ tname = tgt_targetname(target->tid);
+ if (tname && !strcmp(tname, name))
return target;
}
--
Lee Duncan
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Handle access of a target that has been removed
2015-10-13 18:10 [PATCH] Handle access of a target that has been removed Lee Duncan
@ 2015-10-17 12:57 ` FUJITA Tomonori
2015-10-19 15:56 ` Lee Duncan
0 siblings, 1 reply; 7+ messages in thread
From: FUJITA Tomonori @ 2015-10-17 12:57 UTC (permalink / raw)
To: lduncan; +Cc: stgt
On Tue, 13 Oct 2015 11:10:17 -0700
Lee Duncan <lduncan@suse.com> wrote:
> Hello:
>
> I recently got a report of a tgtd core dump from our opencloud
> group. The stack trace showed that a strcmp against a NULL was causing
> the failure:
>
> ----------------------------------------------------------------
> Program terminated with signal 11, Segmentation fault.
> (gdb) bt
> #0 0x00007fa701817576 in __strcmp_sse42 () from /lib64/libc.so.6
> #1 0x0000000000408012 in target_find_by_name (
> name=0x6ac16f "iqn.2010-10.org.openstack:volume-e812c705-80bc-4064-a84c-5559cda8b1ca") at iscsi/target.c:216
> #2 0x0000000000406042 in login_start (conn=0x6abea8) at iscsi/iscsid.c:478
> #3 0x0000000000406e77 in cmnd_exec_login (conn=<optimized out>)
> at iscsi/iscsid.c:654
> #4 cmnd_execute (conn=<optimized out>) at iscsi/iscsid.c:914
> #5 iscsi_rx_handler (conn=0x6abea8) at iscsi/iscsid.c:2064
> #6 0x0000000000409e98 in iscsi_tcp_event_handler (fd=<optimized out>,
> events=1, data=0x63a480 <target_list>) at iscsi/iscsi_tcp.c:158
> #7 0x0000000000418f1e in event_loop () at tgtd.c:272
> #8 0x0000000000419405 in main (argc=1, argv=<optimized out>) at tgtd.c:438
> ----------------------------------------------------------------
>
> It looks like target_find_by_name() uses tgt_targetname(), but doesn't
> account for the fact that it can return a NULL when the target being
> looked up does not (now) exist:
Thanks for the patch. But I'm confused why this happens. target with
the same tid as iscsi_target must exist?
> ----------------------------------------------------------------
> char *tgt_targetname(int tid)
> {
> struct target *target;
>
> target = target_lookup(tid);
> if (!target)
> return NULL;
>
> return target->name;
> }
> ----------------------------------------------------------------
>
> I propose the following patch to address this:
>
> ---
> diff --git a/usr/iscsi/target.c b/usr/iscsi/target.c
> index 7afd7fd0ba87..b0a36e7de7b4 100644
> --- a/usr/iscsi/target.c
> +++ b/usr/iscsi/target.c
> @@ -369,9 +369,11 @@ void target_list_build(struct iscsi_connection *conn, char *addr, char *name)
> struct iscsi_target *target_find_by_name(const char *name)
> {
> struct iscsi_target *target;
> + char *tname;
>
> list_for_each_entry(target, &iscsi_targets_list, tlist) {
> - if (!strcmp(tgt_targetname(target->tid), name))
> + tname = tgt_targetname(target->tid);
> + if (tname && !strcmp(tname, name))
> return target;
> }
>
>
> --
> Lee Duncan
> --
> To unsubscribe from this list: send the line "unsubscribe stgt" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Handle access of a target that has been removed
2015-10-17 12:57 ` FUJITA Tomonori
@ 2015-10-19 15:56 ` Lee Duncan
2015-10-28 5:28 ` FUJITA Tomonori
0 siblings, 1 reply; 7+ messages in thread
From: Lee Duncan @ 2015-10-19 15:56 UTC (permalink / raw)
To: FUJITA Tomonori; +Cc: stgt
On 10/17/2015 05:57 AM, FUJITA Tomonori wrote:
> On Tue, 13 Oct 2015 11:10:17 -0700
> Lee Duncan <lduncan@suse.com> wrote:
>
>> Hello:
>>
>> I recently got a report of a tgtd core dump from our opencloud
>> group. The stack trace showed that a strcmp against a NULL was causing
>> the failure:
>>
>> ----------------------------------------------------------------
>> Program terminated with signal 11, Segmentation fault.
>> (gdb) bt
>> #0 0x00007fa701817576 in __strcmp_sse42 () from /lib64/libc.so.6
>> #1 0x0000000000408012 in target_find_by_name (
>> name=0x6ac16f "iqn.2010-10.org.openstack:volume-e812c705-80bc-4064-a84c-5559cda8b1ca") at iscsi/target.c:216
>> #2 0x0000000000406042 in login_start (conn=0x6abea8) at iscsi/iscsid.c:478
>> #3 0x0000000000406e77 in cmnd_exec_login (conn=<optimized out>)
>> at iscsi/iscsid.c:654
>> #4 cmnd_execute (conn=<optimized out>) at iscsi/iscsid.c:914
>> #5 iscsi_rx_handler (conn=0x6abea8) at iscsi/iscsid.c:2064
>> #6 0x0000000000409e98 in iscsi_tcp_event_handler (fd=<optimized out>,
>> events=1, data=0x63a480 <target_list>) at iscsi/iscsi_tcp.c:158
>> #7 0x0000000000418f1e in event_loop () at tgtd.c:272
>> #8 0x0000000000419405 in main (argc=1, argv=<optimized out>) at tgtd.c:438
>> ----------------------------------------------------------------
>>
>> It looks like target_find_by_name() uses tgt_targetname(), but doesn't
>> account for the fact that it can return a NULL when the target being
>> looked up does not (now) exist:
>
> Thanks for the patch. But I'm confused why this happens. target with
> the same tid as iscsi_target must exist?
>
No, I believe this is happening because the targets are getting
dynamically removed. But I will verify that. Because if tgt_targetname()
can return a NULL (as apparently it did this time), there are probably
other places in the code that need to check for that.
--
Lee Duncan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Handle access of a target that has been removed
2015-10-19 15:56 ` Lee Duncan
@ 2015-10-28 5:28 ` FUJITA Tomonori
2015-11-13 19:15 ` Lee Duncan
0 siblings, 1 reply; 7+ messages in thread
From: FUJITA Tomonori @ 2015-10-28 5:28 UTC (permalink / raw)
To: lduncan; +Cc: stgt
On Mon, 19 Oct 2015 08:56:06 -0700
Lee Duncan <lduncan@suse.com> wrote:
> On 10/17/2015 05:57 AM, FUJITA Tomonori wrote:
>> On Tue, 13 Oct 2015 11:10:17 -0700
>> Lee Duncan <lduncan@suse.com> wrote:
>>
>>> Hello:
>>>
>>> I recently got a report of a tgtd core dump from our opencloud
>>> group. The stack trace showed that a strcmp against a NULL was causing
>>> the failure:
>>>
>>> ----------------------------------------------------------------
>>> Program terminated with signal 11, Segmentation fault.
>>> (gdb) bt
>>> #0 0x00007fa701817576 in __strcmp_sse42 () from /lib64/libc.so.6
>>> #1 0x0000000000408012 in target_find_by_name (
>>> name=0x6ac16f "iqn.2010-10.org.openstack:volume-e812c705-80bc-4064-a84c-5559cda8b1ca") at iscsi/target.c:216
>>> #2 0x0000000000406042 in login_start (conn=0x6abea8) at iscsi/iscsid.c:478
>>> #3 0x0000000000406e77 in cmnd_exec_login (conn=<optimized out>)
>>> at iscsi/iscsid.c:654
>>> #4 cmnd_execute (conn=<optimized out>) at iscsi/iscsid.c:914
>>> #5 iscsi_rx_handler (conn=0x6abea8) at iscsi/iscsid.c:2064
>>> #6 0x0000000000409e98 in iscsi_tcp_event_handler (fd=<optimized out>,
>>> events=1, data=0x63a480 <target_list>) at iscsi/iscsi_tcp.c:158
>>> #7 0x0000000000418f1e in event_loop () at tgtd.c:272
>>> #8 0x0000000000419405 in main (argc=1, argv=<optimized out>) at tgtd.c:438
>>> ----------------------------------------------------------------
>>>
>>> It looks like target_find_by_name() uses tgt_targetname(), but doesn't
>>> account for the fact that it can return a NULL when the target being
>>> looked up does not (now) exist:
>>
>> Thanks for the patch. But I'm confused why this happens. target with
>> the same tid as iscsi_target must exist?
>>
>
> No, I believe this is happening because the targets are getting
> dynamically removed. But I will verify that. Because if tgt_targetname()
> can return a NULL (as apparently it did this time), there are probably
> other places in the code that need to check for that.
Ok, I'm still confused but applied the patch. Let's see if it would
help.
Thanks,
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Handle access of a target that has been removed
2015-10-28 5:28 ` FUJITA Tomonori
@ 2015-11-13 19:15 ` Lee Duncan
2015-11-26 5:06 ` Lee Duncan
0 siblings, 1 reply; 7+ messages in thread
From: Lee Duncan @ 2015-11-13 19:15 UTC (permalink / raw)
To: FUJITA Tomonori; +Cc: stgt
On 10/28/2015 06:28 AM, FUJITA Tomonori wrote:
> On Mon, 19 Oct 2015 08:56:06 -0700
> Lee Duncan <lduncan@suse.com> wrote:
>
>> On 10/17/2015 05:57 AM, FUJITA Tomonori wrote:
>>> On Tue, 13 Oct 2015 11:10:17 -0700
>>> Lee Duncan <lduncan@suse.com> wrote:
>>>
>>>> Hello:
>>>>
>>>> I recently got a report of a tgtd core dump from our opencloud
>>>> group. The stack trace showed that a strcmp against a NULL was causing
>>>> the failure:
>>>>
>>>> ----------------------------------------------------------------
>>>> Program terminated with signal 11, Segmentation fault.
>>>> (gdb) bt
>>>> #0 0x00007fa701817576 in __strcmp_sse42 () from /lib64/libc.so.6
>>>> #1 0x0000000000408012 in target_find_by_name (
>>>> name=0x6ac16f "iqn.2010-10.org.openstack:volume-e812c705-80bc-4064-a84c-5559cda8b1ca") at iscsi/target.c:216
>>>> #2 0x0000000000406042 in login_start (conn=0x6abea8) at iscsi/iscsid.c:478
>>>> #3 0x0000000000406e77 in cmnd_exec_login (conn=<optimized out>)
>>>> at iscsi/iscsid.c:654
>>>> #4 cmnd_execute (conn=<optimized out>) at iscsi/iscsid.c:914
>>>> #5 iscsi_rx_handler (conn=0x6abea8) at iscsi/iscsid.c:2064
>>>> #6 0x0000000000409e98 in iscsi_tcp_event_handler (fd=<optimized out>,
>>>> events=1, data=0x63a480 <target_list>) at iscsi/iscsi_tcp.c:158
>>>> #7 0x0000000000418f1e in event_loop () at tgtd.c:272
>>>> #8 0x0000000000419405 in main (argc=1, argv=<optimized out>) at tgtd.c:438
>>>> ----------------------------------------------------------------
>>>>
>>>> It looks like target_find_by_name() uses tgt_targetname(), but doesn't
>>>> account for the fact that it can return a NULL when the target being
>>>> looked up does not (now) exist:
>>>
>>> Thanks for the patch. But I'm confused why this happens. target with
>>> the same tid as iscsi_target must exist?
>>>
>>
>> No, I believe this is happening because the targets are getting
>> dynamically removed. But I will verify that. Because if tgt_targetname()
>> can return a NULL (as apparently it did this time), there are probably
>> other places in the code that need to check for that.
>
> Ok, I'm still confused but applied the patch. Let's see if it would
> help.
>
> Thanks,
I am still trying to look more deeply into how this could happen.
This bug was triggered when using cloud storage as the back-end for for
their target, and that cloud storage "goes away". (I am still trying to
determine what that actually means.) If so, I should be able to simulate
by allowing my back-end storage to go away.
As you say, let's see if we see any other instances of tgt_targetname()
returning null, in other spots in the code.
Thank you.
--
Lee Duncan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Handle access of a target that has been removed
2015-11-13 19:15 ` Lee Duncan
@ 2015-11-26 5:06 ` Lee Duncan
2015-11-27 5:56 ` FUJITA Tomonori
0 siblings, 1 reply; 7+ messages in thread
From: Lee Duncan @ 2015-11-26 5:06 UTC (permalink / raw)
To: FUJITA Tomonori; +Cc: stgt
Hi:
Response below ...
On 11/13/2015 11:15 AM, Lee Duncan wrote:
> On 10/28/2015 06:28 AM, FUJITA Tomonori wrote:
>> On Mon, 19 Oct 2015 08:56:06 -0700
>> Lee Duncan <lduncan@suse.com> wrote:
>>
>>> On 10/17/2015 05:57 AM, FUJITA Tomonori wrote:
>>>> On Tue, 13 Oct 2015 11:10:17 -0700
>>>> Lee Duncan <lduncan@suse.com> wrote:
>>>>
>>>>> Hello:
>>>>>
>>>>> I recently got a report of a tgtd core dump from our opencloud
>>>>> group. The stack trace showed that a strcmp against a NULL was causing
>>>>> the failure:
>>>>>
>>>>> ...
>>>>>
>>>>> It looks like target_find_by_name() uses tgt_targetname(), but doesn't
>>>>> account for the fact that it can return a NULL when the target being
>>>>> looked up does not (now) exist:
>>>>
>>>> Thanks for theI supplied a patch you appliedI supplied a patch you applied patch. But I'm confused why this happens. target with
>>>> the same tid as iscsi_target must exist?
>>>>
>>>
>>> No, I believe this is happening because the targets are getting
>>> dynamically removed. But I will verify that. Because if tgt_targetname()
>>> can return a NULL (as apparently it did this time), there are probably
>>> other places in the code that need to check for that.
>>
>> Ok, I'm still confused but applied the patch. Let's see if it would
>> help.
>>
>> Thanks,
>
> I am still trying to look more deeply into how this could happen.
>
> This bug was triggered when using cloud storage as the back-end for for
> their target, and that cloud storage "goes away". (I am still trying to
> determine what that actually means.) If so, I should be able to simulate
> by allowing my back-end storage to go away.
>
> As you say, let's see if we see any other instances of tgt_targetname()
> returning null, in other spots in the code.
>
I thought that you might like to know that I finally tested this. I
created a test bed where I randomly created and deleted 4 target nodes
by changing the configuration files then running "tgt-admin --update
ALL -f".
At the same time, I randomly attempted to log out of then back into
all 4 targets.
It only takes a minute to hit the case the patch I sent fixes, where
target_find_by_name() dies when tgt_targetname() returns an unexpected
NULL. It was nice to validate that the patch was actually needed.
You might remember, though, that I was worried that other places where
tgt_targetname() where used might need to check for NULL as well. So
I tested for that.
Surprisingly, none of the other places that uses tgt_targetname() ever
got NULL back. I tested for several hours. So either I just didn't hit
the corner case needed to tickle such a bug, or that just doesn't
happen. I would love to investigate further, but I may not have the
time to get to it soon. In the mean time, it appears we are in good
shape.
--
Lee Duncan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Handle access of a target that has been removed
2015-11-26 5:06 ` Lee Duncan
@ 2015-11-27 5:56 ` FUJITA Tomonori
0 siblings, 0 replies; 7+ messages in thread
From: FUJITA Tomonori @ 2015-11-27 5:56 UTC (permalink / raw)
To: lduncan; +Cc: stgt
Hi,
On Wed, 25 Nov 2015 21:06:14 -0800
Lee Duncan <lduncan@suse.com> wrote:
> I thought that you might like to know that I finally tested this. I
Yeah, thanks a lot for the input!
> created a test bed where I randomly created and deleted 4 target nodes
> by changing the configuration files then running "tgt-admin --update
> ALL -f".
>
> At the same time, I randomly attempted to log out of then back into
> all 4 targets.
>
> It only takes a minute to hit the case the patch I sent fixes, where
> target_find_by_name() dies when tgt_targetname() returns an unexpected
> NULL. It was nice to validate that the patch was actually needed.
>
> You might remember, though, that I was worried that other places where
> tgt_targetname() where used might need to check for NULL as well. So
> I tested for that.
>
> Surprisingly, none of the other places that uses tgt_targetname() ever
> got NULL back. I tested for several hours. So either I just didn't hit
> the corner case needed to tickle such a bug, or that just doesn't
> happen. I would love to investigate further, but I may not have the
> time to get to it soon. In the mean time, it appears we are in good
> shape.
I see. I try to think more.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-11-27 5:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-13 18:10 [PATCH] Handle access of a target that has been removed Lee Duncan
2015-10-17 12:57 ` FUJITA Tomonori
2015-10-19 15:56 ` Lee Duncan
2015-10-28 5:28 ` FUJITA Tomonori
2015-11-13 19:15 ` Lee Duncan
2015-11-26 5:06 ` Lee Duncan
2015-11-27 5:56 ` FUJITA Tomonori
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).