All of lore.kernel.org
 help / color / mirror / Atom feed
* dmeventd generates segfault when creating snapshots
@ 2014-02-10 11:34 Chiyohiko Morita
  2014-02-11 10:33 ` Zdenek Kabelac
  0 siblings, 1 reply; 4+ messages in thread
From: Chiyohiko Morita @ 2014-02-10 11:34 UTC (permalink / raw)
  To: lvm-devel


(I'm not on this list; please Cc me)

With a high load, the dmeventd daemon sometimes generates 
a segmentation violation when createing a lvm-snapshot.

In _register_for_event(), it may take 10 or more seconds between 
calling _register_for_timeout() and _create_thread(). 
(Please refer to the following _register_for_timeout().)

If the _timeout_thread thread calls pthread_kill() before 
_create_thread() is called, dmeventd will generate segfault.
This is because the value of thread->thread is still NULL.
(Please refer to the following _timeout_thread().)

A related source code and coredump are shown below.

====================================================================
LVM2.2.02.XXX/daemons/dmeventd/dmeventd.c
====================================================================
/*
 * Register for an event.
 *
 * Only one caller at a time here, because we use
 * a FIFO and lock it against multiple accesses.
 */
static int _register_for_event(struct message_data *message_data)
{
	int ret = 0;

	:

	/* If creation of timeout thread fails (as it may), we fail
	   here completely. The client is responsible for either
	   retrying later or trying to register without timeout
	   events. However, if timeout thread cannot be started, it
	   usually means we are so starved on resources that we are
	   almost as good as dead already... */
	if (thread_new->events & DM_EVENT_TIMEOUT) {
		ret = -_register_for_timeout(thread_new);   ----------------+ 
		if (ret) {					            |
		    _unlock_mutex();                                        |
		    goto out;                                               |
		}                                                           |
	}                                         (This may take more than 10 seconds.)
                                                                            |
	if (!(thread = _lookup_thread_status(message_data))) {              |
		_unlock_mutex();                                            |
                                                                            |
		if (!(ret = _do_register_device(thread_new)))   <=(This function may make the current process sleep.)
			goto out;                                           |
                                                                            |
		thread = thread_new;                                        |
		thread_new = NULL;                                          |
                                                                            |
		/* Try to create the monitoring thread for this device. */  |
		_lock_mutex();                                              |
		if ((ret = -_create_thread(thread))) {		<-----------+ 
			_unlock_mutex();
			_do_unregister_device(thread);		
			_free_thread_status(thread);		
			goto out;
		} else
			LINK_THREAD(thread);
	}

	:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
/* Wake up monitor threads every so often. */
static void *_timeout_thread(void *unused __attribute__((unused)))
{
	struct timespec timeout;
	time_t curr_time;

	timeout.tv_nsec = 0;
	pthread_cleanup_push(_exit_timeout, NULL);
	pthread_mutex_lock(&_timeout_mutex);

	while (!dm_list_empty(&_timeout_registry)) {
		struct thread_status *thread;

		timeout.tv_sec = 0;
		curr_time = time(NULL);

		dm_list_iterate_items_gen(thread, &_timeout_registry, timeout_list) {
			if (thread->next_time <= curr_time) {
				thread->next_time = curr_time + thread->timeout;
				pthread_kill(thread->thread, SIGALRM);		<= thread->thread is NULL
			}                                             (_monitor_thread has not been created yet.)

			if (thread->next_time < timeout.tv_sec || !timeout.tv_sec)
				timeout.tv_sec = thread->next_time;
		}

		pthread_cond_timedwait(&_timeout_cond, &_timeout_mutex,
				       &timeout);
	}

	pthread_cleanup_pop(1);

	return NULL;
}
====================================================================



Additional info:

====================================================================
coredump 1
====================================================================
- - -<command>- - - - - - - - - - - - - - - - - - - - - - - - - - -
# lvcreate -s -L 1M -n data01_snap  /dev/test-vg/data01

- - -</var/log/messages>- - - - - - - - - - - - - - - - - - - - - -
	:
Feb  6 15:35:34 COS62-SKR-K038 dmeventd[3662]: dmeventd ready for processing.
Feb  6 15:35:42 COS62-SKR-K038 lvm[3662]: Monitoring snapshot test--vg-data01_snap
Feb  6 15:36:48 COS62-SKR-K038 lvm[3662]: No longer monitoring snapshot test--vg-data01_snap
Feb  6 15:37:24 COS62-SKR-K038 lvm[3662]: Monitoring snapshot test--vg-data01_snap
Feb  6 15:38:29 COS62-SKR-K038 lvm[3662]: No longer monitoring snapshot test--vg-data01_snap
Feb  6 15:39:03 COS62-SKR-K038 lvm[3662]: Monitoring snapshot test--vg-data01_snap
Feb  6 15:40:17 COS62-SKR-K038 lvm[3662]: No longer monitoring snapshot test--vg-data01_snap
Feb  6 15:40:59 COS62-SKR-K038 kernel: dmeventd[3942]: segfault at 2d0 ip 0000003aa720c610 sp 00007fe0a578bdd8 error 4 in libpthread-2.12.so[3aa7200000+17000]
Feb  6 15:41:06 COS62-SKR-K038 abrt[3951]: saved core dump of pid 3662 (/sbin/dmeventd) to /var/spool/abrt/ccpp-2014-02-06-15:40:59-3662.new/coredump (11165696 bytes)
	:

- - -<coredump>- - - - - - - - - - - - - - - - - - - - - - - - - - -

Core was generated by `/sbin/dmeventd'.
Program terminated with signal 11, Segmentation fault.
#0  __pthread_kill (threadid=0, signo=14) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43
/usr/src/debug/glibc-2.12-2-gc4ccff1/nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43:1539:beg:0x3aa720c610
Missing separate debuginfos, use: debuginfo-install libudev-147-2.51.el6.x86_64
(gdb) info threads
  2 Thread 0x7fe0a5f0f7a0 (LWP 3662)  0x0000003aa66e6d77 in socket () at ../sysdeps/unix/syscall-template.S:82
* 1 Thread 0x7fe0a578c700 (LWP 3942)  __pthread_kill (threadid=0, signo=14) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43
(gdb) bt full
#0  __pthread_kill (threadid=0, signo=14) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43
        __x = <value optimized out>
        pd = 0x0
        tid = <value optimized out>
        val = <value optimized out>
#1  0x0000000000403218 in _timeout_thread (unused=<value optimized out>) at dmeventd.c:534
        thread = <value optimized out>
        __clframe = {__cancel_routine = 0x402ab0 <_exit_timeout>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = 0}
        timeout = {tv_sec = 0, tv_nsec = 0}
        curr_time = 1391668859
#2  0x0000003aa72077f1 in start_thread (arg=0x7fe0a578c700) at pthread_create.c:301
        __res = <value optimized out>
        pd = 0x7fe0a578c700
        now = <value optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140602825557760, 3891965741624072973, 140735593897936, 140602825558464, 0, 3,
                -3908490291094611187, 3924399409031158541}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {
              prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <value optimized out>
        pagesize_m1 = <value optimized out>
        sp = <value optimized out>
        freesize = <value optimized out>
#3  0x0000003aa66e5ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
No locals.
(gdb) frame 1
#1  0x0000000000403218 in _timeout_thread (unused=<value optimized out>) at dmeventd.c:534
/usr/src/debug/LVM2.2.02.87/daemons/dmeventd/dmeventd.c:534:13457:beg:0x403218
(gdb) print thread
$1 = <value optimized out>
(gdb) print &((struct thread_status *)0)->thread
$2 = (pthread_t *) 0x10
(gdb) disassemble _timeout_thread
Dump of assembler code for function _timeout_thread:
   0x0000000000403160 <+0>:     push   %r13
	:
   0x00000000004031fa <+154>:   mov    0x60(%r12),%eax
   0x00000000004031ff <+159>:   mov    0x10(%r12),%rdi
   0x0000000000403204 <+164>:   mov    $0xe,%esi
   0x0000000000403209 <+169>:   lea    0x0(%r13,%rax,1),%rax
   0x000000000040320e <+174>:   mov    %rax,0x58(%r12)
   0x0000000000403213 <+179>:   callq  0x402740 <pthread_kill@plt>
=> 0x0000000000403218 <+184>:   mov    0x58(%r12),%rax
   0x000000000040321d <+189>:   jmp    0x4031d0 <_timeout_thread+112>
   0x000000000040321f <+191>:   nop
	:
   0x000000000040325c <+252>:   callq  0x402710 <_Unwind_Resume@plt>
End of assembler dump.
(gdb) info reg r12
r12            0x242d520        37934368
(gdb) print (struct thread_status *)0x242d520
$3 = (struct thread_status *) 0x242d520
(gdb) print *$3
$4 = {list = {n = 0x0, p = 0x0}, thread = 0, dso_data = 0x242d4d0, device = {
    uuid = 0x242d5b0 "LVM-XpiJnj24rpQoNi9vDihWFCfsCzh6JD4t0Pq57IIfrKnxacgFLryKxGLMimT3If3u",
    name = 0x1bc1510 "test--vg-data01_snap", major = 253, minor = 3}, event_nr = 0, processing = 0, status = 0, events = 196352,
  current_events = 0, current_task = 0x0, next_time = 1391668869, timeout = 10, timeout_list = {n = 0x606e90, p = 0x606e90},
  dso_private = 0x32}
(gdb) print _timeout_registry
$5 = {n = 0x242d588, p = 0x242d588}
(gdb)  print &_timeout_registry
$6 = (struct dm_list *) 0x606e90

(gdb) thread 2
[Switching to thread 2 (Thread 0x7fe0a5f0f7a0 (LWP 3662))]#0  0x0000003aa66e6d77 in socket ()
    at ../sysdeps/unix/syscall-template.S:82
/usr/src/debug/glibc-2.12-2-gc4ccff1/sysdeps/unix/syscall-template.S:82:3441:beg:0x3aa66e6d77
(gdb) bt full
#0  0x0000003aa66e6d77 in socket () at ../sysdeps/unix/syscall-template.S:82
No locals.
#1  0x0000003aa66e1b94 in openlog_internal (ident=<value optimized out>, logstat=<value optimized out>,
    logfac=<value optimized out>) at ../misc/syslog.c:355
        retry = 0
#2  0x0000003aa66e2347 in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x7fe0a5cac770 "Monitoring snapshot %s\n",
    ap=<value optimized out>) at ../misc/syslog.c:274
        __clframe = {__cancel_routine = 0x3aa66e1920 <cancel_handler>, __cancel_arg = 0x7fff8f14ded0, __do_it = 1,
          __cancel_type = 0}
        now_tm = {tm_sec = 58, tm_min = 40, tm_hour = 15, tm_mday = 6, tm_mon = 1, tm_year = 114, tm_wday = 4, tm_yday = 36,
          tm_isdst = 0, tm_gmtoff = 32400, tm_zone = 0x242cf10 "JST"}
        now = 1391668858
        fd = <value optimized out>
        f = <value optimized out>
        buf = 0x2340ee0 "<14>Feb  6 15:40:58 lvm[3662]: Monitoring snapshot test--vg-data01_snap\n"
        bufsize = 72
        msgoff = 20
        saved_errno = 0
        failbuf = "\000\000\000\000\000\000\000\000P\376B\002\000\000\000\000\200\376B\002\000\000\000\000`\337\024\217\377"
        clarg = {buf = 0x2340ee0, oldaction = 0x0}
#3  0x0000003aa66e2533 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>)
    at ../misc/syslog.c:131
        ap = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 0x7fff8f14e020, reg_save_area = 0x7fff8f14df60}}
#4  0x00007fe0a5cabe93 in syslog (device=0x1bc1510 "test--vg-data01_snap", uuid=<value optimized out>,
    major=<value optimized out>, minor=<value optimized out>, private=0x242d598) at /usr/include/bits/syslog.h:32
No locals.
#5  register_device (device=0x1bc1510 "test--vg-data01_snap", uuid=<value optimized out>, major=<value optimized out>,
    minor=<value optimized out>, private=0x242d598) at dmeventd_snapshot.c:247
        percent_check = 0x242d598
        r = 1
#6  0x0000000000403bc0 in _do_register_device (message_data=0x606f00) at dmeventd.c:689
No locals.
#7  _register_for_event (message_data=0x606f00) at dmeventd.c:1010
        ret = 0
        thread = <value optimized out>
        thread_new = 0x242d520
        dso_data = <value optimized out>
#8  0x00000000004045f1 in _handle_request (msg=0x7fff8f14e2e0) at dmeventd.c:1426
        requests = {{cmd = 2, f = 0x403720 <_register_for_event>}, {cmd = 3, f = 0x402fd0 <_unregister_for_event>}, {cmd = 4,
            f = 0x402f80 <_get_registered_device>}, {cmd = 5, f = 0x402f70 <_get_next_registered_device>}, {cmd = 6,
            f = 0x402dc0 <_set_timeout>}, {cmd = 7, f = 0x402d40 <_get_timeout>}, {cmd = 1, f = 0x402870 <_active>}, {cmd = 10,
            f = 0x402ad0 <_get_status>}}
        req = 0x606e00
#9  _do_process_request (msg=0x7fff8f14e2e0) at dmeventd.c:1457
        ret = <value optimized out>
        answer = <value optimized out>
        message_data = {id = 0x1bc1580 "3913:3", dso_name = 0x242d850 "libdevmapper-event-lvm2snapshot.so",
          device_uuid = 0x1bc1ed0 "LVM-XpiJnj24rpQoNi9vDihWFCfsCzh6JD4t0Pq57IIfrKnxacgFLryKxGLMimT3If3u", events = {
            str = 0x2ff00 <Address 0x2ff00 out of bounds>, field = 196352}, timeout = {str = 0xa <Address 0xa out of bounds>,
            secs = 10}, msg = 0x7fff8f14e2e0}
#10 0x0000000000404aec in _process_request (argc=<value optimized out>, argv=<value optimized out>) at dmeventd.c:1488
        die = 0
        msg = {cmd = 2, size = 0, data = 0x0}
#11 main (argc=<value optimized out>, argv=<value optimized out>) at dmeventd.c:1971
        opt = <value optimized out>
        fifos = {client = 5, server = 4, client_path = 0x405ac5 "/var/run/dmeventd-client",
          server_path = 0x405aac "/var/run/dmeventd-server"}
(gdb) frame 7
#7  _register_for_event (message_data=0x606f00) at dmeventd.c:1010
/usr/src/debug/LVM2.2.02.87/daemons/dmeventd/dmeventd.c:1010:24323:beg:0x403bc0
(gdb) print *thread_new
$7 = {list = {n = 0x0, p = 0x0}, thread = 0, dso_data = 0x242d4d0, device = {
    uuid = 0x242d5b0 "LVM-XpiJnj24rpQoNi9vDihWFCfsCzh6JD4t0Pq57IIfrKnxacgFLryKxGLMimT3If3u",
    name = 0x1bc1510 "test--vg-data01_snap", major = 253, minor = 3}, event_nr = 0, processing = 0, status = 0, events = 196352,
  current_events = 0, current_task = 0x0, next_time = 1391668869, timeout = 10, timeout_list = {n = 0x606e90, p = 0x606e90},
  dso_private = 0x32}

(gdb) print _thread_registry_unused
$8 = {n = 0x606df0, p = 0x606df0}
(gdb) print &_thread_registry_unused
$9 = (struct dm_list *) 0x606df0
(gdb) print _thread_registry
$10 = {n = 0x606de0, p = 0x606de0}
(gdb) print &_thread_registry
$11 = (struct dm_list *) 0x606de0


====================================================================
coredump 2
====================================================================
- - -<command>- - - - - - - - - - - - - - - - - - - - - - - - - - -
# lvcreate -s -L 1M -n data01_snap  /dev/test-vg/data01

- - -</var/log/messages>- - - - - - - - - - - - - - - - - - - - - -
	:
Feb  7 00:37:39 COS62-SKR-K038 lvm[5281]: Monitoring snapshot test--vg-data01_snap
Feb  7 00:40:01 COS62-SKR-K038 lvm[5281]: No longer monitoring snapshot test--vg-data01_snap
Feb  7 00:40:23 COS62-SKR-K038 lvm[5281]: Monitoring snapshot test--vg-data01_snap
Feb  7 00:42:32 COS62-SKR-K038 lvm[5281]: No longer monitoring snapshot test--vg-data01_snap
Feb  7 00:42:53 COS62-SKR-K038 lvm[5281]: Monitoring snapshot test--vg-data01_snap
Feb  7 00:45:18 COS62-SKR-K038 lvm[5281]: No longer monitoring snapshot test--vg-data01_snap
Feb  7 00:45:49 COS62-SKR-K038 kernel: dmeventd[6048]: segfault at 2d0 ip 0000003aa720c610 sp 00007f2f908e7dd8 error 4 in libpthread-2.12.so[3aa7200000+17000]
Feb  7 00:46:22 COS62-SKR-K038 abrt[6049]: saved core dump of pid 5281 (/sbin/dmeventd) to /var/spool/abrt/ccpp-2014-02-07-00:45:53-5281.new/coredump (21397504 bytes)
Feb  7 00:46:23 COS62-SKR-K038 abrtd: Directory 'ccpp-2014-02-07-00:45:53-5281' creation detected
Feb  7 00:48:48 COS62-SKR-K038 dmeventd[6071]: dmeventd ready for processing.
Feb  7 00:49:08 COS62-SKR-K038 lvm[6071]: Monitoring snapshot test--vg-data01_snap
Feb  7 00:50:08 COS62-SKR-K038 lvm[6071]: No longer monitoring snapshot test--vg-data01_snap
Feb  7 00:50:34 COS62-SKR-K038 lvm[6071]: Monitoring snapshot test--vg-data01_snap
Feb  7 00:51:34 COS62-SKR-K038 lvm[6071]: No longer monitoring snapshot test--vg-data01_snap
Feb  7 00:51:57 COS62-SKR-K038 lvm[6071]: Monitoring snapshot test--vg-data01_snap
	:

- - -<coredump>- - - - - - - - - - - - - - - - - - - - - - - - - - -
Core was generated by `/sbin/dmeventd'.
Program terminated with signal 11, Segmentation fault.
#0  __pthread_kill (threadid=0, signo=14) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43
/usr/src/debug/glibc-2.12-2-gc4ccff1/nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43:1539:beg:0x3aa720c610
Missing separate debuginfos, use: debuginfo-install libudev-147-2.51.el6.x86_64
(gdb) info thread
  2 Thread 0x7f2f9127c7a0 (LWP 5281)  _touch_memory () at mm/memlock.c:119
* 1 Thread 0x7f2f908e8700 (LWP 6048)  __pthread_kill (threadid=0, signo=14) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43

(gdb) bt full
#0  __pthread_kill (threadid=0, signo=14) at ../nptl/sysdeps/unix/sysv/linux/pthread_kill.c:43
        __x = <value optimized out>
        pd = 0x0
        tid = <value optimized out>
        val = <value optimized out>
#1  0x0000000000403218 in _timeout_thread (unused=<value optimized out>) at dmeventd.c:534
        thread = <value optimized out>
        __clframe = {__cancel_routine = 0x402ab0 <_exit_timeout>, __cancel_arg = 0x0, __do_it = 1, __cancel_type = 0}
        timeout = {tv_sec = 0, tv_nsec = 0}
        curr_time = 1391701548
#2  0x0000003aa72077f1 in start_thread (arg=0x7f2f908e8700) at pthread_create.c:301
        __res = <value optimized out>
        pd = 0x7f2f908e8700
        now = <value optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139842265450240, -2678941760172108955, 140734499851136, 139842265450944, 0, 3,
                2633864390507189093, -2691126274855054491}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {
              prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <value optimized out>
        pagesize_m1 = <value optimized out>
        sp = <value optimized out>
        freesize = <value optimized out>
#3  0x0000003aa66e5ccd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
No locals.
(gdb) frame 1
#1  0x0000000000403218 in _timeout_thread (unused=<value optimized out>) at dmeventd.c:534
/usr/src/debug/LVM2.2.02.87/daemons/dmeventd/dmeventd.c:534:13457:beg:0x403218
(gdb) print thread
$1 = <value optimized out>
(gdb) print &((struct thread_status *)0)->thread
$2 = (pthread_t *) 0x10
(gdb) disassemble _timeout_thread
Dump of assembler code for function _timeout_thread:
   0x0000000000403160 <+0>:     push   %r13
	:
   0x00000000004031fa <+154>:   mov    0x60(%r12),%eax
   0x00000000004031ff <+159>:   mov    0x10(%r12),%rdi
   0x0000000000403204 <+164>:   mov    $0xe,%esi
   0x0000000000403209 <+169>:   lea    0x0(%r13,%rax,1),%rax
   0x000000000040320e <+174>:   mov    %rax,0x58(%r12)
   0x0000000000403213 <+179>:   callq  0x402740 <pthread_kill@plt>
=> 0x0000000000403218 <+184>:   mov    0x58(%r12),%rax
   0x000000000040321d <+189>:   jmp    0x4031d0 <_timeout_thread+112>
   0x000000000040321f <+191>:   nop
	:
   0x000000000040325c <+252>:   callq  0x402710 <_Unwind_Resume@plt>
End of assembler dump.
(gdb) info reg r12
r12            0x1a37f20        27492128
(gdb)  print (struct thread_status *)0x1a37f20
$3 = (struct thread_status *) 0x1a37f20
(gdb)  print _timeout_registry
$4 = {n = 0x1a37f88, p = 0x1a37f88}
(gdb) print &_timeout_registry
$5 = (struct dm_list *) 0x606e90

(gdb) thread 2
[Switching to thread 2 (Thread 0x7f2f9127c7a0 (LWP 5281))]#0  _touch_memory () at mm/memlock.c:119
/usr/src/debug/LVM2.2.02.87/lib/mm/memlock.c:119:2720:beg:0x7f2f90bb1eb3
(gdb) bt full
#0  _touch_memory () at mm/memlock.c:119
        pagesize = 4096
        pos = 0x2262c70 ""
        end = 0x2274c68 ""
#1  _allocate_memory () at mm/memlock.c:132
        stack_mem = <value optimized out>
        temp_malloc_mem = 0x1a74c70
#2  0x00007f2f90bb1fc7 in _lock_mem (cmd=0x1a3ae50) at mm/memlock.c:305
No locals.
#3  _lock_mem_if_needed (cmd=0x1a3ae50) at mm/memlock.c:382
No locals.
#4  0x00007f2f90be8946 in lvm2_run (handle=0x1a3ae50, cmdline=<value optimized out>) at lvmcmdlib.c:85
        argc = 1
        ret = 1
        oneoff = 0
        args = {0x1a72290 "_memlock_inc", 0x1a6f170 "libdevmapper-event-lvm2mirror.so", 0x1000 <Address 0x1000 out of bounds>,
          0x7f2f90bc737e "H\213\234$\220", 0x2000 <Address 0x2000 out of bounds>, 0x7f2f90bf2fc3 "mirror",
          0x2000 <Address 0x2000 out of bounds>, 0x3aa698be80 "", 0x6d766c2f6374652f <Address 0x6d766c2f6374652f out of bounds>,
          0x0, 0x0, 0x3aa66797d1 "H\205\300H\211\303t7\203=\210^1", 0x0, 0x1a390a0 "\020\213\002\251:",
          0x2000 <Address 0x2000 out of bounds>,
          0x3aa8e17c54 "H\205\300H\211\303t4H\215@\030N\215$#H\211\003L\211c\bH\213E\020H\211C\020H\211]\020H\211\330H\213l$\bH\213\034$L\213d$\020H\203\304\030\303\017\037", 0x1000 <Address 0x1000 out of bounds>, 0x8 <Address 0x8 out of bounds>,
          0x1a44e10 "libdevmapper-event-lvm2mirror.so",
          0x3aa8e17df9 "H\211\301\061\300\213T$\bH\205\311t\313H\213\061\215B\377H!\360H)\305H\215\004.H\211\001\353\257\017\037@", 0x0, 0x7f2f00000008 <Address 0x7f2f00000008 out of bounds>, 0x1a6f170 "libdevmapper-event-lvm2mirror.so",
          0x1a3a4d0 "\200\256\243\001", 0x1a3ae50 "", 0x7fff4ddf0b10 "P\256\243\001",
          0x6d766c2f6374652f <Address 0x6d766c2f6374652f out of bounds>, 0x0, 0x0, 0x1a3ae50 "", 0x1a3ae80 "\320J\247\001",
          0x7fff4ddf0b10 "P\256\243\001", 0x6d766c2f6374652f <Address 0x6d766c2f6374652f out of bounds>,
          0x7f2f90b7072f "H\205\300t\325L\213` M\205\344t\314E\213\034$L\215\264$\200", 0x7f2f90bb4710 "UH\215\065t_\004",
          0x7f2f90bb5350 "UH\215\065\351U\004", 0x7f2f90b7c8b0 "UH\215\065\340\006\a", 0x7f2f90b8c850 "UH\215\065 ;\006",
          0x7f2f90bc0de0 "H\215\065\005\273\003", 0x7f2f90bc0dd0 "H\215\065\017\273\003", 0x7f2f90bc0dc0 "H\215\065\031\273\003",
          0x7f2f90bc0db0 "H\215\065 \273\003", 0x7f2f90bc0da0 "H\215\065'\273\003", 0x7f2f90bc0d90 "H\215\065.\273\003",
          0x7f2f90bc0d80 "H\215\065\065\273\003", 0x7f2f90bc0d70 "H\215\065$\273\003", 0x7f2f90bc0d60 "H\215\065L\273\003",
          0x7f2f90bc0d50 "H\215\065S\273\003", 0x7f2f90bc0d40 "H\215\065Z\273\003", 0x0, 0x1a3ae50 "",
          0x3aa720e559 "H\211\320H\203\304\bH=\001\360\377\377s\001\303H\213\r\020\212 ", 0x0, 0x7f2f90bb0ed7 "\211?\001",
          0x1a3ae50 "", 0x0, 0x1a3ae50 "", 0x0, 0x1a3ae50 "", 0x7f2f90b71754 "H\213\203", <incomplete sequence \320>,
          0x11cc510 "test--vg-data01_snap", 0x1a37f98 "", 0x1a37f98 "", 0x0}
        argv = 0x7fff4ddf0980
        cmdcopy = 0x1a72290 "_memlock_inc"
        cmd = 0x1a3ae50
#5  0x00007f2f90e2db34 in dmeventd_lvm2_init () at dmeventd_lvm.c:117
        r = 0
#6  0x00007f2f9102fe6e in register_device (device=0x11cc510 "test--vg-data01_snap", uuid=<value optimized out>,
    major=<value optimized out>, minor=<value optimized out>, private=0x1a37f98) at dmeventd_snapshot.c:243
        percent_check = 0x1a37f98
        r = <value optimized out>
#7  0x0000000000403bc0 in _do_register_device (message_data=0x606f00) at dmeventd.c:689
No locals.
#8  _register_for_event (message_data=0x606f00) at dmeventd.c:1010
        ret = 0
        thread = <value optimized out>
        thread_new = 0x1a37f20
        dso_data = <value optimized out>
#9  0x00000000004045f1 in _handle_request (msg=0x7fff4ddf0e90) at dmeventd.c:1426
        requests = {{cmd = 2, f = 0x403720 <_register_for_event>}, {cmd = 3, f = 0x402fd0 <_unregister_for_event>}, {cmd = 4,
            f = 0x402f80 <_get_registered_device>}, {cmd = 5, f = 0x402f70 <_get_next_registered_device>}, {cmd = 6,
            f = 0x402dc0 <_set_timeout>}, {cmd = 7, f = 0x402d40 <_get_timeout>}, {cmd = 1, f = 0x402870 <_active>}, {cmd = 10,
            f = 0x402ad0 <_get_status>}}
        req = 0x606e00
#10 _do_process_request (msg=0x7fff4ddf0e90) at dmeventd.c:1457
        ret = <value optimized out>
        answer = <value optimized out>
        message_data = {id = 0x11cc580 "6008:3", dso_name = 0x11cc770 "libdevmapper-event-lvm2snapshot.so",
          device_uuid = 0x1a73c00 "LVM-XpiJnj24rpQoNi9vDihWFCfsCzh6JD4tM4HMZzzG4gill3AxggwJUROaoE2tPqLg", events = {
            str = 0x2ff00 <Address 0x2ff00 out of bounds>, field = 196352}, timeout = {str = 0xa <Address 0xa out of bounds>,
            secs = 10}, msg = 0x7fff4ddf0e90}
#11 0x0000000000404aec in _process_request (argc=<value optimized out>, argv=<value optimized out>) at dmeventd.c:1488
        die = 0
        msg = {cmd = 2, size = 0, data = 0x0}
#12 main (argc=<value optimized out>, argv=<value optimized out>) at dmeventd.c:1971
        opt = <value optimized out>
        fifos = {client = 5, server = 4, client_path = 0x405ac5 "/var/run/dmeventd-client",
          server_path = 0x405aac "/var/run/dmeventd-server"}
(gdb) fram 8
#8  _register_for_event (message_data=0x606f00) at dmeventd.c:1010
/usr/src/debug/LVM2.2.02.87/daemons/dmeventd/dmeventd.c:1010:24323:beg:0x403bc0
(gdb) print *thread_new
$6 = {list = {n = 0x0, p = 0x0}, thread = 0, dso_data = 0x1a73e20, device = {
    uuid = 0x1a37fb0 "LVM-XpiJnj24rpQoNi9vDihWFCfsCzh6JD4tM4HMZzzG4gill3AxggwJUROaoE2tPqLg",
    name = 0x11cc510 "test--vg-data01_snap", major = 253, minor = 0}, event_nr = 0, processing = 0, status = 0, events = 196352,
  current_events = 0, current_task = 0x0, next_time = 1391701558, timeout = 10, timeout_list = {n = 0x606e90, p = 0x606e90},
  dso_private = 0x0}

(gdb) print _thread_registry
$7 = {n = 0x606de0, p = 0x606de0}
(gdb) print &_thread_registry
$8 = (struct dm_list *) 0x606de0
(gdb) print _thread_registry_unused
$9 = {n = 0x606df0, p = 0x606df0}
(gdb) print &_thread_registry_unused
$10 = (struct dm_list *) 0x606df0

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Thanks,

--
Chiyohiko Morita <morita valinux co jp>




^ permalink raw reply	[flat|nested] 4+ messages in thread

* dmeventd generates segfault when creating snapshots
  2014-02-10 11:34 dmeventd generates segfault when creating snapshots Chiyohiko Morita
@ 2014-02-11 10:33 ` Zdenek Kabelac
  2014-02-12  8:18   ` Chiyohiko Morita
  2014-02-12  9:06   ` Marian Csontos
  0 siblings, 2 replies; 4+ messages in thread
From: Zdenek Kabelac @ 2014-02-11 10:33 UTC (permalink / raw)
  To: lvm-devel

Dne 10.2.2014 12:34, Chiyohiko Morita napsal(a):
> 
> (I'm not on this list; please Cc me)
> 
> With a high load, the dmeventd daemon sometimes generates
> a segmentation violation when createing a lvm-snapshot.
> 
> In _register_for_event(), it may take 10 or more seconds between
> calling _register_for_timeout() and _create_thread().
> (Please refer to the following _register_for_timeout().)
> 
> If the _timeout_thread thread calls pthread_kill() before
> _create_thread() is called, dmeventd will generate segfault.
> This is because the value of thread->thread is still NULL.
> (Please refer to the following _timeout_thread().)
> 
> A related source code and coredump are shown below.
> 

Could you please open regular  lvm2 bugzilla
(at   http://bugzilla.redhat.com/   - use Fedora Rawhide in case you don't
have match for your distro)

Attach all your information provided here in your email.

Add info from 'vgs, lvs -a, pvs' and your in-use lvm.conf
version of your lvm2 package,
how many volumes are active.

Thank you

Zdenek



^ permalink raw reply	[flat|nested] 4+ messages in thread

* dmeventd generates segfault when creating snapshots
  2014-02-11 10:33 ` Zdenek Kabelac
@ 2014-02-12  8:18   ` Chiyohiko Morita
  2014-02-12  9:06   ` Marian Csontos
  1 sibling, 0 replies; 4+ messages in thread
From: Chiyohiko Morita @ 2014-02-12  8:18 UTC (permalink / raw)
  To: lvm-devel

> Dne 10.2.2014 12:34, Chiyohiko Morita napsal(a):
> > 
> > (I'm not on this list; please Cc me)
> > 
> > With a high load, the dmeventd daemon sometimes generates
> > a segmentation violation when createing a lvm-snapshot.
> > 
> > In _register_for_event(), it may take 10 or more seconds between
> > calling _register_for_timeout() and _create_thread().
> > (Please refer to the following _register_for_timeout().)
> > 
> > If the _timeout_thread thread calls pthread_kill() before
> > _create_thread() is called, dmeventd will generate segfault.
> > This is because the value of thread->thread is still NULL.
> > (Please refer to the following _timeout_thread().)
> > 
> > A related source code and coredump are shown below.
> > 
> 
> Could you please open regular  lvm2 bugzilla
> (at   http://bugzilla.redhat.com/   - use Fedora Rawhide in case you don't
> have match for your distro)
> 
> Attach all your information provided here in your email.
> 
> Add info from 'vgs, lvs -a, pvs' and your in-use lvm.conf
> version of your lvm2 package,
> how many volumes are active.

I am using LVM2 version lvm2-2.02.87-6.el6.x86_64.
Please refer to the following for package information 
and 'vgs, lvs -a, pvs' command output. 

# rpm -qa | grep kernel-2
kernel-2.6.32-220.23.1.el6.x86_64
kernel-2.6.32-220.el6.x86_64
(RHEL/CentOS 6.2)

# rpm -qa | grep lvm2
lvm2-libs-2.02.87-6.el6.x86_64
lvm2-2.02.87-6.el6.x86_64
lvm2-debuginfo-2.02.87-6.el6.x86_64

# vgs
  VG      #PV #LV #SN Attr   VSize VFree
  test-vg   2   5   2 wz--n- 7.99g 4.98g

# lvs -a
  LV          VG      Attr   LSize Origin Snap%  Move Log Copy%  Convert
  data01      test-vg owi-a- 1.00g
  data01_snap test-vg swi-a- 4.00m data01   0.00
  data02      test-vg owi-a- 1.00g
  data02_snap test-vg swi-a- 4.00m data02   0.00
  data03      test-vg -wi-a- 1.00g

# pvs
  PV         VG      Fmt  Attr PSize PFree
  /dev/vdb1  test-vg lvm2 a--  4.00g 1012.00m
  /dev/vdb2  test-vg lvm2 a--  4.00g    4.00g

# diff -upN /etc/lvm/lvm.conf.org /etc/lvm/lvm.conf
--- lvm.conf.org        2011-12-08 07:48:23.000000000 +0900
+++ /etc/lvm/lvm.conf   2014-02-12 07:35:34.000000000 +0900
@@ -63,7 +63,7 @@ devices {


     # By default we accept every block device:
-    filter = [ "a/.*/" ]
+    filter = [ "a|/dev/vdb*|", "r|.*|" ]

     # Exclude the cdrom drive
     # filter = [ "r|/dev/cdrom|" ]
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Additional info:
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
# ls /etc/lvm/archive | wc -l
121510

#  ls -al /etc/lvm/archive | head
total 493076
drwxr-xr-x. 2 root root 6377472 Feb 12  2014 .
drwxr-xr-x. 9 root root    4096 Feb 12 06:28 ..
-rw-------. 1 root root    1109 Jan 29 11:59 test-vg_00000-2082297108.vg
-rw-------. 1 root root    1442 Jan 29 12:00 test-vg_00001-190015709.vg
-rw-------. 1 root root    1771 Jan 29 12:01 test-vg_00002-1842787566.vg
-rw-------. 1 root root    2372 Jan 29 12:01 test-vg_00003-1994250619.vg
-rw-------. 1 root root    1773 Jan 29 12:01 test-vg_00004-2042082134.vg
-rw-------. 1 root root    2390 Jan 29 12:01 test-vg_00005-1068639300.vg
-rw-------. 1 root root    2993 Jan 29 12:01 test-vg_00006-364657598.vg

#  ls -al /etc/lvm/archive | tail -n3
-rw-------. 1 root root    3322 Jan 31 02:14 test-vg_99997-1786673524.vg
-rw-------. 1 root root    3925 Jan 31 02:14 test-vg_99998-1284671590.vg
-rw-------. 1 root root    3306 Jan 31 02:14 test-vg_99999-1949909947.vg

# ls -alrt /etc/lvm/archive | tail -n5
-rw-------. 1 root root    2402 Feb  9 13:54 test-vg_121507-196968836.vg
-rw-------. 1 root root    3007 Feb  9 13:55 test-vg_121508-1062447752.vg
-rw-------. 1 root root    2402 Feb  9 13:55 test-vg_121509-1916506465.vg
drwxr-xr-x. 9 root root    4096 Feb 12 06:28 ..
drwxr-xr-x. 2 root root 6377472 Feb 12  2014 .


<lvcreate/lvremove command execution with no load>

# ls /etc/lvm/archive | wc -l
121510

# ls /etc/lvm/archive | head -n3
test-vg_00000-2082297108.vg
test-vg_00001-190015709.vg
test-vg_00002-1842787566.vg

# ls /etc/lvm/archive | tail -n3
test-vg_99997-1786673524.vg
test-vg_99998-1284671590.vg
test-vg_99999-1949909947.vg

# ls -rt /etc/lvm/archive | tail -n3
test-vg_121507-196968836.vg
test-vg_121508-1062447752.vg
test-vg_121509-1916506465.vg

# time lvcreate -s -L 1M -n data01_snap /dev/test-vg/data01
  Rounding up size to full physical extent 4.00 MiB
  Logical volume "data01_snap" created

real    0m11.666s
user    0m10.377s
sys     0m0.184s

--<top>---
top - 07:07:48 up 5 days,  5:02, 11 users,  load average: 0.38, 0.74, 0.74
Tasks: 155 total,   2 running, 153 sleeping,   0 stopped,   0 zombie
Cpu(s): 29.9%us,  5.3%sy,  0.0%ni, 61.3%id,  3.5%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3923192k total,  3795744k used,   127448k free,   475752k buffers
Swap:  4194296k total,    49272k used,  4145024k free,  2213304k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
20102 root      20   0 46356  24m 1472 R 100.0  0.6   0:03.70 lvcreate
    1 root      20   0 19328  956  736 S  0.0  0.0   0:01.16 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:02.32 kthreadd
    3 root      RT   0     0    0    0 S  0.0  0.0   0:14.32 migration/0
    4 root      20   0     0    0    0 S  0.0  0.0   4:21.86 ksoftirqd/0
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0
	:
---

# time lvremove -f /dev/test-vg/data01_snap
  Logical volume "data01_snap" successfully removed

real    0m11.255s
user    0m10.361s
sys     0m0.174s

--<top>---
top - 07:09:32 up 5 days,  5:03, 11 users,  load average: 0.22, 0.59, 0.69
Tasks: 161 total,   2 running, 159 sleeping,   0 stopped,   0 zombie
Cpu(s): 50.0%us,  0.2%sy,  0.0%ni, 49.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3923192k total,  3797244k used,   125948k free,   475748k buffers
Swap:  4194296k total,    49272k used,  4145024k free,  2211404k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
20147 root      20   0 48124  26m 1452 R 100.0  0.7   0:03.78 lvremove
    9 root      20   0     0    0    0 S  3.3  0.0   5:50.75 ksoftirqd/1
    4 root      20   0     0    0    0 S  3.0  0.0   4:22.21 ksoftirqd/0
20146 root      20   0 15020 1208  888 R  0.3  0.0   0:00.02 top
    1 root      20   0 19328  956  736 S  0.0  0.0   0:01.16 init
	:
----------

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Thank you

--
Chiyohiko Morita <morita valinux co jp>



^ permalink raw reply	[flat|nested] 4+ messages in thread

* dmeventd generates segfault when creating snapshots
  2014-02-11 10:33 ` Zdenek Kabelac
  2014-02-12  8:18   ` Chiyohiko Morita
@ 2014-02-12  9:06   ` Marian Csontos
  1 sibling, 0 replies; 4+ messages in thread
From: Marian Csontos @ 2014-02-12  9:06 UTC (permalink / raw)
  To: lvm-devel

On 02/11/2014 11:33 AM, Zdenek Kabelac wrote:
> Dne 10.2.2014 12:34, Chiyohiko Morita napsal(a):
>>
>> (I'm not on this list; please Cc me)
>>
>> With a high load, the dmeventd daemon sometimes generates
>> a segmentation violation when createing a lvm-snapshot.
>>
>> In _register_for_event(), it may take 10 or more seconds between
>> calling _register_for_timeout() and _create_thread().
>> (Please refer to the following _register_for_timeout().)
>>
>> If the _timeout_thread thread calls pthread_kill() before
>> _create_thread() is called, dmeventd will generate segfault.
>> This is because the value of thread->thread is still NULL.
>> (Please refer to the following _timeout_thread().)
>>
>> A related source code and coredump are shown below.
>>
>
> Could you please open regular  lvm2 bugzilla
> (at   http://bugzilla.redhat.com/   - use Fedora Rawhide in case you don't
> have match for your distro)
>
> Attach all your information provided here in your email.
>
> Add info from 'vgs, lvs -a, pvs' and your in-use lvm.conf
> version of your lvm2 package,
> how many volumes are active.

IMO this is closely related to Bug 889651 - segfault in dmeventd in low 
memory situations.

https://bugzilla.redhat.com/show_bug.cgi?id=889651

-- Marian

>
> Thank you
>
> Zdenek
>
> --
> lvm-devel mailing list
> lvm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/lvm-devel
>



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-02-12  9:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-10 11:34 dmeventd generates segfault when creating snapshots Chiyohiko Morita
2014-02-11 10:33 ` Zdenek Kabelac
2014-02-12  8:18   ` Chiyohiko Morita
2014-02-12  9:06   ` Marian Csontos

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.