[Swan-dev] fiddle_bare_shunt crasher

Fri Jun 10 14:16:53 UTC 2016

---------- Forwarded message ----------
Date: Thu, 9 Jun 2016 23:15:18
From: gulassi <notifications at github.com>
Cc: "Paul Wouters (libreswan)" <paul at cypherpunks.ca>,
     Comment <comment at noreply.github.com>
To: libreswan/libreswan <libreswan at noreply.github.com>
Subject: Re: [libreswan/libreswan] Libreswan 3.16 crash (#50)

It seems I may have hit this same issue. First there are segfaults like:

[Jun10 04:12] pluto[19727]: segfault at 28 ip 00007f5476f93a6f sp 00007ffc54ff7280 error 4 in pluto[7f5476f32000+105000]

and sometimes when these segfaults occur pluto gets stuck. During this pluto uses on CPUs worth of resources. I managed to get a
core dump when segfault happens and gdb showed back trace:

(gdb) bt full
#0  0x00007f5476f93a6f in fiddle_bare_shunt (src=src at entry=0x7f5477fc5d78, dst=dst at entry=0x7f5477fc5d98,
     policy_prio=policy_prio at entry=0, cur_shunt_spi=259, new_shunt_spi=new_shunt_spi at entry=256, repl=repl at entry=0,
     transport_proto=1, why=why at entry=0x7f5477002a43 "expire_bare_shunt")
     at /usr/src/debug/libreswan-3.15/programs/pluto/kernel.c:1103
         this_client = {addr = {u = {v4 = {sin_family = 2, sin_port = 2048, sin_addr = {s_addr = 3969379675},
                 sin_zero = "\000\000\000\000\000\000\000"}, v6 = {sin6_family = 2, sin6_port = 2048,
                 sin6_flowinfo = 3969379675, sin6_addr = {__in6_u = {
                     __u6_addr8 = '\000' <repeats 12 times>, "\001\000\000", __u6_addr16 = {0, 0, 0, 0, 0, 0, 1, 0},
                     __u6_addr32 = {0, 0, 0, 1}}}, sin6_scope_id = 2013027696}}}, maskbits = 32596}
         that_client = {addr = {u = {v4 = {sin_family = 18664, sin_port = 30698, sin_addr = {s_addr = 32596},
                 sin_zero = "\350\366$wT\177\000"}, v6 = {sin6_family = 18664, sin6_port = 30698,
                 sin6_flowinfo = 32596, sin6_addr = {__in6_u = {
                     __u6_addr8 = "\350\366$wT\177\000\000\002\000\000\000\000\000\000", __u6_addr16 = {63208, 30500,
                       32596, 0, 2, 0, 0, 0}, __u6_addr32 = {1998911208, 32596, 2, 0}}}, sin6_scope_id = 0}}},
           maskbits = 0}
         null_host = <optimized out>
#1  0x00007f5476f98816 in delete_bare_shunt (why=0x7f5477002a43 "expire_bare_shunt", cur_shunt_spi=<optimized out>,
     transport_proto=<optimized out>, dst=0x7f5477fc5d98, src=0x7f5477fc5d78)
     at /usr/src/debug/libreswan-3.15/programs/pluto/kernel.c:1211
No locals.
#2  expire_bare_shunts () at /usr/src/debug/libreswan-3.15/programs/pluto/kernel.c:3333
         bsp = 0x7f5477fc5d70
         age = <optimized out>
         msg = 0x7f5477002a43 "expire_bare_shunt"
         bspp = 0x7f5477ea48e8
#3  0x00007f5476f69d85 in timer_event_cb (fd=<optimized out>, event=<optimized out>, arg=0x7f5477eec0e0)
     at /usr/src/debug/libreswan-3.15/programs/pluto/timer.c:585
         ev = 0x7f5477eec0e0
         type = EVENT_SHUNT_SCAN
         st = 0x0
         statenum = "\360>\333w\000\000\000\000\005\000\000\000T\177\000\000\024\017\323wT\177\000\000\221\214\232tT\177\000"
         last_used_age = {delta_secs = 0}
#4  0x00007f54749aaa14 in event_process_active_single_queue (activeq=0x7f5477d2f390, base=0x7f5477db3ef0)
     at event.c:1350
         ev = 0x7f5477f86870
         count = 1
#5  event_process_active (base=<optimized out>) at event.c:1420
         activeq = 0x7f5477d2f390
         i = 0
         c = 0
#6  event_base_loop (base=0x7f5477db3ef0, flags=flags at entry=0) at event.c:1621
         evsel = 0x7f5474bdfbe0 <epollops>
         tv = {tv_sec = 0, tv_usec = 263449}
         tv_p = <optimized out>
         res = <optimized out>
         done = 0
         retval = 0
         __func__ = "event_base_loop"
---Type <return> to continue, or q <return> to quit---
#7  0x00007f5476f6769b in main_loop () at /usr/src/debug/libreswan-3.15/programs/pluto/server.c:602
         r = <optimized out>
         ev_ctl = <optimized out>
         ev_sig_hup = <optimized out>
         ev_sig_term = <optimized out>
#8  call_server () at /usr/src/debug/libreswan-3.15/programs/pluto/server.c:703
No locals.
#9  0x00007f5476f4f9ea in main (argc=<optimized out>, argv=<optimized out>)
     at /usr/src/debug/libreswan-3.15/programs/pluto/plutomain.c:1596
         log_to_stderr_desired = 0
         log_to_file_desired = 0
         keep_alive = 0
         virtual_private = 0x0

Running on Centos 7 with libreswan-3.15-5.el7_1.x86_64.
Possibly this was caused when there was a tunnel configuration with two left subnets defined but one of them would not connect
and appeared in the bare shunts list. There hasn't been a segfault for over an hour since I removed the subnet from the
configuration.

If there is any way I can provide more information please let me know.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.[AC3V-WgP8O2tS3s7xyTjJbv749XTqX6Kks5qKNbGgaJpZM4HsmZi.gif]