[Swan-dev] crash after pluto: Fix addresspool reference count

Antony Antony antony at phenome.org
Fri Oct 6 17:24:39 UTC 2017


On Thu, Oct 05, 2017 at 09:57:06PM +0200, Wolfgang Nothdurft wrote:
> Am 05.10.2017 um 20:57 schrieb Antony Antony:
> > On Thu, Oct 05, 2017 at 08:36:52PM +0200, Wolfgang Nothdurft wrote:
> > > Am 05.10.2017 um 20:18 schrieb Antony Antony:
> > > > Wow, this patch looks like a heavy hammer solution. To reference count the
> > > > pool for each lease? There is something else going on. I imagine reproducing
> > > > #299 will give more info. Also wonder no unrefrence when the lease goes
> > > > away. Did you check for memory leak after this patch?
> > > > 
> > > > Thanks for the proposed patch, it gave a bit more insight into the issue.
> > > > 
> > > 
> > > memory leak is not the problem, because at the moment the
> > > unrefence_addresspool is called to often.
> > > 
> > > My final solution at the moment is to move unreference_addresspool to the
> > > release leases function and when the non-instance connection is deleted.
> > > 
> > > The question is for what the refcount stands, only for installing a
> > > addresspool it is not necessary in my opinion. But I'm not as deep in the
> > > code as the one who wrote it initially.
> > 
> > An addresspool is shared between connections. Eech connection add on

> > sreference count. I think a connection instance may also add a reference
> > count, I am not sure any more.
> > 
> > Lease should not add reference count to the pool. Atleast that is the idea.
> > 
> > I will look into soon, probably tomorrow.
> > 
> 
> ah ok, than it is easy. Than the unreference call is wrong and should only
> be called when the non-instance connection is deleted.
> 
> I have updated lsw#299 with the final patch.

Thanks for the new patch. I reviewed it and realized this would break when 
deleting an established connection.

Here is the core dump after applying the patch,
https://bugs.libreswan.org/attachment.cgi?id=112

To reproduce, run test xauth-pluto-16 after connection from road is 
established on east, the responder, delete it.  
ipsec auto --delete modecfg-east-21 

and pluto crash. If I remember correctly, the reason is when deleting a 
connection pluto delete the CK_TEMPLATE first. So both instance and template 
should refcount. 

(gdb) bt
#0  0x000055b6925de68c in rel_lease_addr (c=0x7fe2b3bdeb08)
    at /home/build/libreswan/programs/pluto/addresspool.c:183
#1  0x000055b6925f0c08 in delete_connection (c=0x7fe2b3bdeb08, 
relations=false)
    at /home/build/libreswan/programs/pluto/connections.c:282
#2  0x000055b6925f1471 in delete_connections_by_name (
    name=0x7fff367a1960 "modecfg-east-21", strict=true)
    at /home/build/libreswan/programs/pluto/connections.c:415
#3  0x000055b69266dbda in whack_process (whackfd=27, m=0x7fff367a1440)
    at /home/build/libreswan/programs/pluto/rcv_whack.c:392
#4  0x000055b69266eaae in whack_handle (whackctlfd=4)
    at /home/build/libreswan/programs/pluto/rcv_whack.c:779
#5  0x000055b69266e73b in whack_handle_cb (fd=4, event=2, arg=0x0)
    at /home/build/libreswan/programs/pluto/rcv_whack.c:679
#6  0x00007fe2ba8693f9 in event_persist_closure (ev=0x7fe2b3f92f70,
    base=0x7fe2b3b1fd80) at event.c:1319
#7  event_process_active_single_queue (activeq=0x7fe2b3b25ff0, 
base=0x7fe2b3b1fd80)
    at event.c:1363
#8  event_process_active (base=<optimized out>) at event.c:1438
#9  event_base_loop (base=0x7fe2b3b1fd80, flags=0) at event.c:1639
#10 0x000055b692615d17 in main_loop ()
    at /home/build/libreswan/programs/pluto/server.c:813
#11 0x000055b692616270 in call_server ()
    at /home/build/libreswan/programs/pluto/server.c:946
#12 0x000055b692612b3d in main (argc=5, argv=0x7fff367a53b8)
    at /home/build/libreswan/programs/pluto/plutomain.c:1814

May be you need sharing address pools too, I am not sure.

-antony


More information about the Swan-dev mailing list