[Swan-dev] crash after pluto: Fix addresspool reference count
Antony Antony
antony at phenome.org
Fri Oct 6 17:24:39 UTC 2017
On Thu, Oct 05, 2017 at 09:57:06PM +0200, Wolfgang Nothdurft wrote:
> Am 05.10.2017 um 20:57 schrieb Antony Antony:
> > On Thu, Oct 05, 2017 at 08:36:52PM +0200, Wolfgang Nothdurft wrote:
> > > Am 05.10.2017 um 20:18 schrieb Antony Antony:
> > > > Wow, this patch looks like a heavy hammer solution. To reference count the
> > > > pool for each lease? There is something else going on. I imagine reproducing
> > > > #299 will give more info. Also wonder no unrefrence when the lease goes
> > > > away. Did you check for memory leak after this patch?
> > > >
> > > > Thanks for the proposed patch, it gave a bit more insight into the issue.
> > > >
> > >
> > > memory leak is not the problem, because at the moment the
> > > unrefence_addresspool is called to often.
> > >
> > > My final solution at the moment is to move unreference_addresspool to the
> > > release leases function and when the non-instance connection is deleted.
> > >
> > > The question is for what the refcount stands, only for installing a
> > > addresspool it is not necessary in my opinion. But I'm not as deep in the
> > > code as the one who wrote it initially.
> >
> > An addresspool is shared between connections. Eech connection add on
> > sreference count. I think a connection instance may also add a reference
> > count, I am not sure any more.
> >
> > Lease should not add reference count to the pool. Atleast that is the idea.
> >
> > I will look into soon, probably tomorrow.
> >
>
> ah ok, than it is easy. Than the unreference call is wrong and should only
> be called when the non-instance connection is deleted.
>
> I have updated lsw#299 with the final patch.
Thanks for the new patch. I reviewed it and realized this would break when
deleting an established connection.
Here is the core dump after applying the patch,
https://bugs.libreswan.org/attachment.cgi?id=112
To reproduce, run test xauth-pluto-16 after connection from road is
established on east, the responder, delete it.
ipsec auto --delete modecfg-east-21
and pluto crash. If I remember correctly, the reason is when deleting a
connection pluto delete the CK_TEMPLATE first. So both instance and template
should refcount.
(gdb) bt
#0 0x000055b6925de68c in rel_lease_addr (c=0x7fe2b3bdeb08)
at /home/build/libreswan/programs/pluto/addresspool.c:183
#1 0x000055b6925f0c08 in delete_connection (c=0x7fe2b3bdeb08,
relations=false)
at /home/build/libreswan/programs/pluto/connections.c:282
#2 0x000055b6925f1471 in delete_connections_by_name (
name=0x7fff367a1960 "modecfg-east-21", strict=true)
at /home/build/libreswan/programs/pluto/connections.c:415
#3 0x000055b69266dbda in whack_process (whackfd=27, m=0x7fff367a1440)
at /home/build/libreswan/programs/pluto/rcv_whack.c:392
#4 0x000055b69266eaae in whack_handle (whackctlfd=4)
at /home/build/libreswan/programs/pluto/rcv_whack.c:779
#5 0x000055b69266e73b in whack_handle_cb (fd=4, event=2, arg=0x0)
at /home/build/libreswan/programs/pluto/rcv_whack.c:679
#6 0x00007fe2ba8693f9 in event_persist_closure (ev=0x7fe2b3f92f70,
base=0x7fe2b3b1fd80) at event.c:1319
#7 event_process_active_single_queue (activeq=0x7fe2b3b25ff0,
base=0x7fe2b3b1fd80)
at event.c:1363
#8 event_process_active (base=<optimized out>) at event.c:1438
#9 event_base_loop (base=0x7fe2b3b1fd80, flags=0) at event.c:1639
#10 0x000055b692615d17 in main_loop ()
at /home/build/libreswan/programs/pluto/server.c:813
#11 0x000055b692616270 in call_server ()
at /home/build/libreswan/programs/pluto/server.c:946
#12 0x000055b692612b3d in main (argc=5, argv=0x7fff367a53b8)
at /home/build/libreswan/programs/pluto/plutomain.c:1814
May be you need sharing address pools too, I am not sure.
-antony
More information about the Swan-dev
mailing list