[Swan-dev] crash after pluto: Fix addresspool reference count
wolfgang at linogate.de
wolfgang at linogate.de
Fri Oct 6 19:29:33 UTC 2017
On Fri, 6 Oct 2017 19:24:39 +0200, Antony Antony wrote
> On Thu, Oct 05, 2017 at 09:57:06PM +0200, Wolfgang Nothdurft wrote:
> > Am 05.10.2017 um 20:57 schrieb Antony Antony:
> > > On Thu, Oct 05, 2017 at 08:36:52PM +0200, Wolfgang Nothdurft wrote:
> > > > Am 05.10.2017 um 20:18 schrieb Antony Antony:
> > > > > Wow, this patch looks like a heavy hammer solution. To reference
count the
> > > > > pool for each lease? There is something else going on. I imagine
reproducing
> > > > > #299 will give more info. Also wonder no unrefrence when the lease goes
> > > > > away. Did you check for memory leak after this patch?
> > > > >
> > > > > Thanks for the proposed patch, it gave a bit more insight into the
issue.
> > > > >
> > > >
> > > > memory leak is not the problem, because at the moment the
> > > > unrefence_addresspool is called to often.
> > > >
> > > > My final solution at the moment is to move unreference_addresspool to the
> > > > release leases function and when the non-instance connection is deleted.
> > > >
> > > > The question is for what the refcount stands, only for installing a
> > > > addresspool it is not necessary in my opinion. But I'm not as deep in the
> > > > code as the one who wrote it initially.
> > >
> > > An addresspool is shared between connections. Eech connection add on
>
> > > sreference count. I think a connection instance may also add a reference
> > > count, I am not sure any more.
> > >
> > > Lease should not add reference count to the pool. Atleast that is the idea.
> > >
> > > I will look into soon, probably tomorrow.
> > >
> >
> > ah ok, than it is easy. Than the unreference call is wrong and should only
> > be called when the non-instance connection is deleted.
> >
> > I have updated lsw#299 with the final patch.
>
> Thanks for the new patch. I reviewed it and realized this would
> break when deleting an established connection.
>
> Here is the core dump after applying the patch,
> https://bugs.libreswan.org/attachment.cgi?id=112
>
> To reproduce, run test xauth-pluto-16 after connection from road is
> established on east, the responder, delete it.
> ipsec auto --delete modecfg-east-21
>
> and pluto crash. If I remember correctly, the reason is when
> deleting a connection pluto delete the CK_TEMPLATE first. So both
> instance and template should refcount.
>
> (gdb) bt
> #0 0x000055b6925de68c in rel_lease_addr (c=0x7fe2b3bdeb08)
> at /home/build/libreswan/programs/pluto/addresspool.c:183
> #1 0x000055b6925f0c08 in delete_connection (c=0x7fe2b3bdeb08,
> relations=false)
> at /home/build/libreswan/programs/pluto/connections.c:282
> #2 0x000055b6925f1471 in delete_connections_by_name (
> name=0x7fff367a1960 "modecfg-east-21", strict=true)
> at /home/build/libreswan/programs/pluto/connections.c:415
> #3 0x000055b69266dbda in whack_process (whackfd=27, m=0x7fff367a1440)
> at /home/build/libreswan/programs/pluto/rcv_whack.c:392
> #4 0x000055b69266eaae in whack_handle (whackctlfd=4)
> at /home/build/libreswan/programs/pluto/rcv_whack.c:779
> #5 0x000055b69266e73b in whack_handle_cb (fd=4, event=2, arg=0x0)
> at /home/build/libreswan/programs/pluto/rcv_whack.c:679
> #6 0x00007fe2ba8693f9 in event_persist_closure (ev=0x7fe2b3f92f70,
> base=0x7fe2b3b1fd80) at event.c:1319
> #7 event_process_active_single_queue (activeq=0x7fe2b3b25ff0,
> base=0x7fe2b3b1fd80)
> at event.c:1363
> #8 event_process_active (base=<optimized out>) at event.c:1438
> #9 event_base_loop (base=0x7fe2b3b1fd80, flags=0) at event.c:1639
> #10 0x000055b692615d17 in main_loop ()
> at /home/build/libreswan/programs/pluto/server.c:813
> #11 0x000055b692616270 in call_server ()
> at /home/build/libreswan/programs/pluto/server.c:946
> #12 0x000055b692612b3d in main (argc=5, argv=0x7fff367a53b8)
> at /home/build/libreswan/programs/pluto/plutomain.c:1814
>
> May be you need sharing address pools too, I am not sure.
Sorry, I missed that the initial problem was triggered with a configured
static ip in /etc/ipsec.d/passwd.
I have added a patch for you for the xauth-pluto-22 test to reproduce lsw299
with v3.21 and it also triggers the rel_lease_addr crash with my actual patch.
The actual problems seems to be when installing a new addresspool from
ikev1_xauth.c.
This code is initially from me and I think when I implemented it I overlooked
that the pool is shared and not copied for the instance.
I can look to rework it next week.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: xauth-pluto-22-lsw299-crash
Type: application/octet-stream
Size: 2004 bytes
Desc: not available
URL: <https://lists.libreswan.org/pipermail/swan-dev/attachments/20171006/d2338e89/attachment.obj>
More information about the Swan-dev
mailing list