[Swan-dev] Pluto memory consumption

Erik Andersson erik at ingate.com
Fri Mar 17 12:52:50 UTC 2017


Hi again,

Andrew fixed the "biggest" memory consumption issue for a few weeks ago.

Now I've tested with the same configuration but enabled x509 certificate 
authentication.

I've run pluto via valgrind for a few days. Pluto was build from the 
master branch (last commit 72f2f4fac52d4fe661c696e965cca756d3ce78a9).

In this mail I've highlighted the "definitely lost" entries from the 
valgrind log.

I've tried to fix a couple of them, but with mixed result. Not sure if I 
can fix the more intricate ones based on my current knowledge of pluto.

The full valgrind log can be found via the following link:

https://s3-eu-west-1.amazonaws.com/ingate-dump/pluto_valgrind2.txt

1)

==2991== 17 bytes in 1 blocks are definitely lost in loss record 59 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x1B1697: alloc_bytes_raw (alloc.c:95)
==2991==    by 0x1B1B0D: alloc_bytes (alloc.c:201)
==2991==    by 0x1BB9C4: lsw_nss_setup (lswnss.c:50)
==2991==    by 0x126169: pluto_init_nss (plutomain.c:412)
==2991==    by 0x126169: main (plutomain.c:1488)

lswnss.c:50

nssdir = alloc_bytes(strlen(configdir) + strlen(sql) + 1, "(ignore) 
nssdir");

nssdir is only free'd in the error path.

2)

==2991== 32 bytes in 1 blocks are definitely lost in loss record 133 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x1B1697: alloc_bytes_raw (alloc.c:95)
==2991==    by 0x1B1B0D: alloc_bytes (alloc.c:201)
==2991==    by 0x192E2E: chunk_from_symkey (crypt_symkey.c:388)
==2991==    by 0x19540E: calc_skeyseed_v2 (ikev2_prf.c:187)
==2991==    by 0x19540E: calc_dh_v2 (ikev2_prf.c:260)
==2991==    by 0x1A1B47: pluto_do_crypto_op (pluto_crypt.c:270)
==2991==    by 0x1A2644: pluto_crypto_helper (pluto_crypt.c:365)
==2991==    by 0x1A28C0: pluto_helper_thread (pluto_crypt.c:1011)
==2991==    by 0x5A4BDF4: start_thread (in /usr/lib64/libpthread-2.17.so)
==2991==    by 0x685A1AC: clone (in /usr/lib64/libc-2.17.so)

crypt_symkey.c:388

void *bytes = alloc_bytes(wrapped_key.len, name);

The allocated bytes that is returned via the chunk_t is not free'd.

3)

==2991== 32 bytes in 1 blocks are definitely lost in loss record 134 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x1B1697: alloc_bytes_raw (alloc.c:95)
==2991==    by 0x1B1B0D: alloc_bytes (alloc.c:201)
==2991==    by 0x192E2E: chunk_from_symkey (crypt_symkey.c:388)
==2991==    by 0x195443: calc_skeyseed_v2 (ikev2_prf.c:192)
==2991==    by 0x195443: calc_dh_v2 (ikev2_prf.c:260)
==2991==    by 0x1A1B47: pluto_do_crypto_op (pluto_crypt.c:270)
==2991==    by 0x1A2644: pluto_crypto_helper (pluto_crypt.c:365)
==2991==    by 0x1A28C0: pluto_helper_thread (pluto_crypt.c:1011)
==2991==    by 0x5A4BDF4: start_thread (in /usr/lib64/libpthread-2.17.so)
==2991==    by 0x685A1AC: clone (in /usr/lib64/libc-2.17.so)

crypt_symkey.c:388

void *bytes = alloc_bytes(wrapped_key.len, name);

The allocated bytes that is returned via the chunk_t is not free'd.

4)

==2991== 256 bytes in 2 blocks are definitely lost in loss record 566 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x8E7827F: PORT_Alloc_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x552DEC2: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x55303C8: PK11_DeriveWithTemplate (in /usr/lib64/libnss3.so)
==2991==    by 0x55307B1: PK11_DeriveWithFlags (in /usr/lib64/libnss3.so)
==2991==    by 0x192C2E: symkey_from_symkey (crypt_symkey.c:302)
==2991==    by 0x19542C: calc_skeyseed_v2 (ikev2_prf.c:190)
==2991==    by 0x19542C: calc_dh_v2 (ikev2_prf.c:260)
==2991==    by 0x1A1B47: pluto_do_crypto_op (pluto_crypt.c:270)
==2991==    by 0x1A2644: pluto_crypto_helper (pluto_crypt.c:365)
==2991==    by 0x1A28C0: pluto_helper_thread (pluto_crypt.c:1011)
==2991==    by 0x5A4BDF4: start_thread (in /usr/lib64/libpthread-2.17.so)
==2991==    by 0x685A1AC: clone (in /usr/lib64/libc-2.17.so)

crypt_symkey.c:302

PK11SymKey *result = PK11_DeriveWithFlags(base_key, derive, &param,
                                                   target, operation,
                                                   key_size, flags);
The returned result isn't free'd

5)

==2991== 288 (256 direct, 32 indirect) bytes in 2 blocks are definitely 
lost in loss record 572 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x8E7827F: PORT_Alloc_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x552DEC2: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x55303C8: PK11_DeriveWithTemplate (in /usr/lib64/libnss3.so)
==2991==    by 0x5530720: PK11_Derive (in /usr/lib64/libnss3.so)
==2991==    by 0x192A6D: merge_symkey_bytes (crypt_symkey.c:220)
==2991==    by 0x1938FE: symkey_from_bytes (crypt_symkey.c:420)
==2991==    by 0x13D204: final_symkey (ike_alg_nss_prf_ops.c:181)
==2991==    by 0x193E9E: crypt_prf_final_symkey (crypt_prf.c:128)
==2991==    by 0x194C3B: ikev2_prfplus (ikev2_prf.c:292)
==2991==    by 0x195965: ikev2_child_sa_keymat (ikev2_prf.c:390)
==2991==    by 0x192462: ikev2_derive_child_keys (ikev2_crypto.c:147)

crypt_symkey.c:220

PK11SymKey *result = PK11_Derive(base_key, derive, &data_param, target,
                                          operation, key_size);

It seems that the result is free'd in ikev2_crypto.c:160

release_symkey(__func__, "keymat", &keymat);

I guess there are still references to the PK11SymKey keymat. Not sure.

6)

==2991== 800 (512 direct, 288 indirect) bytes in 4 blocks are definitely 
lost in loss record 638 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x8E7827F: PORT_Alloc_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x552DEC2: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x55303C8: PK11_DeriveWithTemplate (in /usr/lib64/libnss3.so)
==2991==    by 0x5530720: PK11_Derive (in /usr/lib64/libnss3.so)
==2991==    by 0x192A6D: merge_symkey_bytes (crypt_symkey.c:220)
==2991==    by 0x1938FE: symkey_from_bytes (crypt_symkey.c:420)
==2991==    by 0x13D204: final_symkey (ike_alg_nss_prf_ops.c:181)
==2991==    by 0x193E9E: crypt_prf_final_symkey (crypt_prf.c:128)
==2991==    by 0x194CE3: ikev2_prfplus (ikev2_prf.c:304)
==2991==    by 0x194F8B: ikev2_ike_sa_keymat (ikev2_prf.c:366)
==2991==    by 0x195272: calc_skeyseed_v2 (ikev2_prf.c:142)
==2991==    by 0x195272: calc_dh_v2 (ikev2_prf.c:260)

Similar to 2, 3 and 4?

7)

==2991== 1,636 (56 direct, 1,580 indirect) bytes in 1 blocks are 
definitely lost in loss record 692 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x1B1697: alloc_bytes_raw (alloc.c:95)
==2991==    by 0x1B1B0D: alloc_bytes (alloc.c:201)
==2991==    by 0x17CEC4: ikev2_collect_fragment (ikev2.c:761)
==2991==    by 0x17F8F0: process_v2_packet (ikev2.c:1252)
==2991==    by 0x19FD77: comm_handle (demux.c:234)
==2991==    by 0x19FD77: comm_handle_cb (demux.c:193)
==2991==    by 0x6329A13: event_base_loop (in 
/usr/lib64/libevent-2.0.so.5.1.9)
==2991==    by 0x165568: main_loop (server.c:663)
==2991==    by 0x165568: call_server (server.c:798)
==2991==    by 0x126499: main (plutomain.c:1714)

ikev2.c:761

frag = alloc_thing(struct ikev2_frag, "ikev2_frag");

Should frag be free'd in ikev2_collect_fragment?

8)

==2991== 2,288 (2,048 direct, 240 indirect) bytes in 1 blocks are 
definitely lost in loss record 726 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x908F3AE: PL_ArenaAllocate (in /usr/lib64/libplds4.so)
==2991==    by 0x8E7851F: PORT_ArenaAlloc_Util (in 
/usr/lib64/libnssutil3.so)
==2991==    by 0x8E76BC7: SECITEM_CopyItem_Util (in 
/usr/lib64/libnssutil3.so)
==2991==    by 0x55AC4B3: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x5591BD7: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x555373F: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x557E716: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x557EE3A: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x5581666: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x5506DA1: CERT_PKIXVerifyCert (in /usr/lib64/libnss3.so)
==2991==    by 0x1AC665: vfy_chain_pkix (nss_cert_vfy.c:383)
==2991==    by 0x1AC665: verify_and_cache_chain (nss_cert_vfy.c:490)

Not sure what's going on here.

9)

==2991== 4,152 bytes in 1 blocks are definitely lost in loss record 751 
of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x1B1697: alloc_bytes_raw (alloc.c:95)
==2991==    by 0x1B1B0D: alloc_bytes (alloc.c:201)
==2991==    by 0x19FEB8: alloc_md (msgdigest.c:86)
==2991==    by 0x1802D4: ikev2_crypto_start (ikev2_parent.c:318)
==2991==    by 0x18343F: ikev2parent_outI1 (ikev2_parent.c:619)
==2991==    by 0x15548B: initiate_a_connection (initiate.c:331)
==2991==    by 0x1566D2: initiate_connection (initiate.c:357)
==2991==    by 0x19DA5E: whack_process (rcv_whack.c:521)
==2991==    by 0x19E634: whack_handle (rcv_whack.c:738)
==2991==    by 0x6329A13: event_base_loop (in 
/usr/lib64/libevent-2.0.so.5.1.9)
==2991==    by 0x165568: main_loop (server.c:663)
==2991==    by 0x165568: call_server (server.c:798)

ikev2_parent.c:318

fake_md = alloc_md();

The fake_md isn't free'd.

Not sure if this is related to commit:

123f7a32699dd3838d296083610418d84b9064ef

10)

==2991== 9,248 (8,304 direct, 944 indirect) bytes in 2 blocks are 
definitely lost in loss record 771 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x1B1697: alloc_bytes_raw (alloc.c:95)
==2991==    by 0x1B1B0D: alloc_bytes (alloc.c:201)
==2991==    by 0x19FEB8: alloc_md (msgdigest.c:86)
==2991==    by 0x1802D4: ikev2_crypto_start (ikev2_parent.c:318)
==2991==    by 0x18343F: ikev2parent_outI1 (ikev2_parent.c:619)
==2991==    by 0x168179: timer_event_cb (timer.c:750)
==2991==    by 0x6329A13: event_base_loop (in 
/usr/lib64/libevent-2.0.so.5.1.9)
==2991==    by 0x165568: main_loop (server.c:663)
==2991==    by 0x165568: call_server (server.c:798)
==2991==    by 0x126499: main (plutomain.c:1714)

Same as 9?

11)

==2991== 11,440 (360 direct, 11,080 indirect) bytes in 5 blocks are 
definitely lost in loss record 778 of 805
==2991==    at 0x4C2B975: calloc (vg_replace_malloc.c:711)
==2991==    by 0x8E78351: PORT_ZAlloc_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x8E783AB: PORT_NewArena_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x550B7B9: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x550E4C5: PK11_GenerateKeyPairWithOpFlags (in 
/usr/lib64/libnss3.so)
==2991==    by 0x550E8C8: PK11_GenerateKeyPairWithFlags (in 
/usr/lib64/libnss3.so)
==2991==    by 0x55083FD: SECKEY_CreateDHPrivateKey (in 
/usr/lib64/libnss3.so)
==2991==    by 0x13B73C: nss_modp_calc_ke (ike_alg_nss_modp.c:71)
==2991==    by 0x1A3894: calc_ke (crypt_ke.c:68)
==2991==    by 0x1A1BCF: pluto_do_crypto_op (pluto_crypt.c:255)
==2991==    by 0x1A2644: pluto_crypto_helper (pluto_crypt.c:365)
==2991==    by 0x1A28C0: pluto_helper_thread (pluto_crypt.c:1011)

ike_alg_nss_modp.c:71

*privk = SECKEY_CreateDHPrivateKey(&dh_params,pubk, 
lsw_return_nss_password_file_info());

privk doesn't seem to get free'd.

12)

==2991== 13,728 (432 direct, 13,296 indirect) bytes in 6 blocks are 
definitely lost in loss record 784 of 805
==2991==    at 0x4C2B975: calloc (vg_replace_malloc.c:711)
==2991==    by 0x8E78351: PORT_ZAlloc_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x8E783AB: PORT_NewArena_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x550C07C: ??? (in /usr/lib64/libnss3.so)
==2991==    by 0x550E595: PK11_GenerateKeyPairWithOpFlags (in 
/usr/lib64/libnss3.so)
==2991==    by 0x550E8C8: PK11_GenerateKeyPairWithFlags (in 
/usr/lib64/libnss3.so)
==2991==    by 0x55083FD: SECKEY_CreateDHPrivateKey (in 
/usr/lib64/libnss3.so)
==2991==    by 0x13B73C: nss_modp_calc_ke (ike_alg_nss_modp.c:71)
==2991==    by 0x1A3894: calc_ke (crypt_ke.c:68)
==2991==    by 0x1A1BCF: pluto_do_crypto_op (pluto_crypt.c:255)
==2991==    by 0x1A2644: pluto_crypto_helper (pluto_crypt.c:365)
==2991==    by 0x1A28C0: pluto_helper_thread (pluto_crypt.c:1011)

Same as 11?

13)

==2991== 18,868 (16,608 direct, 2,260 indirect) bytes in 4 blocks are 
definitely lost in loss record 791 of 805
==2991==    at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
==2991==    by 0x1B1697: alloc_bytes_raw (alloc.c:95)
==2991==    by 0x1B1B0D: alloc_bytes (alloc.c:201)
==2991==    by 0x19FEB8: alloc_md (msgdigest.c:86)
==2991==    by 0x19FCB4: comm_handle (demux.c:230)
==2991==    by 0x19FCB4: comm_handle_cb (demux.c:193)
==2991==    by 0x6329A13: event_base_loop (in 
/usr/lib64/libevent-2.0.so.5.1.9)
==2991==    by 0x165568: main_loop (server.c:663)
==2991==    by 0x165568: call_server (server.c:798)
==2991==    by 0x126499: main (plutomain.c:1714)

demux.c:230

         md = alloc_md();
         md->iface = ifp;

         if (read_packet(md))
                 process_packet(&md);

         release_any_md(&md);

md seems to get free'd. Not sure what's going on here.

14)

==2991== 42,656 bytes in 5,332 blocks are definitely lost in loss record 
795 of 805
==2991==    at 0x4C2B975: calloc (vg_replace_malloc.c:711)
==2991==    by 0x8E78351: PORT_ZAlloc_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x553C911: CERT_ImportCerts (in /usr/lib64/libnss3.so)
==2991==    by 0x1AC090: crt_tmp_import (nss_cert_vfy.c:194)
==2991==    by 0x1AC090: verify_and_cache_chain (nss_cert_vfy.c:474)
==2991==    by 0x130526: pluto_process_certs.isra.3 (x509.c:766)
==2991==    by 0x130BD4: ikev2_decode_cert (x509.c:877)
==2991==    by 0x17D662: ikev2_decode_peer_id_and_certs (ikev2.c:1300)
==2991==    by 0x184E6F: ikev2_parent_inI2outR2_tail (ikev2_parent.c:3438)
==2991==    by 0x184E6F: ikev2_parent_inI2outR2_continue 
(ikev2_parent.c:3381)
==2991==    by 0x1A239A: handle_helper_answer (pluto_crypt.c:910)
==2991==    by 0x6329A13: event_base_loop (in 
/usr/lib64/libevent-2.0.so.5.1.9)
==2991==    by 0x165568: main_loop (server.c:663)
==2991==    by 0x165568: call_server (server.c:798)
==2991==    by 0x126499: main (plutomain.c:1714)

See 8.

15)

==2991== 65,088 bytes in 8,136 blocks are definitely lost in loss record 
802 of 805
==2991==    at 0x4C2B975: calloc (vg_replace_malloc.c:711)
==2991==    by 0x8E78351: PORT_ZAlloc_Util (in /usr/lib64/libnssutil3.so)
==2991==    by 0x553C911: CERT_ImportCerts (in /usr/lib64/libnss3.so)
==2991==    by 0x1AC090: crt_tmp_import (nss_cert_vfy.c:194)
==2991==    by 0x1AC090: verify_and_cache_chain (nss_cert_vfy.c:474)
==2991==    by 0x130526: pluto_process_certs.isra.3 (x509.c:766)
==2991==    by 0x130BD4: ikev2_decode_cert (x509.c:877)
==2991==    by 0x17D662: ikev2_decode_peer_id_and_certs (ikev2.c:1300)
==2991==    by 0x185800: ikev2parent_inR2 (ikev2_parent.c:4081)
==2991==    by 0x17F93B: process_v2_packet (ikev2.c:1268)
==2991==    by 0x19FD77: comm_handle (demux.c:234)
==2991==    by 0x19FD77: comm_handle_cb (demux.c:193)
==2991==    by 0x6329A13: event_base_loop (in 
/usr/lib64/libevent-2.0.so.5.1.9)
==2991==    by 0x165568: main_loop (server.c:663)
==2991==    by 0x165568: call_server (server.c:798)

Same as 14?

Thanks and regards,

Erik

On 2017-03-03 03:12, Erik Andersson wrote:
> Thanks Andrew!
>
> I've re-built with your recent commits and it seems that your efforts
> have solved the issue.
>
> I've run for a few hours and the memory consumption doesn't rise :)
>
> Regards,
>
> Erik
>
> On 2017-03-02 22:30, Andrew Cagney wrote:
>>> I didn't do any archaeology
>>
>> I think two things were at play.
>>
>> The first change:
>>
>> commit 02768e7cd76a465443f13f3cbf6864db268f0b52
>> Date:   Mon Nov 12 21:18:54 2007 -0500
>>
>>     #845 - beginning of DH calculation and SKEYSEED derivation.
>>
>> would copy skeyseed into a wirechunk:
>>
>>     setwirechunk_fromchunk(skr->skeyseed, skeyseed, skr);
>>
>> but never copied the field back out.  This set the trap.
>> (I suspect this was ok as deleting the wirechunk SKR would reclaim the
>> memory).
>>
>> The second change:
>>
>> commit 944c9a31c1e4dff1ab92cdf9c85629b7270a6157
>> Date:   Tue Feb 18 04:15:35 2014 -0500
>>
>>     smoother nss:
>>     Many NSS pointers were passed through chunks making to code confusing
>>     and brittle.  No longer.
>>     A side-effect is that this change makes it harder to ditch NSS.
>>
>> replaced the wirechunk with individual pointer moves.  Since the old
>> code didn't extract the field skeyseed, the new code didn't either;
>> falling into the trap :-)
>>
>> I've pushed a fix.  For 3.19, something like:
>>
>> @@ -143,6 +142,7 @@ static void calc_skeyseed_v2(struct pcr_skeyid_q
>> *skq,
>>         PK11SymKey *finalkey = ikev2_ike_sa_keymat(skq->prf, skeyseed_k,
>>                                                    ni, nr, spii, spir,
>>                                                    total_keysize);
>> +       free_any_symkey("skeyseed_", &skeyseed_k);
>>
>>         size_t next_byte = 0;
>>
>> might be sufficient (but I've not tested it).
>>
>> Andrew
>>
> _______________________________________________
> Swan-dev mailing list
> Swan-dev at lists.libreswan.org
> https://lists.libreswan.org/mailman/listinfo/swan-dev


More information about the Swan-dev mailing list