[Swan] Valid packets dropping in the kernel

Dharma Indurthy dharma at redoxengine.com
Thu Nov 8 02:11:07 UTC 2018


Here we go:

tcpdump -n "host 12.131.93.13 or host 10.50.32.166 or host 10.253.1.53" -XXX
01:18:24.636686 IP 12.131.93.13.4500 > 172.20.109.76.4500: UDP-encap:
ESP(spi=0x67526edb,seq=0x6ba46), length 100
0x0000:  0eef 4216 5634 0e82 073f 73ab 0800 4500  ..B.V4...?s...E.
0x0010:  0080 2acb 0000 e811 24b1 0c83 5d0d ac14  ..*.....$...]...
0x0020:  6d4c 1194 1194 006c b334 6752 6edb 0006  mL.....l.4gRn...
0x0030:  ba46 1209 8dbf dc18 04a9 3acd 75d9 ad46  .F........:.u..F
0x0040:  a7a0 add5 3b98 0240 4e94 8f91 9206 5943  ....;.. at N.....YC
0x0050:  74a1 42f7 8714 9596 6d86 0208 c253 a5de  t.B.....m....S..
0x0060:  dc75 4fa4 c61b e75c 6c93 6c79 4442 0701  .uO....\l.lyDB..
0x0070:  7f66 b151 7536 e70c 5113 7ff7 708e 5df8  .f.Qu6..Q...p.].
0x0080:  21c4 f3ec 47e4 1ac4 2b94 3e76 213f       !...G...+.>v!?
01:18:24.636686 IP 10.50.32.166 > 10.253.1.53: ICMP echo request, id 5, seq
44181, length 40
0x0000:  0eef 4216 5634 0e82 073f 73ab 0800 4500  ..B.V4...?s...E.
0x0010:  003c 1959 0000 7e01 ec5e 0a32 20a6 0afd  .<.Y..~..^.2....
0x0020:  0135 0800 a0c1 0005 ac95 6162 6364 6566  .5........abcdef
0x0030:  6768 696a 6b6c 6d6e 6f70 7172 7374 7576  ghijklmnopqrstuv
0x0040:  7761 6263 6465 6667 6869                 wabcdefghi
<Disappear>

ip xfrm policy show:

src 10.253.1.53/32 dst 10.50.32.166/32
dir out priority 1040351
tmpl src 172.20.109.76 dst 12.131.93.13
proto esp reqid 20117 mode tunnel

src 10.50.32.166/32 dst 10.253.1.53/32
dir fwd priority 1040351
tmpl src 12.131.93.13 dst 172.20.109.76
proto esp reqid 20117 mode tunnel
src 10.50.32.166/32 dst 10.253.1.53/32
dir in priority 1040351
tmpl src 12.131.93.13 dst 172.20.109.76
proto esp reqid 20117 mode tunnel

Happy to add the whole content, but these are the only policies for
10.50.32.166 and 10.253.1.53.

Similarly, ip xfrm state
src 12.131.93.13 dst 172.20.109.76
proto esp spi 0x67526edb reqid 20137 mode tunnel
replay-window 32 flag af-unspec
auth-trunc hmac(sha1) 0xe4af4a28e114241910040b7ce684f00949a28917 96
enc cbc(aes)
0x33b4bb6f365364e046808a5c701e24868830a81a2074cd2604421b627f7bcf4c
encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
anti-replay context: seq 0x6ba30, oseq 0x0, bitmap 0xffffffff

SPI matches the packets, which makes sense since we decrypt the payload.
And XfrmInTmplMismatch increments every time.  Looks like the reqid doesn't
match.  Could that be it?  Seems to match for other tunnels, but this does
not seem to affect pings initiated on our side:

01:32:27.776107 IP 172.20.75.204 > 10.153.32.166: ICMP echo request, id
13229, seq 16, length 64
0x0000:  0eef 4216 5634 0e82 073f 73ab 0800 4500  ..B.V4...?s...E.
0x0010:  0054 38fe 4000 4001 de8b ac14 4bcc 0a99  .T8. at .@.....K...
0x0020:  20a6 0800 66ad 33ad 0010 2b92 e35b 0000  ....f.3...+..[..
0x0030:  0000 84d4 0b00 0000 0000 1011 1213 1415  ................
0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
0x0060:  3637                                     67
01:32:27.776146 IP 172.20.109.76.4500 > 12.131.93.13.4500: UDP-encap:
ESP(spi=0xbd768477,seq=0x13), length 132
0x0000:  0e82 073f 73ab 0eef 4216 5634 0800 4500  ...?s...B.V4..E.
0x0010:  00a0 0000 4000 4011 b75c ac14 6d4c 0c83  .... at .@..\..mL..
0x0020:  5d0d 1194 1194 008c 0000 bd76 8477 0000  ]..........v.w..
0x0030:  0013 3c03 c06e 9a3c 2ab2 4dc1 ac3f 5216  ..<..n.<*.M..?R.
0x0040:  86a3 c15a 73a8 eb54 3121 8347 6241 1d61  ...Zs..T1!.GbA.a
0x0050:  b817 48d6 9977 0dd0 0856 3815 47e9 bb13  ..H..w...V8.G...
0x0060:  bbca cc3e 2b71 cd16 a85f 54fe 6864 386f  ...>+q..._T.hd8o
0x0070:  95b5 8d3f 35eb ca05 8eaa ae65 8da0 d22c  ...?5......e...,
0x0080:  1aa6 a6b6 c4b3 1085 cb71 3cb3 5088 d464  .........q<.P..d
0x0090:  f069 8ca5 b9dd a4e6 b1f0 f287 da3a 8349  .i...........:.I
0x00a0:  91a0 b9fc a7c9 b6fb 2c84 2bc5 f5f3       ........,.+...
01:32:27.812407 IP 12.131.93.13.4500 > 172.20.109.76.4500: UDP-encap:
ESP(spi=0x91a1288c,seq=0x13), length 132
0x0000:  0eef 4216 5634 0e82 073f 73ab 0800 4500  ..B.V4...?s...E.
0x0010:  00a0 6df5 0000 e811 e166 0c83 5d0d ac14  ..m......f..]...
0x0020:  6d4c 1194 1194 008c 0000 91a1 288c 0000  mL..........(...
0x0030:  0013 a37e e661 9916 867a 05cc c2c0 f31d  ...~.a...z......
0x0040:  0538 b85b 1350 8440 4962 f183 315b a103  .8.[.P. at Ib..1[..
0x0050:  d5ec 555e e974 227e 9e2a c454 34f7 86bd  ..U^.t"~.*.T4...
0x0060:  9940 f1e5 d76e 3719 14d0 cd69 a508 c1f7  . at ...n7....i....
0x0070:  a8a0 0dc4 92b6 5207 15e3 9659 843e b8f4  ......R....Y.>..
0x0080:  c56e 7af0 f53e 60bc e8b9 5e5d 7e06 10f5  .nz..>`...^]~...
0x0090:  a5f1 dfd9 294d 749e 9f67 3f4a ee00 9878  ....)Mt..g?J...x
0x00a0:  ae3d 1f89 984c 363f 553c 5a16 20c8       .=...L6?U<Z...
01:32:27.812407 IP 10.50.32.166 > 10.253.0.1: ICMP echo reply, id 13229,
seq 16, length 64
0x0000:  0eef 4216 5634 0e82 073f 73ab 0800 4500  ..B.V4...?s...E.
0x0010:  0054 05a8 0000 7e01 012c 0a32 20a6 0afd  .T....~..,.2....
0x0020:  0001 0000 6ead 33ad 0010 2b92 e35b 0000  ....n.3...+..[..
0x0030:  0000 84d4 0b00 0000 0000 1011 1213 1415  ................
0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
0x0060:  3637                                     67
01:32:27.812447 IP 10.153.32.166 > 172.20.75.204: ICMP echo reply, id
13229, seq 16, length 64
0x0000:  0e82 073f 73ab 0eef 4216 5634 0800 4500  ...?s...B.V4..E.
0x0010:  0054 05a8 0000 7d01 14e2 0a99 20a6 ac14  .T....}.........
0x0020:  4bcc 0000 6ead 33ad 0010 2b92 e35b 0000  K...n.3...+..[..
0x0030:  0000 84d4 0b00 0000 0000 1011 1213 1415  ................
0x0040:  1617 1819 1a1b 1c1d 1e1f 2021 2223 2425  ...........!"#$%
0x0050:  2627 2829 2a2b 2c2d 2e2f 3031 3233 3435  &'()*+,-./012345
0x0060:  3637                                     67

Definitely seems interesting, but no idea what causes the reqid to get out
of sync.  Actually, the reqid matches the policies for the 1x3 connection:
src 10.50.36.4/32 dst 10.253.0.1/32
dir fwd priority 1040351
tmpl src 12.131.93.13 dst 172.20.109.76
proto esp reqid 20137 mode tunnel
src 10.50.36.4/32 dst 10.253.0.1/32
dir in priority 1040351
tmpl src 12.131.93.13 dst 172.20.109.76
proto esp reqid 20137 mode tunnel
src 10.253.0.1/32 dst 10.50.36.4/32
dir out priority 1040351
tmpl src 172.20.109.76 dst 12.131.93.13
proto esp reqid 20137 mode tunnel

But not the others.

On Tue, Nov 6, 2018 at 10:59 AM Dharma Indurthy <dharma at redoxengine.com>
wrote:

> Hey, Paul.  I appreciate your response.
>
> Do not use leftsourceip= if you specify more then one leftsubnet. Also,
>> leftsourceip= must be an IP address within the (single) leftsubnet=
>>
> >     right=12.131.93.13
>> >     rightsubnets=" 10.50.32.166/32 10.50.32.239/32 10.50.36.4/32 "
>> >     rightsourceip=12.131.93.13
>>
>> The same applies here.
>>
>
> Good to know, but I don't think it's getting used.  We'll clean  up the
> config.
>
>
>> > SAs come up, and we can ping their side.
>>
>> > 000 #3166924: "orthooklahoma3937/1x1":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 918s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3166924: "orthooklahoma3937/1x1" esp.815a3ae9 at 12.131.93.13
>> esp.618dd3ad at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3167825: "orthooklahoma3937/1x2":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 1148s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3167825: "orthooklahoma3937/1x2" esp.73c12328 at 12.131.93.13
>> esp.b76a1e64 at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3165167: "orthooklahoma3937/1x3":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 82s; newest IPSEC; eroute owner;
>> isakmp#3136241; idle; import:admin initiate
>> > 000 #3165167: "orthooklahoma3937/1x3" esp.33a967a1 at 12.131.93.13
>> esp.72596d49 at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3166787: "orthooklahoma3937/2x1":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 891s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3166787: "orthooklahoma3937/2x1" esp.970dcc23 at 12.131.93.13
>> esp.207c2a70 at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3166964: "orthooklahoma3937/2x2":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_REPLACE in 602s; newest IPSEC; eroute
>> owner; isakmp#3166786; idle; import:admin initiate
>> > 000 #3166964: "orthooklahoma3937/2x2" esp.61180b3 at 12.131.93.13
>> esp.50ff9d05 at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=1KB ESPout=1KB!
>> ESPmax=4194303B
>> > 000 #3162278: "orthooklahoma3937/2x3":4500 STATE_QUICK_I2 (sent QI2,
>> IPsec SA established); EVENT_SA_EXPIRE in 437s; isakmp#3136241; idle;
>> import:admin initiate
>> > 000 #3162278: "orthooklahoma3937/2x3" esp.e4c24f90 at 12.131.93.13
>> esp.cadf8591 at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3162955: "orthooklahoma3937/2x3":4500 STATE_QUICK_R2 (IPsec SA
>> established); EVENT_SA_REPLACE in 399s; newest IPSEC; eroute owner;
>> isakmp#3136241; idle; import:admin initiate
>> > 000 #3162955: "orthooklahoma3937/2x3" esp.d783e492 at 12.131.93.13
>> esp.1d0a885d at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=42KB ESPout=0B!
>> ESPmax=4194303B
>> > 000 #3166786: "orthooklahoma3937/2x3":4500 STATE_MAIN_R3 (sent MR3,
>> ISAKMP SA established); EVENT_SA_REPLACE in 26486s; newest ISAKMP; nodpd;
>> idle; import:admin initiate
>> >
>> > We have duplicate SAs for some reason -- you can see that for 2x3, not
>> sure if that matters.
>>
>> It should not matter. What seems to have happened is that when you
>> established the IKE SA, and you were in the process of establishing all
>> the IPsec SA's, the other end also started doing the same IPsec SA's.
>> So you ended up with one connection which was initiated by you and
>> responded to by you. One of them should vanish after a little while.
>>
>> Yeah, that's what I thought.  They do come and go, but we consistently
> have two:
> 000 #439432: "orthooklahoma3937/2x3":4500 STATE_QUICK_I2 (sent QI2, IPsec
> SA established); EVENT_SA_EXPIRE in 47s; isakmp#430186; idle; import:admin
> initiate
> 000 #439432: "orthooklahoma3937/2x3" esp.16ea20ad at 12.131.93.13
> esp.6916d827 at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
> ESPmax=4194303B
> 000 #449005: "orthooklahoma3937/2x3":4500 STATE_QUICK_I2 (sent QI2, IPsec
> SA established); EVENT_SA_REPLACE in 1873s; newest IPSEC; eroute owner;
> isakmp#430186; idle; import:admin initiate
> 000 #449005: "orthooklahoma3937/2x3" esp.523917e3 at 12.131.93.13
> esp.51b2fd1a at 172.20.109.76 ref=0 refhim=0 Traffic: ESPin=0B ESPout=0B!
> ESPmax=4194303B
>
> ^At the moment, we have two that our side has initiated.  Still, as far as
> I can see, no big deal.  Seems to be valid on both sides.
>
> > It's the 1x1 SA that's pertinent.  We NAT the source and target ips via
>> PREROUTING and POSTROUTING rules, and I
>> > can see traffic initiated by the customer hitting PREROUTING but never
>> hitting POSTROUTING and never leaving the box.
>>
>> Are you using the policy matching for ipsec? See:
>>
>
> We don't use policy matching, but we've never had to before.  For inbound
> customer traffic, we PREROUTE to match the config, and then we POSTROUTE to
> NAT the traffic past our gateway.  You can see the pings match the config
> and disappear.  We do this for all our tunnels, so pretty sure it's not
> that, but correct me if I'm wrong.  If it were a iptables error, I'd expect
> the behavior to be consistent, but it the connection works for a while, and
> then breaks.  It's working now, for example:
>
> working:
> 18:46:43.944652 IP 12.131.93.13.4500 > 172.20.109.76.4500: UDP-encap:
> ESP(spi=0x760cc9e1,seq=0x6c047), length 100 << Into our gateway
> 18:46:43.944652 IP 10.50.32.166 > 10.253.1.53: ICMP echo request, id 4,
> seq 36054, length 40 << Through PREROUTING
> 18:46:43.944700 IP 10.153.32.166 > 172.20.75.204: ICMP echo request, id
> 4, seq 36054, length 40 << Through POSTROUTING
>
> not working from before:
>
> 18:52:14.753803 IP 12.131.93.13.4500 > 172.20.109.76.4500: UDP-encap: << Into our gateway
> ESP(spi=0x57369ff6,seq=0x14254d), length 100
> 18:52:14.753803 IP 10.50.32.166 > 10.253.1.53: ICMP echo request, id 2, seq << Through PREROUTING
> 16669, length 40
> << Vanish! >>
>
> It's working now, so I don't have any useful xfrm state info to show, but
> I can produce that when it breaks again.  Any more info I can provide?
>
> -Dharma
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libreswan.org/pipermail/swan/attachments/20181107/1cda200c/attachment-0001.html>


More information about the Swan mailing list