Overblog Suivre ce blog
Editer l'article Administration Créer mon blog
8 mars 2012 4 08 /03 /mars /2012 16:15

This post covers:

 

-          OAM implementation on Junos

-          Default value of OAM LFM parameter (in relation with the Part 1)

-          Configuration / understanding: Neighbor discovery phase

-          Configuration / understanding: Remote loopback operation

-          Configuration / understanding: Link event

-          Configuration / understanding: Action profile

 

1/ OAM implementation:

 

Junos supports OAM LFM since the release 8.4. Junos does not support the Codes 0x02-0x03: Variable request and response.

 

LFM has its dedicated process called lfmd. OAM PDU send/receive state machine is managed by the PPMd process. This process manages adjacencies and pdu transmission for several protocols: ISIS, OSPF, BFD, LACP, VRRP. In other words PPMd uses IPC with others deamons like RPD, LACPD, LFMd or VRRPd... To see which adjacencies/transmissions “tasks” are managed by PPMd, you can use this hidden command:

 


sponge@bob> show ppm connections brief

 

Protocol      Logical system ID  Adjacencies      Transmissions

BFD           All                0                0

LACP          None               9                9

LFM           None               1                1

OAMD          None               0                0

STP           None               0                0

OSPF2         None               1                1

LDP           None               7                7

ISIS          None               9                9

ESIS          None               3                9

OSPF2         1                  1                1

LDP           1                  2                2

ISIS          1                  3                3

ESIS          1                  1                3

CFM           None               0                0

PFE (fpc0)    540                3                3

PFE (fpc3)    543                3                3

PFE (fpc1)    541                3                3

PFE (fpc9)    549                0                0

PFE (fpc7)    547                1                1

 

Connections: 19, Remote connections: 5



PPMd delegates some tasks at PFE level since the 9.x release (I don’t remember the exact release, and it depends on hardware/protocols): we speak about PPM distribution at PFE level or remote connections (see above, each FPC has some remote connections). Only some protocols are distributed at PFE level. We can notice: BFD, LACP, LFM, CFM… However, it depends on the Junos version as well as the FPC/DPC/MPC hardware.

 

For example: TRIO based cards only support PPMd distributed LFM since the 11.1. Before, OAM LFM PDUs are handled by the Routing-engine. To see, which protocols are distributed at PFE Level for which FPC/interface, use the hidden command (available at least since 10.2):

 


sponge@bob> show ppm interfaces remote detail | match "Protocol|fpc"

 

IFL-index: 84, Protocol: LACP

Distributed, Distribution handle: 44, Distribution address: fpc0

IFL-index: 87, Protocol: LACP

Distributed, Distribution handle: 55, Distribution address: fpc0

IFL-index: 96, Protocol: LACP

Distributed, Distribution handle: 88, Distribution address: fpc1

IFL-index: 99, Protocol: LACP

Distributed, Distribution handle: 108, Distribution address: fpc1

IFL-index: 84, Protocol: LFM

Distributed, Distribution handle: 40, Distribution address: fpc0

IFL-index: 85, Protocol: LFM

Distributed, Distribution handle: 172, Distribution address: fpc3

[…]



We can explain this output as follow: FPC0 supports PPM distributed for LFM and LACP protocols, if I configure these protocols for example on the interface index 84 (IFL) LACP and LFM (adjacency and some PDU messages) will be handled by PPM at PFE Level (no RE CPU consumption).

 

For LFM configuration with short timer (min 100ms) it is strongly recommended to use LFM on hardware/software that support LFM distribution at PFE level. Otherwise, you can experience some unexpected LFM adjacency flaps.

 

The last hidden command allows to see the configured LFM adjacencies (already up / means discovery complete):

 


sponge@bob> show ppm adjacencies remote detail | find LFM

 

Protocol: LFM, Hold time: 300, IFL-index: 84

Distributed

Distribution handle: 82, Distribution address: fpc0

 

Adjacencies: 10, Remote adjacencies: 10



If you want to be sure, when an LFM adjacency is up, that no OAM PDU are sent/received by the RE, you can use this tip (here we’ve a problem):

 


sponge@bob> monitor traffic interface ge-x/y/z  no-resolve matching "ether host 1:80:c2:0:0:2"

verbose output suppressed, use <detail> or <extensive> for full protocol decode

Address resolution is OFF.

Listening on ge-1/0/0, capture size 96 bytes

 

11:09:37.102836  In OAM, length 46

11:09:37.173960 Out OAM, length 38

11:09:37.210130  In OAM, length 46

11:09:37.273936 Out OAM, length 38



If you see these packets, that means OAM LFM is handled by the RE. Strongly not recommended.

 

2/ Default timers:

 

The following tab provides the default, min, max values of OAM LFM timers.

 

803.3ah

Junos

Default value

Min Value

Max Value

pdu_timer

pdu-interval

1000 ms

100 ms

1000 ms

ocal_lost_link_timer

pdu-threshold

x3 (pdu-interval)

x1

x10

Errored Symbol Window

symbol-period

1s

1s

1s

Errored Frame Threshold

frame-error

100ms

100ms

100ms

Errored Frame Threshold

frame-period

1s

1s

1s

Errored Frame Seconds Summary Window

frame-period-summary

60s

60s

60s

 

  3/ Configuring/troubleshooting LFM:

 

We refer to this diagram:

 

oam-config

 

 

Discovery phase:

 

OAM LFM is configurable at “protocols oam ethernet link-fault-management”. For discovery phase, you have to set per interface or via apply-group the specific interface parameters.

 


 [edit protocols oam ethernet link-fault-management]

 

sponge@bob# show

interface ge-1/0/0 {

    pdu-interval 100;

    link-discovery active;

    pdu-threshold 3;

    negotiation-options {

        no-allow-link-events;

        allow-remote-loopback;

    }

}


 

-          pdu-intervalrefers to pdu_timer (the interval in ms between 2 OAM PDU sent)

-          pdu-thresholdrefers to the number of consecutives loss of OAM PDU “keepalives” to declare the LFM adjacency Down (neighbor failure). (here. Holtime is 300ms) 

-          link-discoveryrefers to the Active or Passive mode of the OAM Client

-          negociation-optionsrefers to the OAM Client capabilities. No-allow-link-event disables the sending of link event OAM PDU (code 0x01). Allow-remote-loopback (inform peer that “you” support remote loopback control request).

 

If you configure OAM LFM like this on each side of a Ethernet segment, the discovery phase explained in Part 1, will begin and if no unexpected error occurred the LFM adjacency should enter in Discovery state complete (SEND ANY step of the state machine).

 

To check that, use the following command:

 


sponge@bob> show oam ethernet link-fault-management ge-1/0/0

  Interface: ge-1/0/0

    Status: Running, Discovery state: Send Any

    Peer address: 02:21:59:a2:e9:c2

    Flags:Remote-Stable Remote-State-Valid Local-Stable 0x50

    Remote entity information:

      Remote MUX action: forwarding, Remote parser action: forwarding

      Discovery mode: active, Unidirectional mode: unsupported

      Remote loopback mode: supported, Link events: supported

      Variable requests: unsupported


 

Many interesting information are provided by the simple command. You can check if OAM LFM adjacency is UP:

-          Discovery state in SEND_ANY

-          && Remote-Stable and Local-stable set to 1 = flags field == 0x50

 

You have directly access to remote information mode/state/capabilities. In this example Patrick is configured in Active mode, it supports Remote Loopback (aka. allow-remote-loopback) and unlike Bob supports link events. Moreover, we have information regarding the Mux state: in forwarding state (Patrick can send non-OAM PDU frames) and the Parser state: in forwarding state as well (Patrick can receive non-OAM PDU frames).

 

Important point: if the discovery phase failed (for example Patrick has no LFM configuration yet), there is no action / impact to the link layer and consequently to other protocols already configured on the link. In our case, LACP keeps its state up (collecting/distributing). You need at least to have one time a discovery completely successfully reached and specific action configured to have an impact on other protocols attached to the link.

 


Remote Loopack mode:

 

Personally, I don’t use remote loopback mode for internal backbone link but it could be useful for testing the layer 2 end-end connectivity with customer without configuring anything at the layer 3 (ex. PE-CE link).

 

As I written in the Part 1: loopback mode cut the normal forwarding of traffic over the link in LB. All the other protocols will go down.

 

Note: Some Junos HW doesn’t support remote loopback, even if you configure allow-remote-loopback

 

Remote loopback operation allow to force the remote router to switch in LB mode. In other words, all Ethernet frames sent to the remote router will be loop back without any changing. To accept a remote loopback request from a remote peer, a route has to support this capability (see above: allow-remote-loopback) and to be configured in Active mode.

 

In our example, I’m going to configure on Bob the Remote Loopback Mode. This action, once committed, will trigger the sending of Remote Loopback request from Bob to Patrick. Patrick, once the request accepted, will switch its link in LB mode. To acknowledge the request, Patrick will sent via its periodic OAM info PDU the new state of its Mux and Parser.

 

On Bob: 

 


[edit protocols oam ethernet link-fault-management]

sponge@bob# set interface ge-1/0/0 remote-loopback

sponge@bob# commit comment Set_RemoteLB and-quit


 

Once committed, go on to Patrick to check its state:

 


star@patrick> show oam ethernet link-fault-management ge-0/0/0

  Interface: ge-0/0/0

    Status: Running, Discovery state: Send Any

    Peer address: 02:1b:c0:d5:95:67

    Flags:Remote-Stable Remote-State-Valid Local-Stable 0x50

    Remote loopback status: Enabled on local port, Disabled on peer port

    Remote entity information:

      Remote MUX action: forwarding, Remote parser action: discarding

      Discovery mode: active, Unidirectional mode: unsupported

      Remote loopback mode: unsupported, Link events: unsupported

      Variable requests: unsupported


 

As you can see, the OAM is still the only protocol UP (flags=0x50 and state machine in Send_Any state). You can check the remote LB status, here we have Patrick’s ge-0/0/0 link in LB. Moreover, the remote router Bob (remote entity information) has changed its parser state to discarding. Remember, we configure “remote loopback” under Bob but this is Patrick that loops back frames coming from Bob. So Bob can forward (Mux state) frames to the link toward Patrick, which will loop back them and then Bob has to discard the LB frames (Parser state).

 

On Patrick, you can check the status “flags” of the link ge-0/0/0. The flag “Looped” has to be enabled.

 


star@patrick> show interfaces ge-0/0/0

Physical interface: ge-0/0/0, Enabled, Physical link is Up

  Interface index: 261, SNMP ifIndex: 503

  Description: To Sponge Bob ge-0/0/1

  Link-level type: Ethernet, MTU: 4484, Speed: 1000mbps, BPDU Error: None,

  MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,

  Flow control: Disabled, Auto-negotiation: Enabled, Remote fault: Online

  Device flags   : Present Running

  Interface flags: SNMP-Traps Looped Internal: 0x4004000



On Bob:

 

Just monitoring traffic on ge-1/0/0 interface with the source mac address equal to the mac address of the Bob’s ge-1/0/0 interface.

 


sponge@bob> monitor traffic interface ge-1/0/0 layer2-headers no-resolve matching "ether src 02:1b:c0:d5:95:67" 

18:39:21.293912 Out 2:1b:c0:d5:95:67 > 1:80:c2:0:0:2, ethertype Slow Protocols (0x8809), length 74: LACPv1, length 60

18:39:21.294356  In 2:1b:c0:d5:95:67 > 1:80:c2:0:0:2, ethertype Slow Protocols (0x8809), length 74: LACPv1, length 60

18:39:22.294433 Out 2:1b:c0:d5:95:67 > 1:80:c2:0:0:2, ethertype Slow Protocols (0x8809), length 74: LACPv1, length 60

18:39:22.294900  In 2:1b:c0:d5:95:67 > 1:80:c2:0:0:2, ethertype Slow Protocols (0x8809), length 74: LACPv1, length 60


 

Hey! We see the Bob’s LACP frames coming back. It works. Nice!

 

To deactivate the remote loopback operation, just delete the previous command on Bob.

 


[edit protocols oam ethernet link-fault-management] 

sponge@bob# delete interface ge-1/0/0 remote-loopback

sponge@bob# commit comment Delete_RemoteLB and-quit


   

 

Link Events:

 

By default link events are supported by Junos. You can disable this capability with the previous command no-allow-link-events

 

Configuring link events on Junos is not really intuitive (Even if I love Junos J, the Cisco IOX implementation of this part is less complex to understand). You can configure the threshold for the 4 link events allow by the 802.3ah (see Part 1). The Window of analysis is hard coded and depends on the type of link event.

 

The threshold configuration for each event does not allow a lot of granularity. Each event threshold can be set between 1 and 100.

 

-          Event Frame-Error : 1 to 100 errored frames detected during the window of 100 msec.

-          Event Frame-Period : 1 to 100 errored frames detected compared with the number of 64bytes frames at the line rate received during 1sec

-          Event Frame-Period-summary : 1 to 100 errored frames per seconds during the window of 60secs

-          Event Symbol-Period : 1 to 100 errored symbols compared with the number of symbols that can be sent during 1 sec.

 

Link Event PDUs are sent at a periodic rate that is hard coded as well. The value is 1 event every 5 secs.

 

Framing error for Junos means Framing with invalid frame checksum. Available at the interface cli level:

 


 sponge@bob> show interfaces ge-1/0/0 extensive | match Framing 

    Errors: 0, Drops: 0, Framing errors: 0, Runts: 0, Policed discards: 0,


 

Symbol error for Junos means any symbol code errors at the physical interface level. Available at the interface cli level:

 


sponge@bob> show interfaces ge-1/0/0 extensive | match code

    Code violations                          0


 

Configure link events threshold at the “link-fault-management interface“ level:

 


[edit protocols oam ethernet link-fault-management] 

sponge@bob# show

       interface ge-0/0/0 {

            pdu-interval 100;

            link-discovery active;

            pdu-threshold 3;

            negotiation-options {

                allow-remote-loopback;

            }

            event-thresholds {

                symbol-period 1;

                frame-error 1;

                frame-period 1;

                frame-period-summary 1;

            }

        } 



To troubleshoot link events you can use the following command:

 


sponge@bob> show oam ethernet link-fault-management ge-0/0/0 detail

  Interface: ge-0/0/0

    Status: Running, Discovery state: Send Any

    Peer address: 02:1b:c0:d5:95:67

    Flags:Remote-Stable Remote-State-Valid Local-Stable 0x50

    OAM receive statistics:

      Information: 110933, Event: 1, Variable request: 0, Variable response: 0

      Loopback control: 4, Organization specific: 0

    OAM flags receive statistics:

      Critical event: 0, Dying gasp: 0, Link fault: 0

    OAM transmit statistics:

      Information: 141481, Event: 3, Variable request: 0, Variable response: 0

      Loopback control: 0, Organization specific: 0

    OAM received symbol error event information:

      Events: 0, Window: 0, Threshold: 0

      Errors in period: 0, Total errors: 0

    OAM received frame error event information:

      Events: 0, Window: 0, Threshold: 0

      Errors in period: 0, Total errors: 0

    OAM received frame period error event information:

      Events: 0, Window: 0, Threshold: 0

      Errors in period: 0, Total errors: 0

    OAM received frame seconds error event information:

      Events: 0, Window: 0, Threshold: 0

      Errors in period: 0, Total errors: 0

    OAM transmitted symbol error event information:

      Events: 0, Window: 0, Threshold: 1

      Errors in period: 0, Total errors: 0

    OAM current symbol error event information:

      Events: 0, Window: 96248, Threshold: 1

      Errors in period: 0, Total errors: 0

    OAM transmitted frame error event information:

      Events: 0, Window: 50, Threshold: 1

      Errors in period: 0, Total errors: 0

    OAM current frame error event information:

      Events: 0, Window: 50, Threshold: 1

      Errors in period: 0, Total errors: 0

    Remote entity information:

      Remote MUX action: forwarding, Remote parser action: forwarding

      Discovery mode: active, Unidirectional mode: unsupported

      Remote loopback mode: unsupported, Link events: supported

      Variable requests: unsupported


 

You can check the total number of link event PDUs sent and received as well as specific information of Symbol or framing errors.

 

To clear the OAM statistics information use this command:



 sponge@bob> clear oam ethernet link-fault-management statistics 


 

When the remote peer receives the Link Event PDU and has an Action Profile referring on Link Events, it:

 

-          Extract from the received Link Event PDU : the errored Frame/symbol count and the Window value.

-          Compare ErroredFrame/symbol divided by the window with configured threshold in the Action Profile

-          If result is > to configured threshold : trigger Action.

 

 

Action Profile:

 

Action Profile allows you to trigger action(s) in response of specific event. The events may be:

 

-          link-adjacency down

-          link-event received

-          protocol-down only used for CCC interfaces (removing family CCC notifies 802.ah protocol with the protocol-down event) 

 

The action may be syslog and/or link down and/or sending critical event bit (in OAM PDU info).

 

You can specify the action to be taken by the system when the configured link-fault event

occurs. Multiple action profiles can be applied to a single interface. For each action-profile,

at least one event and one action must be specified. The actions are taken only when all

of the events in the action profile are true. If more than one action is specified, all the

actions are executed.

 

Prefer one Action Profile per Event and apply multiple Action Profiles to an interface (logical “or” between several Action Profile).

 

I recommend at least one Action Profile to automatically disable the link layer of an Interface when the OAM client loses the neighbor adjacency.

 


[edit protocols oam ethernet link-fault-management] 

sponge@bob# show

action-profile AP-ADJDOWN {

    event {

        link-adjacency-loss;

    }

    action {

        syslog;

        link-down;

    }

}

interface ge-0/0/0 {

    apply-action-profile AP-ADJDOWN;

    pdu-interval 100;

    link-discovery active;

    pdu-threshold 3;

    negotiation-options {

        allow-remote-loopback;

    }

    event-thresholds {

        symbol-period 1;

        frame-error 1;

        frame-period 1;

        frame-period-summary 1;

    }

}


 

In our example we declare an Action Profile that automatically shutdown the link layer of the interface ge-0/0/0 (all protocols will go down except OAM LFM) and generate a syslog message when OAM LFM adjacency is lost.

 

Simple test in our example : I’m blocking Ethernet frame coming from Bob on Patrick to simulate an adjacency loss.

 

On Patrick I can see that the adjacency is down:

 


 star@patrick> show oam ethernet link-fault-management 

  Interface: ge-0/0/0

    Status: Running, Discovery state: Active Send Local

    Peer address: 00:00:00:00:00:00

    Flags:0x8

Application profile statistics:

  Profile Name                   Invoked     Executed

  AP-ADJDOWN                           1            1



 The Action Profile has been matched 1 and Excuted 1: link-down + syslog action. Regarding this one, you can see in messages file:

 


Mar  8 14:49:09  patrick lfmd[49580]: Action Syslog on ge-0/0/0 [AP-ADJDOWN]: :Adjacency Lost


 

So now, we can check the status of the interface.

 


star@patrick> show interfaces ge-0/0/0

Physical interface: ge-0/0/0, Enabled, Physical link is Up

  Interface index: 133, SNMP ifIndex: 258

  Description: To Sponge Bob ge-1/0/0

  Link-level type: Ethernet, MTU: 1514, Speed: 1000mbps,

  MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,

  Flow control: Disabled, Auto-negotiation: Enabled, Remote fault: Online

  Device flags   : Present Running

  Interface flags: Link-Layer-Down SNMP-Traps Internal: 0x4004000


 

The physical link is still enabled, but the link layer is down (see Interface flags). So only OAM PDU can be forwarded through the link.

 

Now, we want to add another action profile when framing errors are detected. Just create another one with only the action: syslog.

 


[edit protocols oam ethernet link-fault-management] 

sponge@bob# show

action-profile AP-ADJDOWN {

    event {

        link-adjacency-loss;

    }

    action {

        syslog;

       link-down;

    }

}

action-profile AP-FRAME {

    event {

        link-event-rate {

            frame-error 1;

        }

    }

    action {

        syslog;

    }

}

interface ge-0/0/0 {

    apply-action-profile [ AP-ADJDOWN AP-FRAME ];

    pdu-interval 100;

    link-discovery active;

    pdu-threshold 3;

    negotiation-options {

        allow-remote-loopback;

    }

    event-thresholds {

        symbol-period 1;

        frame-error 1;

        frame-period 1;

        frame-period-summary 1;

    }

}


You can see that action-profiles are added in logical or manner.

 

Regarding tests of the link events I will update this part when I will receive my OAM LFM licence for IXIA. I’ve only successfully generated the Symbol-period Link Event by stressing the Fiber icon_evil.gif

 

David.

Partager cet article

Repost 0
Published by junosandme - dans Posts
commenter cet article

commentaires

Sohee 08/05/2015 21:20

How to block ethernet frames (as you mentioned above) in Junos to simulate LFM adjacency loss? Thanks.