There are many preparation works before you can add RMA device into your chassis group.
Step 1, Upgrade JunOS Remotely
Usually your RMA Device is delivered to the production environment to do replacement. You will have to remotely upgrade JunOS first.
login: root root> --- JUNOS 10.0R1.8 built 2009-11-03 10:06:39 UTC root> root> show version Model: srx240-hm JUNOS Software Release [10.0R1.8]
root> configure
Entering configuration mode
[edit]
root# delete
This will delete the entire configuration
Delete everything under this level? [yes,no] (no) yes
[edit]
root# set system root-authentication plain-text-password
New password:
Retype new password:
[edit]
root# commit and-quit
commit complete
Exiting configuration mode
root> set chassis cluster cluster-id 4 node 0 reboot
Successfully enabled chassis cluster. Going to reboot now
Some basic configurationon fxp0.0 interface and default static route. Also ssh service will need to be enabled.
root> show configuration
## Last commit: 2016-11-29 03:37:32 UTC by root
version 10.0R1.8;
system {
root-authentication {
encrypted-password "$1$2eav5HPL$01SUB9SOzDJl007hXhNVj0"; ## SECRET-DATA
}
services {
ssh;
}
}
interfaces {
fxp0 {
unit 0 {
family inet {
address 10.9.1.11/24;
}
}
}
}
routing-options {
static {
route 0.0.0.0/0 next-hop 10.9.1.1;
}
}
{primary:node0}
root> request system software add /var/tmp/junos-srxsme-12.1X46-D55.3-domestic.tgz reboot
NOTICE: Validating configuration against junos-srxsme-12.1X46-D55.3-domestic.tgz.
NOTICE: Use the 'no-validate' option to skip this if desired.
Formatting alternate root (/dev/da0s2a)...
/dev/da0s2a: 298.0MB (610284 sectors) block size 16384, fragment size 2048
using 4 cylinder groups of 74.50MB, 4768 blks, 9600 inodes.
super-block backups (for fsck -b #) at:
32, 152608, 305184, 457760
** /dev/altroot
FILE SYSTEM CLEAN; SKIPPING CHECKS
clean, 150096 free (24 frags, 18759 blocks, 0.0% fragmentation)
Checking compatibility with configuration
Initializing...
Verified manifest signed by PackageProduction_10_0_0
Verified junos-10.0R1.8-domestic signed by PackageProduction_10_0_0
Using junos-12.1X46-D55.3-domestic from /altroot/cf/packages/install-tmp/junos-12.1X46-D55.3-domestic
Copying package ...
veriexec: cannot validate /cf/var/validate/chroot/junos/pkg/manifest.certs: unhandled critical extension: /C=US/ST=CA/L=Sunnyvale/O=Juniper Networks/OU=Juniper CA/CN=PackageProductionRSA_2016/emailAddress=ca@juniper.net
chroot: /usr/bin/hwdb_xml_parser: Authentication error
Unable to regenerate Hardware Database, skipping hardware database checks at install time
chroot: tar: Authentication error
Validating against /config/juniper.conf.gz
cp: /cf/var/validate/chroot/var/etc/resolv.conf and /etc/resolv.conf are identical (not copied).
cp: /cf/var/validate/chroot/var/etc/hosts and /etc/hosts are identical (not copied).
chroot: /usr/sbin/mgd: Authentication error
Validation failed
WARNING: Current configuration not compatible with /altroot/cf/packages/install-tmp/junos-12.1X46-D55.3-domestic
{primary:node0}
root> request system software add /var/tmp/junos-srxsme-12.1X46-D55.3-domestic.tgz reboot no-validate
Formatting alternate root (/dev/da0s2a)...
/dev/da0s2a: 298.0MB (610284 sectors) block size 16384, fragment size 2048
using 4 cylinder groups of 74.50MB, 4768 blks, 9600 inodes.
super-block backups (for fsck -b #) at:
32, 152608, 305184, 457760
** /dev/altroot
FILE SYSTEM CLEAN; SKIPPING CHECKS
clean, 150096 free (24 frags, 18759 blocks, 0.0% fragmentation)
Installing package '/altroot/cf/packages/install-tmp/junos-12.1X46-D55.3-domestic' ...
verify-sig: cannot validate ./certs.pem
unhandled critical extension: /C=US/ST=CA/L=Sunnyvale/O=Juniper Networks/OU=Juniper CA/CN=PackageProductionRSA_2016/emailAddress=ca@juniper.net
Installation failed for package '/altroot/cf/packages/install-tmp/junos-12.1X46-D55.3-domestic'
One of the reasons why installation failed is because the device is set to a date earlier than the date on which the jloader was built, therefore the certificate for the file is not yet valid.
root> set date 201611281600.00
node0:
--------------------------------------------------------------------------
Mon Nov 28 16:00:00 UTC 2016
Another reason is you will have to upgrade to intermediate version first before you can upgrade to some latest release. For example, from JunOS 10 to 12.1x44 first, then you can do upgrade to 12.1x46
Step 2: Follwoing Juniper KB's instruction:
Note: It does not include IDP signature database step when there is IDP feature enabled on your system. You will have to deactivate security idp first before go to step 6.
[KB21134] Show KB Properties
Step 3: Troubleshooting Issues
3.1 Nodes of a cluster go into Primary/Lost or Primary / Primary state
Control link and Fabric link send the packets but not receive anything.
Changed Fabric ports on SRX , but situation is still same. Changed cable to try, same result.
Based on KB23929, it is caused with following reason:
"With codes prior to 10.4, by default, the control port tagging was enabled and it used the 4094 VLAN. For 10.4 and later codes, by default, it is disabled.
So, the upgrade/downgrade makes one node of the control port as tagged and the other node as untagged; so this causes control packets to drop, which in turn causes the Split Brain condition."
SOLUTION:
root> set chassis cluster control-link-vlan enable/disable
warning: A reboot is required for control-link-vlan to be disabled
{primary:node1}
test@fw1-2> request system reboot Reboot the system ? [yes,no] (no) yes
{primary:node1}
test@fw1-2> show chassis cluster information detail
node0:
--------------------------------------------------------------------------
Redundancy mode:
Configured mode: active-active
Operational mode: active-active
Cluster configuration:
Heartbeat interval: 1000 ms
Heartbeat threshold: 3
Control link recovery: Enabled
Fabric link down timeout: 66 sec
Node health information:
Local node health: Healthy
Remote node health: Healthy
Redundancy group: 0, Threshold: 255, Monitoring failures: none
Events:
Dec 7 13:57:43.435 : hold->secondary, reason: Hold timer expired
Dec 7 15:48:17.158 : secondary->primary, reason: Control & Fabric links down
Dec 7 15:48:34.749 : primary->secondary-hold, reason: Preempt/yield(10/100)
Dec 7 15:53:34.754 : secondary-hold->secondary, reason: Ready to become secondary
Dec 7 17:53:56.761 : secondary->primary, reason: Control & Fabric links down
Dec 7 17:53:59.428 : primary->secondary-hold, reason: Preempt/yield(10/100)
Dec 7 17:58:59.433 : secondary-hold->secondary, reason: Ready to become secondary
Redundancy group: 1, Threshold: 255, Monitoring failures: none
Events:
Dec 7 13:57:43.512 : hold->secondary, reason: Hold timer expired
Dec 7 15:48:17.134 : secondary->ineligible, reason: Fabric link down
Dec 7 15:48:17.863 : ineligible->primary, reason: Control & Fabric links down
Dec 7 15:48:34.753 : primary->secondary-hold, reason: Monitor failed: IF
Dec 7 15:48:35.762 : secondary-hold->secondary, reason: Ready to become secondary
Dec 7 15:51:00.571 : secondary->ineligible, reason: Fabric link down
Dec 7 17:53:41.929 : ineligible->secondary, reason: fabric link UP
Dec 7 17:53:56.830 : secondary->primary, reason: Control & Fabric links down
Dec 7 17:53:59.431 : primary->secondary-hold, reason: Monitor failed: CS
Dec 7 17:54:00.434 : secondary-hold->secondary, reason: Ready to become secondary
Control link statistics:
Control link 0:
Heartbeat packets sent: 19997
Heartbeat packets received: 19949
Heartbeat packet errors: 0
Duplicate heartbeat packets received: 0
Control recovery packet count: 0
Sequence number of last heartbeat packet sent: 20024
Sequence number of last heartbeat packet received: 20501
Fabric link statistics:
Child link 0
Probes sent: 11579
Probes received: 11575
Child link 1
Probes sent: 0
Probes received: 0
Switch fabric link statistics:
Probe state : DOWN
Probes sent: 0
Probes received: 0
Probe recv errors: 0
Probe send errors: 0
Probe recv dropped: 0
Sequence number of last probe sent: 0
Sequence number of last probe received: 0
Chassis cluster LED information:
Current LED color: Green
Last LED change reason: No failures
Control port tagging:
Disabled
............omitted......
node1:
--------------------------------------------------------------------------
Redundancy mode:
Configured mode: active-active
Operational mode: active-active
Cluster configuration:
Heartbeat interval: 1000 ms
Heartbeat threshold: 3
Control link recovery: Enabled
Fabric link down timeout: 66 sec
Node health information:
Local node health: Healthy
Remote node health: Healthy
Redundancy group: 0, Threshold: 255, Monitoring failures: none
Events:
Dec 7 13:49:59.220 : hold->secondary, reason: Hold timer expired
Dec 7 13:53:47.517 : secondary->primary, reason: Remote node reboot
Redundancy group: 1, Threshold: 255, Monitoring failures: none
Events:
Dec 7 13:49:59.267 : hold->secondary, reason: Hold timer expired
Dec 7 13:51:05.382 : secondary->primary, reason: Remote yield (100/0)
Control link statistics:
Control link 0:
Heartbeat packets sent: 20475
Heartbeat packets received: 20172
Heartbeat packet errors: 0
Duplicate heartbeat packets received: 0
Control recovery packet count: 0
Sequence number of last heartbeat packet sent: 20502
Sequence number of last heartbeat packet received: 20025
Fabric link statistics:
Child link 0
Probes sent: 11740
Probes received: 11585
Child link 1
Probes sent: 0
Probes received: 0
Switch fabric link statistics:
Probe state : DOWN
Probes sent: 0
Probes received: 0
Probe recv errors: 0
Probe send errors: 0
Probe recv dropped: 0
Sequence number of last probe sent: 0
Sequence number of last probe received: 0
Chassis cluster LED information:
Current LED color: Green
Last LED change reason: No failures
Control port tagging:
Disabled
............omitted......
No comments:
Post a Comment