Sunday, March 23, 2014

When upgrades go bad: Installing JunOS from USB in a SRX router

So, I screwed up pretty bad. I decided to upgrade the JunOS release in this Juniper SRX210 router to the one (at the time I type this) recommended by Juniper, 11.4R10.3. When it booted up after the install, it crashed during the boot process. Well, I could have spent the time kicking myself but I am doing this upgrade off-hours and I did account for things going badly in my downtime estimate. And, this router is part of a redundant router setup using the Virtual Router Redundancy Protocol (VRRP); being down will not affect production. In other words, this is more of an annoyance than a real issue. Since I have to deal with this, how about if we learn how to restore the OS in this juniper router?

I tried a few ways and thought that the easiest one was to use a USB drive. Of course, it will not work well if you are not physically close to said router (other things will also not work well in these circumstances but that is another topic), but since I can I am doing the USB upgrade.

Procedure

  1. Get a USB drive. I know, this is a pretty obvious step but it is step 1. Ideally use a 1GB/2GB USB drive, formatted as fat16/fat32. Honestly I do not know how critical that is, but my experience with Cisco, which seems not to like the higher capacity ones, made me be leery. On the plus side, you should be able to find those rather easily as people replace their old ones with newer larger ones. If not, there are always the usual sources such as ebay or amazon.
  2. Download and copy OS image you are going to use, say junos-srxsme-11.4R10.3-domestic.tgz, into USB drive. If you are smarter than me, you would have gone to the Juniper downloads site and got all the OS images you need, placing them in your file server. I wasn't so I had to go the SRX210 download page and fetch it.
  3. Have your trusty serial cable and connect it to the router's console port. The default setup is the time-honored 9600 8N1. If you changed it, make sure you wrote than somewhere. I am lazy and I kinda like that setting.
  4. Connect USB drive to router.
  5. Reboot router after you attack the usb drive to it. It needs to know the drive exists as it boots up. Otherwise, it will bark like this:
    loader> install file:///junos-srxsme-11.4R10.3-domestic.tgz
    cannot open package (error 22)
    loader>

    When you try to install it.

  6. Now, if you boot with USB already connected to router, it will first say something like this:

    Running U-Boot CRC Test... OK.
    Flash:  4 MB
    USB:   scanning bus for devices... 4 USB Device(s) found
           scanning bus for storage devices... 2 Storage Device(s) found
    Clearing DRAM........ done
    BIST check passed.

    Some of you noticed the 2 storage devices message. It is talking about the inboard one (probably where the OS should be) and the external drive.

  7. Now, when you see

    POST Passed
    Press SPACE to abort autoboot in 1 seconds

    Please keep your fingers in your pockets. If you press space here, you will end up in the => prompt (U-boot). If you wait you will then see

    Protected 1 sectors
    Loading /boot/defaults/loader.conf
    /kernel data=0xb0f9c0+0x134788 DA(some hot action happening here)

    have your space-bar finger on standby for the next message will be

    Hit [Enter] to boot immediately, or space bar for command prompt.
  8. Then you will press space bar and get the loader> prompt. And now, it will start doing the install thingie:

    loader> install file:///junos-srxsme-11.4R10.3-domestic.tgz
    /kernel data=0xae82f0+0x12d2b8 syms=[0x4+0x88ce0+0x4+0xc6af6]
    Kernel entry at 0x801000d8 ...
    init regular console
    GDB: debug ports: uart
    GDB: current port: uart
    KDB: debugger backends: ddb gdb
    KDB: current backend: ddb
    Copyright (c) 1996-2013, Juniper Networks, Inc.
    All rights reserved.
    Copyright (c) 1992-2006 The FreeBSD Project.
    Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
            The Regents of the University of California. All rights reserved.
    JUNOS 11.4R10.3 #0: 2013-11-15 06:56:20 UTC
    [...]
  9. After a while (I got bored and went to make me some tea), you will see it recreate the ssh key pairs and then finally be ready for business (apologies for the bad cut-n-pasting but my terminal console was being cute):

    |
    |                 |
    |  .o  ..         |
    |.+o .o.o.
    |X . .. .. E      |
    |oo ..            |
    |  .+             |
    |.-+
    root@uranus% omplete
    Setting initial options: .
    Starting optface configuration:
    additional daemons: eventd.
    Additional rout;/boot/modules -> /bo;
    kld netpfe drv: ifpfed_dialer default_adtwork setup:.
    Starting final network daemons:.
    setting ldconfig.
    Initial rc.mips initialization:.
    Local package initializationup access
    .
    kern.securelevel: -1 -> 1
    Creating JAIL MFS partitirade.uboot="0xBFC00000"
    boot.upgrade.loader="0xBFE00000"
    Boot mILE SYSTEM CLEAN; SKIPPING CHECKS
    clean, 78249 free (17 frags, ar 20 16:46:25 CDT 2014
    
    uranus (ttyu0)

    Note that it remembered the hostname for the router. I still went through the configs before letting it join the router cluster. But that is pretty much it! Router is back in business.

Closing Thoughts

  1. The universe is Murphian; things will go wrong. Try not to stress about that.
  2. When you schedule downtime for upgrades, account for things going badly in your time estimates.
  3. The hardest thing to do is figuring out what can go wrong. But, you could ask yourself "If this upgrade halts server or just this service, what would be my backup plan?" and then see if you can answer that question.
  4. Next time I need to upgrade the OS in this or another router, I will have the firmware/OS on standby in a USB drive. I do not know about you but I found out when I am prepared everything works out perfectly.
  5. If you can afford it, redundancy is a wonderful thing.
  6. Always save your configs somewhere, well, safe. Having to recreate them from scratch is a bit of a drag.

1 comment:

Ale Suarez said...

hi, i have several errors and when inserting the usb and then rebooting i have de error n20

this is the initial error

mountroot>
panic: Root mount failed, startup aborted.
Uptime: 6m16s
Cannot dump. No dump device defined.
Automatic reboot in 15 seconds - press a key on the console to abort
--> Press a key on the console to reboot,
--> or switch off the system now.
Rebooting...


U-Boot 2010.03 (Oct 21 2012 - 03:06:55)

------
folowing steps you in your post

loader> install file:///jinstall-ex-4500-12.3R11.2-domestic.tgz
cannot open package (error 22)
loader> install file:///jinstall-ex-4500-12.3R11.2-domestic-signed.tgz
cannot open package (error 22)
loader>

-----

da1: Removable Direct Access SCSI-6 device
da1: 40.000MB/s transfers
da1: 15356MB (31449088 512 byte sectors: 255H 63S/T 1957C)
(da0:umass-sim0:0:0:0): READ(10). CDB: 28 0 0 2c 5e 75 0 0 1 0
(da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): NOT READY asc:3a,0
(da0:umass-sim0:0:0:0): Medium not present
(da0:umass-sim0:0:0:0): Unretryable error
(da0:umass-sim0:0:0:0): Invalidating pack
(da0:umass-sim0:0:0:0): READ(10). CDB: 28 0 0 2c 5e 76 0 0 1 0
(da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): NOT READY asc:3a,0
(da0:umass-sim0:0:0:0): Medium not present
(da0:umass-sim0:0:0:0): Unretryable error
(da0:umass-sim0:0:0:0): Invalidating pack
Trying to mount root from ufs:/dev/da0s2a
(da0:umass-sim0:0:0:0): READ(10). CDB: 28 0 0 b 8 ac 0 0 10 0
(da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0): SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): NOT READY asc:3a,0
(da0:umass-sim0:0:0:0): Medium not present
(da0:umass-sim0:0:0:0): Unretryable error
(da0:umass-sim0:0:0:0): Invalidating pack
g_vfs_done():da0s2a[READ(offset=65536, length=8192)]error = 6

Manual root filesystem specification:
: Mount using filesystem
eg. ufs:/dev/da0a
? List valid disk boot devices
Abort manual input

mountroot>


could you help me?
Ale