Troubleshooting Data System
 

Some hasty notes on checking the status of the turbulence tower data system at MFO.

Since this contains network address information which we'd like to keep away from net hackers, don't publish this page to the world.

To check sensor status, refer to the plots of the Manitou turbulence tower.

A crontab entry for user maclean on porter.eol.ucar.edu, runs $ISFF/projects/BEACHON_SRM/ISFF/scripts/wget_plot_data.sh at 4,6,8,10 and 12:00 local time every day.

This script copies the high-rate data files from a server at RAL, then computes 5 minute statistics in NetCDF form and creates the plots with Splus. The output of the wget script is logged to porter:/tmp/beachon.log. Check that log for errors. If things are working right it will likely just have a line like "logging in again...".  The output of the Splus plotting functions is in /net/isff/projects/BEACHON_SRM/ISFF/slogs.

To check the most recent files that have been copied to EOL systems, do:

ls -l /scr/isfs/projects/BEACHON_SRM/raw_data

These files are first copied from our data system to the RAL server in the Manitou seatainer at 05:00 local time. Then the files are copied from the seatainer to a RAL server at Foothills Lab. Due to this two step process the copies may get a bit behind. The data up to May 2 00:00 UTC ( May 1 18:00 MDT) , for example, will be copied to the seatainer starting at 05:00 on May 2nd. The 2nd copy to Foothills may not see the new files and so they may not be copied to the RAL server at FLAB until the morning of May 3rd.  Then the wget is run on porter.eol.ucar.edu which copies the files to  /src/isfs and makes the plots.  These plots will then cover up to 18:00 MDT of May 1. As a result we may not see the full data plots  covering up to midnight MDT of May 1 until mid-morning of May 4th.

To list what files are available on the RAL servers, and you're within the UCAR firewall, point a browser to ftp://beachon.rap.ucar.edu/pub/data/turbulence_tower/raw_data

The same list of files is also available on the RAL BEACHON web portal, from either inside or outside the firewall, but the portal requires a password. The portal is also reachable from the BEACHON Web Portal link on the MFO wiki page. Click on Turbulence Tower, then Login. The login name is fluxtower, and the password is l0tzoFD@. Click File Download, then raw_data. At the bottom of the listing you should see the most recent data files.

If data files are further behind than three days, check if the data system is up. First see if you can ping RAL's manitou server in the seatainer:

ping 72.19.158.45

If you can't ping the RAL server, contact Andy Gaydos, x2721, gaydos@ucar.edu.

If you can ping the seatainer server, then try to ssh past it to the turbulence tower data system:

ssh -p 2222 root@72.19.158.45

The data system only allows logins from  systems on the UCAR 128.117 networks. If the link is up and you try to ssh in from another network, it will prompt for a user and password, but will always report password failed. If you do get a user and password prompt then the link is up and the data system is alive.

If ssh reports "WARNING HOST IDENTIFICATION HAS CHANGED" then the link is up and the data system is  alive (or we've been hacked and you're actually not talking to the data system!). Most likely you need to update your $HOME/.ssh/known_hosts file, as described in the warning message, deleting the line with the old host key, and try again.

If it reports "No route to host", ask Andy if he can ping our system from his server.  On the Manitou network, our data system is 192.168.100.202. If our system can't be pinged then either the data system or the network to the turbulence tower is down.

Troubleshooting at the Manitou Site.

When someone goes to the site, have them check the following:

  • In the first seatainer where the fiber to the turbulence tower is terminated, check the LEDs on the black "Transition Networks" fiber/copper media converter on the shelf above the computers, below the box labed "Fiber to Turbulence Tower".  The power, SDC (signal detect copper) and SDF (signal detect fiber) LEDs should be on. If not, give it a power cycle by disconnecting and reconnecting the power cable.
  • Go to the tower, and check that there is AC power at the weatherproof power distribution enclosure. Open up the power enclosure. The power LEDs of the fiber media converter and the 5-port network switch should be on.
  • Check the SDC and SDF LEDs on the media converter.  If they are off, cycle power to the media converter. If the SDC LED is off, also power cycle the 5-port network switch in that same enclosure by disconnecting/reconnecting its power cable.
  • Check the link LED on the 5-port network switch for the ethernet connection to the data system. As of May 2010, the data system was connnected to port 4 on the switch. That LED should be on. The ethernet cable from the data system has yellow heat-shrink near the connector. If the LED for the tower data system connection is not on, then the data system may be is dead, the ethernet cable connection is loose or the cable is damaged.
  • Check the ethernet cable from the power enclosure to the tower data system, that the connections at each end are secure and the cable has not been damaged.
  • If the cable looks OK, but the network switch LED for the data system is not on, then power cycle the data system, by switching the rocker switch on the white data system box to off, wait a few seconds and then back on.  If the data system boots successfully you should see the link LED come on after a minute or so.
  • Check the status of the data system in the white enclosure at the base of the tower, via one or more of the following:
    ping 192.168.100.1
    
    ssh -p 2222 root@72.19.158.45
    • With a voltmeter or serial port test box connected to an available serial port, check that the DC voltage at the enclosure is between 11.5 and 14 volts.
    • Connect a serial cable to the console port of the data system to a portable device with a terminal emulator, like hyperterm or minicom. Log into the root account on the system, and do "lsu" to see if the data file is growing. Also trying pinging the server in the Manitou seatainer:
    • Open the white data system box. The LED light on the top of the small black pocketec disk drive should flash every few seconds. The pocketec is about the size of a deck of cards. If it does not flash then the data system program is not writing to the disk. The switch on the top of the disk should be on "USB".  Try switching to OFF and back to USB.  It should flicker once and then start flashing every few seconds within 30 seconds or less.
    • Call someone in at FLAB to see if they can log into and check the data system: