Category:


Solaris Command to Monitor Network for Docbroker and Repository Services
The network often gets blamed when things are performing poorly, and perhaps this is correct - your network interfaces may be running at 100% utilisation.
What command will tell you how busy the network interface is? Many sysadmins suggest using netstat -i to find out,
    $ netstat -i 1
        input   hme0      output           input  (Total)    output
    packets errs  packets errs  colls  packets errs  packets errs  colls
    70820498 6     73415337 0     0      113173825 6     115768664 0     0   
    
    
This shows packet counts per second, the first line is the summary since boot. How many packets would mean the interface is busy? 100/sec, 1000/sec?
What we do know is the speed of the network interface, for this one it is 100 Mb/sec. However we have no way of telling the size of the packets - they may be 56 byte packets or 1500 bytes. This means that the packet count is not useful, perhaps it is useful as a yardstick of activity only. What we really need is Kb/sec...

By System
How to monitor network usage for the entire system, usually by network interface.

    netstat
    The Solaris netstat command is where a number of different network status programs have been dropped, it's the kitchen sink of network tools.
    netstat -i as mentioned earlier, only prints packet counts. We don't know if they are big packets or small packets, and we can't use them to accurately determine how utilised the network interface is. There are other performance monitoring tools that plot this as a "be all and end all" value - this is wrong.
    netstat -s dumps various network related counters from Kstat, the Kernel Statistics framework. This shows that Kstat does track at least some details in terms of bytes,
      $ netstat -s | grep Bytes
              tcpOutDataSegs      =37367847   tcpOutDataBytes     =166744792
          
      
    However the byte values above are for TCP in total, including loopback traffic that never travelled via the network interfaces.
    netstat -k on Solaris 9 and earlier dumped all Kstat counters,
      $ netstat -k | awk '/^hme0/,/^$/'
      
      
    Great - so bytes by network interface are indeed tracked. However netstat -k was an undocumented switch that has now been dropped in Solaris 10. That's ok, as there are better ways to get to Kstat, including the C library that tools such as vmstat use - libkstat.


    kstat
    The Solaris Kernel Statistics framework does track network usage, and as of Solaris 8 there has been a /usr/bin/kstat command to fetch Kstat details,
      $ kstat -p 'hme:0:hme0:*bytes64' 1
      
      
    Now we just need a tool to present this in a more meaningful way.


    nx.se
    The SE Toolkit provides a language, SymbEL, that lets us write our own performance monitoring tools. It also contained a collection of example tools, including nx.se which lets us identify network utilisation,
      $ se nx.se 1
      Current tcp RtoMin is 400, interval 1, start Sun Oct  9 10:36:42 2005
       
      10:36:43 Iseg/s Oseg/s InKB/s OuKB/s Rst/s  Atf/s  Ret%  Icn/s  Ocn/s
        
      
      
    Having KB/s values lets us determine exactly how busy our network interfaces are. There is other useful information printed above, including Coll% - collisions, NoCP/s - no can puts, and Defr/s defers, which may be evidence of network saturation.


    nicstat
    nicstat is a freeware tool written in C to print out network utilisation and saturation by interface,
      $ nicstat 1
         
      
    Fantastic. There is also an older Perl version of nicstat available.
    The following are the switches available from version 0.90 of the C version,
      $ nicstat -h
      USAGE: nicstat [-hsz] [-i int[,int...]] | [interval [count]]
       
      
    The utilisation measurement is based on the maximum speed of the interface (if available via Kstat), divided by the current throughput. The saturation measurement is a value that reflects errors due to saturation (no can puts, etc).


    SNMP
    It's worth mentionining that there is also useful data available in SNMP, which is used by software such as MRTG. Here we use Net-SNMP's snmpget to fetch some interface values,
      $ snmpget -v1 -c public localhost ifOutOctets.2 ifInOctets.2   
      
      
    These values are the outbound and inbound bytes for our main interface. In Solaris 10 a full description of the IF-MIB values can be found at /etc/sma/snmp/mibs/IF-MIB.txt.



Across Network
Analysing the performance of the external network.

    ping
    ping is the classic network probe tool,
      $ ping -s mars
        
      
    So I discover that mars is up, and it responds within 1 ms. Solaris 10 enhanced ping to print 3 decimal places for the times.
    ping is handy to see if a host is up, but that's about all. Some people use it to test whether their application server is ok. Hmm. ICMP is handled in the kernel without needing to call a user based process, so it's possible that a server will ping ok while the application either responds slowly or not at all.


    traceroute
    traceroute sends a series of UDP packets with an increasing TTL, and by watching the ICMP time expired replies can discover the hops to a host (assuming the hops actually decrement the TTL),
      $ traceroute www.sun.com
      
      
    The times may give me some idea where a network bottleneck is. We must also remember that networks are dynamic, and this may not be the permanent path to that host.


    TTCP
    Test TCP is a freeware tool to test the throughput between two hops. It needs to be run on both the source and destination, and there is a Java version of TTCP which will run on many different operating systems. Beware, it will flood the network with traffic to perform it's test.
    The following is run on one host as a reciever. The options used make the test run for a reasonable duration - around 60 seconds,
      $ java ttcp -r -n 65536
      Receive: buflen= 8192  nbuf= 65536 port= 5001
      
    Then the following was run on the second host as the transmitter,
      $ java ttcp -t jupiter -n 65536
      
      
    This shows the speed between these hosts for this test is around 11.6 Megabytes per second.


    pathchar
    After writing traceroute, Van Jacobson then went on to write pathchar - an amazing tool that identifys network bottlenecks. It operates like traceroute, but rather than printing response time to each hop it prints bandwidth between each pair of hops.
      # pathchar 192.168.1.10
        
      
    This tool works by sending "shaped" traffic over a long interval and carefully measuring the response times. It doesn't flood the network like TTCP does.


    ntop
    ntop is a tool that sniffs network traffic and provides comprehensive reports via a web interface. It is also available on sunfreeware.com.
      # ntop
      
      
      
    Now you connect via a web browser to localhost:3000.



By Process
How to monitor network usage by process. Recently the addition of DTrace to Solaris 10 has allowed the creation of the first network by process tools.

    tcptop
    This is a DTrace based tool from the freeware DTraceToolkit which gives a summary of TCP traffic by system and by process
      # tcptop 10
      Sampling... Please wait.
      2005 Jul  5 04:55:25,  load: 1.11,  TCPin:      2 Kb,  TCPout:    110 Kb
       
       UID    PID LADDR           LPORT FADDR           FPORT      SIZE NAME
       100  20876 192.168.1.5     36396 192.168.1.1        79      1160 finger
           
      
    This version of tcptop will examine newly connected sessions (while tcptop has been running). In the above we can see PID and SIZE columns, this is tracking TCP traffic that has travelled on external interfaces. The TCPin and TCPout summaries also tracks localhost TCP traffic.


    tcpsnoop
    This is a DTrace based tool from the DTraceToolkit which prints TCP packets live by process,
      # tcpsnoop
          
      
    This version of tcpsnoop will examine newly connected sessions (while tcpsnoop has been running). In the above we can see a PID column and packet details, this is tracking TCP traffic that has travelled on external interfaces.



Comments (0)

Post a Comment