# README (this file) FlowViewer V3.0  Date: 7/04/2006
#
# Quick Install
#
#   1. Untar into cgi-bin subdirectory
#
#   For FlowViewer
#
#   2. Configure FlowViewer_Configuration.pm variables as necessary
#   3. Point browser to FlowViewer.cgi
#
#   For FlowGrapher
#
#   4. Install GD, GD::Graph
#   5. Configure FlowViewer_Configuration.pm variables as necessary
#   6. Point browser to FlowGrapher.cgi
#
#   For FlowTracker
#
#   7. Install RRDtool (at least version 1.2.12)
#   8. Create FlowTracker_Filter and FlowTracker_RRDtool directories
#   9. Start FlowTracker_Collector, FlowTracker_Grapher in background
#   10. Configure FlowViewer_Configuration.pm variables as necessary
#   11. Point browser to FlowTracker.cgi
#
#   For all FlowViewer tools
#  
#   12. Review all FlowViewer directories and files for proper permissions
#
# Dependencies
#
# - FlowGrapher requires the Perl GD and GD:Graph packages
#   GD package: http://search.cpan.org/~lds/GD-2.30/
#   GD::Graph package: http://search.cpan.org/~mverb/GDGraph-1.43/
# - FlowViewer.cgi requires the GDBM capability in Perl (or)
# - FlowViewer_NDBM.cgi requires the NDBM capability in Perl
#   (NDBM is in most Perl distributions)
# - FlowTracker requires RRDtool (at least version 1.2.12)
#   RRDtool: http://oss.oetiker.ch/rrdtool
#
# Contents
#
# FlowViewer_Configuration.pm
#
# This file contains parameters that configure and control the 
# FlowViewer, FlowGrapher, and FlowTracker  environments. This package 
# should remain in the same directory that the CGI scripts are in.
#
# FlowViewer_Utilities.pm
#
# This file contains processing used by multiple programs (e.g., to
# create the Report Parameters output for each tool, and other utilities
# (e.g., 'epoch_to_date' which converts between typical date formats 
# and 'seconds since 1972') that are invoked by other scripts. This 
# package should be placed in the same directory as the CGI scripts.
#
# FlowViewer.cgi
#
# This script produces the web page which provides the user the form
# for entering analysis selection criteria for FlowViewer. Version 3.0
# reorganizes the processing. FlowViewer.cgi is now the old
# create_FlowViewer_webpage. This change permits the input date and time
# to be updated with each invocation.
#
# FlowViewer_Main.cgi
#
# This script responds when the user completes the selection criteria
# form and submits the 'Generate Report' command. The script creates a
# flow-tools filter file based on the selection criteria. Based on the
# input time period, the script concatenates the relevant flow-tools
# data files for the selected device. The location of the flow-tools
# raw data files is specified via the 'flow_data_directory' parameter.
# The script then invokes the selected statistics/print report flow-tools
# program and reformats the output into HTML. An option is available in
# FlowViewer_Configuration to have this script use the NDBM capability
# (for caching resolved host names) instead of the default GDBM
# capability for users whose Perl distribution does not have GDBM.
#
# FlowViewer.png
#
# The FlowViewer logo. Leave this file in the 'cgi-bin_directory', the
# the FlowViewer.cgi script will place a copy of the image in 
# 'reports_directory'.
#
# FlowGrapher.cgi
#
# This script produces the web page which provides the user the form
# for entering analysis selection criteria for FlowGrapher. Version 3.0
# reorganizes the processing. FlowGrapher.cgi is now the old
# create_FlowGrapher. This change permits the input date and time
# to be updated with each invocation.
#
# FlowGrapher_Main.cgi
#
# This script responds when the user completes the FlowGrapher selection
# criteria form and submits the 'Generate Graph' command. The script
# creates intermediate processing files exactly like FlowViewer above.
# The script then parses intermediate output, fills time buckets, and
# generates a graphic image. Textual output accompanies the graph. An
# option is available in FlowViewer_Configuration to have this script use
# the NDBM capability (for caching resolved host names) instead of the
# default GDBM capability for users whose Perl distribution does not have
# GDBM.
#
# FlowGrapher.png
#
# The FlowGrapher logo. Leave this file in the 'cgi-bin_directory', the
# the FlowGrapher.cgi script will place a copy of the image in 
# 'graphs_directory'.
#
# FlowGrapher_Colors
#
# This file contains a translation between textual color names and their
# RGB value counterparts.
#
# FlowTracker.cgi
#
# This script produces the web page which provides the user the form
# for entering analysis selection criteria for FlowTracker. The script
# also provides the user with the ability to review, revise, or remove
# existing trackings. FlowTracker is new for version 3.0. 
#
# FlowTracker_Main.cgi
#
# This script responds when the user completes the FlowTracker selection
# criteria form and submits the 'Establish Tracking' command. The script
# responds to the users desire to create, remove, or revise a tracking.
#
# FlowTracker_Collector
#
# The script is started once by the user and placed in the 'background'.
# The script will execute and then sleep for the duration of a five minute
# period, essentially running every five minutes. For each existing tracking, 
# the script applies the associated filter to the flow data and extracts the
# amount that occured during a 5-minute window approximately 30 miuntes
# earlier. This is to permit long-running flows to have been exported and 
# available to the collector. The script then divides the total bits by 
# 300 seconds to get an average bits-per-second rate during the period.
# The data point is then provided to RRDtool for storage.
#
# FlowTracker_Grapher
# 
# The script is started once by the user and placed in the 'background'.
# The script will execute and then sleep for the duration of a five minute
# period, essentially running every five minutes. The script runs the
# RRDtool graph function for each existing tracking. Daily, weekly,
# monthly, and yearly graphs are updated with the latest information. The
# script creates an html page for each tracking that includes the filter
# parameters and the four graphs. The script also creates an overall web
# page ($tracker_webpage) that provides links to all active tracking pages.
#
# FlowTracker.png
#
# The FlowTracker logo. Leave this file in the 'cgi-bin_directory', the
# FlowTracker.cgi script will place a copy of the image in 
# 'tracker_directory'.
#
# FlowTracker_Links.png
#
# The FlowTracker logo with links. Leave this file in the 'cgi-bin_directory',
# the FlowTracker.cgi script will place a copy of the image in 
# 'tracker_directory'. This image contains mapped links to FlowViewer and
# FlowGrapher such that those input pages are pre-loaded with the filter
# criteria from the Tracking.
#
# FlowViewer_Save.cgi
#
# This script moves temporary save files into a permanent residence
# as defined by either the 'reports_directory' or 'graphs_directory'
# environment variables.
#
# Configuration parameters
#
# The FlowViewer, FlowGrapher, and FlowTracker scripts all use parameters
# in the FlowViewer_Configuration.pm file to control the environment that
# they run in. Here is a brief explanation of some of the relevant 
# parameters:
#
# $ENV(PATH) - modify as appropriate for your installation
# $FlowViewer_server - IP address of server hosting this software
# $FlowViewer_service - Either HTTP (port 80) or HTTPS (port 443)
# $reports_directory - Directory to hold saved FlowViewer reports
# $reports_short - Reports directory beginning from web server default
# $graphs_directory - Directory to hold saved FlowGrapher reports
# $graphs_short - Graphs directory beginning from web server default
# $tracker_directory - Directory to hold FlowTracker trackings
# $tracker_short - Tracker directory beginning from web server default
# $filter_directory - Directory in which to keep FlowTracker filter files
# $rrdtool_directory - Directory in which to keep FlowTracker RRDtool files
# $cgi_bin_directory - Directory which holds cgi scripts
# $cgi_bin_short - cgi-bin directory beginning from web server default
# $flow_data_directory - Directory that holds all flow-tools data files
# $flow_bin_directory - Directory where all flow-tools reside
# $rrdtool_bin_directory - Location of RRDtool programs
# $work_directory - Directory to store intermediate files 
# $names_directory - Directory to save permanent 'names' file
# $flow_capture_interval - Interval beyond end point to capture all flows
# $flow_file_length - Length (in seconds) of each of your flow files 
# $devices - List of device names exporting netflow (see #4 below)
# $N - Used to control directory organization (see #5 below)
# $dig - Location of DNS utility 'dig' (set to nslookup if required)
# $collection_offset - Seconds into past to begin collection period
#
# The rest of this file contains basic parameters such as colors, etc.
#
# Additional Considerations
#
# 1. Directory permissions for the subdirectories created for the
# 'htdocs', 'work', 'names', 'cgi-bin' (e.g., FlowTracker_Filter, 
# FlowTracker_RRDtool) directories must permit the owner of the web
# server process (e.g., apache) to write into these directories.
# The directories may have been created by a different user.
#
# 2. FlowViewer and FlowGrapher offer the ability to save interesting
# reports. To do this, the scripts save a temporary copy of every report
# in advance of the user electing to save it permanently. These 
# intermediate files will accumulate in the 'work' directory specified
# in the FlowViewer_Configuration file. These files could be removed 
# daily via a cron script to prevent unecessary use of disk space. When
# the user elects to save a report, it is copied into either the 
# 'reports_directory', or the 'graphs_directory' depending on which 
# function he is running.
#
# 3. FlowViewer and FlowGrapher offer the ability to resolve NetFlow IP
# addresses into their host names on the fly. This process is speeded
# up by caching names into a 'names' file which resides in the directory
# specified by the 'names_directory' parameter. This parameter defaults
# to /tmp, but this may not be the best directory for you since it will
# disappear with a reboot. As you are building up your 'names' file
# with early runs, you will notice the speed increase dramatically
# as the 'names' file is used more. The process of resolving names is
# the primary reason for slower overall FlowViewer performance. You
# should preferably use the GDBM array database which is fastest.
# However, not all Perl distributions support GDBM but most do support
# NDBM. The '$use_NDBM' flag in FlowViewer_Configuration.pm will 
# cause the FlowViewer_Main and FlowGrapher_Main scripts to use NDBM.
#
# 4. The FlowViewer and FlowGrapher reporting features use a flow-tools
# data directory layout that has a particular device at the top. A
# typical flow-tools directory looks like:
#
# /flows/router_1/2005/2005-07/2005-07-04
#
# The device name (router_1) is obtained from an array called 'devices'
# in the FlowViewer_Configuration.pm file. Populate this array with your
# device names. If your flow-data file structure does not include a
# device name, for example you are collecting only from one device, set
# the @devices array to empty (i.e., @devices = ("");) On the web page
# you can ignore the Devices pulldown selection.
#
# 5. Different organizations store captured netflow data differently
# according to the 'N" setting on the flow-capture statement. However,
# there is a bug in the flow-tools documentation such that the default
# value is truly '3' and not '0' as indicated. I have set $N = 3 to
# reflect the more common setting. The directory structure associated 
# with $N = 3 is shown below:
#
# /flows/router_1/2005/2005-07/2005-07-04
#
# If you are not seeing output, please check this setting.
#
#
# Change Log
# 
# Version 3.0
# 
# 1. Major new addition of FlowTracker
# 2. Reorganized scripts so that the date and time fields are updated
#    with each invocation
# 3. Moved common code (e.g., filter creation) to FlowViewer_Utilities.pm
# 4. Improved Report Parameters output formatting
# 5. Provided host names capability for FlowGrapher (thanks Mark Foster)
# 6. Introduced debug and logging capabilities
# 7. Merged GDBM/NDBM into a single script (thanks Ed Ravin)
# 
# Version 2.3
#
# 1. Modified FlowGrapher record processing to not call 'timelocal' for
#    epoch times. Other speed improvements. Result: up to 10 times faster.
# 2. FlowGrapher error leaving spikes is fixed (thanks Mark Foster)
# 3. Bug with concatenation when $N=0 fixed (thanks Dave Faught)
#
# Version 2.2
#
# 1. Added flow_select parameter to control which flows are considered
#    with respect to the specified time period
# 2. Removed Easterm Time (ET) notation. All times are system local
#
# Version 2.1
#
# 1. Fixed concatenation. Needs to start one flow file length before start time
# 2. Fixed end-of-year problem in FlowGrapher
# 3. Small problem for time requests that end just before midnight
#
# Version 2.0
#
# 1. Used pipe (|) instead of re-reading intermediate files (thanks Woj Kozicki!)
# 2. Introduced configurable variable $N to specify flow-directory nesting levels
# 3. Reduce default value of configurable variable $flow_capture_interval to 1800
# 4. Created FlowViewer_NDBM.cgi for users whose Perl does not have GDBM
# 5. Created configurable 'work_directory' separate from cgi_bin_directory
# 6. Sped up concatenation for requests that cross day boundaries
# 7. Added filter fields: Protocol, TOS Field, TCP Flags
# 8. Added some more syntax checking
# 9. Added FlowGrapher capability (requires GD for Perl)
#
# Version 1.0 (Original)
#
#
# Vital Assistance
#
# Special thanks to those FlowViewer users who provided feedback and valuable
# suggestions, including Sejin Ahn, Mark Boolootian, Bogdan Ghita, Woj Kozicki,
# Ed Arvin, Alex Shepherd, Mike Smith, Scott Wingfield, Vali Magdalinoiu,
# Eric Lautenschlaeger, and Dave Faught. Big thanks to fellow toiler in the NASA
# vineyard Mark Foster for some detailed testing, excellent suggestions, and code
# to go along with it :-)
#
#
# Bugs, recommendations
#
# If you discover a bug, or have a recommendation, please send an email to:
#
# Joe Loiacono
# jloiacon@csc.com
