SoloManager and File:T1267-f6.jpg: Difference between pages

From Eigenvector Research Documentation Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Donal
 
imported>Benjamin
(Working with False-color images, figure 6.)
 
Line 1: Line 1:
This page describes the SoloManager program and its usage
Working with False-color images, figure 6.
 
== Introduction ==
The purpose of SoloManager is to start a target program locally and to then continuously monitor
the target program's availability. The target program responds to tcp/ip queries on a specified port if it is
operating normally.
If the target program becomes unresponsive for for a specified
period of time then the SoloManager can terminate it and restart it, and/or reboot the host computer entirely.
Many aspects of the SoloManager program can be configured by specifying values in the SoloManager.ini text file.
 
 
''Note: The configuration files are contained in a folder called 'solomonitor', this is due to historical reasons. We apologize for any confusion this may have caused.''
 
== Description of components ==
 
=== SoloManager .jar file ===
 
This contains the SoloManager program, and a sample SoloManager.ini file. It also contains all necessary Java
library files.
 
=== SoloManager configuration file ===
 
Contains configuration details specifying how the SoloManager operates. See the example configuration file listed below.
 
=== Target program ===
This is an the program we wish to monitor and to ensure is always available. It must expose a TCP port and respond to socket queries on that port.
 
=== Wrapper service (optional) ===
 
This is an optional component which will start SoloManager whenever the host computer is booted up. It is described in the sections below about starting SoloManager as a Service or Daemon.
 
== Relationships and processing sequence ==
 
These components are related as shown in the SoloManager flowchart.
 
[[File:Flowchart.png]]
 
== Typical process flow ==
The SoloManager is typically started automatically when the host computer is booted up, usually via the [[#Starting SoloManager Automatically|Service and Daemon Wrapper]].
 
Once started, the SoloManager begins by reading in values for all configurable parameters from the SoloManager.ini file. This file can be edited by the user to specify their preferred settings but it must be located in the same directory as the SoloManager jar file. This is where the user specified the name of the target executable which SoloManager will
start and monitor, for example.
 
SoloManager then begins its unending loop where it checks the status of the target program. SoloManager creates a socket connection to the target program and sends a query. If the target program is alive it sends a response which must match what SoloManager is expecting.
 
SoloManager checks the Target program is alive by:
# opening a socket on the target program's port
# Sending the parameter "msgToSocket" to the socket and verifying that the first line returned from the socket equals the parameter "expectedResponse".
: If the response is not valid SoloManager will repeat this check up to "fastFailCountLimit" times with a pause of "FastQueryIntervalSeconds" seconds.
: If the response is valid the check is complete with result success.
 
If the target check was successful then the failure counter is reset to zero and the loop repeats after a specified pause period of
"SlowQueryIntervalSeconds" seconds.
If the target check was not successful then the failure counter is incremented. The loop continues until this counter reaches a
specified "nResponseRestart" counter value, whereupon SoloManager issues a command to restart the the target program and continue with
the loop. If the target program restarts then the next check will be successful so the loop continues normally.
 
If the restart command does not succeed in restarting the target program then the target checks will continue failing and
the failure counter incrementing until it eventually attains the specified "nResponseReboot" counter value. At this point SoloManager
issues a command to reboot the host computer and the entire process begins again.
 
During these operations SoloManager writes status information to a log file and optionally can send e-mail to report events.
The log file will be located in the directory specified by "outdir". Its size is limited to the last "logFileMsgCapacity" log messages.
E-mailed alerts are optional and are enabled by setting "enableEmailing" = true. In this case e-mail messages will be sent to the specified user whenever:
# The SoloManager program starts.
# SoloManager is about to issue a restart command for the target program.
# SoloManager is about to issue a reboot command to the host computer's operating system.
 
== Dependencies ==
SoloManager requires the following:
# Java version 1.5 or later is available on the host computer.
# It must be able to write to a log file on the filesystem.
# It must be able to issue a system reboot command (command can be defined within the configuration file).
# Operating system may be any of: Linux, Windows (2000, XP, 2003, 2008, Vista, 7), or MAC
 
 
== Configuration File ==
 
The configuration file will almost always need to be modified for the individual application and installation settings. An example file is included below, but a few key settings to modify include:
 
* '''executableName''' Name of the program to run (usually either Solo.exe or Solo_Predictor.exe.)
* '''startExecutableCommandPre''' Full path to the program listed as executableName (unless the program's folder has been added to the system path by the installer.)
* '''outdir''' Specifies the folder which should contain the log files. By default these will be written to the same folder as the configuration file, but another file may be preferable if the user does not have read/write permissions to that folder.
* '''maxTargetRunDurationHours''' The target will be stopped and restarted every maxTargetRunDurationHours if this is a positive number. It has no effect if it is not a positive number.
* '''nResponseRestart''' and '''nResponseReboot''' indicates how many target check failures must occur before the application is restarted and/or the system is rebooted (respectively). If the Target Application fails after starting successfully, it will be detected by the next normal check, which occur every slowQueryIntervalSeconds seconds. When a target check fails a restart is invoked after (fastQueryIntervalSeconds+1)*fastFailCountLimit*nResponseRestart seconds. If the restart attempts fail then a system reboot is invoked after (fastQueryIntervalSeconds+1)*fastFailCountLimit*nResponseRboot seconds. Thus the worst case total elapsed time, in seconds, from the target failing until an action occurs can be roughly calculated by:
 
ResponseTime = slowQueryIntervalSeconds + (fastQueryIntervalSeconds+1)*fastFailCountLimit*nResponse____
 
 
The settings in the configuration file represent likely minimum settings. If longer delays are acceptable before a response, increase the fastQueryIntervalSeconds and/or the nResponse___ settings.
 
<pre>--------------------------------------------------------------------------
------------ start: Example SoloManager.ini configuration file -----------
# default values for the SoloManager
#
# Period to pause when fast and slow polling the executable
fastQueryIntervalSeconds = 2
slowQueryIntervalSeconds = 6
#
# How many times to poll when getting fail result before escalating the response level
fastFailCountLimit = 2
# The initial fastFailCountLimit is usually larger, to allow time for target system startup
startFastFailCountLimit = 15
#
# How many fast cycles should occur with fails before applying response for level 1, 2, etc.
# Note: set to zero or a negative integer to suppress the response action from occurring
#nResponse1
nResponseRestart = 1
# nResponse2
nResponseReboot = 3
#
# maxTargetRunDurationHours. Non-positive value disables this feature.
# Positive value must be greater than 0.05 (hours)
maxTargetRunDurationHours = 0
#
# executable details
executableName = solo_predictor.exe
startExecutableCommandPre = c:\\Program Files\\EVRI\\Solo_Predictor\\application\\runtime\\win64\\
startExecutableCommandPost =
stopExecutableCommandPre = taskkill /F /IM \"
stopExecutableCommandPost = \"
#
# reboot
rebootCommandPre =
rebootCommandPost =
rebootCommand = shutdown /?
#
# executable socket details
serverIP = 127.0.0.1
serverPort = 2211
#
# log file capacity
logFileMsgCapacity = 6000
#
# Output directory. DO NOT add surrounding quotes
outdir = .
#
# must be true or false, case insensitive:
enableEmailing        = false
#
# mailserver
 
mailServer            = mail.eigenvector.com
 
mailServerPort        = 587
mailUsername          = USERNAME@eigenvector.com
mailPassword          = PASSWORD
# Note: mail Addresses cannot include spaces and must be well-formed addresses
mailRecepientAddress  = SOMEONE@gmail.com
# Use something which will be a valid e-mail address:
mailSenderAddress    = monitor@solopredictor.com
#
//---------- start: Example SoloManager.ini configuration file -----------
</pre>
 
==Starting SoloManager Automatically==
 
SoloManager is most useful when run automatically by an operating system. This will start the Target Application in the background. The following describes how to install SoloManager as a service (Windows) or daemon (Linux).
 
===Running SoloManager as a Windows Service===
 
The ''service'' folder in the SoloManager main folder contains the tools necessary to run SoloManager as a Windows service. This will automatically start the application without a user logging in. Follow these instructions to install SoloManager as a Windows service:
 
# Copy the application files onto the computer on which the application is to be run.
# Configure solomanager.ini as needed for the intended behavior.
# Copy solomanager.ini into the "service" folder. This copy of solomanager.ini will be used by the service.
# Run the Install_Service.bat file in the ''service'' folder to install the service (this batch file must be run by a user with administrative privileges).
 
Note:  different versions of Windows have differing levels of user access control.  For example, under Windows XP it is typically sufficient to be logged in as a user with administrative privileges to successfully install the service described in step 4 above.  With Windows Vista and higher,  even if you are logged in as an administrator by default you do not have administrative privileges when launching an application. To run an application in an administrative mode, you will have to right click on the application icon and select "Run as an administrator".
 
To workaround the issue, you will need to open a command window as an Administrator. To do so, click on Start and search for "cmd".  Right-click on "cmd.exe"  and select the option "Run as Administrator".  From this command window, you will be able to run all of the necessary batch files at administrative level.
 
====Troubleshooting Windows Service Problems====
* Errors and status messages will be reported to the log files stored in the ''C:/temp'' folder (if this doesn't exist, the log will be created in the same folder as the wrapper.exe). To move logs to a different location, edit the ''service/conf/service.conf'' file. You can also modify the logging behavior in this file (maximum length, number of log backups, etc.)
 
* Several of the configuration files expect the folder ''C:/temp'' to exist. If it does not, you may receive "Null Pointer Exception" errors from the SoloManager application. Either modify the ''service.conf'' and ''solomanager.ini'' files '''or''' create the folder as needed.
 
* An error in the log saying that Java could not be found usually means that the service was unable to locate java in the standard Solo_Predictor folder. If encountered, edit the ''service/conf/service.conf'' file and locate the "wrapper.java.command" property. The usual value for this property is:
  C:/Program Files/EVRI/Solo_Predictor/application/sys/java/jre/win64/jre/bin/java
::which is the default Solo_Predictor sub-folder in which the 64-bit version of Java is located. If Solo_Predictor is installed in a location other than the default folder, or you are using the 32-bit version of Solo_Predictor, change this value to reflect the correct location. For 32-bit Solo_Predictor, replace "win64" with "win32".
 
::An alternative solution to the above issue is to execute the service with the credentials of a specific user that has a full copy of Java installed. To resolve this issue, go to the windows "Services" control panel, locate the EVRI SoloManager service, double-click the service and change the "Log On" properties to a specified user.
 
* If you have problems, try running the test script:
  Test_Service
::to see if the server will start when run manually. Errors from this script can be used to adjust the service.conf file.
 
* To uninstall the service, run the Uninstall_Service.bat file (as an administrator.)
 
===Running SoloManager as a Unix/Linux Daemon===
 
The ''daemon_linux'' folder in the SoloManager main folder contains the tools necessary to run SoloManager as a Linux daemon. This will automatically start the application without a user logging in. Follow these instructions to install SoloManager as a Linux Daemon:
 
# Copy the application files onto the computer on which the application is to be run.
# Configure solomanager.ini as needed for the intended behavior.
# Copy solomanager.ini into the ''daemon_linux'' folder. This copy of solomanager.ini will be used by the daemon.
# Run the ''Install_Daemon'' script to install the daemon (this batch file must be run by a user with root privileges).
./Install_Service
 
'''NOTE:''' In order to execute this script and have the daemon operate correctly, you may have to manually set the "execute" bit on all files in the top-level ''daemon_linux'' folder to "on" using the chmod command inside the ''daemon_linux'' folder:
chmod 755 *
 
Errors and status messages will be reported to the log files stored in the ''daemon_linux/logs'' folder. To move logs to a different location, edit the ''daemon_linux/conf/wrapper.conf'' file. You can also modify the logging behavior in this file (maximum length, number of log backups, etc.)
 
To uninstall the daemon, run the ''Uninstall_Daemon'' script (as root.)
./Uninstall_Daemon
 
If you have problems, try running the test script:
./Test_Daemon
to see if the server will start when run manually.

Revision as of 14:21, 12 May 2017

Working with False-color images, figure 6.