|
Lights-Out Solution for Sun Company
Sun Company, one of the
country's leading corporations specializing in refining
and marketing oil, consists of five refineries and a
central headquarters located in Philadelphia, each of
which operates its own computer center. The six data
centers, connected via a wide-area network, are critical
to Sun's 5,000-plus users. In order to increase
reliability and efficiency, Sun decided to migrate to a
total lights-out solution.
According to Dennis
Puida, Senior Systems Analyst, Sun needed a proven
software solution that was reliable and exhibited
technical excellence. "RoboMon was the only product
we evaluated that fit the bill," Dennis states.
"Our primary objective was to implement a software
solution that would monitor all of our refinery
applications."
Sun initiated a three-phase rollout for RoboMon. The
first phase involved system monitoring and error
notification. The second and current phase consisted of
system monitoring and automatic problem resolution. And
the third phase will covered dynamic tuning.
Among its many functions, RoboMon monitors Sun's
refinery applications, including: data acquisition,
which is used to obtain real-time information about
refinery processing and operations; LIMS, which provides
quality control information for Sun's testing labs;
OMNI, a materials management system; and VRU, a voice
recognition system. In all of these situations, RoboMon
watches for the presence of application processes. If it
detects a problem, RoboMon automatically restarts the
process and sends a mail message detailing its action.
If the error persists, RoboMon pages the appropriate
person.
According to Dennis, Sun has implemented most of the
built-in rules supplied with the product. "In
addition, we have taken full advantage of the product's
flexible and extensive rule writing capabilities. We
developed a system where RoboMon has been placed into an
A-B-C model."
The model is graphically represented as a three-ring
circle. Inner ring A refers to a subset of RoboMon's
supplied rules that are used as is. Here, all of the
Performance solution rules have been turned on, while
the Automation rules are employed on a case-by-case
basis. Otherwise known as the "Sun Core
Rules," ring B contains supplied rules that have
been customized to Sun's corporate standards, and have
been rolled out to all of its sites. These include disk
space, disk state change, page files, looping processes,
batch jobs, tapes mounted, and many others. Also
included in this category are formal reports used by all
refineries for performance management and trend
analysis. Finally, outer ring C refers to site-specific
rules. Each refinery has the ability to write its own
rules based on site-specific needs.
As a 24x7 organization, Sun relies heavily on its
weekend operations. "The ability of software to
take automatic corrective action is probably the most
important aspect of a lights-out solution," Dennis
says. "Some of RoboMon's many actions include
restarting processes, restarting print queues, mounting
disk drives, and purging and deleting files to free disk
space. These actions can be one, two or three levels
deep and are based upon thresholds. For example, if disk
space is low, RoboMon will purge files. If the problem
is not corrected, RoboMon will then delete certain files
based on pre-defined criteria.
Over the past two years, RoboMon has become an
integral part of Sun's operations. Says Dennis Puida,
"RoboMon is a flexible, comprehensive lights-out
solution that has enabled our personnel to concentrate
on more pressing issues. Our staff doesn't need to react
to every situation anymore. RoboMon handles it
automatically."
|