|
Service Level Agreement Monitoring &
Reporting
Service Level Agreements
Longitude® Service Level Agreements (SLAs) allow you to monitor and assess the performance of anything from a single entity to a multi-tiered application or business service. If the IT department has contractual obligations for service levels, Longitude’s SLAs can help ensure and document compliance. SLAs are particularly useful for correlating user experience metrics (as measured via Longitude’s synthetic web transaction facility) with the underlying infrastructure components that support the associated business service. Service Level Agreements are defined in terms of service availability and performance, answering the questions:
- What percentage of the time was the service available?
- How was the service performing?
- What is the root cause of outages and degradations in performance?
SLA Dashboards and Reports
Longitude’s SLA Dashboard provides up to the minute displays of SLA performance, with pinpoint drilldown to investigate shortfalls in compliance. Historical reports are also available. When an SLA indicates an issue, you can make reporting more informative for stakeholders by commenting on extenuating circumstances or documenting the actions taken to resolve the problem. Likewise, you can document changes in IT infrastructure – such as upgrading the CPU or doubling the memory in a web server – and then measure the resulting effects on SLA performance.
Defining SLAs
SLAs are defined in terms of service conditions, where each condition measures one aspect of the overall application or service being measured by the SLA. For example, an SLA for a multi-tiered application might include service conditions to measure user experience (via a synthetic web transaction), memory, database availability, CPU utilization, and network bandwidth. An outage or degradation in any area would affect the service and thus the state of the SLA, and the SLA dashboard (shown above) will make it easy to determine the underlying cause. Alternatively, an SLA may consist of only one metric; for example, a “Network Response” SLA might simply measure the response time of a ping request to another server on the network.
Clustered Service Conditions Account for Redundancy
A clustered service condition is a service condition that operates across a group of computers or devices (e.g. disks). For example, a “Server Performance” condition could be measured over 3 computers, where performance is considered acceptable if at least 2 of the computers are operating, even if the third computer is down. This allows SLAs to emulate true business service conditions by taking redundancy into account.
Overall SLA Compliance and State Changes
A Service Level Agreement is the logical aggregation of individual service conditions, and its state reflects the minimum (worst) state of its individual service conditions. If an SLA has five service conditions where three of them are good in a given interval, one is degraded, and one is unacceptable; the SLA is unacceptable in that period of time. Whenever an SLA changes state, Longitude generates an event that can be viewed in real time dashboards and historical reports, and can be used to notify responsible IT staff or even take corrective action.
Service conditions can be in the following states:
| Good |
Service available and operating as expected. Services
are deemed Good only when each metric of the service is
deemed Good in the measured time period. |
| Degraded |
Service is available, but is operating at a less than
“Good” level of performance. |
| Unacceptable |
Service is available, but is operating at an
unacceptable level of performance (i.e., less than “Degraded”). |
| Maintenance |
Service is unavailable due to scheduled or requested
maintenance. |
| Down |
Service is unavailable when it should be available
(i.e., it is not Good, Degraded, or Unacceptable, and
there is no scheduled or requested maintenance in
effect). |
| No Data |
"No data" means that for some reason the
management station is not attempting to collect data for
the service. |
About Longitude
With Longitude you get the best of both worlds –
agentless software that’s not only affordable and easy to
install and use, but gives you the kind of comprehensive
application and server performance monitoring that you’d
expect from high end agent-based software. Longitude monitors
hundreds of vital performance metrics, alerts you to problems,
takes
any corrective actions you specify, and generates reports and graphs that demonstrate just how
well your operating system, databases, web servers, messaging,
and J2EE applications are performing. Best of all, everything
is presented in dashboard views that let you drill down to
investigate problems and find answers fast.
In keeping with its robust functionality, Heroix Longitude
scales with ease. It’s equally adept at monitoring a few
servers or many. And unlike some other agentless solutions,
Longitude is totally self-contained with no prerequisites for
layered software. You won’t have to install extra software,
nor will you get stung by unexpected costs such as additional
database, report writer, and runtime license fees. In most
cases, Longitude can be deployed in about 15 minutes!
|