Events and Alerts

Longitude checks the data it collects for problem conditions using rules built in to Longitude applications. Each rule applies to a specific component in an application, and checks that component against a condition. For example, the Windows application has a CPU component, and rules check the CPU usage against a threshold. If that threshold is exceeded, an Application Event is created.

The Dashboard >> Events tab displays a sortable and filterable list of all Application Events. The Dashboard >> Status tab displays an overview of Application Events indicating the severity and percentage of effected computers through user configurable pie charts.

 

Status Dashboard

The Status Dashboard is displayed under the Dashboard >> Status tab. An application overview dashboard is created by default for Windows, Unix, Network Devices, Hyper-V, or VMware when these applications are monitored. The default dashboards display the status for all monitored computers for the application’s key metrics – for example, the Unix dashboard has a pie chart widget for each of CPU, FileSystem, Page Scan Rate and Swap Space. The Dashboard home page displays an overview for all dashboards with each dashboard’s icon displaying the most severe current event status.

Dashboard home page

Clicking on an icon in the dashboard list or on a status icon in the left pane will display the details for the dashboard. In the example below, no problems have been detected for CPU, Page Scan Rate or Swap Space, but 63% of the monitored FileSystems have Critical severity storage space problems and 19% have major severity storage space problems.

Hover over dashboard

Hovering over a widget displays the details of the problems for that section of the widget. In the example above, the detail display was produced by hovering over the Red (i.e. Critical) section of the FileSystem widget.

Right click on event widget

Right-clicking on a widget displays options for displaying detailed metric information:

Problem items only Metrics for Critical and Major problems
All Metrics for all monitored computers
First 25 Metrics for 25 worst performing computers
First 10 Metrics for 10 worst performing computers

Summary event display

The Problem items only report above displays the metrics for computers for which critical or
major severity problems have been detected. Hovering over the data columns lists the value of the metric and any available additional information

Click on the Back button to return to the Widget display.

Existing dashboards can be modified by right-clicking on the dashboard name in the left pane and selecting Modify this dashboard, and new dashboards can be created by right clicking on Enterprise and selecting Create a new dashboard. The Modify Dashboard window will allow you to modify pre-existing widgets, or add in new widgets.

Modify widgets

Adding or editing a widget is done through a widget definition form. Each drop-down menu in the form will be populated based on previous selections. For example, selecting Windows as the Application populates theComponent list with only the components found in the Windows application. Selecting the CPU component restricts the Event list to only Events that are produced for Windows CPU.

DashboardWidget

The Computer group can be set to either All, or a previously defined Computer Group. If a user defined Computer Group is selected and members are subsequently added to or removed from the Computer Group the computers displayed in the Dashboard will be updated to reflect the change in the Group.

 

Copy Dashboard

Copy Dashboard

Status Dashboards are user specific. Admin users can copy their own dashboards to other users by right clicking on Enterprise and selecting Copy dashboards to other users…. Please note the following:

  • Copying a dashboard to another user will overwrite existing dashboard definitions.
  • If a dashboard has been configured based on a Computer Group, the Computer Groups will need to be copied as well.
  • Read Only users will only see the default Application overview dashboards until custom dashboards are copied to their account.

 

View Events

Events are detailed text descriptions of problems detected by Longitude applications. Events can be viewed by selecting Events in the Dashboard tab. The tree display in the left pane can be configured to view applications by Computers, Applications or Groups.

The color of the icon in these trees represents the most severe event
for that computer, application or group:

Icon Highest Event Severity
Green Critical
Yellow Major
Yellow Minor
Red No Events Found

Check or uncheck the box next to the icon to include or exclude events for the computer, application or group. For information on configuring groups, refer to the section on configuring Groups.

 

Events are listed in the following sections:

Section Events
Application Events Generated by Longitude rules scanning collected data
Windows Event Log Data collected from the Longitude Windows Event Log application
Syslog Data received by Longitude Syslog listener
SNMP Traps Data received by Longitude SNMP Trap listener
SLA Events Events generated by a change in status for a user defined Service Level Agreement

If you are not monitoring these applications, the section will not be displayed. There is no tree display associated with the Syslog, SNMP Trap and SLA Events tabs.

 

Event Details

Event Monitor

  • The left pane of the Events display contains icons displaying the highest severity event found for the computer.
  • The right pane displays the events found for the servers checked in the left pane – by default the display shows events for the last hour.
  • The Count column is the number of times the event has occurred in the interval being displayed – clicking on the + icon to the left of the Count displays all the events for the interval.
  • Clicking on an Event in the right pane displays the text of the event in the bottom pane.
  • The Windows Event Log, Syslog, and SNMP Traps tabs will only be displayed if these applications are being monitored.

 

Sorting and Filtering Events

Event Monitor filters

  • Most of the columns in the Event Monitor can be filtered and sorted.
  • Click on the column title to sort by that column.
  • Click on the arrow to the right of a column menu to display additional filtering options.
  • The Columns filter is the same for all columns and allows you to remove columns from the display.
  • Filtering options vary depending on the context of the column, e.g.:
    Event Monitor context filter

 

Suppressing Events

The Longitude rule engine can be configured to suppress Application Events for specific entities. For example, the Windows application will create events for services which are set to autostart but which are not running – to prevent events from being created for services which are not running for a known reason, the event can be suppressed for up to 48 hours. Please note that suppressing events will only apply to the specific item selected – suppressing n event for one service on a computer will not affect events created for other services on that computer.

For recurring problems, events may be suppressed through configuring a suppression action.

 

Disabling Events

If a specific event is not needed on a server, and will not be needed in the future, that event can be disabled. Disabled events are configured in the same way that suppressed events are configured, but they do not expire. An event would be disabled for known problems that will not be addressed. For example, low free memory alerts on an MS SQL database server configured to use maximum available memory.

Suppress and Delete Events Menu

Right-clicking on an event will display the Suppress and Disable options for the event. Right-clicking anywhere in the Application Events window will display the options to options to view all suppressed or disabled events. The available menu options are:

  • Suppress this event
    Suppressing an event will prompt you for how long the event should be suppressed. After this period has expired, the event will be created if the problem condition still exists. During this period, data will still be collected, but it will not be checked against application rules. To suppress events for a known computer maintenance window, suspend monitoring of applications to prevent collection timeouts while the server is unavailable.
     
    Scheduled event suppression may be configured using Configure Suppression in the Configure tab.
     
  • View all suppressed events
    This option will display a pop up window listing all the events that are currently suspended, and when the suspension expires.
  • Disable this event
    Disabling an event will disable it until it is manually resumed. When you are prompted to disable an event, you will be given the option to disable it for all computers or only for the selected computer. For example, the ServiceDown alert can be disabled for the service you’re viewing either for all computers, or for only the computer in the current event.

  • View all disabled events
    This option will display a pop up window listing all the events that are currently disabled. Right-clicking on an event in this window will allow you to Enable the event again.

 

Configure Events

With the Configure button depressed, you will be able to:

The Status and Events tabs are linked to the Events pane in the Configure tab, so that an event selected in Events, or a widget selected in Status will be loaded into the Events tree in the Configure tab. Alternately, you can browse through the Events tree to adjust thresholds and actions directly.

 

Adjusting Thresholds

Configure Thresholds

  • Thresholds can either be a Global Value or an Override Value.
  • The Thresholds tab will display content appropriate to the Event being tailored. For devices with multiple entities (e.g. Disks, Network Interfaces), you will have the option to set Override Values for each individual entity. Some events will have more than one trigger – for example, Unix >> FileSystem >> FileSystemSpaceCritical will alert if the free space is below a specified number of Megabytes, or below a percentage of free space. Status events, such as ping failure or service stopped, will not have thresholds.
  • To change an event’s Global Value threshold, select the event in the left pane, enter a new Global Value on the right, and select Apply.
  • To change an Override Value threshold, enter the value you would like to use in the Override Value field, select the objects that will use this threshold instead of the Global Value, and select Apply.
  • If an Override Value has been set for an object, it will be listed to the right of the object in the Computers tree.
  • Only computers monitored by an application will be displayed, and they are displayed by Computer Group.
  • Override Values are assigned to computers and objects belonging to a computer.
    If a Computer Group is selected, the Override values will be applied to every
    computer in the Group. Moving computers to a different Computer Group will not change Override Values

 

Configuring Actions

Application Events can trigger alert Actions such as Email or SMS text messages. Each Action can include multiple types of alerts (Email, Text to pager, SMS text to phone, SNMP trap, Execute OS Command) and up to 5 Actions can be created for each Event. After the first Action is configured, a + icon will appear to the right of the Action 1 tab, and allow you to create an additional action.

  1. Click on the Dashboard tab.
  2. Click on the Events button, and select the Event you want to generate an alert.
  3. Click on the Configure button – the Event selected in the previous screen will be selected in this screen. Alternately, if you know which Event you want to alert on, you can browse to the Event in the left pane.
  4. In the right pane, select the Actions button.
  5. Actions can be configured by Computer or by Computer Group. In the right pane, in the Action Filters field, click on the radio button for either Groups or Computers. The default value for both Groups and Computers is *, selecting all Computer Groups or all Computers.
    Computer or Computer Group alert selection

    • Clicking on the expand icon for Computer Groups will present a list of all user defined groups. Computer Groups are user specific, and actions can include groups from multiple users. Selected groups will be displayed as a comma separated list in the format User:Group.

      Expanded list of Groups in alert selection

    • To select specific Computers, click on the expand icon and check the Computers for this Action. Computers may also be entered as a comma separated list.
      Expanded list of Computers in alert selection
    • Either Computer Groups or Computers may be used for an action, not both. To create actions for both Computer Groups and Computers configure an additional action.
    • If a computer is added to or removed from a Computer Group, the actions used for that computer will be updated accordingly. Note that since Computer Groups are user specific, in order to remove an action for a Computer it must be removed from all of the selected user groups.
  6. Some alerts will contain additional Action Filters – for example, filtering for Disk IDs or Processes. These may be left as * to select all.
  7. Select the number of consecutive events to alert on. For example, if you are alerting on a PingCritical Event, which has a 1 minute interval, selecting a count of 3 would delay the alert for 3 minutes, but would eliminate false alerts for intermittent network problems. However, Disk Space rules have a 1 hour interval, so a count of 3 would delay alerts for 3 hours.
  8. Select one or more alert actions, and provide the following additional properties for the selected actions:
    Email Comma separated list of recipients
    Text to pager Pager Provider and Number
    Text to phone Cell Phone Provider and Number
    SNMP Trap Specific Trap Number
    Execute Command to Execute (Windows Computers only)

    Alert displaying all available actions

  9. The Enter Times of Day and Days of the Week field sets the time when the alert is active – alerts will only be executed during the selected days and times.
  10. The Cool Off Period is the number of minutes between alert updates. Once an alert is sent, if the problem has not been resolved, additional alerts will only be sent after the Cool Off Period has expired.
  11. Click on the Apply button to create the alert action. This button will be available after one or more fields in the Select Actions section have been completed.
  12. You will receive a Notice that the Action has been successfully applied.
  13. After the action has been assigned, the Configure window will display a different icon for events with actions.
    Display with action notation
  14. Actions may be removed by using the Remove button next to the apply button.

 

Configure Suppression

Application events may also be suppressed for a specified schedule through the
Suppression tab under Dashboard >> Configure This can be used to suppress recurring alerts that are known problems – e.g. high disk queue alerts during a nightly backup. To set up a scheduled event suppression:

  1. Click on Dashboard and then select Configure
  2. In the Events tree in the left pane, select the event to be suppressed
  3. Select the Suppression tab at the top of the right pane
  4. Configure the Suppression Filters to cover the Computers or Computer Groups to be suppressed, and any entities that should be suppressed, e.g. Disk ID or Process Name
  5. Adjust the Times of Day and Days of the Week for the period the event should be suppressed
  6. Click Apply
  7. With a scheduled event suppression configured, the icon for the event will be colored yellow in the left Events pane
  8. To remove a scheduled suppression, make sure no days are selected for the schedule, and click Apply