Monitoring

The CloudWatch monitoring service allows Owner to control Virtual Machine operation.

Definitions

Several types of entities are determined within CloudWatch: Metrics, Namespaces, and Alarms.

Metrics

Metrics are data sets arranged by time. Each metric corresponds to a certain Namespace. Namespace indicates Cloud system, which parameters the Metric describes. For example, Metrics of “AWS/EC2” Namespace store parameters of Instances (CPU load, incoming Internet traffic, etc.). Each metric is also associated with Dimension, which is a key/value pair enabling Metric identification. For example, dimension for Metric from “AWS/EC2” will be a pair of {"InstanceID": [<Instance ID>]}, where the value is a list consisting of one element - Instance ID, which data is stored in the Metric.

Alarms

Alarm allows user to monitor statistics of Metrics values and receive notifications of its compliance with a certain condition. For example, user can be notified when CPU load of certain instance exceeds 80% during the last hour.

Alarm has the following parameters:

  • Name – Alarm name, which is unique for a user
  • Description – short Alarm description (optional)
  • Threshold – value, to which metric values will be compared
  • Comparison Operator – arithmetic operation used for comparison with threshold value
  • Evaluation periods is a number of configured time periods, over which metric values will be collected
  • Period Duration (overall duration of periods shall not exceed 24 hours)
  • Statistics – arithmetical operation, which is performed over the metric values in each period

Statistics list:

  • Average – average value
  • Sum – sum of values
  • Maximum – maximum value
  • Minimum – minimum value
  • Data Samples – number of entries with data on a metric

Tip

If the consecutive period(s) is set to 1 when creating the alarm and the duration of period is set to 3600 seconds, then the alarm will check, for example, maximum metric value over the last hour. If the consecutive period(s) is set to 12 and the duration of period is set to 300 seconds, then the alarm will check maximum metric value in every five minute period over the last hour.

Changing Evaluation periods and their duration help determine the importance of, for example, CPU load peaks.

The Alarm can be in one of three Statuses:

  • INSUFFICIENT DATA means that the gathered amount of data is insufficient to check the existing Metric
  • OK – the metric value does not match the threshold condition
  • ALARM – metric value matches the threshold condition.

The Alarm can perform Actions if switched to a certain status. The Action means sending an email to user-Particular addresses, which contains the time of status change and the reason for its change as well as certain additional information.

Metrics list

AWS/EC2 namespace

CPUUtilization Processor load
DiskReadBytes Number of bytes read from all Volumes attached to the Instance.
DiskWriteBytes Number of bytes written to all Volumes attached to the Instance.
DiskReadOps Read operations from all Volumes attached to the Instance.
DiskWriteOps Write operations to all Volumes attached to the Instance.
NetworkOut Number of bytes sent via all network interfaces of the Instance.
NetworkIn Number of bytes received from all network interfaces of the Instance.
NetworkPacketsOut Number of packets sent via all network interfaces of the Instance.
NetworkPacketsIn Number of packets received from all network interfaces of the Instance.

AWS/EBS namespace

VolumeReadBytes Number of bytes read from this volume.
VolumeWriteBytes Number of bytes written to this volume.
VolumeReadOps Completed read operations from this volume.
VolumeWriteOps Completed write operations to this volume.

The Monitoring tab

Open Monitoring tab to set up VM performance monitoring. Main menu of the console consists of two sections: Alarms and Metrics.

Alarms section

A list of all user-created Alarms is displayed in Alarms section.

The following parameters are shown for each Alarm:

  • Status
  • Alarm name
  • Condition for sending a notification when Alarm triggers
  • Instance ID or Volume ID, for which this Alarm is created
../_images/alarm_list.png

Click Alarm name to open Alarm page.

Metrics section

A list of all Metrics shared to User is displayed in Metrics section.

The following parameters are specified for each Metric:

  • Measurement
  • Measured value
../_images/metric_list.png

Click the measured Metric value to open Metric page.

Alarm Page

The Alarm page contains two tabs: Information and History

The Information tab

This tab shows:

  • Alarm status and its reason
  • Alarm parameters
  • Interactive diagram of a Metric, which is monitored by the Alarm
  • Controls allowing for the Alarm modification or deletion.
../_images/alarm_info.png

The History tab

This tab shows a list of Alarm-related events.

../_images/alarm_history.png

Three event types are shared:

  • Alarm modification (creating, deleting and changing)
  • Alarm status change
  • Action execution (sending email)

Creating a new alarm

Click button_setup button on Alarms tab to create new Alarm.

A dialog window of new Alarm creation will pop up. The dialog window consists of three windows (the third one is optional).

In the first window, select Measurement and Metric to be monitored by the Alarm.

../_images/alarm_redact_1.png

In the second window, set other Alarm parameters. This window also displays a Metric diagram and the selected threshold value. This helps select threshold value more accurately based on the latest Metric values.

Alarm creation can be completed in this window.

../_images/alarm_redact_2.png

In the third (optional) window, Actions can be configured. You can set an Alarm status, which shall trigger an action, and enter email addresses for notifications.

../_images/alarm_redact_3.png

Changing alarm

Click button_alarm_modify button on Alarm page to modify an Alarm.

Then a dialog will open, which is similar to that of Alarm creation but with the second window displayed.

Metrics page

The Metrics page shows interactive Metrics diagram. You can measure period length and statistics applicable to each period. This helps evaluate trends of data change and set Alarm parameters more accurately. To view Metrics data over required period of time, set the timeframe in corresponding fields under the diagram.

Controls on the page allow you to:

  • Create an Alarm for a Metric button_alarm_create_from_metric
  • Display a list of all Alarms for Measurement (Instance, volume) corresponding to the Metric button_list_alarms_for_id

CloudWatch limitations

The following limitations apply to CloudWatch operation:

Value Limitation
Number of Alarms 100
Number of notification (email) recipients 5
Metrics storage period, days 7
Alarms History retention period, days 7