Edge Alarm Template - QOS

The following sections describe the alarm template settings:
Playlist/Manifest File Alarms
• HTTP 4xx/5xx Alarms
• Derived Variant Startup Alarm
• TCP Alarms
• Buffer/Gear Change Risk Alarms
• QoE Alarms

There are also some recommended values for these alarms. However, the recommended values are very generic and different values may be appropriate for your content.

Playlist/Manifest File Alarms

The following sections describe the Playlist/Manifest File alarms:

Playlist/Manifest Unavailable
Stalled Playlist/Manifest File

The Playlist/Manifest Unavailable alarm is triggered when the number of playlist/manifest unavailable errors exceeds the configured value. The alarm clears when the number of playlist/manifest unavailable errors is lower than the configured value.

Recommended value: 2

1207

The Stalled Playlist/Manifest File alarm is triggered when a text track file is not updated within the configured threshold. The alarm clears when the session has ended or the number of seconds without a change in a text file is lower than the configured threshold.

Stalled Playlist recommended value: 60 seconds

1202

The Media File Unavailable alarm is triggered when multiple retries for the same media file have failed. The logic allows for 404 responses and will only log a missed media file after multiple unsuccessful attempts. The probe finally moves on to keep pace with the live stream and logs the missing media file event.

Alarm Window Size recommended value: 1 Minute
Error Maximum recommended value: 0 (alerts every time a media file is missed)

HTTP 4xx/5xx Alarms

The HTTP 4xx/5xx alarms are triggered when the maximum number of HTTP 4xx or 5xx status codes has occurred during the sliding window. The alarms clear when the number of HTTP 4xx or 5xx status codes in the sliding alarm window is lower than the configured threshold.

Sliding Alarm Window Size (minutes) - recommended value: 10 min

Window Errors Maximum - recommended value: 10 errors.

Status Code Exclusions – Allows exclusion of certain HTTP error codes from alarm
Recommended values will alarm if there are more than 10 HTTP errors in 10 min sliding window

1205

Derived Video Startup Time (DVST) Alarm

The Derived Variant Startup Alarm is triggered when the time to download the configured number of segments or configured media duration exceeds the threshold. This is measuring the time to download the first 60 seconds of video. Included in the time is DNS, TCP connection, playlist/manifest download, and the configured number of media segments. The alarm clears when the session has ended.

Maximum (seconds) - recommended value: 30 seconds
Alarm if the probe was not able to download first 60 seconds of video in 30 seconds.

1202

Connection Time Errors Alarm

The Connection Time Errors alarm is triggered when the TCP connection time exceeds the configured threshold. The alarm clears when the session has ended or the TCP connection has completed within the configured time threshold.

Threshold (milliseconds) - recommended value: 1000 ms

1149

Connection Errors

The Connection Errors alarm is triggered when The maximum number of failed TCP connection attempts has occurred during the sliding window. The alarm clears when the number of failed TCP connection attempts in the sliding alarm window is lower than the configured threshold.

Sliding Alarm Window Size (minutes) - recommended value: 10 min

Window Errors Maximum - recommended value: 1 min
Alarm if there is more than 1 TCP connection errors in 10 min sliding window.

1202

Buffer/Gear Change Risk Alarms

The Buffer/Gear Change Risk alarm is triggered when the buffer/gear change (BGCR) event lasts longer than the duration threshold or the maximum number of BGCR threshold violations has occurred during the sliding alarm window. The alarm clears when the event duration is below the configured threshold or the number of failed BGCR violations in the sliding alarm window is lower than the configured threshold.

Duration Threshold (seconds) - recommended value: 60 seconds
Alarm if we had bad condition (slow delivery) for 60 seconds. Alarm indicates risk of low buffer and gear change to lower rendition.

Sliding Alarm Window Size (minutes) - recommended value: 10 min

Windows Errors Maximum - recommended value: 3
Alarm If we had more than 3 slow delivery events in 10 min. This is good for intermittent events that don’t last more than 60 sec.

1202

Outage

An outage is any condition that leads to an interruption in video playback. The outage alarm will trigger anytime that the monitoring location is detecting an outage for a given asset or stream variant. Specify the minimum threshold for how long an asset or stream variant needs to be in an outage state before the alarm is triggered.

Alarm Threshold (seconds) - recommended value: 60 seconds
Will alarm if we are not able to get any segments in 60 seconds