Edge Alarm Template - QOS
The following sections describe the alarm template settings:
• Playlist/Manifest File Alarms
• HTTP 4xx/5xx Alarms
• Derived Variant Startup Alarm
• TCP Alarms
• Buffer/Gear Change Risk Alarms
• QoE Alarms
There are also some recommended values for these alarms. However, the recommended values are very generic and different values may be appropriate for your content.
Playlist/Manifest File Alarms
The following sections describe the Playlist/Manifest File alarms:
• Playlist/Manifest Unavailable
• Stalled Playlist/Manifest File
The Playlist/Manifest Unavailable alarm is triggered when the number of playlist/manifest unavailable errors exceeds the configured value. The alarm clears when the number of playlist/manifest unavailable errors is lower than the configured value.
Recommended value: 2
The Stalled Playlist/Manifest File alarm is triggered when a text track file is not updated within the configured threshold. The alarm clears when the session has ended or the number of seconds without a change in a text file is lower than the configured threshold.
Stalled Playlist recommended value: 60 seconds
The Media File Unavailable alarm is triggered when multiple retries for the same media file have failed. The logic allows for 404 responses and will only log a missed media file after multiple unsuccessful attempts. The probe finally moves on to keep pace with the live stream and logs the missing media file event.
Alarm Window Size recommended value: 1 Minute
Error Maximum recommended value: 0 (alerts every time a media file is missed)
HTTP 4xx/5xx Alarms
The HTTP 4xx/5xx alarms are triggered when the maximum number of HTTP 4xx or 5xx status codes has occurred during the sliding window. The alarms clear when the number of HTTP 4xx or 5xx status codes in the sliding alarm window is lower than the configured threshold.
Sliding Alarm Window Size (minutes) - recommended value: 10 min
Window Errors Maximum - recommended value: 10 errors.
Status Code Exclusions – Allows exclusion of certain HTTP error codes from alarm
Recommended values will alarm if there are more than 10 HTTP errors in 10 min sliding window
Derived Video Startup Time (DVST) Alarm
The Derived Variant Startup Alarm is triggered when the time to download the configured number of segments or configured media duration exceeds the threshold. This is measuring the time to download the first 60 seconds of video. Included in the time is DNS, TCP connection, playlist/manifest download, and the configured number of media segments. The alarm clears when the session has ended.
Maximum (seconds) - recommended value: 30 seconds
Alarm if the probe was not able to download first 60 seconds of video in 30 seconds.
Connection Time Errors Alarm
The Connection Time Errors alarm is triggered when the TCP connection time exceeds the configured threshold. The alarm clears when the session has ended or the TCP connection has completed within the configured time threshold.
Threshold (milliseconds) - recommended value: 1000 ms
Connection Errors
The Connection Errors alarm is triggered when The maximum number of failed TCP connection attempts has occurred during the sliding window. The alarm clears when the number of failed TCP connection attempts in the sliding alarm window is lower than the configured threshold.
Sliding Alarm Window Size (minutes) - recommended value: 10 min
Window Errors Maximum - recommended value: 1 min
Alarm if there is more than 1 TCP connection errors in 10 min sliding window.
Buffer/Gear Change Risk Alarms
The Buffer/Gear Change Risk alarm is triggered when the buffer/gear change (BGCR) event lasts longer than the duration threshold or the maximum number of BGCR threshold violations has occurred during the sliding alarm window. The alarm clears when the event duration is below the configured threshold or the number of failed BGCR violations in the sliding alarm window is lower than the configured threshold.
Duration Threshold (seconds) - recommended value: 60 seconds
Alarm if we had bad condition (slow delivery) for 60 seconds. Alarm indicates risk of low buffer and gear change to lower rendition.
Sliding Alarm Window Size (minutes) - recommended value: 10 min
Windows Errors Maximum - recommended value: 3
Alarm If we had more than 3 slow delivery events in 10 min. This is good for intermittent events that don’t last more than 60 sec.
Outage
An outage is any condition that leads to an interruption in video playback. The outage alarm will trigger anytime that the monitoring location is detecting an outage for a given asset or stream variant. Specify the minimum threshold for how long an asset or stream variant needs to be in an outage state before the alarm is triggered.
Alarm Threshold (seconds) - recommended value: 60 seconds
Will alarm if we are not able to get any segments in 60 seconds
Updated about 4 years ago