Kubo Ryosuke
Posted on November 3, 2021
Introduction
MPD which is manifest file of MPEG-DASH indicates segment URLs and any attributes like HLS playlist.
HLS playlist has simple list of URLs and attributes and is very easy to read. However, MPD has compex structure and is more difficult to read than HLS playlist.
In this post, I describe things about segments of MPD.
MPD Examples
You can see Reference Client and many DASH streams at DASH Industry Forum.
BaseURL
If MPD has BaseURL tag, segment URL is indicated as relative path from value of the tag.
<BaseURL>http://localhost/mystream/hd/</BaseURL>
BaseURL can have relative URL from MPD.
<BaseURL>./hd/</BaseURL>
If all segments are concatenated as one file, BaseURL can have full URL of the file.
<BaseURL>http://localhost/mystream/video.mp4</BaseURL>
In this case, player will download those segments by range requests.
Using SegmentBase or SegmentList/SegmentURL which will be described bellow, MPD can indicates byte ranges of those segments to player.
SegmentTemplate
In live profile, SegmentTemplate tag is usually used.
This tag can be used to both of live stream (dynamic type MPD) and VOD stream (static type MPD).
ABEMA which is our video streaming service uses SegmentTemplate tag for all streams.
media attribute and initialization attribute
SegmentTemplate@media
attribute and SegmentTemplate@initialization
attribute indicate media segments (MP4 containing moof and mdat) and initialization segment (MP4 containing ftyp and moov), respectively.
However, you should resolve templates which is enclosed by $
and embedded to those values.
<SegmentTemplate
media="$RepresentationID$/$Time$.mp4"
initialization="$RepresentationID$/init.mp4">
There are 5 types of template. In most cases, $RepresentationID$
and $Time$
or $Number$
will be used.
$RepresentationID$
template indicates value of Representation@id
.
In following sample, There are 2 Representation tags at same level with SegmentTemplate tag.
<AdaptationSet contentType="video" mimeType="video/mp4" segmentAlignment="true">
<SegmentTemplate timescale="90000" media="$RepresentationID$/$Time$.mp4" initialization="$RepresentationID$/init.mp4">
<Representation id="video-hd" bandwidth="2000000" frameRate="30000/1001" height="720" width="1280" scanType="progressive" />
<Representation id="video-sd" bandwidth="1000000" frameRate="30000/1001" height="480" width="854" scanType="progressive" />
</AdaptationSet>
Inserting video-hd
and video-sd
to $RepresentationId$
of @initialization
attribute, you can get paths: video-hd/init.mp4
and video-sd/init.mp4
. These are initialization segment of HD and SD, respectively.
On the other way, what does $Time$
template indicates?
With SegmentTimeline
SegmentTimeline tag enumerates relative time and duration of segments as follows:
<?xml version="1.0" encoding="utf-8"?>
<MPD xmlns="urn:mpeg:dash:schema:mpd:2011" availabilityStartTime="1970-01-01T00:00:00Z" profiles="urn:mpeg:dash:profile:isoff-live:2011" type="dynamic" minBufferTime="PT5.000000S" publishTime="2021-10-28T13:07:58Z" minimumUpdatePeriod="PT5.000000S" timeShiftBufferDepth="PT60.000000S" suggestedPresentationDelay="PT15.000000S">
<BaseURL>http://localhost/mystream/</BaseURL>
<Period id="1" start="PT1609426800S">
<AdaptationSet mimeType="video/mp4" segmentAlignment="true">
<SegmentTemplate timescale="90000" presentationTimeOffset="10786776" media="$RepresentationID$/$Time$.mp4" initialization="$RepresentationID$/init.mp4">
<SegmentTimeline>
<S d="357357" t="11771760" />
<S d="360360" r="3"/>
<S d="357357" />
</SegmentTimeline>
</SegmentTemplate>
<Representation id="video-hd" bandwidth="2000000" frameRate="30000/1001" height="720" width="1280" scanType="progressive" />
<Representation id="video-sd" bandwidth="1000000" frameRate="30000/1001" height="480" width="854" scanType="progressive" />
</AdaptationSet>
<AdaptationSet mimeType="audio/mp4" segmentAlignment="true">
<SegmentTemplate timescale="48000" presentationTimeOffset="5752947" media="$RepresentationID$/$Time$.mp4" initialization="$RepresentationID$/init.mp4">
<SegmentTimeline>
<S d="191488" t="6278272"/>
<S d="192512" r="4"/>
</SegmentTimeline>
</SegmentTemplate>
<Representation id="audio-high" bandwidth="190000">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2" />
</Representation>
<Representation id="audio-low" bandwidth="64000">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="1" />
</Representation>
</AdaptationSet>
</Period>
</MPD>
SegmentTimeline tag has S tags which indicates each segments.
However, if S tag has non-zero @r
attribute, @r
is repeat count of the S tag. (When @r
is negative value, it means open-ended.) For example, r="3"
means +3 segments, in other words, there are continuous 4 segments which have same duration.
S@d
attribute is duration of segment and S@t
attribute is earliest timestamp of segment.
The second and the subsequent segments can omit @t
attribute, and its value is sum of previous segment's @t
and @d
.
Insert those @t
values to $Time$
template, you can get media segment URLs.
For example, earliest segments of above sample is http://localhost/mystream/video-hd/11771760.mp4
and http://localhost/mystream/video-sd/11771760.mp4
.
S@t
and S@d
are always integer values. Dividing those by SegmentTemplate@timescale
attribute, you can get time and duration in seconds, respectively.
So, in above sample, the duration of earliest video segment is 357357 / 90000 = 3.97
seconds, and the timestamp is 11771760 / 90000 = 130.797
seconds. These values are synchronized with values in MP4 boxes.
Then, subtracting SegmentTemplate@presentationTimeOffset
from S@t
, you can get elapsed time from period start.
For above sample, you can calculate elapsed time of latest video segment from period start by following expression.
(11771760 + 357357 + 360360 * 4 - 10786776) / 90000 = 30.93 seconds
Adding Period@start
to MPD@availabilityStartTime
, you can get absolute time of period start.
So, in above sample, period start is 2020/12/31T15:00:00Z and composition time of latest video segment is 2020/12/31T15:00:30.930Z.
Only SegmentTemplate
As mentioned above, you can use SegmentTimeline to write timestamp (composition time) and duration of each segments.
On the other hand, if all segments have same duration, SegmentTimeline is not mandatory.
<AdaptationSet contentType="video" mimeType="video/mp4" segmentAlignment="true">
<SegmentTemplate duration="2" startNumber="1000" initialization="$RepresentationID$/init.mp4" media="$RepresentationID$/$Number$.mp4" />
<Representation id="video-300k" bandwidth="300000" codecs="avc1.64001e" frameRate="30" height="360" width="640" />
</AdaptationSet>
In above sample, SegmentTemplate@duration
is segment duration and the all segments are 2 seconds.
@media
attribute has $Number$
template, it is index number of segment which is started by @startNumber
.
So, segment URLs of above MPD are followings.
video-300k/1000.mp4 --- earliest 2 seconds
video-300k/1001.mp4 --- 2nd 2 seconds
video-300k/1002.mp4 --- 3rd 2 seconds
:
video-300k/1099.mp4 --- 1000th 2 seconds
:
This post examples use $Time$
template with SegmentTimeline or $Number$
without SegmentTimeline.
However, not only these usages, you can use $Number$
template with SegmentTimeline.
Relation with Representation
In above samples, SegmentTemplate tag and Representation tags are placed in same layer. However, SegmentTemplate tag can be placed in each Representation tags.
<Representation id="video-hd" bandwidth="2000000" frameRate="30000/1001" height="720" width="1280" scanType="progressive">
<SegmentTemplate timescale="90000" presentationTimeOffset="10786776" media="hd/$Time$.mp4" initialization="hd/init.mp4">
...
</SegmentTemplate>
</Representation>
<Representation id="video-sd" bandwidth="1000000" frameRate="30000/1001" height="480" width="854" scanType="progressive">
<SegmentTemplate timescale="90000" presentationTimeOffset="10786776" media="sd/$Time$.mp4" initialization="sd/init.mp4">
...
</SegmentTemplate>
</Representation>
SegmentList, Initialization and SegmentURL
SegmentURL tags have URL or byte-range of each segments.
For example, you can place Initialization tag and SegmentURL tags in SegmentList tag as follows:
<SegmentList>
<Initialization sourceURL="init.mp4" />
<SegmentURL media="0.mp4" />
<SegmentURL media="1.mp4" />
<SegmentURL media="2.mp4" />
</SegmentList>
This is similar to HLS playlist and easy to understand.
SegmentBase
If BaseURL has URL of single MP4 file and MPD has neither SegmentList nor SegmentTemplate, player have no way knowing byte-range of segments.
SegmentBase@indexRange
attribute indicates byte-range of information about byte-range of each segments (ex. sidx Box).
<BaseURL>http://localhost/sample.mp4</BaseURL>
<SegmentBase indexRange="896-1730"/>
For this example, player will send range-request of 896-1730 at first. The response data contains timestamp and offset of segments and player will send range-request for each segments.
Conclusion
There are many types of MPD usage. So it is difficult to understand.
This post described major ways to represent segments in MPD.
References
- ISO/IEC 23009-1 Dynamic adaptive streaming over HTTP (DASH) - Part 1: Media presentation description and segment formats
- フレームレートとタイムコード、タイムスタンプ
- AbemaTV の MPEG-DASH 対応
- MPEG-DASH (Dynamic Adaptive Streaming over HTTP, ISO/IEC 23009-1) - BITMOVIN Blog
Posted on November 3, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.