Skip to main content

How to use KML

Requirements

This page references the kml of the song lady marmalade duet as an example.

The kml can be found here.

Working With the KML File

The KML file will be returned by setting the parameter withKml to true in the getSong request.

While there are many elements in the KML file, the ones to take note of are duet and pg. The following sections detail each element in the file.

Duet

The duet section identifies the number of singers in the song and who sings which part. The number of singers is set by the number of singer elements in the duet section.

As shown in Figure 1, the song has four female singers.

Figure 1: Duet example

    <duet>
<singer gender="female">
<part>pt.1</part>
<part>pt.2</part>
<part>pt.5</part>
</singer>
<singer gender="female">
<part>pt.2</part>
<part>pt.3</part>
<part>pt.5</part>
</singer>
<singer gender="female">
<part>pt.2</part>
<part>pt.4</part>
<part>pt.6</part>
</singer>
<singer gender="female">
<part>pt.2</part>
<part>pt.6</part>
<part>pt.7</part>
</singer>
</duet>

Table 1: Singers and associated parts

SingerPart 1Part 2Part 3Part 4Part 5Part 6Part 7
1stxxx
2ndxxx
3rdxxx
4thxxx

Pg

The pg (page) section mostly gives information about the time and duration of the lyrics. There are four types of pg element.

  • info_page: Can be ignored. Page containing information about the duration, scale and tempo.
  • lyric: Page containing information about the lyrics to display.
  • credits_page: Can be ignored. Page containing information about the distributor.
  • closing_page: Can be ignored.

As shown in Figure 2, the page elements can be found under the attribute id.

Figure 2: pg example

    <pg d="0" id="info_page" t="0">
...
</pg>
<pg d="8.24" id="lyrics.1" t="0">
...
</pg>

Lyrics

The lyrics to display are separated in pages. Each page contains lines ln that themselves contain lyrics lyr. A lyric indicates a word to sing at which moment.

There are 4 attributes to a lyric:

  • c: The singer part associated with the lyric. This attribute can also be found under the ln element.
  • d: Duration of the word.
  • s: The word itself to sing.
  • t: The time at which the word have to be sang.

For example, as shown in Figure 3, the word WHERE'S should be sung at 4.35 seconds for a duration of 0.29 seconds.

Figure 3: Lyric example

    <ln c="pt.1">
<lyr c="pt.1" d="0.29" s="WHERE'S " t="4.35"/>
<lyr c="pt.1" d="0.27" s="ALL " t="4.64"/>
<lyr c="pt.1" d="0.31" s="MY " t="4.91"/>
</ln>

Important:

A page may have lines associated with different parts. If this is the case, only the singer that has that part should sing the line.

For example, from our sample file we see that the first page, id=lyrics.1, only contains line with the attribute c=pt.1. As shown in Table 1, part 1 is only to be sung by the first singer. However, the fourth page, id=lyrics.4, has two parts; c=pt.2 and c=pt.3. where part 2 is sung by everyone and part 3 by the second singer. The renderer should be able to associate the lines with the appropriate singers and lyrics accordingly. Looking at the page id=lyrics.4, there are two parts; c=pt.2 and c=pt.3. The part 2 has to be sung by everyone while the part 3 only has to be sung by the second singer. The renderer should be able to associate the lines with the right singers and display the lyric correspondingly.

Table 1: Singers and associated parts

SingerPart 1Part 2Part 3Part 4Part 5Part 6Part 7
1stxxx
2ndxxx
3rdxxx
4thxxx

Changing lead vocal toggle

For certain songs, we have the option to enable the lead vocal to be on or off. There is an optional coldstartoffset at the start of the kml file where the C element is for adjusting the songs to match the lyrics and music.

Figure 4: Cold start offset example

<coldstartoffset B="0" C="4597" D="0">

For the example in Figure 4, there is a cold start offset of 4597 milliseconds. When we toggle the lead vocal to on, we subtract those 4597 milliseconds to the current time of the songs. When we toggle it back off, we should readjust the current time by adding the cold start offset.

If the cold start offset is not defined, then no offset is required.

There might also be songs for which the coldstartoffset tag is greater than the time of the first lyrics. In that case, use the minimum value between the cold start offset and the first lyric timestamp.

Figure 5: Cold start offset greater than lyrics example

    <coldstartoffset B="0" C="4597" D="0">
<ln c="pt.1">
<lyr c="pt.1" d="0.29" s="WHERE'S " t="4.35"/>
<lyr c="pt.1" d="0.27" s="ALL " t="4.64"/>
<lyr c="pt.1" d="0.31" s="MY " t="4.91"/>
</ln>

For example in this particular case, the first lyrics is sang at "4.35s" and the coldstartoffset is 4597ms. Since the start of a song cannot be negative, we only subtract the 4350ms when playing the file with lead vocal.

Implementation

Concretely, the coldstartoffset means that the audio file with the lead vocal start that amount faster than the file without lead vocal.

If a lyrics is sang at t=4.00 and the coldstartoffset is 1500, than the same lyric is actually sang at t=2.5 with lead vocal. This also mean that the kml always matches the file without lead vocal.

Naturally, a lyric cannot be sang at a negative time which explains why the smallest value between the first lyric and the coldstartoffset must be taken. The coldstartoffset can be greater than the first lyrics for purposes not needed when rendering the lyrics.

To switch between lead vocal on and off seamlessly, the seek time of the audio file must be adjusted to keep the files in sync. With our example, if the song is at 10sec and the user toggle the lead vocal off, then the same lyric is sang at (10 + 1.5) = 11.5sec without lead vocal. The seek time must then be 11.5sec when playing the file. In the same way when toggling on the lead vocal, the seek time must be adjusted to (11.5 - 1.5) = 10sec to keep the files in sync.