id3v2.html (69121B)
1 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 2 <BASE HREF="http://www.id3.org/id3v2-00.txt"><table border=1 width=100%><tr><td><table border=1 bgcolor=#ffffff cellpadding=10 cellspacing=0 width=100% color=#ffffff><tr><td><font face=arial,sans-serif color=black size=-1>This is <b><font color=#0039b6>G</font> <font color=#c41200>o</font> <font color=#f3c518>o</font> <font color=#0039b6>g</font> <font color=#30a72f>l</font> <font color=#c41200>e</font></b>'s <a href="http://www.google.com/help/features.html#cached"><font color=blue>cache</font></a> of <A HREF="http://www.id3.org/id3v2-00.txt"><font color=blue>http://www.id3.org/id3v2-00.txt</font></a>.<br> 3 <b><font color=#0039b6>G</font> <font color=#c41200>o</font> <font color=#f3c518>o</font> <font color=#0039b6>g</font> <font color=#30a72f>l</font> <font color=#c41200>e</font></b>'s cache is the snapshot that we took of the page as we crawled the web.<br> 4 The page may have changed since that time. Click here for the <A HREF="http://www.id3.org/id3v2-00.txt"><font color=blue>current page</font></a> without highlighting.<br>To link to or bookmark this page, use the following url: <code>http://www.google.com/search?q=cache:oBxHu60tTnkC:www.id3.org/id3v2-00.txt+id3v2+description&hl=en&ie=UTF-8</code></font><br><br><center><font size=-2><i>Google is not affiliated with the authors of this page nor responsible for its content.</i></font></center></td></tr> 5 <tr><td> 6 <table border=0 cellpadding=0 cellspacing=0><tr><td><font face=arial,sans-serif color=black size=-1>These search terms have been highlighted: </font></td><td bgcolor=#ffff66><B><font face=arial,sans-serif color=black size=-1>id3v2 </font></B></td><td bgcolor=#A0FFFF><B><font face=arial,sans-serif color=black size=-1>description </font></B></td></tr></table> 7 </td></tr></table></td></tr></table> 8 <hr> 9 <html> 10 <head> 11 </head><body><pre> 12 Informal standard M. Nilsson 13 Document: <B style="color:black;background-color:#ffff66">id3v2</B>-00.txt 26th March 1998 14 15 16 ID3 tag version 2 17 18 Status of this document 19 20 This document is an Informal standard and is released so that 21 implementors could have a set standard before the formal standard is 22 set. The formal standard will use another version number if not 23 identical to what is described in this document. The contents in this 24 document may change for clarifications but never for added or altered 25 functionallity. 26 27 Distribution of this document is unlimited. 28 29 30 Abstract 31 32 The recent gain of popularity for MPEG layer III audio files on the 33 internet forced a standardised way of storing information about an 34 audio file within itself to determinate its origin and contents. 35 36 Today the most accepted way to do this is with the so called ID3 tag, 37 which is simple but very limited and in some cases very unsuitable. 38 The ID3 tag has very limited space in every field, very limited 39 numbers of fields, not expandable or upgradeable and is placed at the 40 end of a the file, which is unsuitable for streaming audio. This draft 41 is an attempt to answer these issues with a new version of the ID3 42 tag. 43 44 45 1. Table of contents 46 47 2. Conventions in this document 48 3. <B style="color:black;background-color:#ffff66">ID3v2</B> overview 49 3.1. <B style="color:black;background-color:#ffff66">ID3v2</B> header 50 3.2. <B style="color:black;background-color:#ffff66">ID3v2</B> frames overview 51 4. Declared <B style="color:black;background-color:#ffff66">ID3v2</B> frames 52 4.1. Unique file identifier 53 4.2. Text information frames 54 4.2.1. Text information frames - details 55 4.2.2. User defined text information frame 56 4.3. URL link frames 57 4.3.1. URL link frames - details 58 4.3.2. User defined URL link frame 59 4.4. Involved people list 60 4.5. Music CD Identifier 61 4.6. Event timing codes 62 4.7. MPEG location lookup table 63 4.8. Synced tempo codes 64 4.9. Unsychronised lyrics/text transcription 65 4.10. Synchronised lyrics/text 66 4.11. Comments 67 4.12. Relative volume adjustment 68 4.13. Equalisation 69 4.14. Reverb 70 4.15. Attached picture 71 4.16. General encapsulated object 72 4.17. Play counter 73 4.18. Popularimeter 74 4.19. Recommended buffer size 75 4.20. Encrypted meta frame 76 4.21. Audio encryption 77 4.22. Linked information 78 5. The 'unsynchronisation scheme' 79 6. Copyright 80 7. References 81 8. Appendix 82 A. Appendix A - ID3-Tag Specification V1.1 83 A.1. Overview 84 A.2. ID3v1 Implementation 85 A.3. Genre List 86 A.4. Track addition - ID3v1.1 87 9. Author's Address 88 89 90 2. Conventions in this document 91 92 In the examples, text within "" is a text string exactly as it appears 93 in a file. Numbers preceded with $ are hexadecimal and numbers 94 preceded with % are binary. $xx is used to indicate a byte with 95 unknown content. %x is used to indicate a bit with unknown content. 96 The most significant bit (MSB) of a byte is called 'bit 7' and the 97 least significant bit (LSB) is called 'bit 0'. 98 99 A tag is the whole tag described in this document. A frame is a block 100 of information in the tag. The tag consists of a header, frames and 101 optional padding. A field is a piece of information; one value, a 102 string etc. A numeric string is a string that consists of the 103 characters 0-9 only. 104 105 106 3. <B style="color:black;background-color:#ffff66">ID3v2</B> overview 107 108 The two biggest design goals were to be able to implement <B style="color:black;background-color:#ffff66">ID3v2</B> 109 without disturbing old software too much and that <B style="color:black;background-color:#ffff66">ID3v2</B> should be 110 expandable. 111 112 The first criterion is met by the simple fact that the MPEG [MPEG] 113 decoding software uses a syncsignal, embedded in the audiostream, to 114 'lock on to' the audio. Since the <B style="color:black;background-color:#ffff66">ID3v2</B> tag doesn't contain a valid 115 syncsignal, no software will attempt to play the tag. If, for any 116 reason, coincidence make a syncsignal appear within the tag it will be 117 taken care of by the 'unsynchronisation scheme' described in section 118 5. 119 120 The second criterion has made a more noticeable impact on the design 121 of the <B style="color:black;background-color:#ffff66">ID3v2</B> tag. It is constructed as a container for several 122 information blocks, called frames, whose format need not be known to 123 the software that encounters them. At the start of every frame there 124 is an identifier that explains the frames's format and content, and a 125 size descriptor that allows software to skip unknown frames. 126 127 If a total revision of the <B style="color:black;background-color:#ffff66">ID3v2</B> tag should be needed, there is a 128 version number and a size descriptor in the <B style="color:black;background-color:#ffff66">ID3v2</B> header. 129 130 The ID3 tag described in this document is mainly targeted to files 131 encoded with MPEG-2 layer I, MPEG-2 layer II, MPEG-2 layer III and 132 MPEG-2.5, but may work with other types of encoded audio. 133 134 The bitorder in <B style="color:black;background-color:#ffff66">ID3v2</B> is most significant bit first (MSB). The 135 byteorder in multibyte numbers is most significant byte first (e.g. 136 $12345678 would be encoded $12 34 56 78). 137 138 It is permitted to include padding after all the final frame (at the 139 end of the ID3 tag), making the size of all the frames together 140 smaller than the size given in the head of the tag. A possible purpose 141 of this padding is to allow for adding a few additional frames or 142 enlarge existing frames within the tag without having to rewrite the 143 entire file. The value of the padding bytes must be $00. 144 145 146 3.1. <B style="color:black;background-color:#ffff66">ID3v2</B> header 147 148 The <B style="color:black;background-color:#ffff66">ID3v2</B> tag header, which should be the first information in the 149 file, is 10 bytes as follows: 150 151 ID3/file identifier "ID3" 152 ID3 version $02 00 153 ID3 flags %xx000000 154 ID3 size 4 * %0xxxxxxx 155 156 The first three bytes of the tag are always "ID3" to indicate that 157 this is an ID3 tag, directly followed by the two version bytes. The 158 first byte of ID3 version is it's major version, while the second byte 159 is its revision number. All revisions are backwards compatible while 160 major versions are not. If software with <B style="color:black;background-color:#ffff66">ID3v2</B> and below support 161 should encounter version three or higher it should simply ignore the 162 whole tag. Version and revision will never be $FF. 163 164 The first bit (bit 7) in the 'ID3 flags' is indicating whether or not 165 unsynchronisation is used (see section 5 for details); a set bit 166 indicates usage. 167 168 The second bit (bit 6) is indicating whether or not compression is 169 used; a set bit indicates usage. Since no compression scheme has been 170 decided yet, the ID3 decoder (for now) should just ignore the entire 171 tag if the compression bit is set. 172 173 The ID3 tag size is encoded with four bytes where the first bit (bit 174 7) is set to zero in every byte, making a total of 28 bits. The zeroed 175 bits are ignored, so a 257 bytes long tag is represented as $00 00 02 176 01. 177 178 The ID3 tag size is the size of the complete tag after 179 unsychronisation, including padding, excluding the header (total tag 180 size - 10). The reason to use 28 bits (representing up to 256MB) for 181 size <B style="color:black;background-color:#A0FFFF">description</B> is that we don't want to run out of space here. 182 183 A <B style="color:black;background-color:#ffff66">ID3v2</B> tag can be detected with the following pattern: 184 $49 44 33 yy yy xx zz zz zz zz 185 Where yy is less than $FF, xx is the 'flags' byte and zz is less than 186 $80. 187 188 189 3.2. <B style="color:black;background-color:#ffff66">ID3v2</B> frames overview 190 191 The headers of the frames are similar in their construction. They 192 consist of one three character identifier (capital A-Z and 0-9) and 193 one three byte size field, making a total of six bytes. The header is 194 excluded from the size. Identifiers beginning with "X", "Y" and "Z" 195 are for experimental use and free for everyone to use. Have in mind 196 that someone else might have used the same identifier as you. All 197 other identifiers are either used or reserved for future use. 198 199 The three character frame identifier is followed by a three byte size 200 descriptor, making a total header size of six bytes in every frame. 201 The size is calculated as framesize excluding frame identifier and 202 size descriptor (frame size - 6). 203 204 There is no fixed order of the frames' appearance in the tag, although 205 it is desired that the frames are arranged in order of significance 206 concerning the recognition of the file. An example of such order: 207 UFI, MCI, TT2 ... 208 209 A tag must contain at least one frame. A frame must be at least 1 byte 210 big, excluding the 6-byte header. 211 212 If nothing else is said a string is represented as ISO-8859-1 213 [ISO-8859-1] characters in the range $20 - $FF. All unicode strings 214 [UNICODE] use 16-bit unicode 2.0 (ISO/IEC 10646-1:1993, UCS-2). All 215 numeric strings are always encoded as ISO-8859-1. Terminated strings 216 are terminated with $00 if encoded with ISO-8859-1 and $00 00 if 217 encoded as unicode. If nothing else is said newline character is 218 forbidden. In ISO-8859-1 a new line is represented, when allowed, with 219 $0A only. Frames that allow different types of text encoding have a 220 text encoding <B style="color:black;background-color:#A0FFFF">description</B> byte directly after the frame size. If 221 ISO-8859-1 is used this byte should be $00, if unicode is used it 222 should be $01. 223 224 The three byte language field is used to describe the language of the 225 frame's content, according to ISO-639-2 [ISO-639-2]. 226 227 All URLs [URL] may be relative, e.g. "picture.png", "../doc.txt". 228 229 If a frame is longer than it should be, e.g. having more fields than 230 specified in this document, that indicates that additions to the 231 frame have been made in a later version of the ID3 standard. This 232 is reflected by the revision number in the header of the tag. 233 234 235 4. Declared <B style="color:black;background-color:#ffff66">ID3v2</B> frames 236 237 The following frames are declared in this draft. 238 239 4.19 BUF Recommended buffer size 240 241 4.17 CNT Play counter 242 4.11 COM Comments 243 4.21 CRA Audio encryption 244 4.20 CRM Encrypted meta frame 245 246 4.6 ETC Event timing codes 247 4.13 EQU Equalization 248 249 4.16 GEO General encapsulated object 250 251 4.4 IPL Involved people list 252 253 4.22 LNK Linked information 254 255 4.5 MCI Music CD Identifier 256 4.7 MLL MPEG location lookup table 257 258 4.15 PIC Attached picture 259 4.18 POP Popularimeter 260 261 4.14 REV Reverb 262 4.12 RVA Relative volume adjustment 263 264 4.10 SLT Synchronized lyric/text 265 4.8 STC Synced tempo codes 266 267 4.2.1 TAL Album/Movie/Show title 268 4.2.1 TBP BPM (Beats Per Minute) 269 4.2.1 TCM Composer 270 4.2.1 TCO Content type 271 4.2.1 TCR Copyright message 272 4.2.1 TDA Date 273 4.2.1 TDY Playlist delay 274 4.2.1 TEN Encoded by 275 4.2.1 TFT File type 276 4.2.1 TIM Time 277 4.2.1 TKE Initial key 278 4.2.1 TLA Language(s) 279 4.2.1 TLE Length 280 4.2.1 TMT Media type 281 4.2.1 TOA Original artist(s)/performer(s) 282 4.2.1 TOF Original filename 283 4.2.1 TOL Original Lyricist(s)/text writer(s) 284 4.2.1 TOR Original release year 285 4.2.1 TOT Original album/Movie/Show title 286 4.2.1 TP1 Lead artist(s)/Lead performer(s)/Soloist(s)/Performing group 287 4.2.1 TP2 Band/Orchestra/Accompaniment 288 4.2.1 TP3 Conductor/Performer refinement 289 4.2.1 TP4 Interpreted, remixed, or otherwise modified by 290 4.2.1 TPA Part of a set 291 4.2.1 TPB Publisher 292 4.2.1 TRC ISRC (International Standard Recording Code) 293 4.2.1 TRD Recording dates 294 4.2.1 TRK Track number/Position in set 295 4.2.1 TSI Size 296 4.2.1 TSS Software/hardware and settings used for encoding 297 4.2.1 TT1 Content group <B style="color:black;background-color:#A0FFFF">description</B> 298 4.2.1 TT2 Title/Songname/Content <B style="color:black;background-color:#A0FFFF">description</B> 299 4.2.1 TT3 Subtitle/<B style="color:black;background-color:#A0FFFF">Description</B> refinement 300 4.2.1 TXT Lyricist/text writer 301 4.2.2 TXX User defined text information frame 302 4.2.1 TYE Year 303 304 4.1 UFI Unique file identifier 305 4.9 ULT Unsychronized lyric/text transcription 306 307 4.3.1 WAF Official audio file webpage 308 4.3.1 WAR Official artist/performer webpage 309 4.3.1 WAS Official audio source webpage 310 4.3.1 WCM Commercial information 311 4.3.1 WCP Copyright/Legal information 312 4.3.1 WPB Publishers official webpage 313 4.3.2 WXX User defined URL link frame 314 315 316 4.1. Unique file identifier 317 318 This frame's purpose is to be able to identify the audio file in a 319 database that may contain more information relevant to the content. 320 Since standardisation of such a database is beyond this document, all 321 frames begin with a null-terminated string with a URL [URL] containing 322 an email address, or a link to a location where an email address can 323 be found, that belongs to the organisation responsible for this 324 specific database implementation. Questions regarding the database 325 should be sent to the indicated email address. The URL should not be 326 used for the actual database queries. If a $00 is found directly after 327 the 'Frame size' the whole frame should be ignored, and preferably be 328 removed. The 'Owner identifier' is then followed by the actual 329 identifier, which may be up to 64 bytes. There may be more than one 330 "UFI" frame in a tag, but only one with the same 'Owner identifier'. 331 332 Unique file identifier "UFI" 333 Frame size $xx xx xx 334 Owner identifier <textstring> $00 335 Identifier <up to 64 bytes binary data> 336 337 338 4.2. Text information frames 339 340 The text information frames are the most important frames, containing 341 information like artist, album and more. There may only be one text 342 information frame of its kind in an tag. If the textstring is followed 343 by a termination ($00 (00)) all the following information should be 344 ignored and not be displayed. All the text information frames have the 345 following format: 346 347 Text information identifier "T00" - "TZZ" , excluding "TXX", 348 described in 4.2.2. 349 Frame size $xx xx xx 350 Text encoding $xx 351 Information <textstring> 352 353 354 4.2.1. Text information frames - details 355 356 TT1 357 The 'Content group <B style="color:black;background-color:#A0FFFF">description</B>' frame is used if the sound belongs to 358 a larger category of sounds/music. For example, classical music is 359 often sorted in different musical sections (e.g. "Piano Concerto", 360 "Weather - Hurricane"). 361 362 TT2 363 The 'Title/Songname/Content <B style="color:black;background-color:#A0FFFF">description</B>' frame is the actual name of 364 the piece (e.g. "Adagio", "Hurricane Donna"). 365 366 TT3 367 The 'Subtitle/<B style="color:black;background-color:#A0FFFF">Description</B> refinement' frame is used for information 368 directly related to the contents title (e.g. "Op. 16" or "Performed 369 live at wembley"). 370 371 TP1 372 The 'Lead artist(s)/Lead performer(s)/Soloist(s)/Performing group' is 373 used for the main artist(s). They are seperated with the "/" 374 character. 375 376 TP2 377 The 'Band/Orchestra/Accompaniment' frame is used for additional 378 information about the performers in the recording. 379 380 TP3 381 The 'Conductor' frame is used for the name of the conductor. 382 383 TP4 384 The 'Interpreted, remixed, or otherwise modified by' frame contains 385 more information about the people behind a remix and similar 386 interpretations of another existing piece. 387 388 TCM 389 The 'Composer(s)' frame is intended for the name of the composer(s). 390 They are seperated with the "/" character. 391 392 TXT 393 The 'Lyricist(s)/text writer(s)' frame is intended for the writer(s) 394 of the text or lyrics in the recording. They are seperated with the 395 "/" character. 396 397 TLA 398 The 'Language(s)' frame should contain the languages of the text or 399 lyrics in the audio file. The language is represented with three 400 characters according to ISO-639-2. If more than one language is used 401 in the text their language codes should follow according to their 402 usage. 403 404 TCO 405 The content type, which previously (in ID3v1.1, see appendix A) was 406 stored as a one byte numeric value only, is now a numeric string. You 407 may use one or several of the types as ID3v1.1 did or, since the 408 category list would be impossible to maintain with accurate and up to 409 date categories, define your own. 410 References to the ID3v1 genres can be made by, as first byte, enter 411 "(" followed by a number from the genres list (section A.3.) and 412 ended with a ")" character. This is optionally followed by a 413 refinement, e.g. "(21)" or "(4)Eurodisco". Several references can be 414 made in the same frame, e.g. "(51)(39)". If the refinement should 415 begin with a "(" character it should be replaced with "((", e.g. "((I 416 can figure out any genre)" or "(55)((I think...)". The following new 417 content types is defined in <B style="color:black;background-color:#ffff66">ID3v2</B> and is implemented in the same way 418 as the numerig content types, e.g. "(RX)". 419 420 RX Remix 421 CR Cover 422 423 TAL 424 The 'Album/Movie/Show title' frame is intended for the title of the 425 recording(/source of sound) which the audio in the file is taken from. 426 427 TPA 428 The 'Part of a set' frame is a numeric string that describes which 429 part of a set the audio came from. This frame is used if the source 430 described in the "TAL" frame is divided into several mediums, e.g. a 431 double CD. The value may be extended with a "/" character and a 432 numeric string containing the total number of parts in the set. E.g. 433 "1/2". 434 435 TRK 436 The 'Track number/Position in set' frame is a numeric string 437 containing the order number of the audio-file on its original 438 recording. This may be extended with a "/" character and a numeric 439 string containing the total numer of tracks/elements on the original 440 recording. E.g. "4/9". 441 442 TRC 443 The 'ISRC' frame should contian the International Standard Recording 444 Code [ISRC]. 445 446 TYE 447 The 'Year' frame is a numeric string with a year of the recording. 448 This frames is always four characters long (until the year 10000). 449 450 TDA 451 The 'Date' frame is a numeric string in the DDMM format containing 452 the date for the recording. This field is always four characters 453 long. 454 455 TIM 456 The 'Time' frame is a numeric string in the HHMM format containing 457 the time for the recording. This field is always four characters 458 long. 459 460 TRD 461 The 'Recording dates' frame is a intended to be used as complement to 462 the "TYE", "TDA" and "TIM" frames. E.g. "4th-7th June, 12th June" in 463 combination with the "TYE" frame. 464 465 TMT 466 The 'Media type' frame describes from which media the sound 467 originated. This may be a textstring or a reference to the predefined 468 media types found in the list below. References are made within "(" 469 and ")" and are optionally followed by a text refinement, e.g. "(MC) 470 with four channels". If a text refinement should begin with a "(" 471 character it should be replaced with "((" in the same way as in the 472 "TCO" frame. Predefined refinements is appended after the media type, 473 e.g. "(CD/S)" or "(VID/PAL/VHS)". 474 475 DIG Other digital media 476 /A Analog transfer from media 477 478 ANA Other analog media 479 /WAC Wax cylinder 480 /8CA 8-track tape cassette 481 482 CD CD 483 /A Analog transfer from media 484 /DD DDD 485 /AD ADD 486 /AA AAD 487 488 LD Laserdisc 489 /A Analog transfer from media 490 491 TT Turntable records 492 /33 33.33 rpm 493 /45 45 rpm 494 /71 71.29 rpm 495 /76 76.59 rpm 496 /78 78.26 rpm 497 /80 80 rpm 498 499 MD MiniDisc 500 /A Analog transfer from media 501 502 DAT DAT 503 /A Analog transfer from media 504 /1 standard, 48 kHz/16 bits, linear 505 /2 mode 2, 32 kHz/16 bits, linear 506 /3 mode 3, 32 kHz/12 bits, nonlinear, low speed 507 /4 mode 4, 32 kHz/12 bits, 4 channels 508 /5 mode 5, 44.1 kHz/16 bits, linear 509 /6 mode 6, 44.1 kHz/16 bits, 'wide track' play 510 511 DCC DCC 512 /A Analog transfer from media 513 514 DVD DVD 515 /A Analog transfer from media 516 517 TV Television 518 /PAL PAL 519 /NTSC NTSC 520 /SECAM SECAM 521 522 VID Video 523 /PAL PAL 524 /NTSC NTSC 525 /SECAM SECAM 526 /VHS VHS 527 /SVHS S-VHS 528 /BETA BETAMAX 529 530 RAD Radio 531 /FM FM 532 /AM AM 533 /LW LW 534 /MW MW 535 536 TEL Telephone 537 /I ISDN 538 539 MC MC (normal cassette) 540 /4 4.75 cm/s (normal speed for a two sided cassette) 541 /9 9.5 cm/s 542 /I Type I cassette (ferric/normal) 543 /II Type II cassette (chrome) 544 /III Type III cassette (ferric chrome) 545 /IV Type IV cassette (metal) 546 547 REE Reel 548 /9 9.5 cm/s 549 /19 19 cm/s 550 /38 38 cm/s 551 /76 76 cm/s 552 /I Type I cassette (ferric/normal) 553 /II Type II cassette (chrome) 554 /III Type III cassette (ferric chrome) 555 /IV Type IV cassette (metal) 556 557 TFT 558 The 'File type' frame indicates which type of audio this tag defines. 559 The following type and refinements are defined: 560 561 MPG MPEG Audio 562 /1 MPEG 2 layer I 563 /2 MPEG 2 layer II 564 /3 MPEG 2 layer III 565 /2.5 MPEG 2.5 566 /AAC Advanced audio compression 567 568 but other types may be used, not for these types though. This is used 569 in a similar way to the predefined types in the "TMT" frame, but 570 without parenthesis. If this frame is not present audio type is 571 assumed to be "MPG". 572 573 TBP 574 BPM is short for beats per minute, and is easily computed by 575 dividing the number of beats in a musical piece with its length. To 576 get a more accurate result, do the BPM calculation on the main-part 577 only. To acquire best result measure the time between each beat and 578 calculate individual BPM for each beat and use the median value as 579 result. BPM is an integer and represented as a numerical string. 580 581 TCR 582 The 'Copyright message' frame, which must begin with a year and a 583 space character (making five characters), is intended for the 584 copyright holder of the original sound, not the audio file itself. The 585 absence of this frame means only that the copyright information is 586 unavailable or has been removed, and must not be interpreted to mean 587 that the sound is public domain. Every time this field is displayed 588 the field must be preceded with "Copyright " (C) " ", where (C) is one 589 character showing a C in a circle. 590 591 TPB 592 The 'Publisher' frame simply contains the name of the label or 593 publisher. 594 595 TEN 596 The 'Encoded by' frame contains the name of the person or 597 organisation that encoded the audio file. This field may contain a 598 copyright message, if the audio file also is copyrighted by the 599 encoder. 600 601 TSS 602 The 'Software/hardware and settings used for encoding' frame 603 includes the used audio encoder and its settings when the file was 604 encoded. Hardware refers to hardware encoders, not the computer on 605 which a program was run. 606 607 TOF 608 The 'Original filename' frame contains the preferred filename for the 609 file, since some media doesn't allow the desired length of the 610 filename. The filename is case sensitive and includes its suffix. 611 612 TLE 613 The 'Length' frame contains the length of the audiofile in 614 milliseconds, represented as a numeric string. 615 616 TSI 617 The 'Size' frame contains the size of the audiofile in bytes 618 excluding the tag, represented as a numeric string. 619 620 TDY 621 The 'Playlist delay' defines the numbers of milliseconds of silence 622 between every song in a playlist. The player should use the "ETC" 623 frame, if present, to skip initial silence and silence at the end of 624 the audio to match the 'Playlist delay' time. The time is represented 625 as a numeric string. 626 627 TKE 628 The 'Initial key' frame contains the musical key in which the sound 629 starts. It is represented as a string with a maximum length of three 630 characters. The ground keys are represented with "A","B","C","D","E", 631 "F" and "G" and halfkeys represented with "b" and "#". Minor is 632 represented as "m". Example "Cbm". Off key is represented with an "o" 633 only. 634 635 TOT 636 The 'Original album/Movie/Show title' frame is intended for the title 637 of the original recording(/source of sound), if for example the music 638 in the file should be a cover of a previously released song. 639 640 TOA 641 The 'Original artist(s)/performer(s)' frame is intended for the 642 performer(s) of the original recording, if for example the music in 643 the file should be a cover of a previously released song. The 644 performers are seperated with the "/" character. 645 646 TOL 647 The 'Original Lyricist(s)/text writer(s)' frame is intended for the 648 text writer(s) of the original recording, if for example the music in 649 the file should be a cover of a previously released song. The text 650 writers are seperated with the "/" character. 651 652 TOR 653 The 'Original release year' frame is intended for the year when the 654 original recording, if for example the music in the file should be a 655 cover of a previously released song, was released. The field is 656 formatted as in the "TDY" frame. 657 658 659 4.2.2. User defined text information frame 660 661 This frame is intended for one-string text information concerning the 662 audiofile in a similar way to the other "T"xx frames. The frame body 663 consists of a <B style="color:black;background-color:#A0FFFF">description</B> of the string, represented as a terminated 664 string, followed by the actual string. There may be more than one 665 "TXX" frame in each tag, but only one with the same <B style="color:black;background-color:#A0FFFF">description</B>. 666 667 User defined... "TXX" 668 Frame size $xx xx xx 669 Text encoding $xx 670 <B style="color:black;background-color:#A0FFFF">Description</B> <textstring> $00 (00) 671 Value <textstring> 672 673 674 4.3. URL link frames 675 676 With these frames dynamic data such as webpages with touring 677 information, price information or plain ordinary news can be added to 678 the tag. There may only be one URL [URL] link frame of its kind in an 679 tag, except when stated otherwise in the frame <B style="color:black;background-color:#A0FFFF">description</B>. If the 680 textstring is followed by a termination ($00 (00)) all the following 681 information should be ignored and not be displayed. All URL link 682 frames have the following format: 683 684 URL link frame "W00" - "WZZ" , excluding "WXX" 685 (described in 4.3.2.) 686 Frame size $xx xx xx 687 URL <textstring> 688 689 690 4.3.1. URL link frames - details 691 692 WAF 693 The 'Official audio file webpage' frame is a URL pointing at a file 694 specific webpage. 695 696 WAR 697 The 'Official artist/performer webpage' frame is a URL pointing at 698 the artists official webpage. There may be more than one "WAR" frame 699 in a tag if the audio contains more than one performer. 700 701 WAS 702 The 'Official audio source webpage' frame is a URL pointing at the 703 official webpage for the source of the audio file, e.g. a movie. 704 705 WCM 706 The 'Commercial information' frame is a URL pointing at a webpage 707 with information such as where the album can be bought. There may be 708 more than one "WCM" frame in a tag. 709 710 WCP 711 The 'Copyright/Legal information' frame is a URL pointing at a 712 webpage where the terms of use and ownership of the file is described. 713 714 WPB 715 The 'Publishers official webpage' frame is a URL pointing at the 716 official wepage for the publisher. 717 718 719 4.3.2. User defined URL link frame 720 721 This frame is intended for URL [URL] links concerning the audiofile in 722 a similar way to the other "W"xx frames. The frame body consists of a 723 <B style="color:black;background-color:#A0FFFF">description</B> of the string, represented as a terminated string, 724 followed by the actual URL. The URL is always encoded with ISO-8859-1 725 [ISO-8859-1]. There may be more than one "WXX" frame in each tag, but 726 only one with the same <B style="color:black;background-color:#A0FFFF">description</B>. 727 728 User defined... "WXX" 729 Frame size $xx xx xx 730 Text encoding $xx 731 <B style="color:black;background-color:#A0FFFF">Description</B> <textstring> $00 (00) 732 URL <textstring> 733 734 735 4.4. Involved people list 736 737 Since there might be a lot of people contributing to an audio file in 738 various ways, such as musicians and technicians, the 'Text 739 information frames' are often insufficient to list everyone involved 740 in a project. The 'Involved people list' is a frame containing the 741 names of those involved, and how they were involved. The body simply 742 contains a terminated string with the involvement directly followed by 743 a terminated string with the involvee followed by a new involvement 744 and so on. There may only be one "IPL" frame in each tag. 745 746 Involved people list "IPL" 747 Frame size $xx xx xx 748 Text encoding $xx 749 People list strings <textstrings> 750 751 752 4.5. Music CD Identifier 753 754 This frame is intended for music that comes from a CD, so that the CD 755 can be identified in databases such as the CDDB [CDDB]. The frame 756 consists of a binary dump of the Table Of Contents, TOC, from the CD, 757 which is a header of 4 bytes and then 8 bytes/track on the CD making a 758 maximum of 804 bytes. This frame requires a present and valid "TRK" 759 frame. There may only be one "MCI" frame in each tag. 760 761 Music CD identifier "MCI" 762 Frame size $xx xx xx 763 CD TOC <binary data> 764 765 766 4.6. Event timing codes 767 768 This frame allows synchronisation with key events in a song or sound. 769 The head is: 770 771 Event timing codes "ETC" 772 Frame size $xx xx xx 773 Time stamp format $xx 774 775 Where time stamp format is: 776 777 $01 Absolute time, 32 bit sized, using MPEG [MPEG] frames as unit 778 $02 Absolute time, 32 bit sized, using milliseconds as unit 779 780 Abolute time means that every stamp contains the time from the 781 beginning of the file. 782 783 Followed by a list of key events in the following format: 784 785 Type of event $xx 786 Time stamp $xx (xx ...) 787 788 The 'Time stamp' is set to zero if directly at the beginning of the 789 sound or after the previous event. All events should be sorted in 790 chronological order. The type of event is as follows: 791 792 $00 padding (has no meaning) 793 $01 end of initial silence 794 $02 intro start 795 $03 mainpart start 796 $04 outro start 797 $05 outro end 798 $06 verse begins 799 $07 refrain begins 800 $08 interlude 801 $09 theme start 802 $0A variation 803 $0B key change 804 $0C time change 805 $0D unwanted noise (Snap, Crackle & Pop) 806 807 $0E-$DF reserved for future use 808 809 $E0-$EF not predefined sync 0-F 810 811 $F0-$FC reserved for future use 812 813 $FD audio end (start of silence) 814 $FE audio file ends 815 $FF one more byte of events follows (all the following bytes with 816 the value $FF have the same function) 817 818 The 'Not predefined sync's ($E0-EF) are for user events. You might 819 want to synchronise your music to something, like setting of an 820 explosion on-stage, turning on your screensaver etc. 821 822 There may only be one "ETC" frame in each tag. 823 824 825 4.7. MPEG location lookup table 826 827 To increase performance and accuracy of jumps within a MPEG [MPEG] 828 audio file, frames with timecodes in different locations in the file 829 might be useful. The ID3 frame includes references that the software 830 can use to calculate positions in the file. After the frame header is 831 a descriptor of how much the 'frame counter' should increase for every 832 reference. If this value is two then the first reference points out 833 the second frame, the 2nd reference the 4th frame, the 3rd reference 834 the 6th frame etc. In a similar way the 'bytes between reference' and 835 'milliseconds between reference' points out bytes and milliseconds 836 respectively. 837 838 Each reference consists of two parts; a certain number of bits, as 839 defined in 'bits for bytes deviation', that describes the difference 840 between what is said in 'bytes between reference' and the reality and 841 a certain number of bits, as defined in 'bits for milliseconds 842 deviation', that describes the difference between what is said in 843 'milliseconds between reference' and the reality. The number of bits 844 in every reference, i.e. 'bits for bytes deviation'+'bits for 845 milliseconds deviation', must be a multiple of four. There may only be 846 one "MLL" frame in each tag. 847 848 Location lookup table "MLL" 849 ID3 frame size $xx xx xx 850 MPEG frames between reference $xx xx 851 Bytes between reference $xx xx xx 852 Milliseconds between reference $xx xx xx 853 Bits for bytes deviation $xx 854 Bits for milliseconds dev. $xx 855 856 Then for every reference the following data is included; 857 858 Deviation in bytes %xxx.... 859 Deviation in milliseconds %xxx.... 860 861 862 4.8. Synced tempo codes 863 864 For a more accurate <B style="color:black;background-color:#A0FFFF">description</B> of the tempo of a musical piece this 865 frame might be used. After the header follows one byte describing 866 which time stamp format should be used. Then follows one or more tempo 867 codes. Each tempo code consists of one tempo part and one time part. 868 The tempo is in BPM described with one or two bytes. If the first byte 869 has the value $FF, one more byte follows, which is added to the first 870 giving a range from 2 - 510 BPM, since $00 and $01 is reserved. $00 is 871 used to describe a beat-free time period, which is not the same as a 872 music-free time period. $01 is used to indicate one single beat-stroke 873 followed by a beat-free period. 874 875 The tempo descriptor is followed by a time stamp. Every time the tempo 876 in the music changes, a tempo descriptor may indicate this for the 877 player. All tempo descriptors should be sorted in chronological order. 878 The first beat-stroke in a time-period is at the same time as the beat 879 <B style="color:black;background-color:#A0FFFF">description</B> occurs. There may only be one "STC" frame in each tag. 880 881 Synced tempo codes "STC" 882 Frame size $xx xx xx 883 Time stamp format $xx 884 Tempo data <binary data> 885 886 Where time stamp format is: 887 888 $01 Absolute time, 32 bit sized, using MPEG [MPEG] frames as unit 889 $02 Absolute time, 32 bit sized, using milliseconds as unit 890 891 Abolute time means that every stamp contains the time from the 892 beginning of the file. 893 894 895 4.9. Unsychronised lyrics/text transcription 896 897 This frame contains the lyrics of the song or a text transcription of 898 other vocal activities. The head includes an encoding descriptor and 899 a content descriptor. The body consists of the actual text. The 900 'Content descriptor' is a terminated string. If no descriptor is 901 entered, 'Content descriptor' is $00 (00) only. Newline characters 902 are allowed in the text. Maximum length for the descriptor is 64 903 bytes. There may be more than one lyrics/text frame in each tag, but 904 only one with the same language and content descriptor. 905 906 Unsynced lyrics/text "ULT" 907 Frame size $xx xx xx 908 Text encoding $xx 909 Language $xx xx xx 910 Content descriptor <textstring> $00 (00) 911 Lyrics/text <textstring> 912 913 914 4.10. Synchronised lyrics/text 915 916 This is another way of incorporating the words, said or sung lyrics, 917 in the audio file as text, this time, however, in sync with the audio. 918 It might also be used to describing events e.g. occurring on a stage 919 or on the screen in sync with the audio. The header includes a content 920 descriptor, represented with as terminated textstring. If no 921 descriptor is entered, 'Content descriptor' is $00 (00) only. 922 923 Synced lyrics/text "SLT" 924 Frame size $xx xx xx 925 Text encoding $xx 926 Language $xx xx xx 927 Time stamp format $xx 928 Content type $xx 929 Content descriptor <textstring> $00 (00) 930 931 932 Encoding: $00 ISO-8859-1 [ISO-8859-1] character set is used => $00 933 is sync identifier. 934 $01 Unicode [UNICODE] character set is used => $00 00 is 935 sync identifier. 936 937 Content type: $00 is other 938 $01 is lyrics 939 $02 is text transcription 940 $03 is movement/part name (e.g. "Adagio") 941 $04 is events (e.g. "Don Quijote enters the stage") 942 $05 is chord (e.g. "Bb F Fsus") 943 944 Time stamp format is: 945 946 $01 Absolute time, 32 bit sized, using MPEG [MPEG] frames as unit 947 $02 Absolute time, 32 bit sized, using milliseconds as unit 948 949 Abolute time means that every stamp contains the time from the 950 beginning of the file. 951 952 The text that follows the frame header differs from that of the 953 unsynchronised lyrics/text transcription in one major way. Each 954 syllable (or whatever size of text is considered to be convenient by 955 the encoder) is a null terminated string followed by a time stamp 956 denoting where in the sound file it belongs. Each sync thus has the 957 following structure: 958 959 Terminated text to be synced (typically a syllable) 960 Sync identifier (terminator to above string) $00 (00) 961 Time stamp $xx (xx ...) 962 963 The 'time stamp' is set to zero or the whole sync is omitted if 964 located directly at the beginning of the sound. All time stamps should 965 be sorted in chronological order. The sync can be considered as a 966 validator of the subsequent string. 967 968 Newline characters are allowed in all "SLT" frames and should be used 969 after every entry (name, event etc.) in a frame with the content type 970 $03 - $04. 971 972 A few considerations regarding whitespace characters: Whitespace 973 separating words should mark the beginning of a new word, thus 974 occurring in front of the first syllable of a new word. This is also 975 valid for new line characters. A syllable followed by a comma should 976 not be broken apart with a sync (both the syllable and the comma 977 should be before the sync). 978 979 An example: The "ULT" passage 980 981 "Strangers in the night" $0A "Exchanging glances" 982 983 would be "SLT" encoded as: 984 985 "Strang" $00 xx xx "ers" $00 xx xx " in" $00 xx xx " the" $00 xx xx 986 " night" $00 xx xx 0A "Ex" $00 xx xx "chang" $00 xx xx "ing" $00 xx 987 xx "glan" $00 xx xx "ces" $00 xx xx 988 989 There may be more than one "SLT" frame in each tag, but only one with 990 the same language and content descriptor. 991 992 993 4.11. Comments 994 995 This frame replaces the old 30-character comment field in ID3v1. It 996 consists of a frame head followed by encoding, language and content 997 descriptors and is ended with the actual comment as a text string. 998 Newline characters are allowed in the comment text string. There may 999 be more than one comment frame in each tag, but only one with the same 1000 language and content descriptor. 1001 1002 Comment "COM" 1003 Frame size $xx xx xx 1004 Text encoding $xx 1005 Language $xx xx xx 1006 Short content <B style="color:black;background-color:#A0FFFF">description</B> <textstring> $00 (00) 1007 The actual text <textstring> 1008 1009 1010 4.12. Relative volume adjustment 1011 1012 This is a more subjective function than the previous ones. It allows 1013 the user to say how much he wants to increase/decrease the volume on 1014 each channel while the file is played. The purpose is to be able to 1015 align all files to a reference volume, so that you don't have to 1016 change the volume constantly. This frame may also be used to balance 1017 adjust the audio. If the volume peak levels are known then this could 1018 be described with the 'Peak volume right' and 'Peak volume left' 1019 field. If Peakvolume is not known these fields could be left zeroed 1020 or completely omitted. There may only be one "RVA" frame in each 1021 tag. 1022 1023 Relative volume adjustment "RVA" 1024 Frame size $xx xx xx 1025 Increment/decrement %000000xx 1026 Bits used for volume descr. $xx 1027 Relative volume change, right $xx xx (xx ...) 1028 Relative volume change, left $xx xx (xx ...) 1029 Peak volume right $xx xx (xx ...) 1030 Peak volume left $xx xx (xx ...) 1031 1032 In the increment/decrement field bit 0 is used to indicate the right 1033 channel and bit 1 is used to indicate the left channel. 1 is 1034 increment and 0 is decrement. 1035 1036 The 'bits used for volume <B style="color:black;background-color:#A0FFFF">description</B>' field is normally $10 (16 bits) 1037 for MPEG 2 layer I, II and III [MPEG] and MPEG 2.5. This value may not 1038 be $00. The volume is always represented with whole bytes, padded in 1039 the beginning (highest bits) when 'bits used for volume <B style="color:black;background-color:#A0FFFF">description</B>' 1040 is not a multiple of eight. 1041 1042 1043 4.13. Equalisation 1044 1045 This is another subjective, alignment frame. It allows the user to 1046 predefine an equalisation curve within the audio file. There may only 1047 be one "EQU" frame in each tag. 1048 1049 Equalisation "EQU" 1050 Frame size $xx xx xx 1051 Adjustment bits $xx 1052 1053 The 'adjustment bits' field defines the number of bits used for 1054 representation of the adjustment. This is normally $10 (16 bits) for 1055 MPEG 2 layer I, II and III [MPEG] and MPEG 2.5. This value may not be 1056 $00. 1057 1058 This is followed by 2 bytes + ('adjustment bits' rounded up to the 1059 nearest byte) for every equalisation band in the following format, 1060 giving a frequency range of 0 - 32767Hz: 1061 1062 Increment/decrement %x (MSB of the Frequency) 1063 Frequency (lower 15 bits) 1064 Adjustment $xx (xx ...) 1065 1066 The increment/decrement bit is 1 for increment and 0 for decrement. 1067 The equalisation bands should be ordered increasingly with reference 1068 to frequency. All frequencies don't have to be declared. Adjustments 1069 with the value $00 should be omitted. A frequency should only be 1070 described once in the frame. 1071 1072 1073 4.14. Reverb 1074 1075 Yet another subjective one. You may here adjust echoes of different 1076 kinds. Reverb left/right is the delay between every bounce in ms. 1077 Reverb bounces left/right is the number of bounces that should be 1078 made. $FF equals an infinite number of bounces. Feedback is the amount 1079 of volume that should be returned to the next echo bounce. $00 is 0%, 1080 $FF is 100%. If this value were $7F, there would be 50% volume 1081 reduction on the first bounce, yet 50% on the second and so on. Left 1082 to left means the sound from the left bounce to be played in the left 1083 speaker, while left to right means sound from the left bounce to be 1084 played in the right speaker. 1085 1086 'Premix left to right' is the amount of left sound to be mixed in the 1087 right before any reverb is applied, where $00 id 0% and $FF is 100%. 1088 'Premix right to left' does the same thing, but right to left. Setting 1089 both premix to $FF would result in a mono output (if the reverb is 1090 applied symmetric). There may only be one "REV" frame in each tag. 1091 1092 Reverb settings "REV" 1093 Frame size $00 00 0C 1094 Reverb left (ms) $xx xx 1095 Reverb right (ms) $xx xx 1096 Reverb bounces, left $xx 1097 Reverb bounces, right $xx 1098 Reverb feedback, left to left $xx 1099 Reverb feedback, left to right $xx 1100 Reverb feedback, right to right $xx 1101 Reverb feedback, right to left $xx 1102 Premix left to right $xx 1103 Premix right to left $xx 1104 1105 1106 4.15. Attached picture 1107 1108 This frame contains a picture directly related to the audio file. 1109 Image format is preferably "PNG" [PNG] or "JPG" [JFIF]. <B style="color:black;background-color:#A0FFFF">Description</B> 1110 is a short <B style="color:black;background-color:#A0FFFF">description</B> of the picture, represented as a terminated 1111 textstring. The <B style="color:black;background-color:#A0FFFF">description</B> has a maximum length of 64 characters, 1112 but may be empty. There may be several pictures attached to one file, 1113 each in their individual "PIC" frame, but only one with the same 1114 content descriptor. There may only be one picture with the picture 1115 type declared as picture type $01 and $02 respectively. There is a 1116 possibility to put only a link to the image file by using the 'image 1117 format' "-->" and having a complete URL [URL] instead of picture data. 1118 The use of linked files should however be used restrictively since 1119 there is the risk of separation of files. 1120 1121 Attached picture "PIC" 1122 Frame size $xx xx xx 1123 Text encoding $xx 1124 Image format $xx xx xx 1125 Picture type $xx 1126 <B style="color:black;background-color:#A0FFFF">Description</B> <textstring> $00 (00) 1127 Picture data <binary data> 1128 1129 1130 Picture type: $00 Other 1131 $01 32x32 pixels 'file icon' (PNG only) 1132 $02 Other file icon 1133 $03 Cover (front) 1134 $04 Cover (back) 1135 $05 Leaflet page 1136 $06 Media (e.g. lable side of CD) 1137 $07 Lead artist/lead performer/soloist 1138 $08 Artist/performer 1139 $09 Conductor 1140 $0A Band/Orchestra 1141 $0B Composer 1142 $0C Lyricist/text writer 1143 $0D Recording Location 1144 $0E During recording 1145 $0F During performance 1146 $10 Movie/video screen capture 1147 $11 A bright coloured fish 1148 $12 Illustration 1149 $13 Band/artist logotype 1150 $14 Publisher/Studio logotype 1151 1152 1153 4.16. General encapsulated object 1154 1155 In this frame any type of file can be encapsulated. After the header, 1156 'Frame size' and 'Encoding' follows 'MIME type' [MIME] and 'Filename' 1157 for the encapsulated object, both represented as terminated strings 1158 encoded with ISO 8859-1 [ISO-8859-1]. The filename is case sensitive. 1159 Then follows a content <B style="color:black;background-color:#A0FFFF">description</B> as terminated string, encoded as 1160 'Encoding'. The last thing in the frame is the actual object. The 1161 first two strings may be omitted, leaving only their terminations. 1162 MIME type is always an ISO-8859-1 text string. There may be more than 1163 one "GEO" frame in each tag, but only one with the same content 1164 descriptor. 1165 1166 General encapsulated object "GEO" 1167 Frame size $xx xx xx 1168 Text encoding $xx 1169 MIME type <textstring> $00 1170 Filename <textstring> $00 (00) 1171 Content <B style="color:black;background-color:#A0FFFF">description</B> <textstring> $00 (00) 1172 Encapsulated object <binary data> 1173 1174 1175 4.17. Play counter 1176 1177 This is simply a counter of the number of times a file has been 1178 played. The value is increased by one every time the file begins to 1179 play. There may only be one "CNT" frame in each tag. When the counter 1180 reaches all one's, one byte is inserted in front of the counter thus 1181 making the counter eight bits bigger. The counter must be at least 1182 32-bits long to begin with. 1183 1184 Play counter "CNT" 1185 Frame size $xx xx xx 1186 Counter $xx xx xx xx (xx ...) 1187 1188 1189 4.18. Popularimeter 1190 1191 The purpose of this frame is to specify how good an audio file is. 1192 Many interesting applications could be found to this frame such as a 1193 playlist that features better audiofiles more often than others or it 1194 could be used to profile a persons taste and find other 'good' files 1195 by comparing people's profiles. The frame is very simple. It contains 1196 the email address to the user, one rating byte and a four byte play 1197 counter, intended to be increased with one for every time the file is 1198 played. The email is a terminated string. The rating is 1-255 where 1199 1 is worst and 255 is best. 0 is unknown. If no personal counter is 1200 wanted it may be omitted. When the counter reaches all one's, one 1201 byte is inserted in front of the counter thus making the counter 1202 eight bits bigger in the same away as the play counter ("CNT"). 1203 There may be more than one "POP" frame in each tag, but only one with 1204 the same email address. 1205 1206 Popularimeter "POP" 1207 Frame size $xx xx xx 1208 Email to user <textstring> $00 1209 Rating $xx 1210 Counter $xx xx xx xx (xx ...) 1211 1212 1213 4.19. Recommended buffer size 1214 1215 Sometimes the server from which a audio file is streamed is aware of 1216 transmission or coding problems resulting in interruptions in the 1217 audio stream. In these cases, the size of the buffer can be 1218 recommended by the server using this frame. If the 'embedded info 1219 flag' is true (1) then this indicates that an ID3 tag with the 1220 maximum size described in 'Buffer size' may occur in the audiostream. 1221 In such case the tag should reside between two MPEG [MPEG] frames, if 1222 the audio is MPEG encoded. If the position of the next tag is known, 1223 'offset to next tag' may be used. The offset is calculated from the 1224 end of tag in which this frame resides to the first byte of the header 1225 in the next. This field may be omitted. Embedded tags is currently not 1226 recommended since this could render unpredictable behaviour from 1227 present software/hardware. The 'Buffer size' should be kept to a 1228 minimum. There may only be one "BUF" frame in each tag. 1229 1230 Recommended buffer size "BUF" 1231 Frame size $xx xx xx 1232 Buffer size $xx xx xx 1233 Embedded info flag %0000000x 1234 Offset to next tag $xx xx xx xx 1235 1236 1237 4.20. Encrypted meta frame 1238 1239 This frame contains one or more encrypted frames. This enables 1240 protection of copyrighted information such as pictures and text, that 1241 people might want to pay extra for. Since standardisation of such an 1242 encryption scheme is beyond this document, all "CRM" frames begin with 1243 a terminated string with a URL [URL] containing an email address, or a 1244 link to a location where an email adress can be found, that belongs to 1245 the organisation responsible for this specific encrypted meta frame. 1246 1247 Questions regarding the encrypted frame should be sent to the 1248 indicated email address. If a $00 is found directly after the 'Frame 1249 size', the whole frame should be ignored, and preferably be removed. 1250 The 'Owner identifier' is then followed by a short content <B style="color:black;background-color:#A0FFFF">description</B> 1251 and explanation as to why it's encrypted. After the 1252 'content/explanation' <B style="color:black;background-color:#A0FFFF">description</B>, the actual encrypted block follows. 1253 1254 When an <B style="color:black;background-color:#ffff66">ID3v2</B> decoder encounters a "CRM" frame, it should send the 1255 datablock to the 'plugin' with the corresponding 'owner identifier' 1256 and expect to receive either a datablock with one or several <B style="color:black;background-color:#ffff66">ID3v2</B> 1257 frames after each other or an error. There may be more than one "CRM" 1258 frames in a tag, but only one with the same 'owner identifier'. 1259 1260 Encrypted meta frame "CRM" 1261 Frame size $xx xx xx 1262 Owner identifier <textstring> $00 (00) 1263 Content/explanation <textstring> $00 (00) 1264 Encrypted datablock <binary data> 1265 1266 1267 4.21. Audio encryption 1268 1269 This frame indicates if the actual audio stream is encrypted, and by 1270 whom. Since standardisation of such encrypion scheme is beyond this 1271 document, all "CRA" frames begin with a terminated string with a 1272 URL containing an email address, or a link to a location where an 1273 email address can be found, that belongs to the organisation 1274 responsible for this specific encrypted audio file. Questions 1275 regarding the encrypted audio should be sent to the email address 1276 specified. If a $00 is found directly after the 'Frame size' and the 1277 audiofile indeed is encrypted, the whole file may be considered 1278 useless. 1279 1280 After the 'Owner identifier', a pointer to an unencrypted part of the 1281 audio can be specified. The 'Preview start' and 'Preview length' is 1282 described in frames. If no part is unencrypted, these fields should be 1283 left zeroed. After the 'preview length' field follows optionally a 1284 datablock required for decryption of the audio. There may be more than 1285 one "CRA" frames in a tag, but only one with the same 'Owner 1286 identifier'. 1287 1288 Audio encryption "CRA" 1289 Frame size $xx xx xx 1290 Owner identifier <textstring> $00 (00) 1291 Preview start $xx xx 1292 Preview length $xx xx 1293 Encryption info <binary data> 1294 1295 1296 4.22. Linked information 1297 1298 To keep space waste as low as possible this frame may be used to link 1299 information from another <B style="color:black;background-color:#ffff66">ID3v2</B> tag that might reside in another audio 1300 file or alone in a binary file. It is recommended that this method is 1301 only used when the files are stored on a CD-ROM or other circumstances 1302 when the risk of file seperation is low. The frame contains a frame 1303 identifier, which is the frame that should be linked into this tag, a 1304 URL [URL] field, where a reference to the file where the frame is 1305 given, and additional ID data, if needed. Data should be retrieved 1306 from the first tag found in the file to which this link points. There 1307 may be more than one "LNK" frame in a tag, but only one with the same 1308 contents. A linked frame is to be considered as part of the tag and 1309 has the same restrictions as if it was a physical part of the tag 1310 (i.e. only one "REV" frame allowed, whether it's linked or not). 1311 1312 Linked information "LNK" 1313 Frame size $xx xx xx 1314 Frame identifier $xx xx xx 1315 URL <textstring> $00 (00) 1316 Additional ID data <textstring(s)> 1317 1318 Frames that may be linked and need no additional data are "IPL", 1319 "MCI", "ETC", "LLT", "STC", "RVA", "EQU", "REV", "BUF", the text 1320 information frames and the URL link frames. 1321 1322 The "TXX", "PIC", "GEO", "CRM" and "CRA" frames may be linked with the 1323 content descriptor as additional ID data. 1324 1325 The "COM", "SLT" and "ULT" frames may be linked with three bytes of 1326 language descriptor directly followed by a content descriptor as 1327 additional ID data. 1328 1329 1330 5. The 'unsynchronisation scheme' 1331 1332 The only purpose of the 'unsychronisation scheme' is to make the <B style="color:black;background-color:#ffff66">ID3v2</B> 1333 tag as compatible as possible with existing software. There is no use 1334 in 'unsynchronising' tags if the file is only to be processed by new 1335 software. Unsynchronisation may only be made with MPEG 2 layer I, II 1336 and III and MPEG 2.5 files. 1337 1338 Whenever a false synchronisation is found within the tag, one zeroed 1339 byte is inserted after the first false synchronisation byte. The 1340 format of a correct sync that should be altered by ID3 encoders is as 1341 follows: 1342 1343 %11111111 111xxxxx 1344 1345 And should be replaced with: 1346 1347 %11111111 00000000 111xxxxx 1348 1349 This has the side effect that all $FF 00 combinations have to be 1350 altered, so they won't be affected by the decoding process. Therefore 1351 all the $FF 00 combinations have to be replaced with the $FF 00 00 1352 combination during the unsynchonisation. 1353 1354 To indicate usage of the unsynchronisation, the first bit in 'ID3 1355 flags' should be set. This bit should only be set if the tag 1356 contained a, now corrected, false synchronisation. The bit should 1357 only be clear if the tag does not contain any false synchronisations. 1358 1359 Do bear in mind, that if a compression scheme is used by the encoder, 1360 the unsyncronisation scheme should be applied *afterwards*. When 1361 decoding a compressed, 'unsyncronised' file, the 'unsyncronisation 1362 scheme' should be parsed first, compression afterwards. 1363 1364 1365 6. Copyright 1366 1367 Copyright (C) Martin Nilsson 1998. All Rights Reserved. 1368 1369 This document and translations of it may be copied and furnished to 1370 others, and derivative works that comment on or otherwise explain it 1371 or assist in its implementation may be prepared, copied, published 1372 and distributed, in whole or in part, without restriction of any 1373 kind, provided that a reference to this document is included on all 1374 such copies and derivative works. However, this document itself may 1375 not be modified in any way and reissued as the original document. 1376 1377 The limited permissions granted above are perpetual and will not be 1378 revoked. 1379 1380 This document and the information contained herein is provided on an 1381 "AS IS" basis and THE AUTHORS DISCLAIMS ALL WARRANTIES, EXPRESS OR 1382 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1383 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1384 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1385 1386 1387 7. References 1388 1389 [CDDB] Compact Disc Data Base 1390 1391 <url:<a href="http://www.cddb.com">http://www.cddb.com</a>> 1392 1393 [ISO-639-2] ISO/FDIS 639-2. 1394 Codes for the representation of names of languages, Part 2: Alpha-3 1395 code. Technical committee / subcommittee: TC 37 / SC 2 1396 1397 [ISO-8859-1] ISO/IEC DIS 8859-1. 1398 8-bit single-byte coded graphic character sets, Part 1: Latin 1399 alphabet No. 1. Technical committee / subcommittee: JTC 1 / SC 2 1400 1401 [ISRC] ISO 3901:1986 1402 International Standard Recording Code (ISRC). 1403 Technical committee / subcommittee: TC 46 / SC 9 1404 1405 [JFIF] JPEG File Interchange Format, version 1.02 1406 1407 <url:<a href="http://www.w3.org/Graphics/JPEG/jfif.txt">http://www.w3.org/Graphics/JPEG/jfif.txt</a>> 1408 1409 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1410 Extensions (MIME) Part One: Format of Internet Message Bodies", 1411 RFC 2045, November 1996. 1412 1413 <url:<a href="ftp://ftp.isi.edu/in-notes/rfc2045.txt">ftp://ftp.isi.edu/in-notes/rfc2045.txt</a>> 1414 1415 [MPEG] ISO/IEC 11172-3:1993. 1416 Coding of moving pictures and associated audio for digital storage 1417 media at up to about 1,5 Mbit/s, Part 3: Audio. 1418 Technical committee / subcommittee: JTC 1 / SC 29 1419 and 1420 ISO/IEC 13818-3:1995 1421 Generic coding of moving pictures and associated audio information, 1422 Part 3: Audio. 1423 Technical committee / subcommittee: JTC 1 / SC 29 1424 and 1425 ISO/IEC DIS 13818-3 1426 Generic coding of moving pictures and associated audio information, 1427 Part 3: Audio (Revision of ISO/IEC 13818-3:1995) 1428 1429 1430 [PNG] Portable Network Graphics, version 1.0 1431 1432 <url:<a href="http://www.w3.org/TR/REC-png-multi.html">http://www.w3.org/TR/REC-png-multi.html</a>> 1433 1434 [UNICODE] ISO/IEC 10646-1:1993. 1435 Universal Multiple-Octet Coded Character Set (UCS), Part 1: 1436 Architecture and Basic Multilingual Plane. Technical committee 1437 / subcommittee: JTC 1 / SC 2 1438 1439 <url:<a href="http://www.unicode.org">http://www.unicode.org</a>> 1440 1441 [URL] T. Berners-Lee, L. Masinter & M. McCahill, "Uniform Resource 1442 Locators (URL).", RFC 1738, December 1994. 1443 1444 <url:<a href="ftp://ftp.isi.edu/in-notes/rfc1738.txt">ftp://ftp.isi.edu/in-notes/rfc1738.txt</a>> 1445 1446 1447 8. Appendix 1448 1449 1450 A. Appendix A - ID3-Tag Specification V1.1 1451 1452 ID3-Tag Specification V1.1 (12 dec 1997) by Michael Mutschler 1453 <amiga2@info2.rus.uni-stuttgart.de>, edited for space and clarity 1454 reasons. 1455 1456 1457 A.1. Overview 1458 1459 The ID3-Tag is an information field for MPEG Layer 3 audio files. 1460 Since a standalone MP3 doesn't provide a method of storing other 1461 information than those directly needed for replay reasons, the 1462 ID3-tag was invented by Eric Kemp in 1996. 1463 1464 A revision from ID3v1 to ID3v1.1 was made by Michael Mutschler to 1465 support track number information is described in A.4. 1466 1467 1468 A.2. ID3v1 Implementation 1469 1470 The Information is stored in the last 128 bytes of an MP3. The Tag 1471 has got the following fields, and the offsets given here, are from 1472 0-127. 1473 1474 Field Length Offsets 1475 Tag 3 0-2 1476 Songname 30 3-32 1477 Artist 30 33-62 1478 Album 30 63-92 1479 Year 4 93-96 1480 Comment 30 97-126 1481 Genre 1 127 1482 1483 1484 The string-fields contain ASCII-data, coded in ISO-Latin 1 codepage. 1485 Strings which are smaller than the field length are padded with zero- 1486 bytes. 1487 1488 Tag: The tag is valid if this field contains the string "TAG". This 1489 has to be uppercase! 1490 1491 Songname: This field contains the title of the MP3 (string as 1492 above). 1493 1494 Artist: This field contains the artist of the MP3 (string as above). 1495 1496 Album: this field contains the album where the MP3 comes from 1497 (string as above). 1498 1499 Year: this field contains the year when this song has originally 1500 been released (string as above). 1501 1502 Comment: this field contains a comment for the MP3 (string as 1503 above). Revision to this field has been made in ID3v1.1. See 1504 A.4. 1505 1506 Genre: this byte contains the offset of a genre in a predefined 1507 list the byte is treated as an unsigned byte. The offset is 1508 starting from 0. See A.3. 1509 1510 1511 A.3. Genre List 1512 1513 The following genres is defined in ID3v1 1514 1515 0.Blues 1516 1.Classic Rock 1517 2.Country 1518 3.Dance 1519 4.Disco 1520 5.Funk 1521 6.Grunge 1522 7.Hip-Hop 1523 8.Jazz 1524 9.Metal 1525 10.New Age 1526 11.Oldies 1527 12.Other 1528 13.Pop 1529 14.R&B 1530 15.Rap 1531 16.Reggae 1532 17.Rock 1533 18.Techno 1534 19.Industrial 1535 20.Alternative 1536 21.Ska 1537 22.Death Metal 1538 23.Pranks 1539 24.Soundtrack 1540 25.Euro-Techno 1541 26.Ambient 1542 27.Trip-Hop 1543 28.Vocal 1544 29.Jazz+Funk 1545 30.Fusion 1546 31.Trance 1547 32.Classical 1548 33.Instrumental 1549 34.Acid 1550 35.House 1551 36.Game 1552 37.Sound Clip 1553 38.Gospel 1554 39.Noise 1555 40.AlternRock 1556 41.Bass 1557 42.Soul 1558 43.Punk 1559 44.Space 1560 45.Meditative 1561 46.Instrumental Pop 1562 47.Instrumental Rock 1563 48.Ethnic 1564 49.Gothic 1565 50.Darkwave 1566 51.Techno-Industrial 1567 52.Electronic 1568 53.Pop-Folk 1569 54.Eurodance 1570 55.Dream 1571 56.Southern Rock 1572 57.Comedy 1573 58.Cult 1574 59.Gangsta 1575 60.Top 40 1576 61.Christian Rap 1577 62.Pop/Funk 1578 63.Jungle 1579 64.Native American 1580 65.Cabaret 1581 66.New Wave 1582 67.Psychadelic 1583 68.Rave 1584 69.Showtunes 1585 70.Trailer 1586 71.Lo-Fi 1587 72.Tribal 1588 73.Acid Punk 1589 74.Acid Jazz 1590 75.Polka 1591 76.Retro 1592 77.Musical 1593 78.Rock & Roll 1594 79.Hard Rock 1595 1596 The following genres are Winamp extensions 1597 1598 80.Folk 1599 81.Folk-Rock 1600 82.National Folk 1601 83.Swing 1602 84.Fast Fusion 1603 85.Bebob 1604 86.Latin 1605 87.Revival 1606 88.Celtic 1607 89.Bluegrass 1608 90.Avantgarde 1609 91.Gothic Rock 1610 92.Progressive Rock 1611 93.Psychedelic Rock 1612 94.Symphonic Rock 1613 95.Slow Rock 1614 96.Big Band 1615 97.Chorus 1616 98.Easy Listening 1617 99.Acoustic 1618 100.Humour 1619 101.Speech 1620 102.Chanson 1621 103.Opera 1622 104.Chamber Music 1623 105.Sonata 1624 106.Symphony 1625 107.Booty Bass 1626 108.Primus 1627 109.Porn Groove 1628 110.Satire 1629 111.Slow Jam 1630 112.Club 1631 113.Tango 1632 114.Samba 1633 115.Folklore 1634 116.Ballad 1635 117.Power Ballad 1636 118.Rhythmic Soul 1637 119.Freestyle 1638 120.Duet 1639 121.Punk Rock 1640 122.Drum Solo 1641 123.A capella 1642 124.Euro-House 1643 125.Dance Hall 1644 1645 1646 A.4. Track addition - ID3v1.1 1647 1648 In ID3v1.1, Michael Mutschler revised the specification of the 1649 comment field in order to implement the track number. The new format 1650 of the comment field is a 28 character string followed by a mandatory 1651 null ($00) character and the original album tracknumber stored as an 1652 unsigned byte-size integer. In such cases where the 29th byte is not 1653 the null character or when the 30th is a null character, the 1654 tracknumber is to be considered undefined. 1655 1656 1657 9. Author's Address 1658 1659 Martin Nilsson 1660 Rydsvägen 246 C. 30 1661 S-584 34 Linköping 1662 Sweden 1663 1664 Email: nilsson@id3.org 1665 1666 Co-authors: 1667 1668 Johan Sundström Email: johan@id3.org 1669 1670 1671 </pre></body></html>