id3v2.4.0-frames.txt (65648B)
1 $Id: id3v2.4.0-frames.txt,v 1.1 2003/07/27 18:28:34 id3 Exp $ 2 3 Informal standard M. Nilsson 4 Document: id3v2.4.0-frames.txt 1st November 2000 5 6 7 ID3 tag version 2.4.0 - Native Frames 8 9 Status of this document 10 11 This document is an informal standard and replaces the ID3v2.3.0 12 standard [ID3v2]. A formal standard will use another revision number 13 even if the content is identical to document. The contents in this 14 document may change for clarifications but never for added or altered 15 functionallity. 16 17 Distribution of this document is unlimited. 18 19 20 Abstract 21 22 This document describes the frames natively supported by ID3v2.4.0, 23 which is a revised version of the ID3v2 informal standard [ID3v2.3.0] 24 version 2.3.0. The ID3v2 offers a flexible way of storing audio meta 25 information within audio file itself. The information may be 26 technical information, such as equalisation curves, as well as title, 27 performer, copyright etc. 28 29 ID3v2.4.0 is meant to be as close as possible to ID3v2.3.0 in order 30 to allow for implementations to be revised as easily as possible. 31 32 33 1. Table of contents 34 35 2. Conventions in this document 36 3. Default flags 37 4. Declared ID3v2 frames 38 4.1. Unique file identifier 39 4.2. Text information frames 40 4.2.1. Identification frames 41 4.2.2. Involved persons frames 42 4.2.3. Derived and subjective properties frames 43 4.2.4. Rights and license frames 44 4.2.5. Other text frames 45 4.2.6. User defined text information frame 46 4.3. URL link frames 47 4.3.1. URL link frames - details 48 4.3.2. User defined URL link frame 49 4.4. Music CD Identifier 50 4.5. Event timing codes 51 4.6. MPEG location lookup table 52 4.7. Synced tempo codes 53 4.8. Unsynchronised lyrics/text transcription 54 4.9. Synchronised lyrics/text 55 4.10. Comments 56 4.11. Relative volume adjustment (2) 57 4.12. Equalisation (2) 58 4.13. Reverb 59 4.14. Attached picture 60 4.15. General encapsulated object 61 4.16. Play counter 62 4.17. Popularimeter 63 4.18. Recommended buffer size 64 4.19. Audio encryption 65 4.20. Linked information 66 4.21. Position synchronisation frame 67 4.22. Terms of use 68 4.23. Ownership frame 69 4.24. Commercial frame 70 4.25. Encryption method registration 71 4.26. Group identification registration 72 4.27. Private frame 73 4.28. Signature frame 74 4.29. Seek frame 75 4.30. Audio seek point index 76 5. Copyright 77 6. References 78 7. Appendix 79 A. Appendix A - Genre List from ID3v1 80 8. Author's Address 81 82 83 2. Conventions in this document 84 85 Text within "" is a text string exactly as it appears in a tag. 86 Numbers preceded with $ are hexadecimal and numbers preceded with % 87 are binary. $xx is used to indicate a byte with unknown content. %x 88 is used to indicate a bit with unknown content. The most significant 89 bit (MSB) of a byte is called 'bit 7' and the least significant bit 90 (LSB) is called 'bit 0'. 91 92 A tag is the whole tag described the ID3v2 main structure document 93 [ID3v2-strct]. A frame is a block of information in the tag. The tag 94 consists of a header, frames and optional padding. A field is a piece 95 of information; one value, a string etc. A numeric string is a string 96 that consists of the characters "0123456789" only. 97 98 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 99 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 100 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 101 102 103 3. Default flags 104 105 The default settings for the frames described in this document can be 106 divided into the following classes. The flags may be set differently 107 if found more suitable by the software. 108 109 1. Discarded if tag is altered, discarded if file is altered. 110 111 None. 112 113 2. Discarded if tag is altered, preserved if file is altered. 114 115 None. 116 117 3. Preserved if tag is altered, discarded if file is altered. 118 119 ASPI, AENC, ETCO, EQU2, MLLT, POSS, SEEK, SYLT, SYTC, RVA2, TENC, 120 TLEN 121 122 4. Preserved if tag is altered, preserved if file is altered. 123 124 The rest of the frames. 125 126 127 4. Declared ID3v2 frames 128 129 The following frames are declared in this draft. 130 131 4.19 AENC Audio encryption 132 4.14 APIC Attached picture 133 4.30 ASPI Audio seek point index 134 135 4.10 COMM Comments 136 4.24 COMR Commercial frame 137 138 4.25 ENCR Encryption method registration 139 4.12 EQU2 Equalisation (2) 140 4.5 ETCO Event timing codes 141 142 4.15 GEOB General encapsulated object 143 4.26 GRID Group identification registration 144 145 4.20 LINK Linked information 146 147 4.4 MCDI Music CD identifier 148 4.6 MLLT MPEG location lookup table 149 150 4.23 OWNE Ownership frame 151 152 4.27 PRIV Private frame 153 4.16 PCNT Play counter 154 4.17 POPM Popularimeter 155 4.21 POSS Position synchronisation frame 156 157 4.18 RBUF Recommended buffer size 158 4.11 RVA2 Relative volume adjustment (2) 159 4.13 RVRB Reverb 160 161 4.29 SEEK Seek frame 162 4.28 SIGN Signature frame 163 4.9 SYLT Synchronised lyric/text 164 4.7 SYTC Synchronised tempo codes 165 166 4.2.1 TALB Album/Movie/Show title 167 4.2.3 TBPM BPM (beats per minute) 168 4.2.2 TCOM Composer 169 4.2.3 TCON Content type 170 4.2.4 TCOP Copyright message 171 4.2.5 TDEN Encoding time 172 4.2.5 TDLY Playlist delay 173 4.2.5 TDOR Original release time 174 4.2.5 TDRC Recording time 175 4.2.5 TDRL Release time 176 4.2.5 TDTG Tagging time 177 4.2.2 TENC Encoded by 178 4.2.2 TEXT Lyricist/Text writer 179 4.2.3 TFLT File type 180 4.2.2 TIPL Involved people list 181 4.2.1 TIT1 Content group description 182 4.2.1 TIT2 Title/songname/content description 183 4.2.1 TIT3 Subtitle/Description refinement 184 4.2.3 TKEY Initial key 185 4.2.3 TLAN Language(s) 186 4.2.3 TLEN Length 187 4.2.2 TMCL Musician credits list 188 4.2.3 TMED Media type 189 4.2.3 TMOO Mood 190 4.2.1 TOAL Original album/movie/show title 191 4.2.5 TOFN Original filename 192 4.2.2 TOLY Original lyricist(s)/text writer(s) 193 4.2.2 TOPE Original artist(s)/performer(s) 194 4.2.4 TOWN File owner/licensee 195 4.2.2 TPE1 Lead performer(s)/Soloist(s) 196 4.2.2 TPE2 Band/orchestra/accompaniment 197 4.2.2 TPE3 Conductor/performer refinement 198 4.2.2 TPE4 Interpreted, remixed, or otherwise modified by 199 4.2.1 TPOS Part of a set 200 4.2.4 TPRO Produced notice 201 4.2.4 TPUB Publisher 202 4.2.1 TRCK Track number/Position in set 203 4.2.4 TRSN Internet radio station name 204 4.2.4 TRSO Internet radio station owner 205 4.2.5 TSOA Album sort order 206 4.2.5 TSOP Performer sort order 207 4.2.5 TSOT Title sort order 208 4.2.1 TSRC ISRC (international standard recording code) 209 4.2.5 TSSE Software/Hardware and settings used for encoding 210 4.2.1 TSST Set subtitle 211 4.2.2 TXXX User defined text information frame 212 213 4.1 UFID Unique file identifier 214 4.22 USER Terms of use 215 4.8 USLT Unsynchronised lyric/text transcription 216 217 4.3.1 WCOM Commercial information 218 4.3.1 WCOP Copyright/Legal information 219 4.3.1 WOAF Official audio file webpage 220 4.3.1 WOAR Official artist/performer webpage 221 4.3.1 WOAS Official audio source webpage 222 4.3.1 WORS Official Internet radio station homepage 223 4.3.1 WPAY Payment 224 4.3.1 WPUB Publishers official webpage 225 4.3.2 WXXX User defined URL link frame 226 227 228 4.1. Unique file identifier 229 230 This frame's purpose is to be able to identify the audio file in a 231 database, that may provide more information relevant to the content. 232 Since standardisation of such a database is beyond this document, all 233 UFID frames begin with an 'owner identifier' field. It is a null- 234 terminated string with a URL [URL] containing an email address, or a 235 link to a location where an email address can be found, that belongs 236 to the organisation responsible for this specific database 237 implementation. Questions regarding the database should be sent to 238 the indicated email address. The URL should not be used for the 239 actual database queries. The string 240 "http://www.id3.org/dummy/ufid.html" should be used for tests. The 241 'Owner identifier' must be non-empty (more than just a termination). 242 The 'Owner identifier' is then followed by the actual identifier, 243 which may be up to 64 bytes. There may be more than one "UFID" frame 244 in a tag, but only one with the same 'Owner identifier'. 245 246 <Header for 'Unique file identifier', ID: "UFID"> 247 Owner identifier <text string> $00 248 Identifier <up to 64 bytes binary data> 249 250 251 4.2. Text information frames 252 253 The text information frames are often the most important frames, 254 containing information like artist, album and more. There may only be 255 one text information frame of its kind in an tag. All text 256 information frames supports multiple strings, stored as a null 257 separated list, where null is reperesented by the termination code 258 for the charater encoding. All text frame identifiers begin with "T". 259 Only text frame identifiers begin with "T", with the exception of the 260 "TXXX" frame. All the text information frames have the following 261 format: 262 263 <Header for 'Text information frame', ID: "T000" - "TZZZ", 264 excluding "TXXX" described in 4.2.6.> 265 Text encoding $xx 266 Information <text string(s) according to encoding> 267 268 269 4.2.1. Identification frames 270 271 TIT1 272 The 'Content group description' frame is used if the sound belongs to 273 a larger category of sounds/music. For example, classical music is 274 often sorted in different musical sections (e.g. "Piano Concerto", 275 "Weather - Hurricane"). 276 277 TIT2 278 The 'Title/Songname/Content description' frame is the actual name of 279 the piece (e.g. "Adagio", "Hurricane Donna"). 280 281 TIT3 282 The 'Subtitle/Description refinement' frame is used for information 283 directly related to the contents title (e.g. "Op. 16" or "Performed 284 live at Wembley"). 285 286 TALB 287 The 'Album/Movie/Show title' frame is intended for the title of the 288 recording (or source of sound) from which the audio in the file is 289 taken. 290 291 TOAL 292 The 'Original album/movie/show title' frame is intended for the title 293 of the original recording (or source of sound), if for example the 294 music in the file should be a cover of a previously released song. 295 296 TRCK 297 The 'Track number/Position in set' frame is a numeric string 298 containing the order number of the audio-file on its original 299 recording. This MAY be extended with a "/" character and a numeric 300 string containing the total number of tracks/elements on the original 301 recording. E.g. "4/9". 302 303 TPOS 304 The 'Part of a set' frame is a numeric string that describes which 305 part of a set the audio came from. This frame is used if the source 306 described in the "TALB" frame is divided into several mediums, e.g. a 307 double CD. The value MAY be extended with a "/" character and a 308 numeric string containing the total number of parts in the set. E.g. 309 "1/2". 310 311 TSST 312 The 'Set subtitle' frame is intended for the subtitle of the part of 313 a set this track belongs to. 314 315 TSRC 316 The 'ISRC' frame should contain the International Standard Recording 317 Code [ISRC] (12 characters). 318 319 320 4.2.2. Involved persons frames 321 322 TPE1 323 The 'Lead artist/Lead performer/Soloist/Performing group' is 324 used for the main artist. 325 326 TPE2 327 The 'Band/Orchestra/Accompaniment' frame is used for additional 328 information about the performers in the recording. 329 330 TPE3 331 The 'Conductor' frame is used for the name of the conductor. 332 333 TPE4 334 The 'Interpreted, remixed, or otherwise modified by' frame contains 335 more information about the people behind a remix and similar 336 interpretations of another existing piece. 337 338 TOPE 339 The 'Original artist/performer' frame is intended for the performer 340 of the original recording, if for example the music in the file 341 should be a cover of a previously released song. 342 343 TEXT 344 The 'Lyricist/Text writer' frame is intended for the writer of the 345 text or lyrics in the recording. 346 347 TOLY 348 The 'Original lyricist/text writer' frame is intended for the 349 text writer of the original recording, if for example the music in 350 the file should be a cover of a previously released song. 351 352 TCOM 353 The 'Composer' frame is intended for the name of the composer. 354 355 TMCL 356 The 'Musician credits list' is intended as a mapping between 357 instruments and the musician that played it. Every odd field is an 358 instrument and every even is an artist or a comma delimited list of 359 artists. 360 361 TIPL 362 The 'Involved people list' is very similar to the musician credits 363 list, but maps between functions, like producer, and names. 364 365 TENC 366 The 'Encoded by' frame contains the name of the person or 367 organisation that encoded the audio file. This field may contain a 368 copyright message, if the audio file also is copyrighted by the 369 encoder. 370 371 372 4.2.3. Derived and subjective properties frames 373 374 TBPM 375 The 'BPM' frame contains the number of beats per minute in the 376 main part of the audio. The BPM is an integer and represented as a 377 numerical string. 378 379 TLEN 380 The 'Length' frame contains the length of the audio file in 381 milliseconds, represented as a numeric string. 382 383 TKEY 384 The 'Initial key' frame contains the musical key in which the sound 385 starts. It is represented as a string with a maximum length of three 386 characters. The ground keys are represented with "A","B","C","D","E", 387 "F" and "G" and halfkeys represented with "b" and "#". Minor is 388 represented as "m", e.g. "Dbm" $00. Off key is represented with an 389 "o" only. 390 391 TLAN 392 The 'Language' frame should contain the languages of the text or 393 lyrics spoken or sung in the audio. The language is represented with 394 three characters according to ISO-639-2 [ISO-639-2]. If more than one 395 language is used in the text their language codes should follow 396 according to the amount of their usage, e.g. "eng" $00 "sve" $00. 397 398 TCON 399 The 'Content type', which ID3v1 was stored as a one byte numeric 400 value only, is now a string. You may use one or several of the ID3v1 401 types as numerical strings, or, since the category list would be 402 impossible to maintain with accurate and up to date categories, 403 define your own. Example: "21" $00 "Eurodisco" $00 404 405 You may also use any of the following keywords: 406 407 RX Remix 408 CR Cover 409 410 TFLT 411 The 'File type' frame indicates which type of audio this tag defines. 412 The following types and refinements are defined: 413 414 MIME MIME type follows 415 MPG MPEG Audio 416 /1 MPEG 1/2 layer I 417 /2 MPEG 1/2 layer II 418 /3 MPEG 1/2 layer III 419 /2.5 MPEG 2.5 420 /AAC Advanced audio compression 421 VQF Transform-domain Weighted Interleave Vector Quantisation 422 PCM Pulse Code Modulated audio 423 424 but other types may be used, but not for these types though. This is 425 used in a similar way to the predefined types in the "TMED" frame, 426 but without parentheses. If this frame is not present audio type is 427 assumed to be "MPG". 428 429 TMED 430 The 'Media type' frame describes from which media the sound 431 originated. This may be a text string or a reference to the 432 predefined media types found in the list below. Example: 433 "VID/PAL/VHS" $00. 434 435 DIG Other digital media 436 /A Analogue transfer from media 437 438 ANA Other analogue media 439 /WAC Wax cylinder 440 /8CA 8-track tape cassette 441 442 CD CD 443 /A Analogue transfer from media 444 /DD DDD 445 /AD ADD 446 /AA AAD 447 448 LD Laserdisc 449 450 TT Turntable records 451 /33 33.33 rpm 452 /45 45 rpm 453 /71 71.29 rpm 454 /76 76.59 rpm 455 /78 78.26 rpm 456 /80 80 rpm 457 458 MD MiniDisc 459 /A Analogue transfer from media 460 461 DAT DAT 462 /A Analogue transfer from media 463 /1 standard, 48 kHz/16 bits, linear 464 /2 mode 2, 32 kHz/16 bits, linear 465 /3 mode 3, 32 kHz/12 bits, non-linear, low speed 466 /4 mode 4, 32 kHz/12 bits, 4 channels 467 /5 mode 5, 44.1 kHz/16 bits, linear 468 /6 mode 6, 44.1 kHz/16 bits, 'wide track' play 469 470 DCC DCC 471 /A Analogue transfer from media 472 473 DVD DVD 474 /A Analogue transfer from media 475 476 TV Television 477 /PAL PAL 478 /NTSC NTSC 479 /SECAM SECAM 480 481 VID Video 482 /PAL PAL 483 /NTSC NTSC 484 /SECAM SECAM 485 /VHS VHS 486 /SVHS S-VHS 487 /BETA BETAMAX 488 489 RAD Radio 490 /FM FM 491 /AM AM 492 /LW LW 493 /MW MW 494 495 TEL Telephone 496 /I ISDN 497 498 MC MC (normal cassette) 499 /4 4.75 cm/s (normal speed for a two sided cassette) 500 /9 9.5 cm/s 501 /I Type I cassette (ferric/normal) 502 /II Type II cassette (chrome) 503 /III Type III cassette (ferric chrome) 504 /IV Type IV cassette (metal) 505 506 REE Reel 507 /9 9.5 cm/s 508 /19 19 cm/s 509 /38 38 cm/s 510 /76 76 cm/s 511 /I Type I cassette (ferric/normal) 512 /II Type II cassette (chrome) 513 /III Type III cassette (ferric chrome) 514 /IV Type IV cassette (metal) 515 516 TMOO 517 The 'Mood' frame is intended to reflect the mood of the audio with a 518 few keywords, e.g. "Romantic" or "Sad". 519 520 521 4.2.4. Rights and license frames 522 523 TCOP 524 The 'Copyright message' frame, in which the string must begin with a 525 year and a space character (making five characters), is intended for 526 the copyright holder of the original sound, not the audio file 527 itself. The absence of this frame means only that the copyright 528 information is unavailable or has been removed, and must not be 529 interpreted to mean that the audio is public domain. Every time this 530 field is displayed the field must be preceded with "Copyright " (C) " 531 ", where (C) is one character showing a C in a circle. 532 533 TPRO 534 The 'Produced notice' frame, in which the string must begin with a 535 year and a space character (making five characters), is intended for 536 the production copyright holder of the original sound, not the audio 537 file itself. The absence of this frame means only that the production 538 copyright information is unavailable or has been removed, and must 539 not be interpreted to mean that the audio is public domain. Every 540 time this field is displayed the field must be preceded with 541 "Produced " (P) " ", where (P) is one character showing a P in a 542 circle. 543 544 TPUB 545 The 'Publisher' frame simply contains the name of the label or 546 publisher. 547 548 TOWN 549 The 'File owner/licensee' frame contains the name of the owner or 550 licensee of the file and it's contents. 551 552 TRSN 553 The 'Internet radio station name' frame contains the name of the 554 internet radio station from which the audio is streamed. 555 556 TRSO 557 The 'Internet radio station owner' frame contains the name of the 558 owner of the internet radio station from which the audio is 559 streamed. 560 561 4.2.5. Other text frames 562 563 TOFN 564 The 'Original filename' frame contains the preferred filename for the 565 file, since some media doesn't allow the desired length of the 566 filename. The filename is case sensitive and includes its suffix. 567 568 TDLY 569 The 'Playlist delay' defines the numbers of milliseconds of silence 570 that should be inserted before this audio. The value zero indicates 571 that this is a part of a multifile audio track that should be played 572 continuously. 573 574 TDEN 575 The 'Encoding time' frame contains a timestamp describing when the 576 audio was encoded. Timestamp format is described in the ID3v2 577 structure document [ID3v2-strct]. 578 579 TDOR 580 The 'Original release time' frame contains a timestamp describing 581 when the original recording of the audio was released. Timestamp 582 format is described in the ID3v2 structure document [ID3v2-strct]. 583 584 TDRC 585 The 'Recording time' frame contains a timestamp describing when the 586 audio was recorded. Timestamp format is described in the ID3v2 587 structure document [ID3v2-strct]. 588 589 TDRL 590 The 'Release time' frame contains a timestamp describing when the 591 audio was first released. Timestamp format is described in the ID3v2 592 structure document [ID3v2-strct]. 593 594 TDTG 595 The 'Tagging time' frame contains a timestamp describing then the 596 audio was tagged. Timestamp format is described in the ID3v2 597 structure document [ID3v2-strct]. 598 599 TSSE 600 The 'Software/Hardware and settings used for encoding' frame 601 includes the used audio encoder and its settings when the file was 602 encoded. Hardware refers to hardware encoders, not the computer on 603 which a program was run. 604 605 TSOA 606 The 'Album sort order' frame defines a string which should be used 607 instead of the album name (TALB) for sorting purposes. E.g. an album 608 named "A Soundtrack" might preferably be sorted as "Soundtrack". 609 610 TSOP 611 The 'Performer sort order' frame defines a string which should be 612 used instead of the performer (TPE2) for sorting purposes. 613 614 TSOT 615 The 'Title sort order' frame defines a string which should be used 616 instead of the title (TIT2) for sorting purposes. 617 618 619 4.2.6. User defined text information frame 620 621 This frame is intended for one-string text information concerning the 622 audio file in a similar way to the other "T"-frames. The frame body 623 consists of a description of the string, represented as a terminated 624 string, followed by the actual string. There may be more than one 625 "TXXX" frame in each tag, but only one with the same description. 626 627 <Header for 'User defined text information frame', ID: "TXXX"> 628 Text encoding $xx 629 Description <text string according to encoding> $00 (00) 630 Value <text string according to encoding> 631 632 633 4.3. URL link frames 634 635 With these frames dynamic data such as webpages with touring 636 information, price information or plain ordinary news can be added to 637 the tag. There may only be one URL [URL] link frame of its kind in an 638 tag, except when stated otherwise in the frame description. If the 639 text string is followed by a string termination, all the following 640 information should be ignored and not be displayed. All URL link 641 frame identifiers begins with "W". Only URL link frame identifiers 642 begins with "W", except for "WXXX". All URL link frames have the 643 following format: 644 645 <Header for 'URL link frame', ID: "W000" - "WZZZ", excluding "WXXX" 646 described in 4.3.2.> 647 URL <text string> 648 649 650 4.3.1. URL link frames - details 651 652 WCOM 653 The 'Commercial information' frame is a URL pointing at a webpage 654 with information such as where the album can be bought. There may be 655 more than one "WCOM" frame in a tag, but not with the same content. 656 657 WCOP 658 The 'Copyright/Legal information' frame is a URL pointing at a 659 webpage where the terms of use and ownership of the file is 660 described. 661 662 WOAF 663 The 'Official audio file webpage' frame is a URL pointing at a file 664 specific webpage. 665 666 WOAR 667 The 'Official artist/performer webpage' frame is a URL pointing at 668 the artists official webpage. There may be more than one "WOAR" frame 669 in a tag if the audio contains more than one performer, but not with 670 the same content. 671 672 WOAS 673 The 'Official audio source webpage' frame is a URL pointing at the 674 official webpage for the source of the audio file, e.g. a movie. 675 676 WORS 677 The 'Official Internet radio station homepage' contains a URL 678 pointing at the homepage of the internet radio station. 679 680 WPAY 681 The 'Payment' frame is a URL pointing at a webpage that will handle 682 the process of paying for this file. 683 684 WPUB 685 The 'Publishers official webpage' frame is a URL pointing at the 686 official webpage for the publisher. 687 688 689 4.3.2. User defined URL link frame 690 691 This frame is intended for URL [URL] links concerning the audio file 692 in a similar way to the other "W"-frames. The frame body consists 693 of a description of the string, represented as a terminated string, 694 followed by the actual URL. The URL is always encoded with ISO-8859-1 695 [ISO-8859-1]. There may be more than one "WXXX" frame in each tag, 696 but only one with the same description. 697 698 <Header for 'User defined URL link frame', ID: "WXXX"> 699 Text encoding $xx 700 Description <text string according to encoding> $00 (00) 701 URL <text string> 702 703 704 4.4. Music CD identifier 705 706 This frame is intended for music that comes from a CD, so that the CD 707 can be identified in databases such as the CDDB [CDDB]. The frame 708 consists of a binary dump of the Table Of Contents, TOC, from the CD, 709 which is a header of 4 bytes and then 8 bytes/track on the CD plus 8 710 bytes for the 'lead out', making a maximum of 804 bytes. The offset 711 to the beginning of every track on the CD should be described with a 712 four bytes absolute CD-frame address per track, and not with absolute 713 time. When this frame is used the presence of a valid "TRCK" frame is 714 REQUIRED, even if the CD's only got one track. It is recommended that 715 this frame is always added to tags originating from CDs. There may 716 only be one "MCDI" frame in each tag. 717 718 <Header for 'Music CD identifier', ID: "MCDI"> 719 CD TOC <binary data> 720 721 722 4.5. Event timing codes 723 724 This frame allows synchronisation with key events in the audio. The 725 header is: 726 727 <Header for 'Event timing codes', ID: "ETCO"> 728 Time stamp format $xx 729 730 Where time stamp format is: 731 732 $01 Absolute time, 32 bit sized, using MPEG [MPEG] frames as unit 733 $02 Absolute time, 32 bit sized, using milliseconds as unit 734 735 Absolute time means that every stamp contains the time from the 736 beginning of the file. 737 738 Followed by a list of key events in the following format: 739 740 Type of event $xx 741 Time stamp $xx (xx ...) 742 743 The 'Time stamp' is set to zero if directly at the beginning of the 744 sound or after the previous event. All events MUST be sorted in 745 chronological order. The type of event is as follows: 746 747 $00 padding (has no meaning) 748 $01 end of initial silence 749 $02 intro start 750 $03 main part start 751 $04 outro start 752 $05 outro end 753 $06 verse start 754 $07 refrain start 755 $08 interlude start 756 $09 theme start 757 $0A variation start 758 $0B key change 759 $0C time change 760 $0D momentary unwanted noise (Snap, Crackle & Pop) 761 $0E sustained noise 762 $0F sustained noise end 763 $10 intro end 764 $11 main part end 765 $12 verse end 766 $13 refrain end 767 $14 theme end 768 $15 profanity 769 $16 profanity end 770 771 $17-$DF reserved for future use 772 773 $E0-$EF not predefined synch 0-F 774 775 $F0-$FC reserved for future use 776 777 $FD audio end (start of silence) 778 $FE audio file ends 779 $FF one more byte of events follows (all the following bytes with 780 the value $FF have the same function) 781 782 Terminating the start events such as "intro start" is OPTIONAL. The 783 'Not predefined synch's ($E0-EF) are for user events. You might want 784 to synchronise your music to something, like setting off an explosion 785 on-stage, activating a screensaver etc. 786 787 There may only be one "ETCO" frame in each tag. 788 789 790 4.6. MPEG location lookup table 791 792 To increase performance and accuracy of jumps within a MPEG [MPEG] 793 audio file, frames with time codes in different locations in the file 794 might be useful. This ID3v2 frame includes references that the 795 software can use to calculate positions in the file. After the frame 796 header follows a descriptor of how much the 'frame counter' should be 797 increased for every reference. If this value is two then the first 798 reference points out the second frame, the 2nd reference the 4th 799 frame, the 3rd reference the 6th frame etc. In a similar way the 800 'bytes between reference' and 'milliseconds between reference' points 801 out bytes and milliseconds respectively. 802 803 Each reference consists of two parts; a certain number of bits, as 804 defined in 'bits for bytes deviation', that describes the difference 805 between what is said in 'bytes between reference' and the reality and 806 a certain number of bits, as defined in 'bits for milliseconds 807 deviation', that describes the difference between what is said in 808 'milliseconds between reference' and the reality. The number of bits 809 in every reference, i.e. 'bits for bytes deviation'+'bits for 810 milliseconds deviation', must be a multiple of four. There may only 811 be one "MLLT" frame in each tag. 812 813 <Header for 'Location lookup table', ID: "MLLT"> 814 MPEG frames between reference $xx xx 815 Bytes between reference $xx xx xx 816 Milliseconds between reference $xx xx xx 817 Bits for bytes deviation $xx 818 Bits for milliseconds dev. $xx 819 820 Then for every reference the following data is included; 821 822 Deviation in bytes %xxx.... 823 Deviation in milliseconds %xxx.... 824 825 826 4.7. Synchronised tempo codes 827 828 For a more accurate description of the tempo of a musical piece, this 829 frame might be used. After the header follows one byte describing 830 which time stamp format should be used. Then follows one or more 831 tempo codes. Each tempo code consists of one tempo part and one time 832 part. The tempo is in BPM described with one or two bytes. If the 833 first byte has the value $FF, one more byte follows, which is added 834 to the first giving a range from 2 - 510 BPM, since $00 and $01 is 835 reserved. $00 is used to describe a beat-free time period, which is 836 not the same as a music-free time period. $01 is used to indicate one 837 single beat-stroke followed by a beat-free period. 838 839 The tempo descriptor is followed by a time stamp. Every time the 840 tempo in the music changes, a tempo descriptor may indicate this for 841 the player. All tempo descriptors MUST be sorted in chronological 842 order. The first beat-stroke in a time-period is at the same time as 843 the beat description occurs. There may only be one "SYTC" frame in 844 each tag. 845 846 <Header for 'Synchronised tempo codes', ID: "SYTC"> 847 Time stamp format $xx 848 Tempo data <binary data> 849 850 Where time stamp format is: 851 852 $01 Absolute time, 32 bit sized, using MPEG [MPEG] frames as unit 853 $02 Absolute time, 32 bit sized, using milliseconds as unit 854 855 Absolute time means that every stamp contains the time from the 856 beginning of the file. 857 858 859 4.8. Unsynchronised lyrics/text transcription 860 861 This frame contains the lyrics of the song or a text transcription of 862 other vocal activities. The head includes an encoding descriptor and 863 a content descriptor. The body consists of the actual text. The 864 'Content descriptor' is a terminated string. If no descriptor is 865 entered, 'Content descriptor' is $00 (00) only. Newline characters 866 are allowed in the text. There may be more than one 'Unsynchronised 867 lyrics/text transcription' frame in each tag, but only one with the 868 same language and content descriptor. 869 870 <Header for 'Unsynchronised lyrics/text transcription', ID: "USLT"> 871 Text encoding $xx 872 Language $xx xx xx 873 Content descriptor <text string according to encoding> $00 (00) 874 Lyrics/text <full text string according to encoding> 875 876 877 4.9. Synchronised lyrics/text 878 879 This is another way of incorporating the words, said or sung lyrics, 880 in the audio file as text, this time, however, in sync with the 881 audio. It might also be used to describing events e.g. occurring on a 882 stage or on the screen in sync with the audio. The header includes a 883 content descriptor, represented with as terminated text string. If no 884 descriptor is entered, 'Content descriptor' is $00 (00) only. 885 886 <Header for 'Synchronised lyrics/text', ID: "SYLT"> 887 Text encoding $xx 888 Language $xx xx xx 889 Time stamp format $xx 890 Content type $xx 891 Content descriptor <text string according to encoding> $00 (00) 892 893 Content type: $00 is other 894 $01 is lyrics 895 $02 is text transcription 896 $03 is movement/part name (e.g. "Adagio") 897 $04 is events (e.g. "Don Quijote enters the stage") 898 $05 is chord (e.g. "Bb F Fsus") 899 $06 is trivia/'pop up' information 900 $07 is URLs to webpages 901 $08 is URLs to images 902 903 Time stamp format: 904 905 $01 Absolute time, 32 bit sized, using MPEG [MPEG] frames as unit 906 $02 Absolute time, 32 bit sized, using milliseconds as unit 907 908 Absolute time means that every stamp contains the time from the 909 beginning of the file. 910 911 The text that follows the frame header differs from that of the 912 unsynchronised lyrics/text transcription in one major way. Each 913 syllable (or whatever size of text is considered to be convenient by 914 the encoder) is a null terminated string followed by a time stamp 915 denoting where in the sound file it belongs. Each sync thus has the 916 following structure: 917 918 Terminated text to be synced (typically a syllable) 919 Sync identifier (terminator to above string) $00 (00) 920 Time stamp $xx (xx ...) 921 922 The 'time stamp' is set to zero or the whole sync is omitted if 923 located directly at the beginning of the sound. All time stamps 924 should be sorted in chronological order. The sync can be considered 925 as a validator of the subsequent string. 926 927 Newline characters are allowed in all "SYLT" frames and MUST be used 928 after every entry (name, event etc.) in a frame with the content type 929 $03 - $04. 930 931 A few considerations regarding whitespace characters: Whitespace 932 separating words should mark the beginning of a new word, thus 933 occurring in front of the first syllable of a new word. This is also 934 valid for new line characters. A syllable followed by a comma should 935 not be broken apart with a sync (both the syllable and the comma 936 should be before the sync). 937 938 An example: The "USLT" passage 939 940 "Strangers in the night" $0A "Exchanging glances" 941 942 would be "SYLT" encoded as: 943 944 "Strang" $00 xx xx "ers" $00 xx xx " in" $00 xx xx " the" $00 xx xx 945 " night" $00 xx xx 0A "Ex" $00 xx xx "chang" $00 xx xx "ing" $00 xx 946 xx "glan" $00 xx xx "ces" $00 xx xx 947 948 There may be more than one "SYLT" frame in each tag, but only one 949 with the same language and content descriptor. 950 951 952 4.10. Comments 953 954 This frame is intended for any kind of full text information that 955 does not fit in any other frame. It consists of a frame header 956 followed by encoding, language and content descriptors and is ended 957 with the actual comment as a text string. Newline characters are 958 allowed in the comment text string. There may be more than one 959 comment frame in each tag, but only one with the same language and 960 content descriptor. 961 962 <Header for 'Comment', ID: "COMM"> 963 Text encoding $xx 964 Language $xx xx xx 965 Short content descrip. <text string according to encoding> $00 (00) 966 The actual text <full text string according to encoding> 967 968 969 4.11. Relative volume adjustment (2) 970 971 This is a more subjective frame than the previous ones. It allows the 972 user to say how much he wants to increase/decrease the volume on each 973 channel when the file is played. The purpose is to be able to align 974 all files to a reference volume, so that you don't have to change the 975 volume constantly. This frame may also be used to balance adjust the 976 audio. The volume adjustment is encoded as a fixed point decibel 977 value, 16 bit signed integer representing (adjustment*512), giving 978 +/- 64 dB with a precision of 0.001953125 dB. E.g. +2 dB is stored as 979 $04 00 and -2 dB is $FC 00. There may be more than one "RVA2" frame 980 in each tag, but only one with the same identification string. 981 982 <Header for 'Relative volume adjustment (2)', ID: "RVA2"> 983 Identification <text string> $00 984 985 The 'identification' string is used to identify the situation and/or 986 device where this adjustment should apply. The following is then 987 repeated for every channel 988 989 Type of channel $xx 990 Volume adjustment $xx xx 991 Bits representing peak $xx 992 Peak volume $xx (xx ...) 993 994 995 Type of channel: $00 Other 996 $01 Master volume 997 $02 Front right 998 $03 Front left 999 $04 Back right 1000 $05 Back left 1001 $06 Front centre 1002 $07 Back centre 1003 $08 Subwoofer 1004 1005 Bits representing peak can be any number between 0 and 255. 0 means 1006 that there is no peak volume field. The peak volume field is always 1007 padded to whole bytes, setting the most significant bits to zero. 1008 1009 1010 4.12. Equalisation (2) 1011 1012 This is another subjective, alignment frame. It allows the user to 1013 predefine an equalisation curve within the audio file. There may be 1014 more than one "EQU2" frame in each tag, but only one with the same 1015 identification string. 1016 1017 <Header of 'Equalisation (2)', ID: "EQU2"> 1018 Interpolation method $xx 1019 Identification <text string> $00 1020 1021 The 'interpolation method' describes which method is preferred when 1022 an interpolation between the adjustment point that follows. The 1023 following methods are currently defined: 1024 1025 $00 Band 1026 No interpolation is made. A jump from one adjustment level to 1027 another occurs in the middle between two adjustment points. 1028 $01 Linear 1029 Interpolation between adjustment points is linear. 1030 1031 The 'identification' string is used to identify the situation and/or 1032 device where this adjustment should apply. The following is then 1033 repeated for every adjustment point 1034 1035 Frequency $xx xx 1036 Volume adjustment $xx xx 1037 1038 The frequency is stored in units of 1/2 Hz, giving it a range from 0 1039 to 32767 Hz. 1040 1041 The volume adjustment is encoded as a fixed point decibel value, 16 1042 bit signed integer representing (adjustment*512), giving +/- 64 dB 1043 with a precision of 0.001953125 dB. E.g. +2 dB is stored as $04 00 1044 and -2 dB is $FC 00. 1045 1046 Adjustment points should be ordered by frequency and one frequency 1047 should only be described once in the frame. 1048 1049 1050 4.13. Reverb 1051 1052 Yet another subjective frame, with which you can adjust echoes of 1053 different kinds. Reverb left/right is the delay between every bounce 1054 in ms. Reverb bounces left/right is the number of bounces that should 1055 be made. $FF equals an infinite number of bounces. Feedback is the 1056 amount of volume that should be returned to the next echo bounce. $00 1057 is 0%, $FF is 100%. If this value were $7F, there would be 50% volume 1058 reduction on the first bounce, 50% of that on the second and so on. 1059 Left to left means the sound from the left bounce to be played in the 1060 left speaker, while left to right means sound from the left bounce to 1061 be played in the right speaker. 1062 1063 'Premix left to right' is the amount of left sound to be mixed in the 1064 right before any reverb is applied, where $00 id 0% and $FF is 100%. 1065 'Premix right to left' does the same thing, but right to left. 1066 Setting both premix to $FF would result in a mono output (if the 1067 reverb is applied symmetric). There may only be one "RVRB" frame in 1068 each tag. 1069 1070 <Header for 'Reverb', ID: "RVRB"> 1071 Reverb left (ms) $xx xx 1072 Reverb right (ms) $xx xx 1073 Reverb bounces, left $xx 1074 Reverb bounces, right $xx 1075 Reverb feedback, left to left $xx 1076 Reverb feedback, left to right $xx 1077 Reverb feedback, right to right $xx 1078 Reverb feedback, right to left $xx 1079 Premix left to right $xx 1080 Premix right to left $xx 1081 1082 1083 4.14. Attached picture 1084 1085 This frame contains a picture directly related to the audio file. 1086 Image format is the MIME type and subtype [MIME] for the image. In 1087 the event that the MIME media type name is omitted, "image/" will be 1088 implied. The "image/png" [PNG] or "image/jpeg" [JFIF] picture format 1089 should be used when interoperability is wanted. Description is a 1090 short description of the picture, represented as a terminated 1091 text string. There may be several pictures attached to one file, each 1092 in their individual "APIC" frame, but only one with the same content 1093 descriptor. There may only be one picture with the picture type 1094 declared as picture type $01 and $02 respectively. There is the 1095 possibility to put only a link to the image file by using the 'MIME 1096 type' "-->" and having a complete URL [URL] instead of picture data. 1097 The use of linked files should however be used sparingly since there 1098 is the risk of separation of files. 1099 1100 <Header for 'Attached picture', ID: "APIC"> 1101 Text encoding $xx 1102 MIME type <text string> $00 1103 Picture type $xx 1104 Description <text string according to encoding> $00 (00) 1105 Picture data <binary data> 1106 1107 1108 Picture type: $00 Other 1109 $01 32x32 pixels 'file icon' (PNG only) 1110 $02 Other file icon 1111 $03 Cover (front) 1112 $04 Cover (back) 1113 $05 Leaflet page 1114 $06 Media (e.g. label side of CD) 1115 $07 Lead artist/lead performer/soloist 1116 $08 Artist/performer 1117 $09 Conductor 1118 $0A Band/Orchestra 1119 $0B Composer 1120 $0C Lyricist/text writer 1121 $0D Recording Location 1122 $0E During recording 1123 $0F During performance 1124 $10 Movie/video screen capture 1125 $11 A bright coloured fish 1126 $12 Illustration 1127 $13 Band/artist logotype 1128 $14 Publisher/Studio logotype 1129 1130 1131 4.15. General encapsulated object 1132 1133 In this frame any type of file can be encapsulated. After the header, 1134 'Frame size' and 'Encoding' follows 'MIME type' [MIME] represented as 1135 as a terminated string encoded with ISO 8859-1 [ISO-8859-1]. The 1136 filename is case sensitive and is encoded as 'Encoding'. Then follows 1137 a content description as terminated string, encoded as 'Encoding'. 1138 The last thing in the frame is the actual object. The first two 1139 strings may be omitted, leaving only their terminations. MIME type is 1140 always an ISO-8859-1 text string. There may be more than one "GEOB" 1141 frame in each tag, but only one with the same content descriptor. 1142 1143 <Header for 'General encapsulated object', ID: "GEOB"> 1144 Text encoding $xx 1145 MIME type <text string> $00 1146 Filename <text string according to encoding> $00 (00) 1147 Content description <text string according to encoding> $00 (00) 1148 Encapsulated object <binary data> 1149 1150 1151 4.16. Play counter 1152 1153 This is simply a counter of the number of times a file has been 1154 played. The value is increased by one every time the file begins to 1155 play. There may only be one "PCNT" frame in each tag. When the 1156 counter reaches all one's, one byte is inserted in front of the 1157 counter thus making the counter eight bits bigger. The counter must 1158 be at least 32-bits long to begin with. 1159 1160 <Header for 'Play counter', ID: "PCNT"> 1161 Counter $xx xx xx xx (xx ...) 1162 1163 1164 4.17. Popularimeter 1165 1166 The purpose of this frame is to specify how good an audio file is. 1167 Many interesting applications could be found to this frame such as a 1168 playlist that features better audio files more often than others or 1169 it could be used to profile a person's taste and find other 'good' 1170 files by comparing people's profiles. The frame contains the email 1171 address to the user, one rating byte and a four byte play counter, 1172 intended to be increased with one for every time the file is played. 1173 The email is a terminated string. The rating is 1-255 where 1 is 1174 worst and 255 is best. 0 is unknown. If no personal counter is wanted 1175 it may be omitted. When the counter reaches all one's, one byte is 1176 inserted in front of the counter thus making the counter eight bits 1177 bigger in the same away as the play counter ("PCNT"). There may be 1178 more than one "POPM" frame in each tag, but only one with the same 1179 email address. 1180 1181 <Header for 'Popularimeter', ID: "POPM"> 1182 Email to user <text string> $00 1183 Rating $xx 1184 Counter $xx xx xx xx (xx ...) 1185 1186 1187 4.18. Recommended buffer size 1188 1189 Sometimes the server from which an audio file is streamed is aware of 1190 transmission or coding problems resulting in interruptions in the 1191 audio stream. In these cases, the size of the buffer can be 1192 recommended by the server using this frame. If the 'embedded info 1193 flag' is true (1) then this indicates that an ID3 tag with the 1194 maximum size described in 'Buffer size' may occur in the audio 1195 stream. In such case the tag should reside between two MPEG [MPEG] 1196 frames, if the audio is MPEG encoded. If the position of the next tag 1197 is known, 'offset to next tag' may be used. The offset is calculated 1198 from the end of tag in which this frame resides to the first byte of 1199 the header in the next. This field may be omitted. Embedded tags are 1200 generally not recommended since this could render unpredictable 1201 behaviour from present software/hardware. 1202 1203 For applications like streaming audio it might be an idea to embed 1204 tags into the audio stream though. If the clients connects to 1205 individual connections like HTTP and there is a possibility to begin 1206 every transmission with a tag, then this tag should include a 1207 'recommended buffer size' frame. If the client is connected to a 1208 arbitrary point in the stream, such as radio or multicast, then the 1209 'recommended buffer size' frame SHOULD be included in every tag. 1210 1211 The 'Buffer size' should be kept to a minimum. There may only be one 1212 "RBUF" frame in each tag. 1213 1214 <Header for 'Recommended buffer size', ID: "RBUF"> 1215 Buffer size $xx xx xx 1216 Embedded info flag %0000000x 1217 Offset to next tag $xx xx xx xx 1218 1219 1220 4.19. Audio encryption 1221 1222 This frame indicates if the actual audio stream is encrypted, and by 1223 whom. Since standardisation of such encryption scheme is beyond this 1224 document, all "AENC" frames begin with a terminated string with a 1225 URL containing an email address, or a link to a location where an 1226 email address can be found, that belongs to the organisation 1227 responsible for this specific encrypted audio file. Questions 1228 regarding the encrypted audio should be sent to the email address 1229 specified. If a $00 is found directly after the 'Frame size' and the 1230 audio file indeed is encrypted, the whole file may be considered 1231 useless. 1232 1233 After the 'Owner identifier', a pointer to an unencrypted part of the 1234 audio can be specified. The 'Preview start' and 'Preview length' is 1235 described in frames. If no part is unencrypted, these fields should 1236 be left zeroed. After the 'preview length' field follows optionally a 1237 data block required for decryption of the audio. There may be more 1238 than one "AENC" frames in a tag, but only one with the same 'Owner 1239 identifier'. 1240 1241 <Header for 'Audio encryption', ID: "AENC"> 1242 Owner identifier <text string> $00 1243 Preview start $xx xx 1244 Preview length $xx xx 1245 Encryption info <binary data> 1246 1247 1248 4.20. Linked information 1249 1250 To keep information duplication as low as possible this frame may be 1251 used to link information from another ID3v2 tag that might reside in 1252 another audio file or alone in a binary file. It is RECOMMENDED that 1253 this method is only used when the files are stored on a CD-ROM or 1254 other circumstances when the risk of file separation is low. The 1255 frame contains a frame identifier, which is the frame that should be 1256 linked into this tag, a URL [URL] field, where a reference to the 1257 file where the frame is given, and additional ID data, if needed. 1258 Data should be retrieved from the first tag found in the file to 1259 which this link points. There may be more than one "LINK" frame in a 1260 tag, but only one with the same contents. A linked frame is to be 1261 considered as part of the tag and has the same restrictions as if it 1262 was a physical part of the tag (i.e. only one "RVRB" frame allowed, 1263 whether it's linked or not). 1264 1265 <Header for 'Linked information', ID: "LINK"> 1266 Frame identifier $xx xx xx xx 1267 URL <text string> $00 1268 ID and additional data <text string(s)> 1269 1270 Frames that may be linked and need no additional data are "ASPI", 1271 "ETCO", "EQU2", "MCID", "MLLT", "OWNE", "RVA2", "RVRB", "SYTC", the 1272 text information frames and the URL link frames. 1273 1274 The "AENC", "APIC", "GEOB" and "TXXX" frames may be linked with 1275 the content descriptor as additional ID data. 1276 1277 The "USER" frame may be linked with the language field as additional 1278 ID data. 1279 1280 The "PRIV" frame may be linked with the owner identifier as 1281 additional ID data. 1282 1283 The "COMM", "SYLT" and "USLT" frames may be linked with three bytes 1284 of language descriptor directly followed by a content descriptor as 1285 additional ID data. 1286 1287 1288 4.21. Position synchronisation frame 1289 1290 This frame delivers information to the listener of how far into the 1291 audio stream he picked up; in effect, it states the time offset from 1292 the first frame in the stream. The frame layout is: 1293 1294 <Head for 'Position synchronisation', ID: "POSS"> 1295 Time stamp format $xx 1296 Position $xx (xx ...) 1297 1298 Where time stamp format is: 1299 1300 $01 Absolute time, 32 bit sized, using MPEG frames as unit 1301 $02 Absolute time, 32 bit sized, using milliseconds as unit 1302 1303 and position is where in the audio the listener starts to receive, 1304 i.e. the beginning of the next frame. If this frame is used in the 1305 beginning of a file the value is always 0. There may only be one 1306 "POSS" frame in each tag. 1307 1308 1309 4.22. Terms of use frame 1310 1311 This frame contains a brief description of the terms of use and 1312 ownership of the file. More detailed information concerning the legal 1313 terms might be available through the "WCOP" frame. Newlines are 1314 allowed in the text. There may be more than one 'Terms of use' frame 1315 in a tag, but only one with the same 'Language'. 1316 1317 <Header for 'Terms of use frame', ID: "USER"> 1318 Text encoding $xx 1319 Language $xx xx xx 1320 The actual text <text string according to encoding> 1321 1322 1323 4.23. Ownership frame 1324 1325 The ownership frame might be used as a reminder of a made transaction 1326 or, if signed, as proof. Note that the "USER" and "TOWN" frames are 1327 good to use in conjunction with this one. The frame begins, after the 1328 frame ID, size and encoding fields, with a 'price paid' field. The 1329 first three characters of this field contains the currency used for 1330 the transaction, encoded according to ISO 4217 [ISO-4217] alphabetic 1331 currency code. Concatenated to this is the actual price paid, as a 1332 numerical string using "." as the decimal separator. Next is an 8 1333 character date string (YYYYMMDD) followed by a string with the name 1334 of the seller as the last field in the frame. There may only be one 1335 "OWNE" frame in a tag. 1336 1337 <Header for 'Ownership frame', ID: "OWNE"> 1338 Text encoding $xx 1339 Price paid <text string> $00 1340 Date of purch. <text string> 1341 Seller <text string according to encoding> 1342 1343 1344 4.24. Commercial frame 1345 1346 This frame enables several competing offers in the same tag by 1347 bundling all needed information. That makes this frame rather complex 1348 but it's an easier solution than if one tries to achieve the same 1349 result with several frames. The frame begins, after the frame ID, 1350 size and encoding fields, with a price string field. A price is 1351 constructed by one three character currency code, encoded according 1352 to ISO 4217 [ISO-4217] alphabetic currency code, followed by a 1353 numerical value where "." is used as decimal separator. In the price 1354 string several prices may be concatenated, separated by a "/" 1355 character, but there may only be one currency of each type. 1356 1357 The price string is followed by an 8 character date string in the 1358 format YYYYMMDD, describing for how long the price is valid. After 1359 that is a contact URL, with which the user can contact the seller, 1360 followed by a one byte 'received as' field. It describes how the 1361 audio is delivered when bought according to the following list: 1362 1363 $00 Other 1364 $01 Standard CD album with other songs 1365 $02 Compressed audio on CD 1366 $03 File over the Internet 1367 $04 Stream over the Internet 1368 $05 As note sheets 1369 $06 As note sheets in a book with other sheets 1370 $07 Music on other media 1371 $08 Non-musical merchandise 1372 1373 Next follows a terminated string with the name of the seller followed 1374 by a terminated string with a short description of the product. The 1375 last thing is the ability to include a company logotype. The first of 1376 them is the 'Picture MIME type' field containing information about 1377 which picture format is used. In the event that the MIME media type 1378 name is omitted, "image/" will be implied. Currently only "image/png" 1379 and "image/jpeg" are allowed. This format string is followed by the 1380 binary picture data. This two last fields may be omitted if no 1381 picture is attached. There may be more than one 'commercial frame' in 1382 a tag, but no two may be identical. 1383 1384 <Header for 'Commercial frame', ID: "COMR"> 1385 Text encoding $xx 1386 Price string <text string> $00 1387 Valid until <text string> 1388 Contact URL <text string> $00 1389 Received as $xx 1390 Name of seller <text string according to encoding> $00 (00) 1391 Description <text string according to encoding> $00 (00) 1392 Picture MIME type <string> $00 1393 Seller logo <binary data> 1394 1395 1396 4.25. Encryption method registration 1397 1398 To identify with which method a frame has been encrypted the 1399 encryption method must be registered in the tag with this frame. The 1400 'Owner identifier' is a null-terminated string with a URL [URL] 1401 containing an email address, or a link to a location where an email 1402 address can be found, that belongs to the organisation responsible 1403 for this specific encryption method. Questions regarding the 1404 encryption method should be sent to the indicated email address. The 1405 'Method symbol' contains a value that is associated with this method 1406 throughout the whole tag, in the range $80-F0. All other values are 1407 reserved. The 'Method symbol' may optionally be followed by 1408 encryption specific data. There may be several "ENCR" frames in a tag 1409 but only one containing the same symbol and only one containing the 1410 same owner identifier. The method must be used somewhere in the tag. 1411 See the description of the frame encryption flag in the ID3v2 1412 structure document [ID3v2-strct] for more information. 1413 1414 <Header for 'Encryption method registration', ID: "ENCR"> 1415 Owner identifier <text string> $00 1416 Method symbol $xx 1417 Encryption data <binary data> 1418 1419 1420 4.26. Group identification registration 1421 1422 This frame enables grouping of otherwise unrelated frames. This can 1423 be used when some frames are to be signed. To identify which frames 1424 belongs to a set of frames a group identifier must be registered in 1425 the tag with this frame. The 'Owner identifier' is a null-terminated 1426 string with a URL [URL] containing an email address, or a link to a 1427 location where an email address can be found, that belongs to the 1428 organisation responsible for this grouping. Questions regarding the 1429 grouping should be sent to the indicated email address. The 'Group 1430 symbol' contains a value that associates the frame with this group 1431 throughout the whole tag, in the range $80-F0. All other values are 1432 reserved. The 'Group symbol' may optionally be followed by some group 1433 specific data, e.g. a digital signature. There may be several "GRID" 1434 frames in a tag but only one containing the same symbol and only one 1435 containing the same owner identifier. The group symbol must be used 1436 somewhere in the tag. See the description of the frame grouping flag 1437 in the ID3v2 structure document [ID3v2-strct] for more information. 1438 1439 <Header for 'Group ID registration', ID: "GRID"> 1440 Owner identifier <text string> $00 1441 Group symbol $xx 1442 Group dependent data <binary data> 1443 1444 1445 4.27. Private frame 1446 1447 This frame is used to contain information from a software producer 1448 that its program uses and does not fit into the other frames. The 1449 frame consists of an 'Owner identifier' string and the binary data. 1450 The 'Owner identifier' is a null-terminated string with a URL [URL] 1451 containing an email address, or a link to a location where an email 1452 address can be found, that belongs to the organisation responsible 1453 for the frame. Questions regarding the frame should be sent to the 1454 indicated email address. The tag may contain more than one "PRIV" 1455 frame but only with different contents. 1456 1457 <Header for 'Private frame', ID: "PRIV"> 1458 Owner identifier <text string> $00 1459 The private data <binary data> 1460 1461 1462 4.28. Signature frame 1463 1464 This frame enables a group of frames, grouped with the 'Group 1465 identification registration', to be signed. Although signatures can 1466 reside inside the registration frame, it might be desired to store 1467 the signature elsewhere, e.g. in watermarks. There may be more than 1468 one 'signature frame' in a tag, but no two may be identical. 1469 1470 <Header for 'Signature frame', ID: "SIGN"> 1471 Group symbol $xx 1472 Signature <binary data> 1473 1474 1475 4.29. Seek frame 1476 1477 This frame indicates where other tags in a file/stream can be found. 1478 The 'minimum offset to next tag' is calculated from the end of this 1479 tag to the beginning of the next. There may only be one 'seek frame' 1480 in a tag. 1481 1482 <Header for 'Seek frame', ID: "SEEK"> 1483 Minimum offset to next tag $xx xx xx xx 1484 1485 1486 4.30. Audio seek point index 1487 1488 Audio files with variable bit rates are intrinsically difficult to 1489 deal with in the case of seeking within the file. The ASPI frame 1490 makes seeking easier by providing a list a seek points within the 1491 audio file. The seek points are a fractional offset within the audio 1492 data, providing a starting point from which to find an appropriate 1493 point to start decoding. The presence of an ASPI frame requires the 1494 existence of a TLEN frame, indicating the duration of the file in 1495 milliseconds. There may only be one 'audio seek point index' frame in 1496 a tag. 1497 1498 <Header for 'Seek Point Index', ID: "ASPI"> 1499 Indexed data start (S) $xx xx xx xx 1500 Indexed data length (L) $xx xx xx xx 1501 Number of index points (N) $xx xx 1502 Bits per index point (b) $xx 1503 1504 Then for every index point the following data is included; 1505 1506 Fraction at index (Fi) $xx (xx) 1507 1508 'Indexed data start' is a byte offset from the beginning of the file. 1509 'Indexed data length' is the byte length of the audio data being 1510 indexed. 'Number of index points' is the number of index points, as 1511 the name implies. The recommended number is 100. 'Bits per index 1512 point' is 8 or 16, depending on the chosen precision. 8 bits works 1513 well for short files (less than 5 minutes of audio), while 16 bits is 1514 advantageous for long files. 'Fraction at index' is the numerator of 1515 the fraction representing a relative position in the data. The 1516 denominator is 2 to the power of b. 1517 1518 Here are the algorithms to be used in the calculation. The known data 1519 must be the offset of the start of the indexed data (S), the offset 1520 of the end of the indexed data (E), the number of index points (N), 1521 the offset at index i (Oi). We calculate the fraction at index i 1522 (Fi). 1523 1524 Oi is the offset of the frame whose start is soonest after the point 1525 for which the time offset is (i/N * duration). 1526 1527 The frame data should be calculated as follows: 1528 1529 Fi = Oi/L * 2^b (rounded down to the nearest integer) 1530 1531 Offset calculation should be calculated as follows from data in the 1532 frame: 1533 1534 Oi = (Fi/2^b)*L (rounded up to the nearest integer) 1535 1536 1537 5. Copyright 1538 1539 Copyright (C) Martin Nilsson 2000. All Rights Reserved. 1540 1541 This document and translations of it may be copied and furnished to 1542 others, and derivative works that comment on or otherwise explain it 1543 or assist in its implementation may be prepared, copied, published 1544 and distributed, in whole or in part, without restriction of any 1545 kind, provided that a reference to this document is included on all 1546 such copies and derivative works. However, this document itself may 1547 not be modified in any way and reissued as the original document. 1548 1549 The limited permissions granted above are perpetual and will not be 1550 revoked. 1551 1552 This document and the information contained herein is provided on an 1553 "AS IS" basis and THE AUTHORS DISCLAIMS ALL WARRANTIES, EXPRESS OR 1554 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1555 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1556 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1557 1558 1559 6. References 1560 1561 [CDDB] Compact Disc Data Base 1562 1563 <url:http://www.cddb.com> 1564 1565 [ID3v2.3.0] Martin Nilsson, "ID3v2 informal standard". 1566 1567 <url:http://www.id3.org/id3v2.3.0.txt> 1568 1569 [ID3v2-strct] Martin Nilsson, 1570 "ID3 tag version 2.4.0 - Main Structure" 1571 1572 <url:http//www.id3.org/id3v2.4.0-structure.txt> 1573 1574 [ISO-639-2] ISO/FDIS 639-2. 1575 Codes for the representation of names of languages, Part 2: Alpha-3 1576 code. Technical committee / subcommittee: TC 37 / SC 2 1577 1578 [ISO-4217] ISO 4217:1995. 1579 Codes for the representation of currencies and funds. 1580 Technical committee / subcommittee: TC 68 1581 1582 [ISO-8859-1] ISO/IEC DIS 8859-1. 1583 8-bit single-byte coded graphic character sets, Part 1: Latin 1584 alphabet No. 1. Technical committee / subcommittee: JTC 1 / SC 2 1585 1586 [ISRC] ISO 3901:1986 1587 International Standard Recording Code (ISRC). 1588 Technical committee / subcommittee: TC 46 / SC 9 1589 1590 [JFIF] JPEG File Interchange Format, version 1.02 1591 1592 <url:http://www.w3.org/Graphics/JPEG/jfif.txt> 1593 1594 [KEYWORDS] S. Bradner, 'Key words for use in RFCs to Indicate 1595 Requirement Levels', RFC 2119, March 1997. 1596 1597 <url:ftp://ftp.isi.edu/in-notes/rfc2119.txt> 1598 1599 [MIME] Freed, N. and N. Borenstein, "Multipurpose Internet Mail 1600 Extensions (MIME) Part One: Format of Internet Message Bodies", 1601 RFC 2045, November 1996. 1602 1603 <url:ftp://ftp.isi.edu/in-notes/rfc2045.txt> 1604 1605 [MPEG] ISO/IEC 11172-3:1993. 1606 Coding of moving pictures and associated audio for digital storage 1607 media at up to about 1,5 Mbit/s, Part 3: Audio. 1608 Technical committee / subcommittee: JTC 1 / SC 29 1609 and 1610 ISO/IEC 13818-3:1995 1611 Generic coding of moving pictures and associated audio information, 1612 Part 3: Audio. 1613 Technical committee / subcommittee: JTC 1 / SC 29 1614 and 1615 ISO/IEC DIS 13818-3 1616 Generic coding of moving pictures and associated audio information, 1617 Part 3: Audio (Revision of ISO/IEC 13818-3:1995) 1618 1619 1620 [PNG] Portable Network Graphics, version 1.0 1621 1622 <url:http://www.w3.org/TR/REC-png-multi.html> 1623 1624 [URL] T. Berners-Lee, L. Masinter & M. McCahill, "Uniform Resource 1625 Locators (URL).", RFC 1738, December 1994. 1626 1627 <url:ftp://ftp.isi.edu/in-notes/rfc1738.txt> 1628 1629 [ZLIB] P. Deutsch, Aladdin Enterprises & J-L. Gailly, "ZLIB 1630 Compressed 1631 Data Format Specification version 3.3", RFC 1950, May 1996. 1632 1633 <url:ftp://ftp.isi.edu/in-notes/rfc1950.txt> 1634 1635 1636 7. Appendix 1637 1638 1639 A. Appendix A - Genre List from ID3v1 1640 1641 The following genres is defined in ID3v1 1642 1643 0.Blues 1644 1.Classic Rock 1645 2.Country 1646 3.Dance 1647 4.Disco 1648 5.Funk 1649 6.Grunge 1650 7.Hip-Hop 1651 8.Jazz 1652 9.Metal 1653 10.New Age 1654 11.Oldies 1655 12.Other 1656 13.Pop 1657 14.R&B 1658 15.Rap 1659 16.Reggae 1660 17.Rock 1661 18.Techno 1662 19.Industrial 1663 20.Alternative 1664 21.Ska 1665 22.Death Metal 1666 23.Pranks 1667 24.Soundtrack 1668 25.Euro-Techno 1669 26.Ambient 1670 27.Trip-Hop 1671 28.Vocal 1672 29.Jazz+Funk 1673 30.Fusion 1674 31.Trance 1675 32.Classical 1676 33.Instrumental 1677 34.Acid 1678 35.House 1679 36.Game 1680 37.Sound Clip 1681 38.Gospel 1682 39.Noise 1683 40.AlternRock 1684 41.Bass 1685 42.Soul 1686 43.Punk 1687 44.Space 1688 45.Meditative 1689 46.Instrumental Pop 1690 47.Instrumental Rock 1691 48.Ethnic 1692 49.Gothic 1693 50.Darkwave 1694 51.Techno-Industrial 1695 52.Electronic 1696 53.Pop-Folk 1697 54.Eurodance 1698 55.Dream 1699 56.Southern Rock 1700 57.Comedy 1701 58.Cult 1702 59.Gangsta 1703 60.Top 40 1704 61.Christian Rap 1705 62.Pop/Funk 1706 63.Jungle 1707 64.Native American 1708 65.Cabaret 1709 66.New Wave 1710 67.Psychadelic 1711 68.Rave 1712 69.Showtunes 1713 70.Trailer 1714 71.Lo-Fi 1715 72.Tribal 1716 73.Acid Punk 1717 74.Acid Jazz 1718 75.Polka 1719 76.Retro 1720 77.Musical 1721 78.Rock & Roll 1722 79.Hard Rock 1723 1724 1725 8. Author's Address 1726 1727 Written by 1728 1729 Martin Nilsson 1730 Rydsvgen 246 C. 30 1731 SE-584 34 Linkping 1732 Sweden 1733 1734 Email: nilsson at id3.org