ID3 and almp3 Nightmare
ZoriaRPG

I'm trying to figure out how to know just how far to read ahead in a binary stream for MP3 files that contain not only basic ID3 tags, but also other rubbish, such as
ID3v2 Frames (text information).

This nonsense pads as much as 1MB onto an MP3 stream, and of course, almp3 doesn't know what to do with it at all.

Here are two examples:

With ID3 Metadata
http://timelord.insomnia247.nl/zc-dev/mp3-metadata/House.mp3

Same file, stripped of ID3 metadata, but otherwise unmodified:
http://timelord.insomnia247.nl/zc-dev/mp3-metadata/House320.mp3

The problem, is in determining precisely how large the extraneous data is, and skipping over exactly that part of the file.

Spec reference:
http://id3.org/id3v2.3.0

Has anyone here any experience with this sort of thing?

Edgar Reynaldo

I can't help you with mp3, but Allegro 4 supports .ogg through Logg.

ZoriaRPG

I can't help you with mp3, but Allegro 4 supports .ogg through Logg.

Aye, we support that too, and OGG works just fine. Unfortunately, ZC added MP3 support via almp3 at one point, and almp3 has no support for ID3v2. In fact, I doubt that it supports ID3 at all.

The weird bit is that an older version of ZC, with the same libs and the same code for MP3 reading/handling worked, and yet our current builds do not work with MP3s that have extensive ID3v2 metatags! In fact, loading these corrupts all allegro sound, making it completely silent, including for DIGI/WAV/MIDI. I'm not sure why, yet.

I may need to just rip out almp3 and add in some other, external. MP3 library. I spent half the day reading up on this, and while I see tools to strip out metadata, I can't find a practical demonstration of how to do that in C/C++.

I may need to bundle in a tool to manually strip it, for now.

Ultimately, if I can load the full file with fopen(), I can then strip out the metadata and copy over the MP3 audio frames to a new buffer, then play that--all internally ande invisibly.

THe problem iw thqat ID3v2 metadata is of variable length, and I'm not sure how to determine precisely where it ends, and the actual sync pattern begins!

WHile I could in theory search for the sync token (0xFFE), there's no guarantee in the spec that metadata won't also have that signature, and bcause the metatags do not have a fixed length, there's no single byte at which I can know that the sync signal begins.

Obviously,. the people who have made the tools to strip out the metadata have figured out how to do it, so, there must be a way, but I have not had the time to properly review all of the several hundred source reference files that I need to examine to find out how they do it--or if they bothered to annotate.

I do have refs for how to read all of the tags, but I'm not sure how much padding there is between the last-possible tag, and the sync signal; and that'd become a guessing game.

The asinine part is that ID3v2 is stored at the lead of the file, instead of as a footer. This is compounded--and confounded--by other varieties of metatags and meta formats. :(

I'm thinking that adding a more modern MP3 lib would be the most rational option, as long as it cooperates.

Edgar Reynaldo

Why use mp3 at all? FFMPEG can convert mp3 to .ogg with no loss, and you'll get compression, and free use. Hard to beat. Supporting proprietary formats is a waste of time.

Chris Katko

Supporting proprietary formats is a waste of time.

1. MP3 is a free format now.

2. You cannot discount the majority of all compressed audio is in MP3, including hardware codec support in many portable sound capture devices (including mine).

3. Converting MP3 to OGG is a lossy process. Of which, you cannot hand-wave away the potential for corruption. Sometimes it's terrible.

4. OGG has it's own failure modes and sounds differently "bad" to ears. It's not the end-of-the-world but again, it has to be addressed.

A better equivalent argument would be "Why use ZIP ever again when you can use 7ZIP" which supports (IIRC) every aspect of the original ZIP format, plus plenty of upgrades. (Though it still doesn't preserve file system bits so you still need a tar on linux systems, IIRC).

But MP3? That's a dangerous thing. You wouldn't recommend everyone convert their JPEG collection to GIF. One lossy format to another.

Any proper audio library these days should support WAV, MP3 and OGG. And, ideally, FLAC. They're the workhorse of 99% of all audio these days. There is MPEG-4 audio, AAC, but nobody cares unless they're ripping their iTunes library.

Quote:

FFMPEG can convert mp3 to .ogg with no loss?

??? Citation?

https://trac.ffmpeg.org/wiki/Encode/HighQualityAudio

Quote:

Generation loss:
Transcoding from a lossy format like MP3, AAC, Vorbis, Opus, WMA, etc. to the same or different lossy format might degrade the audio quality even if the bitrate stays the same (or higher). This quality degradation might not be audible to you but it might be audible to others.

As per here, you can even have better (or worse if you don't!) results by converting to a specific KHz format to prevent the encoder from resampling to its internal format.

http://bernholdtech.blogspot.com/2013/03/Nine-different-audio-encoders-100-pass-recompression-test.html

ZoriaRPG

Why use mp3 at all? FFMPEG can convert mp3 to .ogg with no loss, and you'll get compression, and free use. Hard to beat. Supporting proprietary formats is a waste of time.

Two driving reasons:

1. It's been there for 12 years, and we need to support all legacy quests (games made in the editor) that call MP3s. Back when this was first implemented (2005-7), there was no ID3v2 spec. It just worked.

Now, many MP3 files that our users download have all of this rubbish mketadata, introduced to the spec by Apple and other companies. That doesn't mean that we can just stop supporting the format, and cut off support for the biggest games with the highest production values made for our editor!

The small / common games don't use MP3s, or any of our enhanced audio options: We also support a variety of chiptunes formats. It takes effort to add these files to a quest, and to distribute them; and some of the quests use completely original music in this format.

Many of these remain the top-tier user favourites, and we need to ensure that they still play properly in the forseeable future.

2. While I personally prefer OGG, the average user rarely knows that the format exists--remember, these are the type of people who think that .avi is a format/codec!!

A fair amount of our userbase is made up of younger kids, or not-too-technically-savvy people. WHile OGG is quite popular on Linux, and every Linux user knows it, users on Windows either don't know about it, or think of it as some sort of unicycle.

In the last poll thar I conducted, there were all of three Linux users running ZC, of which I was one. OSX had a few times that number, and the rest were all on Windows of some sort.

For those people, MP3 is all that they know--or AAC, which we will never support. They wouldn't even comprehend what ID3 v2 metadata is, hence why telling them to strip it isn't the best approach, either. Besides, that's a cop-out.

Peter Hull

Possibly I haven't understood it, but it doesn't look too bad. The size of the header is encoded in the header so you just need to decode that and skip forward. There only complication is if 'unsynchronization' is used - if I understand correctly this means you have to scan through the header byte-by-byte to count all the unsynchronization points.
I made this code

#SelectExpand
1#include <cstdio> 2#include <cstdint> 3#include <cstring> 4 5struct __attribute__((__packed__)) id3hdr { 6 uint8_t id[3]; 7 uint8_t version; 8 uint8_t revision; 9 bool unsync:1; 10 bool ext:1; 11 bool exp:1; 12 bool reserved:5; 13 uint32_t size; 14}; 15 16 17uint32_t size28(uint32_t v) { 18 uint8_t* p = (uint8_t*) &v; 19 uint32_t a0 = p[0] & 0x7f; 20 uint32_t a1 = p[1] & 0x7f; 21 uint32_t a2 = p[2] & 0x7f; 22 uint32_t a3 = p[3] & 0x7f; 23 return a0 << 21 | a1 << 14 | a2 << 7 | a3; 24} 25 26int main() { 27 FILE* f = fopen("House.mp3", "rb"); 28 struct id3hdr h; 29 fread(&h, sizeof(h), 1, f); 30 char tag[4] = {0}; 31 memcpy(tag, h.id, 3); 32 uint32_t size = size28(h.size); 33 uint16_t version = h.version; 34 uint16_t revision = h.revision; 35 printf("ID: %s ver: %d.%d unsynchronized: %d extended header: %d experimental: %d size: %u\n", 36 tag, 37 version, revision, 38 h.unsync, 39 h.ext, 40 h.exp, 41 size); 42 fseek(f, size, SEEK_CUR); 43 FILE* g = fopen("House-no-id.mp3", "wb"); 44 while (!feof(f)) { 45 fputc(fgetc(f), g); 46 } 47 fclose(g); 48 fclose(f); 49 return 0; 50}

If the 'unsynchronized' flag is true, then the fseek won't work but I don't have any files to test.

Where is the almp3 source, maybe it's something that can be fixed fairly easily.

ZoriaRPG

Where is the almp3 source, maybe it's something that can be fixed fairly easily.

The original links are all dead, so I opened a new repo for it on GitHub, here:

https://github.com/ArmageddonGames/almp3

Peter Hull

I had a look at this now. If I've understood correctly, the ID3v2 data can be up to 256MB at the start of the file and it has a size field in the header so that ID3v2-aware code can skip over it. It also has a mechanism so that none of it looks like an MP3 frame, so non-ID3v2-aware code will pass through it without trying to play it.

almp3 has two modes; either a block data mode where you load the whole MP3 and use almp3_create_mp3 on it, or a streaming mode where you can load a file piecewise and use almp3_create_mp3stream on it.

The former does look for (and I think deals with correctly) the ID3v2 tag. The latter does not, but will work as long as the first MP3 frame is within approx 28KB (MAXFRAMESIZE * 16) of the start. If it hasn't found a valid frame in the first 28K (or however much you gave it for the first buffer) then it gives up.

I assume you're using streaming mode and that's why you have a problem.

This can be fixed by making almp3_create_mp3stream ID3v2-aware. This would complicate things because you may have to give it quite a few buffers full of data before it's got past the end of a long ID3v2 tag, and until then it can't set up the actual MP3 structures. Another problem might be looping tracks; if you rewind a file with a long ID3v2 tag and stream it in again there might be an audible gap with no sound (almp3 doesn't 'know' you've started the file again).

An alternative would be to add a function to return the position of the first valid MP3 frame, then when you rewind the file you would seek to this position rather than the absolute beginning of the file.

Does that make sense?

ZoriaRPG

almp3 has two modes; either a block data mode where you load the whole MP3 and use almp3_create_mp3 on it, or a streaming mode where you can load a file piecewise and use almp3_create_mp3stream on it.

The former does look for (and I think deals with correctly) the ID3v2 tag. The latter does not, but will work as long as the first MP3 frame is within approx 28KB (MAXFRAMESIZE * 16) of the start. If it hasn't found a valid frame in the first 28K (or however much you gave it for the first buffer) then it gives up.

I assume you're using streaming mode and that's why you have a problem.

This can be fixed by making almp3_create_mp3stream ID3v2-aware. This would complicate things because you may have to give it quite a few buffers full of data before it's got past the end of a long ID3v2 tag, and until then it can't set up the actual MP3 structures. Another problem might be looping tracks; if you rewind a file with a long ID3v2 tag and stream it in again there might be an audible gap with no sound (almp3 doesn't 'know' you've started the file again).

An alternative would be to add a function to return the position of the first valid MP3 frame, then when you rewind the file you would seek to this position rather than the absolute beginning of the file.

Does that make sense?

Aye, the varyabble length is a bitch.

I'll look into using almp3_create_mp3 instead of almp3_create_mp3stream, as that seems to be the better option. I'm not sure why the prior ZC devs used the latter of the two, nor if that'd create entirely different problems.

I wanted to eventually add custom seek/loop to MP3 and OGG, but it's a very low priority. I think that there already exists an 'audible gap' during loops, which may be caused by ID3 freame, based on what you are writing here.

Clearly, we don't want to amplify that issue, so I'll need to re-read the almp3 stuff again to see if shifting to almp3_create_mp3 is suitable for our needs.

We load the files externally from the filesystem, rather than embedding them into the quest packfile, and thus we have no need to read in a partial stream out of a series of other data.

Peter Hull
ZoriaRPG said:

I'll look into using almp3_create_mp3 instead of almp3_create_mp3stream, as that seems to be the better option.

That should work unless you have a problem with holding all of a very large mp3 in memory at one time (could the user supply you with an enormously long file?)

Let me know how you get on!

Audric

As I understand it, the most likely culprit of large ID3v2 tag is when there is album (cover) art. Reasonable files would include only a single 200x200 or 300x300 JPG, causing an overhead of around 60Kb, but people who use the first result of a Google Image search can easily bloat their files. No wonder a MP3 streamer will hiccup when it reaches a part of the file where 500Kb of data result in zero seconds of music.

Chris Katko
ZoriaRPG said:

Aye, the varyabble length is a .

The alternative would be quite unsightly though. You'd either waste tons of space on EVERY MP3 in your tens-of-thousands library (*), or, be screwed like DOS's 8+3 maximum filename length when you encounter edge cases because the fixed length has to be small enough to only work in the average case (otherwise it's "bloat").

The original ID3 tag only supported a fixed number of hardcoded genres! Because, you know, it's not like new musical genres ever come out!

(*Man, I miss those days of having your own >30GB library and sharing / comparing with other friends. Now it's just "YouTube it" for almost any song your heart desires.)

Audric said:

No wonder a MP3 streamer will hiccup when it reaches a part of the file where 500Kb of data result in zero seconds of music.

That should only affect beginnings of playback with ID3v2 because the tag is at the beginning of the file. ID3v1 was at the end so old players wouldn't be affected at all. ID3v2 also supports international data.

Something I've long wondered but never found any concrete data on, is how to properly support forward compatibility in file formats and APIs. The only way I've seen so far, is the IFF format (precursor to AIFF) and MP3 (and I'm sure others), have "frames" and simply ignore frames they don't understand. However, that doesn't guarantee success because the frame may be needed. For example, an old (or hardware) h.264 player that cannot decode B-frames (in "Baseline" profile). Still, the format can still be "processed" by a parser without exploding.

Likewise, an API like OpenGL can support both polling for "all supported flags", as well as passing ENUM=value, instead of explicit arguments, so you can add new enum values later, or even only support a subset (ala extensions).

But other than that... I have seen zero articles, zero videos, and the wiki and c2 have no info.

Audric

However, that doesn't guarantee success because the frame
may be needed.

From my experience with GIF and PNG, there are two ways for someone to extend a specified format.

The first one is to use values that are explicitly invalid (for example if the spec says this byte can be 0 to 5, and you use 6). The decoder will be able to know that something is wrong, but this is risky because this may crash decoders that are not robust enough, and the ones that are "too robust" will try to load what they can anyway, making assumptions that you can't predict.

The second method is to use the standard-defined options that let you add "user-defined" data. Every decoder will be able to omit it, although the rest of the file will have to contain all the mandatory data anyway. The usefulness of this is only if there is a reason to distribute "new files" to older decoders, and if the extra data is not completely mandatory for a good interpretation of the content (such as a new flag "this should play backward")

Doctor Cop

ZoriaRPG : can you post the link of your website, I want to know about(play) your game and thanks for the almp3 link. I was searching for it.

ZoriaRPG

ZoriaRPG : can you post the link of your website, I want to know about(play) your game and thanks for the almp3 link. I was searching for it.

Edit:

I forgot to include a discord Link to ZC Dev Discord Server:
https://discord.gg/YKm5SaQ

It's the Zelda Classic Game Engine:

https://www.allegro.cc/depot/ZeldaClassic1

(old depot information: https://www.allegro.cc/depot/ZeldaClassic/)

Main website: https://zeldaclassic.com
The current development builds are not available on this server!

Main Forum: https://armageddongames.net

PureZC Forums: https://purezc.net

Current alpha build, as of the time of this post:
http://timelord.insomnia247.nl/zc-dev/2.55/2.55_Win_Alpha_1.zip

This has the new module format mostly implemented, and it includes the Classic module.

Current 2.53 Beta: http://timelord.insomnia247.nl/zc-dev/2.53/2.53_Win_Beta_25.zip

Source files: https://github.com/ArmageddonGames/ZeldaClassic
(Deps, not included: bison, flex)

The alternative would be quite unsightly though. You'd either waste tons of space on EVERY MP3 in your tens-of-thousands library (*), or, be screwed like DOS's 8+3 maximum filename length when you encounter edge cases because the fixed length has to be small enough to only work in the average case (otherwise it's "bloat").

If the variable data was at the end of the file as a footer, you could skip over it until EOF as soon as you hit one token. The problem isn't the variable size, it's in placing it at the head of the file. At the least, the spec should have a lead-in value for the entire size in bytes before the first valid music frame.

Peter Hull
ZoriaRPG said:

the spec should have a lead-in value for the entire size in bytes before the first valid music frame.

It does, it's the 'size' field.

The ID3v2 tag header, which should be the first information in the file, is 10 bytes as follows:

ID3v2/file identifier "ID3"
ID3v2 version $03 00
ID3v2 flags %abc00000
ID3v2 size 4 * %0xxxxxxx

Thread #617614. Printed from Allegro.cc