Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FFmpeg error trying to merge video and iamf file #20

Open
Kurville opened this issue Jan 6, 2025 · 13 comments
Open

FFmpeg error trying to merge video and iamf file #20

Kurville opened this issue Jan 6, 2025 · 13 comments
Labels
documentation Improvements or additions to documentation

Comments

@Kurville
Copy link

Kurville commented Jan 6, 2025

Using the script to merge video and audio available here https://github.com/AOMediaCodec/iamf-tools/blob/main/docs/external/encoding_with_external_tools.md#encode-wav-files-to-iamf-with-ffmpeg

gives this

Input #0, iamf, from 'iamf.iamf':
Duration: N/A, bitrate: N/A
Stream group #0:0[0x1]: IAMF Audio Element:
Layer 0: ambisonic 3
Stream group #0:1[0x3]: IAMF Mix Presentation:
Annotations:
en-us : default_mix_presentation
Submix 0:
IAMF Audio Element #0:0[0x1]
Annotations:
en-us : 3OA
Layout #0: stereo
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from 'video.mp4':
Metadata:
major_brand : mp42
minor_version : 0
compatible_brands: mp42mp41isomavc1
creation_time : 2020-01-16T20:02:47.000000Z
Duration: 00:00:53.24, start: 0.000000, bitrate: 5148 kb/s
Stream #1:00x1: Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080, 5145 kb/s, 25 fps, 25 tbr, 25 tbn (default)
Metadata:
creation_time : 2020-01-16T20:02:47.000000Z
handler_name : L-SMASH Video Handler
vendor_id : [0][0][0][0]
encoder : AVC Coding
[out#0/mp4 @ 0x600003bc4000] Invalid stream index 2
Error opening output file video_iamf.mp4.
Error opening output files: Invalid argument

@jwcullen jwcullen added the documentation Improvements or additions to documentation label Jan 6, 2025
@rafalfaro18
Copy link

Same issue here. By the way the script doesn't currently use the variable stream_groups_count after it defines it.

@rafalfaro18
Copy link

rafalfaro18 commented Jan 7, 2025

Can this maybe have something to do with the example to create the 5.1 iamf from a wav with ffmpeg defining stream groups with ids 1 and 3 but the sample code to merge with video referencing stream groups 0 and 1?

@rafalfaro18
Copy link

I think the issue might be that it says map 0:a:0 instead of 0:a that fixes it for me but I'm not sure if the resulting file is ok

@rafalfaro18
Copy link

I think the issue might be that it says map 0:a:0 instead of 0:a that fixes it for me but I'm not sure if the resulting file is ok

I just checked with the decoder and it seems to be working. In youtube I get the Processing abandoned error when uploading though.

jwcullen added a commit that referenced this issue Jan 8, 2025
  - Applies suggestion from a comment by rafalfaro18 (
#20 (comment)).
  - Part of #20. Leave issue open, because it still assumes a single audio element.

PiperOrigin-RevId: 713359902
@jwcullen
Copy link
Collaborator

jwcullen commented Jan 8, 2025

I think the issue might be that it says map 0:a:0 instead of 0:a that fixes it for me but I'm not sure if the resulting file is ok

Thanks, this seems to work better.

Script still needs further updates to handle more than 1 audio element (e.g. 3OA + stereo use case).

I just checked with the decoder and it seems to be working. In youtube I get the Processing abandoned error when uploading though.

Can you share a dump of ffprobe your_video_and_iamf.mp4? Feel free to redact anything if it is private.

@Kurville
Copy link
Author

Kurville commented Jan 8, 2025

Glad to see it's getting worked on. The fact that FFmpeg reports

Input #0, iamf, from 'iamf.iamf':
Duration: N/A, bitrate: N/A

is not critical?

@rafalfaro18
Copy link

I think the issue might be that it says map 0:a:0 instead of 0:a that fixes it for me but I'm not sure if the resulting file is ok

Thanks, this seems to work better.

Script still needs further updates to handle more than 1 audio element (e.g. 3OA + stereo use case).

I just checked with the decoder and it seems to be working. In youtube I get the Processing abandoned error when uploading though.

Can you share a dump of ffprobe your_video_and_iamf.mp4? Feel free to redact anything if it is private.

Is this a compliant mp4 with iamf audio?

ffprobe -show_streams -of json .\Final.mp4
ffprobe version 7.1-full_build-www.gyan.dev Copyright (c) 2007-2024 the FFmpeg developers
  built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.100 / 61. 19.100
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
{
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '.\Final.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiamfiso2avc1mp41
    encoder         : Lavf61.7.100
  Duration: 00:00:13.02, start: 0.000000, bitrate: 236 kb/s
  Stream group #0:0[0x1]: IAMF Audio Element:
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
    Layer 0: 5.1
  Stream group #0:1[0x3]: IAMF Mix Presentation:
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
    Annotations:
      en-us           : default_mix_presentation
    Submix 0:
      IAMF Audio Element #0:0[0x1]
        Annotations:
          en-us           : 5.1
      Layout #0: stereo
  Stream #0:4[0x5](und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 117 kb/s, 30 fps, 24.60 tbr, 24596 tbn (default)
      Metadata:
        handler_name    : VideoHandler
        vendor_id       : [0][0][0][0]
        encoder         : AVC Coding
    "streams": [
        {
            "index": 0,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 2,
            "channel_layout": "stereo",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x0",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624960,
            "duration": "13.020000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 0,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 1,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 2,
            "channel_layout": "stereo",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x1",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624960,
            "duration": "13.020000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 1,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 2,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 1,
            "channel_layout": "mono",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x2",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624960,
            "duration": "13.020000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 1,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 3,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 1,
            "channel_layout": "mono",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x3",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624960,
            "duration": "13.020000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 1,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 4,
            "codec_name": "h264",
            "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
            "profile": "Main",
            "codec_type": "video",
            "codec_tag_string": "avc1",
            "codec_tag": "0x31637661",
            "width": 1920,
            "height": 1080,
            "coded_width": 1920,
            "coded_height": 1080,
            "closed_captions": 0,
            "film_grain": 0,
            "has_b_frames": 0,
            "sample_aspect_ratio": "1:1",
            "display_aspect_ratio": "16:9",
            "pix_fmt": "yuv420p",
            "level": 40,
            "chroma_location": "left",
            "field_order": "progressive",
            "refs": 1,
            "is_avc": "true",
            "nal_length_size": "4",
            "id": "0x5",
            "r_frame_rate": "6149/250",
            "avg_frame_rate": "98384/3279",
            "time_base": "1/24596",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 3279,
            "duration": "0.133314",
            "bit_rate": "117136",
            "bits_per_raw_sample": "8",
            "nb_frames": "4",
            "extradata_size": 40,
            "disposition": {
                "default": 1,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 0,
                "still_image": 0,
                "multilayer": 0
            },
            "tags": {
                "language": "und",
                "handler_name": "VideoHandler",
                "vendor_id": "[0][0][0][0]",
                "encoder": "AVC Coding"
            }
        }
    ]
}

@jwcullen
Copy link
Collaborator

jwcullen commented Jan 15, 2025

@rafalfaro18,

I see the duration of the audio is 13.020000s seconds, but the duration of the video is 0.133314s.

Can you retry encoding, but using audio that matches the duration of the video?

@jwcullen
Copy link
Collaborator

jwcullen commented Jan 15, 2025

@Kurville,

The commands / scripts have been simplified. Now the IAMF encoding and mux with video is done in a single command.

See the updated example for an example to encode IAMF with ambisonics and non-diegetic stereo.

@Kurville
Copy link
Author

That's great! Thanks. Submitted again a file on YouTube: still "abandoning the processing", though... Really looks like they changed something on their side since last week.

@rafalfaro18
Copy link

rafalfaro18 commented Jan 15, 2025

That's great! Thanks. Submitted again a file on YouTube: still "abandoning the processing", though... Really looks like they changed something on their side since last week.

I wasn't gonna bring that up because according to what I read the feature wasn't going to be enabled until later this year, that being said Youtube side of things never worked for me. Not even last week or during CES. I have no idea how people managed to upload supposed compliant iamf videos.

I believe It's a feature that has not been enabled for the general public.

As long as it works with the reference decoder it should be compliant for when it gets enabled in youtube.

I even tried to upload samples provided by the iamf github repos in case I did something wrong and my files weren't compliant mp4s.

@rafalfaro18
Copy link

@rafalfaro18,

I see the duration of the audio is 13.020000s seconds, but the duration of the video is 0.133314s.

Can you retry encoding, but using audio that matches the duration of the video?

My bad, I clipped the video previously but forgot to test with the shortened version.

@Kurville,

The commands / scripts have been simplified. Now the IAMF encoding and mux with video is done in a single command.

See the updated example for an example to encode IAMF with ambisonics and non-diegetic stereo.

Tested with the new example commands starting with a 5.1 WAV file and a video of the same length without audio track:

ffmpeg -i "C:\Users\xxx\Documents\REAPER Media\ReaSurroundTest5.1-Noise.wav"     -i "C:\Users\xxx\Downloads\Test.mp4" -c:v copy     -filter_complex "[0:a]channelmap=0|1:stereo[FRONT];[0:a]channelmap=4|5:stereo[BACK];[0:a]channelmap=2:mono[CENTER];[0:a]channelmap=3:mono[LFE]"     -map "[FRONT]" -map "[BACK]" -map "[CENTER]" -map "[LFE]" -map 1:0     -stream_group "type=iamf_audio_element:id=1:st=0:st=1:st=2:st=3:audio_element_type=channel,layer=ch_layout=5.1"     -stream_group "type=iamf_mix_presentation:id=3:stg=0:annotations=en-us=default_mix_presentation,submix=parameter_id=100:parameter_rate=48000:default_mix_gain=0.0|element=stg=0:headphones_rendering_mode=binaural:annotations=en-us=5.1:parameter_id=101:parameter_rate=48000:default_mix_gain=0.0|layout=sound_system=stereo:integrated_loudness=0.0:digital_peak=0.0"     -streamid 0:0 -streamid 1:1 -streamid 2:2 -streamid 3:3 -streamid 4:4     -c:a libopus -b:a 64000 "C:\Users\xxx\Downloads\Final.mp4"

Result

ffprobe -of json -show_streams "C:\Users\xxx\Downloads\Final.mp4"
ffprobe version 7.1-full_build-www.gyan.dev Copyright (c) 2007-2024 the FFmpeg developers
  built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.100 / 61. 19.100
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
{
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000021c3c711000] DTS discontinuity in stream 4: packet 3 with DTS 12051, packet 4 with DTS 21088
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000021c3c711000] DTS discontinuity in stream 4: packet 5 with DTS 21089, packet 6 with DTS 30126
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000021c3c711000] DTS discontinuity in stream 4: packet 7 with DTS 30127, packet 8 with DTS 34646
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\Users\xxx\Downloads\Final.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiamfiso2avc1mp41
    date            : 2024-11-16
    encoder         : Lavf61.7.100
  Duration: 00:00:13.00, start: 0.000000, bitrate: 181 kb/s
  Stream group #0:0[0x1]: IAMF Audio Element:
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
    Layer 0: 5.1
  Stream group #0:1[0x3]: IAMF Mix Presentation:
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
    Annotations:
      en-us           : default_mix_presentation
    Submix 0:
      IAMF Audio Element #0:0[0x1]
        Annotations:
          en-us           : 5.1
      Layout #0: stereo
  Stream #0:4[0x5](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 18 kb/s, 24.39 fps, 32 tbr, 144606 tbn (default)
      Metadata:
        handler_name    : VideoHandler
        vendor_id       : [0][0][0][0]
    "streams": [
        {
            "index": 0,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 2,
            "channel_layout": "stereo",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x0",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624000,
            "duration": "13.000000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 0,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 1,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 2,
            "channel_layout": "stereo",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x1",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624000,
            "duration": "13.000000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 1,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 2,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 1,
            "channel_layout": "mono",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x2",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624000,
            "duration": "13.000000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 1,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 3,
            "codec_name": "opus",
            "codec_long_name": "Opus (Opus Interactive Audio Codec)",
            "codec_type": "audio",
            "codec_tag_string": "[0][0][0][0]",
            "codec_tag": "0x0000",
            "sample_fmt": "fltp",
            "sample_rate": "48000",
            "channels": 1,
            "channel_layout": "mono",
            "bits_per_sample": 0,
            "initial_padding": 0,
            "id": "0x3",
            "r_frame_rate": "0/0",
            "avg_frame_rate": "0/0",
            "time_base": "1/48000",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 624000,
            "duration": "13.000000",
            "extradata_size": 19,
            "disposition": {
                "default": 0,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 1,
                "still_image": 0,
                "multilayer": 0
            }
        },
        {
            "index": 4,
            "codec_name": "h264",
            "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
            "profile": "High",
            "codec_type": "video",
            "codec_tag_string": "avc1",
            "codec_tag": "0x31637661",
            "width": 1920,
            "height": 1080,
            "coded_width": 1920,
            "coded_height": 1080,
            "closed_captions": 0,
            "film_grain": 0,
            "has_b_frames": 3,
            "sample_aspect_ratio": "1:1",
            "display_aspect_ratio": "16:9",
            "pix_fmt": "yuv420p",
            "level": 40,
            "color_range": "tv",
            "color_space": "bt709",
            "color_transfer": "bt709",
            "color_primaries": "bt709",
            "chroma_location": "left",
            "field_order": "progressive",
            "refs": 1,
            "is_avc": "true",
            "nal_length_size": "4",
            "id": "0x5",
            "r_frame_rate": "32/1",
            "avg_frame_rate": "22486233/921863",
            "time_base": "1/144606",
            "start_pts": 0,
            "start_time": "0.000000",
            "duration_ts": 1843726,
            "duration": "12.749997",
            "bit_rate": "18133",
            "bits_per_raw_sample": "8",
            "nb_frames": "311",
            "extradata_size": 51,
            "disposition": {
                "default": 1,
                "dub": 0,
                "original": 0,
                "comment": 0,
                "lyrics": 0,
                "karaoke": 0,
                "forced": 0,
                "hearing_impaired": 0,
                "visual_impaired": 0,
                "clean_effects": 0,
                "attached_pic": 0,
                "timed_thumbnails": 0,
                "non_diegetic": 0,
                "captions": 0,
                "descriptions": 0,
                "metadata": 0,
                "dependent": 0,
                "still_image": 0,
                "multilayer": 0
            },
            "tags": {
                "language": "eng",
                "handler_name": "VideoHandler",
                "vendor_id": "[0][0][0][0]"
            }
        }
    ]
}

@rafalfaro18
Copy link

rafalfaro18 commented Jan 16, 2025

That's great! Thanks. Submitted again a file on YouTube: still "abandoning the processing", though... Really looks like they changed something on their side since last week.

I wasn't gonna bring that up because according to what I read the feature wasn't going to be enabled until later this year, that being said Youtube side of things never worked for me. Not even last week or during CES. I have no idea how people managed to upload supposed compliant iamf videos.

I believe It's a feature that has not been enabled for the general public.

As long as it works with the reference decoder it should be compliant for when it gets enabled in youtube.

I even tried to upload samples provided by the iamf github repos in case I did something wrong and my files weren't compliant mp4s.

CORRECTION: It just worked. I was able to upload the file that gave me the ffprobe log mentioned earlier. Nothing fancy but successful test nonetheless. I have no idea if it's related to me recently getting youtube premium on the account used to upload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants