-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FFmpeg error trying to merge video and iamf file #20
Comments
Same issue here. By the way the script doesn't currently use the variable stream_groups_count after it defines it. |
Can this maybe have something to do with the example to create the 5.1 iamf from a wav with ffmpeg defining stream groups with ids 1 and 3 but the sample code to merge with video referencing stream groups 0 and 1? |
I think the issue might be that it says map 0:a:0 instead of 0:a that fixes it for me but I'm not sure if the resulting file is ok |
I just checked with the decoder and it seems to be working. In youtube I get the Processing abandoned error when uploading though. |
- Applies suggestion from a comment by rafalfaro18 ( #20 (comment)). - Part of #20. Leave issue open, because it still assumes a single audio element. PiperOrigin-RevId: 713359902
Thanks, this seems to work better. Script still needs further updates to handle more than 1 audio element (e.g. 3OA + stereo use case).
Can you share a dump of |
Glad to see it's getting worked on. The fact that FFmpeg reports
is not critical? |
Is this a compliant mp4 with iamf audio? ffprobe -show_streams -of json .\Final.mp4
ffprobe version 7.1-full_build-www.gyan.dev Copyright (c) 2007-2024 the FFmpeg developers
built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 59. 39.100 / 59. 39.100
libavcodec 61. 19.100 / 61. 19.100
libavformat 61. 7.100 / 61. 7.100
libavdevice 61. 3.100 / 61. 3.100
libavfilter 10. 4.100 / 10. 4.100
libswscale 8. 3.100 / 8. 3.100
libswresample 5. 3.100 / 5. 3.100
libpostproc 58. 3.100 / 58. 3.100
{
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '.\Final.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiamfiso2avc1mp41
encoder : Lavf61.7.100
Duration: 00:00:13.02, start: 0.000000, bitrate: 236 kb/s
Stream group #0:0[0x1]: IAMF Audio Element:
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Layer 0: 5.1
Stream group #0:1[0x3]: IAMF Mix Presentation:
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Annotations:
en-us : default_mix_presentation
Submix 0:
IAMF Audio Element #0:0[0x1]
Annotations:
en-us : 5.1
Layout #0: stereo
Stream #0:4[0x5](und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 117 kb/s, 30 fps, 24.60 tbr, 24596 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
encoder : AVC Coding
"streams": [
{
"index": 0,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 2,
"channel_layout": "stereo",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x0",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624960,
"duration": "13.020000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 1,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 2,
"channel_layout": "stereo",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x1",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624960,
"duration": "13.020000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 1,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 2,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 1,
"channel_layout": "mono",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x2",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624960,
"duration": "13.020000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 1,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 3,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 1,
"channel_layout": "mono",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x3",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624960,
"duration": "13.020000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 1,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 4,
"codec_name": "h264",
"codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
"profile": "Main",
"codec_type": "video",
"codec_tag_string": "avc1",
"codec_tag": "0x31637661",
"width": 1920,
"height": 1080,
"coded_width": 1920,
"coded_height": 1080,
"closed_captions": 0,
"film_grain": 0,
"has_b_frames": 0,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "16:9",
"pix_fmt": "yuv420p",
"level": 40,
"chroma_location": "left",
"field_order": "progressive",
"refs": 1,
"is_avc": "true",
"nal_length_size": "4",
"id": "0x5",
"r_frame_rate": "6149/250",
"avg_frame_rate": "98384/3279",
"time_base": "1/24596",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 3279,
"duration": "0.133314",
"bit_rate": "117136",
"bits_per_raw_sample": "8",
"nb_frames": "4",
"extradata_size": 40,
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0,
"multilayer": 0
},
"tags": {
"language": "und",
"handler_name": "VideoHandler",
"vendor_id": "[0][0][0][0]",
"encoder": "AVC Coding"
}
}
]
} |
I see the duration of the audio is 13.020000s seconds, but the duration of the video is 0.133314s. Can you retry encoding, but using audio that matches the duration of the video? |
That's great! Thanks. Submitted again a file on YouTube: still "abandoning the processing", though... Really looks like they changed something on their side since last week. |
I wasn't gonna bring that up because according to what I read the feature wasn't going to be enabled until later this year, that being said Youtube side of things never worked for me. Not even last week or during CES. I have no idea how people managed to upload supposed compliant iamf videos. I believe It's a feature that has not been enabled for the general public. As long as it works with the reference decoder it should be compliant for when it gets enabled in youtube. I even tried to upload samples provided by the iamf github repos in case I did something wrong and my files weren't compliant mp4s. |
My bad, I clipped the video previously but forgot to test with the shortened version.
Tested with the new example commands starting with a 5.1 WAV file and a video of the same length without audio track: ffmpeg -i "C:\Users\xxx\Documents\REAPER Media\ReaSurroundTest5.1-Noise.wav" -i "C:\Users\xxx\Downloads\Test.mp4" -c:v copy -filter_complex "[0:a]channelmap=0|1:stereo[FRONT];[0:a]channelmap=4|5:stereo[BACK];[0:a]channelmap=2:mono[CENTER];[0:a]channelmap=3:mono[LFE]" -map "[FRONT]" -map "[BACK]" -map "[CENTER]" -map "[LFE]" -map 1:0 -stream_group "type=iamf_audio_element:id=1:st=0:st=1:st=2:st=3:audio_element_type=channel,layer=ch_layout=5.1" -stream_group "type=iamf_mix_presentation:id=3:stg=0:annotations=en-us=default_mix_presentation,submix=parameter_id=100:parameter_rate=48000:default_mix_gain=0.0|element=stg=0:headphones_rendering_mode=binaural:annotations=en-us=5.1:parameter_id=101:parameter_rate=48000:default_mix_gain=0.0|layout=sound_system=stereo:integrated_loudness=0.0:digital_peak=0.0" -streamid 0:0 -streamid 1:1 -streamid 2:2 -streamid 3:3 -streamid 4:4 -c:a libopus -b:a 64000 "C:\Users\xxx\Downloads\Final.mp4" Result ffprobe -of json -show_streams "C:\Users\xxx\Downloads\Final.mp4"
ffprobe version 7.1-full_build-www.gyan.dev Copyright (c) 2007-2024 the FFmpeg developers
built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 59. 39.100 / 59. 39.100
libavcodec 61. 19.100 / 61. 19.100
libavformat 61. 7.100 / 61. 7.100
libavdevice 61. 3.100 / 61. 3.100
libavfilter 10. 4.100 / 10. 4.100
libswscale 8. 3.100 / 8. 3.100
libswresample 5. 3.100 / 5. 3.100
libpostproc 58. 3.100 / 58. 3.100
{
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000021c3c711000] DTS discontinuity in stream 4: packet 3 with DTS 12051, packet 4 with DTS 21088
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000021c3c711000] DTS discontinuity in stream 4: packet 5 with DTS 21089, packet 6 with DTS 30126
[mov,mp4,m4a,3gp,3g2,mj2 @ 0000021c3c711000] DTS discontinuity in stream 4: packet 7 with DTS 30127, packet 8 with DTS 34646
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:\Users\xxx\Downloads\Final.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiamfiso2avc1mp41
date : 2024-11-16
encoder : Lavf61.7.100
Duration: 00:00:13.00, start: 0.000000, bitrate: 181 kb/s
Stream group #0:0[0x1]: IAMF Audio Element:
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Layer 0: 5.1
Stream group #0:1[0x3]: IAMF Mix Presentation:
Metadata:
handler_name : SoundHandler
vendor_id : [0][0][0][0]
Annotations:
en-us : default_mix_presentation
Submix 0:
IAMF Audio Element #0:0[0x1]
Annotations:
en-us : 5.1
Layout #0: stereo
Stream #0:4[0x5](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 18 kb/s, 24.39 fps, 32 tbr, 144606 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
"streams": [
{
"index": 0,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 2,
"channel_layout": "stereo",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x0",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624000,
"duration": "13.000000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 1,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 2,
"channel_layout": "stereo",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x1",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624000,
"duration": "13.000000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 1,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 2,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 1,
"channel_layout": "mono",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x2",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624000,
"duration": "13.000000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 1,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 3,
"codec_name": "opus",
"codec_long_name": "Opus (Opus Interactive Audio Codec)",
"codec_type": "audio",
"codec_tag_string": "[0][0][0][0]",
"codec_tag": "0x0000",
"sample_fmt": "fltp",
"sample_rate": "48000",
"channels": 1,
"channel_layout": "mono",
"bits_per_sample": 0,
"initial_padding": 0,
"id": "0x3",
"r_frame_rate": "0/0",
"avg_frame_rate": "0/0",
"time_base": "1/48000",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 624000,
"duration": "13.000000",
"extradata_size": 19,
"disposition": {
"default": 0,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 1,
"still_image": 0,
"multilayer": 0
}
},
{
"index": 4,
"codec_name": "h264",
"codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
"profile": "High",
"codec_type": "video",
"codec_tag_string": "avc1",
"codec_tag": "0x31637661",
"width": 1920,
"height": 1080,
"coded_width": 1920,
"coded_height": 1080,
"closed_captions": 0,
"film_grain": 0,
"has_b_frames": 3,
"sample_aspect_ratio": "1:1",
"display_aspect_ratio": "16:9",
"pix_fmt": "yuv420p",
"level": 40,
"color_range": "tv",
"color_space": "bt709",
"color_transfer": "bt709",
"color_primaries": "bt709",
"chroma_location": "left",
"field_order": "progressive",
"refs": 1,
"is_avc": "true",
"nal_length_size": "4",
"id": "0x5",
"r_frame_rate": "32/1",
"avg_frame_rate": "22486233/921863",
"time_base": "1/144606",
"start_pts": 0,
"start_time": "0.000000",
"duration_ts": 1843726,
"duration": "12.749997",
"bit_rate": "18133",
"bits_per_raw_sample": "8",
"nb_frames": "311",
"extradata_size": 51,
"disposition": {
"default": 1,
"dub": 0,
"original": 0,
"comment": 0,
"lyrics": 0,
"karaoke": 0,
"forced": 0,
"hearing_impaired": 0,
"visual_impaired": 0,
"clean_effects": 0,
"attached_pic": 0,
"timed_thumbnails": 0,
"non_diegetic": 0,
"captions": 0,
"descriptions": 0,
"metadata": 0,
"dependent": 0,
"still_image": 0,
"multilayer": 0
},
"tags": {
"language": "eng",
"handler_name": "VideoHandler",
"vendor_id": "[0][0][0][0]"
}
}
]
} |
CORRECTION: It just worked. I was able to upload the file that gave me the ffprobe log mentioned earlier. Nothing fancy but successful test nonetheless. I have no idea if it's related to me recently getting youtube premium on the account used to upload. |
Using the script to merge video and audio available here https://github.com/AOMediaCodec/iamf-tools/blob/main/docs/external/encoding_with_external_tools.md#encode-wav-files-to-iamf-with-ffmpeg
gives this
The text was updated successfully, but these errors were encountered: