FFmpeg add background audio to video but not completely muting the original audio

audioffmpegvideo

It is merging, but I don't want to remove the original audio of the video. I want to overlay the audio in the background.

Is there a way to achieve this behavior? I am using this command to merge audio into a video:

$ ffmpeg -i adele.mp3 -i Ruby\ on\ Rails\ 101-\ What\ is\ Ruby\ on\ Rails_.mp4 -c:v libx264 -c copy -aspect 16:9 -r 30 test.mp4
ffmpeg version N-60332-ga0d5204 Copyright (c) 2000-2014 the FFmpeg developers
  built on Feb  4 2014 21:31:51 with Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
  configuration: --prefix=/usr/local --enable-gpl --enable-nonfree --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libxvid
  libavutil      52. 63.100 / 52. 63.100
  libavcodec     55. 49.101 / 55. 49.101
  libavformat    55. 29.101 / 55. 29.101
  libavdevice    55.  7.100 / 55.  7.100
  libavfilter     4.  1.102 /  4.  1.102
  libswscale      2.  5.101 /  2.  5.101
  libswresample   0. 17.104 /  0. 17.104
  libpostproc    52.  3.100 / 52.  3.100
Input #0, mp3, from 'adele.mp3':
  Metadata:
    title           : Chasing Pavements
    artist          : Adele
    track           : 3
    album           : 19 (Deluxe Edition)
    date            : 2008
  Duration: 00:03:30.55, start: 0.025056, bitrate: 160 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 159 kb/s
    Stream #0:1: Video: mjpeg, yuvj420p(pc), 600x600 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc
    Metadata:
      title           : 
      comment         : Other
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from 'Ruby on Rails 101- What is Ruby on Rails_.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2013-12-21 07:56:20
  Duration: 00:05:42.40, start: 0.000000, bitrate: 1214 kb/s
    Stream #1:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1020 kb/s, 30 fps, 30 tbr, 60 tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #1:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 191 kb/s (default)
    Metadata:
      creation_time   : 2013-12-21 07:56:23
      handler_name    : IsoMedia File Produced by Google, 5-11-2011
Overriding aspect ratio with stream copy may produce invalid files
Output #0, mp4, to 'test.mp4':
  Metadata:
    title           : Chasing Pavements
    artist          : Adele
    track           : 3
    album           : 19 (Deluxe Edition)
    date            : 2008
    encoder         : Lavf55.29.101
    Stream #0:0(und): Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], q=2-31, 1020 kb/s, 30 fps, 15360 tbn, 30 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1: Audio: mp3 (i[0][0][0] / 0x0069), 44100 Hz, stereo, 159 kb/s
Stream mapping:
  Stream #1:0 -> #0:0 (copy)
  Stream #0:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
frame=10272 fps=2793 q=-1.0 Lsize=   46908kB time=00:05:42.40 bitrate=1122.3kbits/s    
video:42642kB audio:4102kB subtitle:0 data:0 global headers:0kB muxing overhead 0.351255

Best Answer

  • You can use the amerge and pan audio filters:

    ffmpeg -i audio.mp3 -i video.mp4 -filter_complex \
    "[0:a][1:a]amerge,pan=stereo|c0<c0+c2|c1<c1+c3[a]" \
    -map 1:v -map "[a]" -c:v copy -c:a aac -shortest output.mp4
    

    This will take the audio stream from the first input (audio.mp3) and the audio stream from the second input (video.mp4). amerge will combine them into a 4 channel stream, then pan will combine the 4 channels into a stereo stream. The output link label is called [out] (you can use almost any arbitrary name).

    Then -map 1:v selects the video from the second input as a video source for the output file, and -map "[a]" selects the audio from the filtergraph as an audio source for the output file.

    The video is stream copied with -c:v copy, so it is not re-encoded. Filters require re-encoding, so -c:a aac is used in this example to re-encode to AAC audio.

    The -shortest option will end the output file whenever the shortest input ends which is useful if one input is shorter than the other.