Eyematching by syncing audio

I recently edited a project that featured interviews with people in Zimbabwe speaking their native language. I don’t know what language they were speaking, but to facilitate our editing we were provided ProRes exports with English translations burned into the picture.


We also received camera original media, and after the cut was mostly locked-down I worked to replace the burned-in translation clips with clean camera clips so we could apply a translation that we had more stylistic control over.

The challenge: The translation-burned clips were edited compilations of all the interviews from that day. There was no timecode reference back to the camera originals, I was left to eyematch the shots.

There were about a dozen clips that needed replacing, and some were pretty easy to accomplish. I would find a frame where a hand came up, or they looked off to the side in a way that I could visually locate in the camera originals. But after a while I found some challenging clips where I couldn’t easily locate a visual match and since I didn’t speak the language it didn’t help me to listen to the camera files.

Then I had a brainstorm: The computer could listen for me! What if I could get PluralEyes to use its audio waveform-syncing capabilities to link up the camera originals to the placeholder media I had been editing with.

I fired up PluralEyes, added the interview compilation clip and the camera originals, then I clicked Synchronize. It didn’t work!

Of course there’s no way it could work. The compilation clip was a long edited stringout of many interview answers, when considered as a whole it is not the same as any piece of camera media. So no sync. But also no problem. I just needed to export the small section of an interview I needed to locate in the camera clips. I parked on the interview clip in my timeline and did a Match Frame to load the source clip (the interview compilation) into the source monitor marked with an In and an Out. I exported this clip In-Out range to a H.264 file.

Now, armed with the small export from the compilation and the camera original media, I tried PluralEyes again. This time it worked great!


I exported this synced sequence to XML for Premiere Pro. After importing and relinking the media I was able to get the timecode I needed in the camera source then do a replace edit in my timeline. BTW, I was doing a replace edit on a clip that was a copy of the clip from the compilation, added on a track above. With the two clips stacked vertically I put the top clip in Difference Mode and that helped me make sure I was dead-on with the camera media. If it was off, I slipped the camera original until it was perfectly aligned and the composite of the two shots went black. Difference Mode is very helpful when matching shots.

After posting my success to Twitter, @adkimery asked why I used “PE instead of PPro’s own synch by waveform function?” It never occurred to me!

So I tried it, and it worked too!


For this task it ended up being easier to use Premiere Pro’s built-in syncing. I created a Multicamera Source Sequence using the short compilation excerpt export and the camera original media. Next I opened the multicam clip in the timeline and found where in the camera clip was the section I needed.

Even though this isn’t the intended purpose of the technology, using audio syncing to match up placeholder media with camera originals was a big help to me in this project.

Comments are closed.