I don't know of any way to do that automatically.
I'd suggest using a frame or arrow or other object to highlight the speaking person's token and manually placing that on the bottom track. You'll need to make a copy to highlight individual each player, then place the appropriate one as the speaker players change.
You might find it easiest to place all the highlight objects for Player 1 whenever they're talking, then switch to the next player so you'll always have the correct highlighting object in the clipboard