I dont think that there is an easy way of doing this.
If I were to do this, I would probably go like this:
a) Put all four tracks in the timeline and sync them by audio.
b) make sure that the starting point of all four tracks is equal, by cutting off the start bits of each and dragging them to the start point in the track. I would also do the same at the end, just to make sure that I have 4 pieces if video in each track that are audio-synced and of equal length.
c) I would for each track either cut pieces manually or use pre-cut which ever is easier. And I would do this artificially by selecting points in each video that are artistically nice and I would make sure to cut all videos at precisely the same moments. This is why the complete sync is important. You can use the time code in each. I would make sure that each of the pieces can be recognized: from which track/video and which number in the sequence.
d) make a collage, making use of the correct sequence numbers selecting precuts. Because all the pieces have the same length you have to make sure that the collage videos all start at once. Making also sure that the collage sequence is still recognizable.
e) the result of the collages can be produced separately (in separate projects even), but I would certainly try to see the result of the sequence of collages first.
f) once completed I would produce the result into a high quality video
g) I would also make a multicam video with the fout videos and do some proper directors/editors work with them and procedure the result of this also in high quality.
h) I would them use both the collage (f) and multicam (g) video and put them in the multicam again to make sure that I can edit-out the weak collage results if any.
Of course the preciseness in step c) and d) are vital to this approach.
It will take some effort to get this done I imagine.
Hope this helps, or triggers you (others) to further and brighter insights.