How did you record them separately?
Matrix
An open network for secure, decentralized communication
Last time I was looking at solutions for this, Discord wasn't separatingbout each participant into their own stream.
OBS part that sounds familiar is maybe separating the voice hat stuff from game streaming also happening.
VDO.ninja is what OP is looking for though. The meets there have each participant as distinct streams (and a host/control for each). OBS can take any of those as a browser source and do whatever you want from there.
Idk about separating out the streams, but I've used video and audio for years, and at least for me, it seems to work just fine. You can do direct calls or set up a video and audio chat room if you want. Element is the go to client, I don't know the state of calls on other clients.
For this use case it is important to be able to record separate voices/origins to individual audio tracks.
Maybe element is not the right choice for this. I did some further searching and am unable to find a recording tool for element that is able to achieve this.