Before engine-compatible assets were good enough to render in-engine, cinematics that involved character animation were authored with an entirely parallel path, using the same reference art, in a character animation program. Maya and 3d Studio Max were popular at the time and most reference art and lookdev was done with them anyway, so that was a popular choice, and Softimage (RIP) was an artist favorite in offline animation too.
Sometimes making the cinematics wasn't a core competence of the studio working on the game, so VFX or animation studios would be contracted to do this. Often this meant the studio was already set up to work for TV or Film, so it was staffed such that a 3D render would reach a compositor relatively quickly, where many manual corrections would happen in 2D space. Compositing software has largely consolidated nowadays to the few survivors (Nuke, Flame mostly) but at the time there were many, like Combustion and Henry (Quantel), and even After Effects was used a bunch in places I worked.
To this day, not all engines love creating cinematics, because even if an external/offline renderer can do the full render (like in Unreal Engine), some engines don't support the animation systems required to do cinematics, or don't support them at a velocity artists like. In other words, the same software workflows are used today.
Off the top of my head, we did intro & cutscenes for these titles/platforms (this is a small subset):
Spycraft | Activision PC
Gexx | Crystal Dynamics 3DO
The Horde | Crystal Dynamics 3DO
Street Fighter II | Sega
Final Fantasy 7 | Square (PC +/or PS2??)
There were at least 20 others
The software itself is a factor in this - and there was hardware, too, a lot of Silicon Graphics workstations were used in the mid-90's - but the device constraints at play dictated the idea of pre-rendering 3D assets for games like the Diablos and Resident Evils, which in turn made it easier to consider reusing them for FMV. That in turn produced the "parallel pipelines" mentioned in omershapira's comment whenever the engine was actually capable of 3D: often games were pitched to publishers by producing a cutscene trailer, and then the development team figured out what the engine tech could actually do as they went along. Because the in-game assets were still very basic and produced relatively cheaply given a design spec, this served a combination of development and marketing goals. Lara Croft got on all the magazine covers because of the high-poly CGI, not her in-game version.
(Why would publishers focus on assets? In this period, acquisition was extremely common as the industry got bigger and financing new projects got more risky, and so publishers gravitated towards a practice of shopping around for IP development and staffing at a low price. What they were banking on was not getting just one hit or a powerful engine, but a franchise that would sell well for years and experienced developers that they could put on their own projects. Likewise, studios were hungry to get publisher support and their heads often settled for an acquisition rather than shutting down. Focusing on asset production was a way of meeting in the middle, since the tech was so often an unknown; if you acquired a team that could make good assets, then plugged them in with an in-house tech team, a product could be made.)
Cutscene creation was usually outsourced to dedicated studios because it was completely disconnected from the actual game development process.
Blur for one heavily uses 3DS Max.
If they were made externally by a 3rd party production studio that also worked for TV and Cinema, they probably used some SGI workstation running Maya or Softimage 3D or Lightwave.