Say you have a machine on which transcoding of massive number of live streams happens concurrently.
Whenever a new transcoding requirement comes, before trying to run the task the obvious first step would be to estimate whether there are currently available GPU resources. If the resources are available, the transcoding session would be launched, otherwise, the task would be put on hold.
The 'trial-and-error' paradigm (i.e. try launching the session, and if it fails, we will handle it somehow') is unsuitable as the streams are live, and there is no tolerance for run-time error of that kind. Instead, some kind of reliable GPU resources estimation mechanism (regardless of how precise) is needed.
Any suggestion as to how to approach solving this kind of problem would be greatly appreciated.