Consider a scenario with several h264-streams stitching/concatenation (streams are created by different encoders). It is called "Multiple-Segment Encoding" at the imsdk documentation.
Wherein a resulting stitched stream should look like as if it was coded continuously by one encoder from one uncompressed source. That is, all segments must have the same frame size, framerate, etc. Also, the joints should not have holes/overlaps in time.
And now about the times. The notion of shift between PTS and DTS exists in all streams, where B-frames are present (in order to use the forward-prediction, first we need to decode one or more future frames) - see figure below. That PTS-DTS shift is equal to maximum number of (reference) frames that can be used by encoder for forward prediction.
Thus, all concatenated segments must have the same PTS-DTS shift also. Or we will get time-holes/overlaps at joints otherwise.
PTS-DTS shift is dependent on GopRefDist and NumRefFrame encoder initialization parameters in current h264 encoder implementation (although, in my opinion, the dependence on GopRefDist is superfluous).
A few shifts only were possible to obtain with imsdk 1.7sw: 0, -40 and -80 msec (for 25p/50i streams).
imsdk 1.8sw has evolved to: 0, -40, -80, -120 and -160 msec. This is good news, most of real world streams fit into them.
Yeah, the dependence formula is quite sophisticated:)
Unfortunately, I have no right to expect that dependence formula will not be changed in a future imsdk version (or between sw and hw-implementations).
But I want to be sure that Multiple-Segment Encoding will provide the correct results in the future. I don't expect full hrd/vbv-conformance on joints (yet), want basic norms only.
So, can I ask to consider the implementation of NumForwardRefFrame parameter (described here: http://software.intel.com/en-us/forums/topic/475621#comment-1758748) in a future imsdk releases?