I am trying to encode an H.264 MVC stream from two 1920x1080 YV12 files using the 'sample_encode.exe' provided in Media SDK 2012 p_3.0.014_GOLD (which I believe is the latest Media SDK release as of this writing). The command line I am using is:
.\\samples\\_bin\\win32\\sample_encode.exe mvc -i view_0.yuv -i view_1.yuv -o test_mvc.264 -w 1920 -h 1080 -u quality
In the encoded bitstream, the first view component (base view) of the first IDR Access Unit has 4 slices. The NAL unit containing the first slice is immediately preceded by a Prefix NAL unit (nal_unit_type equal to 14) in which the 'inter_view_flag' bit in the nal_unit_header_mvc_extension() is coded to 0. But for the NAL units containing the other 3 slices, none of them is preceded by a Prefix NAL unit.
However, according to the 'inter_view_flag' semantics specified in sec. H.126.96.36.199 of the H.264 specification:
When nal_unit_type is equal to 1 or 5 and the NAL unit is not immediately preceded by a NAL unit with nal_unit_type equal to 14, inter_view_flag shall be inferred to be equal to 1.
The value of inter_view_flag shall be the same for all VCL NAL units of a view component.
the Media SDK encoded MVC stream should be deemed as illegal because its first VCL NAL unit has 'inter_view_flag' coded to 0, while its other VCL NAL units in the same view component (base-view) all have an inferred 'inter_view_flag' of 1, which violates the clause that "The value of inter_view_flag shall be the same for all VCL NAL units of a view component".
Due to such inconsistency in 'inter_view_flag', the encoded stream failed to pass our bitstream validation tool and cannot be correctly decoded by our MVC decoder.
I know this can be worked around by letting the encoder encode only one slice per view component (this can be achieved by specifying '-u balanced' instead of '-u quality' in the sample_encode.exe cmmand line, or by assigning 1 to mfxVideoParam.mfx.NumSlice before encoder init). However, after such workaround the encoder seems unable to make use of all cores of the CPU. In my experiments on a quad-core PC, it appears only one CPU core is being used which makes the encoding process way slower (I am using the SW implementation of the SDK).
I hope Intel will be able to fix this bug in the next SDK release. Meanwhile, is there any better workaround you may suggest? Thank you.