The compiler may be able to perform additional optimizations if it is able to optimize across source line boundaries. These may include, but are not limited to, function inlining. This is enabled with the -ipo option.
Recompile the program using the -ipo option to enable interprocedural optimization.
ifort -real-size 64 -qopt-report=2 -qopt-report-phase=vec -D ALIGNED -ipo matvec.f90 driver.f90 -o MatVector