COCO Validation Dataset Preprocessing for TensorFlow*

Pull Command

docker pull intel/object-detection:tf-1.15.2-imz-2.1.1-preprocess-coco-val


The COCO dataset validation images are used for inference with object detection models.

The script calls the script from the TensorFlow Model Garden to convert the raw images and annotations to TF records. The version of the conversion script that you will need to use will depend on which model is being run. The table below has git commit ids for the TensorFlow Model Garden that have been tested with each model.

Model Git Commit ID
Faster R-CNN 7a9934df2afdf95be9405b4e9f1f2480d748dc40
RFCN 1efe98bb8e8d98bbffc703a90d88df15fc2ce906
SSD-MobileNet 7a9934df2afdf95be9405b4e9f1f2480d748dc40
SSD-ResNet34 1efe98bb8e8d98bbffc703a90d88df15fc2ce906


Prior to running the script, you must download and extract the COCO validation images and annotations from the COCO website.

export DATASET_DIR=<directory where raw images/annotations will be downloaded>
mkdir -p $DATASET_DIR



Set following environment variables are expected by the script:

  • DATASET_DIR: Parent directory of the val2017 raw images and annotations files
  • OUTPUT_DIR: Directory where the TF records file will be written

Follow the instructions below to run the script in the docker container, if the model that you are running requires the dataset to be in the TF records format.

How to Use the Docker* Container

  1. This container includes the prerequisites needed to run the dataset preprocessing script. You will need to mount volumes for the dataset (raw images and annotations) and the output dirctory (the location where the TF records file will be written), and set the TF_MODELS_BRANCH environment variable to the git commit id for the TensorFlow Model Garden.

    export DATASET_DIR=<Parent directory of the val2017 raw images and annotations files>
    export OUTPUT_DIR=<directory where TF records will be written>
    export TF_MODELS_BRANCH=<git commit id>
    docker run \
    --env VAL_IMAGE_DIR=${DATASET_DIR}/val2017 \
    --env ANNOTATIONS_DIR=${DATASET_DIR}/annotations \
    --env OUTPUT_DIR=${OUTPUT_DIR} \
    -v ${OUTPUT_DIR}:${OUTPUT_DIR} \
    -t intel/object-detection:tf-1.15.2-imz-2.1.0-preprocess-coco-val

    After the script completes, the OUTPUT_DIR will have a TF records file for the coco validation dataset:

    $ ls $OUTPUT_DIR

Documentation and Sources

Get Started​
Docker Repo
Main GitHub
Release Notes
Get Started Guide

Code Sources
Report Issue

License Agreement

LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804