frame

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Alert: Beginning Tuesday, June 25th, we will be freezing this site and migrating the content and forums to our new home at https://forums.intel.com/s/topic/0TO0P000000PqZDWA0/intel-neural-compute-sticks. Check it out now!

Image aspect ratio - how does it affect accuracy?

Hi all, I've been playing with the examples in the zoo, and have a question about image aspect ratio, how do I ensure maximum accuracy? For example, the ssd mobilenet app will resize images to 300x300 which is a square ratio:

# Neural network assumes input images are these dimensions.
SSDMN_NETWORK_IMAGE_WIDTH = 300
SSDMN_NETWORK_IMAGE_HEIGHT = 300

But, the video that it downloads and uses as examples is all 960x540 which is 16:9 ratio, for example:

https://raw.githubusercontent.com/nealvis/media/master/traffic_vid/bus_station_6094_960x540.mp4

The code uses OpenCV's resize filter. Now, let's compare the results of resizing vs. cropping then resizing. First, this is what resizing that video from 16:9 to 1:1 looks like:

This is what it looks like when you crop first and then resize:

These are clearly quite different, and my question is, how does this affect accuracy? Should I rather crop my video to 1:1 or do the example models all expect that video is going to be 16:9 squashed into 1:1 ratio? To the eye, it would seem more logical that the model would perform better on image files that are not distorted by resizing to a different ratio.

Thanks in advance.

Comments

  • 4 Comments sorted by Votes Date Added
  • I was thinking about this too recently. I think that the aspect ratio doesn't matter as long as it remains consistent. If the images you input to the network are always 16:9 then training on 16:9 images is fine. This is because the pre-processing before input to the network will always be the same 'squashedness', the detection you are trying to learn will always be distorted by the same amount.

  • Yes makes sense, I wonder then, how do you tell what aspect ratio the model was trained on? Specifically, which ones in the appzoo?

  • Doing a center crop and scale down to network input size will give the best results. We have run lots of experiments to validate this.

  • Thanks Victor, yes agreed, I ran some experiments myself and came to the same conclusion that warping or distorting the image is a bad idea even if it can still work.

    I think the examples in the ncappzoo should be changed to work on cropped video, otherwise it's just misleading for newcomers who might assume that 16:9 is the desirable ratio.

Sign In or Register to comment.