frame

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Performance issue about the stick 2.

Hi,
I tried to do some test on the stick2. I ran the demo_squeezenet_download_convert_run.sh on stick2 and get the result
total inference time: 9.0164393.
This is similar with the post: https://ncsforum.movidius.com/discussion/1329/lattepanda-alpha-openvino-cpu-core-m3-vs-ncs1-vs-ncs2-performance-comparison.
According to the test, I can achieve110FPS. And the total workload (MACS) of squeezenet1.1 is 360MFLOPs. The throughput is about 79.2GOPS.
The stick2 has the capability of 1TOPS for neural network and 4TOPS computation. While the test only consumed less than 10% of the capability.
Could anyone tell me whether it is normal or there is anything I missed?
Thanks
Honglei

Comments

  • 4 Comments sorted by Votes Date Added
  • Hi @Honglei

    Are you using OpenVINO 2019 R1? Updating your SDK to the latest version should improve performance a little. Were you able to get results from testing other examples?

    Best Regards,
    Sahira

  • Hi, @Sahira_at_Intel ,
    Thanks for your reply.
    We are using OpenVINO 2019R1.
    Could you give me some explaination of the test result or could you recommend me other test example to get higher throughput?
    Thanks
    Honglei

  • Hi @Honglei
    I just wanted to know how the performance of your NCS2 compares to others to make sure there's no other issues. You can try the security_barrier_camera_demo, interactive_face_detection_demo, segmentation_demo and I can compare your performance to mine. There can be some limitations with the system you're using as well - what are you using?

    Best Regards,
    Sahira

  • Hi Sahira,
    below is the command I tried and the corresponding result.

    ./interactive_face_detection_demo -i cam -m open_model_zoo/model_downloader/downloaded_models/Transportation/object_detection/face/pruned_mobilenet_reduced_ssd_shared_weights/dldt/face-detection-adas-0001-fp16.xml -d MYRIAD
    result: 11fps
    For the test, the model has 2.83GFlops. The total throughput is about 31Gops. (please point out if I am wrong.)

    My host:
    HP ZBook
    CPU: 8-cores, Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
    Memory: 15.5G
    OS: Ubuntu 16.04 64bit

    And I also run the GoogLenet v1 test.
    ./classification_sample -d MYRIAD -i ~/Downloads/car_1.jpg -m openvino/open_model_zoo/model_downloader/downloaded_models/classification/googlenet/v1/caffe/googlenet-v1.xml
    results:43fps
    I found the test result from Intel web, about 80fps( https://software.intel.com/en-us/neural-compute-stick). Even with the 80fps, the throughput is about 250GOPS, 25% usage of the capability. Does the 1TOPS in the stick2 means INT8 operation performance or else?
    Thanks
    Honglei

This discussion has been closed.