frame

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

RapsberryPi3 + OpenVINO + NCS2 + Single Thread + MobileNet-SSD, Implemented.

about 10 FPS

https://github.com/PINTO0309/MobileNet-SSD-RealSense/blob/master/SingleStickSSDwithUSBCamera_OpenVINO_NCS2.py

It was 15 FPS in Core i7.
From now on, I will add an implementation of MultiProcess + MultiStick.

Comments

  • 18 Comments sorted by Votes Date Added
  • Your anything but stupid. The results you have shown has made me move into a different direction. I am not seeing the hardware Movidius provides as useful when you want optimal performance. They are low power which may be useful and if you are not concerned with FPS, as in a doorbell video sensor I think your OK. But if you want to use it on a drone or anything that moves quickly I am not sure its useful.

    thinking about this, do you think Movidius can speed up their chip to make it a useful co processor? I guess I just don't understand what the bottleneck is. I like the idea of a low cost coprocessor that can handle the inference.

  • Core i7 + NCS2 [21 FPS]

  • Hello PINTO,

    I have tried to run vehicle-detection-adas-0002 on Raspberry Pi with NCS2 and it's performance was pretty poor (~4 FPS). Then I tried to run the python script on i7 machine and the performance was also poor (6.46 FPS). Then I tried to use NCS with OpenVINO and the speed was 7.44 FPS, which I don't understand, how is that possible.

    I tested this on the video, captured with GoPRO.. haven't tested with webcam yet, but nevertheless don't know how is it possible that NCS performs better than NCS2.

    Did you encounter any specific issues that you had to solve in order to achieve such performance? I would appreciate any suggestions in order to achieve better FPS rate.

  • edited December 2018 Vote Up0Vote Down

    @nikogamulin

    Then I tried to run the python script on i7 machine and the performance was also poor (6.46 FPS).

    Our benchmark knows that NCS2 is slower than Atom / Core i7 / Core i5 / Core m3.
    Using OpenVINO on CPU seems to greatly optimize internal processing with the MKLDNN plugin.
    NCS2 only outperforms Celeron's CPU.

    Then I tried to use NCS with OpenVINO and the speed was 7.44 FPS, which I don't understand, how is that possible.

    I will arrange them in order of high performance.

    1. OpenVINO + GPU (FP16)
    2. OpenVINO + Intel's CPU (Core i7 or Core i5 or Atom) (FP32)
    3. OpenVINO + Intel's CPU (Core i7 or Core i5 or Atom) + NCS2 (FP16)
    4. OpenVINO + armv7l + NCS2 (FP16)
    5. OpenVINO + armv7l + NCS (FP16)
    6. OpenVINO + armv7l

    Points are described below.

    1. NCS2 is slower than Intel's CPU.
    2. NCS / NCS2 demonstrates its power by combining it with a low-performance CPU like RaspberryPi.
    3. When used in combination with a high performance CPU, the performance of NCS / NCS2 is very bad.
    4. When Intel CPU is used, it seems that the inference is parallelized within the CPU by the number of cores times the number of threads.

    NCS and NCS2 have no meaning unless carefully selected environment to use.

  • None of this makes any sense. How is the stick useful then?

  • Pinto have you tried 3 movidius ncs 2 sticks

  • @chicagobob123

    No, I have not tried it yet.
    I just started implementing MultiStick since yesterday.
    However, it is not a difficult task, so I intend to commit to Github within a few days.

  • After the vehicle example using the ncs2 worked so poorly when compared to an old i5 i was confused. Going to try and set up an atom cpu this week.

  • @chicagobob123

    Pinto have you tried 3 movidius ncs 2 sticks

    RaspberryPi3 + NCS2.

    NCS2 x2 ---> 15 FPS
    NCS2 x3 ---> 20 FPS
    NCS2 x4 ---> 24 FPS

    The OpenVINO API is inconvenient.
    MultiProcess can not be used efficiently.

  • With about $300 of ncs2 sticks there is sometuing very wrong. The ROI is gone and results subpar. I don't think the sticks are useful dro you? You can but a larre panda at 299 and probably do better.

  • @chicagobob123

    I deliberately made meaningless verification.
    Actually, the ARM processor knew that it could not maximize the performance of NCS.
    And I realized that ROI was the worst shortly after purchasing NCS2.
    As you say, I understand that it is better to use LattePanda Delta / Alpha.
    I just dared to show the worst benchmark so that world engineers will not make the wrong choice.

    I borrowed 3 out of 4 NCS2 I used for confirmation, so the loss is small.
    I am not making products, but a stupid hobby programmer.

  • @chicagobob123

    as in a doorbell video sensor I think your OK. But if you want to use it on a drone or anything that moves quickly I am not sure its useful.

    I think the same thing.

    do you think Movidius can speed up their chip to make it a useful co processor? I guess I just don't understand what the bottleneck is. I like the idea of a low cost coprocessor that can handle the inference.

    I believe that proper performance will not be obtained unless MyriadX is incorporated as SoC.
    btw, I am interested in the following devices now.
    https://aiyprojects.withgoogle.com/edge-tpu
    https://www.arrow.com/en/products/eic-ms-vision-500/einfochips-limited
    https://www.intrinsyc.com/open-q-605-single-board-computer/

  • The first tpu at least has specs. It works with
    MobileNet V1/V2
    224x224 max input size; 1.0 max depth multiplier
    MobileNet SSD V1/V2
    320x320 max input size; 1.0 max depth multiplier
    Inception V1/V2
    224x224 fixed input size
    Inception V3/V4
    299x299 fixed input size
    Which you can get from any 640x480 camera. Hd not needed.

    The third item is kind of pricey, 429$ seems like a lot.

  • edited January 2 Vote Up0Vote Down

    @chicagobob123

    Thank you, bob.
    I think I will try the following. The price is affordable and high performance.

    about $78

    • 16.8 TOPs @ 700mW
    • 24 TOPs/Watt
    • 16.8 TOPs @ 300MHz
    • There is a USB type development kit

    https://www.gyrfalcontech.ai/solutions/2801s/
    https://ja.aliexpress.com/store/product/Orange-Pi-AI-Stick-2801-Neural-Network-Computing-Stick-Artificial-Intelligence/1553371_32954041998.html

  • edited January 18 Vote Up0Vote Down

    Hi @PINTO ,
    A few days ago, I got a NCS2, and when I run a sample image-classification demo on RaspberryPi+NCS2, I got unexpectedly bad performance. Then I found your Github and forum discussions.
    Do you know how it is possible that on official NCS2 page, a large and different number is reported? can you run any project to confirm Movidius benchmark results?

  • @hamzeah

    Do you know how it is possible that on official NCS2 page, a large and different number is reported?

    Yes. I know.
    Intel's benchmark results are obviously benchmark results other than ARM processor + USB 2.0.
    As long as RaspberryPi3 is used, 8 times performance is absolutely not obtained.
    Because the load of preprocessing and post-processing is high.
    It is better to use SBC with Intel processor to maximize performance than using SBC of ARM processor.
    OpenVINO + NCS2 is optimized for Intel processors.

    can you run any project to confirm Movidius benchmark results?

    I have never seen such a benchmark.
    However, if you devise logic, 24FPS performance can be obtained with NCS2 x1 even with MobileNet-SSD + RaspberryPi3.

Sign In or Register to comment.