frame

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

After ~19 hours Movidius stick stops working

I am using the Compute Stick in a docker container (using --privileged) and it works great -- for a while.

~19 hours later, my code just hangs and I notice in dmesg that it has disconnected and detected as a different object. Is this a known issues or any suggestion how I might can, in python, detect this? If I restart the docker container, it works again.

OS:
Distributor ID: Ubuntu
Description: Ubuntu 18.04.2 LTS
Release: 18.04
Codename: bionic

NCSK: 2.10.01.01

dmesg:

[792057.755155] usb 2-1.4: USB disconnect, device number 14
[792057.976817] usb 2-1.4: new high-speed USB device number 15 using ehci-pci
[792058.086670] usb 2-1.4: New USB device found, idVendor=03e7, idProduct=f63b
[792058.086675] usb 2-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[792058.086677] usb 2-1.4: Product: VSC Loopback Device
[792058.086680] usb 2-1.4: Manufacturer: Intel Corporation
[792058.086682] usb 2-1.4: SerialNumber: A1F61DD5645C08

** here is where I think it stops working **

[863425.212527] usb 2-1.4: USB disconnect, device number 15
[863425.434671] usb 2-1.4: new high-speed USB device number 16 using ehci-pci
[863425.543518] usb 2-1.4: New USB device found, idVendor=03e7, idProduct=2150
[863425.543520] usb 2-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[863425.543522] usb 2-1.4: Product: Movidius MA2X5X
[863425.543523] usb 2-1.4: Manufacturer: Movidius Ltd.
[863425.543524] usb 2-1.4: SerialNumber: 03e72150

Comments

  • 11 Comments sorted by Votes Date Added
  • Hi @ksaye

    Thank you for reaching out. The Neural Compute Stick (NCS) is expected to be detected as two different objects as seen above. After the model is loaded to the device, it will reboot and be detected as a loop back device. Could you share additional information about your setup? Are you connecting the NCS directly to a port on the computer or a USB hub? I have seen issues when running the NCS for long hours without a powered USB hub.

    Also, which network and model are you using? I can try to reproduce your issue with one of our sample apps.

    Regards,
    Jesus

  • Jesus, thanks for the reply. The reproduction is SOOOOO easy for you. I blogged on it here: https://kevinsaye.wordpress.com/2019/02/23/up-and-running-with-a-movidius-container-in-just-minutes-on-linux/

    Basically on an Ubuntu 18.04 host, plug the NCS in and run the following command: docker run --net=host --privileged -v /dev:/dev --name movidiusflask ksaye/movidiusflask:0.0.1-amd64 /bin/bash -c "cd /opt/inception_v3 && python3 ./main.py"

    then run the following command, replacing the IP address and the 2.jpg with a real file: curl -H "Content-type: application/octet-stream" -X POST --data-binary @2.jpg "http://192.168.15.200:88/image"

    once you get a response, wait a day or so and rerun the curl command, you will see that it hangs.

  • Hi @ksaye

    I started the docker container as you instructed and can curl successfully. I will let you know how it goes in a couple of days.

    Regards,
    Jesus

  • Hi @ksaye

    I saw the curl command hang as you mentioned with your docker container. If I kill the curl command (ctrl + c) and run it again I will get "Error processing image". Running the curl command after that will continue to return the results without any issues. However, after 19ish hours it will hang again. Are you seeing this too?

    I am currently running a new docker container with the Neural Compute SDK v2.08 to see if the behavior is the same on that release. So far it has not hanged.

    Regards,
    Jesus

  • Yes, after many hours (I have only calculated at > 19) it does hang. I know flask is still working because if you perform a get on http://ioaddress:port/ you get a help message. Also, if you look at your dmesg you should see a the detection difference discussed above.

    I am interested to see what 2.08 show, but find it interesting that I am running 2.10.01.01, but it never hurts to try a different version.

  • Hi @ksaye

    I saw the same results with v2.08. It looks like the device is resetting, however, I am still looking into it. How often do you execute the curl command to request an inference? Opening and closing the device at each request may be an option if inference is not requested too often, otherwise it would not be ideal.

    I'm running another test now ...

    Regards,
    Jesus

  • @Jesus_at_Intel , thanks for verifying what I was seeing. I am not sure if it is resetting based on the graph being loaded for such a long time or lack of inferencing. I have the docker container always running with the model always loaded. I only inference from it ever so often, but when I do inference I want it to be fast.

    I think the issue may be around line 97: https://github.com/ksaye/IoTDemonstrations/blob/a7adf1f2cbb876915b36ca59806c98fe84b0e20a/movidiusflask/main.py#L97 where, based on your sample code, I inference and then read it.

    Is there some sample code that can set a timeout and capture the error so I could reload the device?

  • @Jesus_at_Intel an update. If I run the following bash script -- which keeps the NCS active, I do not see the disconnect in ~ 19 hours. This tells me it seems to be a lack of use action that is causing the problem. I have run the script for 2 days and the NCS is performing just fine.

    while [ 1 -eq 1 ]
    do
    curl -H "Content-type: application/octet-stream" -X POST --data-binary @2.jpg "http://192.168.15.200:88/image"
    sleep 15
    done

  • Hi @ksaye

    I also ran a similar test which sent an image every 30 minutes and ran for 2 days without issues. There must be a service monitoring the status of the device. If the device is inactive (seems to be hanged) it will reset the device. Hope this was helpful! Let me know if you have additional questions.

    Regards,
    Jesus

  • @Jesus_at_Intel , thanks for confirming. Any sample Python code that would catch that issue?

Sign In or Register to comment.