frame

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Alert: Beginning Tuesday, June 25th, we will be freezing this site and migrating the content and forums to our new home at https://forums.intel.com/s/topic/0TO0P000000PqZDWA0/intel-neural-compute-sticks. Check it out now!

mvNCCheck error

I tryed to run mvNCCheck on my model (mvNCCheck tf_sigtype.pb -on=activation_4/Softmax) but i got this error:

...
USB: Myriad Execution Finished
Output is in Channel Minor format
USB: Myriad Connection Closing.
E: [ 0] dispatcherEventReceive:247 dispatcherEventReceive() Read failed -2

E: [ 0] eventReader:268 Failed to receive event, the device may have reset

E: [ 0] checkGraphMonitorResponse:922 XLink error
W: [ 0] ncFifoDestroy:2545 myriad NACK

Traceback (most recent call last):
File "/usr/local/bin/mvNCCheck", line 239, in
quit_code = check_net(args.network, args.image, args.inputnode, args.outputnode, args.nshaves, args.inputsize, args.weights, args)
File "/usr/local/bin/mvNCCheck", line 213, in check_net
timings, myriad_output = run_myriad(graph_file, args)
File "/usr/local/bin/ncsdk/Controllers/MiscIO.py", line 334, in run_myriad
fifoOut.destroy()
File "/usr/local/lib/python3.5/dist-packages/mvnc/mvncapi.py", line 298, in destroy
raise Exception(Status(status))
Exception: Status.ERROR

I'm new to this and i tryed to find some solution/explanations on internet and on this forum, but with no result.
Can someone tell me what i'm doing wrong or what can i do to try to understand what is the problem?

If it can be useful this should be the list on my layers and it seems that up to max_pooling2d_2/MaxPool it's all fine, but then with flatten_1(Shape, strided_slice ecc) something goes wrong (IndexError: list index out of range)

[u'conv2d_1_input',
u'conv2d_1/kernel',
u'conv2d_1/kernel/read',
u'conv2d_1/bias',
u'conv2d_1/bias/read',
u'conv2d_1/convolution',
u'conv2d_1/BiasAdd',
u'activation_1/Relu',
u'max_pooling2d_1/MaxPool',
u'conv2d_2/kernel',
u'conv2d_2/kernel/read',
u'conv2d_2/bias',
u'conv2d_2/bias/read',
u'conv2d_2/convolution',
u'conv2d_2/BiasAdd',
u'activation_2/Relu',
u'max_pooling2d_2/MaxPool',
u'flatten_1/Shape',
u'flatten_1/strided_slice/stack',
u'flatten_1/strided_slice/stack_1',
u'flatten_1/strided_slice/stack_2',
u'flatten_1/strided_slice',
u'flatten_1/Const',
u'flatten_1/Prod',
u'flatten_1/stack/0',
u'flatten_1/stack',
u'flatten_1/Reshape',
u'dense_1/kernel',
u'dense_1/kernel/read',
u'dense_1/bias',
u'dense_1/bias/read',
u'dense_1/MatMul',
u'dense_1/BiasAdd',
u'activation_3/Relu',
u'dense_2/kernel',
u'dense_2/kernel/read',
u'dense_2/bias',
u'dense_2/bias/read',
u'dense_2/MatMul',
u'dense_2/BiasAdd',
u'activation_4/Softmax']

I also report this message:

/usr/lib/python3.5/importlib/_bootstrap.py:222: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)

I really hope someone can help me, thanks in advance.

Comments

  • 18 Comments sorted by Votes Date Added
  • Hi @s.fratus005,

    Can you provide more details about your setup and sytem please? Are you using the NCS1 or NCS2? Are you compiling a custom model? Which version of the NCSDK are you using?

    Best Regards,
    Sahira

  • I'm using ubuntu 16.04, NCS1and NCSDK version 2. The model is a custom model done with keras and later converted in tensorflow.

  • Hi @s.fratus005

    There can be issues when using a custom model trained in Keras and then converted to Tensorflow. Can you provide your Keras model and I can try to convert to Tensorflow and then compile to see if i'm getting the same error.

    Best Regards,
    Sahira

  • Hi,

    We made some more tests and discovered the following:
    the model seems to work correctly when compiled until the first layer before the flatten. Or at least if we run mvNCCheck until max_pooling2d_2/MaxPool the model passes all the tests.
    From the begin of what is Keras flatten layer (that is translated to a lot of stuff in tensor flow) things begin to go wrong for some reason. The command does not say something about unsupported layers, so we assume they are all supported.
    We also tried to substitute the flatten layer with a pure tensorflow reshape layer (by modifying how keras creates such layer), but we had the same exact behaviour.

    In the link below we provide the original keras model, the tensorflow counterpart in .pb format and the .graph file that we compiled with mvNCCompile. The .pb file has been frozen using the following Python script:


    def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
    Freezes the state of a session into a pruned computation graph.
    Creates a new computation graph where variable nodes are replaced by
    constants taking their current value in the session. The new graph will be
    pruned so subgraphs that are not necessary to compute the requested
    outputs are removed.
    @param session The TensorFlow session to be frozen.
    @param keep_var_names A list of variable names that should not be frozen,
    or None to freeze all the variables in the graph.
    @param output_names Names of the relevant graph outputs.
    @param clear_devices Remove the device directives from the graph for better portability.
    @return The frozen graph definition.
    """
    graph = session.graph
    with graph.as_default():
    freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
    output_names = output_names or []
    output_names += [v.op.name for v in tf.global_variables()]
    input_graph_def = graph.as_graph_def()
    if clear_devices:
    for node in input_graph_def.node:
    node.device = ""
    frozen_graph = tf.graph_util.convert_variables_to_constants(
    session, input_graph_def, output_names, freeze_var_names)
    return frozen_graph


    That we found online.

    We also removed all dropouts since they are not supported (and also not used in prediction).

    The keras model can be opened with:
    keras.models.load_model(path)

    We used the extension .model only for convenience.

    We also provide two screenshots about the problems we faced.

    Other details that can be important:
    - we are using the first version of the neural compute stick (Movidius).
    - we are running ncsdk2 inside a docker container because we had a different version of ubuntu (18.10 instead of 16.04) and it did not compile on that version.

    Thanks in advance for your support.

    Link: https://drive.google.com/drive/folders/19U4TjgWNTBVsdYrhIujjcL8r7mUxGJ7l?usp=sharing

  • Hi @Sahira_at_Intel ,
    sorry if i bother you, did you tryed to convert our model from keras to tensorflow? :)
    Best regards.

  • Hi @s.fratus005
    Not a bother at all - I'm working on this and will let you know what I find! Thanks so much for your patience.

    Best Regards,
    Sahira

  • Thank you so much for your help and kindness :smiley:

  • Hi @s.fratus005

    This might even be a USB issue - are you using a USB hub or are you plugging directly into your system? (the xlink error is usually due to communication issues between the NCS & the system) since it does look like you converted to an NCS graph file. If you're using a hub, try plugging directly into your computer instead and try again.

    Best Regards,
    Sahira

  • Hi @Sahira_at_Intel ,
    actually we are plugging it directly into the computer :/
    Best regards.

  • Hi @s.fratus005

    I was able to reproduce this issue. Can you provide the Keras model and the steps you made to convert the model to Tensorflow and then to NCS graph?
    Thanks so much for your patience!

    Best Regards,
    Sahira

  • Hi @Sahira_at_Intel
    opening this link: https://drive.google.com/drive/folders/19U4TjgWNTBVsdYrhIujjcL8r7mUxGJ7l you can find the model (sigtype_ep102_vl0.919548961424_va0.735483873275.model). Now i also added the script "keras_to_TF.py" used to make the conversion from keras to tensorflow. Then we used the command mvNCCompile to generate the graph from the tensorflow file.
    Best regards.

  • Hi @s.fratus005
    Thanks so much for your patience. I still have not been able to find a fix for this. I converted your model on my end and am still getting the same errors. It looks like there must be something happening on the firmware side during the deallocate buffer command. But I have escalated this issue to the NCS Engineering team and will let you know as soon as I hear back.

    Best Regards,
    Sahira

  • Thank you so much,
    best regards.

  • I am using a keras model and then I am converting it to a tensorflow graph. I am using that graph for later usage of course. I do get the following -

    E: [         0] dispatcherEventReceive:247  dispatcherEventReceive() Read failed -2
    
    E: [         0] eventReader:268 Failed to receive event, the device may have reset
    
    

    However, the predictions do not get hampered for this.

  • Hi @Sayak_Paul,
    we didn't understand which model did you use in your previous post. However, we provide a sample image already rescaled for which the output of the network should be: [3.5754219e-03 9.0594798e-01 2.7515322e-03 1.1193451e-04 4.8057209e-03
    4.2361184e-03 3.6002832e-04 7.8211337e-02]
    Image (pickle): https://drive.google.com/file/d/1kx5kIutDXnNdWSkMaD8kQVKcNSVSXwLJ/view?usp=sharing
    Model (keras): https://drive.google.com/file/d/1oXXgCtmJexB8d55ZMVBf3Eg-nU05mylC/view?usp=sharing
    Model (pb): https://drive.google.com/file/d/1UrwK97a2YFmY8SZX0QThDKLNnLrsC6fU/view?usp=sharing
    Do you obtain the same results?

    Thank you.

  • Hi @s.fratus005
    I apologize for the delayed response - I had escalated this issue to Engineering for further review.
    We checked the firmware and API and found no issues there, but when the firmware makes a call to the CNN code to run inference, it never returns a result and watchdog kills/disconnects the device after sometime, hence the xlink error. This very well may be a bug within the CNN code, and I don't think there is a workaround we can use to get around this.
    Please let me know if you have any further questions.

    Best Regards,
    Sahira

  • Ok, thank you very much for the answers and your kindness,

    best regards.

  • @Sahira_at_Intel can I ask if the same problem would occur even with the NCS2?

Sign In or Register to comment.