It looks like you're new here. If you want to get involved, click one of these buttons!Sign In
It looks like you're new here. If you want to get involved, click one of these buttons!
To test the running of a custom tensorflow network on the NCS, I've created a small "hello world" type example (see below). Unfortunately, the NCS gives the incorrect output when running this example.
This example network aims to take an input value, multiply it by itself and add 10.
The following script generates a tensorflow graph (tf_maths.meta file). The script also prints the output values for inputs 0 to 10.
import tensorflow as tf tf.reset_default_graph() # Name of the graph name = 'tf_maths' # Tensorflow graph that multiplies an input number by itself and adds 10 x = tf.get_variable("input", shape=[1,1], initializer = tf.zeros_initializer, dtype = tf.float16) x1 = tf.matmul(x, x, name = 'multiply') v1 = tf.Variable(10.0, name = 'variable1', dtype = tf.float16) v2 = tf.Variable(0.0, name = 'variable2', dtype = tf.float16) y = tf.add(x1,v1, name = 'add_variable') y2 =tf.add(y,v2, name = 'output') with tf.Session() as sess: # Initialise variables init_op = tf.global_variables_initializer() sess.run(init_op) # Test with inputs 0 to 10 for i in range(0,11): assign_op = x.assign([[i]]) sess.run(assign_op) print('%i, %s' %(i,sess.run(y2))) # Save the graph saver = tf.train.Saver(tf.global_variables()) saver.save(sess, './' + name)
I then run the following command to compile the graph, ready for loading on to the NCS.
mvNCCompile tf_maths.meta -in input -on output -s 12 -is 1 1
Finally, I run the following script to load the graph on to the NCS and test the output values with input values 0 to 10.
from mvnc import mvncapi as mvnc import numpy as np # get input parameters graph_path = './graph' devices = mvnc.EnumerateDevices() if len(devices) == 0: print('No devices found') quit() device = mvnc.Device(devices) device.OpenDevice() #Load graph with open(graph_path, mode='rb') as f: graphfile = f.read() #Load preprocessing data graph = device.AllocateGraph(graphfile) # Testing numbers 0 to 10 for i in range(0,11): a = np.array([[i]]) if (graph.LoadTensor(a.astype(np.float16), 'user object')): output, userobj = graph.GetResult() print('%i, %s' %(i,output)) else: print("LoadTensor fail") print('deallocate graph') graph.DeallocateGraph() device.CloseDevice() print('Finished')
Unfortunately, the output values aren't correct. As shown below:
|input||expected output||NCS output|
As you can see, the NCS looks to be multiplying the input by 12.0, rather than performing the intended operation. If I create a similar tensorflow graph in which I only do matrix multiplications (no add operations) I do get the correct output.
Is there possibly a bug in the compiler when interpreting a tf.add layer? I'd be very grateful if you could confirm the approach that I am taking to run a custom tensorflow network on the NCS is correct, and if so, the cause of the problem when performing add operations