Tutorial. Network that uses several types of input data

Wednesday, May 23, 2018

App

Posted by Yoshiyuki Kobayashi

This tutorial describes how to handle neural networks that use several types of data as inputs. This method, for example, can be used to perform classification based on multiple images or based on image and vector inputs.

 

1. Performing classification based on multiple images

First, the method of estimating y based on three images x, x2, and x3 will be explained.

1.1 Preparing a dataset for handling multiple input image data

The Neural Network Console’s dataset CSV supports the handing of multiple types of data. To handle multiple types of data, simply prepare a dataset CSV file containing a column for each data type.

To estimate y based on three images x, x2, and x3, create a dataset CSV file as follows:

x x2 x3 y
./x_1.png ./x2_1.png ./x3_1.png 0
./x_2.png ./x2_2.png ./x3_2.png 1
./x_3.png ./x2_3.png ./x3_3.png 2

Dataset CSV file for estimating y based on three images x, x2, and x3

1.2 Configuring a network for handling multiple input image data

To input multiple image data files, insert an input layer for each file. For each input layer, use the Size property to specify the size of each image and the Dataset property to specify the variable name to input (x, x2, or x3 in this example).



Inserting input layers and setting the Size and Dataset properties

After entering data with the input layers, you can configure a neural network as you like. To combine multiple inputs, you can use layers such as Concatenate (tensor concatenation), Add2 (add two tensors at the element level), and Mul2 (multiply two tensors at the element level).



Example in which intermediate outputs are combined using Concatenate and Mul2 layers

Combination using the Concatenate layer requires the sizes of input axes other than the axis specified by the Axis property to be the same. Combination using the Add2 or Mul2 layer requires the sizes of the axes other than the axis with one input element to be the same.

 

2. Performing classification based on image and vector inputs

Next, the method of estimating y based on image x and four-dimensional vector x2 will be explained.

2.1 Preparing a dataset for performing classification based on image and vector inputs

To estimate y based on image x and four-dimensional vector x2, create a dataset CSV file as follows:

x x2__0 x2__1 x2__2 x2__3 y
./x_1.png 0.0 0.1 0.2 0.3 0
./x_2.png 0.1 0.2 0.3 0.4 1
./x_3.png 0.2 0.3 0.4 0.5 2

Dataset CSV file for estimating y based on image x and four-dimensional vector x2 as inputs

To handle vectors in a dataset CSV file, use a column for each element. !!!DELETE!!! In this example, because x2 is a four-dimensional vector, four columns, x2__0 to x2__3, will be used.
The header (x2__0 to x2__3) follows the format: variable name (x2)+double underscore+vector dimension index (0 to 3). For example, x2__1 indicates a column with dimensional index 1 of variable x2.

2.2 Configuring a network for performing classification based on image and vector inputs

In this example, two input layers are arranged to receive two inputs x and x2. For the first input layer, set the Dataset property to the input variable name x and the Size property to the image size. For the second input layer, set the Dataset property to the vector variable name x2 and the Size property to 4, which is the vector size.



Inserting input layers and setting the Size and Dataset properties

After arranging the input layers, configure the latter stages of the neural network. In the following network, a convolutional neural network is configured for input x, and its output is concatenated with the first affine output of x2.



Example of combining outputs using a concatenate layer.