NetMaker - neural network designer and simulator

Networks

Network structure setup (Setup button)
   - general info
   - structure setup
   - committee of networks
   - reading and writing network to the file
Training parameters setup, run options (Go button)
   - training algorithms and parameters
   - dynamic structure and training controls
   - output format and event filter
   - pre- and post-run commands
List of the training and testing sets (I/O button)
Output list (>> button)

Optimization algorithms
Activation functions
Error functions

Network block does all the neural calculations and optimizations. It reads events from the DataSets connected through the I/O, calculates the network answer (output vector) and puts it back into the source events. If output DataSets are connected through >>, new events are created according to the output mapping rules and sent to the destination DataSets (filtering may also be applied).
To add the Network block to the project choose menu Edit - Add Compnent - Network.

Network structure setup (Setup button)

Options are arranged in tabs: general info, manual structure setup, and file operations.

general info

network info

Block and project names may be chnged here (names should not contain "\" symbols).
Summary shows the current network setup.

structure setup

Network structure may be fixed by user or it can be adjusted automatically during the training, but even in this case initial setup of some parameters is required. Number of the network inputs and outputs is determined by the particular network application. Hidden neuron numbers and activation (transfer) functions (f_act) should be specified by the user. All neurons in the layer use the same activation function, but f_act may differ between layers. In some cases f_act of the output layer is determined by the application. Read more on available activation functions here.

network structure window

Network type - Neural network type:

BP Training is a multi-layer perceptron network with a supervised, back-propagation training pattern. Now it is implemented in the feed-forward (MLP) and recursive (RMLP, back-propagation through time training) versions, both heavily optimized for speed (SSE is used if detected). Also, a wide range of the training algorithms is available for these network types.

Cascade-Correlation is the architecture and the supervised training pattern proposed by S. Fahlman. It is an efficient model and it learns quickly, but it also has some disadvantages (see Back-Propagation vs Cascade-Correlation example).

Network index: currently selected index in the committe of networks.

inputs: Number of the network inputs (input vector length) and input layer type. Type is set to Input for feed-forward MLP networks and it is set to Recurent, for RMLP networks.

hidden 0/1: Number of the hidden neurons and the hidden layer's activation function type. Note, that the network may consist of a different types of layers (and it is recommended solution in some cases), however some combinations may be difficult to train. Size hidden1 = 0 results with the network with a single hidden layer.

outputs: Number of the network outputs (output vector length) and output layer activation function type (may differ from the activation function of the hidden layers).

Randomize: Randomizes neuron interconnection coefficients.

Apply Changes: Creates new network with randomized coefficients if parameters have been changed.

memory: number of the events that precede processed event and are shown to the network together with currently processed event.

Following options are enabled only if RMLP network type is selected:

rank - number of the previous network outputs, from the layer specified by loop option, that are shown to the network with the current event;

loop - marks the layer to be used as a source of the feedback loop;

clear state - if checked, feedback loop is reset at the end of processing each DataSet.

committee of networks

Committee is an ensemble of networks that are prepared for the same task, but trained independently (different setup is allowed - architecture, training algorithm, etc). The output of these networks may be averaged taking into account some weight (network error, Bayesian evidence, etc). Averaged output is less biased than output of each individual network in the ensemble. The implementation of averaging is underway... Now this tab allows to set a committee size and navigate through the networks. Also error details and Bayesian Framework parameters of each network are shown here.

network committee window

Selected network index: Index of the network in the committee. This network is used in all other activities - structure setup, training, etc.

Committee size: Number of networks in the committee. Set button aplies changes.

Randomize Selected: Randomizes interconnection coefficients of the selected network.

Randomize All: Initializes all the networks in the commitee using the model of the currently selected network. All networks are randomized.

reading and writing network to the file

network file window

Put the name of the file in the text box or select it through the common Open / Save dialog window (... button). Then push Load / Save button and it's done. :)
Save Committee saves all networks from the committee to separate files, file names are based on the typed in the text box.

Training parameters setup, run options (Go button)

training algorithms and parameters

training setup window

Training algorithm: List of available optimization algorithms.

JustRun: This special entry is present for every network type. It stands for running the network over the input DataSets without any training operations, just to calculate the network outputs.

Parameters:

List of parameters (with short description) available for the selected training method. Only common parameters are explained below. To read more on algorithms and parameters, see here.

AutoStop: If switched off, training stops at the first reasonable iteration after MaxIter iterations (not until the error still decreases rapidly or the algorithm is making an attempt that just supports main stream of calculations). If switched on, a set of the conditions (stable error, stable neuron interconnections...) is checked every MaxIter iterations and the training is stopped basing on the result of this check. Built-in stop conditions are quite sophisticated, with varying priorities for different stages of the training and they cannot be modified by the user. Sensitivity of the auto-stop algorithm can by modified indirectly by MaxIter parameter: bigger value of MaxIter makes the training longer and more precisely estimated minimum of the error function.

MaxIter: Number of the training iterations / stop conditions checking interval.

MinError: Satisfying error level. If reached, training stops. Rarely used and usually set to its default value (10^-6).

MaxError: Maximum acceptable error level. If not reached after the training stops, network is randomized and the whole training repeats (now maximum 3 training attempts is hardcoded). Value 0 turns off this option.

WeightDecay: The regularization factor α for weight decay term of the error function. Higher value of this factor makes the network output more smooth and resistant to the statistical fluctuations of the training data. Regularization is explained in more details in the example of function approximation. It is possible to include (or exclude) neuron biases in regularization with EnableBiasDecay option (not all algorithms support this option).

Sample weight: A weight assigned to each event in the DataSet connected to the network. This weight is used to modify error and gradient values calculated for each event. Sample weight expression may use variables available with the event (so any input, target, error-bar, and not-used vector elements can be accessed with i1, t1, b1, n1, ... codes). If event uncertantias are available, then χ² error function may be realized with Sample weight set to 1/(b1*b1) and MSE error function, where b1 is assumed to be the event uncertainty value.

RMPL networks:

TeacherForcing: mixing factor ( f ) for the recurrent input calculation inp_t = out_t-n ⋅ (1 - f) + tgt_t-n ⋅ f, n = 1...rank, where out_t-n is the network output at iteration t-n and tgt_t-n is the target value for this output; if f = 1, recurrent input is replaced by the desired target value completely, if f = 0 then the teacher forcing is disabled.

ForcingDecay: Decay factor δ for the teacher forcing factor; if the network error decreases, f is recalculated at each iteration as f_t = δ ⋅ f_t-1; if δ = 1 then the teacher forcing factor is constant, δ = 0 is not allowed.

dynamic structure and training controls

Dynamic Structure:

Add new neurons: enables growth of the network structure.
pool: Number of the random candidates - each time the network size is increased the pool of new randomized neurons is trained and the best one is selected to extend the network (but only if it seems to be useful to the whole network);
neuron split: Hidden neuron with the highest variation of the output error is divided in two, slightly differentiated neurons; pair of new neurons is then pre-trained; these neurons are built into the network if they give better results than random candidates; this feature is now alway turned on.
max neurons, max weights: Limits on maximum number of the neurons and the network interconnection coefficients. Training stops when any of these limits is reached. Value 0 turns of the limit check.

Remove neurons: enables pruning of the unused or redundant neurons; twins - measure of the difference between activations of two neurons that will be replaced with one equivalent neuron; dead - measure of significance of the neurons that may be safely removed; const - measure of neuron activation σ/μ that will be used to consider neuron as constant (modification of bias in following layer is done when this type of neuron is removed); increasing these values results in more neurons removed.

OBS (Optimal Brain Surgeon): technique of eliminating single connections from the network; if OBS option is enabled, algorithm is applied at the end of the training process; if iterative option is also checked, algorithm is applied before each attempt of neurons pruning (be aware, this is time and memory consuming technique - O(N³ ∙ M), where N is the number of the network coefficients and M is the training set size); Lq is the measure of the allowed error increase caused by removing the connection; you can read more about the algorithm here (this is not a simple cutting of connections).

Control Thread:

Run committee: runs all the networks in the committee.

Save iteration info: enables storing information about the network training/testing in each iteration. Required for Network Error plots.

Progress window: enables control window with training statistics.

Hold: if checked, training will pause before and after each structure modification.

Save phases: if checked, network model will be stored in files before and after each structure modification.

Shuffle data: enables randomized reordering of the training vectors during the training.

Trigger Source: If Manual mode is selected, Go! button is enabled and it releases calculations. If Internal mode is chosen - Source button is enabled and it allows to select source of the trigger through the common Connection Add dialog window (and Go! button is disabled); processing starts when one of the selected sources finished its own task.

output format and event filter

output format window

When network finishes processing, the input DataSets contents is sent to the output DataSets according to the output mapping specifications. Destination events in each output DataSet may be composed in a different way. Select the DataSet in the Output Data Set list and put desired expressions in the Mapping table for destination event vectors.
Simple example in the image above performs the following mapping:

destination event input vector will be composed of the first and second element of the source event input vector;

destination event output vector will be created with the length = 1 and it will contain first element of the network answer vector (output vector);

destination event target and not-used vector are not allocated.

Events stored in the output DataSets can be filtered (in the example above only events with the t1==0.95F will be sent to the "signal" DataSet).

pre- and post-run commands

commands window

Initialization commands are executed before starting the network calculations. Postprocessing commands are executed when the network finishes calculations and sends data to the output sets.

Available commands:

Randomize() - randomizes neuron interconnection weights.

ReadCoeffs(filename) - reads the network structure from the file.

SaveCoeffs([filename[,y|n]) - writes down the network structure to the file; if no filename is specified the name from Setup window is used, if filename is not specified in there the Network block name is used; y option forces overwriting the existing file, default is n.

SaveData(dataset,[filename[,ascii|bin]]) - saves the contents of the DataSet connected to the network (input or output); dataset is the name of the DataSet block; filename is the name of the destination file, if not specified the name from Setup - Save Data tab is used, if filename is not specified there the block name is used; ascii / bin option selects the file type (default value is also taken from Setup - Save tab); format of data written to the file is the same as specified in Setup - Save Data tab.

RemoveSampleUpTo(dataset,nlast) - removes all events from the dataset except nlast events.

ClearData(dataset1[,dataset2,...]) - removes all events from specified list of DataSets.

TriggChainBreak(counter) - stops triggering connected Transform / Network blocks after counter triggers. Useful for defining loops in time-sequence prediction.

ClosePrj() - stops all activities and closes current project.

CloseApp() - stops all activities and closes NetMaker window.

Input list (I/O button)

I/O button opens the list of the training and testing DataSets (a bit extended Connection Add dialog window). Double-click on the item from the Available list adds new training set. Use lower << button to add new item to the Testing sets list. Double-click on the item from the Training Sets and Testing Sets lists removes the connection.

i/o window

When both training and testing sets are present during the training process, training sets are used to calculate neuron interconnection weights changes, then the error on the testing sets is calculated. If the error on the testing sets seems to increase constantly, training is stopped. This is a good way to avoid overtraining.

Output list (>> button)

Connects the output of the Network block to the destination DataSets. Opens common Connection Add dialog window.