What Is a Neural Network?

The term neural network applies to a loosely related family of models, characterized by a large parameter space and flexible structure, descending from studies of brain functioning. As the family grew, most of the new models were designed for nonbiological applications, though much of the associated terminology reflects its origin.

Specific definitions of neural networks are as varied as the fields in which they are used. While no single definition properly covers the entire family of models, for now, consider the following description ¹:

A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects:

Knowledge is acquired by the network through a learning process.
Interneuron connection strengths known as synaptic weights are used to store the knowledge.

For a discussion of why this definition is perhaps too restrictive, see ².

In order to differentiate neural networks from traditional statistical methods using this definition, what is not said is just as significant as the actual text of the definition. For example, the traditional linear regression model can acquire knowledge through the least-squares method and store that knowledge in the regression coefficients. In this sense, it is a neural network. In fact, you can argue that linear regression is a special case of certain neural networks. However, linear regression has a rigid model structure and set of assumptions that are imposed before learning from the data.

By contrast, the definition above makes minimal demands on model structure and assumptions. Thus, a neural network can approximate a wide range of statistical models without requiring that you hypothesize in advance certain relationships between the dependent and independent variables. Instead, the form of the relationships is determined during the learning process. If a linear relationship between the dependent and independent variables is appropriate, the results of the neural network should closely approximate those of the linear regression model. If a nonlinear relationship is more appropriate, the neural network will automatically approximate the "correct" model structure.

The trade-off for this flexibility is that the synaptic weights of a neural network are not easily interpretable. Thus, if you are trying to explain an underlying process that produces the relationships between the dependent and independent variables, it would be better to use a more traditional statistical model. However, if model interpretability is not important, you can often obtain good model results more quickly using a neural network.

¹ Haykin, S. 1998. Neural Networks: A Comprehensive Foundation, 2nd ed. New York: Macmillan College Publishing.

² Ripley, B. D. 1996. Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press.