Neural networks and fuzzy systems

Neural networks and fuzzy systems estimate functions from sample data. Statistical and artificial intelligence (AI) approaches also estimate functions. For each problem, statistical approaches require that we guess how output functionally depends on input. Neural and fuzzy systems do not require that we articulate such a mathematical model. They are model free estimators.

We can also view AI expert systems as model-free estimators. They map conditions to actions. Experts do not articulate a mathematical transfer function from the condition space to action space. But the AI framework is symbolic. Symbolic processing favors a prepositional and predicate-calculus approach to machine intelligence. It does not favor numerical mathematical analysis or hardware implementation. In particular symbols do not have derivatives. Symbolic systems may change with time, but they are not properly dynamical systems, not systems of first-order difference or differential equations.

Neural Pre-Attentive and Attentive Processing: -

The human visual system behaves as an adaptive system. Consider how it responds to this stimulus pattern.

What do we see when we look at the Kanizsa [1976] square? We see a square with bright interior. We see illusory boundaries. Or do we? We recognize a bright square. Technically we do not see it, because it is not there.

The Kanizsa square exits in our brain, not “out there” in physical reality on the page. Out there only four symmetric ink patterns stain the page.

In the terminology of eighteenth-century philosopher Immanuel Kant

[1783,1787], the four ink stains are noumena, “things in themselves”. Light photons bounce off the noumena and stimulate our surface receptors, retinal neurons in this case. The noumena-induced sensation produces the Kanizsa-squares phenomenon or perception in our brain. There would be no Kanizsa squares in the space-time continuum without brains or brain like systems to perceive them.

FUZZINESS AS MULTIVALENCE: -

Fuzzy theory holds that things are matters of degree. It mechanizes much of our “folk psychology”. Fuzzy theory also reduces black-white logic and mathematics to special limiting cases of gray relationships. Along the way it violates black-white “laws of logic”, in particular the law of noncontradiction not- (A and not-A) and the law of excluded middle either A or not-A, and yet resolves the paradoxes or antinomies [Kline, 1980] that these laws generate. Dose the speaker tell the truth when he says he lies? Is set A a member of itself if A equals the set of all sets that are not members of themselves? Fuzziness also provides a fresh, and deterministic, interpretation of probability and randomness.

Mathematically fuzziness means multivaluedness or multivalence and stems from the Heisenberg position-momentum uncertainty principles in quantum mechanics. Three-valued fuzziness corresponds to truth, falsehood, and indeterminacy, or to presence, absence, and ambiguity. Multivalued fuzziness corresponds to degrees of indeterminacy or ambiguity, partial occurrence of events or relations.

Bivalent paradoxes as fuzzy midpoints

Consider the bivalent paradoxes again. A California bumpersticker reads TRUST ME. Suppose instead a bumpersticker reads DON’T TRUST ME. Should we trust the driver? if we do, then, as the bumpersticker instructs, we do not. But if we don’t trust the driver, then, again in accord with the bumpersticker, we do trust the driver. The classical liar paradox has the same form. Does the liar from Crete lie when he says that all Cretans are liars? If he lies, he tells the truth. If he tells the truth, he lies. Russell’s barber is a man in a town whose advertises his services with the logo “I shave all, and only, those men who don’t shave themselves.” Who shaves the barber? If he shaves himself, then according to his logo he does not. If he does not, then according to his logo he does. Consider the card that says on one side “The sentence on the other side is true,” and says on the other side “The sentence on the other side is false.”

The “paradoxes” have the same form. A statement s and its negation not-s have the same truth-value t(s):

t (s) = t (not-s) (1)

The two statements are both TRUE (1) or both FALSE (0). This violates the laws of noncontradiction and excluded middle. For bivalent truth tables it is reminded that negation reverses truth-value:

t (not-s) = 1 - t (s)

So (1) reduces to

t (s) = 1 – t (s)

If S is true, t (S)=1 and t (not S)=0, then 1=0. If t (S)=0 it also implies the contradiction 1=0.

The fuzzy or multivalued interpretation accepts the logical relation (1-3) and, instead of insisting that t (S)=0 or t (S)=1, simply solves for t (S) in (1-3):

2t(S) = 1 (1-4)

Or

t (S) =1/2 (1-5)

So the “paradoxes” reduce to literal half-truths. They represent in the extreme the uncertainty inherent in every empirical statement and in many mathematical statements. Geometrically, the fuzzy approach places the paradoxes at the midpoint of the one-dimensional unit hypercube [0,1]. More general paradoxes reside at the midpoint of n-dimensional hypercubes, the unique point equidistant to all 2^n vertices.

Multivaluedness also resolves the classical sorites paradoxes. Consider a heap of sand. Is it still a heap if we remove one grain of sand? How about two grains? Three? If we argue bivalently by induction, we eventually remove all grains and still conclude that a heap remains, or that it has suddenly vanished. No single grain takes us from a heap to no heap. The same holds if we pluck out hairs from a nonbald scalp or remove 5%, 10%, or more of the molecules from a table or brain. We transition gradually, not abruptly, from a thing to its opposite. Physically we experience degrees of occurrence. In terms of statements about the physical processes, we arrive again at degrees of truth.

Suppose there are n grains of sand in the heap. Removing one grain leaves n-1 grains and a truth-value t (S n - 1) of the statement S n – 1 and implies n-1 sand grains are heap. In general the truth-value t (S n - 1) obeys t (S n-1) <1. t (S n-1) may be close to unity, but we have some nonzero doubt d n-1 about the truth of the matter. (The argument still holds if there exist no doubting creatures in the universe.) For instance

t (S n ) = 1 - d n (1-6)

Where 0<=d n <=d n-1 <=…. <=d n-m <=…<=1. So t (S n-m) approaches zero as m increases to n. If we argue inductively, we can interpret the overall inference as the forward chain “( If S n , then S n-1 and (If S n-1, then S n-2 and…. and (If S1, then S0)”.

Fuzziness in the twentieth century

Logical paradoxes and the Heisenberg uncertainty principle led to the development of multivalued or “fuzzy” logic in the 1920s and 1930s. Quantum theorists allowed for indeterminacy by including a third or middle truth-value in the bivalent logical framework. The next step allowed degrees of indeterminacy, viewing TRUE and FALSE as the two limiting cases of the spectrum of indeterminacy.

Polish logician Jan Lukasiewicz first normally developed a three-valued logical system in the early 1930s. He extended the range of truth-values from {0,1,1/2} to all rational numbers in {0,1}, and finally to all numbers in {0,1} itself. Logics that use the general truth function t:

{Statements}® [0,1] define continuous or “fuzzy” logics.

In the 1930s quantum philosopher max black applied continuous logic componentwise to sets or lists of elements or symbols. Historically, black drew the first fuzzy-set membership functions. Black called the uncertainty of these structures vagueness. Anticipating Zadeh’s fuzzy –set theory, each element in black’s multivalued sets and lists behaved as a statement in a continuous logic.

Zadeh extended the bivalent indicator function Ia of nonfuzzy subset a of x,

Ia (x) = {1 if x Î A

0 if x Ï A }

To a multivalued indicator or membership function m a : X ® [0.1]. This allows us to combine such multivalued or fuzzy sets with the pointwise operators of indicator functions :

IAÇB (x) = min ( IA(x) , IB(x) )

IAÈB (x) = max ( IA(x) , IB(x) )

I Ac (x) = 1 - IA (x)

A Ì B iff I A (x) £ I B (x) for all x in X

The membership value m A (x) measures the elementhood or degree to which element x belongs to set A:

m A (x) = Degree (x Î A)

Just as the individual indicator values I A (x) behave as statements in bivalent prepositional calculus, membership values m A (x) correspond to statements in continuous logic.

Sets as Points in Cubes:

Fuzziness prevents logical certainty at the level at the level of black-white axioms. This seems unsettling to some and liberating to others.

Neural networks and fuzzy systems process inexact information and process it inexactly. Neural networks recognize ill-defined patterns without an explicit set of rules.

Fuzzy systems estimate functions and control systems with partial descriptions of system behavior.

The neuronal state space, the set of all possible neural outputs, equals the set of all n-dimensional fit vectors, the fuzzy power set. Both equal the unit hypercube I n =[0,1] n =[0,1] X…X [0,1], the set of all vectors of length n and with coordinates in the unit interval [0,1].

The 2n vertices of In represent extremized neuronal-output combinations. The midpoint of the cube, where a fuzzy set A equals its own opposite Ac, has maximum fuzzy entropy. The black-white vertices have minimal fuzzy entropy.

Proper fuzzy sets, nonvertex points, A violate the law of noncontradiction and excluded middle: A Ç Ac ¹f and A È Ac ¹ X

Fig.1.1: Fuzzy power set F (2x) of X corresponds to the unit square when X={x1, x2}. The four non-fuzzy subsets in the nonfuzzy power set 2X correspond to the four corners of the 2-cube. The fuzzy subset A correspond to the fit vector (1/3, 3/4) and to a point inside the 2-cube if mA (x1)=1/3 and mA (x2)=3/4. The midpoint M of the unit square corresponds to the maximally fuzzy set. Long diagonals connect nonfuzzy set complements.

Fuzziness in a probabilistic world

Is uncertainty same as the randomness? If we are not sure about something, is it only up to chance? Do the notions of likelihood and probability exhaust our notions of uncertainity? Many people trained in probability and statistics believe so. Some even say so, and say so loudly.

Randomness and fuzziness differ conceptually and theoretically. We can illustrate some differences with examples.

Randomness and fuzziness also share many similarities. Both systems describe uncertainty with numbers in the unit interval [0,1]. This ultimately means that both systems describe uncertainty numerically. Both systems combine sets and propositions associatively, commutatively, and distributively. The key distinction concerns how the systems jointly treat a set A and its opposite Ac. Classical set theory demands AÇ Ac = f, and probability theory conforms: p (A Ç Ac)=p (f)=0. So A Ç Ac represents a probabilistic impossible event. But fuzziness begins when A Ç Ac ¹.f

Randomness VS. Ambiguity: Whether VS. How Much

Fuzziness describes event ambiguity. It measures the degree to which an event occurs, not whether it occurs. Randomness describes the uncertainty of event occurrence. An event occurs or not, and you can bet on it. The issue concerns the occurring event: Is it uncertain in any way? Can we unambiguously distinguish the event from its opposite?

Whether an event occurs is “random”. To what degree it occurs is fuzzy. Whether an ambiguous event occurs – as when we say there is 20% chance of light rain tomorrow-involves compound uncertainties, the probability of a fuzzy event.

We regularly apply probabilities to fuzzy events: small errors, satisfied customers, A students and galactic clusters etc. we understand that, at least around the edges, some satisfied customers can be somewhat unsatisfied, some A students might equally be B+ students, some stars are as much in a galactic cluster as out of it. Events can transition more or less smoothly to their opposites, making classification hard near the midpoint of the transition. But in theory-in formal descriptions and in textbooks-the events and their opposites are black and white. A hill is a mountain if it is at least x meters tall, not a mountain if it is one micron less than x in height.

Consider some further examples. The probability that this chapter gets published is one thing. The degree to which it gets published is another. Suppose there is 50% chance that there is an apple in the refrigerator (electron in a cell). That is one state of affairs, perhaps arrived at through frequency calculations. Now suppose there is half an apple in the refrigerator. That is another state of affairs. Both state of affairs are superficially equivalent in terms of their numerical uncertainty. Yet physically, ontologically, they differ. One is “random” the other fuzzy.

Consider parking your car in a parking lot with painted parking spaces. You can park in any space with some probability. Your car will totally occupy one space and totally unoccupy all other spaces. The probability number reflects a frequency history or Bayesian brain state that summarizes which parking space your car will totally occupy. Alternatively, you can park in every space to some degree. Your car will partially and deterministically, occupy every space. In practice your car will occupy most spaces to zero degree. Finally, we can use numbers in [0,1] to describe, for each parking space, the occurrence probability of each degree of partial occupancy- probabilities of fuzzy events.

If we assume events as unambiguous, as in balls-in-urn experiments, there is no set fuzziness. Only randomness remains. But when we discuss the physical universe, every assertion of event ambiguity or unambiguity is an empirically hypothesis. We habitually overlook this when we apply probability theory. Years of such oversight have entrenched the sentiment that uncertainty is randomness, and randomness alone. We systematically assume away event ambiguity. We call the partially empty glass empty and call the small number zero. This silent assumption of universal nonambiguity resembles the pre-relativistic assumption of an uncurved universe.

Consider the inexact oval in figure below. Does it make more sense to say that the oval is probably an ellipse, or that it is a fuzzy ellipse? There seems nothing random about the matter. The situation is deterministic: All the facts are in. yet uncertainty remains. The uncertainty arises from the simultaneous occurrence of two properties: to some extent the inexact oval is ellipse, and to some extent it is not an ellipse.

Fuzziness systems and applications

Fuzzy systems store banks of fuzzy associations or common-sense “rules”. A fuzzy traffic controller might contain the fuzzy association. “If traffic is heavy in this direction, then keep the light green longer “. Fuzzy phenomena admit degrees. Some traffic configurations are heavier than others. Some green-light durations are longer than others. The single fuzzy association (HEAVY, LONGER) encodes all these combinations.

Fuzzy systems are even newer than neural systems. Fuzzy systems “intelligently” automate subways; focus cameras and camcorders; tune color televisions and computer disc heads; control automobile transmissions, cruise controllers, and emergency braking systems; defrost refrigerators; control air conditioners; automate washing machines and vacuum sweepers; guide robot-arm manipulators; invest in securities; control traffic lights, elevators, and cement mixers; recognize kanji characters; select golf clubs; even arrange flowers.