home *** CD-ROM | disk | FTP | other *** search
- Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Path: sparky!uunet!paladin.american.edu!auvm!DRETOR.DCIEM.DND.CA!JEFF
- X-Mailer: ELM [version 2.2 PL9]
- Message-ID: <9208182202.AA14821@client2.dciem.dnd.ca>
- Newsgroups: bit.listserv.csg-l
- Date: Tue, 18 Aug 1992 18:02:30 EDT
- Sender: "Control Systems Group Network (CSGnet)" <CSG-L@UIUCVMD.BITNET>
- From: jeff@DRETOR.DCIEM.DND.CA
- Subject: top-level
- Lines: 173
-
- Here's a new topic, related to re-organisation.
-
-
- I started wondering recently how a top-level Elementary
- Control System (ECS) can remain connected to reality.
-
- To explain: let us take a high-level ECS in a Control Net.
- (A high-level node is many levels from the raw input and output.)
- Assume the net starts out untrained (or only partially trained)
- for its environment. Finally we assume that random re-organization
- is a major part of its training.
- Now the high-level ECS doesn't know what its inputs or
- its reference mean. All it must do is control that they match.
- It may be initially set up with input of (target-position -
- finger-position) and reference of (0). However after a few random
- re-organizations the input weight for "target-position" may have been
- set to zero, and the input weight from "elbow-angle" to a positive value.
- This leaves the ECS training to control "elbow-angle +
- finger-position" = 0. There is no way for the ECS (or for the random
- re-organisation) to know that this new function is nonsense.
-
-
- In general it seems impossible to keep the input "relevant"
- to the reference without forcing it in some fashion (and thus adding
- another set of properties to the ECS).
-
-
- One approach to "forcing it" is found in our Little Baby (a
- learning version of the Little Man). As in the Little Man the high-level
- references involve the distance of the finger from the target
- (as perceived in the right and left retinas).
- The Baby has one (or more) layers of ECSs attached to the outputs
- of its high-level ECSs. However the inputs are connected directly to
- the Baby's inputs.
-
- R* <- top-level reference
- |
- --------
- | ECSs | <- top-level ECS
- --------
- / \
- | --------
- direct ->| | ECSs | <-untrained ECSs
- input | --------
- to top | / \
- | / \
- ...................................
- \ / \
- Input Output <- environment
-
-
- The Little Baby is forced to learn to follow the target by
- being provided with a fixed input function.
- The Complex Environmental Variable (CEV) that the Baby is controlling
- cannot be unlearned, however it can likewise never be learned.
- This is a reasonable hack while we experiment
- with re-organisation, but in the long run we can't always
- hand-code/hard-code the inputs.
-
-
-
- Bill seems to have also seen the problem since he has suggested
- that the learning mechanism should not be completely blind. He wants
- it to contain some simple CEVs (an oxymoron :?) which guide the
- re-organisation. This may be necessary, but it also feels like a hack
- to have a separate control hierarchy for learning.
-
-
-
- I have a partial solution that does not add new variables or
- structure to the existing hierarchy. Unfortunately the CEV in the
- example is different than "finger on target".
-
-
- Suppose we have Little Baby (Mark MCXLI) that can successfully
- learn to control (i.e. we have solved some of the re-organisation
- problem).
- Now we wish to teach it to avoid a spot in its environment (say
- the exact center of its cube).
-
- We add an extra input (called Pain). We change the environment
- so that Pain becomes large if the finger is close to the center of
- the cube, but is very small elsewhere. (We now have a hot-spot.)
-
- We also add a simple ECS that has Pain as input, zero as
- reference, a large gain, and outputs to the arm muscles. This does
- nothing while the finger is outside the hot-spot. If the Baby moves
- the finger into the hot-spot this ECS will quickly yank it out, and will
- then resume doing nothing. We have given Little Baby a pain reflex.
-
- The Baby now avoids the hotspot very effectively, however
- it will have trouble moving finger to target in some cases (assume
- for the moment that we don't move the target into the hotspot).
- If a trajectory goes through the hotspot the arm will jump. Some
- target locations will even have the Baby caught in a cycle.
-
- The rest of the Baby will presumably eventually re-organise
- to avoid approaching the hot-spot. There are several strategies that
- will succeed, and the one chosen depends on the learning mechanism.
-
-
- R* <- top-level reference
- |
- --------
- | ECSs | <- multi-level CS (Control System)
- --------
- / \
- / \ R* <- another top-level reference
- / \ |
- -------- -------- -------
- | ECSs | | ECSs | | ECS | <- pain reflex ECS
- -------- -------- -------
- / \ / \ / \
- .......................................
- | \ | \ Pain /
- | \ | \ /
- +-----------+ \ / Environment
- | \ \ /
- Inputs ------------Outputs
-
-
-
- So why don't I consider this a cheat too? After all we have
- hand-coded an ECS to perform a function. Well we haven't had to
- add a separate learning hierarchy (as per Bill), or had to wire across
- levels (as in the current Little Baby).
-
- Below are the reasons I think we don't have to add any
- new features to "force" the Baby to learn the task.
-
-
- Simplicity:
- The pain-reflex is easy to learn by simple means (such as
- genetic algorithms or random search). We shouldn't need to hand-code
- such control functions.
-
-
- Effectiveness:
- The pain-reflex is very effective at avoiding the hot-spot.
- This is accomplished solely by setting the gain high on a simple task.
-
-
- Stability:
- The pain reflex is stable against random re-organisation.
- Since it is "effective" it very seldom has a non-zero error.
- (Persistant high local error should probably trigger re-organisation.)
- Since it is "simple" it has very few weights. This makes it a small
- target for a random mutation (compared to the rest of the net).
- Lastly it is high gain. If there is a random change to an input
- or output the Baby will thrash wildly. The strong accumulation of
- local error should quickly cause a benign mutation.
-
-
- Since the new top-level goal is quite stable the rest of the
- Little Baby's brain is forced to re-learn.
-
-
-
- Now for the proverb. I have realized that one of my original
- assumptions was wrong. When I first learned PCT I assumed that
- all the top-level ECSs (ones with fixed references) were also
- high-level ECSs (far from the environment).
- I now suspect that *most* of the top-level goals of an
- organism are fairly close to the I/O level, and that most of the
- high-level ECSs are just used to add efficiency to the satisfaction of
- these low-level goals.
-
- Top-level goals need not be high-level goals.
-
-
- ... Jeff
- --
- De apibus semper dubitandum est - Winni Ille Pu
-