NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / bit / listserv / csgl / 740 < prev next >

Wrap

Text File | 1992-08-18 | 7.1 KB | 185 lines

Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU Path: sparky!uunet!paladin.american.edu!auvm!DRETOR.DCIEM.DND.CA!JEFF X-Mailer: ELM [version 2.2 PL9] Message-ID: <9208182202.AA14821@client2.dciem.dnd.ca> Newsgroups: bit.listserv.csg-l Date: Tue, 18 Aug 1992 18:02:30 EDT Sender: "Control Systems Group Network (CSGnet)" <CSG-L@UIUCVMD.BITNET> From: jeff@DRETOR.DCIEM.DND.CA Subject: top-level Lines: 173 Here's a new topic, related to re-organisation. I started wondering recently how a top-level Elementary Control System (ECS) can remain connected to reality. To explain: let us take a high-level ECS in a Control Net. (A high-level node is many levels from the raw input and output.) Assume the net starts out untrained (or only partially trained) for its environment. Finally we assume that random re-organization is a major part of its training. Now the high-level ECS doesn't know what its inputs or its reference mean. All it must do is control that they match. It may be initially set up with input of (target-position - finger-position) and reference of (0). However after a few random re-organizations the input weight for "target-position" may have been set to zero, and the input weight from "elbow-angle" to a positive value. This leaves the ECS training to control "elbow-angle + finger-position" = 0. There is no way for the ECS (or for the random re-organisation) to know that this new function is nonsense. In general it seems impossible to keep the input "relevant" to the reference without forcing it in some fashion (and thus adding another set of properties to the ECS). One approach to "forcing it" is found in our Little Baby (a learning version of the Little Man). As in the Little Man the high-level references involve the distance of the finger from the target (as perceived in the right and left retinas). The Baby has one (or more) layers of ECSs attached to the outputs of its high-level ECSs. However the inputs are connected directly to the Baby's inputs. R* <- top-level reference | -------- | ECSs | <- top-level ECS -------- / \ | -------- direct ->| | ECSs | <-untrained ECSs input | -------- to top | / \ | / \ ................................... \ / \ Input Output <- environment The Little Baby is forced to learn to follow the target by being provided with a fixed input function. The Complex Environmental Variable (CEV) that the Baby is controlling cannot be unlearned, however it can likewise never be learned. This is a reasonable hack while we experiment with re-organisation, but in the long run we can't always hand-code/hard-code the inputs. Bill seems to have also seen the problem since he has suggested that the learning mechanism should not be completely blind. He wants it to contain some simple CEVs (an oxymoron :?) which guide the re-organisation. This may be necessary, but it also feels like a hack to have a separate control hierarchy for learning. I have a partial solution that does not add new variables or structure to the existing hierarchy. Unfortunately the CEV in the example is different than "finger on target". Suppose we have Little Baby (Mark MCXLI) that can successfully learn to control (i.e. we have solved some of the re-organisation problem). Now we wish to teach it to avoid a spot in its environment (say the exact center of its cube). We add an extra input (called Pain). We change the environment so that Pain becomes large if the finger is close to the center of the cube, but is very small elsewhere. (We now have a hot-spot.) We also add a simple ECS that has Pain as input, zero as reference, a large gain, and outputs to the arm muscles. This does nothing while the finger is outside the hot-spot. If the Baby moves the finger into the hot-spot this ECS will quickly yank it out, and will then resume doing nothing. We have given Little Baby a pain reflex. The Baby now avoids the hotspot very effectively, however it will have trouble moving finger to target in some cases (assume for the moment that we don't move the target into the hotspot). If a trajectory goes through the hotspot the arm will jump. Some target locations will even have the Baby caught in a cycle. The rest of the Baby will presumably eventually re-organise to avoid approaching the hot-spot. There are several strategies that will succeed, and the one chosen depends on the learning mechanism. R* <- top-level reference | -------- | ECSs | <- multi-level CS (Control System) -------- / \ / \ R* <- another top-level reference / \ | -------- -------- ------- | ECSs | | ECSs | | ECS | <- pain reflex ECS -------- -------- ------- / \ / \ / \ ....................................... | \ | \ Pain / | \ | \ / +-----------+ \ / Environment | \ \ / Inputs ------------Outputs So why don't I consider this a cheat too? After all we have hand-coded an ECS to perform a function. Well we haven't had to add a separate learning hierarchy (as per Bill), or had to wire across levels (as in the current Little Baby). Below are the reasons I think we don't have to add any new features to "force" the Baby to learn the task. Simplicity: The pain-reflex is easy to learn by simple means (such as genetic algorithms or random search). We shouldn't need to hand-code such control functions. Effectiveness: The pain-reflex is very effective at avoiding the hot-spot. This is accomplished solely by setting the gain high on a simple task. Stability: The pain reflex is stable against random re-organisation. Since it is "effective" it very seldom has a non-zero error. (Persistant high local error should probably trigger re-organisation.) Since it is "simple" it has very few weights. This makes it a small target for a random mutation (compared to the rest of the net). Lastly it is high gain. If there is a random change to an input or output the Baby will thrash wildly. The strong accumulation of local error should quickly cause a benign mutation. Since the new top-level goal is quite stable the rest of the Little Baby's brain is forced to re-learn. Now for the proverb. I have realized that one of my original assumptions was wrong. When I first learned PCT I assumed that all the top-level ECSs (ones with fixed references) were also high-level ECSs (far from the environment). I now suspect that *most* of the top-level goals of an organism are fairly close to the I/O level, and that most of the high-level ECSs are just used to add efficiency to the satisfaction of these low-level goals. Top-level goals need not be high-level goals. ... Jeff -- De apibus semper dubitandum est - Winni Ille Pu