Saturday, June 8, 2013

Shoestring AI, Part 2, Nodal FORTH

Background: Sometime in the early 80's (either 1981 or 82) I got hold of GraFORTH, not because it's a nifty-neat programming language, but because it could do real 3D graphics on the Apple II... amazing, because for a super-slow 8 bit machine doing any kind of real 3D appears 'impossible'. So motivated to do cool graphics I learned FORTH, and then after a while moved on with the march of technology. But it would remain a benchmark for me of simplicity and elegance.

Before I implement any code here, I noticed that some of the implementations of AI (or AGI... whatever...) are ... messy. By that I mean that to some degree they have code or algorithms 'bolted on' to do specific things with knowledge, logic, and so on. Yeeech.

I also noticed that a graph-type KB (Knowledge Base) needs traversal and maintenance routines... which leads down the path of these 'bolt ons'. Double Yeeech.

I then thought about a self-referential system, which might be built on primitives; the fewer the primitives, the fewer the bolt-ons. Hmmm... Thinking more about the qualities of the information in the KB I noticed that from any node, two bits of information might be interesting to know:

1. 'Upward' relationships - is the information 'well placed' in the KB?

2. 'Downward' evaluations - do the subcomponents of this node 'evaluate'

I was initially thinking along the lines that downward evaluation into atomic actions / expressions / values would be neat; caching-in-context (virtual pruning), downward expansion if not, etc. I was going for the identification of atomic actions, starting first with the graph tree itself, then it would expand into logical constructs between objects, and so on.

What became fairly apparent very quickly is that for atomic actions what I was trying to do was a tree-traversal-evaluation, whereby the branch of the tree could be 'evaluated' and the actions and logic were expressed either as primitives/atomics, or within the nodes themselves as more complex constructs. That starts to look a whole lot like FORTH.

FORTH, in a nutshell.

Consider this statement:

2 2 + .

It should put a 2 on the stack, then another 2, then pop two values off the stack, sum them, put the result back on the stack, and then print the result. It illustrates something common about FORTH's: 'Reverse Polish Notation', or RPN, which is super slick for machine language expression of FORTH primatives, since things like AVR chips are much like any other modern processor.

You could also write a useful function that adds 2 to whatever is on the stack:

2 +

Now there is a word-definition for 'PLUSTWO'. By expanding the dictionary of FORTH this way you not only write the language, but you write the program; the two are the same.

This self-coding is why FORTH is so interesting; the definition and relationships of words in definitions starts to look somewhat like a KB graph-tree.

And now with a little extra stack-ness we can store not just stack-friendly integers, but really anything, like references to objects in the KB, confidence weights, probabilities, whatever.

We have (glossing over the obviously missing implementation) a downward-evalutating KB model now. It is self-expressive, so any node in the KB tree can be 'executed' to some degree with definitions that only exist in the tree itself. This means that code concepts become portable; and possibly opens the door to behaviour and pattern modelling.

Upward-evaluations are the logical ones, asking questions about the consistency and placement of concepts within the KB. I think they are executed in atomics during object creation and modification, but I'm not sure yet. Maybe tomorrow I'll be sure.

For now lets settle on this as 'Nodal FORTH', or nFORTH, since it operates on nodes (and there are some more cool stack-like things we can do there, also).

Thursday, June 6, 2013

Shoestring AI?

Background: I've been thinking about AI on and off since maybe 1976. I fully believe that it's possible to engage in a topic, leave it to percolate in the subconscious, and then re-visit it after some period of time. For me and AI, it's every 3 years or so. Sometimes for just a week, sometimes more. This week has been particularly productive. I can't work on the rover; the rain has been constant, and I have a few other obligations first, so I can't drive it outside to make any progress.

About a week-and-a-bit ago I had the idea that several Atmel 84's or 328's could be physically arranged on an inexpensive PCB (or stacks of PCBs) to split the tasks of AI up into really little bits, but perhaps locally related (and hence stacked) groups to operate on concepts, where a concept is a fairly well-contained cluster of knowledge an behaviours that can be well represented for the purposes of small robotics.

Examples of concepts would be things like time, position, sensor inputs, driver outputs, etc, but also a few other things, like goal states, external object representations, and things generally useful to keeping a small robot from roaming into known dangers and more focused on goals at hand.

The basic implementation steals from the idea of Kalman filters on input sensors to do a better-than-your-average-bear (pun intended) estimation of outcome to apply it to AI and not just data smoothing. More on that later...

At a hardware level, the basic PCB group would have three chips; an I/O chip, a Concept Knowledgebase chip, and a crossbar chip (so concepts can talk amongst themselves without buggering up the I/O lines and making servos jittery).

A bonus-points implementation would also do a somewhat direct mapping of a concept-group of chips to text, so even in a simple system 'almost natural language' text I/O would be possible. You could tell your iBot to not bother cleaning that part of the house, and it would 'understand' (at a location-action level).

I decided that it would be a Good Thing  to not rush the PCB design, as a software-only proof would be a little less work. Heh. Sure.

I also found through thought-experiment that not only would the Kalman approach work for outcome estimation, you could use it for outcome selection, by picking from a pool of possible actions and then choosing the one that was likely to produce a good results.

For goal setting the Kalman process actually has a goal state built in, if you consider the correct terms and take the estimated state as before it's combined with the current state in the final smoothing / weighting step.

The system can also self-tune, by fiddling with it's own K-weight terms. I wouldn't call this self-learning, but it's a reasonable approximation of propagation training in neural networks, although we aren't dealing with neural nets, the need for training-via-feedback is (I hope) self apparent.

Lucky for me, Processing just advanced to 2.0 the day before. It's just the right thing for this, as it's lightweight enough to get results within 30 seconds of installation, instead of endless B.S. typical of (cough) modern (cough) dev environment configurations. Plus it does VERY nice graphics at zero programmer cost, handy for visualizations of knowledge trees, internal machine states, and whatever else was lurking in brain but not fully expressed yet.

I also installed MySQL, so in literally minutes I had a test db being queried by a Processing sketch. Good enough for a days work.

Why SQL? I needed a way to store knowledge, and doing it in SQL makes it flexible enough so that I can mess around until I get it thin enough to do it in a lightweight, probably flat file, 512 byte-chunk format for fast SD card access within a concept-group of chips. These little chips run at the blazing speed of 16 Mhz and have perhaps 2k of sram, so these performance considerations count!

The following day I found a talk by Doug Lenat who I recognized as having started some interesting work in one of my previous visits to AI-land, so it was neat to hear where they had gone with it. 'Neat' as in 'close, but no cigar'. But there are some cool elements to what they have made available... I'm totally going to borrow the OpenCyc KB, but I'm going to do something a bit different with the storage organization.

The next day was less fun-filled, so it was a thought-experiment-only day, and the sum-total was that as long as I kept super-classing the world-knowledge representational model, I could hold onto the idea that there is one - and only one - object representation of All Things.

This is important from an elegance perspective. Others have slightly messier implementations (which I'll get to shortly), but one key aspect I think is correct is that some of the describable properties of an AI system are not built, they are emergent. For emergence to flourish I suspect it does 'better' in an elegant system. I don't know much about emergence, but if chaos is it's evil twin... well, I have some experience with chaotic systems... but enough about my employment history.

I decided that there are a few key terms to keep my thoughts organized. Like rabbits, these terms can multiply, run willy-nilly all over the place, and generally get away from the person trying to organize them. So I'll try to keep the definitions simple, I don't want them to multiply, and I want to keep the implementations penned-in.

Objects: everything in the system is an object. All knowledge, things, concepts, states, potentials, histories, and outcomes. All colors, music and emotion. The breadth and depth of the universe and mankinds hopes and dreams. So not lofty at all. Just everything.

Relationships: Sometimes there are two objects that are related; daughter to mother, light and dark, etc. Kinda obvious, right? So it's very common to start drawing tree graphs of circles connected with lines, or logic statements like ($A) IS_LIKE ($B), or (A$) HAS ($R) WITH ($B). I did this. And IT IS WRONG. Remember: All Things Are Objects. That includes relationships; and stick-figure graph diagrams can trick the brain into wanting to implement the circles in one bit of code, an the lines in another. Wrongity-wrong-wrong. (with a caveat that I might get to by the end of this post...)

Properties: Things, especially in the physical world, have properties. Trees are green, cows make a sound like 'moo', and so on. Borrowing a bit from object-oriented programming, it's sort of natural to assume that a notion like 'object' would also have 'properties'. BTW, properties are objects too! And how are properties attached to objects? Relationships, remember, those are objects as well. It's getting crowded in here...

Classes / Instances: To complete the object-oriented convention, it's also handy to have the idea of classes and then instances of those classes. Classes can have all kinds of properties and relationships, and instances can inherit those from one or more parents, or override them with some uniqueness. No. Big. Deal. Oh yeah, those are objects too.

Logic: There is (I think) a formal set of all possible logic states. It's probably (heh) handy to have some probabilistic features bolted on / baked in, because I'd always assumed that the Kalman weights would factor into the inter-object links. 'Logic' is the inter-object relationship consistency statements that can be evaluated; truths can be found, and these in turn can be stored in the KB. Is logic an object? Representationally, yes. To say A = B is to say something like objA.objEQ.objB. Implementationally, and because logic is discrete, I assumed that it would be done in code (as opposed to within the systems own meta-language).

Actions: Actions are a kind of rabbit offspring; they are really two kinds; actions against the KB itself, and actions that an object could perform (which are just a subclass of a property). For expedience I'm going to code the objectTree pruning/adding/changing stuff by hand, until I get my head wrapped around how (or if) to do this within the system. I think it's eventually important to do it in-system, so that the system can learn itself how/when/why to change how it understands the world, with some sort of patterns of relationships.

Ok, so that was a good start. Time to relax. Maybe watch a youtube video or two. And who should come up? Ben Goertzel. Google. Now. I watched a particularly interesting googletalk that had spectacularly bad audio, but it was worth it.

The upshot was that his approach is to allow for several algorithmic expressions that keep the individual  algorithms from overwhelming the system with large storage, memory, and processing requirements. A kinda handy notion for squishing all these ideas into a few 8-bit processors.

The other thing that I've long-agreed with in Ben's talk is the 'artificial baby' approach. Although it appears the use of the name 'Strong AI' seems to have morphed at some point to AGI - Artificial General Intelligence. He also mentions that the atoms in the knowledge graph are not actually not named, which I was going for, but not strict on, and that there is a thought/knowledge pattern recognition portion to the talk regarding self reflection and awareness. So pretty close. But his sounds more like a hand coded approach of several concepts into a single system, rather than using the system itself... I'll have to read more on it though.

Well, that's a busy week. Maybe by next week I'll have some code to show for it all.