Sunday, September 29, 2013

From Hardware to Software and Back Again

I haven't had much free time to advance the actual build much; the next critical path tasks on the to-do list are the PCB's that control the functions and sensors of the four steer wheels, and two simpler ones for the two middle wheels. Knowing that this task will take several consecutive days I've put it off until I can schedule some time for it. But I did have some time for some software...

One thing that struck me as I was toying with the language Go was that I was no longer limited to simply returning raw sensor data for higher level operations. Most of the sensors live at the end of an I2C bus, and the model I've been using for the last year is to create registers of 32 byte buffers that represent commands and status bytes. This is terrific for small microcontrollers like the AT Tiny 84 and 85. And it's pretty efficient for transmitting information, since I have a tendency to get as much out of every bit being sent, with fixed point numbers, bit packed structures, and the like.

But that generally doesn't work so hot for compiled languages where it's more sensible (and safer) to use variable types that fit the data. Pretty much all the sensors return data in some native unit, be it degrees of angle or temperature, or distances like inches and centimetres. To fit some values into a single 8 bit byte I sometimes scale the data to fit, or clip it and shift the result so it can fit in a 0..255 range. The penalty is then to remember to shift and scale it back later, or even remember what units it was in the first place. That would be a HUGE distraction to manage if it weren't taken care of at a lower level of code. I'd rather focus on higher levels of problem solving in higher levels of code.

That train of thought led me to wonder about what kind of overall architecture the software should have. With the shift from the smaller and slower microcontroller to the bigger/better/stronger/faster Beaglebone Black, and the use of Go instead of 'arduino-ish' code, I had to start over; and since the code would be written in Go I had to start from scratch.

The first thing I did was write a few toy problem solvers to get used to working with Go, goroutines, and channels, to see how it behaved with single threaded/single core environments. It works. With Go 1.2 about to be released with some scheduler tweaks, it should work really well. Remember that the primary reason for using Go was the notion that it handles concurrency in goroutines with very little overhead, which makes calling blocking system-level calls like bus reads/writes a net-zero cpu cost option. That isn't to say it makes those calls 'free' or 'instant', but the net cost of calling blocking functions isn't felt by the whole system as a complete vapour-lock of cpu resources.

As I looked at different patterns of Go code that did the dispatch and collection of bus level functions and where it would makes sense (or not) to use goroutines I decided that at the lowest level possible all the status data should be held in appropriate variables, and even synthesize status data from two or more  sources. Similarly I was thinking in terms of splitting the code in half, such that all this low level mucking about could be contained in a hardware abstraction layer (a HAL), so that I could explore higher level code with less chance of misadventure creeping into the lower levels of code.

This basically describes two basic modules, the Executive (or EXEC) and the HAL. In theory it should be possible to swap the EXEC from 'bot to 'bot and it should still work, so long as the HAL for each hardware platform exposes a similar interface for status and commanding.

But once I write code I really hate to revisit it. And the EXEC would be complicated enough that I wanted (again) to limit the damage I could do to it over time with re-writes and debugging by making it a write-once type of adventure, but give the user a 'playground' in which to express the highest levels of code safely. But what should that look like?

I thought initially it should be just YAML files; or maybe a less strict text document. I thought about the simplicity of the language Logo, since the basic driving functions are exactly like turtle graphics, but throw in unpredictable terrain, and it gets messy. I thought about event driven programming, perhaps like class-based or object-based systems that hook events to handlers... hmmm... ya... no. Not exactly.

I started nosing around the internets for some building blocks, and found that what I was describing is ground pretty much covered by the ideas described here (it's a kinda longish read, if you really digest it). But again ... simpler.

What's really missing here is the ability to be expressive about tasks within a 'sandbox'. In fact, it should be possible to run the code in a sandbox, without a physical robot, as a simulation.

One bit that I thought might fit in with something like YAML were rules, specifically a way to define polygons that represent kinds of terrain and boundary areas. In practice I'd probably write the map editor in Processing, since I've done a lot of work in that area to pre-process satellite photos, and it has a decent XML library, so maybe not YAML, but XML.

As I worked through the logic of location based rules / geofencing, I noticed that I was defining a hierarchy of rules like "try and not drive here" or "don't ever drive here!" even "this would be a really good place to drive" (well, expressed as an array of costs at 1m resolution... anyway...). But what these really represent are examples of Asimov's 'Three Laws of Robotics".

It goes like this (hit up Wikipedia if you must):

1. Don't hurt anyone
2. Don't hurt yourself
3. Follow users instructions

I've made it super-simple for discussion here. The idea is that it's really a drop-through sanity check on proposed actions. A proposed action (or drive segment, or whatever) must conform to these rules, which can be elaborated on.

For example, hazard avoidance is likely a good #1... so it shouldn't drive into users ankles. This thing is metal, and it would really smart if it did.

Not continuing a tough uphill drive that's causing the wheel motors to overheat, or the battery to drain too low would be good examples of #2. Motors and batteries are expensive!

And staying in bounds and attempting to reach a goal cell would be peachy rules for #3. Actually it might be a good #2 as well, since not staying in bounds might let it roam onto a busy street...

You get the idea. There are several rules that could be built at least in #1 and #2 that we shouldn't ever violate, and don't need to be re-written every time. Some of it could be baked into the way the lower level routines are hooked together, either within the HAL or the sandbox wrapper, so we should never be able to cause disaster. It's probably going to end up in the library level code.

That leaves #3 - "follow instructions".

With all the major safety constraints out of the way, we should be able to simplify things to the point where the instruction set is (almost):

STOP - stop driving in any direction
MOVE - move along a path to a destination in a certain time
PIVOT - stop and turn in place to a new heading
ACTION - do other non-driving things

And that about covers it for the highest level of output commands.

Input isn't so easy... there is a LOT of data that could be exposed. It probably needs a namespace to keep it all sorted out. Maybe an XML like document describing what variable to expose from which data source, and how. Hmmm...

As for language constructs, I'm all about the simple:

JUMP / RETURN - easy as pie. Except that introduces the notion of an instruction pointer.

IF / THEN / ELSE - piece of cake. Except that introduces expression evaluation.

Forth would be SO MUCH easier. A 'single stack' (not really, but that's what the user sees), RPN notation for stack values... and jump/return is simplified at the same time. Maybe Forth-like sections that could be dynamically invoked.

I'll have to sleep on it; I don't actually know if this will all fit in the hardware...

I sense a multi-core cpu in the rovers future...

Sunday, September 8, 2013

A quickie on machine vision

Ok, so I have some fairly unfounded but strong feelings about how to approach machine vision, and how not to. Some of this might even be unpopular... but here goes anyway:

1. OpenCV? Nah. Everybody is using it, and everybody is getting the same mediocre results.

2. CMU-whatever? Nah. Last time I looked it was $400 for the hardware.

3. Step 1: Convert to HSV! Really? That's not very creative...

4. It takes two (or three!) cameras to do stereo vision. Wrong! (although it is handy...)

5. Machine vision is really hard because the bar is set so high by the example of the human vision system. Ummm... No. If the human vision system was so awesome we'd never mis-read anything or walk into things or make mistakes as witnesses.

I actually think the only reason the human vision system is kinda nifty is the massive volume of data we discard.

Based on the idea that the more data you throw away, the less data you have to push through a processing pipeline, so the more time you can spend dwelling on that reduced data, and the better quality you're going to get as an end result.

Seems ok, right?

I'm also pretty sure the 'problem' of machine vision is often stated very poorly - probably because there is more than one 'problem' at hand. The underlying thing I think is missed in the defining the problem is that our vision is an answer-seeking system.

Imagine if you could hear the questions your brain had to ask to get your eyes (and head and body) to look where they do in a typical 5 minute walk down the street:

-Is that a "Don't Walk" light?
-Is that car going too fast to stop?
-What is barking, is it a friendly dog?
-Don't trip on that crack in the sidewalk!
-Is the other path clear? I don't want to walk under that tree full of birds...

As a participatory member in our thought processes, vision is pretty unique. In contrast, hearing is somewhat different since it's more temporally transient; once the sound is gone, it's gone. With vision we can choose which things to focus on.

I set out today to code a Processing sketch to reduce a moving visual scene to simpler parameters, so I could then feed an optical flow truth table (which turns out to be just a bunch of three or two bit AND's), but that code isn't here yet.

But to get started I needed a toolkit so I could modify an input stream and make some pretty quick and dirty adjustments to the image to see what else I might need to do to it.

The easiest way to start was simply using Processing (, since that has a video library built in, and dead-simple access to pixel level data. I also threw in some on-screen controls of the threshold values using the controlP5 library (also dead-simple). Oh, and for source material I recorded some 640x480 scenes in a local park, with the camera height fairly low to simulate the mast height of the rover.

Here is the result after an hour of coding, you might want to flip to full-screen mode for it:

And here is the code:

// vidproc demo 0.1 - 2013 - Noel Dodd (
// License terms: Creative Commons (see
// and GPL 2 (see
import controlP5.*;
Movie vid;
ControlP5 cp5;
PImage source;       // Source temp image
PImage destination;  // Destination image
int cwidth = 640;
int cheight = 480;
int numPix;
int b_threshold = 50;
int w_threshold = 220;
int g_threshold = 50;
boolean falsecolor = false;
void setup() {
  println("Start... ... ...");
  numPix = cwidth * cheight;
  cp5 = new ControlP5(this);
  // source = loadImage("source.jpg");
  vid = new Movie(this, "vid.MOV");
  // The destination image is created as a blank image the same size as the source.
  source = createImage(cwidth,cheight, RGB);
  destination = createImage(cwidth,cheight, RGB);
void movieEvent(Movie m) {;
  int loc = 0;
  for (int x = 0; x < cwidth; x++) {
    for (int y = 0; y < cheight; y++ ) {
      loc = y + ( y * x);
      // copy the pixel to give an unthresholded view to overwrite
      source.set(x,y, m.get(x,y));
void draw() {
  for (int x = 0; x < source.width; x++) {
    for (int y = 0; y < source.height; y++ ) {
      int loc = x + y*source.width;
      destination.pixels[loc] = source.pixels[loc];
      // Should it be grey?
      float red,green,blue;
      float rg,rb,bg;
      red = red(destination.pixels[loc]);
      green = green(destination.pixels[loc]);
      blue = blue(destination.pixels[loc]);
      rg = abs(red - green);
      rb = abs(red - blue);
      bg = abs(blue - green);
      if (( rg+rb+bg) < g_threshold) {
        destination.pixels[loc] = color(127); // grey
      // Test the brightness against the threshold
      if (brightness(source.pixels[loc]) < b_threshold) {
        destination.pixels[loc]  = color(0);  // black
      if (brightness(source.pixels[loc]) > w_threshold) {
        destination.pixels[loc]  = color(255);    // white
      // we can also test on colors
      if(falsecolor) {
        red = red(destination.pixels[loc]);
        green = green(destination.pixels[loc]);
        blue = blue(destination.pixels[loc]);    
        // check blue
        if ((blue > red) && (blue > green)) {
          destination.pixels[loc]  = color(0,0,255);
        // check red
        if ((red > blue) && (red > green)) {
          destination.pixels[loc]  = color(255,0,0);
        // check green
        if ((green > red) && (green > blue)) {
          destination.pixels[loc]  = color(0,255,0);
  // We changed the pixels in destination
  // Display the destination

Monday, September 2, 2013

Go Concurrency and Python, Part 1 - the GPS

I wanted to understand more about where / how concurrency is used in Go, so I'd know when it's appropriate to use in the rover to prevent blocking calls from stalling out important processing tasks.

First, I found this video that pretty much lays out some good ideas ("patterns") for some handy uses. It moves along at a good pace, so watching the whole thing is probably recommended:

So it does exactly what I want; goroutines can encapsulate potentially blocking calls, can have timeouts, and let the caller proceed until a place / time where the results of the call might be more useful, for example, if several things need to be initiated in one chunk, then the results gathered a short time later.

The next bit I wanted to test was access to the GPIO functions on the Beaglebone Black. Since I'm not at a point where I totally get how device tree overlays would be exposed in Go, I could use the Adafruit python library. The fact that the python calls are blocking is of little interest, since I could call them from a Go routine and keep on chugging away in the main processing loop while they execute.

opkg update && opkg install python-pip python-setuptools python-smbus

After a few hundred lines of console output I get only one error, which I ignore...

pkg_run_script: package "bonescript" postinst script returned status 1.

 Hmmm. Big deal. Let's move on... and run the recommended test:

python -c "import Adafruit_BBIO.GPIO as GPIO; print GPIO"

<module 'Adafruit_BBIO.GPIO' from '/usr/lib/python2.7/site-packages/Adafruit_BBIO/'>

Yay! Looks ok. I'll worry about the bonescript error ... later.

One of the things I'm interested in doing is listening to one of the UARTs for the serial input from the GPS module. Originally this was supposed to be a shield (it's the Adafruit Ultimate GPS Logger), but going for the 'simpler is better' mantra I decided to just create a fake set of sockets on a PCB as a shield carrier. Should work.

The UART part of the tutorial requires the python serial module pyserial... so let's install that:

 pip install pyserial

And we get:

Downloading/unpacking pyserial
  Downloading pyserial-2.6.tar.gz (116kB): 116kB downloaded
  Running egg_info for package pyserial
    running egg_info
    warning: no files found matching 'examples/'
    warning: no files found matching 'test/'
Installing collected packages: pyserial
  Running install for pyserial
    running install
    changing mode of build/scripts-2.7/ from 644 to 755
    /usr/bin/python /tmp/
    removing /tmp/
    warning: no files found matching 'examples/'
    warning: no files found matching 'test/'
    changing mode of /usr/bin/ to 755
Successfully installed pyserial
Cleaning up...

Hmmm... more minor errors, but reports of success. Huh. Well, let's keep going... and set up the two loopback wires between two UARTS, and launch minicom in two different windows. Success! One point, if you are typing the commands instead of copy/pasting the lines, isn't obvious - the device is ttyO1, not tty01 (it's a captial O, not a zero). Weird.

 Anyway... the GPS Shield can be powered from +5v and it has a 3.3v regulator on it, so the serial lines are at 3.3v levels - safe for the Beaglebone. After some futzing I had it displaying the serial GPS fix as seen on /dev/ttyO1:

minicom -b 9600 -D /dev/ttyO1

However the terminal emulation part of minicom really botches the output. Let's just dump the port, a hint from this post helps...

root@beaglebone:/sys/kernel/debug# stty -F /dev/ttyO1 raw
root@beaglebone:/sys/kernel/debug# stty -F /dev/ttyO1 9600
root@beaglebone:/sys/kernel/debug# cat /dev/ttyO1

Ok, so we know at a device level we are getting good serial input to the port from the GPS. Nice. Let's see if we can consume that in some code.

root@beaglebone:~/pytests# cat
import Adafruit_BBIO.UART as UART
import serial


ser = serial.Serial(port = "/dev/ttyO1", baudrate=9600, timeout=5)

c = 0
print "Starting read loop..."

while ( c < 10) and (ser.isOpen()):
c += 1

And running it we get what you'd expect, a nice easy to read display of GPS strings:

root@beaglebone:~/pytests# python
return 1 lookup_uart_by_nameStarting read loop...
$GPRMC,173321.087,V,,,,,0.00,0.0 0,020913,,,N*4E
$GPRMC,1 73322.087,V,,,,,0.00,0.00,020913,,,N*4D
$GPGGA,173323.087,,,,,0 ,0,,,M,,M,,*40
$GPRMC,173324.087,V,,,,, 0.00,0.00,020913,,,N*4B
$GPGGA,173326.0 87,,,,,0,0,,,M,,M,,*45
$GPRMC,173326.087,V,,,,,0.00,0.00,020913 ,,,N*49
$GPRMC,173327.08 7,V,,,,,0.00,0.00,020913,,,N*48

There are some obvious next steps here, like when we get GPS data that starts in the middle of string, or when we hit the 64 character limit and truncate a line for no reason. There are plenty of open source GPS string parsers out there, so it's really up to you as to how to implement.

Since I know how I'm going to use the data (and what data to throw away) I know I'll want a few things from the GPS, like coordinates, but I want them in a certain format for later use.

For the purposes of this test I think I'll just work on some basic parsing, since what I really want to test is if I can return valid data strings from python back to a Go calling routine.

Sunday, September 1, 2013

Ooops. A quick update

A few weeks back I had an 'Ooops!' moment. Actually I used more colourful language, but you get the idea. While prototyping an L293D + micro controller + DC motor as a high torque steering servo I found out that the thermal limit properties of the insulation on my breadboard wires was... lacking.

So the power and signal lines decided to ignite their insulation and short together, which totally fried the controller, in this case the Max32 (like a Mega, but 32 bits/80 Mhz). Crap. That should teach me for using such an expensive item when really I should have been using a $3 AVR chip.

Now without a module with enough ram and processor oomph I had to go shopping. The Due is interesting, but it's expensive for what you get. The Raspberry Pi is likewise interesting, but it's limited in it's I/O.

What about the new Beaglebone Black (BBB)? It has a LOT more I/O (in terms of I2C busses, SPI, and UARTs). And by some measures it can be 2x - 4x faster than a Raspberry Pi, since it's one generation newer and a higher clock speed. Actually, just the clock speed is impressive. To move from a 20 Mhz AVR chip to a 1 Ghz (that's 1000 Mhz! 50x faster!) is impressive, and opens a lot more possibilities up.

After messing around in Visio to lay out the possible connections from the original plan, but using the BBB, I found that I could eliminate several slave AVR chips. The only thing I have to pay special attention to are the voltage limits now at 3.3v for items that connect directly. The ADC limit of the BBB is even lower (grrrr) at 1.8v (or something like that), but a good chunk of the ADC has to be at 5v, so I'm going to slave the ADC measurement off to an AT84 + 16 channel MUX breakout as an I2C connected device.

Off to the store... In this case I can get a BBB locally in Calgary, at Active Electronics. The price is slightly higher there, but it's supporting a local business. When I got there I was really impressed to find a lot more of their retail space was given over to breakouts, sensors, and other modules from companies like Sparkfun and Adafruit, in addition to the kits from Solarbotics and Canakit. This is a Good Thing for us locals.

It's early days for the BBB on my desk; I started by getting it on the network and doing a full OS upgrade, then an opkg update and opkg upgrade. Also I gave a little ntp love to get the clock sync'd for now, as I have a Dallas RTC from Adafruit for the rover when it's running off-net, but I have to do the PCB layout to get the I2C busses laid out first.

Of course I already ran the blinkled test from the cloud9 ide, I doubt I'll use a lot bonescript in the race version. I could use the adafruit python library (and I might, as a final I/O step), but one thing that's fairly exciting is a language called go.  The interesting part is that it's multithreaded, but in a way that is non-intrusive to the code (mostly). That means that instead of calling blocking routines in python, that 1 Ghz processor can be busy doing other things. So it didn't take long to install go and compile a 'hello, world' program.

On the hardware front, I plugged in an old NTSC-to-USB video capture dongle, and it was detected automatically as video0 (I'm using Angstrom, btw):

root@beaglebone:~# lsusb
Bus 001 Device 002: ID 0573:0003 Zoran Co. Personal Media Division (Nogatech) USBGear USBG-V1
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

Video4Linux is also present, and it works as well. I have yet to solder the connections up, but the idea is to take NTSC video and both broadcast it back to a base station (when wifi isn't running, as it uses the same frequency band), but also to suck in video frames for machine vision. Since some of the links I've found on getting video input to work use the openCV library, this youtube video from Kyle Hounslow is particularly interesting, since it's exactly what I expected to have to do for coloured obstacle and goal detection in terms of image manipulation.

I still have to poke around a bit to get an idea of how it would all look, but seeing what's mapped in /sys/devices/platform/omapfb/subsystem/devices is an interesting start, since it looks like a whole whack of things are set up by default.

I have no idea if this means I can do GPIO manipulation from go by tweaking files mapped to devices, or if there is a better way.