The Unorthodox Engineers: March 2014

Saturday, March 22, 2014

"LeoDroid" Design Decisions

I should explain a few of the design decisions I made with LeoDroid. First, DC Power!

Power Systems

Notice the USB "Power Bank" on LeoDroid's back? That's an 8500mAh NiMH battery capable of delivering 1.5 Amps. They're designed to recharge mobile phones, and come cheap from eBay. You can't build one cheaper, trust me.

If you have built power systems before and didn't know about these brilliant new devices, all you need are these magical words; "5.2V @ 1.5A DC step-up regulator, built-in USB charger, all matched to the battery, less than twenty bucks."

They can be plugged into any standard USB charger. They can charge each other. 5.2V drops to exactly 5V after a Schottky protection diode or LDO regulator. 1.5 amps is enough to spin robot motors.

Unreliable power is one of those things that can ruin your whole day. Buggy self-made power supplies can screw up your whole month. These power banks can be bought for less, in many cases, than the naked battery.

They come in a variety of shapes, colours, sizes, and optional features like solar cells. If one breaks or dies, you just plug in another. You don't have to care about battery chemistry, charge curves, or special connectors. If you use USB-powered Arduinos, there's no soldering required; just unplug from the PC, attach the USB Power Bank, and away you go.

Power Switch

Note well the power switch! The use of a DPDT switch is not accidental - the idea is that when turned on, the battery pack power is connected to the Arduino and Motor Driver - but when turned off, those systems are isolated from each other.

See it? Over to the right, above the wheel well.

To understand why, notice that the Arduino's USB port isn't connected to the battery, (which would be easy with standard cables) instead we route the power to the RAW pin. This arrangement is absolutely necessary because we can't drive enough power through the Arduino USB port to turn high-current motors. It's not a matter of wire gauge - it's the protection polyfuse. If you try to pull more than 500mA through that fuse, it will shut down.

So we have to run power to the motor driver directly, which means soldering wires. And that same input power has to get to the Arduino, because we're using its on-board regulator to supply the digital circuits with 5V. Power just can't go through the Arduino's USB connector on the way to the motors.

Unfortunately, a problem arises because we occasionally want to plug the Arduino into a desktop PC to upload new program code. We really don't want the battery turned on when doing this, because it's possible that the battery pack (if the PC is underpowered or defective) could feed power INTO the computer and burn out ports, even with the polyfuse. Or the PC could try to charge the battery bank backwards, via the wrong port. Don't cross the streams.

If you plug your PC into one side and the battery into the other, power can flow ALL the ways.

If you unplug the battery to prevent this but leave the microcontroller and motors driver connected, the host PC will need to supply all your Droids' needs - including the motors if they happen to suddenly power on because you're debugging, and that will trip the polyfuse.

If you've got good equipment, this will merely reboot the board. If.

With the double-pole switch, "on" means the battery is connected to both controller and motors, and "off" means the controller is safely isolated for programming.

Throw ze switch, Igor.

If you use a single-pole switch (or none at all) then you need to be damn sure your equipment (and everything you plug it into) can survive the conditions. Be aware that I have seen USB ports that burst into flame because someone plugged a double-headed cable in twice to the same PC.

Another simple option is to not use the RAW pin but hack up a USB cable so it splits to supply the motor driver as well. Therefore, you'd be forced to unplug the battery USB connector before plugging in the programming USB cable to the PC.

The most foolproof solution, for the usual class of fool.
It's still a switch, it's just harder to flip.

If you think about it, you would be 'manually' performing the action of the DPDT switch by having a mutually exclusive 'power cable' and 'programming cable'. If you intend to do software development, plugging and unplugging over and over can be annoying, and physically break things. Hence the actual switch. Of course, you could leave the switch in the wrong position...

If you don't intend to connect the Arduino to a PC ever again after emplacing it, then a single pole switch is probably fine. But why not do it properly?

(Did I hear someone say "diodes"? Sure, you could use diodes. They'd better be expensive schottky ones, or the voltage drop will take the circuit out of spec. Even good ones are wasting 5% of your power.)

Power Loom

This is such a simple and obvious concept - have a PCB or connector block that is the central power loom. With dupont connectors, I use old 2-row IDC connectors (such as the IDE connectors off dead hard drives) and solder all the pins in each row together to create two 'rails'. Then all the connectors plug into that. It's basically a 10-socket mini power board.

Like this, only tiny.

The alternative, which I see all the time, is the 'daisy chain' that develops when you run power wires to the nearest other powered device.

The 'loom' is not only better electrically (to prevent ground loops) but it makes visually checking hook-up polarity straightforward - if all the black wires are on one side, and all the red on the other, then you're obviously good at the loom end. (This is how we battle Murphy's law.)

It also means you can quickly isolate individual devices without worrying about chained devices. And you can untangle wires easier.

"LeoDroid" photos

Hooray, pics uploaded at last. Here is my latest robot, based around an Arduino Leonardo: I'm just about to pull it all apart again to do the final trim and shaping of plastic bits, and add The LASER.

Hi there.

Looking pensively into the distance.

The NiMh battery pack has a solar panel built-in, and runs for hours.

Also, can recharge your phone in a pinch.

The thumbstick controls a pointer on the screen

...which controls the Integrated Programming Environment

Under the hood, kinda looks like a prop from the technicolour 60's.

The actual computer is nearly invisible. It's all really just different lengths of coloured wire.

Release: Fritzing design files for "LeoDroid"

Here are the "Fritzing" design files for the Arduino Leonardo-based robot that is sitting on my desk. I'll be posting some photos shortly, and explaining some of the design decisions.

Common Features:

Atmega32u4 based design
128x160 pixel TFT display
Thumbstick
Ultrasonic Ranger
IR Remote Control Decoder
Magnetometer Compass
Sound Effects
PWM Motor H-Bridge control
PWM LED/Laser output

I've added all related files (ritzing designs, exported images, and library parts) to the GitHub repository: https://github.com/unorthodox-engineers/unorthodox-arduino/

Pro Micro Variant

This is the target hardware I developed for, the 16Mhz 5V variant of SparkFun's Pro Micro.

It might look as though there is one pin free: D14. Nope. Once SPI is switched on, it gets locked as MISO and can't be used for general I/O, even though our SPI Apprentice device doesn't have a data output line. Every single pin is used.

Arduino Leonardo Variant

The genuine Arduino "Leonardo" is completely compatible with the Pro Micro, and even has a few extra pins to play with. (though not as many as you think)

Remember that SDA and SCL are 'aliases' of pins D2 and D3, so we really only have D11, D12, A4 and A5 unused. (Though that is enough for many things!) And on the Leo, the SPI bus is on the ICSP header (only) so make sure you have the right cables.

I'll try to post errata here, but the files on GitHub will always be the 'canonical' ones. Make sure to check those before you start building. :-)

Parts List

"Rover 5" robotic base or similar
Atmega32u4 based "Arduino Leonardo", "SparkFun Pro Micro" or compatible clone.
ST7735 based 128x160 1.8" TFT LCD Screen

ILI9325C drivers also exist, but the screen layout would need work

Thumbstick breakout board (X+Y+Switch)
L298N Motor Driver
HMC5883L 3-axis magnetometer breakout board
TSOP312 or compatible IR Receiver/Decoder
HC-SR04 Ultrasonic Ranger
Piezo Speaker
LED/LASER module
100nF capacitor
2.2k resistor
Various lengths of coloured wire.

Thursday, March 20, 2014

Review: Hack (The Facebook Language)

I design new computer languages, when I can't avoid it. It's something I've been doing since before my Honors Thesis, in which I created a distributed language that allowed programs to dart and jump from machine to machine, architecture independent, at will.

So when someone announces a "Major New Language", I tend to have a look-in. When Facebook announces a new language, well...

Here's the thing. I don't want to be mean, but it's going to come out, because some of the criticisms I have border on cruel and unusual. In fact the more I read the spec, the more I wonder what they were thinking...

Actually, I do know what they were thinking. They wanted to improve PHP. That is a laudable goal, but frankly, and this is going to sound mean, the fastest way to do that is to stop using it. Hack is designed to solve problems which are central to Facebook's business, but irrelevant to the majority of platforms and users that run PHP. For those platforms, PHP's shortcoming are not a bug, they are a feature.

Case in point: Async. Hack has some lovely Awaitable metas (sort of similar to generics) which you stick on a function and when you call it, you instantly get back a kind of 'handle' to the result which should be available eventually.

That's nice. It makes parallelizing parts of the code really simple.

But it's not that you couldn't already do something similar in PHP (granted, clunky as hell) it's just that most Hosting providers had the PCE (Process Control Extensions) turned off, or would suspend your account if you called too many long-running shell scripts, besides killing your script after 30 seconds of runtime, no matter what you set the timeout to.

Hack Async solves the problem internally for Facebook where they trust all their code, but doesn't do a thing for Dreamhost or Gator clients. In fact it guarantees they can't run it, because even the simplest script can now balloon out into infinite parallel threads, and there's no resource management to limit it.

Also, easy creation of parallel algorithms means easy creation of deadlock problems.. where are the tools to solve those?

Other issues that are important to a company with a huge monolithic codebase are static typing and annotations (check) immutable database tuples (check) generics (check) collections (check) and UML modelling. (um, no!)

Then there are missed opportunities. Traits are broken for the one thing they should be useful for - macros which prevent copypasta and tortuous subclassing in cases where even generics can't cope. It could have been a "literate programming" primitive. Now I don't know what it's for. "Implementing Static Interfaces" apparently.

What really amazes me is that "Lambda Expressions" are more than halfway down the list. "Nullables" (the ability to make static types include 'null') is given more prominence. That's like casually mentioning "oh, and we've got warp drive." after explaining the many benefits of the cup-holder.

Probably because, like warp drive, it is very hard to retrofit after-the-fact and make it completely functional. Javascript had it built into its core, and still had difficulties.

There are some genuinely nice things; Iterators and Continuations are great (I'm still waiting for them in Javascript, ahem!) although frankly very few people know how to use Continuations, and Iterators are just a standard class interface, and I feel another missed opportunity to combine two things which are really the same thing.

But none of that addresses my original point; which is that Hosting providers chose PHP because it was limited and constrained, and could be used to sandbox user accounts on a shared machine. Everything in Hack was already available in C++ and UNIX, but we were not allowed that full power, because abuses (or bad code) brought down the system for everyone, partly because hosting providers are quite terrible sysadmins.

But also because our languages are really bad at allocating finite resources like memory, bandwidth, and CPU power among sub-processes. Or informing us how much is available, or being consumed by our own processes. But that's another discussion...

Because PHP was largely cut-off from the OS, (deliberately!) Hack has incorporated large chunks of the OS inside itself to make that interface frictionless and performant. Switching on all that power will go wonderfully inside the walled garden of the Facebook server farm and other such corporate installations, but it doesn't address anyone else's needs.

This is an heroic effort to get Facebook out of the language dead-end they had painted themselves into, a way to transition from PHP to C++ incrementally, without re-writing the whole damn codebase. Because they're right: PHP starts breaking down when you get too large and complex. I'm amazed they've pushed it this far. It's designed to spam out a web page in 100ms, not multithread a distributed computation. Fortunately, Turing Complete means what it says on the box.

But if you're looking for a cool server-side language for your next project, I'd still go with node.js, unless you too have a million lines of legacy PHP hung around your neck like a dead albatross.

That's not the most resounding accolade, granted, but at least Hack has a definite use-case. (More than can be said for many languages) For PHP programmers with a formal background, it offers a partial escape from our sad and sorry lives. Well, if the sysadmin would just install it...

Monday, March 17, 2014

Overview: Arduino Graphical User Interface classes

Part of last week's release was a very hacked-together GUI for my programmable droid interface. Here's a couple of screenshots that I've posted before:

Not shown is the thumb-stick that controls the square cursor that you can see near the middle of the screen. The basic features are:

point-and-click style interface using thumbstick.
line-oriented layout that favors runs of text symbols
clicks initiate immediate action, or enter menu mode where moving the joystick will modify the item, after which clicking again will commit the choice.
redraw manager maintains a list of updated regions, and can be called incrementally by the main loop.
screens can be 'paged', so that the cursor is either limited to the edge of the screen, or navigates over to a new page, or a different screen.

If you take a look in <unorthodox_droid_raster.h>, you'll find a set of 'Pager' classes that are 'bound' to the Raster classes. (it asks the raster classes to draw things) These are refactors of the older <unorthodox_droid_gfx.h> classes that were bound to the Adafruit GFX library.

I have no intention of 'abstracting' them both - the raster classes are the only ones in use, but the older classes are left as examples and prototypes for people who have already committed to the GFX library, or who perhaps want to port the classes to a third hardware/driver library.

Frankly, the classes aren't really important except as examples. That's going to be the hardest part of this explanation: my 'library' isn't what you expect. There's really not a lot of code there - which is the point. What I have is a way of breaking down the problem so that custom GUIs can be created within the very hard limitations of the Arduino.

Here's a couple of things to consider, issues which make this particular version of the problem different from 'classic' cases:

Low screen bandwidth - because the screen is usually on the end of a serial link, instead of dual-port SRAM on a video card, simply clearing the screen can take over a second.
Write-only - the screens I work with have no way to read pixels back from the display.
Memory mismatch - Ardino total RAM is 2.5K. The bytes needed to display a 24-bit 320x256 image is 245K. The Arduino would need 100 times more RAM to keep a local 'copy' of what is on the screen.

If we assume our UI is made up from some 'fixed' elements and some 'variable' bits, what we're really asking is to create a structure that fits within the tiny 2.5K Arduino RAM, that we can modify, which will somehow be able to repaint that entire 245K screen surface.

There's a name for trying to keep something big in a small space: Compression.

Therefore the real secret to building an Arduino GUI is not about widget toolkits and pen styles, it's going to be about efficiently accessing compressed data.

The moment we have a cursor pointer moving around, we're going to have to redraw little chunks of the screen where it is, and when it leaves. We have to be able to efficiently reach into the structure to redraw line spans when the cursor moves, or text content changes.

Compression is all about context. The briefest recorded telegram conversation consisted of one symbol sent by each participant. The first sent "?", and the second replied "!". In their context it made perfect sense, and couldn't have been clearer. If you haven't heard the story, then it is less so.

So our first layer of compression is to cut down the problem space in a way that 'everybody knows about', but that doesn't limit us too much. Since most of the 'interfaces' that I needed to build were essentially various kinds of configuration menus, and menus are lists of text items, it was pretty clear I could build such an interface out of "lines of text" and not much else. (Curses, you say.)

Font glyphs (especially european ones) can be compactly represented in 6x8 pixel blocks, or even 5x7 if you always assume a 'spacer' pixel between them. 8x8 might sound neater, but the aspect ratio of 6x8 glyphs is a lot nicer. 8x8 fonts look very 'fat', and fit less per line on small screens.

If we assume our screen is made of a grid of 6x8 pixel blocks, each one of which might contain a character, then we vastly simplify our data load from say 320x256x3 bytes (RGB pixels) to 53x32x2 bytes (assuming one character glyph and 'style' byte per cell)

Thus, keeping a coloured fixed-font grid would require 3.3K of memory. Still too much.

Wow. We can't even keep a full-screen character grid. Even the cheapest microcomputer of the 80's could do that.

We make another assumption: on most of our 'menu based' screen, there are going to be lists of things. "lists" by definition are a bunch of lines that have a similar format. If they have the same general format, but different 'drop-in' values, then that's essentially good old printf().

In the screenshots shown at the top, you can squint and notice that most screens only have two 'line formats' - the header line, and the bulk list lines. (empty lines not counted) and in fact the classes that back these screens have a very simple 'redraw' handler which tests if the row is <1 (in which case it draw the header) or >1 (in which case it subtracts two, adds the page offset, and draws that item number)

These classes are not filled with data structures that are parsed by an outside framework - they are full of if/then and case statements that hard-code the screen layout into compiled instructions - the fastest possible way to render.

The RasterPager class is the base class that each screen is built from. (What might be called a 'screen' is internally referred to as a Pager, because it 'knows how to draw a page'.)

Pagers are asked by the render manager to draw RasterSpan objects which are essentially defined as a sequential span of characters on a single text line, plus some raster context. "line 10, chars 8-17" is a typical span.

The draw manager has a list of two bytes for each line to record the 'invalid' span. (how much we need to redraw next time - which may be nothing) If the manager is asked to 'invalidate' more spans on a line which already has one, the existing span is extended to include those characters as well.

When the draw manager is given a 'time slice' to spend on rendering pixels, it finds the next line which has a span that needs to be drawn, and asks the Pager to redraw it.

The draw manager has no idea how the span is going to be painted - it might have character glyphs drawn into it, or just pretty patterns - it just knows which parts of which lines are waiting to be done. When it comes time to do the job, the draw manager passes the span request on to the Pager class.

So when the Pager gets the request to draw parts of 'line 0' (for example) it can resolve this in any way it likes. Case statement, array lookup, random number generator, whatever. In many ways, the Pager classes are like HTML 'stylesheet' classes; they mostly contain formatting information.

Because we are interested in rendering lines of coloured text, the RasterPager class has a method called format_span() which works very similarly to printf() in that it has a fixed "format string" parameter (compactly stored in program flash memory) and a set of variable parameters in an array.

format_span() is written to efficiently render sub-spans of the entire line - it can efficiently skip over parts that aren't involved, and doesn't buffer. It can extract single digits from the middle of decimal-formatted and aligned numeric values, and use single parameters multiple times in multiple ways, as decimal values, style indexes, or string selectors.

Here's the top two line definitions used by the SignalPager screen shown. (there's a debug footer I've omitted) It looks bulky in source, but the first string is only 14 bytes long (compiled) and the second is 26 bytes.

// title line

PROGMEM prog_uint8_t SignalPager_title_line[] = {

// fragment metadata

RasterPager::Span | 6,

RasterPager::Span | 9 | RasterPager::Last,

// fragment instructions

RasterPager::Lit | 1, ' ',

RasterPager::Lit | 9, 'X','-','D','R','O','I','D',' ','1'

};

// list line

PROGMEM prog_uint8_t SignalPager_signal_line[] = {

// fragment metadata

RasterPager::Style | RasterPager::Vector | 5, RasterPager::Span | 2,

RasterPager::Style | 6, RasterPager::Span | 4,

RasterPager::Style | 3, RasterPager::Span | 2,

RasterPager::Style | 0, RasterPager::Span | 7,

RasterPager::Style | 3, RasterPager::Span | 5,

RasterPager::Style | RasterPager::Vector | 3, RasterPager::Span | 1 | RasterPager::Last,

// fragment instructions

RasterPager::Lit | 2, '<',' ',

RasterPager::Dec | 0,

RasterPager::Lit | 2, ' ','(',

RasterPager::Dec | 1,

RasterPager::Lit | 3, ' ',')',' ',

RasterPager::Lit | 1, '>'

};

Here's the overloaded draw_span() method on that class, which selects which of those formats to use (based on the line) and then stuffs the vector with the relevant parameters, and then calls format_span().

void draw_span(RasterSpan * s) {

int row = s->row;

prog_uint8_t * line_format = 0;

int line_vector[8];

if(row==0) {

line_format = SignalPager_title_line;

} else if(row==1) {

} else if(row<18) {

int line = row + scroll_y*16 - header;

line_format = SignalPager_signal_line;

line_vector[0] = line;

line_vector[1] = droid->signal_value[line];

line_vector[2] = droid->fs.token_size(line&0x7F);

line_vector[3] = line_vector[2] ? 2 : 3;

line_vector[4] = droid->fs.token_size((line&0x3F)|0x80);

line_vector[5] = (line<64) ? (line_vector[4] ? 2 : 3) : 7;

}

format_span(s, line_format, line_vector);

}

Note that apart from the function calls to obtain the parameter values, (which we have to do anyway) the code essentially consists of stuffing a local fixed array with values (very cheap) and then calling one method. Remember that function calls have overhead, and can incur anywhere from a dozen to hundreds of bytes of compiled program code. Calling the raster equivalent of drawText() and setColor() for each parameter directly would clearly take more code bytes - the less calls we need to make, the better. One is pretty minimal. One shared is even better.

The chain now goes: the RasterDroid draw manager knows which bits of which lines to redraw, so it incrementally calls the pager draw_span() for each one, and the pager passes that span request (along with the appropriate line definition and looked-up parameters) over to format_span(), which then draws individual font symbols to the raster device using draw_text() as needed. Whew!

This is how we decompress the instruction 'redraw span 8-10 on line 3' into pixels which hit the screen. Not with data structures - because they don't fit in RAM - but with pure code. Our program flash budget is much bigger, so we depend on that instead. Also, by compiling the code we create the fastest possible execution path, optimized for each 'screen' while sharing the common 'drawing primitive' they all use.

This is all totally backwards, when compared with how a UI toolkit like Windows or Aqua or GTK does things - with abstract widgets. But abstraction is a luxury that, on our memory budget, we can't afford. Knowing our context means abandoning polymorphism, to some degree, by "hardcoding magic numbers", or in this case; format strings and click positions.

Although... since the code is so structured, there's no reason why a 'widget layout' system couldn't be created which code-generates these classes. That system could be full of interesting widget code that aligns columns or draws ASCII art, but that can all be done pre-compile.

Sunday, March 9, 2014

Two At Twice The Price

There's a line in "Contact", spoken by Hadden, that I have taken as a personal mantra:

"First rule in government spending; why build one when you can have two, at twice the price."

That's just before his other immortal line. (And just after some really bad zero-G special effects)

A good friend of mine who studies governments once told me that while it sounds stupidly obvious, governments often alter the price of a service merely by trying to use it, and it doesn't always get cheaper in bulk. If you're in government, "two at twice the price" is a bargain.

The same thing happens in computer programming, for much the same reason - overheads. It's not just the transactions, the framework containing them has to be accounted for.

Here's a little hint that has made vast wodges of my programming more efficient: don't pass one, when you can pass an array.

A case in point would be my new Raster classes: The 'primitive' method for modifying the screen contents is a function called fragment() which takes a linear array of pixels, and paints the strip to the surface. My equivalent of drawPixel() stuffs the colour value into a one-pixel array, and then calls fragment().

Therefore, drawing a single pixel is actually a special case of drawing n pixels. Not the other way around.

So what? The difference is between calling a single function that internally loops around, (keeping all its state intact) or having a loop that calls the function n times. Those function calls aren't free, you know. And you'll rarely update only one pixel total. That's why my font routines are twice as fast.

Write the array case first. Optimize it. Then build the easy-use single-case version on top of that, because if the user is yanking single entries they don't care about performance. If they care, they'll be calling the array version to fast-fill buffers - and if that is merely a loop that calls the single case over and over, you've fooled them good and lost an opportunity for real performance.

Don't do one when you can do n at n times the price. Because the price gets worse when you lose scope or context, so build a 'vector machine' which amortizes the scope costs over as much data as possible. That's how Seymour Cray rolled.

Docs: Raster classes

Another part of the code I just released is a set of "Raster" classes and low-level device drivers for TFT/LCD screens connected to the Arduino. I use them to display my miniature IDE/GUI. They are, in many ways, a re-write of the excellent Adafruit GFX library. (Though with half as many devices, and missing some of the primitives)

Raster
Abstract base driver class for drawing pixel data

Raster8 (Raster)
8-bit colour interface
Raster16 (Raster)
16-bit colour interface

ILI9325C (Raster16)
16-bit colour driver for ILI9325. (subclassed by 'local' drivers)
ILI9325C_Pins (ILI9325C)
example driver variant using digitalWrite to set pins (slow, but general)
ILI9325C_Leo (ILI9325C)
example driver variant using direct port access (fast, but Leonardo-specific)

ST7735
abstract base class for ST7735 24-bit LCD driver

ST7735_SPI (Raster16, SPIDevice, ST7735)
driver variant using 16-bit colour data and SPI (subclassed by 'local' drivers)

RasterDraw
Vector drawing primitives

RasterDraw16 (RasterDraw)
16-bit optimized versions of drawing primitives

RasterFont
Font drawing primitives

RasterFont16 (RasterFont)
16-bit optimized versions of font primitives

Why on earth would I rewrite the GFX Classes? That's a major question. And one that needs an answer, because I've essentially forked the effort. There needs to be a good reason.

One reason is "244 bytes", which is the difference in compiled size between Raster and GFX code paths in my "raster_test" demo. Raster is smaller, (of course) and 240 bytes is not a trivial amount.
Another reason is "1,280 bytes", which is the size of the GFX font data block. You can't change, or cut down this font without messing with the library files and affecting all you projects. Very few projects use the entire font. Raster lets you use custom fonts.
Another is "200%", which is about how much faster the font-rendering code in Raster is compared with GFX. (Although GFX beats mine for rectangle/screen clearing)
But the primary reason is this: to make driver writing easier, while optimizing performance and code size.

One big thing to remember about Arduino development: less methods == less space. Functions, especially virtual methods on objects, are not cheap. Let's look at the core drawing methods required to implement a new GFX 'hardware driver':

fillScreen(uint16_t color)
drawPixel(int16_t x, int16_t y, uint16_t color)
drawFastVLine(int16_t x, int16_t y, int16_t h, uint16_t color)
drawFastHLine(int16_t x, int16_t y, int16_t w, uint16_t color)
fillRect(int16_t x, int16_t y, int16_t w, int16_t h, uint16_t color)

The way this is done is by extending the GFX base class into a Device Driver class and overriding these methods. Several screen drivers are subclassed in this way:

       Adafruit_GFX
       |
      --------------------------------------
   | | | | |
ST7735 TFTLCD PCD8544 SD1331 HX8340B

Once those methods are overloaded, they are used by all the other GFX 'drawing primitives' like drawTriangle() and fillCircle(). The font rendering code in particular makes a lot of drawPixel() calls.

What's wrong with that? Let's say you want to add a new method like 'draw_bezier'. Where do you add it? If you subclass GFX and add it there, none of the 'driver' classes will inherit the behaviour. You would have to modify the original Adafruit source, or extend every driver class.

If we want to be able to extend GFX, then it can't be an ancestor class of the drivers. Also, it would be really nice to cut down the number of virtual methods to the absolute minimum. One would be ideal. Less code means less bugs.

This is where we separate the concepts of a "Raster Device", and a "Rasterizer". The device must be persistent, since it relates to physical hardware. But the chunks of code that draw pretty pictures to the device (whether lines or curves or fonts) should be able to come and go independently - even be applied to multiple devices.

It's also probably a good idea to acknowledge that some raster devices are 8 bit, and others are 16 bit. (And leave room for other depths) Sure it would be nice to abstract that, but the hit to performance is pretty intense. Abstraction is the enemy of performance. It's always faster when you know what you're doing.

While we're at it, we might as well split the 'drawing' and 'font' code so they're more granular, similar to the HTML5 Canvas concept of 'contexts'. Maybe there will be a 3D rasterizer some day...

Raster RasterDraw RasterFont
| | |
----------- ===== RasterDraw16 ==== RasterFont16
| | // | |
Raster8 Raster16 <=== --------- ---------
| | | | | | |
--------- (new primitives) (user fonts)
| |
ST7735 ILI9325C
| |
-------------
| | |
(local drivers)

The trade-off is that we now have to instantiate extra objects and connect them together - first the Raster device, then that device gets wrapped in the 'Draw' and 'Font' objects. One advantage that gives; if memory gets really, really tight, we can tear-down the 'draw' objects (and re-instantiate them later) and leave the device driver loaded.

Raster Drivers only need to implement one virtual method to draw pixels:

fragment(word x, word y, byte dir, Page * pixels, word index, word count)

This method draws a 'strip' of pixels to the surface, starting at a position, and extending in one of the four cardinal directions (up, right, down, left) for a specific count of pixels. The raw pixel data is read from a virtual memory page, and assumed to be in the correct format for the device.

Most LCD driver chips implement the equivalent of this function in hardware, because they're good about rotations, and avoid having a preferred orientation. So we're mapping directly to hardware capabilities, not human abstractions.

It should be pretty clear that this one method can draw single pixels, short line segments, and therefore everything else. This interacts with my virtual memory classes - the page of data might be a WordPage which repeats the same value for the entire block. (creating flat colour) Or it may be a MemoryPage array of pixel values corresponding to part of a font character or bitmap image.

The 'direction' parameter means it is equivalent to the drawFastVLine() and drawFastHLine() methods as well as drawPixel(), so we've got three methods for the price of one.

The only ones left are fillScreen() and fillRect(), which are trivially implemented by calling fragment() once per row. And that code is logically shifted over to the RasterDraw class, to be extended with all kinds of new ways of drawing rectangles with frilly edges.

So, people writing hardware drivers only need to implement one abstract method. (ok, the colour lookup makes two...) And once that method is tested and optimal, any separately-written custom rendering code can wrap around that new driver. Driver and draw classes don't have to be descended from each other, or even know about each others' existence. (Important if you are creating local subclasses)

I'm not sure I can make it any more optimal than that.

It's not totally done, of course. RasterDraw doesn't do circles or arbitrary lines yet, simply because I haven't had a need, or the time. (It would be trivial to port Bresenhams algorithms, but I wonder if a bezier curve algorithm might not be more generalized...) But the interface 'contract' is in place, so that doesn't have to be my job anymore.

Share and Enjoy.

... Afterward On Colour Depth

Arduino code doesn't have a concept of 'Variant', and we're not going to create a generalized 'Color' class because performance will drop through the floor if we have to create and destroy an object per pixel.

The bit depth is largely an issue of how the rest of the code stores and presents colour data. Even if a 16-bit colour screen was suddenly attached, the program can't invent more depth for 8-bit images in storage, it can only convert the pixel format. That could be done by creating a 'virtual' Raster device that has a fragment() method that converts the data before passing it on to a 'real' driver. Such 'adaptor drivers' (actually done mostly using an 'adaptor Page' for the raw data conversion) are pretty easy to make, again because only one method needs to be proxied.

In fact, there can be advantages to internally 'pretending' everything is only 8-bit (or mono) right until it gets to the driver (which might be 24 or 32 bits) - reduced storage and buffer costs mostly, even if we don't get a speed boost because of conversion costs.

Docs: JournalFS and TokenFS classes.

I released a chunk of Arduino code this week, and one of the classes is a "journalling" (or "log-structured") filesystem for extremely small Flash/EEPROM storage, using absolutely minimal resources, in about three hundred lines of code.

Compiled size is about 660 bytes for an instance of TokenFS. The virtual memory classes add about 930 bytes more, if you're starting from naught, but I think that includes EEPROM.h and some other system dependencies too.

When I say 'extremely small', I mean the 1024 bytes of EEPROM in the Atmega32U4 inside the Arduino. One kilobyte of storage, total. That's smaller than the minimum block size of many other filesystems.

The primary job of such a filesystem is to reduce degradation of the storage, by smoothing out the write access pattern so damage is spread evenly over the whole chip, and not concentrated in a few 'hot blocks'. Flash storage is generally guaranteed 10,000 'write cycles' before media errors start to occur. That's per byte location, so the perfect ideal is to have every byte on the storage written 9,999 times before you have to tick over the 10K odometer on any of them.

It's like having a blackboard, but where the eraser is made from sandpaper. It works - but not forever.

One nice thing: the Atmel AVR's EEPROM (and program flash on some variants) is 'hardened' to endure 100K write cycles, which partly makes up for the tiny size.

JournalFS

Levelled write distribution over entire storage.
Resistant to corruption, power fails, and zero space issues.
Guaranteed success if volume free space equals or exceeds block size.
Minimal RAM use - head and tail pointer state only. (a dozen bytes)
Performs automatic incremental and 'emergency' defragmenting.
Basic journal - only understands blocks
256 bytes per block maximum (currently)
5 bytes per block overhead, 100% storage utilization.
64K storage size limit.
Storage is scanned/checked on mount, twice.
Tested on AVR Arduinos, but should work on SAM machines as well.
Virtual Memory classes mean filesystems can be stored in RAM, EEPROM, Program Flash (read only currently), or user-defined storage like SD cards.

Built on top of the base class is what most people think of as a 'filesystem', which is an indexed catalog of things in the storage. In this case, a single byte 'token' key is used instead of a filename string.

TokenFS

Adds 'byte token' index on top of JournalFS.
Index size passed as instance parameter - maximum 256 'entries'.
Moderate RAM use: the index requires one word (two bytes) per entry. (max 512 bytes)
Writing data block for a key replaces existing data.
Writing empty blocks for a key is equivalent to 'file deletion'
Guaranteed success if replacement block is same size or smaller than previous version.
No practical difference between 'missing entries' and 'zero length files' - all keys return a result.
Test cases wrote 200,000,000 random 'file updates' to a 1K RAMDrive without unexpected data loss - exceeding Flash hardware reliability by 1,000 times.

What's the difference? JournalFS takes care of the utterly 'low' level details, but tries not to worry too much about what it's storing. By itself it would be perfect for diary-like 'logfiles' where you just append another record every once in a while, and throw away the oldest to make room.

TokenFS assumes the first byte stored in each block is a 'token key', and maintains an in-RAM index of all the keys it finds. The index is a catalog of pointers to the storage location where the 'latest' version of that key is kept. This is kind of like the logfile, but instead of throwing away the global oldest, it's smarter about throwing away obsolete blocks for individual keys.

TokenFS is the exemplar for how to extend JournalFS by overriding block_state(), which is how JournalFS notifies descendants about blocks found, and asks if they are still valid.

I need to make it very clear that this "Filesystem" will not scale. It is not intended for large storage devices. The primary issue is the need to scan the filesystem on mount to 'catch up' with the journal - which means reading the headers of all the blocks in storage. When you have a ridiculously small amount of storage, that's fine. If you have megabytes, that time becomes significant.

But if you need to store a bunch of 'configuration options' in an Arduino that are independent of the program flash, this is the way.

Usage

Loading the library and defining some storage is pretty easy. This is how the RAMDrive is initialized in the filesystem test.

#include <unorthodox.h> // include the library, of course

byte storage[1024]; // create a 1024 byte buffer

MemoryPage store(storage); // wrap it in a virtual page

TokenFS fs(&store,1024,192); // 192-entry tokenfs

In most cases you'll be using EEPROM, which is even simpler.

  #include <EEPROM.h> // do this first
  #include <unorthodox.h> // to switch on more stuff in here

EEPROMPage eeprom(0); // wrap EEPROM in a virtual page

TokenFS fs(&eeprom, 1024, entries); // start the token index

If you wanted to use only a portion (say, leaving the first 128 bytes free) you can do so:

EEPROMPage eeprom(128); // start at byte 128

TokenFS fs(&eeprom, 1024-128, entries); // use until the end

Writing TokenFS entries is currently a little clumsy, for various reasons. To avoid re-packing the block later, we prepend the token byte ourselves. (It has been a challenge making it 'seem right' but also have good performance.)

byte block[size+1];

// manually prepend the token byte

block[0] = token;

// fill the rest of the block with data:

// ....

// wrap the array in virtual memory

MemoryPage page(block);

// write it to the filesystem

if( fs.token_write(token, &page, size+1) ) {

// it fit. yay!

} else {

// it didn't. boo!

Serial.print("\n Write Failed ");

}

// page is automatically freed when we leave scope

Reading entries back from TokenFS is the most straightforward, token_page() will return a virtual memory page containing the payload content, (or null) and token_size() will tell you how much of it you should read. The token byte is not included.

int size = fs.token_size(token);

if(size) {

Page * page = fs.token_page(token);

for(int i=0; i<size; i++) {

byte b = page->read_byte(i);

// process each byte...

}

// free page object

delete page;

}

Generally, it's a good idea to call token_size() first, because if zero, there's no point calling token_page(). Note the Page object is allocated by the call, so delete it when you are done to avoid memory leaks. (Treat it similarly to a 'file handle'.)

If you really need speed, you can avoid instancing new virtual memory Page objects by interrogating the token index directly to get the storage offsets.

There is generally no need to 'format' a volume before use - 0x00 or 0xFF filled blocks (the most common occurrence for EEPROM) are read as 'corrupt' blocks, which indicates an empty journal. If no existing journal can be found, a new one is started from scratch. The likelihood of random data being recognized as valid blocks is very small.

The exception would be if there had previously been a filesystem installed that was similar in structure, but actually different. In that case, clearing out the EEPROM may be necessary. (Or at least the first dozen bytes)

Applications

To understand why you want all this, imagine you have ten 'options' that you keep in the EEPROM for motor speeds and current limits and other important things. (If you keep it in program flash, it gets blown away with every software update. It's not good for a bugfix to reset all the controller defaults, sometimes.)

If you put them in ten well-known locations in the EEPROM (which is the simplest method - you just document the locations and manage them with newsgroups, not code) then your microcontroller will become corrupt when the first of those options is overwritten 10,000 times. We assume it's a hard wall, to be conservative.

But if we're using a diary-like journal, we need to fill the whole storage with updates before we loop and start to overwrite the old locations again. If the same few records are being changed over and over, this is 'journalled' across the whole storage, and you will get ten or a hundred times as many updates before the storage begins to corrupt.

10,000 updates sounds like a lot, but that's once an hour for 400 days. Lots of systems recalibrate once an hour. Or restart twenty times a day. Journalling can extend 'expected lifetime' of an embedded system from a year, to decades. That is not a trivial outcome.

The trick with these kinds of filesystems is to never store the index inside the volume. Because any central directory catalog becomes the 'hot-spot' that logically must be updated more often than the files it contains. That's why TokenFS must rebuild its RAM index each time the volume is mounted. This is the trade-off... without a "pre-compiled" index available, it takes much longer to start up. It has to read the whole diary to the end.

Limitations

JournalFS block size is currently encoded in one byte, so up to 256 bytes per block.
TokenFS uses the first byte as a key token, so maximum payload is 255 bytes.
16-bit words are used for storage pointers, so 64K is the maximum volume size.
As mentioned, the need to do a full 'journal scan' may exceed acceptable time limits on large volumes.
Defragmenting 'thrash' increases as the free space approaches zero. In the worst case of a completely full volume, it must defragment the entire storage (burning up one entire pass) in order to update a single entry. Do not run the filesystem at 100% capacity. Try to keep 10% free, enough for a couple of blocks at least, so the defragmenter can find space without too much trouble.
Writing will invalidate any Page objects already obtained from token_page(), since defrag will move storage blocks around. Don't cache pages, look them up when you need them.

Non-Limitations

Blocks can be larger than the amount of available RAM.

Wednesday, March 5, 2014

Release: unorthodox-arduino library on GitHub

https://github.com/unorthodox-engineers/unorthodox-arduino

General purpose library with many useful things for the Arduino Leonardo: Flash file system, Raster/GUI, Real-time Robotics kernel, Hardware drivers, Virtual memory, Binary Trees, Tries, Associative maps, and Streaming Micro-parser.

PREVIEW RELEASE V0.1

I'm releasing this "as is", even though many parts are still a mess and need work, because there's enough useful stuff in here to be worth it. But there are no examples, very little documentation, and not a lot of explanations for the ass-backwards way I seem to do some things in the name of optimization.

~ Jeremy

USING

Just copy the "Unorthodox" folder to your Arduino "libraries" directory, and #include <Unorthodox.h> in the usual way.
BUT: Please include all the standard libraries you intend to use (like SPI, EEPROM and Wire) _first_ because the Unorthodox lib will only declare dependant classes (such as EEPROMPage or SPIDevice) if the requirements are _already_ loaded.

The library is currently "AVR only" simply because of the directory structure, and I don't have SAM hardware to test on, but most of the classes should port without issue. And yes, I know it takes a long time to compile - everything is in header files rather than .cpp, for various stupid reasons.

LICENCE

In plain English: If you intend to use this code for projects that are personal, non-profit or educational in nature, then you are considered an "academic peer" and have fair use rights so long as attribution is given. If you intend to resell or commercially profit from this code as part of some product or service, then I will require you to enter into a commercial licence for its use.

CONTRIBUTIONS

I am not actively seeking contributors or patches at this time, though of course bug reports and feedback are always welcome. The code is not technically "open source", not yet anyway. I don't even know if other people will find it useful.

VIRTUAL MEMORY CLASSES

Page
ReadPage
MemoryPage
NearProgramPage
FarProgramPage
EEPROMPage
ZeroPage
BytePage
WordPage
Cardinal

DATA STRUCTURE CLASSES

PrefixTree
PrefixNodeResult
RedBlackTree
Map
MapNode

STREAMING PARSER CLASSES

Cursor
BufferCursor
PageCursor
NumberCursor
PrefixCursor

FILE SYSTEM CLASSES

JournalFS
TokenFS

HARDWARE DEVICE CLASSES

Device
DHT11
I2CDevice
MPU6050
SPIDevice
MAX6957
ENC28J60

RASTER DEVICE CLASSES

Raster
Raster8
Raster16
ILI9325C
ILI9325C_Pins
ILI9325C_Leo
ST7735
ST7735_SPI
RasterDraw16
RasterFont16

ROBOTICS CLASSES

Debounce
DroidBeeps
Motivator
BaseDroid
RasterDroid

GUI CLASSES

RasterSpan
RasterPager
SignalsPager
SourcePager
CodesPager

Tuesday, March 4, 2014

The New Subsistence

I've been watching a lot of talks by Jaron Lanier and Cory Doctrow in the last few days, both excellent thinkers who spend a lot of time worrying about the same effects of technology on society that I do.

Many of their comments have helped sharpen my own thinking, and I'm slowly coming to a new idea that might be a helpful way of looking at the problems of rising inequality and power disparity in our societies.

Actually it's mostly Queen Victoria's ideas, brought up to date. And that's because, frankly, the Victorian age has ours beaten hands down for the rate of technological progress, social progress, economic mobility, health, education, and in fact every measure that's in such decline today. Why?

My fundamental thesis is going to be that the Victorians had better access to raw materials and the means of processing them, for individual gain. Sure, we may have more personal capability (knowledge, tools) than your average Victorian, but we have less opportunity to use them.

Let's take one simple example - food, and the Commons.

Despite the "tragedy of the Commons" that came later, it's interesting to know that large chunks of the English countryside were "Common land", where anyone could graze their sheep. Pastoral citizens were literally surrounded by the raw material they could, with care, turn into food. They could extract a livelihood and value from the landscape around them.

This is a very human endeavor. Put a naked human on a desert island, and they will start building. Grass hats, grass huts, wooden spears, axes, wooden shelters, fires and forges. Play Minecraft if you've forgotten how that process works.

So here's the thing... we've lost access to all those resources. We now live in a pre-owned, pre-fabricated world where the opportunities for extracting value from our (physical) environment are gone. You can't dig up your backyard for the mineral content, you can't cut down trees in the local park for wood, and you can't own enough pigs and chickens to feed yourself without the council telling you to stop.

And, if you somehow do manage to luck onto a cache of resources, you're not allowed to make 'copies' of other people's products, which by now includes everything.

The explosion of prosperity in Victorian times happened because the "middle class" had access to resources, and were given new technological ways to apply them for direct personal benefit. Remember the first railroads were built on private property, from local steel, by blacksmiths. A manor wasn't complete without its workshops. And the ethos of that time we'd probably call "Open Source" today... it was fine to copy/reinvent someone else's machine, especially to make it better. In fact it was encouraged by the Queen herself.

They had copyright. They invented it. There are big books of textile designs, tartans and weaves, that once registered were the 'property' of the artist. "Argyle" being a case in point. But that was Art, created by pure human imagination. With engineering, it was all about attribution because you were 'discovering' something that was thought to already be there. It was a race for glory.

Yes, the world has moved on. No more sheep and commons, now we have media and internet. They are the new source of 'raw material' in our daily lives. But your average citizen can't avail themselves of those resources. Try, and criminal proceedings occur.

The new resources have different rules - sheep are hard to copy, but software is trivial. So 'manufacturers' get to 'produce' an infinite number of units for nearly zero cost, but the moment it reaches us, that valuable feature is removed. Its primary utility value is legally withheld from us.

(If you're wondering what I'm talking about, meditate on the koan: "How can software disappear?" And then ask the users of Final Cut Pro.)

That's the case with everything. There are no commons anymore. No 'free' resources that a capable human can use to bootstrap themselves to self-sufficiency. We are born into a controlled world where opportunities are restricted.

For example, in many places in the world, it's illegal to collect rainwater. In Oregon, a man was jailed for it. Water can fall out of the sky and onto your head and house, and if you collect it in a bucket or a dam you can go to jail.

I want you to think about that for a long, long moment. Really think about it.

Rain is no longer a "common resource". Someone else owns it, even as it falls on you.

And if you say, "Well, yes, but I'm sure it's for a good reason to do with water management" then that may be the case, but what about the small town where the local (private) water processing plant got a rainwater ban enacted? When large private companies have profit motive to restrict your access to 'competing' resources (or artificially deplete them) and collude with governments to do so, what options do you have?

If I were to pick up a fallen branch in the street, carve it into the likeness of a popular kids cartoon, and then sell the results of my work on eBay; I would be breaking about twenty laws. Thirty if I posted it overseas.

If I were to use my scientific knowledge to produce inexpensive chemotherapy drugs from Asprin and Cough Medicine, (drugs which are needed to save lives, so you'd think that would be cool) I would be breaking so many laws your head would spin.

You are not allowed to extract value from your environment. Physically, or intellectually. That right is reserved for people you have to swear fealty and service to, if you want to live comfortably. And the toll they charge is punitive, because they know we have no other option. They got there first, they own the rights, and it doesn't matter they did it before you were even born, before you had any opportunity.

This is why the Middle Class has been decimated. We're surrounded by valuable resources that we can't use as raw materials for our own creations. An iPhone isn't just a product, it could be a small component of a larger device or system, just like the bolts and chips are components within the iPhone - that Apple didn't need special permission to resell, it was just assumed. But if you try that with their product, you get a visit from Apple's legal department. They visit if you make something that looks like their product, let alone has one inside it.

This is going to turn into a new kind of "Tragedy of the Commons" - the last one was about the fictional over-use of the shared resource that resulted in the actual resources being shut down, but this time the tragedy is the new Commons are stuffed with value we can't touch. And if the middle can't generate resources to pay for their lifestyles, then the entire economic edifice of consumerism collapses.

Just imagine if iPhones could be legally embedded in toy robots or remote-controlled planes, and retain the warranty. Apple would no longer be market limited by the number of physical people. I'd carry the one I know about, and not realize my "Internet Fridge" had another inside. And your car, and baby monitor, and telescope. Their business would expand, uncontrollably.

But no, Apple can rain a product down upon you, and you're only allowed to do what they want. That means you can't add value, extract a living, and afford the next iPhone. Henry Ford was a racist bastard, but even he understood the need to give your workers economic opportunities so they could afford to buy your next product.

Remember, in the game of Monopoly when one person holds all the property, the game is over, and all that colourful money becomes meaningless again. We are closer to that point than most think.