zoom.nu

pid 1 rage

2017-09-13T19:35:00.000+02:00

Ok so most of the linux world have gone bonkers and converted to systemd (except slackware - the original rude boy).
I tried to like systemd (well, not really) but it is impossible. It is a gigantic tire fire of badness, reminds me of the clusterf*** that is pulse-audio (hint, hint, know what I mean).

So I thought to myself; if I must have a gigantic monolithic opaque pid 1 process (flying in the face of all that is unix) then by jove it must be something else. Enter RancherOS as one of the dists that do not use systemd. It has a glorious solution, it runs docker as pid 1 and most of the OS services are actually docker containers. It is so fantastically insane that it is really brilliant.

So I'm in the process of converting "the cluster", i.e. my 3 raspberry pi machines to rancher OS. It is not straight forward but this is the process I use. So caveat # 1. My workstation is a windows machine for the reason that I must play starcraft 2 and that does not run on linux. So I got a cheap USB connected SD card reader to write my SD cards. I use Rufus as the utility of choice to write images to the SD card. It has worked really well.

However, the RancherOS on the pi does not auto-expand the root partition. There is also the little snag of having to write the initial cloud-config.yml file with the SSH-keys so I can get at the pi after it has booted. In dire need of a linux machine that can hack the SD-card I tried virtualization. Luckily VirtualBox can expose USB devices to a guest. So I booted up my virtual linux, mounted the SD card reader through USB and by magic it appears. So I can run gparted on the sd card from the virtual linux machine and resize the rootfs. After that it is just a matter of creating a /var/lib/rancher/conf/cloud-config.yml with all the SSH keys on the SD card. Plug it into the pi and boot.

And it freaking works. After a while the pi snagged an IP from the dhcp-server and I could SSH into it. Now to do this a couple of more times and then all my servers are on RancherOS and systemd is just a bad memory on the bare metal servers. It will still haunt many of the docker images but maybe I can force supervisord to be pid 1 :-)

gradle

2013-11-12T18:47:00.002+01:00

I've just quickly tried out gradle as a build system. I have three observations so far.

1) Netbeans have a pretty good integration

2) Repository configuration is in the build script. That is in my book a pretty big f*ckup, maven got rid of that with maven 2. I know I can write my own build script but a new build system where there is no convention for configuration of things that are specific to the build machine and not the source code... The irony of using Ivy and obscuring everything that is good with Ivy.

3) The run task... It is currently not possibly (said the forum 1 year ago and that's all I could find on this subject) to pass command line arguments to the application. You offer a "run" task but it is unusable.

If I have to write those things myself I might as well stick to bare Ant + Ivy.

Designing OO APIs

2013-10-30T21:05:00.001+01:00

I'm trying to find a good pattern to designing an API/Framework that let's clients subclass to inherit convenient behavior.
These are my requirements:

The framework owns the collection of "Things". Like this:
class Framework {
void addAThing(Thing t)...
}

A "Thing" is probably a composition of other stuffs. So it must have a Stuff getStuff() method. The framework uses a Things Stuff sometimes.

interface Thing {
Stuff getStuff() ;
}

So one implementation of Thing is the simple version that gets stuff from the outside.
class Thing1 implements Thing {
private final Stuff myStuff ;
public Thing(final Stuff myStuff) { this.myStuff = checkNotNull(myStuff) ; }
public Stuff getStuff() { return this.myStuff }
}

The problem I have is that there can be many different Things, and the actual creation of Stuff is something that I want the subclasses to be able to implement. So I'll help them along with this:

abstract class ParentThing implements Thing {
private final Stuff myStuff ;
public ParentThing() {
this.myStuff = checkNotNull(createStuff());
}
public Stuff getStuff() { return this.myStuff }
protected abstract Stuff createStuff() ;
}

The problem with that is that I'm relying on subclasses to be very nice about the implementation of createStuff() since the method is called in the constructor. It shouldn't for example register "this" as a callback to some other thread in createStuff since "this"-instance might not have left the constructor when the callback occurs.

So maybe this then:

interface Thing {

void initialize() ;

Stuff getStuff() ;

}

and then I'll fix it so that framework calls initialize before it uses the Thing, like so

class Framework {
void addAThing(Thing t){ t.initialize(); ... }
}

But then I can't use final in the ParentThing class anymore, leaving me with another state of Things, "created but not initialized yet".

So maybe I actually should have

interface ThingBuilder {
Thing createAThing() ;
}

and then

class Framework {
void addAThing(ThingBuilder t)...
}

Might not be so bad with closures, but without closures/lambda it gets a bit messy on the client side.

Ideas are welcome. How to design an OO API?

Debian Squeeze, chrome and firefox woes

2013-04-04T09:04:00.000+02:00

Ok so I am guilty of running stable software, I use Debian Squeeze. It is rather conservative in updating versions of things, doing a lot of testing, and for that reason everything "just works", even if I can't run the latest games and so on.

Google chrome has another strategy, they update their version of the browser all the time. Recently I got updated to Version 26.0.1410.43.
Now chrome says that I am running on an unsupported version of my operating system (Debian 6). It has a link to a page that states what versions are supported. That page says that Debian 6 is supported.
Yeah
Testing

So given the recent public relations stunts that Google has pulled (like ditching google reader - anyone remember orkut, still running....) I wasn't to keen on cutting them any slack, I'll just go back to firefox despite the abbysmal WebGL performance.

Firefox isn't included in Debian because of ... principles. Firefox will not license their logo in a Debian friendly manner because it's like trademark and their property and stuffs. Debian will not include software that have restricted licenses and trademarks and stuffs.
Principles, I can actually dig that. I'll just download it myself.

Happy like a little child finding a fresh puddle of mud on a spring day I headed over to mozilla.org to get me some firefox goodness.
Downloaded the version 19, unpack, click binary, wait for things to break (they usually do). But it ran. That's pretty nice of them.
Only it looked really crappy but I didn't have time to look into what that was about.
Until now, I'm home with a cold and between sneezing and other unpleasantness I tried to get Java-applets running. Only to find out that the reason firefox looks like, pardon the language, shite, and that the java plugin doesn't even get recognized is because it is a 32-bit build.
This is the year of our grace 2013.
I went 64-bit before my niece was born and she is in school now. I mean, come on!
The stupid, stupid, stupid, download page automatically detects I'm on linux and starts a download without asking and it picks the wrong arch. Ok, accidents happen, so where is the download link then? There freaking isn't any!
The solution was to google my way through countless questions until finding ftp://ftp.mozilla.org/pub/firefox/releases/
Klicking my way to the 64 bit version 19 beta build. Unpack that, then let firefox auto-update to version 20 and THEN the plugins work.
It looks good now also because it doesn't try to load GTK from the wrong 32-bit libraries that it obviously can't find since this is, as said, a 64-bit clean machine.
Yeah
Testing

Minimum viable web framework

2013-01-30T22:38:00.000+01:00

In some projects I've been working on, there's a small part of common code and thinking that one could call a minimum viable web framework (well, technically, it's a library and not a framework, but I digress). There's absolutely nothing new in it, and if you've ever done even one Servlet app, you know it already. I see that as an advantage.

My minimum viable web framework consists of 8 lines of code and one principle. The code (quoted below) is a static method (on some utility class) that takes a path to which to forward a request, the request and response objects, and then a vararg list of alternating names and values to put as request attributes. The principle is: if something produces any HTML, then that's all that thing does (inverted: if you don't produce HTML, you don't produce HTML). Anything that produces HTML (in my cases, JSPs) only reads values prepared for it from the request attributes (and if you have something that produces some output that isn't HTML, then it's by definition a special case and is exempt from this rule). It's OK to have lists and other simple data structures as request attributes, and it's OK to use simple looping constructs (e.g. a JSTL core forEach) in the rendering, but you can't do things like calling methods from there.

public static void forward(String path, HttpServletRequest request, HttpServletResponse response, Object... attributes)
throws ServletException, IOException 
{
        for(int i=0; i<attributes.length; i+=2) {
                request.setAttribute((String)attributes[i], attributes[i+1]);
        }
        request.getRequestDispatcher(path).forward(request, response);
}

Together, the code and the principle give a simple way of keeping application code and page formatting separate, and I've been quite happy with it in my projects. There's a lot of things missing that you might expect a framework to provide (e.g. input validation or data access). I see those as nice to have and that the only essential part of a web framework is facilitating the separation of logic and rendering.

So, what do you think - is this web framework both minimal and viable? If it's not minimal - how can we minimise it? If it's not viable, then what's missing? And last but not least: does it zoom?

OpenTSDB

2013-01-22T21:39:00.001+01:00

At my daytime gig I have come into contact with two applications that really zooms.

One is Splunk and I'm sure the developers of that application drink awesome-sauce for breakfast. It is so awesome I won't even write more about it here because I just don't know what words to use. Suffice it to say that it is a log analysis tool that actually beats 'find, xargs, awk, grep' and all of those.

The other application I've came into contact with is OpenTSDB. OpenTSDB is a time series database, that means you put metrics into it that you want to plot over time. OpenTSDB uses hbase as a database. That is the Apache hadoop database.
OpenTSDB has a web front end for plotting the data points that uses gnuplot. It is rather simplistic as a web front but it is mostly bug free and despite a few little quirks it "just works" exactly like you want it to. It does one thing (send a query to OpenTSDB and plot a PNG-image) and it does it good.

We use OpenTSDB to monitor our servers and applications. We put numbers into it like, how much heap has the server, what is the number of busy threads, how many messages was put on the message queue, what was the response time for each web service call.
It is extremely helpful in monitoring and post-mortem analysis of application behaviour. I mean like, really useful. I can correlate exactly the number of packets the load balancer sends to a certain host at 5s intervals with the number of busy threads, the heap size, the cpu load and a load of application specific metrics extracted from the JVMs using JMX.
The JMX-collector, developed in house, is written by some pretty clever guys to be fast but you can also write one, it isn't that hard. Remember to do everything async. and use caching and you're good to go.

I just can not stress enough how incredibly useful OpenTSDB is for not only monitoring what happens now but what happened that sunday when a couple of servers want haywire and didn't respond. Given the precision of the correlation it is very easy to find the relevant log entries from the time stamps.

To put in to perspective how good opentsdb & hbase is. We dump I would say more than 1 metric/second (usually we poll a specific metric each 5 sec or so - depending on what it is).
From each server, and we have > 35 servers.
To a single opentsdb + hbase server.

And it just freaking works. I can actually see exactly what the heap size was for host X on christmas eve.

Just tonight I installed OpenTSDB on my local machine/server, I don't need it at home but just to pay tribute to it.

Tracking down memory corruption by mprotecting your ADTs

2012-11-21T22:47:00.000+01:00

In C, it's customary to design your code around Abstract Data Types, that is, modules that consist of a header file that declares the external interface of the module (consisting of an opaque struct and a set of functions operating on that struct), and an implementation file (which has the full declaration of the structure, the definitions of the functions in the header and any helper functions). The header would be something like this:

#ifndef INCLUDE_GUARD_STACK_H
#define INCLUDE_GUARD_STACK_H

#include <stddef.h>
#include <stdbool.h>

struct stack;

struct stack* stack_create(size_t size);
bool stack_push(struct stack* stack, int i);
bool stack_pop(struct stack* stack, int* i);
void stack_destroy(struct stack* stack);

#endif

The implementation, then, is:

#include "stack.h"

#include <stdlib.h>

struct stack
{
  int* elements;
  size_t used;
  size_t allocated;
};

struct stack* stack_create(size_t size)
{
  struct stack* stack = malloc(sizeof(struct stack));
  stack->elements = malloc(size * sizeof(int));
  stack->used = 0;
  stack->allocated = size;
  return stack;
}

bool stack_push(struct stack* stack, int i)
{
  if (stack->used == stack->allocated) {
    goto error_full;
  }
  stack->elements[stack->used++] = i;
  return true;
error_full:
  return false;
}

I'll leave the implementation of the rest of the functions to your imagination. Since only the forward declaration of the struct is in the header, no code outside the implementation can access the members of the struct.

Now, assume we have a memory corruption fault somewhere in the rest of the program which when triggered corrupts the elements pointer but doesn't have any other effects. Our program them seems to be working fine until some later time when it suddenly crashes due to an invalid memory access in stack_push. We'd really like to get the program to abort at the point of the original corruption of the elements pointer, but how can we do that?

One way of solving that is to use the fact that since the structure is opaque, there is no way that any code outside our implementation file has any legitimate use of touching any of the memory to which struct stack* points. Since no other code has any business accessing that memory, then maybe we can have the OS help us preventing it from doing that? Enter mprotect.

The mprotect function lets us control what types of access should be permitted to a region of memory. If we access the memory in any other way, the OS is free (and in some cases even required) to abort our program at the spot. If we keep the memory inaccessible at all times except for when we use it inside our implementation functions, then chances are we can catch the memory corruption as it happens. The mprotect man page does say that the memory it protects has to be page aligned, though. How do we do that? Via posix_memalign and getpagesize, like so:

#include "stack.h"
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>

struct stack* stack_create(size_t size)
{
  struct stack* stack;
  posix_memalign((void**)&stack, getpagesize(), sizeof(struct stack));
  posix_memalign((void**)&stack->elements, getpagesize(), size * sizeof(int));
  stack->used = 0;
  stack->allocated = size;
  protect(stack);
  return stack;
}

Now we just have to implement and use the protect function mentioned above and its inverse, unprotect:

static void protect(struct stack* stack)
{
  mprotect(stack->elements, stack->allocated * sizeof(int), PROT_NONE);
  mprotect(stack, sizeof(*stack), PROT_NONE);
}
static void unprotect(struct stack* stack)
{
  /* Unprotect in the reverse order, or we crash and burn trying to read stack->elements */
  mprotect(stack, sizeof(*stack), PROT_READ|PROT_WRITE);
  mprotect(stack->elements, stack->allocated * sizeof(int), PROT_READ|PROT_WRITE);
}
bool stack_push(struct stack* stack, int i)
{
  unprotect(stack);
  if (stack->used == stack->allocated) {
    goto error_full;
  }
  stack->elements[stack->used++] = i;
  protect(stack);
  return true;
error_full:
  protect(stack);
  return false;
}

Since this ends up modifying bits in the MMU, it may not be suitable to have enabled on performance critical code, so an #ifdef NDEBUG switch that selects an empty implementation of protect and unprotect for non-debug builds could be advisable.

So does protecting your ADTs via mprotect zoom? Well, it does come in handy at times, and there's not much disadvantage to using it, so my verdict is: Zooms!

Compiling with LLVM

2012-10-28T13:07:00.000+01:00

LLVM is, among many other things, a reusable compiler backend implemented as a library. In this post, I'll show how the llvmpy Python bindings can be used to write an optimising cross-compiler in a few lines of Python. As a starting point, let's re-use the old parser and interpreter from the Python Parsing Combinators series. Since the grammar only has operators and numbers, we'll need to extend it a bit - there's not much point in spending time compiling something that isn't going to take any input. Let's add named values like so:

identifier = RegEx("[a-zA-Z_][a-zA-Z0-9_]*").apply(Identifier)
simple = number | paren | identifier

Next, we'll need an implementation for this new node in our Abstract Syntax Tree:

class Identifier(object):
    def __init__(self, s):
        self.name=s
    def eval(self, env):
        return env[self.name]
    def __repr__(self):
        return repr(self.name)

As you can see if you compare this eval to the old ones, we now take a dictionary as argument to eval, and that it maps names to numbers. The old eval implementations should be extended to pass along this environment, but Identifier is the only one that has a use for it.

Interpreting is now done thus:

expr.parse("a+1", 0).next()[0].eval({"a":4})

Now, the topic of this post is compilation, not interpretation. Let's start by importing the llvmpy bindings:

from llvm import *
from llvm.core import *
from llvm.ee import *

In LLVM, you have modules (roughly object files) that contain functions, which take arguments and contain a basic block. Basic blocks contain instructions, which can in turn contain things like basic blocks or call functions. Creating a module is simple:

m = Module.new("m")

The function we want to add to the module should correspond to the expression we're compiling. It should return an int, and its arguments should be the identifiers that are used in the expression. We can find the identifiers by adding a method to the AST nodes that return the union of all identifiers in the sub-tree (it's trivial, so I'll spare you the implementation). Let's sort them ASCIIbetically and use that as our arguments. We're going to need a mapping from identifier name to function arguments, so let's build that up as well.

identifiers = sorted(list(ast.identifiers()))
ty_int = Type.int()
ty_func = Type.function(ty_int, [ty_int] * len(identifiers))
f = m.add_function(ty_func, "f")
args = {}
for i in range(len(identifiers)):
    f.args[i].name = identifiers[i]
    args[identifiers[i]] = f.args[i]

Now, function bodies consist of a basic block (i.e. a series of statements). You can add statements to a basic block using a builder.

bb = f.append_basic_block("entry")
builder = Builder.new(bb)

Now we're ready for the interesting parts. For each AST node type, we add a method for compiling to LLVM opcodes. Since LLVM uses Static Single Assignment, each opcode results in the assignment of a variable, and this variable is then never modified. Every method on the builder that appends an instruction to the basic block returns the resulting value, which makes it natural to do the same for our compile method: we return the value that will hold the results of the computation. As arguments, let's take the builder that we use for appending the instructions, the type to use for our numbers, and the argument mapping we built up. Identifiers are simple: we just return the argument that corresponds to the identifier:

class Identifier(object):
    #...
    def compile(self, builder, ty_int, args):
        return args[self.name]

Numbers are also easy - we just return a constant value.

class Number(object):
    #...
    def compile(self, builder, ty_int, args):
        return Constant.int(ty_int, self.n)

Now, how do we do our two remaining AST node types, addition and multiplication? We can do this by first emitting the code for the left hand side of the operation, remembering the value that will hold the results for that. Then we do the same for the right-hand side, and finally, we emit a multiplication instruction that takes the resulting values for the left hand side and the right hand side as its operands.

class Mul(BinOp):
    #...
    def compile(self, builder, ty_int, args):
        a = self.a.compile(builder, ty_int, args)
        b = self.b.compile(builder, ty_int, args)
        return builder.mul(a, b)

Addition is of course the same, except for using builder.add. We can now let the basic block of our function return the value that will hold the result of evaluating our entire AST by doing

builder.ret(ast.compile(builder, ty_int, args))

We can now generate machine code from this, and run it directly:

ee = ExecutionEngine.new(m)
env={
    "a": GenericValue.int(ty_int, 100),
    "b": GenericValue.int(ty_int, 42)
}
args=[]
for param in identifiers:
    args.append(env[param])
retval = ee.run_function(m.get_function_named("f"), args)

This will build up a function in memory, which we can pass some arguments and call. This is really quite amazing. You call some functions to describe how to do something, and then you get a function back that you can execute directly. This can be used for all sorts of interesting things, like partial specialisation of a function at runtime, to take one example. Let's say you have to do a fully unindexed join on two strings in a database. For each value in the first table, you will need to do a string comparison with every single value in the other table. Since none of the strings we're comparing are known at compile time, there's little the compiler can do except hand us its very best version of a general strcmp. With LLVM, we can do better: for each string in one of the tables, generate a specialised string comparison function that compares a string in the other table to that particular one. Such functions can be optimised quite a bit.

The fun doesn't stop here either. In addition to generating functions in memory, we can serialise them to bitcode files using the to_bitcode method of LLVM modules. That generates a .bc file for which we can generate assembler code for some particular architecture, and then use the platform assembler and linker to generate an executable. To do that, however, we first need a main function. Since the function we generated for our expression above takes ordinary integers, we'll have to generate a main function that will take inputs form somewhere (e.g. the command line), convert that to integers, pass them to the function, and then make the result known to the user. To make things easy, let's just call the libc function atoi on argv[1], argv[2],... and use the result of the function as the exit code of our program.

ty_str = Type.pointer(Type.int(8))
ty_main = Type.function(ty_int, [ty_int, Type.pointer(ty_str)])
main = m.add_function(ty_main, "main")
main_bb = main.append_basic_block("entry")
bmain = Builder.new(main_bb)
argv=main.args[1]
int_args=[]
for i in range(1, len(identifiers)+1):
    # atoi(argv[i])
    s = bmain.load(bmain.gep(argv, [Constant.int(ty_int, i)]))
    int_args.append(bmain.call(atoi, [s]))

    bmain.ret(bmain.call(f, int_args))

Here, ty_str is a character pointer, and main is a function from int and pointer to character pointer to int. In its basic block, we add one call to atoi for each identifier that our expression uses. The argument to atoi would in C be written as argv[i], but here, we do it by first getting an element pointer ("gep" is short-hand for Get Element Pointer) to the string we want, and then dereference that. Think of the "load" as the * and the gep as the + in *(argv+i)

We then add a call to our function, and return its result as our exit code. But what does atoi refer to here? We must first tell LLVM that we want to call something from the outside of our program. It will give us back a reference to the function, so that we can build up calls to that something.

ty_atoi = Type.function(ty_int, [ty_str])
atoi = m.add_function(ty_atoi, "atoi")
atoi.linkage = LINKAGE_EXTERNAL

All that is left before we can generate our very own native executables is file handling and calling the right tools. On Mac OS X, it goes a little something like this:

import os

if filename.endswith(".zoom"):
    basename = filename[:-5]
else:
    basename = "a"

bitcode = basename + ".bc"
asm = basename + ".s"
obj = basename + ".o"
executable = basename

f=open(bitcode, "wb")
m.to_bitcode(f)
f.close()

if target=="x86":
    os.system("llc -filetype=obj %s -o %s" % (bitcode, obj))
    os.system("ld -arch x86_64 %s -lc /Developer/SDKs/MacOSX10.6.sdk/usr/lib/crt1.o -o %s" % (obj, executable))
else:
    os.system("llc -mtriple=arm-apple-darwin -filetype=obj %s -o %s" % (bitcode, obj))
    os.system("ld -arch_multiple -arch arm %s -o %s -L/Developer/Platforms/iPhoneOS.platform/DeviceSupport/4.2/Symbols/usr/lib /Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS4.3.sdk/usr/lib/crt1.o -lSystem " % (obj, executable))

In order to build for other targets, you can specify -mtriple=xxx on the llc command line. You will of course have to run a matching version ld in order to generate your executable.

So, we can build native, and even cross-compiled, executables, but what about optimisation? Turns out we get that too. The LLVM bitcode we generate for the expression 1+1+1+1+a+a+a+a is

define i32 @f(i32 %a) {
entry:
  %0 = add i32 %a, %a
  %1 = add i32 %a, %0
  %2 = add i32 %a, %1
  %3 = add i32 1, %2
  %4 = add i32 1, %3
  %5 = add i32 1, %4
  %6 = add i32 1, %5
  ret i32 %6
}

whereas the resulting x86 assembler code is

_f:                                     ## @f
Ltmp0:
## BB#0:                                ## %entry
        leal    (%rdi,%rdi), %eax
        addl    %edi, %eax
        leal    4(%rdi,%rax), %eax
        ret

Clearly, llvm has figured out what's wrong and folded the constant additions and lowered the multiplication to a left shift.

So, what's the conclusion from this? That building optimizing cross-compilers is a weekend project for a small DSL, and you can do it all in Python - and if that doesn't zoom, nothing does.

Goto Checker

2012-09-22T22:31:00.000+02:00

One of the cool new features of the upcoming Clang release is the tooling infrastructure. This Clang tooling infrastructure provides an easy way of writing style checkers, source-to-source rewriters, analyzers and all sorts of other tools that need to understand C and related languages.

Especially nice is the AST matching support that landed recently. With the AST matchers, you can describe patterns of code that you're interested in in a declarative DSL, and get a callback called for all instances of that pattern in the code you're parsing. It's a bit like XPath for C.

As a demonstration of this, I've written a little code checker that checks that code follows a particular error handling style, and warns of any transgressions. The error handling style I chose is the one I prefer in C, which I call the reverse label style. The reverse label style has been made popular though its use in the Linux kernel, and looks like the below example:

int read_file_normal(const char* filename)
{
 FILE* f = fopen(filename, "rt");
 char buf[100];

 if (f == NULL) {
  fprintf(stderr, "Failed to open %s", filename);
  goto error_open;
 }

 if (fread(buf, sizeof(buf), 1, f) == 0) {
  fprintf(stderr, "Failed to read data from %s\n", filename);
  goto error_read;
 }

 if (fclose(f) == EOF) {
  fprintf(stderr, "Failed to close %s\n", filename);
 }

 return 0;

error_read:
 fclose(f);
error_open:
 return -1;
}

For each error case, we have a goto to a unique label. Starting at that label is all cleanup that should be done in the current case. The labels are therefore listed in reverse order to the gotos. A bit more discussion can be found at the staila blog. As rightly pointed out there, it results in readable code that unfortunately easily can rot during maintenance.

Automatic code checking to the rescue. We want to find all function definitions, and in each function definition we want to see all uses of goto and all labels, and verify that we first have one section of gotos but no labels, which is then followed by a section of labels, listed in reverse order of their corresponding gotos in the first section, without any jumps in it. We thus set up a MatchFinder and add our three patterns and bind their match results to names we can use in our callback:

        MatchFinder mf;
        GotoChecker handler;

        mf.addMatcher(functionDecl(isDefinition()).bind("func"), &handler);
        mf.addMatcher(gotoStmt().bind("goto"), &handler);
        mf.addMatcher(labelStmt().bind("label"), &handler);

When we run our tool on some code, our GotoChecker::run method will be called with its MatchResult set to one of these three things. So, we check which one it is, and if it's the start of a new function, we note that we're now in the normal section of the function (as opposed to the error handling section), and we clear our stack of labels:

       const clang::FunctionDecl* func =
                Result.Nodes.getNodeAs("func");
        const clang::GotoStmt* g =
                Result.Nodes.getNodeAs("goto");
        const clang::LabelStmt* label =
                Result.Nodes.getNodeAs("label");
        if (func) {
                in_error_section = false;
                gotos.erase(gotos.begin(), gotos.end());
        }

If we get a goto, we first check that we're not in the error handling section, and then append its label to our stack of labels.

        if (g) {
                clang::LabelDecl* label = g->getLabel();
                if (in_error_section) {
                        clang::DiagnosticsEngine& d =
                                Result.Context->getDiagnostics();
                        unsigned int id = d.getCustomDiagID(
                                clang::DiagnosticsEngine::Warning,
                                "Found goto to label %0 inside "
                                "error hanling section"
                        );
                        d.Report(g->getLocStart(), id) << label;
                }
                gotos.push_back(label);
        }

Here we can also see an example of how to produce diagnostics from a Clang tool. The DiagnosticBuilder returned by DiagnosticsEngine::Report accepts a number of different Clang types, and will format them appropriately. In this case, we're giving it a NamedDecl, so it knows to put it in quotes. The source location we give to Report also means that Clang knows to quote from the source and highlight exactly what we're objecting to.

Now, the final part of the puzzle is to handle labels. It would be really nice if we could verify that there's no way for the code to fall though from the normal section to the error handling section, but that would involve scanning backwards though the statements and checking that the last is either a return, a call to a function known not to return, or a conditional where all branches end with a return or a function knon not to return or a conditional where... Let's skip that for now, and just get the name of the label.

       if (label) {
                if (!in_error_section) {
                        // TODO: Check somehow that all paths have returned
                }
                clang::LabelDecl* found = label->getDecl();
                std::string name = found->getNameAsString();

Since we're enforcing that goto is only used for error handling, let's check that the name of the label reflects this:

                if (
                        name.substr(0, 6) != "error_"
                        && name.substr(0, 8) != "cleanup_"
                        && name.substr(0, 4) != "err_"
                        && name != "exit"
                ) {
                        clang::DiagnosticsEngine& d =
                                Result.Context->getDiagnostics();
                        unsigned int id = d.getCustomDiagID(
                                clang::DiagnosticsEngine::Warning,
                                "Illegal label name: %0"
                        );
                        d.Report(label->getLocStart(), id) << found;
                }

This feature could be debated. Maybe it should be optional, and maybe the permitted prefixes should be configurable, but that's for version 2.0.

Now all that is left is to check that the label we found matches the top of the label stack we built up from the gotos in the normal section.

                if (gotos.size() == 0) {
                        return;
                }
                clang::LabelDecl* expected = gotos.back();
                in_error_section = true;

                if (found != expected) {
                        clang::DiagnosticsEngine& d =
                                Result.Context->getDiagnostics();
                        unsigned int id = d.getCustomDiagID(
                                clang::DiagnosticsEngine::Warning,
                                "Error handling sequence mismatch. "
                                "Expected %0, found %1"
                        );
                        d.Report(label->getLocStart(), id) << expected << found;
                }

                gotos.pop_back();

Apart from a handfull of boilderplate lines (~4 lines plus some includes and namespace uses), that's it. What I like is that all the parsing and scanning and other things that would normally get in our way just disappears and we can go straight to implementing our checking logic.

The full source can be had at the project page. Note that it requires LLVM and Clang from trunk, but there are good instructions for getting that.

In summary, the Clang tooling infrastructure definitly zooms. I expect to be using this quite a bit from now on, and if you're working with C or its derivates, I think you should too.

The Netbeans build system

2012-09-09T10:59:00.001+02:00

This is more of a "venting frustration" than informative. I have been using netbeans for a while now, at least the re-branded version that is jMonkeEngine SDK. However, this is about the netbeans part of it and nothing to do with the jME SDK.
I have been thinking for years that I should port Svansprogram to a Rich Client Platform instead of hacking my own. Hacking my own RCP was fun at first but now it is mostly tedious.
So to learn module development on Netbeans I have started out with some small hacks to get a feel for it. Ran the Hello World things and so on.
Eventually came the time to make my very own module, first I let Netbeans create the project, then I put it into my version control and then on to set up my continuous build system.
The build system in netbeans is Ant. Which is all very fine by me, Ant+Ivy is my preference if I can choose whatever I like. It is always good if the IDE understands the real build system, i.e. the headless builds run on the build server. You know, the real build, that produces the installable binaries.
Eclipse has a totally separate build system so you need to keep the IDE and the real build in synch using plugins and other voodoo. So if Netbeans have some build system that I can use in both the IDE and on the CI machine I'm a happy camper. So with a smile on my face I set out to run the netbeans project on the CI machine.
That is when the horror starts.
The way netbeans set up the builds is to have some archane and weirdly mutated Ant scripts, without documentation, importing things from secret directories. Then have some plugins in the IDE that knows the secret locations to insert values here and there. I have never in my life as a professional coder seen a more opaque and strange build system.
For example, it uses an Ant plugin called CopyLibs to, you know, copy JAR-files it needs for the build. It makes me want to grab the engineer by the throat and foaming at the mouth scream at them that ANT already can copy files or why don't you just use Ivy.
So in order for my CI machine to even be able to get all the required dependencies I have to pollute my Ant installation making it harder to have reproducible builds since now my entire build system is dependent on a specific version of Netbeans.
It does not stop there, to make a plugin for Netbeans I must of course build and link and all that against a specific version of Netbeans. This is acheived by including build scripts from a "Harness" directory that is specific to a version of Netbeans. This is all understandable, but the location of this harness-directory is set up in a "private" property file, through one or two ant properties with closely guarded complex names. The private property file is actually hidden in $HOME.
Yeah you heard me, to make the CI machine build it will have to have access to my $HOME to be able to find the harness directory that is from my _installation_ of netbeans, not from some central repository of dependecies.
There are some other ant-voodoo that Netbeans hides in a FAQ to instruct people how to download the harness from the web instead of pointing the CI machine to an actual installation of netbeans. I say it again, an installation, that requires you to run an installer, and click through stuff.
There is more to this story but now I need to go cry in a dark corner. It has taken me several weeks of hobby-coding time to even get this far, and then to realize that netbeans generates images when building so you have to remember to set the JVM as headless or the build fails. And oh yeah, when I did it generated blank images so my menu entries when the module is installed are empty lines.

Glassfish and TeamCity

2012-06-27T21:48:00.003+02:00

On my day job we use TeamCity from JetBrains as the Continous Integration solution. It works really well, looks good and has a plug-in for eclipse even. So I thought I'd install it at home too.

Now, I have glassfish as an app server. I like JEE application servers, I think they are neat, I want my glassfish to deploy all my JEE applications. So i found the WAR-download of TeamCity. I read the installation instructions which required me to set some environment flag to change the default location for data directory (I keep my variable files in /var thank you very much, not in some wierd directory in $HOME).

I deploy the application and I am presented with a blank page.

Looking in the log file it says that I have no database definition in my data directory - yeah, like, how the hell could I since I have no clue what you are talking about.

So back to the instructions here http://confluence.jetbrains.net/display/TCD7/TeamCity+Data+Directory. I suspect that I must have _all_ the configuration in place before I deploy the WAR file, at least http://youtrack.jetbrains.com/issue/TW-20362 seems to say that. So in order for me to deploy the WAR-file I must first download and run the Netty-integrated installer to create the directories?!?!

Hello Jenkins, goodbye TeamCity

Svansprogram v3.0 out now

2012-05-18T20:47:00.000+02:00

I just released v3.0 of Svansprogram. Some bug fixes, lots of refactoring and sweet new graphics (icons and banners) by Mattias Persson. Get Svansprogram from http://code.google.com/p/svansprogram/

Oh, sorry, no packaging for Mac yet but I'll do it "real soon now" :-)

Installer woes

2012-04-03T14:40:00.004+02:00

I'm preparing to release v3.0 of my Multi-tail application Svansprogram. One popular feature request is native launchers. Yeah, OK, so this one guy asked for it but since there are about 3 people in the world using svansprogram that is a substantial part of the user base.

Being a Linux guy I have to jump through a lot of hoops to even begin to create a windows EXE-file or an OS X launcher. I can't even test them properly! Anyway, Launch4j seems nice and might even have some type of integration with maven (and I'm switching back to ant + ivy when I find the strength). There is a GUI and some XML files that you can use to work up an exe-file. The GIMP did let me create an ICO-file also. As soon as I can test this somewhere I might put in the hour or so to try this out.

A pal has promised to build an OS X launcher by using some sort of maven magic and an another launcher-creator-app-thingy.

So I tackled the Linux side. I decided to distribute an installer with the TAR-ball,

disclaimer - Blogger ate the formatting and yes I KNOW there are symlinks and stuff this does not handle, write a bug report or something :-)

And this people is why I LOVE linux: a few lines of BASH and here is your installer.

#! /bin/bash

SVANSPROGRAM_HOME="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
LOG_FILE=$SVANSPROGRAM_HOME/install.log

log() {
echo "$1" >> $SVANSPROGRAM_HOME/install.log
}

log "`date`"
log "Svansprogram directory: ${SVANSPROGRAM_HOME}"

#######################################
# Create executable file #
#######################################
EXECUTABLE="$SVANSPROGRAM_HOME"/svansprogram.sh
log "Creating shell script: $EXECUTABLE"
cat <<MESSAGEDELIMITER > $EXECUTABLE
java -jar $SVANSPROGRAM_HOME/svansprogram.jar
MESSAGEDELIMITER

log "Setting execute permission"
chmod +x "$EXECUTABLE"

#######################################
# Create desktop file #
#######################################
INSTALL_DIR=~/.local/share/applications
log "Install directory: ${INSTALL_DIR}"

if [ ! -d "$INSTALL_DIR" ]; then
log "Install directory does not exist"
exit 1
fi

log "Creating desktop file"
cat <<MESSAGEDELIMITER > "$SVANSPROGRAM_HOME"/svansprogram.desktop
#!/usr/bin/env xdg-open

[Desktop Entry]
Version=3.0
Type=Application
Terminal=false
Exec=$EXECUTABLE
Name=Svansprogram
Comment=A Multi tail application
Icon=$SVANSPROGRAM_HOME/icon_128x128.png
Categories=Development
MESSAGEDELIMITER

log "Moving desktop file"
mv "$SVANSPROGRAM_HOME"/svansprogram.desktop "$INSTALL_DIR"

Some laws of enterprise architecture

2012-02-08T23:08:00.002+01:00

Though the years, I've acquired a few observations on enterprise development, which I will pompously refer to as my Laws of Enterprise Architecture. They are as follows:

1: In the end, it's SQL

Sure, the system you're looking at is not just a simple SQL front-end. While it does terminate some functionality in an SQL database, other things are passed on to web services and other support systems. But what do those do? More likely than not, they terminate functionality in an SQL database or pass the request on to other systems that do the same. Eventually, everything ends up as SQL queries. There aren't web services all the way down.

As a corollary, you can determine the efficiency of the architecture by comparing the queries that are actually run for a set of functions with that you would have if you just looked at those functions and translated that into queries to an imaginary database built for the purpose. If you only need a query or two that fetch a grand total of a handful of rows, but the queries that end up hitting the actual databases are a lot more complex (and especially return a lot more data), then you're losing track of the objective somewhere along the way from the user to the database.

2: If the answer to your enterprise integration needs is a bus, then you have bigger problems

If you need a bus to integrate your systems, then you either

pretend that you don't know what systems you have, or
actually have no control over what's deployed.

If it's a), then you should open your eyes. Pretending that you don't know things that you, in reality, fully control is a sure way to ruin any design. Doing it for the entire architecture is madness. If you're working on the architecture for an enterprise system and you're in case b), then you should work a bit more on the processes. From where do these unknown systems come?

That said, there are cases where a bus makes sense. If you're building a desktop operating system that will be used by millions of users running any number of strange things on their systems, then a bus does make sense. If your enterprise architecture is comparable to millions of installations of a desktop system, however, you have your plate full.

3: Premature generalization is the root of all evil

Premature optimization is the root of all evil, and it applies even when you're not optimizing for runtime performance. If you prematurely optimize for generality, then you will invariably make designs that are more complex than is called for.

For instance, if the cardinality of a relation between two entities is one-to-one, then don't make it one-to-many just because you think it might be useful later on. Yes, it will be painful one day if you end up having to change it, but it will be painful every single day if you needlessly generalize it now.

Another example is that if you're designing a system whose purpose it is to generate HTML that will be sent out over HTTP, then it's OK to make components of the system aware of that now and then. I'm not saying you should generate HTML from the stored procedures in your database, but only that there's no need to make every (or any, actually) component capable of doing everything. If you can simplify the design by assuming, say, that the purpose of the system is what you know it is, then do so.

Conclusion

So, there you go, my Laws of Enterprise Architecture. What are your laws?

System-level component testing

2011-12-05T20:45:00.001+01:00

One of my favourite parts of development is writing automated blackbox functional tests in the component scope. That is: writing automated tests that use a component as it is ment to be used in a system. The definition of "component" that I tend to use in this context is "as much of the code that my team is working on as makes sense and as little else as possible".

If your code just uses other libraries etc., then it's usually very simple to just mock those parts, especially if you're using a dynamic language. I won't waste your time by talking about that. Instead, I'll talk about when it gets interesting - when you're writing system-level tools in a static language.

Since the tool support (and by "tool support" I mean Valgrind) is best on Linux, I first try to make the code build as a self-contained executable in Linux, no matter what the target OS is. For small embedded OSes, you can probably just reimplement the OS functions and you're set. For Windows, suit your self. That leaves POSIX-like OSes, in which case the code should be fairly easily buildable in Linux. Here's where the fun starts.

So you have some code that calls open(2) on devices that don't exist on your workstation, connects sockets using address families unknown to civilization, or does all kinds of strange things that requires root privileges and can't be easily chrooted. How can you possibly write a harness for that? LD_PRELOAD, that's how.

LD_PRELOAD is a little-used feature of the GNU dynamic linker (and others, e.g. the one in Mac OS X) that lets you specify a dynamic library that is injected into an executable before other libraries are loaded. That, in combination with the rule that whoever first defines a symbol wins, means that you can reimplement any function you like in a library that is part of your harness and have those functions be used instead of the versions defined in, say, libc. If the component you're testing opens /dev/thingamajig, then just add a function that looks like open(2), but instead of actually opening the device node just tells your test scripts about it. One useful way of doing that is to have the test running as a separate process and have a Unix Domain Socket (or pair of pipes if you prefer) where you can send messages about what the component tried to do and receive instructions about what to do with the call.

Since reimplementing everything can be both tedious and slow, you may want to forward uninteresting calls to the normal versions of the functions you've overridden. This can be done using dlsym(RTLD_NEXT, "some_function"). That will make the dynamic linker look up the next library that has a symbol called "some_function" and give you a pointer to that. Assigning that to a function pointer variable gives you a way to call, say, the plain old libc open(2) from your magic open(2) for any file that the test scripts deem uninteresting. Something like this:


static int (*real_open)(const char *path, int oflag, ... );

__attribute__((constructor))
void init_hackery(void)
{
  real_open = dlsym(RTLD_NEXT, "open");
}


int open(const char *path, int oflag, ... )
{
  int mode = 0;
  va_list va;
  va_start(va, oflag);
  if(oflag & O_CREAT)
  {
    mode = va_arg(va, int);
  }
  va_end(va);

  inform_scripts_about_open(path, oflag, mode);
  switch(ask_scripts_what_to_do())
  {
    case HandleInScripts:
    {
      return hanlde_open_in_scripts(path, oflag, mode);
    } break;
    case PassOnToLibc:
    {
      if(oflag & O_CREAT) return real_open(path, oflag, mode);
      else return real_open(path, oflag);
    }
  }
}

With this in place, the test scripts will be informed about every single call to open, and when they decide to, they can take over and do something else (but you probably do want it to end up opening some kind of file descriptor in the process you're testing, as you may need to follow the usual rules for file descriptor numbering and reuse between both the files you fake and the ones you pass on to libc). Override the read, write, ioctl and close in the same manner (mapping the file descriptor to some mock object in your scripts), and the component can get its devices and whatnot and you get your tests. Have fun!

Ada Server Pages

2011-05-13T23:23:00.002+02:00

It seems to me that sometimes, shoe-horning web applications into an Object Oriented design just for the sake of it just doesn't make sense. A class that doesn't have both state and operations is not a class. A framework built on classes that aren't really classes isn't really using the language as it should.

When building a small site, don't you sometimes secretly think that just plain JSP and some static helper methods wasn't all that bad? You're only half wrong. As long as you have a way out to a real language when you grow out of the shell you started in, it's not wrong to use simple tools for a simple end. It's just that for a purely procedural program, maybe Java isn't the right language.

So what's the solution? I'll tell you: Ada Server Pages. Some might call it The Thing That Should Not Be. I call it my first Ada program in almost 15 years. It compiles AdaSP pages into ada source, builds that into shared libraries, loads them into the server and calls them to serve the pages. And it works! At least on my particular OS X box.

Ada Server Pages. It's here. Tell your friends.

Defence-in-depth for data-base backed websites: Connections

2011-03-24T08:36:00.002+01:00

As we've seen, connections can on some database/os combinations be scarce resources. If we're keeping one connection for each session, then won't that limit the site to something like 1000 sessions? Well, there is a way around that.

In a typical site, there will be a large number of users logged in, who all have sessions in the application (especially with long session timeouts), but not that many are actually active at the same time. If we could disconnect the connections for uses who are logged in but idle, and then reconnect when the users become active, then we'd get by using only as many connections as we expect to have simultaneously active users. That number is likely to be much lower, and a site that has 1000 users actively clicking on things in a, say, 5-minute period should probably run Oracle on Solaris anyway.

Basically we get a most-recently-used cache the size of the number of connections that our database and OS can provide for us, and the way we use it looks pretty much the same as a connection pool: get a connection, use it, and release it. The difference is that instead of blocking in the get until there is a connection for us, we may be reconnecting to the database (possibly after waiting, in case the active set is full and we're thrashing).

In a traditional setup, the web app of course knows how to log in to the database, since it's always using the same username etc. In the setup I'm proposing, only the users themselves know how to log on to the database. The web app could technically store the passwords, but that's madness from a security perspective. Cleartext passwords are to be discarded as soon as possible. Thus, we have to try some other way to log on to the database.

One solution that's pretty much made for this scenario is Kerberos. When a user logs in to the site, authenticate to the Authentication Service, get a Ticket-Granting Ticket, and store that one. Whenever you need to connect to the database on the behalf of the user, use the user's TGT to get a ticket for the database. Should work in theory, but in practice it can be a nightmare both to set up and to get working in Java. It's possible that this would be smoother in Windows, where I imagine you could put the users in Active Directory and be done with the first part, but whether it will work still depends on the Kerberos APIs and how your database drivers use them.

So, if we're passing on Kerberos, we can go for PAM or just roll our own solution. With PAM we could build a module that will let us use one-time passwords, so that we authenticate with the password via PAM on login, get a cookie, and then use that cookie when reconnecting. On logout, the cookie gets invalidated.

For my prototype, I've skipped even that and gone for pre-salted passwords. What I do is that before I send passwords to the database (including on enrollment), I hash the passwords together with a random per-user salt. That salted password is what the database sees when authenticating the users, and never the cleartext one. The salted passwords are then stored in the sessions of the users, and used when reconnecting. Thus, cleartext passwords are never stored, so an attack that would show the contents of session variables of other users would not immediately give a way passwords that the users potentially could be using for other sites.

Now that the prototype implementation can have very large numbers of simultaneous users, the real load testing and performance comparison of using a view filter versus working directly on the tables can commence.

1000 connections in PostgreSQL on OS X

2011-02-19T13:53:00.016+01:00

PostgreSQL is not really geared towards more than a few dozen simultaneous connections on desktop operating systems. In this post, I'll show you how to push PostgreSQL 9 on Max OS X 10.6 to handle 1000 connections on a single system running both the webserver and the database.

The first step to getting to 1000 connections is to make PostgreSQL actually try to do it. By default, it only allows 100 simultaneous connections. Change /Library/PostgreSQL/9.0/data/postgresql.conf from max_connections = 100 to max_connections = 1000.

PostgreSQL creates one process per connection, and the web server will need one TCP connection to each one. Thus, to have 1000 connections, we need to allow at least 1000 processes per user (for the PostgreSQL processes) and at least 1000 open file descriptors per process (for the app server). Each PostgreSQL process seems to use 32 fds, and with some margin in case it grows, we should permitt 35000 fds in the system, plus some for other uses (in case you want to do other things on your machine, like logging in to it). The kernel seems to be picky about power-of-two increments, so I'll round the values up to things that it will accept.

Change /etc/sysctl.conf from

kern.maxprocperuid=512
kern.maxproc=2048

kern.maxprocperuid=2048
kern.maxproc=4096
kern.maxfiles=40960 
kern.maxfilesperproc=2048

This will make sure the kernel reserves enough space etc. In addition, we need to allow processes to create enough sub-processes (as reported by ulimit -u). This limit is set by OS X's equivivalent of init, called launchd. Change /etc/launchd.conf (or create if you don't have it already) to say limit maxproc 2000 2000.

The changes to /etc/sysctl.conf and /etc/launchd.conf both take effect on bootup, so reboot the system and have fun with your 1000 connections!

Thee-tiered testing

2011-02-18T06:52:00.003+01:00

The three-tiered program structure has been used to great benefit in many types of programs. I'm using it for my test script. It has been of great help, and I'll show you how I've done it.

The top tier in my tests is the test cases. These read like a high-level description of the steps of the test case, like this:


def test_create_thread():
        session=login("p1", "p1")
        other=get_forum_id(session, "Other")
        before=get_num_threads(session, other)
        create_thread(session, other, "New Thread", "Newly added thread post")
        after=get_num_threads(session, other)
        logout(session)
        return before+1==after

Log in, pick a forum and check how many threads there are. Post a thread and verify that the thread count has increased by one. Easy stuff. All the details about how a thread is created in the application are abstracted away, and what's left is just the code related to the test case. So what do the get_num_threads etc. functions in the middle layer (application adaptation might be a name for it) look like?


def get_num_threads(session, forum_id):
        return int(fetch_data(session, "forum?id=" + forum_id, "count(//html:tr[@class='thread'])"))

These functions tell the bottom layer which URL to fetch, and what parts of the results they are interested in. As you can see, there's an XPath query there. In some, like get_thread_id, where the text of a DOM node is too much, a regex can also be used to pick out parts of the text:


def get_thread_id(session, forum_id, title):
        return fetch_data(session, "forum?id=" + forum_id, "//html:th[@class='subject']/html:a[text()='%s']/@href" % title, "id=(\\d+)")[0].group(1)

Since I'm testing a web application, the functions in the adaptation layer are of course implemented by fetching and parsing web pages. For other applications, I expect this layer to be implemented by composing and decomposing structured messages sent on some link, direct functions calls etc., but the role of the layer is the same: provide functions that correspond to functionality in the application, so that the upper layer can talk about things like reading a post instead of details about where the post is read from etc.

The bottom layer then is the workhorse functions like fetch_data and post. Here, I'll show fetch_data, which has proven itself to be very useful:


def fetch_data(session, url, xpath_query, regex=None, params={}):
        conn=httplib.HTTPConnection("localhost", 8080)
        encoded_params = urllib.urlencode(params)
        headers = {"cookie":session}
        conn.request("GET", "/myapp/" + url, encoded_params, headers)
        response = conn.getresponse()
        if response.status != 200:
                print response.read()
                raise Exception("Failed to fetch data. status=%d, reason=%s" % (response.status, response.reason))
        html=response.read()
        doc=html5lib.parse(html, treebuilder="dom")
        context = xpath.XPathContext()
        context.namespaces['html'] = 'http://www.w3.org/1999/xhtml'
        results=context.find(xpath_query, doc)
        conn.close()
        if regex:
                r=re.compile(regex)
                results=map(lambda node:r.search(node.value), results)
        return results

Build the request using the URL from the caller. Send it, verify that it was OK, and parse the response. Pick out the part that the caller is interested in, and return it. In the old days, parsing HTML was practically impossible. Only a handfull of companies had the resources necessary to write an HTML parser that could parse HTML as it is, not as it should. Even though I'm attempting to have my application only send out valid HTML, it may fail (and we should assume it does here: this is the test suite after all!), so having a parser that can handle anything would of course be nice.

Enter the new HTML spec, where Hixie has done an astounding job in specifying how tagsoup can be parsed in a way that is compatible with the major browsers. Since we now have a spec for parsing HTML, little parsing libs based on this keep popping up everywhere. I'm using the Python html5lib, which can produce DOM trees, which in turn support XPath queries.

That's it for the how of three-tiered tests. Now the why: In addition to having easily readable test cases for the functionality test, it also helped with the load test. Having the middle layer in place meant that the load test I'm writing has simply been a joy to write. Had the functionality testing been done using copy-pasted HTTP/HTML-related code (as it was before I started restructuring it), I'd have to start over. Now, I had almost every function I needed, with names that make sense. Just look at it!


sessions=[]
for i in range(num_users):
        username="load_user_%s_%d" % (instance, i)
        password=username
        signup(username, password, "%s@example.org" % username)
        session=login(username, password)
        sessions.append(session)

for i in range(num_actions):
        session=random.choice(sessions)
        forum_ids=get_fora(session)
        forum_id=random.choice(forum_ids)

        if random.randint(1,10)<10:
                post_to_existing_thread(session, forum_id)
        else:
                create_new_thread(session, forum_id)

for session in sessions:
        logout(session)

Here, post_to_existing_thread and create_new_thread are functions similar to the test cases in the functionality test. All in all, I had to add two new functions to the adaptation layer. The rest was reused, and the load test is (at least to me) plainly readable.

So: the three-layered approach to writing tests definitely zooms. Not only should you use it for your next project: you should apply it to the tests in your current one as soon as possible!

Android texting the wrong person

2011-01-01T19:25:00.002+01:00

This hasn't happened to me (yet) but Engadget is writing about it.

http://www.engadget.com/2010/12/31/android-still-has-horrible-text-messaging-bugs-thatll-get-you-f/

and it is reported

https://code.google.com/p/android/issues/detail?id=9392

It is a pretty serious issue, hope it doesn't happen to me.

Defence-in-depth for data-base backed websites: Roles

2010-11-22T10:16:00.000+01:00

Role-based access control has been hugely influential in how we do authorization. So much that it's almost difficult to find an application that makes authorization decisions for users but does not have a concept of roles. Thus, a complete solution for doing application-level authorization decisions in the database must support application-level roles.

There are two major takes on how roles should be used. On one hand, we have what I'll call big roles, like User, Administrator, Support, and so on. Users usually have a single role. On the other hand, we have small roles, like the ones in Solaris' RBAC: Printer Management or File System Security, and so on, each letting users run a handful of privileged commands. Users usually have a whole bunch of roles, and get the permissions of the union of the permissions of the roles.

In Postgresql, there is support for role-based access control. Users have roles, and roles have permissions. We've used it when creating users in the procedure, which among other things does something like CREATE USER db_username WITH PASSWORD password IN ROLE kroll_role_user. Permissions are then given to the role kroll_role_user, and not to individual users. We also use the search path functionality to ensure that when users ask for a named table, they get a similarly named view in the schema we set up for them.

These two things, database roles and search paths, can be used to implement application-level big roles. By giving different classes of users different database roles, and creating view schemas that match what that role is supposed to be permitted, we can do things like allowing a moderator to set the sticky field of a row in the threads table, while still restricting the view used by regular users to reading threads and inserting new threads with sticky set to false.

If we modify the application to play along, we can give users access to several roles, and by prefixing table names with the schema for a role, the application can specify what authority it is asserting for each operation. That way, the user can operate under the rules of one role by default, but escalating when doing specific operations. A bit like sudo.

So what about small roles? I haven't studied it closely yet, but I believe that they could for the most part be implemented reasonably cleanly using WHERE clauses on the views. Any authorization rule that depends on information that is stored in the database should be possible to formulate as a WHERE clause (e.g. if we have some authorization matrix that the application already uses, then we can select the row for the current user, the column for the current operation, and check that the operation is permitted).

What we get from the database can be seen as data-level permissions, and schema-level permissions. By data-level, I mean that it operates on a row-by-row basis, where we can say that even if a user can sometimes be permitted to update a row in a table, a particular row may not be permitted (e.g. users can edit rows in the posts table if and only if they are specified in the author column). The schema-level permissions are things like allowing users to read from users, but not write. I believe that schema-level permissions match big roles in the application, and data-level matches small roles, but saying anything definitely on this will require studies of real-world role usage in web applications.

There may be situations where you want to combine multiple schema-level roles (i.e. when they do not match the big role idea directly), and you cannot have users select one at login, nor have the application escalate. One example could be if you have one role for letting customer service representatives see details about the specific issues that they handle and one role for giving access to summaries of entire categories of issues for statistics generation, and then have a page for displaying summary information about a specific issue. In that case you probably can't decide which schema to use beforehand. In these cases, combination roles can be created, which give the permissions of both roles. In the general case, this is not feasible since the number of roles combining two roles would be the square of the number of roles, and there could in principle be a need for any combination of roles, leading to O(nⁿ) combination roles. In practice, it's more likely that it's one or two roles (remember, this is schema-level roles we're talking about here) that need to be combined with ordinary roles. In that case, combination roles should be possible to use.

Combination roles can easily be implemented by having the search path for users with combination roles specify first the combination role, and then one of the ordinary ones. That way, if the ordinary role has sufficient rules for a view, then the combination role schema can just omit the view. For views like the one in the example above, a simple union between the views of the ordinary roles and a rule for inserting/updating etc via the corresponding view in the schema of one of the ordinary roles should suffice. It's only when both roles can modify the view and it's not clear from the context of the application which authority the user is asserting when doing the update that the combination role gets complicated. I don't see this as a real-world problem, but more experimentation will tell.

This concludes the implementation part of the series. Next up will be a post on unsolved problems, and performance test results.

Release of zoom-desktop

2010-10-31T16:24:00.003+01:00

I've just made a new release of zoom-desktop and the applications there. Notably the tailing-application 'Svansprogram' have been updated.
Check it out, comments are more than welcome. Please write bug reports or feature requests on the issue tracker.
code.google.com/p/zoom-desktop

JavaOne 2010

2010-09-23T01:23:00.000+02:00

How to make JavaOne suck?
Let Oracle host it.

Defence-in-depth for data-base backed websites: Writing

2010-08-19T20:57:00.005+02:00

Writing though views seems to be quite different in different database servers. In Postgresql, it's done using something they call rules. Rules rewrite the queries early on, before they are sent off to the planner. Views are implemented as rules, so we actually already used them in the previous post, even though it didn't show.

If we have a table for forum threads like this:

CREATE TABLE impl.threads
(
  id serial not null,
  forum integer not null,
  locked integer not null default 0,
  sticky integer not null default 0,
  CONSTRAINT pk_threads PRIMARY KEY (id),
  CONSTRAINT fk_threads_forum FOREIGN KEY ("forum")
      REFERENCES impl.fora (id) ON DELETE CASCADE
);

and we want users to be able to start new threads, but not to be able to specify the locked or sticky values, we can create a rule on the view like this:

CREATE OR REPLACE RULE posts_insert AS
ON INSERT TO kroll_user.threads
DO INSTEAD
INSERT INTO impl.threads (forum, locked, sticky) VALUES (NEW.forum, 0, 0);

For the private messages view, we can set the sender to the current user, regardless of what the user tried:

create or replace rule messages_insert as
on insert to kroll_user.messages do instead
insert into impl.messages ("to", "from", posted, subject, text)
values (
  NEW."to", kroll_user.get_current_app_user(), NEW.posted,
  NEW.subject, NEW.text
);

Conditionals get a bit trickier. Posts should only be permitted to threads that are not locked. An insert to a locked thread should be ignored. Due to a limitation in Postgresql, it's impossible for it to figure out that if it shouldn't do anything then it should do nothing, so we have to explicitly tell it:

CREATE VIEW kroll_user.posts as select * from impl.posts;

grant select, insert, update on kroll_user.posts to kroll_role_user;

create or replace rule posts_insert as
on insert to kroll_user.posts
WHERE (select locked from impl.threads where threads.id=NEW.thread)<>1
DO ALSO
insert into impl.posts (thread, author, subject, body, created)
values (
  NEW.thread, kroll_user.get_current_app_user(), NEW.subject,
  NEW.body, now()
);

create or replace rule posts_insert_default as
on insert to kroll_user.posts
DO INSTEAD NOTHING;

Postgresql runs all rules that have matching WHERE clauses (and if there is no WHERE, then the rule always matches), so for a post to a locked thread, it (like the goggles) does NOTHING, while for a post to an unlocked thread, it both inserts the row and does NOTHING.

Updates are handled almost identically:

create or replace rule posts_update as
ON UPDATE TO kroll_user.posts
WHERE (
  select locked from impl.threads
  where threads.id=(select thread from impl.posts where id = OLD.id)
)<>1
DO ALSO
UPDATE impl.posts
SET subject=NEW.subject, body=NEW.body, edited=now()
WHERE id=NEW.id and author=kroll_user.get_current_app_user();

Writing to sequences can be done in a manner similar to how we did reading: by virtualizing the accessor functions:

create function kroll_user.nextval_threads_id_seq() returns bigint as
'select pg_catalog.nextval(''impl.threads_id_seq'');'
language sql security definer;
  
/* ... */
  
create function kroll_user.nextval(unknown) returns bigint as
'select case
    when CAST($1 as text)=''messages_id_seq''
    then kroll_user.nextval_messages_id_seq()
    when CAST($1 as text)=''posts_id_seq''
    then kroll_user.nextval_posts_id_seq()
    when CAST($1 as text)=''threads_id_seq''
    then kroll_user.nextval_threads_id_seq() 
    else pg_catalog.nextval(CAST($1 as text))
  end;'
language sql;

As stated in the beginning, this is how things work in Postgresql, and other database servers will require things to be done in different ways. For Oracle, MS-SQL, and DB2, the views are updatable automatically, without any extra rules. In some cases (like with the locked and sticky columns), you may have to use triggers to make certain parts non-writeable.

Next post will move back into vendor-neutral theory again. I'll discuss role-based access control and how the database can get involved in enforcing application roles.

Defence-in-depth for data-base backed websites: Reading

2010-08-19T09:02:00.004+02:00

Now that we can create users and log them in, let's try to let them read some data. The actual tables are locked away in the impl schema to which regular users have no access. Selected parts of the tables can be made available though views. As we saw in the introduction, a table containing private messages between users can be exposed as a view that selects only the messages to or from the current user like so:


CREATE VIEW kroll_user.messages AS
SELECT id, "from", "to", posted, subject, text
FROM impl.messages
WHERE "to"=(SELECT id FROM kroll_user.current_app_user)
  OR "from"=(SELECT id FROM kroll_user.current_app_user);

So what's this SELECT id FROM kroll_user.current_app_user? current_app_user simply a view containing the appuser/dbuser mapping for the current user:


CREATE VIEW kroll_user.current_app_user AS
(SELECT appuser as id FROM mapping.users WHERE dbuser=current_user);

current_user is a special SQL-standardized function that is called without the usual parentheses and returns the login name of the current data base user as a string. Or, that's what Postgresql returns. There seems to be some disagreement on whether it should be the login name or the current security context, with Postgresql, DB2, and Firebird in the first camp and MS-SQL, MySQL and Oracle in the second. As long as we don't call it from a privileged stored procedure, that should not matter, however.

If we would like to filter out columns instead of rows, then this can be done by selecting null instead of the data for the column. Row-based and column-based filtering can also be combined. For the users table, I want to hide the email address of all users, except the logged-in user's own address. UNION to the rescue:


CREATE VIEW kroll_user.users AS (
  select id, "name", null as email, moderator from impl.users
  where id <> kroll_user.get_current_app_user()
) UNION (
  select id, "name", email, moderator from impl.users
  where id = kroll_user.get_current_app_user()
);
grant select on kroll_user.users to kroll_role_user;

Setting up read-only views of tables is pretty much straight-forward: decide what cells users should be allowed to see in each table, come up with SELECT statements that pick those, and make them into views.

Sequences are a bit trickier. Postgresql defines a sequnce to be "special single-row table" that can be used only via the functions nextval, currval, and setval. Unfortunately, Postgresql only allows currval to be run on actual sequences, and not on views of sequences. Since the sequences are data and thus stored in the impl schema to which users don't have access, we have to make a little workaround. Instead of virtualizing the sequence, we can virtualize the accessor functions:


create function kroll_user.currval_threads_id_seq() returns bigint as
'select pg_catalog.currval(''impl.threads_id_seq'');'
language sql security definer;

create function kroll_user.currval(unknown) returns bigint as
'select case
  when CAST($1 as text)=''threads_id_seq''
    then kroll_user.currval_threads_id_seq()
  else pg_catalog.currval(CAST($1 as text))
end;'
language sql;

This creates a new function called currval in the user schema (which is the first schema in the users' search path) that checks if the requested sequence is the one we want to virtualize (threads_id_seq). If it is, then we use a privileged access function for that specific sequence (which is a database object that we can explicitly grant access to). If it isn't, then we delegate to the built-in currval (which in Postgesql is stored in the pg_catalog schema). The delegation isn't strictly necessary if users don't have access to non-virtualized sequences, but it ensures we got correct error reporting at least.

That's about all there is to reading and filtering data, which is the majority of what most web apps do, and most of the code is plain old standard SQL. Next up will be writing data, which will lead us in to heavy vendor-specific terrain.