2012-11-21

Tracking down memory corruption by mprotecting your ADTs

In C, it's customary to design your code around Abstract Data Types, that is, modules that consist of a header file that declares the external interface of the module (consisting of an opaque struct and a set of functions operating on that struct), and an implementation file (which has the full declaration of the structure, the definitions of the functions in the header and any helper functions). The header would be something like this:
#ifndef INCLUDE_GUARD_STACK_H
#define INCLUDE_GUARD_STACK_H

#include <stddef.h>
#include <stdbool.h>

struct stack;

struct stack* stack_create(size_t size);
bool stack_push(struct stack* stack, int i);
bool stack_pop(struct stack* stack, int* i);
void stack_destroy(struct stack* stack);

#endif

The implementation, then, is:

#include "stack.h"

#include <stdlib.h>

struct stack
{
  int* elements;
  size_t used;
  size_t allocated;
};

struct stack* stack_create(size_t size)
{
  struct stack* stack = malloc(sizeof(struct stack));
  stack->elements = malloc(size * sizeof(int));
  stack->used = 0;
  stack->allocated = size;
  return stack;
}

bool stack_push(struct stack* stack, int i)
{
  if (stack->used == stack->allocated) {
    goto error_full;
  }
  stack->elements[stack->used++] = i;
  return true;
error_full:
  return false;
}

I'll leave the implementation of the rest of the functions to your imagination. Since only the forward declaration of the struct is in the header, no code outside the implementation can access the members of the struct.

Now, assume we have a memory corruption fault somewhere in the rest of the program which when triggered corrupts the elements pointer but doesn't have any other effects. Our program them seems to be working fine until some later time when it suddenly crashes due to an invalid memory access in stack_push. We'd really like to get the program to abort at the point of the original corruption of the elements pointer, but how can we do that?

One way of solving that is to use the fact that since the structure is opaque, there is no way that any code outside our implementation file has any legitimate use of touching any of the memory to which struct stack* points. Since no other code has any business accessing that memory, then maybe we can have the OS help us preventing it from doing that? Enter mprotect.

The mprotect function lets us control what types of access should be permitted to a region of memory. If we access the memory in any other way, the OS is free (and in some cases even required) to abort our program at the spot. If we keep the memory inaccessible at all times except for when we use it inside our implementation functions, then chances are we can catch the memory corruption as it happens. The mprotect man page does say that the memory it protects has to be page aligned, though. How do we do that? Via posix_memalign and getpagesize, like so:

#include "stack.h"
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>

struct stack* stack_create(size_t size)
{
  struct stack* stack;
  posix_memalign((void**)&stack, getpagesize(), sizeof(struct stack));
  posix_memalign((void**)&stack->elements, getpagesize(), size * sizeof(int));
  stack->used = 0;
  stack->allocated = size;
  protect(stack);
  return stack;
}

Now we just have to implement and use the protect function mentioned above and its inverse, unprotect:

static void protect(struct stack* stack)
{
  mprotect(stack->elements, stack->allocated * sizeof(int), PROT_NONE);
  mprotect(stack, sizeof(*stack), PROT_NONE);
}
static void unprotect(struct stack* stack)
{
  /* Unprotect in the reverse order, or we crash and burn trying to read stack->elements */
  mprotect(stack, sizeof(*stack), PROT_READ|PROT_WRITE);
  mprotect(stack->elements, stack->allocated * sizeof(int), PROT_READ|PROT_WRITE);
}
bool stack_push(struct stack* stack, int i)
{
  unprotect(stack);
  if (stack->used == stack->allocated) {
    goto error_full;
  }
  stack->elements[stack->used++] = i;
  protect(stack);
  return true;
error_full:
  protect(stack);
  return false;
}

Since this ends up modifying bits in the MMU, it may not be suitable to have enabled on performance critical code, so an #ifdef NDEBUG switch that selects an empty implementation of protect and unprotect for non-debug builds could be advisable.

So does protecting your ADTs via mprotect zoom? Well, it does come in handy at times, and there's not much disadvantage to using it, so my verdict is: Zooms!

2012-10-28

Compiling with LLVM

LLVM is, among many other things, a reusable compiler backend implemented as a library. In this post, I'll show how the llvmpy Python bindings can be used to write an optimising cross-compiler in a few lines of Python. As a starting point, let's re-use the old parser and interpreter from the Python Parsing Combinators series. Since the grammar only has operators and numbers, we'll need to extend it a bit - there's not much point in spending time compiling something that isn't going to take any input. Let's add named values like so:
identifier = RegEx("[a-zA-Z_][a-zA-Z0-9_]*").apply(Identifier)
simple = number | paren | identifier
Next, we'll need an implementation for this new node in our Abstract Syntax Tree:
class Identifier(object):
    def __init__(self, s):
        self.name=s
    def eval(self, env):
        return env[self.name]
    def __repr__(self):
        return repr(self.name)
As you can see if you compare this eval to the old ones, we now take a dictionary as argument to eval, and that it maps names to numbers. The old eval implementations should be extended to pass along this environment, but Identifier is the only one that has a use for it.

Interpreting is now done thus:

expr.parse("a+1", 0).next()[0].eval({"a":4})
Now, the topic of this post is compilation, not interpretation. Let's start by importing the llvmpy bindings:
from llvm import *
from llvm.core import *
from llvm.ee import *
In LLVM, you have modules (roughly object files) that contain functions, which take arguments and contain a basic block. Basic blocks contain instructions, which can in turn contain things like basic blocks or call functions. Creating a module is simple:
m = Module.new("m")
The function we want to add to the module should correspond to the expression we're compiling. It should return an int, and its arguments should be the identifiers that are used in the expression. We can find the identifiers by adding a method to the AST nodes that return the union of all identifiers in the sub-tree (it's trivial, so I'll spare you the implementation). Let's sort them ASCIIbetically and use that as our arguments. We're going to need a mapping from identifier name to function arguments, so let's build that up as well.
identifiers = sorted(list(ast.identifiers()))
ty_int = Type.int()
ty_func = Type.function(ty_int, [ty_int] * len(identifiers))
f = m.add_function(ty_func, "f")
args = {}
for i in range(len(identifiers)):
    f.args[i].name = identifiers[i]
    args[identifiers[i]] = f.args[i]
Now, function bodies consist of a basic block (i.e. a series of statements). You can add statements to a basic block using a builder.
bb = f.append_basic_block("entry")
builder = Builder.new(bb)
Now we're ready for the interesting parts. For each AST node type, we add a method for compiling to LLVM opcodes. Since LLVM uses Static Single Assignment, each opcode results in the assignment of a variable, and this variable is then never modified. Every method on the builder that appends an instruction to the basic block returns the resulting value, which makes it natural to do the same for our compile method: we return the value that will hold the results of the computation. As arguments, let's take the builder that we use for appending the instructions, the type to use for our numbers, and the argument mapping we built up. Identifiers are simple: we just return the argument that corresponds to the identifier:
class Identifier(object):
    #...
    def compile(self, builder, ty_int, args):
        return args[self.name]
Numbers are also easy - we just return a constant value.
class Number(object):
    #...
    def compile(self, builder, ty_int, args):
        return Constant.int(ty_int, self.n)
Now, how do we do our two remaining AST node types, addition and multiplication? We can do this by first emitting the code for the left hand side of the operation, remembering the value that will hold the results for that. Then we do the same for the right-hand side, and finally, we emit a multiplication instruction that takes the resulting values for the left hand side and the right hand side as its operands.
class Mul(BinOp):
    #...
    def compile(self, builder, ty_int, args):
        a = self.a.compile(builder, ty_int, args)
        b = self.b.compile(builder, ty_int, args)
        return builder.mul(a, b)
Addition is of course the same, except for using builder.add. We can now let the basic block of our function return the value that will hold the result of evaluating our entire AST by doing
builder.ret(ast.compile(builder, ty_int, args))
We can now generate machine code from this, and run it directly:
ee = ExecutionEngine.new(m)
env={
    "a": GenericValue.int(ty_int, 100),
    "b": GenericValue.int(ty_int, 42)
}
args=[]
for param in identifiers:
    args.append(env[param])
retval = ee.run_function(m.get_function_named("f"), args)
This will build up a function in memory, which we can pass some arguments and call. This is really quite amazing. You call some functions to describe how to do something, and then you get a function back that you can execute directly. This can be used for all sorts of interesting things, like partial specialisation of a function at runtime, to take one example. Let's say you have to do a fully unindexed join on two strings in a database. For each value in the first table, you will need to do a string comparison with every single value in the other table. Since none of the strings we're comparing are known at compile time, there's little the compiler can do except hand us its very best version of a general strcmp. With LLVM, we can do better: for each string in one of the tables, generate a specialised string comparison function that compares a string in the other table to that particular one. Such functions can be optimised quite a bit.

The fun doesn't stop here either. In addition to generating functions in memory, we can serialise them to bitcode files using the to_bitcode method of LLVM modules. That generates a .bc file for which we can generate assembler code for some particular architecture, and then use the platform assembler and linker to generate an executable. To do that, however, we first need a main function. Since the function we generated for our expression above takes ordinary integers, we'll have to generate a main function that will take inputs form somewhere (e.g. the command line), convert that to integers, pass them to the function, and then make the result known to the user. To make things easy, let's just call the libc function atoi on argv[1], argv[2],... and use the result of the function as the exit code of our program.

ty_str = Type.pointer(Type.int(8))
ty_main = Type.function(ty_int, [ty_int, Type.pointer(ty_str)])
main = m.add_function(ty_main, "main")
main_bb = main.append_basic_block("entry")
bmain = Builder.new(main_bb)
argv=main.args[1]
int_args=[]
for i in range(1, len(identifiers)+1):
    # atoi(argv[i])
    s = bmain.load(bmain.gep(argv, [Constant.int(ty_int, i)]))
    int_args.append(bmain.call(atoi, [s]))

    bmain.ret(bmain.call(f, int_args))
Here, ty_str is a character pointer, and main is a function from int and pointer to character pointer to int. In its basic block, we add one call to atoi for each identifier that our expression uses. The argument to atoi would in C be written as argv[i], but here, we do it by first getting an element pointer ("gep" is short-hand for Get Element Pointer) to the string we want, and then dereference that. Think of the "load" as the * and the gep as the + in *(argv+i)

We then add a call to our function, and return its result as our exit code. But what does atoi refer to here? We must first tell LLVM that we want to call something from the outside of our program. It will give us back a reference to the function, so that we can build up calls to that something.

ty_atoi = Type.function(ty_int, [ty_str])
atoi = m.add_function(ty_atoi, "atoi")
atoi.linkage = LINKAGE_EXTERNAL
All that is left before we can generate our very own native executables is file handling and calling the right tools. On Mac OS X, it goes a little something like this:
import os

if filename.endswith(".zoom"):
    basename = filename[:-5]
else:
    basename = "a"

bitcode = basename + ".bc"
asm = basename + ".s"
obj = basename + ".o"
executable = basename

f=open(bitcode, "wb")
m.to_bitcode(f)
f.close()

if target=="x86":
    os.system("llc -filetype=obj %s -o %s" % (bitcode, obj))
    os.system("ld -arch x86_64 %s -lc /Developer/SDKs/MacOSX10.6.sdk/usr/lib/crt1.o -o %s" % (obj, executable))
else:
    os.system("llc -mtriple=arm-apple-darwin -filetype=obj %s -o %s" % (bitcode, obj))
    os.system("ld -arch_multiple -arch arm %s -o %s -L/Developer/Platforms/iPhoneOS.platform/DeviceSupport/4.2/Symbols/usr/lib /Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS4.3.sdk/usr/lib/crt1.o -lSystem " % (obj, executable))
In order to build for other targets, you can specify -mtriple=xxx on the llc command line. You will of course have to run a matching version ld in order to generate your executable.

So, we can build native, and even cross-compiled, executables, but what about optimisation? Turns out we get that too. The LLVM bitcode we generate for the expression 1+1+1+1+a+a+a+a is

define i32 @f(i32 %a) {
entry:
  %0 = add i32 %a, %a
  %1 = add i32 %a, %0
  %2 = add i32 %a, %1
  %3 = add i32 1, %2
  %4 = add i32 1, %3
  %5 = add i32 1, %4
  %6 = add i32 1, %5
  ret i32 %6
}
whereas the resulting x86 assembler code is
_f:                                     ## @f
Ltmp0:
## BB#0:                                ## %entry
        leal    (%rdi,%rdi), %eax
        addl    %edi, %eax
        leal    4(%rdi,%rax), %eax
        ret
Clearly, llvm has figured out what's wrong and folded the constant additions and lowered the multiplication to a left shift.

So, what's the conclusion from this? That building optimizing cross-compilers is a weekend project for a small DSL, and you can do it all in Python - and if that doesn't zoom, nothing does.

2012-09-22

Goto Checker

One of the cool new features of the upcoming Clang release is the tooling infrastructure. This Clang tooling infrastructure provides an easy way of writing style checkers, source-to-source rewriters, analyzers and all sorts of other tools that need to understand C and related languages.

Especially nice is the AST matching support that landed recently. With the AST matchers, you can describe patterns of code that you're interested in in a declarative DSL, and get a callback called for all instances of that pattern in the code you're parsing. It's a bit like XPath for C.

As a demonstration of this, I've written a little code checker that checks that code follows a particular error handling style, and warns of any transgressions. The error handling style I chose is the one I prefer in C, which I call the reverse label style. The reverse label style has been made popular though its use in the Linux kernel, and looks like the below example:

int read_file_normal(const char* filename)
{
 FILE* f = fopen(filename, "rt");
 char buf[100];

 if (f == NULL) {
  fprintf(stderr, "Failed to open %s", filename);
  goto error_open;
 }

 if (fread(buf, sizeof(buf), 1, f) == 0) {
  fprintf(stderr, "Failed to read data from %s\n", filename);
  goto error_read;
 }

 if (fclose(f) == EOF) {
  fprintf(stderr, "Failed to close %s\n", filename);
 }

 return 0;

error_read:
 fclose(f);
error_open:
 return -1;
}
For each error case, we have a goto to a unique label. Starting at that label is all cleanup that should be done in the current case. The labels are therefore listed in reverse order to the gotos. A bit more discussion can be found at the staila blog. As rightly pointed out there, it results in readable code that unfortunately easily can rot during maintenance.

Automatic code checking to the rescue. We want to find all function definitions, and in each function definition we want to see all uses of goto and all labels, and verify that we first have one section of gotos but no labels, which is then followed by a section of labels, listed in reverse order of their corresponding gotos in the first section, without any jumps in it. We thus set up a MatchFinder and add our three patterns and bind their match results to names we can use in our callback:

        MatchFinder mf;
        GotoChecker handler;

        mf.addMatcher(functionDecl(isDefinition()).bind("func"), &handler);
        mf.addMatcher(gotoStmt().bind("goto"), &handler);
        mf.addMatcher(labelStmt().bind("label"), &handler);
When we run our tool on some code, our GotoChecker::run method will be called with its MatchResult set to one of these three things. So, we check which one it is, and if it's the start of a new function, we note that we're now in the normal section of the function (as opposed to the error handling section), and we clear our stack of labels:
       const clang::FunctionDecl* func =
                Result.Nodes.getNodeAs("func");
        const clang::GotoStmt* g =
                Result.Nodes.getNodeAs("goto");
        const clang::LabelStmt* label =
                Result.Nodes.getNodeAs("label");
        if (func) {
                in_error_section = false;
                gotos.erase(gotos.begin(), gotos.end());
        }
If we get a goto, we first check that we're not in the error handling section, and then append its label to our stack of labels.
        if (g) {
                clang::LabelDecl* label = g->getLabel();
                if (in_error_section) {
                        clang::DiagnosticsEngine& d =
                                Result.Context->getDiagnostics();
                        unsigned int id = d.getCustomDiagID(
                                clang::DiagnosticsEngine::Warning,
                                "Found goto to label %0 inside "
                                "error hanling section"
                        );
                        d.Report(g->getLocStart(), id) << label;
                }
                gotos.push_back(label);
        }
Here we can also see an example of how to produce diagnostics from a Clang tool. The DiagnosticBuilder returned by DiagnosticsEngine::Report accepts a number of different Clang types, and will format them appropriately. In this case, we're giving it a NamedDecl, so it knows to put it in quotes. The source location we give to Report also means that Clang knows to quote from the source and highlight exactly what we're objecting to.

Now, the final part of the puzzle is to handle labels. It would be really nice if we could verify that there's no way for the code to fall though from the normal section to the error handling section, but that would involve scanning backwards though the statements and checking that the last is either a return, a call to a function known not to return, or a conditional where all branches end with a return or a function knon not to return or a conditional where... Let's skip that for now, and just get the name of the label.

       if (label) {
                if (!in_error_section) {
                        // TODO: Check somehow that all paths have returned
                }
                clang::LabelDecl* found = label->getDecl();
                std::string name = found->getNameAsString();
Since we're enforcing that goto is only used for error handling, let's check that the name of the label reflects this:
                if (
                        name.substr(0, 6) != "error_"
                        && name.substr(0, 8) != "cleanup_"
                        && name.substr(0, 4) != "err_"
                        && name != "exit"
                ) {
                        clang::DiagnosticsEngine& d =
                                Result.Context->getDiagnostics();
                        unsigned int id = d.getCustomDiagID(
                                clang::DiagnosticsEngine::Warning,
                                "Illegal label name: %0"
                        );
                        d.Report(label->getLocStart(), id) << found;
                }
This feature could be debated. Maybe it should be optional, and maybe the permitted prefixes should be configurable, but that's for version 2.0.

Now all that is left is to check that the label we found matches the top of the label stack we built up from the gotos in the normal section.

                if (gotos.size() == 0) {
                        return;
                }
                clang::LabelDecl* expected = gotos.back();
                in_error_section = true;

                if (found != expected) {
                        clang::DiagnosticsEngine& d =
                                Result.Context->getDiagnostics();
                        unsigned int id = d.getCustomDiagID(
                                clang::DiagnosticsEngine::Warning,
                                "Error handling sequence mismatch. "
                                "Expected %0, found %1"
                        );
                        d.Report(label->getLocStart(), id) << expected << found;
                }

                gotos.pop_back();
Apart from a handfull of boilderplate lines (~4 lines plus some includes and namespace uses), that's it. What I like is that all the parsing and scanning and other things that would normally get in our way just disappears and we can go straight to implementing our checking logic.

The full source can be had at the project page. Note that it requires LLVM and Clang from trunk, but there are good instructions for getting that.

In summary, the Clang tooling infrastructure definitly zooms. I expect to be using this quite a bit from now on, and if you're working with C or its derivates, I think you should too.

2012-09-09

The Netbeans build system

This is more of a "venting frustration" than informative. I have been using netbeans for a while now, at least the re-branded version that is jMonkeEngine SDK. However, this is about the netbeans part of it and nothing to do with the jME SDK.
I have been thinking for years that I should port Svansprogram to a Rich Client Platform instead of hacking my own. Hacking my own RCP was fun at first but now it is mostly tedious.
So to learn module development on Netbeans I have started out with some small hacks to get a feel for it. Ran the Hello World things and so on.
Eventually came the time to make my very own module, first I let Netbeans create the project, then I put it into my version control and then on to set up my continuous build system.
The build system in netbeans is Ant. Which is all very fine by me, Ant+Ivy is my preference if I can choose whatever I like. It is always good if the IDE understands the real build system, i.e. the headless builds run on the build server. You know, the real build, that produces the installable binaries.
Eclipse has a totally separate build system so you need to keep the IDE and the real build in synch using plugins and other voodoo. So if Netbeans have some build system that I can use in both the IDE and on the CI machine I'm a happy camper. So with a smile on my face I set out to run the netbeans project on the CI machine.
That is when the horror starts.
The way netbeans set up the builds is to have some archane and weirdly mutated Ant scripts, without documentation, importing things from secret directories. Then have some plugins in the IDE that knows the secret locations to insert values here and there. I have never in my life as a professional coder seen a more opaque and strange build system.
For example, it uses an Ant plugin called CopyLibs to, you know, copy JAR-files it needs for the build. It makes me want to grab the engineer by the throat and foaming at the mouth scream at them that ANT already can copy files or why don't you just use Ivy.
So in order for my CI machine to even be able to get all the required dependencies I have to pollute my Ant installation making it harder to have reproducible builds since now my entire build system is dependent on a specific version of Netbeans.
It does not stop there, to make a plugin for Netbeans I must of course build and link and all that against a specific version of Netbeans. This is acheived by including build scripts from a "Harness" directory that is specific to a version of Netbeans. This is all understandable, but the location of this harness-directory is set up in a "private" property file, through one or two ant properties with closely guarded complex names. The private property file is actually hidden in $HOME.
Yeah you heard me, to make the CI machine build it will have to have access to my $HOME to be able to find the harness directory that is from my _installation_ of netbeans, not from some central repository of dependecies.
There are some other ant-voodoo that Netbeans hides in a FAQ to instruct people how to download the harness from the web instead of pointing the CI machine to an actual installation of netbeans. I say it again, an installation, that requires you to run an installer, and click through stuff.
There is more to this story but now I need to go cry in a dark corner. It has taken me several weeks of hobby-coding time to even get this far, and then to realize that netbeans generates images when building so you have to remember to set the JVM as headless or the build fails. And oh yeah, when I did it generated blank images so my menu entries when the module is installed are empty lines.

2012-06-27

Glassfish and TeamCity

On my day job we use TeamCity from JetBrains as the Continous Integration solution. It works really well, looks good and has a plug-in for eclipse even. So I thought I'd install it at home too.

Now, I have glassfish as an app server. I like JEE application servers, I think they are neat, I want my glassfish to deploy all my JEE applications. So i found the WAR-download of TeamCity. I read the installation instructions which required me to set some environment flag to change the default location for data directory (I keep my variable files in /var thank you very much, not in some wierd directory in $HOME).

I deploy the application and I am presented with a blank page.

Looking in the log file it says that I have no database definition in my data directory - yeah, like, how the hell could I since I have no clue what you are talking about.

So back to the instructions here http://confluence.jetbrains.net/display/TCD7/TeamCity+Data+Directory. I suspect that I must have _all_ the configuration in place before I deploy the WAR file, at least http://youtrack.jetbrains.com/issue/TW-20362 seems to say that. So in order for me to deploy the WAR-file I must first download and run the Netty-integrated installer to create the directories?!?!

Hello Jenkins, goodbye TeamCity

2012-05-18

Svansprogram v3.0 out now

I just released v3.0 of Svansprogram. Some bug fixes, lots of refactoring and sweet new graphics (icons and banners) by Mattias Persson. Get Svansprogram from http://code.google.com/p/svansprogram/

Oh, sorry, no packaging for Mac yet but I'll do it "real soon now" :-)

2012-04-03

Installer woes

I'm preparing to release v3.0 of my Multi-tail application Svansprogram. One popular feature request is native launchers. Yeah, OK, so this one guy asked for it but since there are about 3 people in the world using svansprogram that is a substantial part of the user base.
Being a Linux guy I have to jump through a lot of hoops to even begin to create a windows EXE-file or an OS X launcher. I can't even test them properly! Anyway, Launch4j seems nice and might even have some type of integration with maven (and I'm switching back to ant + ivy when I find the strength). There is a GUI and some XML files that you can use to work up an exe-file. The GIMP did let me create an ICO-file also. As soon as I can test this somewhere I might put in the hour or so to try this out.
A pal has promised to build an OS X launcher by using some sort of maven magic and an another launcher-creator-app-thingy.

So I tackled the Linux side. I decided to distribute an installer with the TAR-ball,
disclaimer - Blogger ate the formatting and yes I KNOW there are symlinks and stuff this does not handle, write a bug report or something :-)

And this people is why I LOVE linux: a few lines of BASH and here is your installer.


#! /bin/bash

SVANSPROGRAM_HOME="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
LOG_FILE=$SVANSPROGRAM_HOME/install.log

log() {
echo "$1" >> $SVANSPROGRAM_HOME/install.log
}

log "`date`"
log "Svansprogram directory: ${SVANSPROGRAM_HOME}"

#######################################
# Create executable file #
#######################################
EXECUTABLE="$SVANSPROGRAM_HOME"/svansprogram.sh
log "Creating shell script: $EXECUTABLE"
cat <<MESSAGEDELIMITER > $EXECUTABLE
java -jar $SVANSPROGRAM_HOME/svansprogram.jar
MESSAGEDELIMITER

log "Setting execute permission"
chmod +x "$EXECUTABLE"

#######################################
# Create desktop file #
#######################################
INSTALL_DIR=~/.local/share/applications
log "Install directory: ${INSTALL_DIR}"


if [ ! -d "$INSTALL_DIR" ]; then
log "Install directory does not exist"
exit 1
fi

log "Creating desktop file"
cat <<MESSAGEDELIMITER > "$SVANSPROGRAM_HOME"/svansprogram.desktop
#!/usr/bin/env xdg-open

[Desktop Entry]
Version=3.0
Type=Application
Terminal=false
Exec=$EXECUTABLE
Name=Svansprogram
Comment=A Multi tail application
Icon=$SVANSPROGRAM_HOME/icon_128x128.png
Categories=Development
MESSAGEDELIMITER

log "Moving desktop file"
mv "$SVANSPROGRAM_HOME"/svansprogram.desktop "$INSTALL_DIR"

2012-02-08

Some laws of enterprise architecture

Though the years, I've acquired a few observations on enterprise development, which I will pompously refer to as my Laws of Enterprise Architecture. They are as follows:

1: In the end, it's SQL

Sure, the system you're looking at is not just a simple SQL front-end. While it does terminate some functionality in an SQL database, other things are passed on to web services and other support systems. But what do those do? More likely than not, they terminate functionality in an SQL database or pass the request on to other systems that do the same. Eventually, everything ends up as SQL queries. There aren't web services all the way down.

As a corollary, you can determine the efficiency of the architecture by comparing the queries that are actually run for a set of functions with that you would have if you just looked at those functions and translated that into queries to an imaginary database built for the purpose. If you only need a query or two that fetch a grand total of a handful of rows, but the queries that end up hitting the actual databases are a lot more complex (and especially return a lot more data), then you're losing track of the objective somewhere along the way from the user to the database.

2: If the answer to your enterprise integration needs is a bus, then you have bigger problems

If you need a bus to integrate your systems, then you either

  1. pretend that you don't know what systems you have, or
  2. actually have no control over what's deployed.
If it's a), then you should open your eyes. Pretending that you don't know things that you, in reality, fully control is a sure way to ruin any design. Doing it for the entire architecture is madness. If you're working on the architecture for an enterprise system and you're in case b), then you should work a bit more on the processes. From where do these unknown systems come?

That said, there are cases where a bus makes sense. If you're building a desktop operating system that will be used by millions of users running any number of strange things on their systems, then a bus does make sense. If your enterprise architecture is comparable to millions of installations of a desktop system, however, you have your plate full.

3: Premature generalization is the root of all evil

Premature optimization is the root of all evil, and it applies even when you're not optimizing for runtime performance. If you prematurely optimize for generality, then you will invariably make designs that are more complex than is called for.

For instance, if the cardinality of a relation between two entities is one-to-one, then don't make it one-to-many just because you think it might be useful later on. Yes, it will be painful one day if you end up having to change it, but it will be painful every single day if you needlessly generalize it now.

Another example is that if you're designing a system whose purpose it is to generate HTML that will be sent out over HTTP, then it's OK to make components of the system aware of that now and then. I'm not saying you should generate HTML from the stored procedures in your database, but only that there's no need to make every (or any, actually) component capable of doing everything. If you can simplify the design by assuming, say, that the purpose of the system is what you know it is, then do so.

Conclusion

So, there you go, my Laws of Enterprise Architecture. What are your laws?