Ruurd
Ruurd

Mar 9, 2015 5 min read

Creating an automatic self-updating process

Recently I was asked by a client to replace a single monolithic custom workflow engine with a more scaleable and loosely coupled modern alternative. We decided on a centralized queue which contained the work items and persisted them, with a manager (scheduler) on top which would accept connections of a dynamically scaleable number of processors which would request and then do the actual work. It’s an interesting setup in itself which relies heavily on dependency injection, Command-Query-Seperation, Entity Framework code first with Migrations for the database, and code first WCF for a strongly typed communication between the scheduler and its processors.

Since there would be many Processors without an administration of where they would be installed, one of the wishes was to make them self-update at runtime when new versions of the code would be available.

Detecting a new version

A key component of the design is for the processors to register themselves on the scheduler when they come start. In the same spirit, they could call to an updatemanager service periodically to check for updates. I implemented this by placing a version inside the processor primary assembly (in the form of an embedded resource). The update manager returns the current latest available version and download location. If this version is more recent than the built in version, the decision to update can be made.

This completes the easy part.

Updating

The problem with updating a process in-place at runtime, is that the operating system locks executable images (exe/dll) when they are mapped inside a running processes. So when you try to overwrite them, you get ‘file is in use by another process’ errors. The natural approach would therefore be to unload every non-OS library except the executable itself, followed by the overwrite action and subsequent reload.

In fact this works for native code/processes, however managed assemblies once loaded can not be unloaded. It therefore appears we are out of luck and can’t use this method. However, we have a (brute force) escape: while we can’t unload managed assemblies, we can unload the whole AppDomain they have been loaded into.

Updating: managed approach

The idea therefore becomes to spin up the process with almost nothing in the default AppDomain (which can never be unloaded), and from there spawn a new AppDomain with the actual Processor code. If an update is detected, we can unload, update, and respawn it again.

And still it didn’t work…the problem I ran into now is that somehow the default domain persisted in loading the one of the user defined assemblies. I loaded my new AppDomain with the following lines:

public class Processor : MarshalByRefObject
{
    AppDomain _processorDomain;

    public void Start()
    {
       // startup code here...
    }

    public static Processor HostInNewDomain()
    {
        // Setup config of new domain to look at parent domain app/web config.
        var procSetup = AppDomain.CurrentDomain.SetupInformation;
        procSetup.ConfigurationFile = procSetup.ConfigurationFile;

        // Start the processor in a new AppDomain.
        _processorDomain = AppDomain.CreateDomain("Processor", AppDomain.CurrentDomain.Evidence, procSetup);

        return (Processor)domain.CreateInstanceAndUnwrap(Assembly.GetExecutingAssembly().FullName, typeof(Processor).FullName);
    }
}

and in a seperate assembly:

public class ProcessorHost
{
    Processor _proc;

    public void StartProcessor()
    {
        proc = Processor.HostInNewDomain();
        proc.Start();
    }
}

There are several problems in this code:

  • the Processor type is used inside the default AppDomain in order to identify the assembly and type to spawn in there - this causes the assembly which contains the type to get loaded in the default domain as well.
  • after spawning the new AppDomain, we call into the Processor.Start() to get it going. For the remoting to work, the runtime generates a proxy inside the default domain to get to the Processor (MarshalByRefObject) in the Processor domain. It does so by loading the type from the assembly containing the Processor type and reflecting on that. I tried different approaches (reflection, casting to dynamic), but it seems the underlying mechanism to generate the proxy is always the same.

So what is the solution ? For one we can make it autostart by starting all the action in the constructor of the Processor. That way we don’t need to call anything to start the Processor, so the runtime doesn’t generate a proxy. Moreover, we can take a stringly typed dependency on the assembly and type. This will result in the code above to change to:

public class Processor : MarshalByRefObject
{
    public Processor()
    {
        Start();
    }

    public void Start()
    {
        // startup code here....
    }
}

with in a seperate assembly:

public class ProcessorHost
{
    private const string ProcessorAssembly = "Processor, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null";
    private const string ProcessorType = "Processor.Processor";

    AppDomain _processorDomain;
    ObjectHandle _hProcessor;

    public void Start()
    {
        // Setup config of new domain to look at parent domain app/web config.
        var procSetup = AppDomain.CurrentDomain.SetupInformation;
        procSetup.ConfigurationFile = procSetup.ConfigurationFile;

        // Start the processor in a new AppDomain.
        _processorDomain = AppDomain.CreateDomain("Processor", AppDomain.CurrentDomain.Evidence, procSetup);

        // Just keep an ObjectHandle, no need to unwrap this.
        _hProcessor = _processorDomain.CreateInstance(ProcessorAssembly, ProcessorType);
    }
}

Communicating with the new AppDomain

Above I circumvented the proxy generation (and thereby type assembly loading in the default AppDomain) by kicking off the startup code automatically in the Processor constructor. However, this restriction introduces a new problem: since we cannot ever call into or out of the new domain by going through user defined types, as that would cause user defined assemblies to be locked in place, how then do we communicate to the parent/default domain an update is ready ?

For the moment I do this by writting AppDomain data in the Processor domain - AppDomain.SetData(someKey, someData) - and reading it periodically from the parent domain - AppDomain.GetData(someKey). It’s not ideal as it requires polling, but it at least works: I only use standard framework methods and types, and so the update works.