Workflow Foundation 4.0 Activity Data Model (I)
In previous blogs, we went through how define an Activity’s execution logic. But this is not enough. To build any meaningful Activity, developers also need to the activity’s state and data flow. In this post, I’m going to talk about WF4’s Activity data model.
Argument and Variable 101
WF is just another higher level programming language that share a lot of similarity with popular imperative procedural programming languages. So let’s first take a look at data model of a typical procedural language. Here is a C# code snippet.
static void Main(string[] args)
{
string answer = Prompt("1+1=?");
Console.WriteLine("Answer is {0}", answer);
Console.ReadLine();
}
static string Prompt(string question)
{
Console.WriteLine("Please enter your name:");
string name = Console.ReadLine();
Console.WriteLine("{0}, {1}", name, question);
return Console.ReadLine();
}
Code 1
As we learnt from the first programming class, functions/methods are basic execution unit for procedural languages and the basic mechanism to model data is arguments and variables:
Arguments are to model data flow in/out of a function, like “question” for method Prompt; variables are used to store states which could be shared among multiple statements, like “name” in method Prompt and “answer” in method Main; both arguments and variables are defined by name and type; arguments are always private to a method; variables are visible to the scope where they’re declared (the closest brace which enclose the declaration).
Not surprisingly, WF4 also chooses Argument and Variable to model data flow and internal states for Activities. And WF’s Arguments/Variables share a lot of those basic characteristics as their C# counterparts. So far the concept of arguments and variables seems to be very simple, now I need to touch something more subtle before talking about WF4’s implementations.
Separation of data and program
Code1 only describes how to manipulate data like “question” and “name”, but doesn’t talk about where the data is stored. In my 32-bit machine with .Net 4.0, the C# code is compiled into this piece of X86 assembly code at runtime:
00000000 push ebp
00000001 mov ebp,esp
…
//copy argument “question”from register ecx to stack location [ebp-4]
00000006 mov dword ptr [ebp-4],ecx
…
//call Console.ReadLine()
00000027 call 5053D5A0
//copy return value of ReadLine from register eax to stack location [ebp-c] as variable “name”
0000002c mov dword ptr [ebp-0Ch],eax
//load argument “question”from stack location [ebp-4] and push it to stack
//as the 3rd argument to WriteLine method call
00000035 push dword ptr [ebp-4]
//load literal string “{0},{1}”from constant memory location to ecx
//as the 1st argument to WriteLine
00000038 mov ecx,dword ptr ds:[03292094h]
//load variable “name”from stack and store in edx as the 2nd argument to WriteLine
0000003e mov edx,dword ptr [ebp-8]
//call Console.WriteLine
00000041 call 5053DBE0
…
Code 2
As I annotated, variables and arguments are saved in stack or registers in a running process. The program does not know location of those variables at compile time, it just reference them with a relative address like “ebp-4 “ (an address calculated by subtracting 4 from value of register “ebp”). At runtime, register “ebp” will be populated with frame address for this method so variables will automatically be saved into the right place. Abstracting stack and register away, we could say arguments and variables are just names for some locations to a runtime environment; the program does not need to concern the real storage, it just needs to reference those names as if they were the real runtime data.
As Andrew Au pointed out, the design to separating program and data is basic yet critical to today’s programming systems. However, WF 3.X has an architecture where program and data are mixed together. For example, this is a simple activity defined in WF3.X model:
public class WriteLineActivity : SequenceActivity
{
public static DependencyProperty TextProperty = DependencyProperty.Register("Text", typeof(string), typeof(WriteLineActivity));
public string Text
{
get { return (string)base.GetValue(TextProperty); }
set { base.SetValue(TextProperty, value); }
}
protected override ActivityExecutionStatus Execute(ActivityExecutionContext executionContext)
{
Console.WriteLine(this.Text);
return ActivityExecutionStatus.Closed;
}
}
Code 3
The activity defines a property “Text” for its input. There are several problems with this design, including:
· Data is stored in the program itself. A WriteLineActivity instance is not only execution logic for the activity, but also storage for the activity’s state. That means multiple running instances of the activity could not share the same WriteLineActivity object instance. Actually in WF3, a clone of WriteLineActivity object is created for every WhileActivity iteration and every ParallelActivity branch.
· There are no way to express intention and scope of the data. It’s hard for users of WriteLineActivity to reason whether “Text” is meant for input, output, or shared state for the activity’s own children. This of course leads to confusion and spaghetti code. Also, the runtime system has hard time to figure out data dependency between the Activities and sometimes could not even cleanup activities when it could otherwise.
In WF4, Activity data model is redesigned, and the number one goal is to create a separation between data and program.
Workflow program, workflow instance, and ActivityContext
In previous section, we analyzed how a C# program code is detached from its running state. The key is to separate the concept of variable/argument as names and their real location at runtime. WF4 does exactly that. Here is a WF4 version of WriteLine:
public class MyWriteLine : CodeActivity
{
public InArgument<string> Text
{
get;
set;
}
protected override void Execute(CodeActivityContext context)
{
string text = this.Text.Get(context);
Console.WriteLine(text);
}
}
Code 4
Arguments and variables are modeled by Argument and Variable classes. But the Argument class does not store the argument value itself, same as Variable. When the “Text” argument is defined on MyWriteLine, it only tells users that this Activity has a string input named “Text”, so that users could bind data to this name and the Activity itself could use this name in its internal expressions. But the real data is saved in an ActivityContext class. Every running workflow instance has its own ActivityContext just like every running method has its own frame in call stack. To access its runtime data, the activity needs to use Argument.Get(ActivityContext) to fetch the data out of the context. That way, different running instances could share the same MyWriteLine object instance yet have their own copy of runtime state.
With the separation of program and running instance, we often call an object instance of an Activity class a "Workflow program", similar to a copy of C# program in code. It’s a static description of logic detached with runtime environment; on the other hand, when the Activity object is scheduled by the WF runtime, it creates a "workflow instance", which is more of an abstract concept like the process for a C# program. It encapsulates the program’s runtime state. One WF program could have multiple running WF instances, similiar to relationship of C# program and processes. Workflow runtime use ActivityContext class to keep all states of a running WF instance. Through ActivityContext, an Activity could get its arguments and variables. As we introduced previously, there are different flavor of Activity modeling style with different Activity base classes. Accordingly, WF has defined different flavor of ActivityContext classes which expose different level of runtime functionalities.
Yun Jin
Development Lead
https://blogs.msdn.com/yunjin