Process Steps for Designing the System Dataflow
Applies To: Windows Server 2003 with SP1
Previous Sections in This Guide
Designing the system dataflow consists of identifying the data sources, objects, and attributes that are used in the metadirectory deployment and describing the policies that define the actions taken when these objects are added, modified, or deleted. Dataflow design is the discovery and recording of data source information, of relevant objects in each data source for synchronization, of real-world object policies and actions, and of which attributes from each object should push out from the metaverse. This information is recorded on the following worksheets, which are provided for you:
Real-World Identity Objects
Connected Data Sources
Object-Level Policies
Included Attributes
Outbound Attribute Flow
Metaverse Object Design
Begin to fill out these worksheets as you collect the information described below. Some of the information will be easily apparent; however, other information may require persistent investigation or decisions not yet made. Therefore, you might have to return to these worksheets when the information is available, particularly during the dataflow design process. When you prepare these new worksheets, begin by adding the initial data. Then return to the worksheets as needed, gradually refining your logical design.
These steps outline this process and the subsections that follow provide details of each step:
Step 1: Identify real-world identity types. Identify each real-world identity type and begin filling out a Real-World Identity Objects worksheet for each one.
Step 2: Identify your data sources. For each management agent listed on the Real-World Identity Objects worksheets, begin filling out a Connected Data Sources worksheet.
Step 3: Identify the system objects. For each object, begin filling out an Object-Level Policies worksheet. List all business policies that would apply to each type of object activity, such as adding, changing, or removing an object.
Step 4: Identify system attributes. For each connected data source object, fill out an Included Attributes worksheet. List all attributes that need to provide data to the metadirectory.
Step 5: Diagram your design. Draw a diagram of your proposed design.
Step 6: Establish a metaverse object type. Represent each real-world identity type, by completing a Metaverse Object Design worksheet for each object type.
Step 7: Obtain stakeholder approval for your dataflow model.
Before you begin, review the sample worksheets to see detailed examples of how to use them. Then review the worksheet samples together with the detailed step descriptions that follow. The step subsections cover the process of creating the metadirectory-related worksheets. They include suggestions and helpful questions to consider as you complete the worksheets. Additionally, consider diagramming the processes.
Note
Remember that at this stage you are documenting a logical design for the metadirectory: you are not concerned about physical constraints, or optimization of storage, processing, and code complexity. You pass on the deliverables from this topic to the metaverse and rule planners, who modify and add to that data, producing a design document with worksheets and a physical design.
Before proceeding, be sure you understand the information provided in “Overview of Designing Your Dataflow Model” and “Dataflow Design Concepts” earlier in this document.
Step 1: Identify Real-World Identity Types
For each object type that you plan to synchronize, determine what actions MIIS 2003 must take to meet your synchronization objectives. For example, the Person object is represented by three different objects: one in Active Directory, one in human resources, and one in telephone data sources. You need to determine which object is authoritative; the attribute flow direction; which objects, if any, should be joined; and what should happen if the real-world object is removed from the system.
Record this information for each real-world object for each data source in your design. The management agent that accesses each data source also represents it. To begin this investigation, complete the Real-World Identity Objects worksheet, which records the information in the following table.
Real-World Identity Objects Worksheet
Real-World Object Data | Explanation for Completing the Worksheet |
---|---|
Management agent |
Names the management agent. It is advisable to name your management agents so that they refer to the data source. |
Object type |
Names the type of data source object that is associated with the real-world object, such as user, person, staff member, or computer. |
Provisioned Y/N |
Indicates whether MIIS 2003 can use this management agent to provision — that is, create — new objects of this type in the connected data source. Indicate either yes or no. |
Join Y/N |
Specifies whether new objects of this type should attempt to join with an existing object in the dataflow. Indicate either yes or no. If you enter “Yes,” document the join policy in the Object-Level Policies worksheet. Joining objects links a connector space object with an existing metaverse object, thereby creating the path for attribute flow. For example, the telephone-related PhonePerson object contains the telephone number for that person, which is required for the design. Therefore, joining this object with a metaverse object meets that requirement. |
Project Y/N |
Indicates whether the object should be projected into MIIS 2003. Projection creates a new object in the metaverse. Specify either yes or no. |
Discovery notes |
Lists any policies that might behave differently during the discovery phase of operation than during ongoing (normal) operation. For example, you may have many user accounts in the human resources database that already exist in Active Directory. During the discovery phase, you want to join these user accounts within your solution; however, during normal operations, any new Active Directory accounts are only created from MIIS 2003. Hence, a join is no longer necessary. |
Part of the join evaluation is to determine what you should design if a join failure occurs. (Most joins are automatically performed when join rules are implemented; some joins must be manually performed once.) For example, the Active Directory User object is a significant object in the design. However, if the User object fails to join with a metaverse object, it might be because it is a new User or a disabled user. You need to decide your strategy based on the state of the object.
Step 2: Identify Your Data Sources
After you prioritize your synchronization objectives, you might have already determined which of your existing data sources seem most likely to be included in the first deployment. It is important that your organization’s business rules support the precise level of synchronization that you require of participating data sources. To determine this, evaluate all aspects of each data source, including the following details:
The department that owns the data source.
The employee who is considered the data source owner.
Your current backup and restore policies for the data source.
Your system’s heavy or low saturation points and when they occur.
Any security constraints to apply to the data that you want to synchronize.
Connection information, such as account name and domain.
All objects in this data source to determine unique identifiers that can be used for join.
When you determine that a data source is likely to be included in the synchronization, fill out the Connected Data Source worksheet for that data source, collecting information that you will use in the design of the dataflow. Determining your preliminary synchronization objectives means that you must investigate all the identity objects in the connected data source. You will find that some objects are relevant to your design and others are not. The purpose of this step is to exclude all unnecessary objects and investigate further those objects that can help you meet your synchronization objectives. Consult data source stewards for information, to obtain buy-in, and to identify issues.
The following table describes the information that you record on the Connected Data Sources worksheet.
Connected Data Sources Worksheet
Data Source Information | Explanation for Completing the Worksheet |
---|---|
Management agent |
Identifies the name and type of the management agent that is used to connect to the data source. The name is not important at this stage in the design but the type is. For example, you should be able to list whether the data source is a SQL Server database or a delimited text file. For a complete list of available types, see MIIS 2003 Help. |
Connected data source |
Identifies the name of the connected data source. |
Owner |
Specifies the owner, either person or department, who oversees the data source. |
Contact |
Names the person who has the ability to access and change the data. |
Backup and restore policy |
Identifies the backup plan and schedule. |
Security issues |
Identifies those issues that might affect your design, such as strict access control. |
All connection and containers details for this MA type |
Identifies the connection and container details that are necessary for MIIS 2003 to connect to the data source. For example, lists forest, account, domain, and container information. |
Name |
Names the relevant object that resides in this data source |
Unique ID |
Identifies what makes each object unique. Every object has at least one attribute that you can use to determine uniqueness. Uniqueness is very important during the entire synchronization process, so identify one attribute that is immutable throughout the life of the object. For example, the EmployeeID associated with an Employee object makes it unique. |
Notes and policies |
Records any additional information about this object in this data source. |
As you fill out the worksheet, think about naming your management agents. Management agents are used to transfer data between MIIS 2003 and a single data source. To facilitate configuring the system later, name your management agent according to the data source with which it is associated.
Determining the Authoritative Data Source
MIIS 2003 allows you to determine which data sources can be used to create, modify, or delete an object that participates in synchronization. At least one data source should be authoritative for each attribute; that is, if more than one data source offers a value for the same attribute, the system needs to have rules to decide which value to choose in order to establish precedence for the data sources.
If an object is modified within a data source that is not authoritative, the authoritative data source overwrites the change. Therefore, define which of the participating data sources is authoritative for each synchronized object and possibly for one or more attributes within an object.
Later in your design, you do an analysis to determine which data source is authoritative for each object, and at times, for each attribute. Be sure to include data source owners when you meet to analyze each attribute because they are more familiar with the needs of their data source. After you establish this information, make sure to publish it to all administrators that regularly modify objects that reside in any participating data source. Employees knowledgeable about security are also valuable reviewers for this analysis. Document any issues discussed, as well as the decisions on which objects and attributes are authoritative.
Modify objects only in an authoritative data source. If the authoritative data source supplies a value and you modify data sources that are not authoritative, the changes do not propagate. Such changes also disappear — even at the data source where they were changed — the next time synchronization occurs with the authoritative data source.
If an object is accidentally deleted or modified from an authoritative data source and the administrator of the non-authoritative data source makes a modification to restore it, the modification is only temporary. Administrators of non-authoritative data sources need to communicate instead with the owner of the authoritative data source.
You might have more than one authoritative data source for a specific object or attribute. When this occurs, identify which authoritative data source takes priority in each instance. You designate priority by determining the attribute precedence when planning your synchronization. For example, in the Fabrikam scenario, the human resources database is authoritative for the Person object. However, a goal of the deployment is to reduce administrative costs; therefore, the scenario is considered to allow employees to update their own home phone number. Because the human resources database is updated less frequently than the Telephone file, the Home phone number of the latter takes precedence during any synchronization process.
The following figure illustrates possible Home phone number attribute conflict (that will be resolved by determining precedence) between the human resources database and the Telephone file:
Figure 3: Uncovering Attribute Conflicts and Determining Precedence
Step 3: Identify System Objects
For each real-world object in each data source that you want to use in your MIIS 2003 deployment, you must discover the appropriate action and then state the business policy that corresponds to that object. Doing so for each object in your design allows the dataflow designers to understand how to treat that object when it is imported into MIIS 2003.
Typically, you plan for these three object actions:
When a resource is added to your organization, such as a new staff member, identify the business policies and business processes that occur for the creation of that system object.
When an existing object changes, such as a name or contact change, identify the policies and processes that occur. This might be different for different object attributes.
When the resource is removed from the organization (object deletion), identify what should happen to the object in the system. The policy might be to remove it immediately everywhere, or just remove it from selected data sources. On an attribute level, the business policy might state that all personal contact information be hidden from all personnel except human resources.
This step begins the process of translating business policies into synchronization rules. To begin your research, complete the Object-Level Policies worksheet, which records the information that is shown in the following table.
Object-Level Policies Worksheet
Object-Level Policy Information | Explanation for Completing the Worksheet |
---|---|
Action |
Specifies the three possible actions that you can apply to a policy: object creation, object modification, and object deletion. Enter New Object, Change Object, or Delete Object. |
Object-level policy |
Details the policy to enforce when the action occurs. For example, for a Delete Object action, you might want to perform either of these actions: If EmployeeStatus was “terminated,” delete the linked objects from all connected data sources, including any delayed action events that were queued. If EmployeeStatus was “active,” disable the Active Directory and Notes accounts and set the Telephone comment to “left.” Create a delayed action for a delete of all associated objects in 30 days. |
Reason/Notes |
Records the business justification for the policy or details about the policy. For example, your business rules might dictate that all terminated users are removed from every system immediately. |
The following figure uses the Fabrikam scenario to illustrate how you can assign object-level policies when an add, delete, or change occurs.
Figure 4: Object-Level Policies When Changes Occur
Step 4: Identify System Attributes
System attributes are the attributes that flow into MIIS 2003 from the connected data sources and flow out of MIIS 2003 to the connected data sources. The values of these attributes are what MIIS 2003 uses to perform synchronization.
Identifying Included Attributes for Each Object
Importing attributes into the metaverse is straightforward: identify which attribute values are needed for synchronization, identify the data source in which they reside, import and stage the objects from the connected directory, and complete the synchronization. However, the design is not driven by the inbound attributes but rather by the attributes you need to push out from the metaverse. These outbound attributes can be pushed back into the same data source they came from (if business policies allow), or they can be pushed into another data source.
For each object in the connected data source, identify and list only those attributes that are beneficial to your scenario, either as coming from the connected data source or flowing outbound from the metadirectory to the connected data source.
At this point in the design process, consider which attributes you must have and which ones would be beneficial for the synchronization process. In the Fabrikam scenario, it is mandatory to publish — both externally and internally — mobile phone numbers for each member of the sales staff. However, it would be optional to internally publish mobile phone numbers for the entire Fabrikam staff. The information you identify at this stage is used later as you design your dataflows and synchronization rules.
The following table lists the information you need to collect and record to design the attribute flow for each object. Record your data on the Included Attributes worksheet.
Included Attributes Worksheet: Connected Data Source Attributes
Attribute Information | Explanation for Completing the Worksheet |
---|---|
Name |
Names the attribute in the connected data source. For example, attributes of the Active Directory User object are samAccountName, Cn, Company, displayName, employeesID, and others. |
Data type |
Identifies the data type of the attribute. The possible choices are string (indexable), string (non-indexable), binary (indexable), binary (non-indexable), number, Boolean, and reference (DN). For example, the samAccountName attribute of the Active Directory User object is an indexable string. |
Multi-value Y/N |
Identifies whether the attribute can hold multiple values. Enter “Yes” to indicate an attribute supports multiple values. |
Content structure |
Specifies how a value or content is structured. Explain the structure as accurately as possible because you will use this information to determine if attributes in other connected data sources hold the same value as this attribute. For example, the content structure of the displayName attribute value is last name, comma, first name. |
Outbound Y/N |
Specifies whether the attribute flows outward from the dataflow to the connected data source. If you enter “Yes,” complete the rest of the outbound attribute columns. |
Requires validation Y/N |
Indicates whether the attribute value must pass a validation test before it is exported to the connected data source. If an attribute must be validated, you can include the validation rule in the Notes column or in a separate section of the document with a reference provided in the Notes column. Indicate either yes or no. |
May be overwritten with null Y/N |
Specifies whether a null value is allowed to replace the attribute value for the connected data source. Enter “Yes” if a null value is allowed. Otherwise, leave blank or enter “No.” |
Business justification |
Outlines the business justification for including this attribute in the list. For example, due to frequent changes, business policy may not require that human resources staff collect mobile phone numbers. Therefore, that information must be updated in and collected from other sources. |
Quality and precedence notes |
Lists any notes about the quality of the attribute and the likelihood that it will take precedence over associated attributes from other connected data sources. For example, identify if the data value of the attribute is likely to be consistently high quality from this particular data source. |
Notes |
Provides space to make additional notes about the attribute. Explain any validation requirements or other details that will help with matching this attribute to the connector space attribute. If validation is required and the details are found elsewhere, include a reference to the explanation. |
The following table lists the information you need to collect and record in order to design the outbound attribute flow for each object.
Note
A few columns on the Included Attributes worksheet, which is detailed in the previous table, duplicate information in the Outbound Attribute Flow worksheet, which is detailed Table 6.
Designing the Outbound Attribute Flow for Each Object
Attribute Information | Explanation for Completing the Worksheet |
---|---|
Name |
Specifies the attribute name of the connected data source. |
Validation |
Details the outbound validation policy that is applied to the named attribute. If validation is required, specifies what requirements apply before the attribute can be exported to the connected data source. |
Validation failure action |
Specifies the policy to apply if this attribute fails the validation test. |
Transformation |
States the transformation policy that converts the content of the metaverse attribute to the connected data source attribute. For a constant value flow, write the constant here. |
May be overwritten with null Y/N |
Specifies whether a null value is allowed to replace the attribute value for the connected data source. Enter “Yes” if a null value is allowed. Otherwise, leave blank or enter “No.” For example, if the human resources database is considered authoritative but your synchronization goals include providing a mobile phone number for each Person object from a telephone directory, you must consider if it is appropriate to update the human resources database with a non-null mobilePhone attribute value. |
Considerations or policies needed |
Records any notes or comments, including policies, for this attribute flow. |
Investigating what data sources to use in your deployment, which objects in those data sources are of interest, and which attributes of those objects are required to be flowed out of MIIS 2003, sets the stage for finalizing your deployment goals. After you collect and record this information in the solution proposal and in the worksheets mentioned in this topic, you can begin the dataflow design.
Figure 3 uses the Fabrikam scenario to illustrate where and when a validity check on the attribute HomeTel would most likely occur. It is assumed that even though the human resources database may not be authoritative in this scenario, that human resources data is valid. However, because users can enter data into the Telephone file, a validity check is always required.
Figure 5: Determining Attribute Value Validity Checks
Step 5: Diagram the Design
It is recommended that you consolidate all the information you have collected for your dataflow design and create diagrams that demonstrate the dataflow model you have created. All the relationships and policy descriptions that you defined in the worksheets during the design process are really the definition of your dataflow model. This model is the basis for all future steps in the planning and deployment of your metadirectory project.
All the planning and deployment teams need to be familiar with at least part of the model; so make sure that it is clear and easy to understand. Diagramming the dataflow model makes it much easier to present the conceptual ideas behind the dataflow design. Diagrams also help you spot potential flaws in the design, such as conflicting data sources for a specific object.
Create diagrams similar to the example provided in the following figure.
Figure 6: Model Diagram
Diagrams should show the data sources involved and use arrows to indicate the flow of information into and out of the metadirectory. Create multiple diagrams. For example, create a diagram for each data source that focuses on the data flowing into and out of the particular data source. You might also consider diagrams based on individual objects that describe the flow of attribute information.
The goal is to make the model easy to understand by the various teams and individuals who are involved with the deployment project. All these people, whether they are members of actual planning and deployment teams, or whether they are data or service owners who need to sign off on the design, need to have a clear picture of the model so that they can make informed and timely decisions. If the information presented is not clear, it can introduce misunderstandings that result in delays in the decision-making and sign-off process, or it can potentially cause data owners or sponsors to back out of the project. Clearly diagramming the model can help avoid these problems.
Step 6: Establish Metaverse Object Types
Now that you have created a list of the objects and attributes that are needed to support the connected data sources in the metadirectory deployment, make your first attempt at identifying the objects that you need create in the metaverse. Using the worksheets that you created for Real-World Identity Objects, prepare a Metaverse Object Design worksheet for each one, probably using the same name for it as the real-world identity type. You will add to this in the later steps as information becomes available.
The following table describes the information to complete for the Metaverse Object Design worksheet. Start by filling in the boxes just under the worksheet title. Next, write the metaverse attributes in the “Name” column and become familiar with the remaining column headings so that you can begin to fill in the cells as you uncover information about the Metaverse object types and their associated inbound attributes.
Metaverse Object Design Worksheet
Metaverse Object Data | Explanation for Completing the Worksheet |
---|---|
Name |
Lists the names of the Metaverse attributes for the object specified in the heading. You can use the built-in attributes or add your own to each of the Metaverse objects. For example, givenName is a metaverse attribute. |
Content structure |
Describes the content (a non-technical explanation is best) and also explains the structure of the attribute’s value. You use this explanation to determine if attributes in other connected data sources hold the same value as this attribute. An example of this explanation is “First letter of first name plus last name plus number.” |
Joined Y/N |
Specifies whether the attribute is used as part of a join policy. Enter “Yes” if the attribute is used as part of a join policy. Otherwise, leave blank or enter “No.” You first recorded join policies on the Object Level Policies worksheet. Joining objects links a connector space object with an existing metaverse object, thereby creating the path for attribute flow. For example, the Telephone connected data source object PhonePerson contains the telephone number for that person, which is required for the design. Therefore, joining this object with a metaverse object meets that requirement. |
Management agent (MA) |
Names the management agent that is supplying the attribute and initial value. |
Object |
Identifies which management agent object flows data to this metaverse attribute. If the management agent has only one object that flows a value to this metaverse attribute, you can leave this column blank because the object can be uniquely identified from the Inbound Attribute Flow worksheet of the management agent. Note: A management agent’s object may contain multiple inbound rules for a single metaverse attribute. In such cases, enter two rows in the table and distinguish the flow by using a comment in the Notes column. |
Precedence |
Specifies the rule that governs when this management agent cannot overwrite the metaverse attribute value. A blank entry means that this management agent can always overwrite the metaverse attribute. |
Considerations or policies needed |
Records any other considerations, including policies. For example, you might write “Flows from the AlternativeNumber attribute in the management agent” for the attribute of a specified object. |
Step 7: Obtain Approval for Your Dataflow Model
As the closing step of the dataflow design phase, be sure to obtain stakeholder approval for your design. However, remember that your design document is not complete until the metaverse design and the synchronization rule design are complete. At this stage in the design, obtain informal approval for the logical dataflow design because it is integral to the remaining deployment steps.
Summary
When you complete the process steps in this document, you will have started, and in some cases completed, the following worksheets, which detail the logical metadirectory design for your solution:
Real-World Identity Objects
Connected Data Sources
Object-Level Policies
Included Attributes
Outbound Attribute Flow
Metaverse Object Design
When the design of your dataflow model is complete, begin planning the design of your metaverse. Be sure to provide the metaverse planners and the rule planners with a copy of the system dataflow design documentation so that they can plan the specific details of your deployment.
See Also
Concepts
Overview of designing a system dataflow model
Dataflow Design Concepts
Filtering Metadirectory Dataflow
Design Constraints
Authority and Precedence