6 Architecture
- Within this section:
- 6.1 Options
- 6.2 The Distributed Option
- 6.4 Recommended Architecture
6.1 Options
6.1.1 This section discusses the architectural options that have been considered for the S.E.E. Directory, their respective advantages and drawbacks, and the reasons for the recommended approach.
6.1.2 The key architectural question is whether the S.E.E. Directory should be centralised or distributed; that is, single instance versus multiple instances of the Directory. (It should be noted that Enterprise Directories are designed to be physically distributed throughout a network for availability and performance reasons, so that at the implementation level the Directory would most probably be physically distributed in either case).
6.2 The Distributed Option
6.2.1 This is depicted in the following diagram.

6.2.2 In this case there is no central Directory. Rather, each subscribing agency has a top level Directory that interfaces with the top level Directories of all other subscribing agencies ('top level' implies the existence of more than one Directory at any given agency, and this is expected to be the general case). How the agency manages its other Directories is completely under its own control. However, in this context the top level Directory of each agency forms a part of the distributed virtual Directory and must structurally conform with the agreed S.E.E. Directory standard.
6.2.3 There are two variants of the distributed approach, distinguished by the degree to which the each agency's master Directory replicates the information held by the master directories at the other participating agencies:
6.2.3.1 At one extreme, the master directories at all agencies are full copies of each other and substantially identical to each other.
6.2.3.2 At the other extreme, each agency maintains its own data only. Any query involving other agencies is resolved by chaining to one or more other agencies at the time that the query is submitted.
6.2.3.3 The most likely actual implementation would sit somewhere in between these extremes, with each agency maintaining its own data while keeping a local copy of the most frequently referenced data from other agencies - in other words, caching their data.
6.2.4 Benefits of the Distributed Approach.
6.2.4.1 This approach clearly supports agency autonomy.
6.2.4.2 It is also resilient to any systemic failure as a result of a software or implementation defect being common throughout the system.
6.2.5 Risks of the Distributed Approach.
6.2.5.1 It depends on the completeness of the interface between Directory instances to propagate data changes and events.
6.2.5.2 It depends on the quality of the interface between Directory instances for the transfer and enforcement of access control information.
6.2.5.3 It requires a gateway or interface from each participating agency to every other participating agency. This problem becomes more severe as agencies are added to the system. For example, with three agencies participating only three interfaces are required, but (as shown in the diagram under 6.2.1) fourteen interfaces are required for six agencies.
6.2.5.4 It requires that the master Directory at each agency conform to the specified pan-agency schema, which imposes a significantly greater constraint on agencies' freedom of choice than the centralised option (see below).
6.2.5.5 There would be no single point of management of the virtual Directory, as a result of which each agency master would be managed in isolation.
6.2.5.6 Responsibility for the integrity of the composite or virtual Directory would be spread across multiple agencies and Directory products.
6.2.5.7 Availability of the whole is a function of the least available participant.
6.3 The Centralised Option
6.3.1 This is depicted in the following diagram.

6.3.2 In this case there is a single reference Directory (it should be noted that this probably also implies the use of a single vendor's Directory product as the master).
6.3.3 There are three variations of this option, distinguished by where the boundary is established between a given agency and the central Directory:
6.3.3.1 In the first case, a copy of the central Directory exists within a given agency, and the agency's gateway or interface to the Directory is to that copy. This is the most likely scenario for larger agencies. If the appropriate connectivity is established, it also provides for a considerable degree of resilience of the Directory across agencies.
6.3.3.2 In the second case, the agency uses one or more of a set of Directory query and update protocols to access an instance of the master Directory that is outside the boundary of its internal network. This scenario applies to agencies that use one or more Directory services internally but do not have the transaction volumes to justify a local master copy.
6.3.3.3 In the third case, the agency uses a web page or other application level interface to a remote Directory. This applies to small agencies that do not use Directory services internally but require occasional access to the contents of the master Directory. This must be regarded as an interim solution only, and would not be expected to be used by any agency in the longer term.
6.3.3.4 All three variations can coexist across agencies: which one any given agency selects will be largely determined by considerations of availability and performance.
6.3.4 Benefits of the Centralised Approach.
6.3.4.1 The propagation of data changes and events is managed by consistent internal protocols between copies of the single Directory.
6.3.4.2 Access control is propagated between instances and enforced uniformly.
6.3.4.3 Each agency only needs to establish a gateway or interface to one master Directory, as opposed to the multiple interfaces required under the distributed option.
6.3.4.4 The schema is consistent and enforced at all instances of the master Directory.
6.3.4.5 The master Directory will have a single management interface. All other copies will therefore automatically reflect the centralised updates.
6.3.4.6 Responsibility for the integrity of the master Directory lies with one vendor.
6.3.4.7 Availability of the master Directory is a function of the replication and fault-tolerance properties of the selected Directory product.
6.3.5 Risks of the Centralised Approach.
6.3.5.1 Agencies will be required to have a gateway to the specified master Directory. This could be seen as eroding agency autonomy.
6.3.5.2 The most likely implementation of this approach would be based on a single Directory product, giving rise to a risk of 'vendor capture'.
6.3.5.3 A master Directory implemented through a single product would also be vulnerable to systemic product software defects.
6.4 Recommended Architecture
6.4.1 This paper recommends the use of a single centralised Directory. The primary reason for this is simplicity, both of implementation and of operation. This relates in large degree to the problem of efficient handling of updates and modifications to data in a directory, as explained in the following paragraphs.
6.4.2 With certain specialised exceptions, directories are required to take on the difficult task of keeping multiple instances of physically distributed information in a consistent or synchronised state. The value they bring to an organisation or group of organisations is to be the definitive point of reference for this information.
6.4.3 While being highly optimised for read access, directories must also naturally allow for information to be modified (updated, inserted, and deleted). When a directory is modified, the modification is first applied to the closest data store or node, and then must be applied to the other data stores that comprise the directory service.
6.4.4 Within any given directory product, the designers will have used proprietary techniques to optimise the speed and efficiency with which modifications to data are handled (technically, these include data compression, changes-only, and network multicast). However, these interfaces remain the intellectual property of their vendors and are seen as important product differentiators. This situation is not anticipated to change markedly in future.
6.4.5 If directory-to-directory interface standards were sufficiently mature, then a single logical directory could be constructed out of many directories (i.e. the distributed approach described in section 6.2). However, this project has concluded that this is not the case and that there would be considerable technical risk attached to adopting a distributed approach.
6.4.6 In consequence this paper recommends that the S.E.E. Directory be based on a single directory service, as depicted in the following diagram.

6.4.7 Under this architecture, a single master Directory is used, but with physical implementation in several copies (numbered 1, 2 and 3 in the diagram). These copies would be held in separate locations to optimise performance, availability and resilience.
6.4.8 As depicted in the diagram, a range of interface options can be used for communications between agencies and the master Directory. These include the XML language (and an emerging variant DSML - Directory Service Mark-up Language) and use of a meta Directory approach (see Glossary in Appendix 5). In each case the reference schema provides a common 'language' for interfacing with the master Directory.
6.4.9 Each agency is provided with its 'view' of the whole Directory under the control of the rules laid down by the governing body.
[ Previous | Next ]

