Learning Objectives
- Understand the three basic types of models
- Understand the concept of separating structure from format
Power system engineers have utilized digital computing in a wide variety of applications, whether for performing complex analysis calculations in power system studies or for controlling real-time network operations. All power system applications require digital storage and exchange of data about the electric grid.
Proprietary Power System Data Formats Present an Integration Challenge
Large enterprise systems such as an Energy Management System (EMS), Enterprise Asset Management (EAM) system or Meter Data Management System (MDMS) typically use proprietary database schemas defining the structure of data storage within their software products. These schemas are customized to support the application’s specific technical and functional requirements. Likewise, file formats for “offline” power system-related applications used in planning and operations (typically referred to as “OT systems” used in control centers) have proprietary formats. Those formats have been extended over many years as the products have matured.
These formats work perfectly when used only for exchanging the data required for that particular application; however, they are based on fixed formats and do not lend themselves very well to extensibility. The addition of extra items can potentially break any software trying to parse the data when it finds unknown entries. These formats typically lack metadata, which keeps files compact, but wider usability is compromised since the formats are not self-describing. Further, if the documentation is proprietary and not publicly available it is often difficult to determine the attributes and characteristics necessary to build an interface that leverages the file format.
Enterprise-Level Systems Integration Requires Extensible Data Exchange Formats
In modern utilities’ IT infrastructures, large-scale applications such as the EMS and asset-management system communicate with each other, generally using a vendor’s custom format based on the internal database schema. In the past this often required the user to purchase each piece of enterprise-level software from the same vendor to ensure compatibility when integrating them or to write an interface to translate between different vendors.
As the power industry moved toward deregulation, and more utilities moving away from writing their own custom software solutions, resulted in multiple utilities running software from a number of different vendors and having to exchange large data sets on a regular basis. The use of proprietary, custom formats complicates this exchange, requiring complex translation between each of the custom formats.
Similarly, offline applications use a rigid, proprietary format containing only the data required by that particular version of the application. When subsequent versions of the program require additional details the file format is changed, resulting in multiple formats for a single application. This practice is common in the software industry, not just with power system applications. However, the impacts of these changes are minor since vendors typically provide import tools to convert prior file formats into the newest version format.
Larger challenges arise when data must be exchanged at an enterprise level. When companies need to exchange data between software applications from different vendors, or have multiple versions of the same software product in production use, organizations must consider the best strategy. Such a scenario typically presents the following options:
- Maintain multiple copies of the same data in multiple formats.
- Store the data in a format compatible with every piece of software, requiring the removal of application-specific data and a subsequent loss in precision.
- Store the data in a single, highly-detailed format and create software to translate from this highly-detailed format to the desired application file formats.
- Use a highly detailed format that is compatible with every application and whose standard format contains the basic data required to represent the power system while simultaneously allowing additional, detailed, application-specific data to be contained without invalidating the format.
The third option requires additional software engineering on the part of the company to create translation tools, but requires them to maintain only a single format containing all the data required.
The fourth option represents the ideal solution, allowing a company to maintain a single, highly detailed format that is compatible with any of their software. This option does, however, depend on three factors:
- A highly detailed model to describe the power system
- A file format that supports extensibility without affecting the core data
- Acceptance of this data model and format by power system software vendors and utility buyers (whether motivated by economic or regulatory reasons)
The Common Information Model (CIM) for Power Systems addresses the first factor, by supplying a highly detailed power system model which addresses design and operational differences across the globe. The model is expressed in Unified Modeling Language (UML), which is widely accepted as a software development standard. The second factor can be addressed with eXtensible Markup Language (XML), combined with the Resource Description Framework (RDF). The remaining requirements are more of a commercial and regulatory challenge than a technical one. Universal acceptance of this format requires both utilities and vendors to acknowledge the benefits of adopting the standard. At present, many major power system application vendors are active participants in IEC working groups which extend these standards. Many have participated in CIM Interoperability tests for power system model exchange and enterprise-level message exchange.
Information Modeling Supports Data Abstraction
“An information model is a representation of concepts, relationships, constraints, rules, and operations to specify data semantics for a chosen domain of discourse”
-- Y. Tina Lee, Information Modeling: From Design to Implementation, NIST, 1999
Information modeling is an approach to dealing with and managing data that uses an abstract, implementation-agnostic model to describe the structure of the entities within a particular software domain. This approach is characteristic of Model-Driven Engineering (MDE), which promotes the use of conceptual models and abstract object representations. One instance of MDE is the Object Management Group’s Model Driven Architectures (MDA)1 within which file formats, database schemas, and internal application data structures can all be derived from a model.
Footnotes
1 For more information see the Object Management Group (OMG) website for MDA: https://www.omg.org/mda/.
Three Models for Data Management Must Be Considered
From the perspective of a software architect there are really three models that need to be taken into consideration when writing software or designing a system architecture:
- The External Model
- The Internal Model
- The Conceptual Model
The External Model
describes the data that is exposed to the user or will be shared outside of the application. For example, this external model would define the data exposed as part of a user-interface or written to a file.
The Internal Model
describes how the application or system stores the data internally, whether within in-memory data structures or a database schema. This data is available to the internal processes and algorithms.
The Conceptual Model
is the abstract definition that integrates the internal and external model. Ideally the internal and external models are both derived from the conceptual model.
In some cases, the three models will be the same, but often it is impossible to realize an internal or external model directly from the conceptual model in situations where the conceptual model uses concepts that cannot be directly mapped into the implementation technology. For example, as will be described in more detail later, UML supports inheritance
, where a class inherits the properties of its parent class.
A normal database schema does not have a concept of inheritance for its table definitions, but there are multiple ways to map
a class structure into a database and thus generate an internal model
2. The internal model will in essence be the database schema, which differs from the conceptual model, but can be automatically derived from it.
Similarly, the external data that is exposed to the user or exported to other applications may only be a subset; of the overall conceptual model, which covers all data used by the application even if it is only used by the internal processes. The external model can thus be auto-generated from the conceptual model as a subset, still closely aligned but not identical.
Footnotes
2 For the most recent UML specifications, see https://www.omg.org/spec/UML/
Information Models Can Be Defined for Existing Applications
For situations where an application already exists and the conceptual model is not being created from scratch, there is often a need to extract an information model from the pre-existing data structures or file formats. This will make it easier to understand and interpret the application’s data and also allow Model Driven Architecture (MDA) based technologies such as Model Driven Transformation to be used with the data.
In the simplest case an information model already exists for the data as the original architects for the software, system or standard used a model-driven methodology.
On other occasions a simple conceptual model can be automatically derived from existing data structures such as a database schema or XML Schema Definition (XSD). Such a model will mirror an internal or external model and so additional manual user manipulation may be required to refine the model and harmonize internal and external models into the conceptual model.
The most time-consuming situation is when there is no means to automatically derive a model and it must be manually created. This can be accomplished by interpreting documentation or reverse-engineering code to identify the data structures.
Information Models Help Separate Structure from Format
By defining an information model for an application or system, the structure of the data is decoupled from its serialization format. This has the benefit of making it easier to understand the data within the system as abstract entities rather than as a particular line or column of text within a file, or as one or more columns and tables in a database. This also allows the same data to be serialized in multiple different formats without impacting its structure and definition.
An information model can be used to derive database schemas, file formats, user-interfaces and often documentation and interfaces automatically. As such it allows a decoupling of the data from one particular format enabling it to be saved to or loaded from multiple locations and formats without any loss of data. From a software engineering perspective, the model should be defined before any software is written or from the perspective of a systems integrator, the model should be created before the interfaces are defined at any level of detail.
By having a well-defined data structure both users and developers will find it easier to interpret, understand and use the data without having to understand the added complexity of a particular serialization format or technology.
CIM Standards Provide a Semantic Model
The conceptual model discussed above is most often referred to in the CIM world as a semantic model. The CIM model acts as an ontology of a given business domain, or a vocabulary by which utility business objects and process can be precisely defined. The model contains business objects and the relationships between them with a clearly defined set of rules. UML is then used as the language for expressing the model with the vocabulary.
Because the CIM serves as a reference to which any interface can be mapped, it is sometimes referred to as a Canonical Data Model (CDM). Compared with one-to-one mapping methods for systems integration, mapping data sources to a common semantic model offers a much more scalable and maintainable way to manage and integrate enterprise data.
A semantic model is applied in the context of utility integration projects by organizing a discrete layer for use by adapters (that is, data converters/mapped interfaces). As the CIM is expanded and extended over time, it provides a trusted information exchange approach that is independent of individual system technologies, information architectures and applications.
Case Study
As an integration architect, Jeff Kimble is already familiar with the benefits of using XML to create self-describing file formats. But he is also familiar with having to deal with proprietary, legacy, interfaces that are typical in the utility IT landscape. In the traditional interface development lifecycle, changes tend to be “one-offs,” that is, each time a change is made it starts with the old interface specification. A copy is made of this interface, and then changes to that interface are adapted for the new specification. Finally, changes to the new interface are tested against all of the applications that used to old interface (if they are not missed in the test plan) against all of the applications the old interface was integrated with.
Jeff wants to learn more about the Model Driven Architecture approach and what is meant by a semantic model. Starting with a model of the data first, before designing the interface appears to add efficiencies in code design by front-loading the design work. Error reduction seems more likely when using a “model is king” approach, since the model can be used to generate database schemas as well as message payloads. This approach re-imagines the applications from the internal data outward, rather from the external interface inwards to the application.
When using a Model Driven Architecture, three models need to be considered:
A
Internal, external, conceptual
B
External, conceptual, super
C
Conceptual, perspective, integration
D
Internal, conceptual, power system
A. Internal, external, conceptual
The internal model describes:
A
How the application or system stores the data
B
The data that is exposed to the user
C
The abstract definition of the data
D
How the data is stored in a table
A. How the application or system stores the data
The external model describes:
A
How the application or system stores the data
B
The data that is exposed to the user
C
The abstract definition of the data
D
How the data is stored in a table
B. The data that is exposed to the user
The conceptual model describes:
A
How the application or system stores the data
B
The data that is exposed to the user
C
The abstract definition of the data
D
How the data is stored in a table
C. The abstract definition of the data
Transforming data between different vendor solutions causes which challenges:
A
Having multiple copies of the same data in different formats
B
Using a highly detailed format that is compatible with every application
C
Storing the data in a format compatible with every piece of software, requiring the removal of application-specific data and losing precision
D
All of the above
D. All of the above