Architecture Principles; Data Principles


Management of data is a complex topic and these principles provide guidelines for how data should be dealt with to maximise its value to the University and to provide a framework within which the University will manage its data.

Level One Data Principles

Name

Statement

All data is valuable

For us the value of data is not just in the way we could monetise it, but from how it contributes to the delivery and assurance of safe, compassionate and effective education and research, and how the data we use and create supports us in delivering our activities and enabling effective decision making.

All data has an owner

It is important with all the data we hold or access that we ensure we understand what role we are fulfilling in order to ensure we understand the responsibilities we have and can work with it appropriately.

Data must be understood

Data is a representation of some aspect of reality and can only be used effectively, appropriately and reliably to contribute to valuable outcomes if it is understood.

Data must have a known purpose

If we cannot identify a purpose for the data, we gather we should not be gathering it.  It is desirable for data to have multiple purposes (and essential that all these purposes are recorded) but there will usually be one primary reason the data is valuable to us.

Data must have context

To be able to understand data and reuse it effectively it is important that we understand some basics about the data itself.

Data is defined consistently throughout the enterprise, and the definitions are understandable and available to all users.

Data must have known quality

To ensure that data can be used safely to drive design and decision making it is important that the quality of the data is recorded.

Data should be open

Wherever possible we will work to open data and cross government standards, this will ensure the data quality is not eroded by avoidable / unnecessary transformation / translation.

Data use is traceable, legal and ethical

It should go without saying that the use we put data to should be bound within the legal and ethical framework we work within.

Use data to prompt appropriate action

Data should be used to initiate traceable action when appropriate; over time we should be able to identify patterns in the data that we have discovered often accompany problems.

The nature and severity of problems associated with data patterns should prompt appropriate action within our systems whether these be technical or procedural.

Data should be digital by default

Although data in this strategy does not just refer to digital data and our vision of data is for data in all its forms, where possible and practical we will digitise physical data to maximise its utility.

Reuse data, don’t recreate it

Reuse of data must be preferred over recreating or recapturing it, but to reuse it effectively it must be in media and formats that assist its reuse with minimum effort.

Data is protected 

We expect that data will be secured from accidental or malicious access (or alteration) by unauthorised users. 

Sensitive data are exchanged securely.

The confidentiality and integrity of sensitive data needs to be ensured.

Access to data repository should be in accordance with data classification specified for sourced data.

Level Two Data Principles

Name

Statement

Data items have a master

The University will have a conceptual data model, which will consist of groups of logical entities. Specific applications will be assigned data mastery for those key entities.

Applications can take copies of data mastered elsewhere but it must be treated as read only, any changes to data mastered in another application must be changed in the master system using an API service or equivalent. All copies should be locked/ frozen.

All changes must ultimately be applied to the master and from there replicated to all copies. 

It must be possible to rebuild all copies from the master. 

The master should be at least as available as the copies.

The master must be recoverable from backups.

Multi-mastery situations should be avoided.

Data changes are audited

All data changes are audited for traceability for provenance of changes.

Store who, what, when, why and (when reasonable to do so) before/after values changes.

Data integrity

Capture business data once at the point of creation and manage it actively throughout its life cycle up to and including eventual disposal.

Data is maintained in the source application.

An adequate data controls and independent checks and balances must exist within and between University applications.

Data that is exchanged adhere to a canonical data model

The University canonical data model provides a high level insight into all data that is used in processes and applications and is managed centrally.

A canonical data model standardises the definitions of data that are exchanged within the organisation.

Use common data definitions to prevent unnecessary translations and semantic differences.
All messages exchanged between applications use the schemas that codify the canonical data model.
Applications that are unable to adhere to the canonical data model rely on integration to translate their application-specific data model to the canonical data model.

Data classification

UoL will follow HMG guidance for data classification. Typically, all data should be classified as ‘Official’.

Within the University’s broad classification of ‘Official’, there are two sub-types of ‘Sensitive’ and ‘Confidential’ where ‘Sensitive’ data is classified as SGII.

Some information will be published and treated as public but restricted by exception.

Data classification should be applied on all data as specified by University data classification. A proper specification of data classification whenever data is shared between applications is required in order to provide controls to secure the data.

Data storage

Data is stored in locations which are regularly backed up rather than in temporary storage.  This means that centralised managed data stores are preferable to multiple local data copies.

Azure cloud is the default storage mechanism for the University’s data.

Data access rights must be granted at the lowest level

Access rights must be granted at the lowest level necessary for performing the required data operation

Providing users or systems with more access rights to the data or for a longer period than strictly necessary introduces unnecessary risk of abuse.

Users do not log in using administrator accounts.
Access rights are based on the role of the user.
Access should be granted only for the amount of time necessary.
Access rights that are no longer needed are revoked.

Data archiving

Data of obsolete/inactive data entities should be archived for audit purpose.

To agree on short, medium, long term archiving tiers and the best technology to archive University data.

Long-term storage solutions will be used for data which has been identified as inactive. 

Data to be archived for the long term will utilise low cost storage technology.

Archived data that requires faster recoverability, such as legal and compliance, we should consider a solution based around short storage technology. This would require business justification due to the cost considerations.

Data retention

Data retention policies will support persistent data and records management to meet legal and business data archival requirements.

University data retention policy will be applied to data that is required to be archived for:

  • Compliance and Legal reasons along with its archived timeline.
  • Business specific requirements.
  • Operational purposes.

Data will be disposed of once the agreed retention period has expired following University disposal policy.

Data quality

Data quality will be maintained at a level to effectively support it business needs.

Data are provided by the source application.

Data are captured once.

Data are maintained in the source application.

Data quality rules should not be hard coded to the code of the application, instead it should be managed by a data quality management tool.

Data compliance

University always strive to comply with international best practice in data protection and privacy.

Data ownership

Every data item must have an owner.

Data ownership must be articulated with associated responsibilities clearly defined to the data owner.

Ensure data owners are involved in projects that ‘touch’ their data.

Data governance

Data governance is everyone’s responsibility. All data stakeholders contribute to data governance policies and their implementation and adoption.

Data lineage provides important metadata for data consumers and should be recorded and available where needed.

Rules for data usage, sharing, ownership and management responsibility need to be defined and standardised.