Change Management: Technologies in Heightened Awareness


There are occasions when a technology is placed into Heightened Awareness.  This article outlines

Technologies Currently in Heightened Awareness

TechnologyReasonDate Placed inWhen to remove
LURCISConcerns around the stability and support for LURCISFri 16/06/2017Will not be removed until retired.
DocumentumConcerns around the stability and support, the product and hardware is now outside of maintenance.Fri 02/11/2018Will not be removed until retired.
PGT CRM

After a number of incidents within a short timeframe.

Fri 20/05/2022There have been a number of issues with the technology leading to incidents and MIs in recent months.  A number of elements of the system are leaving or have left support.  It was felt the  PGT CRM should remain in heightened awareness until replaced.
IMC Management System (Screeton)The system is being placed into heightened awareness as:
  • Based on outdated, unsupported software,
  • Developers have left the organisation and documentation is minimal,
  • Running on unsupported hardware with outdated and unsupported operating system.
Fri 15/07/2022Will not be removed until retired.
May need to run until July 2023.

Syllabus+ INC0615093 - Lack of data to support timetabling.  

 

Syllabus+ experiences issues dealing with the amount of data change that takes place at the start of each academic year.  This year, these issues caused an MI.

The software is very old and requires an update or to be replaced. The business are writing a Business Case for this work to happen.

Fri 14/10/2022

Until upgraded or replaced.

If the option to upgrade is chosen, the software needs to demonstrate it can deal with the amount of data change that takes place at the start of an academic year.

Radioactive Substances Interactive Database - RSID 

and 

Lasers Inventory Database VIRGIL

There is no vendor support for these applications

Both are essential for legislative compliance and security. Although currently stable there is no contingency if there is a failure.  They also contain personal information of staff who'd undertaken training to use Lasers and Radioactive material.

Both are hosted on RSID01HV.

Mon 03/07/2023

Until replaced by a system with vendor support, or a new vendor takes on support of the application.

6840 Distribution Layer Switches

There is an EMEA wide shortage of Cisco 6840 switches.  These have been prone to failure during power events.  Distribution layer switches must work as a pair of identical hardware switches, therefore any failures would require both switches to be replaced with alternative hardware causing disruption.  Cisco 6840 switch operating as a Distribution layer pair are placed into Heightened Awareness for any change relating to power work impacting the building hosting the switch. 

Fri 25/08/2023

Until Supply issues are resolved, or the 6840 Distribution Layer Switches are all replaced with hardware in general supply

 

Reasons for Heightened Awareness

A technology may be placed into heightened awareness for a number of reasons.  The table below gives several examples.

Reason Description
Major Incident (MI)

After an MI, it is usual for a technology to be placed in to heightened awareness for a minimum of 6 weeks.  This period can be longer if it is felt that certain actions need to take place before faith is restored in the technology.

The period of heightened awareness is implemented for several reasons.

  1. We have just recovered from a service outage.  IT need to ensure that we aren't responsible for causing more outages which impact the businesses ability to function and deliver.
  2. There needs to be a period of stability on the configuration.  We need to ensure that changes being made are to reduce the risk of a repeat MI.  If several changes are made and the system crashed, it will be more difficult to understand if this is an MI caused by the same root cause, or by newly introduced configuration.
 Support Concerns

Where a technology has not experienced an MI, but there are concerns about the support / stability of the technology, it may be placed in heightened awareness as a precaution.  For example,

  • Where a key system was developed in house and the people who understand how to manage and maintain it have left the University.
  • Where the hardware solution has left warranty and replacement parts are known to be difficult to source.
  • Where the software solution has left vendor support, meaning issues the University can't solve can not be referred to the vendor
Supporting Technologies

A technology might be placed in heightened awareness due to risks introduced by other technologies that support it.  For example, if the backup solution that backed up a number of servers failed, those servers would be in heightened awareness.  While the servers themselves continue to work as expected, until the backup solution is up and running the servers would not be able to restored.

Known Bug

Where there is a known bug in a technology which means it is at increased risk of failing and causing a service outage, a technology may be placed into heightened awareness.  The technology would stay in heightened awareness until the risk from the known bug was removed or reduced sufficiently.  This might be by patching software, changing configuration to reduce the bug being triggered or putting measure in place to automatically recover should the bug occur.

Expectations when a Technology is in Heightened Awareness

When a technology is in heightened awareness, extra rigour needs to be in place around making changes.  Changes will need to be approved by the Change Advisory Board (CAB), therefore staff need to plan in advance to ensure their change requests are submitted in a timely fashion in order for them to be reviewed and approved at CAB. This is true for changes across the board, so Standard and Normal will both need CAB approval by default.

Emergency changes should only be logged where:-

In both cases, the requester should seek approval from an ITELT member.

Emergency changes should not be used as a way of avoiding the CAB process because a change is being logged too late to go to CAB.

CAB may decide that to pre-approve certain change activities in advance, these still need to be logged.  One example of this happening previously was when a technology was in heightened awareness as there was a memory leak causing the system to regularly crash; CAB pre-approved the weekly reboot of the server as a temporary measure to keep service running whilst a longer term fix was put in place.

 

Technologies previously in Heightened Awareness

TechnologyDate left Heightened Awareness

Reason left Heightened Awareness

NetScaler Pair AC01Fri 14/01/2022

The AC01 pair have now been decommissioned.

LCMM \ MediaSiteFri 01/07/2022

The vendor has provided details of how testing has been changed so that any future issues with edited content would be detected prior to release.

The system has now been stable for 6 weeks post the below high priority incident.

INC0601430 - Users experiencing problems with playing back edited lecture capture recordings on MediaSite

 

NetScaler Pair AC0311/08/2022

The AC03 pair have been upgraded to a version of software that has a fix for the bug that caused the MI (INC0575080.)

Since the upgrade there has been one month of stability.

Legacy Core Mailers

16/06/2023

The Legacy Core Mailers have been removed from our mail flow and have been decommissioned.