6 Fault management functions

12.113GPPFault management of the Base Station System (BSS)TS

According to TMN principles, (see M.3200 [3] and M.3400 [4]) the TMN management service components described in the previous clauses are achieved by means of the TMN management functions described in this clause.

Since the PLMN NEs shall include the "Open Systems Interconnection" capabilities, in general, the telecommunications management functions are based on the "OSI System Management Functions" which are specified by ITU-T with the X.7xx series of recommendations. Each of these recommendations specifies a set of management functions for a specific aspect of the system management. The OSI system management functions are based on the Q.3 protocol services CMISE, ACSE and ROSE, which are specified in general by ITU-T, and in particular for the GSM System by the GSM 12.01 [19] specification.

6.1 Alarm surveillance functions

6.1.1 Threshold Management functions

The Threshold Management Functions identify the management and generation of defined alarm notifications from counter thresholds and gauge thresholds.

The management is covered by the threshold management functions subsequently identified, and based on the recommendations in ITU-T X.721 [6]. The permitted functions are Create Thresholding, Get Thresholding, Set Thresholding, Remove Thresholding and Activate/Deactivate Thresholding.

Create Thresholding: The OS requests the NE to create the thresholding mechanism i.e. an instance of the threshold manager MOC. The threshold manager may be created either in a deactivated or activated state.

The characteristics which are defined for each member of a counter threshold are:

– Comparison Level

The counter level at which the defined notification may be generated if thresholding is activated (see Notify On or Off below).

– Initial Comparison Level

The initial value of the Comparison Level at which the defined notification may be generated. The Comparison Level is set to this level when creating thresholding, and when the counter associated with the counter threshold is reset.

– Offset Level

Whenever the counter threshold is triggered by a counter crossing the Comparison Level, the Comparison Level itself is incremented by the offset level, if the offset level is not zero.

– Severity

Severity level to be included in the defined notification.

– Notify On or Off

The thresholding mechanism and the generation of defined notifications for the counter threshold or gauge threshold when crossing the Comparison Level may be switched on or off.

The characteristics which are defined for each member of a gauge threshold are Notify High and Notify Low, which both in turn consist of:

– Comparison Level

The gauge level at which the defined notification may be generated if thresholding is activated (see Notify On or Off below).

– Severity

Severity level to be included in the defined notification.

– Notify On or Off

The thresholding mechanism and the generation of defined notifications for the gauge threshold when crossing the Comparison level may be switched on or off.

NOTE: This function creates the thresholding mechanism for the counter or gauge, but the thresholding is only performed if it is activated (see Thresholding Status below).

For a description of the other attributes of the threshold manager, please refer to subclause 5.1.2.4 and Annex A.1.

Get Thresholding: The OS requests the NE to report selected attributes of a threshold manager instance.

Set Thresholding: The OS requests the NE to modify the thresholding characteristics for a threshold manager instance.

It is only possible to modify the threshold characteristics when thresholding has been deactivated by setting the administrative state to "locked".

Remove Thresholding: The OS requests the NE to remove the thresholding mechanism for the identified counter or gauge threshold, and to delete the threshold manager instance.

Activate/Deactivate Thresholding: The OS requests the NE to either activate or deactivate the thresholding mechanism for the identified threshold manager. This is accomplished by setting the administrative state to "unlocked" or "locked".

6.1.2 Alarm Reporting functions

The alarm reporting functions identify standard mechanisms for the generation of alarm notifications. The functions of alarm reporting fall into four general areas:

– The generation of an alarm notification by a managed object,

– The forwarding of that alarm notification to the OS,

– The storage of the alarm record in the log (optional), and

– The retrieval/deletion of the alarm record from the log (optional).

Report alarm: The NE notifies OS of alarm information upon the occurrence of an event. This is defined in M.3400 [4] and Q.821[15].

Route alarm report: OS specifies to the NE the destination address(es) for a specified set of alarm reports. This is defined in M.3400 [4] and Q.821[15].

Request alarm report route: OS requests the NE to send the current assignment of the destination address(es) for a specified set of alarm reports; NE responds with the current assignment of destination address(es). This is defined in M.3400 [4] and Q.821[15].

Condition alarm reporting: OS instructs the NE to assign Event Forwarding Discriminator attributes as specified by the OS or the OS instructs the NE to create/delete instances of Event Forwarding Discriminators. This is based on M.3400 [4] and Q.821[15].

Request alarm report control condition: OS requests the NE to send the current assignment of specified Event Forwarding Discriminator attributes; NE responds with the current assignment of specified attributes. This is defined in M.3400 [4] and Q.821[15].

Allow/Inhibit alarm reporting: OS instructs the NE to allow/inhibit alarm reports to the OS. This is defined in M.3400 [4] and Q.821[15].

Request alarm report history: OS requests the NE to send specified historical alarm information from a log; NE responds with the specified information. Alternatively using the GSM 12.00 annex B based on the managed object class simpleFileTransferControl, the OS may request the NE to generate a datafile from the specified historical alarm information which is subsequently transferred to the OS. This is based on definitions in Q.821 [15] and GSM 12.00 [18].

This method is referred to as the "Bulk Transfer of Alarm records" procedure and contains the following steps (for a detailed description, see GSM 12.00 [18] annex B):

– Start file creation basic service: The OS initiates the data file creation by issuing the requestTransferUp action. The resultType field in this action may contain either of the options defined in GSM 12.00 [18] annex B:

– objectSelection: in this case, this parameter identifies a log (or a superior object) to be used as the base object for a scoped and filtered retrieval of alarm records (in the same way as they would be selected with CMISE primitives);

– typeOfFile: the fileType parameter shall be set to the alarmRecords value. Additionally, the optional parameter FileSubType may be used to select alarm records according to predefined criteria such as the severity or the time of occurrence of the alarm (the coding of FileSubType parameter is described in Annex B of the present document).

– The NE creates one or more datafiles containing the requested Alarm Record data. The format of the transferred data shall be the file type "ObjectDataFile";

– Notify file creation basic service: The NE notifies the OS about the datafile(s) with the transferUpReady notification. If typeOfFile option was used, then fileType field (and optionally fileSubType) shall be set to the same value(s) in this notification as in requestTransferUp action. If the objectSelection option was used, the alarmRecords value (13) may be used for the fileType field;

– The OS initiates the transfer of the datafile(s) using FTAM services (see GSM 12.01 [19]);

– Complete file transfer basic service: The OS informs the NE about the completion of the file transfer by issuing the transferUpReceived action.

Delete alarm report history (added in Q.821 [15] and GSM 12.11 compared to M.3400 [4]): OS requests the NE to delete specified historical alarm information from the Log. This is defined in Q.821 [15].

6.1.3 Log control functions

The log control functions are used by the OS to control the operation of the alarm log(s) in the NE.

The retrieval and deletion of information stored in the log(s) is covered by the functions "Request alarm report history" and "Delete alarm report history" (part of the Alarm Reporting functions).

As mentioned in subclause 5.1, the logging of alarm notifications in the NE shall be performed by means of two managed object classes: "Log" and "Alarm Record", defined in ITU-T X.735 [11], X.733 [9] and X.721 [6]. The functionalities of the Log which belong to the Log control functions are:

– Initiation of logging.

– Termination of logging.

– Suspension of logging.

– Resumption of logging.

– Scheduling of logging.

– Modification of logging conditions.

– Retrieval of logging conditions.

The management functions defined for these functionalities are: (M.3400 [4], Q.821 [15]):

Allow/Inhibit logging: OS instructs the NE to allow/inhibit logging of Alarm records.

Condition logging: OS instructs the NE to assign Log attributes as specified by the OS, or to create/delete instances of the Log object class.

Request log condition: OS requests the NE to send the current assignment of specified Log attributes; NE responds with the current assignment of specified attributes.

6.1.4 Alarm summary functions

The alarm summary functions are based on the ones defined in Q.821 [15].

Request current alarm summary: The OS requests the NE to send a current alarm summary; the NE responds with the summary.

Condition current alarm summary: The OS instructs the NE to assign Current Alarm Summary Control attributes as specified by the OS, or to create/delete instances of the Current Alarm Summary Control object class. Note that this function is not present in Q.821 (1993) [15].

Request current alarm summary condition: The OS requests the NE to send the current assignment of specified Current Alarm Summary Control attributes; NE responds with the current assignment of specified attributes. Note that this function is not present in Q.821 (1993) [15].

Report current alarm summary: The NE provides the OS (based on a pre-defined schedule) with a current alarm summary report. This function is optional.

Route current alarm summary: The OS specifies to the NE the destination address(es) for a specified set of scheduled alarm summary reports. This function is optional.

Request current alarm summary route: The OS requests the NE to send the current assignment of the destination address(es) for a specified set of scheduled alarm summaries; the NE responds with the current assignment of destination address(es). This function is optional.

Schedule current alarm summary: The OS specifies a schedule for the NE to establish for the reporting of current alarm summaries. The schedule information specifies what should be reported as well as when it should be reported. This function is optional.

Request current alarm summary schedule: The OS requests the NE to send the current schedule information for current alarm summary reporting. The NE responds with the schedule information. This function is optional.

Allow/inhibit current alarm summary: The OS instructs the NE to allow/inhibit the reporting of the scheduled current alarm summaries. This function is optional.

6.1.5 Alarm surveillance related basic services

This subclause describes the mapping between the management functions, managed objects and the management services (called ‘basic services’ in the present document, in contrast to Q.821 [15]) needed to support the management functions defined in subclauses 6.1.2 – 6.1.4. The mapping of the basic services to the supporting CMIS services is also presented.

The mapping of the following basic services to the confirmed or non-confirmed mode of the supporting CMIS services, except where specified, is a local implementation issue and is not specified in the present document.

The term ‘basic service’ is used in the present document because, in contrast to Q.821 [15], this service component is based on both the management services as specified in Q.821 [15], as well as the management services defined in the present document, e.g. for SFTC (see subclause 6.1.2).

For the service definitions (except for the SFTC basic services, Condition Current Alarm Summary, Request Current Alarm Summary Condition and Threshold Management services), see Q.821[15].

Table 1: Alarm surveillance functions, managed objects and basic services

Function(s)

Managed Object(s)

Basic Service(s)

CMIS service(s)

Report alarm

EFD 1

Alarm reporting (X.733)

M-EVENT-REPORT

Route alarm report

EFD

Set EFD (Q.821)

M-SET

Request alarm report route

EFD

Get EFD (Q.821)

M-GET

Condition alarm reporting

EFD

Set EFD (Q.821)

Initiate alarm reporting (Q.821)

Terminate alarm reporting (Q.821)

M-SET

M-CREATE

M-DELETE

Request alarm report control condition

EFD

Get EFD (Q.821)

M-GET

Allow/inhibit alarm reporting

EFD

Resume/suspend alarm reporting (Q.821)

M-SET

Request alarm report history

(Log 2), Alarm Record

Alarm report retrieving (Q.821)

M-GET

SFTC, (Log, Alarm Record) 3

Start file creation (GSM 12.00, 12.11)

Notify file creation (GSM 12.00, 12.11)

Complete file transfer (GSM 12.00,12.11)

M-ACTION

M-EVENT-REPORT

M-ACTION

Delete alarm report history

(Log 2), Alarm Record

Alarm report deleting (Q.821)

M-DELETE

Allow/inhibit logging

Log

Resume/suspend logging (Q.821)

M-SET

Condition logging

Log

Set Log (Q.821)

Initiate log (Q.821)

Terminate log (Q.821)

M-SET

M-CREATE

M-DELETE

Request log condition

Log

Get Log (Q.821)

M-GET

Request Current Alarm Summary

CASC

Retrieve current alarm summary (Q.821)

M-ACTION

Condition Current Alarm Summary 4

CASC

Set current alarm summary control (Q.821)

Initiate current alarm summary control (Q.821)

Terminate current alarm summary control (Q.821)

M-SET

M-CREATE

M-DELETE

Request Current Alarm Summary Condition 4

CASC

Get current alarm summary control (Q.821)

M-GET

Report Current Alarm Summary

CASC, (MOS) 5

Current alarm summary reporting (Q.821)

M-EVENT-REPORT

Route Current Alarm Summary

MOS6

Terminate MOS (Q.821)

Initiate MOS (Q.821)

M-DELETE

M-CREATE

Request Current Alarm Summary Route

MOS 6

Get MOS (Q.821)

M-GET

Schedule Current Alarm Summary

MOS

Set MOS (Q.821)

Initiate MOS (Q.821)

Terminate MOS (Q.821)

M-SET

M-CREATE

M-DELETE

Request Current Alarm Summary Schedule

MOS

Get MOS (Q.821)

M-GET

Allow/inhibit Current Alarm Summary

MOS

Resume/suspend MOS (Q.821)

M-SET

Create Thresholding

thresholdManager

Create thresholdManager (GSM 12.11)

M-CREATE

Get Thresholding

thresholdManager

Get thresholdManager (GSM 12.11)

M-GET

Set Thresholding

thresholdManager

Set thresholdManager (GSM 12.11)

M-SET

Remove Thresholding

thresholdManager

Delete thresholdManager (GSM 12.11)

M-DELETE

Activate/Deactivate Thresholding

thresholdManager

Set thresholdManager adm.state

(GSM 12.11)

M-SET

NOTE 1: Here EFD is the MIS-User of the Alarm Reporting basic service (M-EVENT-REPORT service), not the target of the operation as for the other services in this table.

NOTE 2: Log may be used as the base object class and instance for scoped and filtered operations. The scope of these operations shall not include the base object.

NOTE 3: Log and Alarm record objects are addressed indirectly through the SFTC when objectSelection option is used.

NOTE 4: This function is added compared to Q.821 (1993). However, ITU-T SG4 has decided to introduce this function in the next version of Q.821.

NOTE 5: CASC is here the source of the emitted report. MOS is indirectly related to reporting by triggering the report generation according to the schedule.

NOTE 6: According to the Q.821 model the destination of the scheduled Current Alarm Summary reports is specified by the MOS read-only attribute destinationAddress. Thus, these reports are not routed via the EFD, and in order to change the destination address, an instance of MOS has to be re-created.

6.2 Fault localisation functions

Fault localisation is accomplished through the analysis of the information contained in alarm notifications and/or test results. Therefore functions which are part of alarm surveillance and test service components may be used for fault localisation.

6.2.1 Alarm report function

This function supports the requirements for the agent to report and for the manager to receive information that may be used to localise a fault to a LRU. See subclause 6.1.2 for complete information.

For alarm notifications, the first piece of localisation information is provided by the identification of the object instance reporting the alarm. Following this, the "Probable Cause" and, optionally, "Specific Problem" values will provide ITU-T standard or GSM or manufacturer specific information that will help to localise the fault to a specific replaceable/repairable unit.

6.2.2 Test management functions

Tests from all the categories presented (see subclause 5.4.3) may be used for fault localisation purposes depending on the unit suspected to be faulty. The following functions support testing for fault localisation:

– Controlled Test Request function (see subclause 6.4.1.1);

– Uncontrolled Test Request function (see subclause 6.4.1.2);

– Resume/suspend Test function (see subclause 6.4.1.3);

– Terminate Test function (see subclause 6.4.1.4);

– Test Result function (see 6.4.1.5).

6.3 Fault correction functions

The fault correction functions identify the following mechanisms for the management of fault correction:

Add Redundancy Relationship: The NE is requested to create the redundancy relationship. The NE responds with an acknowledgement of the request.

Remove Redundancy Relationship: The NE is requested to remove the redundancy relationship. The NE responds with an acknowledgement of the request.

Change Over: The NE is requested to initiate the action that results in the secondary resource taking over the primary role as defined by the redundancy relationship. The NE responds with an acknowledgement of the request.

Change Back: The NE is requested to initiate the action that results in the restoration of the resources into their original roles as defined by the redundancy relationship. The NE responds with an acknowledgement of the request.

Request Redundancy Relationship Condition: The OS requests the NE to send the current condition of the redundancy relationship. The NE responds with the current condition of the redundancy relationship.

Condition Redundancy Relationship: The NE is requested to assign the characteristics of the redundancy relationship as specified by the OS, or to initiate or terminate the redundancy relationship.

Report Redundancy Relationship Condition: The NE informs the OS of the current characteristics of the redundancy relationship; the report may be a result of an autonomous NE modification of the redundancy relationship".

6.3.1 OS controlled fault correction

The OS may control redundancy to execute a Change Over or a Change Back. This "on demand" control by the OS may be performed in the following manner:-

– using state management services to lock or unlock one or more objects in the redundancy.

ITU-T Recommendation X.731 specifies the lock and unlock state management functions referred to in this section.

The use of the lock and unlock state management functions to trigger a Change Over or a Change Back between two managed objects in a defined redundancy relationship is performed in the following manner. Assume that managed object "A" is defined as a primary object in the redundancy relationship, and managed object "B" is defined as the secondary object in the redundancy relationship. In order to use state management functions to effect a change over, the OS first locks the primary resource (managed object "A"), preventing it from performing its functions. The NE then proceeds to use the secondary resource (managed object "B") to take over the functions of the primary resource (managed object "A"). The two resources remain in these roles until their original roles are re‑instated as described below.

State management functions may be subsequently used to revert the primary and secondary resources to their original roles. In order to use state management functions to effect a change back, the OS first unlocks the primary resource (managed object "A"), enabling it to perform functions. Secondly, the OS then locks the secondary resource (managed object "B"), preventing it from performing its functions. The NE then proceeds to use the primary resource (managed object "A") to take over the functions of the secondary resource (managed object "B"), resulting in the primary resource again performing the functions it was originally doing. The two resources are then back in their original roles.

– explicit use of the specific actions for the protectionGroup MOC defined in G.774.03 (see also table 2).

The Change Over and Change Back fault correction functions identified in sub-clause 6.3 may be requested by the OS.

6.3.2 Autonomous fault correction

The Change Over, Change Back and Condition Redundancy Relationship fault correction functions identified in sub-clause 6.3 may be performed autonomously by the NE without OS intervention.

The means by which the NE internally gains access to these fault correction functions is outside the scope of the present document, but the effects of using these correction functions by the agent shall be notified to the OS (refer to Table 2).

6.3.3 Fault correction related basic services

This subclause describes the mapping between the management functions, managed objects and the ‘basic services’ needed to support the management functions defined in subclauses 6.3, 6.3.1 and 6.3.2. The mapping of the basic services to the supporting CMIS services is also presented.

The mapping of the following basic services to the confirmed or non-confirmed mode of the supporting CMIS services, except where specified, is a local implementation issue and is not specified in the present document.

Table 2: Fault correction functions, managed objects and basic services

Function(s)

Managed Object(s)

Basic Service(s)

CMIS service(s)

Add Redundancy Relationship

protectionGroup

protectionUnit

Create protectionGroup (G.774.03)

Create protectionUnit (G.774.03)

M-CREATE

M-CREATE

Remove Redundancy Relationship

protectionGroup

protectionUnit

Delete protectionGroup (G.774.03)

Delete protectionUnit (G.774.03)

M-DELETE

M-DELETE

Change Over

protectionGroup

MO*

protectionGroup

protectionUnit

Invoke Protection (G.774.03)

Lock/Unlock MO* (X.731)

Protection Switch Reporting (G.774.03)

Attribute Value Change protectionUnit (G.774.03)

M-ACTION

M-SET

M-EVENT-REPORT

M-EVENT-REPORT

Change Back

protectionGroup

MO*

protectionGroup

protectionUnit

Release Protection (G.774.03)

Lock/Unlock MO* (X.731)

Protection Switch Reporting (G.774.03)

Attribute Value Change protectionUnit (G.774.03)

M-ACTION

M-SET

M-EVENT-REPORT

M-EVENT-REPORT

Request Redundancy Relationship Condition

protectionGroup

protectionUnit

Get protectionGroup (G.774.03)

Get protectionUnit (G.774.03)

M-GET

M-GET

Condition Redundancy Relationship

protectionGroup

protectionUnit

protectionUnit

Set protectionGroup (G.774.03)

Create protectionUnit (G.774.03)

Delete protectionUnit (G.774.03)

M-SET

M-CREATE

M-DELETE

Report Redundancy Relationship Condition

protectionGroup

protectionGroup

protectionGroup

protectionUnit

State Change protectionGroup (G.774.03)

Attribute Value Change protectionGroup (G.774.03)

Protection Switch Reporting (G.774.03)

Attribute Value Change protectionUnit (G.774.03)

M-EVENT-REPORT

M-EVENT-REPORT

M-EVENT-REPORT

M-EVENT-REPORT

NOTE: * MO means a managed object pointed to by a protectionUnit managed object instance

6.4 Test management functions

The management functions for the testing of the NE are closely based on the ITU-T recommendations X.745 [13] and ITU-T X.737 [12].

ITU-T X.745 includes:

– the model on which these management functions are based (the same model described in the previous clause);

– the definitions used in this context;

– a list of management functions (also named service definitions);

– definitions of two functional units;

– the protocol services that are necessary for these functions;

– a definition of a first nucleus of Information Model (GDMO and ASN.1 syntax) associated with this model.

The testing management functions consist of actions and notifications provided to manage the tests (how to start, stop, suspend, etc. the supported tests). In order to fully define these management functions, it is necessary to specify both the characteristics of the tests supported by the NE, and how they can be managed from the OS.

6.4.1 Functions

This subclause defines the management functions required for test management. Each of the management functions can be to applied to the tests in one or more of the test categories defined in 5.4.3 Test Categories.

6.4.1.1 Controlled Test Request function

This function allows the test conductor (on the managing system) to send a request to the test performer (in the managed NE), to execute a controlled test.

The test response does not include the test results which may be returned later, using another function, or may be put in the TO’s test result attribute, so that they can be retrieved by the managing system.

6.4.1.2 Uncontrolled Test Request function

Like the previous function, this one allows the test conductor to send a request to the test performer to execute an uncontrolled test.

The test response shall include the test results.

6.4.1.3 Resume/suspend Test function

This function allows the test conductor to send a request to the test performer to suspend or resume a single test or all the tests of a session. It is applicable only to controlled tests.

The test response shall include the state of the affected TOs, as it is immediately before the suspension or immediately after the resumption.

6.4.1.4 Terminate Test function

This function allows the test conductor to send a request to the test performer to terminate a single test or all the tests of a session. It is applicable only to controlled tests.

6.4.1.5 Test Result function

This function allows the managed NE to return the test results of controlled tests to the managing OS. The test results are produced as unsolicited events; however they shall include the test invocation identifier which relates them to the original test request.

6.4.1.6 Scheduling Conflict Report function

This function allows the test performer to report a test schedule conflict to the test conductor. This function is provided only when the NE provides the scheduling capability (which is optional).

6.4.2 Test management related basic services

This subclause summarises in a table the mapping between the management functions defined in subclause 6.4.1, the managed object classes and the OSI management services (called ‘basic services’ in the present document), needed to support the management functions. The mapping of the basic services to the supporting CMIS services is presented as well. Note that for the testing, the functions to achieve the service are equivalent to the basic services. For the testing service component the OSI management functions are sufficient and there is no need to define new specific telecommunications management functions.

Table 3: Test management functions, managed objects and basic services

Function(s)

Managed Object (s)

Basic Service(s)

CMIS service(s)

Controlled Test Request

MO(TARR)*, MORT, AO**, TO

Controlled Test Request

M-ACTION

Uncontrolled Test Request

MO(TARR)*, MORT, AO**

Uncontrolled Test Request

M-ACTION

Suspend/Resume Test

MO(TARR)*, MORT, AO**, TO

Suspend/Resume Test

M-ACTION

Terminate Test

MO(TARR)*, MORT, AO**, TO

Terminate Test

M-ACTION

Test Result

MO(TARR)*, MORT, AO**, TO

Test Result

M-EVENT-REPORT

Scheduling Conflict Report

MO(TARR)*, MORT, AO**, TO

Scheduling Conflict Report

M-EVENT-REPORT

* MO(TARR) means an MO of the NE having the TARR functionality.

** AOs are always optional.

The Test Management may also use services, like PT-GET and PT-SET to retrieve and modify the attributes of the MOs involved in the tests, and PT-DELETE to abort the controlled tests. Refer to X.745 [13] for further details.