HP Service Manager SLA – Defining SLA’s (Part Two)
September 28, 2010
Defining Service Level Agreements in HP Service Manager – Part Two
This is a two-part post. The first part can be found here.
Now, the four screenshots above are basically the “plumbing” that must be setup in order to setup the SLA module to work with the incidents module (or any module that you want SLA to be setup for). Let’s take a look at now how this actually will work in practice on an actual incident ticket when this plumbing is setup:
So, let’s assume that we would open up a Priority 1 Incident ticket:
So, since we opened up a Priority 1 Incident Ticket in the screenshot above, AND since we coded the SLA module to work with the Incidents Module, we note now (in the screenshot below) that our incident has been opened with the base SLA that we setup (named “Base Monitoring SLA for IT services”) and since we coded the “P1 Incident” SLO to be active on the incident when a Priority 1 Incident has been opened, the “P1 Incident” SLO (taken from the base SLA on every ticket) is shown and is now Actively Running (as seen by the Status of “Running” below). Now, note the “Expiration” time below. This is the calculated expiration time of this SLO on this incident ticket based on the 02:00:00 time interval we defined on the SLO in the 2nd Screenshot above and based on the calendar we defined on that SLO for when the clock should be active. Also, this “Next Expiration” time is the time that the ticket must go from “Open” to Closed” that is based on how we setup the SLO in the system – this is the way we defined this SLO on the backend:
FURTHER NOTES ABOUT THE SCREENSHOT ABOVE:
- So wrapping this all up then, we can code SLOs similar to this to be based on many different fields than just the priority field. We could key off of the severity field, the category field, or both, the business service field, or other fields and then those SLOs would be the active ones.
- Note the “Upcoming Alerts” section in the screenshot above as well. These are those same alerts that we coded into the SLO record to be sent out at various points in the lifecycle of our SLO. Please note that on these alerts, we can also code other things to happen as well like setting certain values in fields on the incident ticket and etc etc when those alerts fire.
- Please also note that if the defined time limit for this SLO is not reached (meaning the ticket has not gone to “Closed” by the listed expiration date determined by the SLO), the Status (currently seen as “Running”) for that SLO goes to “Breached” and a checkbox on the Incident ticket becomes visible and it is checked. This is very key for reporting as reports can be written to show which incidents are in a breached status.
- Lastly, we can have more than one SLO running on these incident tickets (or tickets from any module you want SLA running with) as well if you so choose. We just would have to code them to do this. For instance, we could have one SLO be active to time from a status of say, “Open” to “Work In Progress”. Once that status is reached on the ticket, you could have another SLO go into effect to time “Work in Progress” to “Closed”. Many other combinations like this can be arranged as well.
Finally, If we were to close our incident from our example:
We would then be able to look at our SLA section on that same incident ticket and see that the SLO has been “Achieved” (Again, this would say “Breached” if this ticket was not closed in the allotted time interval given for this SLO on this incident ticket):
adsf
REASONS FOR USING THE SLA MODULE
Here are a couple reasons for using the SLA module:
- As you have seen from the above example, the base SLA on the ticket is just the holder from which the correct SLOs are pulled on those tickets based on the criteria that those SLOs have to meet in order to be active on the ticket. This means that the SLOs and not the SLA do the timing and measuring in SM 9.20. In this regard, those SLOs are the mechanisms that effectively time and measure the given time limits that have been defined for each ticket to go to a status of “Closed” (or to whatever status we define at the SLO level). In doing this, users have a clear cut area on the ticket to see how much time they have to work the ticket and when exactly they are expected to have that ticket worked.
- SLOs on these tickets are/can be cleanly setup to send out alerts to key users, managers etc at all times throughout the lifecycle of the given SLO time limit allotted on any given ticket. This allows key persons in an organization to keep tabs on all tickets that are out there and if they are being completed in a timely manner.
- SLOs give key users in the organization a clean look into which tickets are in a “breach” status. In this regard, key users are able to easily tell which tickets are being worked in a timely manner and which ones are not.
- Finally (this kind of merges into the next topic), there is a central area inside of SM 9.20 where all SLO data is held. This gives administrators and developers the ability to accurately and easily report on the SLA statistics gathered for every ticket generated in the system that is setup with the SLA module.
SLA Reporting
There is a table in Service Manager 9.20 named “sloresponse.” This table holds a record of every “Active” and “Inactive” SLO attached to a ticket in the system. The screenshot below is an example of the SLO Response record that has been pulled up from our SLO example used in Section 1 above. Note first the “Related Table Name” and “Related ID” fields. The values in these two fields in the screenshot below are “probsummary” and “IM10149” respectively. These field basically give the information about the ticket that the SLO is tied to. In this case in the screenshot below, this SLO is tied to an incident ticket (as incident tickets are stored on the backend in the “probsummary” table in SM 9.20) with the unique number of “IM10149.” So, this allows a report writer to tie the “sloresponse” table to the probsummary table in order to effectively join these two tables. Also, pay attention to the “Active” checkbox seen circled in the screenshot below. This tells whether the SLO is active or not active on the ticket. Finally, note the “Current Status” field. This tells what is happening / what has happened with the SLO whether it was “Achieved”, is still “Running”, or if it was “Breached”. There are other fields below as well that can be used in an effective SLA report like “Expiration”, “Start Time”, “End Time”, etc etc. In this regard users have a central area where they can retrieve very useful information about the SLOs in their system.
NOTES ABOUT THE ABOVE:
- Please note that a report writer would most likely connect to these tables on the backend in SQL via an ODBC driver. The “probsummary” table in SM 9.20 is mapped out to the “PROBSUMMARYMX” table(s) (where “X” is a number) on the backend in SQL. Likewise, the “sloresponse” table in SM 9.20 is mapped out to the “SLORESPONSEMX” tables (where “X” is a number) on the backend in SQL. These two tables would be joined together by the unique ticket number field like I explained above in order to pull info back on the report for the tickets and the SLOs that are tied to these tickets.
- Please also note that due to performance concerns, we at Stratacom have seen many times in the past an external reporting database setup and created that will house mirrored information from the Prod database. This will allow a broad set of reporting users (if that is the case for your implementation) the ability to run multiple reports on a non live Production database. Direct reporting connections to the Prod database do not present performance issues unless multiple reports by multiple users are taken throughout the work day.
If you would like a free PDF version of this document for distribution within your organization please contact us.
HP Service Manager SLA – Defining SLA’s (Part One)
September 28, 2010
Defining Service Level Agreements in HP Service Manager
This document is intended to help guide you toward defining your Service Level Agreement Module Setup in Service Manager.
1. DEFINING SLA (IN SM 9.20); DEFINING SLO (IN SM 9.20) .
In Service Manager, an SLA (Service Level Agreement) becomes the “holder” of various SLOs (Service Level Objectives) that we can define based on many different sets of criteria. The way we at Stratacom have always setup the SLA module is to have one default SLA on every ticket of every module that SLA is being implemented with. The ticket then will pick out the correct SLO(s) to use from that Base SLA based on the information found on the incident ticket. For example, if your organization wanted to implement the SLA module to work inside the Incident Module, we would code the SLA module to make sure that we get the same Base SLA on each of the tickets opened inside the incident module. ***What this then allows us to do is have access to the full listing of that base SLA’s SLOs. So, defining SLO then: (Service Level Objective) An SLO is the mechanism that we would use on an incident ticket to measure the given time limit that is allotted to “work” that ticket based on the items in that ticket like severity, category, business service etc. (basically the time that is allotted to push that ticket from “Open” to “Closed” – or from “Open” to any incident status that is wanted based on the ticket criteria.). We can code any amount of SLOs based on any given criteria on any ticket in the system (not just incident tickets, but through to any ticket for any module that we would setup SLA for). See the rest of this document for a further outline of this principle.
Example of the Base SLA setup in Service Manager:
This is an example of a Base SLA (named “Base Monitoring SLA for IT services” below) that will be found on every incident ticket (or ticket from any module that SLA is setup for). Notice this SLA has all of the SLOs we would want for any incident ticket defined (Note however, that we could define SLOs for any ticket from any Module here other than just incidents). Pay particular attention to the SLOs circled in Red on the bottom of the screenshot:
Now, if we were to double click on the “P1 Incident” SLO record from the SLA in the screenshot above, you will see this SLO record in the screenshot below. Things to note from this screenshot are: ***The Condition Field – This is the field where we define what criteria the incident ticket must meet for this SLO to become active on an incident ticket. We can code this field to key off of a priority or a severity, a category, a business service, or any criteria at all on any incident ticket (or for any ticket from any module that you would setup SLA for). The example below then, based on the condition field, just says, make this SLO active on an incident when that Incident ticket is opened with a priority of 1. Note below the value for the “Service Area” field is “Incidents’” – this is where we define what type of ticket this SLO will be active for. Therefore, in the case below based on the contents of the “Condition” field and the “Service Area” field, this SLO will go into effect when a Priority 1 Incident ticket is opened. Now, note the “Initial State” and “Final State” fields below and the “Duration Field”. These fields have the values “Open”, “Closed”, and “02:00:00” in them respectively. So! This means that when a Priority 1 Incident ticket is opened, this SLO will go into effect on that incident ticket and a 2 hour (02:00:00) time limit will be given for the ticket to go from a status of Open to Closed. So, wrapping this all up then, in this regard, we can code many different SLOs to be active in different situations on incident tickets (or on any ticket from any module for that matter). Finally, note the “Alerts” array below – this is an area where can code alerts to go out to certain users, managers etc. at different percentages of the total time limit given for that SLO. Below, we have coded an Alert to be sent out to various users when 50% of the time limit has been reached on the incident and when the SLO has been breached from a time limit perspective on the incident:
Now, if we were to look at the “Schedule” field in the SLO– we can tie what’s called a “Calendar” record to this SLO – this Calendar record indicates when the clock should be timing our given time interval and when it should be inactive. In the screenshot below, we note the value in this “Schedule” field is “day”:
Now, clicking on the “Lookglass” Button to the right of that schedule field (in the screenshot above) takes us to the “day” calendar record. This calendar record is the actual record that tells the clock on the incident when to time and when not to time that time interval we are measuring on the incident ticket for that SLO:
This article is continued in Service Level Agreements Overview Part Two.








