|
Transforming Remote Infrastructure Management
By Sanjay S. Savla
Oct 12, 2007
Remote infrastructure management can appear in many facets. It can be known purely as a Network Operations Center (NOC) or as a catch/dispatch monitoring capability, or even as a global management center. Each of these carries different connotations, with thelatter being the most comprehensive set of services. This facet ultimately drives the classic remote infrastructure management to a capability that actually drives business value.
In a global management center (or global visibility center) environment, remote monitoring is combined with the actual management of the objects, with correlation built into the capability. An object can be defined as a server, router, database, application, etc.
Each object is 'monitored' at various levels of detail. For instance, a server at a base level would have an operating system, CPU, disk, and memory monitored. In addition, separate agents can be installed to monitor specific 'applications', such as databases and end-user applications like SAP, Peoplesoft, etc.
The agents provide a deeper set of monitoring capability. By doing this, issues are captured at a more granular detail that would result in quicker resolution and increased availability of the application.
The ability to capture more events, more data, and learn more about the server and applications commences the journey toward business service management versus purely a device or component level service.
However, monitoring an object at any level is just the first step in ensuring maximum availability. Managing the object is a natural parlay from monitoring. Three paths exist in managing an object: automated correction, systemic recognition and escalation, and human intervention.
With each new event, efforts should always be made to understand the event to the point an automated recovery can be executed, which reduces downtime and the need for human intervention.
While automated management doesn't require correlation, the more correlation that can be incorporated to the event will result in more complex tasks being able to be executed automatically. It's this correlation that further substantiates the journey to managing the enterprise as a set of business processes and not a set of devices.
The second avenue to managing an object is through recognizing systemic events and proper escalation. For example, if a process is required to be restarted 3 times in a specific time span (15 minutes perhaps), then the monitoring platform would recognize this as a systemic issue and rather than manage this through automated recovery, the management will follow a procedural path to escalate the issue to ensure human intervention prevails. Escalation can be accomplished via a ticket or call-out, or in most cases, both.
The last managing activity is immediate human intervention. There will be events that are complex, involving multiple objects where correlation isn't as sophisticated as the human mind. This is when managing the object now becomes a human activity rather than a technology driven process.
The business value of infrastructure management processes can be calculated by counting the improvements on key levers such as application availability, IT productivity, and ability to change and scale. These levers include:
* Application availability: A process driven approach by following well defined methodologies, such as ITIL, can enable organizations to bring about a major improvement in service levels, and in turn, application availability. A process driven approach can enable organizations to enhance their ability to manage incidents and plan for outages and capacity in a better way.
* IT Productivity: IT productivity is measured by how effectively the IT staff is able to execute various operational functions. A process driven approach supported by automated systems management, robust knowledge management systems, and documentation can decrease the time of IT management and staff.
* Improved capability to change and scale: Backed by industry tools and processes, organizations can be more proactive to handle upgrades, patch management, and even virus attacks. An organization can roll out new services in a much faster manner and free-up its resources to undertake new initiatives.
Many think that remote monitoring and management is a commoditized service consisting of mundane activities. This is true, when it comes to a generic catch/dispatch NOC. However, there is a significant business value-add when the management piece incorporates correlation, ongoing improvements, and a multi-tiered approach to managing the object.
If one can change the monitoring/managing of objects to monitoring and managing of applications and end-to-end processes, the value-add in service is not just about 'IT', rather it is about 'Business'.
By Sanjay S. Savla, senior vice president & head (Infrastructure Management Services) of Patni
|