Umbrella Smarts

Comments
Posted in Articles
Print

The wind is south by southwest, temperature 92, barometric pressure 29.05 and the Doppler radar appears somewhere between forest and mint green. But would someone please just say when to break out the umbrellas?

Multiply that challenge by thousands to compute the frustration level for service providers struggling to keep up with an unprecedented proliferation of new devices, protocols and services, each of which brings with it new monitoring probes and 'event' collectors.

Last January, for example, Visual Networks Inc. added IP class-of-service, site-to-site and per-site IP, frame relay and ATM service level monitoring to its Visual UpTime v7.0 management software. Only five months later it ushered v7.1 out its doors with additional probes and performance meters for specific IP applications including voice, media streaming, file sharing, client/server and database protocols.

The stream of advances in end-to-end, multi-layer event collection continues to flow from veteran competitors like Concord Communications Inc. and Trendium Inc., and new entrants such as Packet Design. Packet Design's Route Explorer is designed to map, record and 'play back' a view of IP routing protocol events.

"Some customers can face 40 million events a day," says Randy Custeau, OSS marketing manager for Agilent Technologies Inc. "You need to do correlation to provide the most useful information to the administrator."

Consequently, in May, Agilent announced first-phase integration of its NetExpert, Firehunter and Access7 network, service level and signaling assurance products. "We did a lot of analysis of customer needs over the last few months, and the first need was for uniform reporting aggregated in a store, enabling you to correlate an alarm to a service component and make smarter decisions about costly actions. If there's a problem device but no actual impact on a customer, don't spend money dispatching a technician."

To meet similar multilayer challenges in the wireless realm, Hewlett-Packard Co. and Nokia joined forces in May to integrate their tools for collecting, aggregating, filtering and analyzing events across the full range of voice, radio, IP and applications host and delivery infrastructure underlying emerging wireless data services.

Yet the real end of the service assurance rainbow may lie one step beyond the event correlation engine solutions that have emerged over the past several years from Agilent, Aprisma Management Technologies, HP, IBM Corp.'s Tivoli, Micromuse Inc. and others.

"A lot of vendors are doing event aggregation, but that's a lot easier than developing a common data model for mapping relations between components and mapping relationships between components and services," says Glenn O'Donnell, program director, service management strategies for Meta Group Inc.

That modeling ability is the particular boast of business process assurance software vendors, led by Managed Objects and System Management ARTS (SMARTS) Inc., both of which emphasize customer demand for a top-down, application-down-to-element view of performance and fault alarms and analysis tools smart enough to know whether a bum optical multiplexer, router port or application server processor is, or is not, impacting a customer.

O'Donnell says that Meta Group counts Managed Objects as the current market leader and particularly lauds its user interface for integrating with multiple sources of events. However, as it gains customers including AT&T Corp. and British Telecom plc, he describes SMARTS' underlying data models as equally impressive in accomplishing what he believes HP, Micromuse, Tivoli and other vendors are moving toward. "Abstracting data from lower levels to build this higher business process assurance level."

In May, SMARTS unveiled InCharge Applications Services Manager, a system designed "to automate the analysis of what is really wrong, based on a common information model that spans network, computing and business objects," says Shaula Alexander Yemini, founder and president of SMARTS.

SMARTS' ability to provide "an end-to-end view of transactions traversing the infrastructure, rather than viewing a series of disconnected technology domains" contributed greatly to AT&T's decision to integrate InCharge as the core topology, polling and fault management component of AT&T's Integrated Global Enterprise Management System, says Roman Pacewicz, vice president of business development for the managed services unit of AT&T Business Services. "You need to manage it all as a system of systems," says Pacewicz. The managed services and complex hosting arm of AT&T has seen a 42 percent reduction in correlated events that actually disguise true root causes for service failures since deploying InCharge to map thousands of routers and servers last year, he adds.

Yemini says SMARTS' data models enable the system to isolate "authentic problems" along with their "signature" of symptoms that typically can spread across multiple element domains, such as when a congested router can lead to congested server transactions, thereby going "beyond information gathering to modeling and analysis."

In practice, this might mean SMARTS Application Services Manager would identify "perhaps seven authentic problems a server could have, along with the signature of symptoms for each problem across the whole environment," Shaula says. Then it would use server instrumentation from partner BMC Inc. to 'watch' for those symptom patterns to emerge, thereby signaling an "authentic" problem. 


InCharge Business Impact manager Chart

Comments