Overview:
This article will define a network management strategy for managing the network. It is necessary to define how the equipment is going to be monitored and determine if the current management strategy is adequate or if new applications, equipment, protocols and processes must be identified. Management components are then integrated with infrastructure and security.
These primary elements comprise any well-defined management strategy and should be considered when developing your strategy.
Network Management Strategy
· Network Management Groups
· SNMP Applications
· Monitored Devices and Events
Network Management Groups
· Fault
· Performance
· Device
· Security
· Change
· Configuration
· Implementation
Fault Management
This describes the pro-active monitoring of devices, circuits and servers for errors. It specifies what events are monitored and thresholds for generating alarms. Once the alarms are generated, there is an escalation process for addressing any errors. It could be a circuit problem, a router interface or a server link.
Service level agreements with local loop providers and long distance IXC for circuit repair are important as is vendor equipment repair contracts. Out-of-band router management allows troubleshooting and configuration of routers with an attached modem. The support technician doesn't rely on the primary circuit to reach the router.
They will utilize a separate analog dial line with a modem connected to the auxiliary port at the router. Escalation support processes are defined that are used by the network operations center (NOC) employees for effective problem resolution. These are some typical support activities:
· Established Tier support levels with job responsibilities well defined for each Tier group
· Defined severity levels and what Tier group is responsible
· Defined response times for severity levels
· Applications for trouble tickets
· Established troubleshooting procedures for employees
· Root Cause Analysis
· Survey support groups for skill levels, identify deficiencies and plan for training programs to address that.
Performance Management
This describes the pro-active monitoring of device, circuit and server performance levels. That translates to monitoring and reporting on trends with device CPU, memory and link utilization, circuit bandwidth utilization, server CPU, memory and disk input/output rate. As well campus segments and device interfaces should be monitored for collisions, CRC errors and packet drops.
Bandwidth capacity planning is an on-going process of monitoring bandwidth utilization trends for the enterprise network and consideration of business growth estimates. That information is utilized for developing a provisioning strategy addressing company bandwidth capacity needs. The dynamic nature of an enterprise network is such that new locations, employees and application deployments will increase network traffic and utilize available bandwidth.
Trend monitoring tools are typically run from the network operations center and focus on enterprise traffic patterns and performance of circuits, routers and switches.
RMON is a popular protocol that is utilized for monitoring router, switch and campus segment performance with probes at various offices across the enterprise. Information can be collected at all layers of the OSI model for statistics on utilizations, packet size and errors.
In addition there are specific SNMP applications designed for bandwidth capacity planning. The bandwidth provisioning strategy could involve faster campus and WAN equipment, increased bandwidth for circuits, quality of service protocols or a combination of any of those elements.
Security Management
This describes the management of device and server security that is consistent with the policies of the corporation. Typical devices are firewalls, routers, switches, TACACS servers and RADIUS servers. Security includes community strings, password assignment, change policy, dial security and Internet security.
Device Management
This describes the maintenance of a database inventory that lists all campus and WAN devices, modules, serial numbers, IOS versions, server documentation and design. It is important that companies keep information on these assets for support and warranty issues.
Configuration Management
This describes the process of configuring, and documenting devices, circuits and servers on the enterprise network. A process for configuring new equipment, modifying current equipment and maintaining TFTP servers should be established. Those scripts should be saved to TFTP servers and documented for later use with subsequent configurations. Build a directory structure with a folder for each equipment type and subdirectories for model types.
Change Management
This describes a process for approving and coordinating device configuration changes and is essential for network availability. Staff members that make unapproved changes without alerting affected departments can cause problems if the changes don't work and are made during busier times of the day. Any changes to the production network should involve at least the network operation center and someone from the engineering group. As well it could be important to let the application developers know of network changes. Any change management process should have these components:
Review Process
· Affected departments consider impact of changes and discuss concerns
· Proof of concept and quality assurance testing
· Develop a timeline for changes approved by all departments
· Departments plan contingencies should there be network issues
· Approval process: software manages and records approvals from groups
· Pro-active monitoring of unauthorized changes
Implementation Management
This describes the process for managing new implementations such that there is no disruption to the production network and the implementation is efficient and effective. These are some network operations center (NOC) activities that should be part of any typical implementation management strategy. Consider vendor support contracts for support with configuration scripts, testing, and design since that will promote an effective implementation.
Standard Network Operations Center Activities:
1) Turn on circuits and ping all new devices to verify connectivity
2) Modify SNMP applications at network operations center for pro-active fault and performance monitoring of new devices
3) Verify devices are SNMP enabled and security is applied
4) Update the inventory database and save configuration scripts to a TFTP server
SNMP Applications
There are a myriad of SNMP applications on the market that focus on managing servers, devices and circuits. An enterprise customer will sometimes employ several applications including their own software that address each management group. The SNMP version that is implemented should be noted at each device and server. This is a list of popular commercial applications and how they could be utilized.
Monitored Devices and Events
Typical devices such as routers, switches and circuits are configured and monitored with SNMP applications. Thresholds are defined for each event that will trigger an alarm when that is exceeded. A polling interval is configured for each event, which describes the time interval between sending of status information from device to network management station. An example would be a router CPU utilization threshold of 60% and a polling interval of 10 minutes.