- data collector processes
- storage engine
- analysis console
- background processes
In this scenario, the data collectors run on the monitored
systems. These should be light weight processes because they shouldn't
put to much burden on the host system. This is especially important
if high performing systems like web servers are to be monitored. data
collector picks up "interesting" events and forwards them
to the central storage engine.
The storage engine stores the received event notifications to persistent storage. This way, it is safe from any manipulations or technical problems at the monitored systems. The storage engine typically runs on a limited number of machines. Often, there is only a single storage engine inside a whole network. That's really not a bad idea, as the whole concept of monitoring is to have all information centrally. Multiple storage engines, on the other hand, are typically used in complex scenarios, mostly with WAN links in between. There, a local storage engine serves as a central hub for one location and forwards the information to the central system.
The analysis console finally, is used by the system administrators. It is the interface that allows admins to have a look at consolidated reports and drill down into more specific topics. Ideally, the console supports multiple concurrent users as well as providing some hints to fixing detected problems. Integrated links to vendor knowledge bases or public search & discussion services are a valuable help here.
Of course, data collectors and the storage engine are background processes. But there can also be background processes that consolidate and monitor the storage engine's data on a schedule - e.g. daily. So administrators either receive an activity overview report or an exception report (for important and urgent matters).
What about Windows?
Windows NT/2000/XP/2003 does not come with a build in monitoring solution. So you need some tools to get it going.
Windows logs the most important state date into the event log. Third party vendors are also encouraged to log any events to the event log. For example, most Anti-Virus products will log caught viruses here. The event log is definitely the place to look if you'd like to monitor a Windows system's health. As a build-in tool, only the Windows event viewer is available (part of the computer management MMC under Windows 2000 and XP). That tool allows interactive display of current events but was never meant to be part of an automated monitoring solution.
What we need is a data collector that can run in the background. For this, we use EventReporter. That product monitors the event log in near real time and forwards all new messages to the storage engine via syslog protocol. Why did I say "near real time"? Well, EventReporter by design does not operate on Windows event notifications, which have been proven to be not fully reliable under extreme scenarios. Instead, it polls the event logs on a pre-set schedule. Resource usage is very moderate, so the schedule can be set to run every 30 seconds - even more often in very security sensitive environments. EventReporter does not only forward the logs but also checks if someone truncates them (via Windows Event Viewer or an API call). If that is done, a notification is send the the storage engine. This functionality is important, as such log truncations can be a good indication of an intruder. EventReporter is installed on each system that is to be monitored. It runs on all flavors of NT (even ALPHA), so really all systems can be monitored.
Why syslog?
I mentioned that we forward the messages via syslog protocol. This in fact is a big plus. Syslog is a standard protocol stemming from Unix. Nowadays, it is supported by nearly all major devices. For example, most routers and network printers are able to provide diagnostic information via syslog. So a syslog based monitoring solution is able to gather data from a variety of sources. While this is not really the scope of this article, it is nice to know that syslog can help us a bit out when we need to monitor the whole network. This gives us additional flexibility as our needs may grow.
Storing the Events...
Now we need something to store the events collected by EventReporter. We use WinSyslog for this. This enhanced syslog daemon works much like it's Unix counterpart. But besides writing to flat files, it can also log to a database and carry out flexible actions.
In our monitoring system, we use it for two functions: first of all, it stores all events. In our case, events are written both to a flat file as well as the database. We use this approach because bulk analysis is done fastest with the help of flat files. However, viewing event details is done best by using a database. So we've taken the route to simply write to both and have the best of both worlds. A large hard disk is of course helpful here...
Besides storing events, WinSyslog acts also as an alerting engine. It can be configured to detect important message fragments or high priority messages and set to forward these to an email account. If you cell/wireless provider supports an email to call interface, you can just as easily page your phone/pager in case of an emergency.
Typically, only a single instance of WinSyslog is needed. However, it has support for syslog cascading. Cascading is used if a reporting hierarchy is build. This is most often done in corporate networks involving WAN links where only higher importance messages should be send to a central data store while less important messages are stored at the individual sites locally. That way, complete data is available for drill-down, but it is not necessarily being transmitted over the WAN. WinSyslog fully supports cascading. It is also able to forward only selected messages based on rules.
Analyzing the Events
Now we come to the analysis part. In most cases, administrators don't like to be bothered with routine information. They just want to get notified if things go either terribly wrong (hopefully a bit before it really hurts) or regularily to see that all is doing well.
In our system, we have MoniLog running to provide daily reports. These MoniLog reports take the wealth of information available and largely compress it. It creates a summary of the events that happened. So a typical report is just a short HTML page, even for a system with a large number of servers. The color coded reports are stored on the Intranet web server and accessible by every administrator. They allow to have a quick look at the system state. Even better, the reports include links to EventID.NET, an online resource for information on Windows events. With eventid.net, solutions can often be found very quickly.
Of course, the MoniLog reports might be too compressed or limited in scope to fully analyze an issue indicated in the daily report. Then, the administrator can use the MoniLog client to dig into the stored event base. WinSyslog does also come with a web interface to the raw event data, so this might be used to have a detail view at each single event.
The integrated Solution
As you see, the system is made up of three main components. Each of these has specific duties to perform. The modular approach provides the flexibility needed in today's environments. For example, if Cisco information is to be integrated into the system, you simply need to point the Cisco boxes to the WinSyslog server. Now, the storage engine saves the new events. Even though MoniLog does not (yet) pick up and analyze the Cisco events, they can be viewed with the WinSyslog web interface, which might be very helpful during analysis.
Also, an administrator has the option to add his or her own custom scripts to be executed on the stored event data. The open system architecture provides unlimited flexibility to do so.
It is also easy to integrate Unix and Linux machines into the scenario.
They support syslog natively and as such can both send and handle
syslog messages. In fact, the EventReporter product alone is often
used as a tool to integrate Windows events into Unix based management
systems.
Conclusion
An effective monitoring solution can save the administrator a lot (and I mean lot) of work. It can also help prevent major system breakdowns, as ciritical situations can be detected early and - hopefully - solved before any damage occurs. This is especially true if your thinking about security monitoring.
As I have outlined, a monitoring system needs not to be very complex or hard to set up. Just use some ready to run tools, integrate them and enjoy the benefits of the system.
Tools used
The following tools were used to build the monitoring system: If you'd like to build you own system, you can download free evaluation copies from the respective web sites. Detailed installation instructions are available in the additional article "How To setup Windows NT centralized Monitoring".
I hope this article is helpful. If you have any questions or remarks, please do not hesitate to contact me at rgerhards@adiscon.com.
About the Author: Rainer Gerhards works for Adiscon, who offers software for server monitoring. Visit http://www.monitorware.com for more information and free downloads.
| | From the Forum: | | Beware of fresh installs | We got talking about installing Windows XP and it came up about the MSBlast worm that is plague-ing the internet since mid-July. This also applies to installing Windows 2000 - ALL versions, Windows NT, and 2003 Server (but not 98 or ME).
Please note : when either installing a fresh copy of your OS or simply doing a repair installation, Windows is vulnerable to infection by the Blaster Worm simply by being connected to the internet! ...
|
 |