| Hello
Readers,
The Slammer worm was one of the worst attacks the Internet has ever seen, and
everyone using the Internet was affected at least a little. But do you know what
happened - really? Exactly how did a worm manage to disrupt network communications
worldwide in so little time? In today's article, Rainer Gerhards surveys the damage
and looks at how the worm progressed - the ways the infection was passed along,
which devices failed, how security systems failed, and, most importantly, how
it could have been worse. How do we prevent such a thing from happening again?
The answers aren't easy, and may involve taking a whole new look at network security.
Read on to find out more!
SQL Slammer
Lessons
By Rainer Gerhards
In this article, I analyze why the SQL Slammer worm was so successful
in bringing down the Internet - and what can be learned from the attack.
I will not analyze how the worm worked in detail. There are already a number of
good analyses out; please see the links section, especially
[1] and [3], if you are interested in
that. Hello Readers,
Most importantly, this paper tries to figure out why the attack
was so successful. This is not a complete or thorough analysis. I have
put together the information I had at my hands from following the BugTraq mailing
list, personal contacts and talks. Anyone with additional information is invited
to email me any feedback at rgerhards@adiscon.com. Comments are very
welcome.
The Incident
Chronology
On Friday afternoon, January 25th, 2003, a new worm propagated through the Internet
and generated massive amounts of traffic, causing denial of service conditions
at many important infrastructures. The worm spread through a several-month old
vulnerability in Microsoft SQL Server 2000 and its little brother, MSDE 2000 (the
MSDE is a stripped down, cost free version of Microsoft SQL Server for use by
application developers). The worm spread extremely fast. Within hours, key Internet
infrastructures and backbones experienced severe problems. There were reports
where large areas - Korea, for example - become disconnected from the Internet
for around 6 hours.
Network administrators worldwide noticed the impact
of the worm rather quickly and began exchanging alerts. Fortunately, key systems
became usable again relatively quickly, though it is expected that corporate environments
will see service degradation for some more days. Also, as of January 27th, there
is still a lot of traffic originating from this worm on the Internet (but no longer
in critical amounts).
Problems Seen
The worm used a well-known security weakness in Microsoft SQL Server and spread
via ca. 400-byte sized UDP packets to the MS SQL monitor port at 1434/udp. The
UDP packet contained the complete worm code. Once a victim was hit, it immediately
began trying to infect other servers by sending the very same UDP packet to pseudo-randomly
generated IP addresses as fast as the infected machine and network allowed. This
was done in a tight loop, which could only be ended by shutting down SQL Server
or the host operating system.
For a detailed analysis, see [1] below.
Extremely High Network Utilization
Due to the tight loop executed by the worm, the affected server generated
extremely high network utilization. To worsen things, not only were relatively
large UDP packets transmitted, but also ICMP non-reachable (port and host) replies
from those targets that were not hit. Postings on the BugTraq mailing list indicate
that this ICMP traffic was immense.
The large amount of traffic alone caused trouble by using up available bandwidth.
It alone prevented access to some sites.
Frozen Network Devices
Due to the high traffic volume, many network devices (routers, switches and
firewalls) were unable to carry on normal processing. Some of them even "froze"
in the middle of operation when their CPU resources were exhausted and the high-priority
task of forwarding traffic prevented all other activity. I suspect that almost
all devices had low buffer respective buffer overflow conditions. I had no specific
reports on that yet, but it is relatively safe to assume such.
Packet Loss
When we consider the buffer overflow conditions that existed in the network
devices, it's important to realize that they were forced to discard traffic. Unfortunately,
not only the malicious communications, but also perfectly legitimate traffic.
I expect UDP traffic was most severely hit, as UDP offers "best effort delivery"
by design. As such, the spec explicitly allows an interim device to discard UDP
packets should the device experience congestion and insufficient buffer space.
TCP-based packets, in contrast, are guaranteed delivery and as such are only discarded
when a device has absolutely no other choice. Of course, if the device is nearly
frozen, it might also drop TCP connections. But I still assume the majority of
discarded packets were UDP packets.
As such, UDP-based services had the least chance to operate normally. To prove
this point, there were reports of disrupted voice-over-IP and streaming media,
all of which are UDP-based.

Failing DNS Resolution
Please keep in mind that DNS is also UDP-based for the most part. This explains
why the Internet Root DNS Servers were completely inaccessible during parts of
the attack - the UDP DNS queries simply did not come through.
This DNS vulnerability illustrates an important vulnerability of the
Internet as whole: if an attacker could generate even more UDP traffic than the
SQL Slammer worm, and do this for an extended period of time, the name resolution
and thus stability of the Internet would fail!
And the bad news is that the worm could have generated even more traffic than
it has already done...
Failing ATM machines
Our society relies more and more upon the Internet. It is interesting to note
that during the Slammer attack not only systems directly related to the Internet
failed, but also seemingly unrelated systems, e.g. ATM machines. Can you envision
shopping in a supermarket and unable to pay your food because of an Internet worm?
Sound like science fiction? Well, it happened on January 25th, 2003. See [4]
below (towards the end) to see that this is true.
What made the Worm so successful?
Some sources have called this worm more destructive than "CodeRed" or "Nimda."
I tend to agree. There are a number of factors that made it so successful. Unlike
Nimda and CodeRed, it is not simply a case of the "lazy admin," at least not in
my opinion. In fact, this was a major motivation in writing this paper. We need
to be aware of all the things that worked together to allow this worm to cause
its damage.
Usage of UDP
A major "advantage" of the worm was its ability to spread via UDP. Unlike TCP,
there are no timeouts for session setup, thus messages can be sent extremely fast.
UDP will also send the complete packet up to a top-level router, at least, before
the packet is discarded, thus causing many of the congestion problems (TCP will
discard packets more quickly, such as when no session can be established due to
either an invalid host or a host not listening to the port targeted).
Unpatched Servers in Internet Data Centers
Internet Data Centers housing customer machines were among the most severely
hit. Typically, those facilities provide network management, power, and climate
for customer machines. The customer, however, is responsible for administration
of the machine. The housing provider typically does not even have administrative
access to customer systems. Internet data centers are typically very well connected
to the Internet.
For this reason, a few unpatched SQL servers inside an Internet data center
can lead to massive amounts of traffic, both inside the providers' facilities
as well as on the backbones the data center connects to. While the network operators
are able to detect this condition, they typically cannot patch or even shut down
the customers' machines, as they do not have admin access to them.
To make things even worse, there are many Internet data centers that lease
machines (often called root-servers) to their customers. In this business model,
the customer is again solely responsible for server administration. The worst
part is that those leased machines are typically single-machine setups that do
not make use of any kind of firewalling. Some data centers do not support multiple
machine configurations with a firewall in front (or offer firewall functionality
for lease). Those who do offer firewalling do not sell those packages very often
- customers trying to save money skimp on security, and choose the cheaper technique
of just a single machine.
MSDE Integration
Pivotal to the worm's success was the widespread use of MS SQL Server. One
might wonder how it happens that so many servers sitting directly in the Internet
are not even protected by a minimal firewall rule set.
In my opinion, the fact that MSDE was vulnerable, too, is highly important.
In contrast to the "real" SQL server, which typically (hopefully) is set up and
administrated by a skilled administrator, the MSDE is often used on desktops.
Also, people simply do not realize they are running SQL server but get it unknowingly
when a third party application installs MSDE. So even the (somewhat) caring admin
is not really aware that he needs to monitor SQL Server patches.
In fact, MSDE is installed as part of Microsoft Visual Studio.NET as well as
a number of other Microsoft products, including Sharepoint Services plus Project
2002 Server. Also, a growing number of third-party applications install MSDE,
some of them silently.
From our developer and admin point of view, it appears that Microsoft is making
deployment and integration of MSDE much harder than the full SQL engine (be warned:
I might be wrong here, it is my own personal impression). In any case, it is definitely
harder to patch MSDE; some of the current patches require a SP2 version that comes
only on CD and cannot be downloaded. The vulnerability exploited by the worm required
such a patch. I guess the unpatched state of some MSDEs can be attributed to this
fact.
There have been security issues with the MSDE setup since it appeared in the
industry. For example, many products install MSDE with the default admin account
of "sa" without a password. Some of these problems were cleaned up with MSDE 2000,
but again, there are still many glitches around that make the setup vulnerable.
Another example is that it is relatively hard to change the default setup directory.
The fact that many apps bring MSDE with them is a key problem, along with the
fact that the end user does not necessarily know he is running a database server
on his machine. Some of those applications are also in wide use at desktops -
just think about the number of Visual Studio.NET installations that have potentially
installed an (unpatched) MSDE.
Unpatched Home User System
Unpatched home systems are the never-ending story of the insecurity of the
Internet. I am sure (but have no definite evidence right now) that home desktops
running MSDE versions have contributed to the worm traffic. It goes without a
saying that many home workstations are still unprotected (insufficiently protected,
to say the least). And broadband is fueling them with more and more power.
Outgoing UDP Firewalling
Traditionally, many organizations do not take equal care to set up firewall
rules for traffic flowing from inside the network to the external side. While
attack traffic coming in from the Internet is closely scrutinized, admins tend
to be more lax with traffic that originates from their networks. Just keep in
mind how many setups allow spoofed traffic, not related to the internal network,
to be transmitted to the Internet.
In my experience, firewalling UDP ports is an even worse story. For example,
even otherwise (partly) caring admins tend to open up UDP port above 1024 to make
their DNS responses work. Of course, this is only necessary for DNS servers, but
it is often applied to all servers as a general policy. In general, traffic originating
internally is more likely to pass through most existing firewall setups than traffic
originating from the external side.
Some organizations experienced the worm even when firewalls prevented it from
entering via the "normal" Internet gateway. When vulnerable home users, or laptops
on the road (developer machines, desktop engine for replication on mobile machines)
got infected, they were likely to infect the organizational network when they
dialed in. One such machine could infect an internal SQL server when the worm
tried to infect the rest of the Internet (and the Intranet as well) from the internal
side of the firewall. Because outgoing traffic is not carefully monitored or closely
restricted, the SQL server would in turn have been able to congest the Internet.
If the firewall was a low-powered one, chances are good that at some point the
firewall would be monopolized by the malicious traffic, effectively denying service
to legitimate traffic. Bear in mind that I do not have any report of things going
that way, but I bet it happened at least once.
In-Band Adminstrative Data
In the past, the discussion of in-band vs. out-of-band network management was
much more active than in recent times, now that everyone does everything via the
Internet and VPN.
I do not have yet any authoritative sources saying that they received alerts
either too late or not at all because of network congestion. However, many protocols
used for such systems are UDP-based. An example is syslog. For this reason, I
would expect that network administration and alerting was at least not as efficient
as it should have been during the attack. While we ourselves were fortunate enough
not to receive an amount of traffic that lead us into real trouble, this also
means I can not confirm the effectiveness of the network management over here.
I would appreciate any feedback on this issue.
At least, I doubt that the current in-band management approach can facilitate
malware like SQL Slammer.
Lessons Learned
Remember, this is a first and quick effort to analyze the effects of the SQL
Slammer worm. In fact, I expect that we will learn more lessons than I describe
here and I also suspect that I will need to change some of my conclusions after
they have undergone peer review. Anyhow, I hope they are helpful - if nothing
else, they hopefully start discussions.
Reconsider Firewalling
One of the main lessons learned is that main firewall configuration must pay
more attention to outgoing traffic. This is not really new news. But I think it
is worth reiterating.
Deny all ICMP
If you don't do this already, it is a very good idea to drop all ICMP packets
other than those generated by the firewall. This might have some implications
on day-by-day operations, but it definitely helps under such attacks. It also
makes it much harder for an attacker to detect which services are running at a
given IP address.
Block all outgoing UDP
The need for outgoing UDP should very seriously be considered. Only those machines
with a definite need should be allowed to send UDP traffic to the Internet. For
the same reason, services requiring UDP should - if possible - placed on a dedicated
machine and not mixed together with other services like SQL server. This might
not be easy for some small shops. It should be highly affordable for the medium
and large organizations. If you need a good argument to justify the cost to your
management: count the traffic generated by the worm and calculate the expense
of it. Then, calculate the cost for a dedicated UDP services machine...
Be Careful with MSDE
Microsoft
I don't want to bash Microsoft here. One thing that I think requires immediate
action is providing downloadable patches for some of the MSDE versions. Offering
them only on CD is unacceptable and is for sure responsible for at least some
of the unpatched versions out there. This should never again happen.
Other Vendors
Vendors shipping MSDE as part of their product should very prominently state
the fact that they install a full-blown database engine on the customer's machine.
Vendors shipping MSDE as an integral part should also take responsibility for
it and notify their customers when Microsoft releases an important security bulletin.
End Users
Take care of what you install. Read the spec for your applications and make
sure that you know when you install a database engine.
Internet Data Centers
Internet data centers should provide a way for their customers to easily -
and without additional cost - to have some router filters assigned. Even some
very generic filter classes would be helpful.
The community as whole should consider out-of-band administration for critical
resources. Of course, this cannot be accomplished for all resources. Protocol
designers and implementers should consider careful use of Quality of Service (QoS)
to ensure that alerts have a better chance of surviving congested routers and
networks. In doing so, they should take great care not to overdo this effort:
if any single probe warning message is flagged as high priority, QoS packets will
probably flood the network, too.
Read
the rest of the article here...
Links
Disclaimer
The information within this paper may change without notice. Use of this information
constitutes acceptance for use in an AS IS condition. There are NO warranties
with regard to this information. In no event shall the author be liable for any
damages whatsoever arising out of or in connection with the use or spread of this
information. Any use of this information is at the user's own risk.
Rainer Gerhards works for Adiscon, who
offers software for server monitoring. Visit www.monitorware.com
for more information and free downloads.
|