Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Image

Crazy Firewall Log Activity — What Does It Mean? 344

arkowitz writes "I happened to have access to five days worth of firewall logs from a US state government agency. I wrote a parser to grab unique IPs out, and sent several million of them to a company called Quova, who gave me back full location info on every 40th one. I then used Green Phosphor's Glasshouse visualization tool to have a look at the count of inbound packets, grouped by country of origin and hour. And it's freaking crazy looking. So I made the video of it and I'm asking the Slashdot community: What the heck is going on?"
This discussion has been archived. No new comments can be posted.

Crazy Firewall Log Activity — What Does It Mean?

Comments Filter:
  • botnet. (Score:1, Interesting)

    by Anonymous Coward on Saturday January 23, 2010 @10:34PM (#30874846)

    The striping across all countries is a check whether your site is reachable from that part of the botnet, the purpose of the traffic is unclear; either to do a large data grab or it's a (very unsuccessful) bandwidth attack, or something. You should adjust it for number of internet connected users [internetworldstats.com] per country first then revisualize that.

  • My guess (Score:3, Interesting)

    by JoshuaZ ( 1134087 ) on Saturday January 23, 2010 @10:36PM (#30874872) Homepage
    It looks to me like the lines of major activity likely corresponded to major news events or other events that caused people to look at the relevant government agency. Without more data it is difficult to speculate. It might be possible to look at the approximate date (Early September of 2009) and find a specific event that would cause this. Indeed, it might then be possible to actually make a guess as to what government agency the firewall belonged.
  • by GNUALMAFUERTE ( 697061 ) <almafuerte@@@gmail...com> on Saturday January 23, 2010 @10:53PM (#30874994)

    First, we would need to know what kind of traffic we are seeing. TCP/UDP? Web? DNS?

    On the other hand, I think you have only partial logs, that would explain many of the blanks on your data. Some blanks are too geometric to be correct, you are probably missing a shitload of data.
    You have to take into account that, and timezones. Timezones are the key to this. This is probably some public service that gets hit at regular intervals (root DNS server, webserver holding news/stock/climate or similar information, etc). Timezones would explain the pattern. We would need to check times for each country against a timezone table to see if they correlate.
    I'm also pretty sure that if someone took the time to look at the most active countries, and the less active countries, and some groups in between, we would be able to probably determine what kind of traffic this was.

    Some people mentioned botnets, and it's a big chance that they have a huge influence on this graphs, again, matching timezones against this graph would help us understand.

    I don't know what kind of information does the submitter have on the logs, or how he got them, but if he could post at least a small sample, that would help a lot. /methinks that submitter has a lot to do with the tool he's using, and this is just another slashvertisement.

  • It just means (Score:5, Interesting)

    by OeLeWaPpErKe ( 412765 ) on Saturday January 23, 2010 @11:01PM (#30875042) Homepage

    (this is a guess, obviously. Full netflow data would tell me more, but only way to be really sure would be a full packet trace)

    This just shows that you're being scanned with random source IP adresses (that's why the vertical stripe lights up). It is essentially a check to see if part of the botnet has more firewall access than other parts, or if a loadbalancer directs stuff to different firewalls, or if you have additional BGP uplinks, some of which might not be quite as secure.

    Then the real scan starts, which uses the information gained in the first phase to make sure it tests out all the firewalls the target network has. Especially in the case of backup bgp links, where traffic comes in on physically and administratively different lines (say 1 verizon, 1 at&t, if you've got money to burn, and most govt. idiots feel the need to burn money). If the company in addition to the multiple uplinks outsources firewalls to those ISPs (or "security", not knowing what they're buying and getting nothing more than a smug false sense of security), again this is done by too many govt. agencies, you are bound to find holes this way. This uses actual bandwidth, and cannot be done on some networks. So what you're seeing is a disproportionate amount of scanning traffic coming from countries with fast networks and few watchful netadmins (or netadmins that just don't care, in Turkey's case), and many unsecured computers (and dear God, Turks and Russians really do not see any need for virusscanners, but generally you'd see a few other countries in there too. Heh the Russians are probably worried that running a virusscanner will interfere with their development of new viruses)

    The regular repeats of vertical lines are probably to rescan reachability information, in case something changed. BGP can be twitchy, especially with incompetent local admins (on the botnet side of the network I mean)

    From the (low) speed of the attack you can further deduce that it was an advanced attack, meant to stay below rate limiters, and presumably meant to stay below the radar. And from the resources required to pull this off you can deduce that this was not a lone hacker. Perhaps an organization (these days, tracing source ip's for security attacks almost invariably yields an IP address in far inland China, which is not because the russians have stopped attacking networks, but the Chinese are putting quantity above quality it seems these days).

    And frankly, if someone has this kind of patience, generally they will find at least something, even in a well maintained network. Best hope it was only some files left out in the "public" folder or ~username folders. It's a good bet they probed the network security in other ways too (esp. googling), with IP's that will tell you much more about where the attack is coming from (using many hops is possible, but results in very slow page loads. And we're all human)

    Btw : looking up a net's country can be done quickly via dns, no need for external company, no need for any tax dollars :

    [kimmy@t61 ~]$ host -t TXT 104.79.125.74.cc.iploc.org
    104.79.125.74.cc.iploc.org descriptive text "US"

    (don't forget to reverse the IP address : looking up 1.2.3.4 is done by host -t TXT 4.3.2.1.cc.iploc.org)

  • Re:Skylab Shreds (Score:3, Interesting)

    by Nikker ( 749551 ) on Sunday January 24, 2010 @12:22AM (#30875548)
    It does seem like a type of coordination of interest in the site possibly a bot-net but it could also be due to press releases or other media publications since it is a gov site. You would have to look over many days and not just hours to come up with something conclusive but it is none the less interesting that every country even those in different time zones accessed at the same time and it is odd that the Chinese are interested that much in a US gov site at the same time but I digress. Overall more information is needed and over a longer time frame to make any real conclusions.
  • Re:Skylab Shreds (Score:5, Interesting)

    by MichaelSmith ( 789609 ) on Sunday January 24, 2010 @12:34AM (#30875628) Homepage Journal

    Yes, he knows the firewall and the traffic. The question is - why is there suddenly traffic suddenly appearing from every country in the world at the same time? and again a number of hours later? And again 5 or 6 times?

    I get a lot of distributed dictionary attacks like that. Its pretty normal.

  • by dweller_below ( 136040 ) on Sunday January 24, 2010 @01:01AM (#30875832)

    Nice visualization. Wonder if there is some way to do it in real time.

    I've done networking and security for a university for the last 10 years. I can guess what this kind of activity would be if it was at my institution. Basically, there are several reasons why every country in the world will suddenly talk to us. They include P2P/Gnutella's, P2P/Swarmcasting, Bittorrent, Skype, P2P-poisoning, P2P-misdirection, and hacker/bot activity.

    When we have pulses like you are observing, it is usually BitTorrent.

    The Gnutella P2P variants don't usually have that many peers. And, they tend to last for several hours or days.

    The various Swarmcasting P2P variants look very similiar to BitTorrent, but again, the users tend to leave them running for hours or days.

    A popular Torrent makes connections to hundreds of locations at once, and usually the local user shuts down in minutes (or an hour) when they get their file.

    Skype won't be narrow bands. It will be every country in the world talking to you all the time. We have had computers promote themselves up the Skype infrastructure until they are constantly talking to over 600K peers. Of course, it is more normal to see a Skype node talking to 10K to 20K peers, but still Skype won't be bands. Skype raises the floor for the entire graph.

    P2P-poisoning would closely match your bands. For several years we observed pulses where every member of a large P2P cloud would attempt to talk to a non-existing IP at our institution. Eventually, we realized that somebody was attempting to render the P2P cloud non-functional by poisoning the P2P community with info on non-existing peers. Of course, since this is a Denial of Service (DoS) attack, this is technically illegal, but we saw it happening for years. But, it appeared to stop a couple years ago (about the time Obama replaced Bush) and we haven't seen any evidence of it lately.

    P2P-misdirection is where a cloud will attempt to confuse traffic analysis by throwing out random connections/packets to random IPs. Typically, this misdirection happens all the time, and not in bursts/bands.

    Bot attack activity doesn't match your patterns either. We observe several types. None would look like your bands:
    - The spoofed attacks will look like every one of your IPs getting acks from a few remote IPs.
    - The mapping activity will look like a representative sample of your IPs getting traffic from a few dozen IPs.
    - An incoming DoS would have a few of your IPs get (spoofed) traffic from everywhere, but it would be sustained.
    - Portscans will only involve a handful of remote IPs.
    - The Tag-team SSH password guessing is close. During the last week, we observed about 3000 sources located all over. But, it happens all the time (in the aggregrate), not in bursts. And the sources this week are concentrated in Italy, Poland, Eastern Europe, Colombia, and Brazil. They aren't really all over the world.

    So, I'm guessing it is BitTorrent. But, your situation may be way different from mine.

    Miles

  • Translation (Score:4, Interesting)

    by Alex Belits ( 437 ) * on Sunday January 24, 2010 @01:03AM (#30875854) Homepage

    Vertical stripes may be from spoofed addresses -- nothing from real sources, even botnets, can be that uniform across the whole address space. It would make sense to check how much of traffic comes from unallocated address space, as packets from there are guaranteed to be spoofed. Why would anyone do such a thing? As a direct portscan it would be useless (he can't see the responses), however it might be used as a smokescreen to hide a real portscan or attack from some of those addresses. It may even be an attack that floods the DNS servers with fake responses in the attempt to poison DNS cache, thus redirecting some of the traffic to the attackers' addresses.

    Then, after whatever kind of discovery was completed, you have seen some targeted host scans, [D]DoS attempts or actual exploits causing large amount of traffic (horizontal stripes).

    Another possibility is that those packets are responses caused by something on your network being coerced into sending packets uniformly to the whole address space. It may be something as stupid as a web page with random redirects, however more likely it is a worm on some of your computers looking for other members of his botnet. After such discovery some hosts joined the botnet[s], producing horizontal stripes composed of traffic from other botnet members.

  • by PCM2 ( 4486 ) on Sunday January 24, 2010 @01:24AM (#30875984) Homepage

    Everyone always wants me to have labels on the graphs. I don't put them there unless you roll over the data, because I want you to see the patterns in the data without bias first.

    Why? The only reason for that would be so you could go, "Whoaahh, it's crazy looking." You've proven that. Anonymous data with no points of reference has no meaning. If you honestly think your graph has more value to the viewer than this graph from 1880 [yorku.ca] showing the population of Sweden over time, I think you're kidding yourself.

    It is actually pretty simple and makes it quite clear what is going on

    That's debatable. I've argued that it could be much, much clearer.

    Finally, I am not interested in producing graphs which show you everything "at a glance". Use a pie chart for that. I am making graphs which facilitate a deeper understanding of larger amounts of data than Tufte dreamed of showing using his 2D paradigms.

    Careful. If you're trying to get into the data visualization business, it's a bad idea to make it known that you're completely ignorant of Edward Tufte.

    For starters, anyone who knows the slightest thing about Edward Tufte knows that he hates pie charts. So he would never say "use a pie chart for that."

    Second, contrary to your assertion, Tufte advocates for extremely data-rich graphics wherever possible. He does not advocate abridging large data sets out of laziness. He does, however, advocate data compression when it will reveal data, and he does not like "wasted ink." Your graphs appear to have miles and miles and miles of plotted data -- none of which is identifiable without mouse interaction -- but relatively few points of interest. As you scroll through the data set, half your movie seems to feature the text "empty" hovering in midair above the graph. In other words, your dataset may indeed be large, but your visualization of it is not particularly informationally dense.

    Finally, until such a time as your product can reach out of my flat-screen monitor and tweak me in the nose, you're every bit as tied to a "2D paradigm" as Tufte is. All you're doing is making it possible to adjust what is plotted in real time. Tufte would probably argue that it's better to get the plot right the first time. Allowing viewers to take their time to absorb a lot of data points is fine, but they shouldn't have to waste their time fiddling around with the plot to reveal those data points.

  • Re:vertical stripes (Score:2, Interesting)

    by osu-neko ( 2604 ) on Sunday January 24, 2010 @01:29AM (#30876008)

    If we assume the video conference included people from all of those countries, who all endeavored to join at the same time GMT regardless of local time, and they keep conferencing for several days without sleeping, then yes, that would account for those horizontal lines that suddenly get thick at the first vertical stripe and continue until the end of the five-day period. That definitely makes sense... ~

  • by arkowitz ( 1185265 ) on Sunday January 24, 2010 @01:44AM (#30876086)
    Only change of perspective makes something 3D; this is the point of using a virtual world, so that the user can fly around building a spatial awareness.

    I do not want to produce a one-time "plot". I want to show data for what it is. If it doesn't look as nice as Tufte would have made it look, I don't care. The point is not to look nice... it's to provide the ability for people to see what is in databases, without bias. And I still don't think Tufte's paradigms work with as much data as these 3d ones do.
  • Re:Skylab Shreds (Score:4, Interesting)

    by ultranova ( 717540 ) on Sunday January 24, 2010 @08:36AM (#30877568)

    Any other theories?

    A botnet attack? But then the activity shouldn't be concentrated by country, but spread around the world about evenly.

    Or it could be that someone's seeding a torrent from behind the firewall. That would explain the suddenly starting continuous activity. It might also explain the concentration by country (language or timezone). It would help if the graph could be organized by such factors.

  • Re:Skylab Shreds (Score:2, Interesting)

    by BrokenHalo ( 565198 ) on Sunday January 24, 2010 @11:54AM (#30878700)
    What I find a bit odd is that nobody has even thought to question what business the submitter has with 5 days' worth of server logs from a US state government agency.

The hardest part of climbing the ladder of success is getting through the crowd at the bottom.

Working...