Sunday, November 27, 2005

REVIEW: Silence on the wire

This is a belated review of Silence on the wire by Michal Zalewski - I read the book a while ago so this is from memory but I enjoyed this book a lot so I wanted to review it. This is book about passive security attacks - instead of actively accessing a system and its information, passive attacks are about capitalizing on information leakage. As such there aren't really any cookbook examples that you can extrapolate from - every case is wildly different. I found the book was more useful for getting you to think about systems in a differnt way to be able to secure them (or attack them I suppose) better. The book breaks the attacks into 4 classes based on the level of access to the system: The Source (Local Machine attacks), Safe Harbor (Intranet attacks), Out in the Wild (Internet attacks) and The Big Picture (attacks against the whole system). The book does a fairly good job of back-filling information you will need to understand the attacks it describes although I found myself needing to refresh my memory about some of the deep details of TCP.

Local machine attacks: This section describes passive attacks when the attacker has access (network or physical) to your machine. The book starts with a desciption of different ways of generating random numbers and how this can be exploited if you have access to the system by exhausting the random number supply - the new random number has to come from some where and (at least in the systems that Zalewski studied) the numbers are partially generated from the keyboard. The authors then describe statistical methods for guessing passwords from the keyboard timing data.

A recurring theme in the book is describing how things work at a very low level (for example a fairly low level description of how microprocessors work starting from boolean algebra) - a necessary prerequisite in attacking any system is having a good understanding of how it works. The book then describes TEMPEST (i.e. electromagnetic radiation from systems) leakage from systems and how it can be exploited.

Intranet attacks: This section deals with attacks that can occur remotely but require some kind of proximity access to the victim. It starts with an analysis of information leakage from the blinking LEDs on networking switches - again Zalewski determined that it could be used on the hardware he tested but it's less clear if it's true for networking hardware in general. The take away is thinking about unconventional avenues of information leakage and how to analyze the thread model (some researchers recently found that you could swipe passwords from the sounds of the keys being typed using a similar model).

The authors then describe some historical attacks of Ethernet (chilling!) and describe how network / modems work at the wire level and some interesting attacks of this. Internet Attacks: The bulk of the book describes various weaknesses in the IP protocol - there is a fairly extensive description of how TCP and UDP works although you will most likely want to have some supplemental information as I noticed some gaps in the description. Zalewski is the author of a passive fingerprinting tool (called p0f) - it identifies systems based on the "fingerprint" of their TCP implementation. TCP is sufficiently complicated that various implementations (and different versions of those implementations) can be identified by how they respond to specific requests. Fingerprinting a system is a prerequisite for knowing how to exploit bugs and weaknesses in the TCP implementation. There is a fair amount of coverage of how to determine the TCP sequence numbers (to inject packets into a connection) using some interesting time series graphing techniques. The authors describe some ways of determining how many hops away a machine is and a detailed discussion of the differences between stateful and stateless firewalls. They then go into various ways that firewalls and the systems behind them can be probed and identified - all backed with clever real world examples. For example, there is an interesting example of how HTTP ETags can be used to track users even when they have disabled browser cookies. Neat! The Big Picture: This section is a little more ill defined - attacks against the internet in general rather than a specific user. It covers topics like using parasitic storage using the internet at large by using temporary packet storage by network devices - kind of juggling of data. There is also some discussion of tracking physical user location using network topology. I really enjoyed this book although reading some sections required a bit of effort - the book is not as polished as most O'Reilly books - but the content makes it worth it. The book contains interesting real world examples for most of the issues it raises - without them the reader might be tempted to think some of these problems were theoretical. The take away from the book is that the delusions under which a lot of people labor - firewalls will protect me! no browser cookies means I can't be tracked! - are not as true as you hope.

Friday, November 25, 2005

GMail SPAM Filter needs a little work...

I got spam in my Gmail account Inbox (i.e. the spam filter didn't catch it) recently with the following title:
[Spam] www.fotosende.com - Digital Fotoğraf Baskı Dünyasına HOŞ GELDİNİZ
I don't know what surprises me more, the fact that the spammer put "[SPAM]" in the title or that Gmail didn't catch it...

UPDATE: It gets better. I looked in the Spam Folder and Gmail displayed an ad for Spam Fajitas!

Tuesday, November 22, 2005

How to argue on the Internet

From Scott Adam's (aka Dilbert creator) blog, a list of 7 ways to argue with people on the internet. I'll never lose an argument again with these great tips!
  1. Turn someone’s generality into an absolute. For example, if someone makes a general statement that Americans celebrate Christmas, point out that some people are Jewish and so anyone who thinks that ALL Americans celebrate Christmas is stupid. (Bonus points for accusing the person of being anti-Semitic.)
  2. Turn someone’s factual statements into implied preferences. For example, if someone mentions that not all Catholic priests are pedophiles, accuse the person who said it of siding with pedophiles.
  3. Turn factual statements into implied equivalents. For example, if someone says that Ghandi didn’t eat cows, accuse the person of stupidly implying that cows deserve equal billing with Gandhi.
  4. Omit key words. For example, if someone says that people can’t eat rocks, accuse the person of being stupid for suggesting that people can’t eat. Bonus points for arguing that some people CAN eat pebbles if they try hard enough.
  5. Assume the dumbest interpretation. For example, if someone says that he can run a mile in 12 minutes, assume he means it happens underwater and argue that no one can hold his breath that long.
  6. Hallucinate entirely different points. For example, if someone says apples grow on trees, accuse him of saying snakes have arms and then point out how stupid that is.
  7. Use the intellectual laziness card. For example, if someone says that ice is cold, recommend that he take graduate courses in chemistry and meteorology before jumping to stupid conclusions that display a complete ignorance of the complexity of ice.
The comments also have some tips which are of varying usefulness like this one:
Saying something annoying about spelling is a good way to miss the point.

Sunday, November 20, 2005

Syndicating Ping State with OPML

This is a followup on my previous post about RSS Reading lists. The recent increase in activity around OPML has a lot of people thinking of useful extensions for it (Syndication of attention data, OPML Extensions, Identity systems) - much of the this is related to subscribing (rather than importing) OPML files. I'll suggest some OPML extensions of my own and why I think they are useful.

Add Time To Live (TTL) Sub-element of <head> (optional)

If clients are going to subscribe to OPML files then there needs to be some way to indicate how long the file should be cached for. This would work basically the same way as RSS 2.0 does - the value of the TTL element indicates the lifetime of that document in minutes and that the server should not poll any more frequently that. Like RSS, there needs to be a reasonably accepted default (60 mins is considered a reasonable default for RSS).

Add etag attribute to <outline> element (optional)

RSS aggregators keep track of the current state of a subscription using either HTTP ETag header or the HTTP Last Modified header. This information is used to conditionally GET the subscription from the server only if the contents have changed relative to the version the client has. Many services are capable of publishing OPML files dynamically - some services such as server based RSS news readers such as Bloglines also syndicate the feed contents. In this case, the OPML publisher already knows the state of the feed (i.e. the ETag) and could publish that information into the OPML file as an ETag using the etag attribute. How does this help the OPML subscriber? The aggregator now checks for updates on the OPML subscription file. For each RSS outline in the OPML file, if there is a etag attribute it's compared to the current etag for the subscription. If the etags match then the subscription is considered current and doesn't need to be downloaded. If the etags do not match then the subscription needs to be downloaded. An RSS Reading list might potentially have 100s of subscriptions contained within it - rather than have N subscribers plus the publisher check the status of M subscriptions in the reading list for (N + 1) x M total pings, the N subscribers could check the status of the publishers OPML file while the publisher checks the status of the M subscriptions for a total of N + M total pings.

Issues

This system is a centralized cache for RSS ping state which have two well known problems: 1) when to invalidate the cache and 2) the cache is a single point of failure. The first problem seems less severe and could have some sort of user override. In the second case, if the server ceases to publish etag information in the OPML or ceases to publish the OPML file at all, the aggregator could revert to polling the subscriptions directly. Another possibility is that an OPML publisher could only publish ping state for subscriptions that the publisher is responsible for - for example, a server based RSS aggregator like bloglines could republish it's cached copies of subscriptions acting as an RSS Proxy server.

Notes

The WebDAV PROPFIND verb uses basically the same mechanism as a way of performing directory enumeration. OPML can also be used to implement virtual directory structures so the same scheme could be used in that use case as well.

Warning Label Generator

The next time you need a warning label for something, Warning Label Generator has you covered.

Review of Google Analytics

I set up Google Analytics to measure traffic to this blog . The problems this service had on the first few days have been well documented but have largely been solved by now (either by people giving up on it or better infrastructure). The price is right (free) and it doesn't have the "only in pay version" aspects of SiteMeter - clearly Analytics is designed to help you maximise your Google Adsense ads. You sign up and add some JavaScript to your blog template and it takes a few days to get any data collected. It's clearly geared towards large web site owners (i.e. AdSense customers) rather than bloggers. For example, I don't really have any executives - so I don't need an executive view - I just want the details. The whole thing seems to be lacking a lot of the usual polish of Google web apps. For example, on the dashboard page I see the following:
To track another website with Analytics, click the 'Add Website Profile' link.
Well I checked and there is no "Add Website Profile" link and apparently there is no way to add. It would be nice if it was better integrated into Google services - for example, why doesn't Blogger have integration of this? I could imagine a lot of improvements they could make (the standard complaint about any Google service is "How about an RSS feed for this?"). You can't really argue with free although I'm anxiously waiting to get a Measure Map invite because it's apparently geared to bloggers.

RSS Reading Lists

The latest topic in the RSS aggregator community is the use of OPML for Reading Lists, Identity and syndication of attention. News aggregators have long supported the import / export of the users subscriptions using OPML but this is a static process - if you subscribe to an OPML file then you will get updated when something is added to / removed from the reading list. This makes it possible to publish your reading interests in an area for others to follow - somewhat like the blogroll that you see along the sides of some blogs. You could also use this to share your subscriptions between different aggregators - e.g. between Bloglines and Netnewswire. Dare Obasanjo has a post discussing some of the issues of how desktop aggregators should deal with these reading lists - hopefully server based readers like Bloglines and Google Reader will make it easier to publish a reading list too. In an upcoming post, I'm describe a scheme for syndicating RSS feed ping state as part of an OPML description of an RSS reading list or directory.

Friday, November 18, 2005

Ross Mayfield vs Software Process

Ross Mayfield has an interesting post on The End of Process dicussing how organization accrete rules over time without consideration of whether the overall process is actually effective. A lot of his argument is drawn from Clay Shirky's article "Process is an embedded overreaction to prior stupidity" in which he rails against overreaction:
But not all stupidity is amenable to deflection by process, and even when it is, the overhead created by process is often not worth the savings in deflected stupidity. Stupidity is frequently a one-off, and a process designed to deflect it within an organization actually ends up embedding it as a negative shape. Like the outline of Wile E. Coyote just after he is catapaulted through a wall, making everyone fill out The Form Designed to Keep You From Doing The Stupid Thing That One Guy Did Three Years Ago actually emphasizes the sense memory of that stupid thing within the group. CAUTION: The beverage you are about to enjoy is extremely hot.
There's a lot of truth in what Shirky says but it's a question of degree. Process is good when it covers 80% of the cases but trying to abolish everything tends to add more friction costs to your organization than it is worth. People also don't really consider whether the process as a whole make sense - look at the laws of any country or the US Tax code: patches on top of patches with no architecture. So what do you do if your working in an environment where process is everywhere? As a former manager of mine used to say "To ask permission is to seek denial" and I've pretty much taken this to heart. It's amazing how far you can get by ignoring things and every good rule is worth breaking once in a while.