Saturday, April 22, 2006

OPML Reading Lists and Scalability

Dave Winer and Amyloo are talking about the scalability of OPML Reading Lists using polling - apparently there is a move afoot to set up ping servers for OPML:

If you have a pinging protocol, unless you're going to send a ping to every subscriber (which isn't practical because of firewalls and NATs), you're going to have to ask some central authority whether something has changed, and nothing is more efficient at that than eTags, nor as widely implemented, nor as utterly optimized.

 Pinging OPML is no worse than RSS in terms of polling problems and if you were going to set up a ping service, you'd probably want to make it a general ping service for any time of HTTP resource that is likely to be polled. Reading lists are interesting because they are a collection of things so the situation is actually better not worse because the OPML file can be a proxy for all the content it represents if the OPML file published the eTags of the resources that it is a collection of then a smart aggregator would only have to ping the OPML file rather than all of the things in the list (see Syndicating Ping State with OPML and OPML Reading Lists for Optimized Feed Sync for more details)

Tuesday, April 18, 2006


Dharmesh from On Startups throws some water on the trend of companies being built using mashups of services from Google / Yahoo:
From Wikipedia: A mashup is a website or web application that seamlessly combines content from more than one source into an integrated experience. From Dharmesh Shah: A gnashup is a mashup that eventually causes gnashing of teeth because the developer thought she was building a viable business, when in fact, she was really conducting a controlled experiment for the benefit of Google and others.
He previously talked about why building a business where the business model is Google Ad Words probably is not realistic.

Sunday, April 09, 2006

Gaming Google Video

The online video wars appear to be heating up - someone has posted a video to google video that is just an advertisement for a video they are hosting on YouTube. Not really sure why you wouldn't just post it in both places although there must be reason if someone went to this much trouble - it might be a terms-of-service thing. Having people verifying which videos are ok and which are not doesn't seem that scalable.

Saturday, April 08, 2006

Statistics and Sports

Malcolm Gladwell writes on using statistics to judge the validity of sport records. He argues that trying to determine if someone is using performance enhancing drugs is sufficiently difficult at the chemical / biological level that we should just look at the statistics. For example, sprinters who have mediocre careers and then suddenly become world record holders are probably on the juice. There is a long history of using this type of technique to judge scientific theories (for example the way that the top quark was detected - it's less of an "aha moment" than you might think). I'm less sure that this would work in atheletics however - as people put more trust in the statistics, it's more likely everyone would just start using steroids and the average performance would just increase. People aren't subatomic particles and will conciously adapt to how they are being measure to avoid detection. The other problem is that statistics work well over a bunch of samples and aren't good at making judgements about any single event because a certain number of background rare events are expected.