ImperialViolet

Bloom filters (10 Jul 2004)

In reply to The Register: Archive.org suffers Fahrenheit 911 memory loss:

> But just hours after putting up the movie, Archive.org pulled it down

Although Moore is the creator of the film, that doesn't mean that he holds the
copyright. The copyright law is very broken. Archive.org knows this and is
doing it's best to fix it[1]. However, organisations are still bound by the
rule of law

[1] http://www.archive.org/about/dmca.php

> "Then, it called Archive.org to remove any trace of the interview at all".

Given that there's a six month delay till content hits the Wayback Machine, I
very much doubt that.

> "and how a "library" can obey this request defies comprehension"

Welcome, once again, to the law. I'm sorry that archive.org doesn't do the
Right Thing - irrespective of the law. We would all like several aspects of the
law to be changed, but the way to do that is quite well known. Small
organisations which break the law don't change it - they cease to exist.

You know, if you want to host all the copyrighted content in the US, for free
and take on the RIAA + MPAA etc. Go ahead and fund it. I'm guessing that you're
not willing to take that personal risk. You'll just keep attacking others for
not doing it for you.

Archive.org isn't perfect - it's struggling to archive all the content that it
legally can without the funds or the lawers to do so. But it's trying.

Next time it's a slow news day - take a walk.

AGL

There doesn't (for some strange reason) seem to be any good Python source for Bloom filters. There's a Sourceforge project, but that uses mpz for hashing, which is deprecated. So I've written PyBloom which impliments counting and standard bloom filters.