A Proposal For Spam-Free Writeable APIs

I've been having an interest in Bitcoin recently, but it would appear I'm too late to the party to make any money on mining. So, what's the next best thing? Taking their idea and using it elsewhere.

The idea behind Bitcoin is to make a particular thing a rare commodity. Now let's pretend we have a website like say http://stackoverflow.com We want to make a public API for it that is writable. Current options appear to be

  1. API keys which require a human to register
  2. ????

I'll throw a second option into the mix. "API Coins" which require a fair bit of computing power to create and are only good in a certain context.

Let's say you wanted to make an account at stackoverflow with a machine that didn't require any human interaction, or rather, didn't require a captcha, valid email, personal info, etc. In theory, a program could register it completely in an automated fashion.

My proposal to prevent masses of spam bots: make it expensive. Use a bitcoin like scheme. Instead of SHA256, I'd go for scrypt because it's so mostly better on CPUs rather than GPUs, and thus capable of executing from Javascript.

So, when you visit the register page I provide something like

  1. Conditions a hash must match (difficultly)
  2. The value hashed must contain a certain provided phrase (to prevent pre-mining of API coins)
  3. That's it!

You calculate a hash which matches and poof! You've got an API key. Ideally, this would be a process that would take no more than 5 minutes on the slowest of hardware. Now, when you need to perform an operation, there will be another hash request, but it won't be as intense as the creation of your API key... but if you're a bad boy, your API key will get banned and you'll have to generate a new one.

Now, how does our site know that API keys are "valid" without pre-mining risk? The key is to make the nonce phrase be random and unique, but slightly persistent. So, when the request is made to get the nonce, it is stored for say an hour. If the API key isn't "found" within an hour or two, it's considered invalid. This would prevent batching of API key creation.

To help to enforce these "hard" checkpoints, if a user, say wanted to post a comment, they'd be given a request like the API key request. A certain difficulty and a phrase to be contained within the pre-hash value. Ideally, this would be significantly easier than generation of an API key.. You could also enforce throttling at this phase by increasing the difficulty for their account as they post more and more things.

The other awesome part about this scheme? It's anonymous other than the IP address in the logs. You can be reasonably sure that it's a human posting while getting absolutely no personal information and storing absolutely no personal information. No passwords needed. You effectively have a sort of private key instead, stored in a cookie or some such.

This also enables awesomely easy registration for users of your API users. "What's an API key?" crops up plenty. Eliminate the need for it!

Some unsolved problems with this approach however:

  1. How to link accounts with it? Assuming you'd want multiple API keys to each API user?
  2. Password to facilitate linking accounts?
  3. What if you lose your key?
  4. What about those mystical FPGA scrypt machines I've heard rumors about?

I might throw together an extremely simple "micro-blog" thing(twitter clone) that uses this concept just to see how it turns out. The hardest thing would probably be implementing scrypt in Javascript

Note One last thing. This isn't to "stop" spam. It's rather to make your site so expensive to spam that it's not profitable. Sure, you can always rent out a few hundred EC2 VMs or some such and compute a few hundred API tokens, but how much is that going to cost? How much do you expect to make from spamming that site?

Posted: 3/31/2013 4:11:10 AM