Friday, June 26, 2009
Beating CAPTCHA can be a good thing.
One example of this is reCAPTCHA. They take the best OCR they can get there hands on, find text it can’t read, make the text even harder to read and then fork it out for bots to do there best with. For someone to beat this, they would have to make a better OCR program. This has two interesting effects. First if the bot gets into general circulation (and it will sooner or later) reCAPTCHA can just start using it in there OCR system and be back on top. Second it furthers the state of the art in ORC and that is valuable in it’s own right.
This thought has some interesting implications. More generally what a CAPTCHA does is use problems that are hard to solve but easy to check the solution on to make automatic access to a resource to expensive. How about doing this more directly? How about find commercially valuable computational problems that can easily be broken up into chunks that have this attribute and can be solved in a few seconds and checked in microseconds. Then the bots would need to expend a few CPU seconds per page load to access a site.
One implementation of this could be a browser plug-in that allows you to bypass the CAPTCHA on a site. The publisher of the plug-in would push out code packets that the plug-in would be required to run. Sites that run the CAPTCHA could even get paid to use it (some fraction of what the central server gets after expenses) and to make things fun, the whole things is open so that anyone who wants can try to write better solutions to the problem. One neat trick would be to try to set the price paid by the central guy so that if you can improve the solver enough, you can make money by getting your own account and just solving problems after problem. Or if you that doesn’t make enough money fast enough for you, just sell your solution to the central guy.
Friday, June 19, 2009
Stackoverflow Flair widget
Note to the stackoverflow team: you have my permission to use that page wholesale if you want.
Tuesday, June 16, 2009
Open source; not just for software anymore.
I think this is a really cool idea and I'm interested in how it will pan out. I have thought it would be interesting to do something like this but I had always thought along the lines of a DARPA grand challenge like project or something fictional like a the spaceship Michael.
Thursday, June 11, 2009
Serialization for D part 6 of n
The array support is transparent but the 3rd party type support adds some new stuff to the API. I went with the function pointer approach and have a simple interface for attaching a pair of function to a given type.
I'm thinking that a number of use cases will be common enough that I should create some boiler plate implementation for them. The first that comes to mind is serialization the same way that the rest of the types are; that is serialize all the members. Other cases would be to pull only selected members or to use a constructor to build the object rather than member assignment.
I'm looking for ides for other cases so comments are welcome.
Monday, June 8, 2009
Looks like the bad guys are winning....
If it weren't for the fear that I'd see it done, I'd write a satire about a terrorist cell that attacks systems they want shut down by using those same system to attack other targets they don't even care about.
Tuesday, June 2, 2009
Thank goodness Linux != Probable cause
Monday, June 1, 2009
Static Initialization Check Feature
While thinking about how to do 3rd party types for my sterilization library, I ran into an interesting problem; how to verify that boot time initialization is done correctly.
Some cases are easy, for instance, where the initialization is done by the same code author that defines the validity check. In that case the checks can just be appended to the initialization code.
The problem cases is where the two bits are separated. This is a sort of punting model where the first author "punts" by throwing out some state variable and expecting someone (a second author) to set them up correctly. In this case you have the problem of where to put the checks. If you put it in a static constructor in the module with the state variables, then it ends up running before any static constructors in the modules that could have set the values. Another option is having a test function that the second author needs to call, but they could forget. A third option would be to have a function that gets called at the top of main, err, yuck. The option I think I'll go with is to check that things are correct on the tare down and then force an immediate teardown at some point as part of the test rig.
What I'd really like is some sort of delayed assert that runs after all the static thiss but before main. Of course then I'll want something between that and main... (Yet another example of why to never have more than two levels of operation if you can avoid it) To avoid that issue it could be restricted to provably side effect free expressions. Given that in my case all I want to do is check that a global is non null this would be just fine for.
Of course, in my case what I'd really like is static whole program optimization and analysis to, where possible, rip out static constructors in favor of literal data segments and replace the checks I'm taking about with compile time checks. But now I'm just dreaming.