I’m currently working at a client’s site and they’re filtering web traffic. I can’t work like this (and don’t like the idea of being filtered) so I decided to setup a VPN to bypass their proxy.
It’s actually surprisingly easy to do: using instructions such as these, I was able to setup a VPN between my machine and an EC2 instance on Amazon Web Services, in no time. (BTW it’s not only easy and fast, but cheap: $0.007 per hour for a reserved micro instance; if it’s your first EC2 micro instance it’s also free for the first year).
At first it didn’t work because the default VPN port of 1197 was blocked; but I directed traffic on port 110 and all was fine… until I tried to access Stack Overflow and was greeted by this message (with the title of “Too many requests”):
We’re sorry… This IP is only allowed to access our API. To protect our users, we can’t process requests from this IP address. If you believe you have reached this page in error, contact us.
I fired an email to team-at-stackoverflow explaining my problem; I asked why they even cared about bots since they’re bragging that they can handle so many hits from Google, which was probably a little passive-aggressive and not too wise.
I got a response almost immediately from “Jeff” (who may or may not be Mr. Atwood himself) saying that:
Yes, indeed — we have such a policy. EC2 instances, due to widespread scraping abuse, are only allowed to access our API. Apologies for any inconvenience. Jeff
It’s really nice of them to answer (and so fast), but I find this policy hard to accept: shouldn’t they try to discriminate between a legitimate user and a bot, either of which can come from AWS EC2 or any other platform? Esp. given their audience of, you know, hackers?
“We’ll just block all of EC2” seems not only excessively broad but, well, lazy.