Archive for the 'Site Features' Category

The Science of Luck

Saturday, June 2nd, 2007

I’ve heard a lot of people claim not to believe in luck. Allow me to demonstrate that this belief, while psychologically satisfying to some, is tantamount to disbelief in Pi or milkshakes.

Luck is not some sort of ethereal faith-based system of wish-granting priority. Luck is the result of a simple calculation and luckiness is the sum of a series of luck calculations.

The Formula


Luck = Benefit / sqrt(Probability)

The basic unit of luck measurement is under some debate, but for our discussion, we will refer to the unit as Л or “El.”

You can see the formula in action with my online luck calculator.

Benefit:

A numeric value with min and max centered about 0 and range set to an arbitrary scale (ex.: winning a free bagel = 0.8 and getting a lethal papercut on the giant novelty check handed to you by Ed McMahon = -99). For our purposes, the benefit scale ranges from -100 to 100, inclusive.

Note: The benefit value is subjectively determined value on a scale where -100 is the worst possible, 0 is neutral, and 100 is the best possible outcome. Refer to table 1a for some examples.

Probability:

A ratio of the number of times an event will happen over a number of attempts. This value will always range from zero to one, inclusive.

Formula In Action

Assuming your assessed benefit value of finding a $20 on the street is 5 and the odds of doing so are 1:400 (1/400 = 0.0025 = 0.25%), the formula would work out like this:
Л = 5/sqrt(0.0025)
Л = 5/0.05
Л = 200

Homework

Everyone loves story problems, so here’s one for you:
Lucy values her life more than anything in the world except for that of Mr. Turtle, her cat. She places the benefit of losing her life at -99. After a friend of a friend perished in a tragic futon accident, Lucy found out that the odds of such a thing happening to her are 1:4,473. Aside from the toilet, Lucy owns no furniture with a seat or table top higher than 18 inches because she feels this will ensure her safety. However, a fateful visit to Sears nullifies all of her protective efforts as she trips over a footstool and suffers a fatal concussion.

Calculate the Л for Lucy’s untimely death. Show your work.
Hint: Surprisingly, the value is negative.

Luck Calculator

Use this simple tool to quantify the luck for a particular event.


Benefit (range: -100 to 100)

 

Probability (range: 0 to 1)

 

Л:

 
Table 1a.: Example relative benefits

Event Benefit
Senseless death -100
Identity stolen -50
Broken tailbone -25
Fender bender -10
Goldfish dies -5
Paper cut -1
It’s Thursday the 12th 0
Two toys in your happy meal 1
Flowers from an admirer 5
No cavities 10
No red lights for a whole day 25
Bowl a 300 50
Save Oprah’s life 100

Superdouche Your CSS

Thursday, January 18th, 2007

After just one or two revisions, your site’s CSS can get pretty cluttered with redundant content and inconsistent formatting. I’ve written a simple tool called the CSS Superdouche that programmatically rewrites your CSS, removing all superfluous elements and reformatting it in an attractive manner.

The CSS Superdouche is capable of streamlining already highly optimized CSS. It attempts to detect whitespace-stripped code and, if necessary to shrink file size, it will do the same.

Check out the CSS Superdouche

A More Novel Approach to Avoiding Spam

Monday, September 18th, 2006

Don’t want spammers emailing you through your website’s contact link? The answer is simple: Forget email. Give the internet your phone number.

There are a number of services available which provide a cheap or free voicemail box with a number based in just about any major city. I’ve used k7.net to create my own voicemail/fax line that feeds straight into my email box.

If you’d like to reach me, feel free to give me a call any time, 24×7 at 206.666.3187. If you ever lose the number, don’t worry. You can find it on the front page of my website.

Protect Your Bandwidth From Leeches

Sunday, March 26th, 2006

For those with concerns about limited bandwidth and images that others might frequently link to, you will benefit from my anti-leech script generator.

This easy to use page will quickly generate a mod_rewrite script that you store in an .htaccess file in the root of your website or the directory you wish to protect. I’ve added images.google.com to the allowed domains by default so you don’t block innocent searchers.

Gmail Invite Spooler Post-Mortem

Thursday, March 23rd, 2006

Nine months after closing down the Gmail Invite Spooler, the page remains one of the most popular landing pages on my site. Over the past several months, this page has averaged around 2,500 unique visitors a day. I’ll explain the arc of this wonderful service, but first I’d like to make one thing very clear:

Sorry, I do not have any Gmail invites. Please don’t ask me for Gmail invites. I am truly sorry that I cannot provide you with any. Please go to Google Mail for more information on how you may get your own account.

You may obtain an account without an invite these days. All you need is a cell phone.

Background
In 2004, Gmail was a very hot commodity. Since April 1st of that year, people were clamoring to get in on the exclusive beta of pre-IPO Google’s hottest new offering. In late summer, by the time I decided to write the spooler, Gmail invites were no longer selling for $100 or more on eBay, but there was a large amount of clutter on the internet with people asking for or offering invites.

The forums and blogs that I visited were littered with chatter about trading invites, but the givers and seekers didn’t seem to be coming together efficiently. It wasn’t uncommon to see multiple posts back to back asking for and offering invites.

My wife may not always like it, but when I see a problem, my mind immediately gets to work on a solution. This was one problem that I knew I could make a simple fix for in a matter of hours.

The First Incarnation
Throughout its lifetime, the basic workings of the page remained the same: People with gmail invites would send them to a specific email address. The spooler would then read those emails and store the invites in a database. Site visitors could come and claim available invites on a first-come, first-served basis. There was no backordering of invite requests. When demand exceeded supply, one had to wait until someone else donated some.

Originally the spooler was made solely for the use of the people on the forums I noticed suffered most from inefficient offers and requests. It was a very simple system that was only workable on a small scale, but I assumed it would only ever see a few hundred hits.

On the first day that I had the spooler open, I received 2,592 Gmail invites. The second day saw 4,574 more coming in. By the end of the second day, I had over 3,000 unclaimed invites.

It didn’t take long for word of the “magical free Gmail site” to leak out to the general internet. Within a few days, demand exceeded supply and I had to implement controls on the page to prevent people from refreshing constantly while waiting for a new invite to come in. I also got the first of many lessons in writing code with scaling in mind as I divorced the mailbox checking from page loading.

Ups and Downs
After a month of running the service, the average inbound invites per day dropped below 1,000 for the first time. It seemed that most of the people who had extra invites on hand had heard about the service and donated all they were willing; Google was not giving out new invites on a regular basis at that time. The inbound invites continued to decline through most of December 2004 until they hit a low around 50.

All this time, demand for invites remained strong. I recorded as many as 100,000 visitors and over a million hits per day. On December 20th, the drought was over as Google started to give Gmail users about five fresh invites each day. The average day saw around 2,500 new invites, but they were still being snapped up as soon as they came in. I implemented more restraints to prevent abuse and further streamlined my code in order to keep my server load at a reasonable level. During this time, my web statistics began to break down because Webalizer couldn’t process all of the data without choking.

Way, Way Up
On February 2nd, 2005, Google decided to open the flood gates. They began giving out around 100 new invites per day to Gmail users. My service experienced demand increases like I’d never seen before. For the first time, I was forced to benchmark my code and decide which methods to use based on how many milliseconds they took.

For only the second time, supply was greater than demand. Anyone wanting a Gmail invite could get one through my service without any delay. Unique visitors increased, but hits dropped way down since users had no need to refresh frequently to see if new invites had arrived.

Way, Way Down
Monday June 6th, 2005 was the day I received an email from Stephanie Hannon, Gmail’s Product Manager. Later that day, I had a conference call with Stephanie and her superior regarding my service. They felt that services like mine had become a threat to the quality of Gmail. Their reasons for making the service invite-only were many:

• Limit new subscribers
• Heighten demand and curiosity
• Limit accessibility of accounts to potential abusers

The last reason was the one that made them care about my site. Spammers and abusers have a higher threshold of entry without the spooler. Despite the fact that I think Google should do more on their part to prevent automated account creation and duplication, they do have more random people gaining access to invites through a service like mine.

In short, Google felt as if too many spammers and abusers were getting invites that they obtained from me and saw this as a threat.

Why I Pulled the Plug
I’ve received a few thousand emails asking for invites, complaining about how “unfair” this is, or asking for source code. In the early days after pulling the plug, I would respond to every request with an individually written response explaining the situation. This generated many replies suggesting I just re-open the system in defiance.

Aside from the fact that I really don’t wish to burn any bridges with Google (heck, maybe they’d forget all this and hire me if I ever applied), I have good technical reasons for not re-opening the spooler: My service relied on people with Gmail accounts constantly inviting the now blocked email address gmail@isnoop.net.

Google is no dummy. They know full well that they must track the email addresses that the invites are sent to. They can (and did) automatically invalidate every invite sent to my site. All 1,240,162 invites I had left over the day I shut the service down instantly became duds. To continue the service, I would have to change the method of catching new invites to one substantially more inconvenient for the donor.

In the end, insistence on keeping the spooler open would have certainly summoned the massive lawyering machine deep within the “don’t be evil” company and I don’t think reasonable person wants that fight.

Fast Forward to Today
The former Gmail Invite Spooler page is now a brief testament to what was once the most popular Gmail invite spooler on the internet.

The bulk of the current 2,500 visitors per day come from non-English speaking blog sites that haven’t yet gotten the message that the page is closed. While the rest of isnoop.net has a 66% US visitor rate, the spooler is only 17% US traffic; it holds the #1 slot by less than one percent.

Almost all of the dozens of emails and stray blog posts requesting invites ask the same thing (in broken English). I saw the need for folks who didn’t speak my native language to get the full story, so I wrote a simple script to help them out. This has helped reduce the confused request flow, but it has also crimped the last of my dwindling AdSense revenue. Oh well. I ran this site before it ever brought me a penny and I’ll continue to do so for as long as I have the energy.

I turn down all requests for the source code for the spooler. If Google doesn’t want me starting fires in their back yard, I’m certainly not going to give away my matches to all of the other neighborhood kids.

I have considered revamping the spooler for use with other invite-only services, but I’ve yet to see one of great enough popularity and of proper nature to justify the effort. I refuse to open up such a thing for a community-based website on the principal that it breaks the “six-degreesâ€? network they’re trying to build up by bringing in random people with no association to the inviter.

Media Coverage
Aside from a number of blogs and forums that mentioned the service, these are the print media references I am aware of:

Book: Google Search & Rescue for Dummies – 2005
Text

Book: Google Hacks – 2005
Text

Popular Science Magazine – June 2005
Close-up
Full page

The Mercury News (San Jose) – May 23, 2005
Online version

PC World – April 13, 2005
Online version

Sydney Morning Herald – April 9, 2005
Close-up
Full page
Online version

Are you seriously still reading this?
This post covers almost all of the points I regularly discuss with folks who have questions about the service. I hope this overly long post has satisfied your curiosity.

Whatever you do, don’t click my ads!

Wednesday, March 22nd, 2006
Sponsored by:


If you are an adsense user, you may have seen this email:

Google AdSense Policy Enforcement
Hello,

While reviewing your account, we noticed that you are currently displaying Google ads in a manner that is not compliant with our policies. For instance, we found violations of AdSense policies on pages such as http://isnoop.net/gmail/

Publishers are not permitted to encourage users to click on Google ads or bring excessive attention to ad units. For example, your site cannot contain phrases such as “click the ads,� “support our sponsors,� “visit these recommended links,� or other similar language that could apply to the Google ads on your site. Publishers may not use arrows or other symbols to direct attention to the ads on their sites, and publishers may not label the Google ads with text other than “sponsored links� or “advertisements.�

Please make any necessary changes to your web pages in the next 72 hours. (truncated…)

It’s nice of them not to bring the hammer down on me for having text that said “Please patronize our fine sponsors,” but it’s even more interesting to see where different ad companies draw the line.

The previous text was officially approved for use on my site when I was serving up AdBrite ads. In fact, AdBrite called me on the phone one morning to ask me to change it from the original text which read something like “Please support this service by visiting the sites below.” I assumed that sort of direct phrasing was frowned upon, but I wasn’t sure and ignorance is bliss.

I assume that Google would disapprove of me posting “Whatever you do, don’t click my ads!” above my AdSense, so that’s why I’m not going to do it. Instead I’ll just publish this blurb about making nice for the kind folks who might just pay me a few dollars towards the costs of running this dedicated server.

Whatever you do, don’t click my ads.

On Geocoding

Thursday, March 9th, 2006

I learned some valuable lessons in high traffic geocoding this week. All this because Google doesn’t offer geocoding services for Google Maps, so you must send them latitude/longitude numbers for any point you want to plot.

This begs the question: How do I quickly come up with the lat/lon coordinates for Shanghai, Anchorage, Indianapolis, Portland, and Seattle? Google does provide a handy link to a Google search for “free geocoder” in their maps API documentation, but none I’ve found have a decent API, work for free, or can sustain the amount of traffic I might request. I’d greatly prefer owning a database and performing the lookups under my own processing power.

The answer I came up with for the low-demand Seattle Emergency Events Map was to screen scrape Google’s own mapping service to see what coordinates they come up with for a given location. It wasn’t pretty, and it wasn’t mine, but I was already using Google so what the heck.

That solution worked beautifully until I got 20,000 visitors to my Maps + RSS package tracking page on Monday of this week. Apparently, Google doesn’t appreciate being hit that much. They temporarily shut down access from my server’s IP to the page I was scraping with a message indicating they’d detected excessive automated behavior. They said something about my tools maybe being a virus. They also kicked my mom in the shin.

When I was notified Google-scraping geocoding wasn’t working anymore (never screen scrape without setting a failure mechansim), I pulled the code and provided a nice message for my site’s visitors. Google dropped the block shortly thereafter, and I hear they gave my mom flowers and apologized for that regrettable shin thing.

I checked out various solutions, trying to find a geocoding database that suited my needs. The US Census TIGER database was far too in-depth and only dealt with US locations. I ended up deploying a commercial IP-to-location database that contains the coordinates for any city that has an IP range associated with it.

Google employees, please skip the following paragraph.

My current geocoding solution involves a lookup in the ip2location tables. If I cannot find a position from there, I check a database cache of locations obtained from Google. If that fails, I scrape the location from Google Maps and cache it for future reference. If that fails for any reason, I go back to the ip2location database and make a darned good guess as to where to point. This typically means centering on a state or even entire country, but it’s better than nothing. This method results in very low traffic to Google, but my goal is zero external reliance.

This geocoding method shouldn’t be long-lived. I plan on converting a copy of the TIGER database for US addresses and purchasing a listing of a few million world locations. I’m always in favor of saving money, so if anyone knows of a free world cities geocoding database, or already has the TIGER database converted to a query-able format, please let me know.

Once I’ve got a satisfactory geocoding system built up, I’d like to open the access and make a public API. That’s down the road a little way, but keep your eyes open for that.

Package Tracking With Google Maps

Sunday, March 5th, 2006

Package tracking with Google MapsI’ve just published an update to my universal package tracking tool that now enables you to view a map of your package’s progress as it travels across the country.

This new mapping addition builds on the original features of being able to track UPS, FedEx, USPS, and Airborne/DHL packages all in one place and having that tracking information published into a personalized RSS feed. The system automatically detects which company your tracking number belongs to and loads the package data for you.

A nice side benefit of this new addition is that I’m developing a pretty robust Google mapping class, helping my other map projects to evolve.

Make Your Site Easily Translatable With a Little JavaScript

Friday, February 24th, 2006

Despite the advances of the Internet, apparently some news still travels slowly. I closed down my Gmail Invite Spooler page months ago, but I’m still getting hundreds of unique visitors to that page along with about a dozen email requests for Gmail each day. Almost all of the emails and traffic are from foreign countries, so I devised a simple javascript that will allow folks to more easily translate the page into their own language.

You can use this very same javascript. It should work on any site just by including the following:

Here’s what it will look like:

Compare Average Flight Costs by City

Friday, February 24th, 2006

CostimatorIn order to help a convention get off the ground, I built a tool that scraped all of the prices for round trip flights from every American and Canadian airport to eleven major North American cities.

According to my Costimator, Washington DC is the cheapest major destination, and NYC’s LaGuardia airport is the cheapest origin airport in North America.

You can enter airport codes and view the average cost to major destinations along with the percent difference from the average cost.