billso.com

Bill Sodeman writes about management, mobile computing and information systems

billso.com header image 4

Entries tagged as 'captcha'

RIP CAPTCHA

all

Posted Friday, 18 July 2008

Long-time readers of billso.com may remember that I used reCAPTCHA to validate comments about my articles. reCAPCTHA is a web service that shows users pictures of two words. The service knows one of the words. The second word was provided by an electronic book scanning project that needs help with its quality control.  reCAPTCHA send the results back to the scanning project, to help them fix their documents.

This is not a working CAPTCHA. It's a Flickr image courtesy of Mess of Pottage.CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) system is a simple test that determines if a computer user is a machine or a human. CAPTCHAs are small puzzles that people can solve quickly, while being too expensive for a computer system to solve.

I dropped the reCAPTCHA feature in May 2008, because the system was not stopping comment spam from appearing on my blog. “Comment spam” is just messages that have little or no relevance to an article or page.

In the past, people who wanted to crack a CAPTCHA system might pay users to stay at home and decipher dozens of captchas, in return for free content or Internet access. But people are slower and less reliable than computers. Processing power continues to improve, while CPU costs get lower.

Paying the price

Stephan Chenette, the manager of security research at Websense Security Labs, notes that CAPTCHA technology had made incremental improvements since 2000, while CAPTCHA crackers bought faster hardware and invested more in their efforts:

CAPTCHA has been broken for the last year and a half. The technology has really not progressed. They’ve got a little bit harder but the hackers have made programs that can easily break them. This works both with print and audio CAPTCHA. All of these have been broken in one way or the other.”

In the last few months, the CAPTCHA systems of several major web sites have been cracked by automated systems:

  • January 2008: Yahoo Mail
  • April 2008: Gmail and Hotmail
  • May 2008: Craigslist

This has resulted in a flood of spam, scams, and fake postings on services around the world. It’s become quite easy to create a fake Web site that can fool many users. Social networks like MySpace and Facebook offer many more opportunities to trick users into revealing their credentials and personal information.

In the last few years, financial service companies and banks have adopted multifactor authentication systems that ask users for more than a password or a CAPTCHA solution. Now organizations in other industries are looking at similar solutions, because it has become much less expensive for scammers and crackers to break these companies’ systems. Several OpenID providers have added multifactor features to their authentication systems, too.

This article called How CAPTCHA got trashed has more details.

Image courtesy of Mess of Pottage through a Creative Commons license.

Related posts and pages on billso.com

Tags: captcha, crime, email, Google, government, hardware, innovation, Microsoft, privacy, spam, university, usability, Yahoo

Digital TV is coming

ism tech

Posted Tuesday, 25 March 2008

Read 1 comment

Yesterday, the Honolulu Advertiser published an article about digital TV conversion. On 17 February 2009, US television stations will stop broadcasting analog television signals. On that date, anyone in the US who uses an antenna to receive their television signal on their analog television will need a digital converter box to receive broadcast signals. Cable and satellite subscribers have or will get converter boxes as part of their service agreement. All televisions manufactured for sale in the US after 1 March 2007 are required to have a digital tuner, so these models don’t need a converter box. The AP has an article with additional details.

I’ve discussed the FCC’s 700 mHz auction on 18 March 2008 and 30 January 2008. When the analog television channels are abandoned, AT&T, Verizon and other companies will use those frequencies for mobile phone and data services.

The US Department of Commerce has a web site with information on the DTV conversion, as does the FCC. Government regulators and consumer activists fear that cable and satellite companies will use digital television to scare up new subscribers. Another AP article states that Hispanics are the ethnic group most likely to lose television service after the conversion, even as the Federal government gives away several million coupons for digital converter boxes. Hawaii has a diverse population, and getting the message out in multiple languages will be challenging. I expect to see more articles in the local papers, especially in early 2009, even though the Advertiser claims that only 5.5% of the state’s television viewers rely on broadcast signals.

Digital TV converter boxes won’t turn an old analog set into a higher-definition TV, of course. These boxes have a digital TV tuner that passes its output to an analog TV on channel 3 or 4, like a video game console would do.

Yahoo reports that broadcasters will be required to run public service advertising, in an effort to notify viewers well before the cutover. The coupon request page uses reCAPTCHA – the same system I use to screen out spam comments on this blog.

Tags: cable, captcha, comments, dc, FCC, hardware, Hawaii, ISP, spam, system, television, time

Avoiding the splogs

all

Posted Thursday, 13 March 2008

A post by WordPress founding developer Matt Mullenweg claims that 80 percent of the world’s blogs are actually spam blogs. Kevin Burton claims this number is as high as 90 percent, and that most of the spam blogs are hosted on Google’s free Blogger service.

I use a publicly-accessible blog to run my courses because it is easier for my students to access this site. I’ve tried hiding blog articles behind a password-protected walled garden like Moodle or WebCT in the past, and that was more trouble than the effort was worth.

I devote a couple of hours each day to this site, because it’s a great place to post the example, articles and links I discuss in my teaching and consulting engagements. Over the last year, I’ve learned a lot about how the splogosphere works.

The splog business model

Most of these splogs use a similar model: automated scripts search the Internet for keywords in legitimate blog posts and RSS feeds, such as my web site. There are between 8 and 14 million active blogs on the Internet, according to Matt’s estimates. The rest of the 100 million blogs are splogs that use software to scrape the first few lines of another blog’s articles, and then post an excerpt on the spam blog’s web site. Many spam blogs also try to leave trackbacks or spings on legitimate blogs, in an effort to draw visitors away from the real blogs.

Splog operators have thin profit margins, so they usually operate dozens or hundreds of sites. Sites earn revenue from keyword-based advertising links on their splog pages, as well as links to advertising-heavy web sites.

Not for human consumption

Most splog operators also try to get high rankings in search engines, so that Google users will see the splog articles before they find the original posts. Slogs are written for search engines, not real people, to read. Plagiarism Today ran one of the earliest articles about the splog business model, back in 2005. This Wired article came out a month earlier, and has some additional information.

Fighting the sploggers is easy

I’ve taken a few simple steps to keep sploggers from scraping my articles and leaving linkbacks to their sites, without wrecking my own web site in the process. It takes me about 10 minutes each week to manage these tools.

My comment forms require users to complete a reCAPTCHA verification form. This step eliminates almost all of the spam blogs that I’ve caught in my server logs. The only drawback with reCAPTCHA is that many mobile web users cannot leave comments.

I also run anti-splog software that uses an Internet-based list of known splogs to identify and quarantine spurious linkbacks.

Another piece of software searches for splog entries based on my articles. Occasionally, I leave comments on splog posts that are based on my articles, just to let them know I caught them.

My posts are also copyrighted under a Creative Commons license, as I discussed on 21 February 2008. I love it when real bloggers link back to my articles, as long as they give me credit for my writing.

Tags: advertising, blogging, business_model, captcha, copyright, e-commerce, mobile, software, spam, teaching

The mobile web and billso.com

all

Posted Tuesday, 5 February 2008

Read 4 comments

This site is now available in a mobile web format at http://m.billso.com/ – please give it a try with your mobile phone or PDA.

Apple iPhone users can view this site in its regular desktop mode at billso.com, or try the mobile version.

As I mentioned on 27 November 2007, the mobile web is not quite ready for the masses yet. There is no standard URL for mobile web sites, for example. Some sites like Facebook use “m.” as a subdomain that serves up a mobile site. Other mobile sites are using the .mobi top level domain. I have a short list of mobile web sites at http://billso.com/mobile/

I own http://billso.mobi and I’ve set that name to redirect to http://m.billso.com

It’s difficult to design web sites that resolve well on small screens, especially given the number of different devices, platforms and carriers that exist in the mobile Internet market.

Difficult does not mean impossible

I’ve tweaked my web site with some WordPress plug-ins. Plug-ins are prepackaged files of PHP programming code that third-parties have written to extend the WordPress blog software. I’ve made m.billso.com work on several hundred pages of content with 3 hours of effort.

The mobile version does load quickly on PDAs and phones, while preserving most of the site content. Those were my primary goals. I’m pleased with what I’ve accomplished using free software and web services.

Feel free to log on with a real computer and leave a comment about the mobile site. I’d like to know if the mobile version of this site is usable and useful for my readers.

A few of the site’s features do not work well on the mobile version. I’m looking for workarounds to address some of these problems.

  1. The menu on the top of each page becomes a long set of entries.
  2. The event calendar in the right sidebar turns into a single column of text, for example. This happens with the standard WordPress calendar widget, too.
  3. Tables do not resolve well in mobile browsers, either. That’s one reason that the calendars on the Spring 2008 course pages are written in a boring text format.
  4. The scenic image at the top of each page shrinks a bit.
  5. Mobile users cannot enter comments. The reCAPTCHA plugin that I use to stop comment spam does not support mobile web browsers. The comment fields will appear on the mobile site, but comments will not be posted. i’ve seen very few mobile blogs that support comment entry, so I am not very worried about fixing this issue.
Tags: API, captcha, cloud, DNS, free, iPhone, mobile, spam, usability, WordPress

Email and print links

all

Posted Monday, 4 June 2007

Thanks to Lester Chan, I’ve added links that will email and print posts and pages.

These links are available at the bottom of each post and page. The following screenshot provides an example from my APA formatting page:

email-print example

The email link allows users to email one post or page every ten minutes. It also uses a captcha system to foil villains.

The print link removes the sidebars and menus from the page, giving a clean look to the printed version.

This print link generates text that is easy to view on a phone or PDA. I’ve been using a mobilizing option on this blog, but that option can’t remove the menus or sidebars.

The print link also prints the actual URL for every link on the page or post. Some Web users like that particular feature.

Tags: administrivia, captcha