Tech Lab

Software puts captions on the real world

By Hiawatha Bray
Globe Staff / September 24, 2009

E-mail this article

Invalid E-mail address
Invalid E-mail address

Sending your article

Your article has been sent.

  • E-mail|
  • Print|
  • Reprints|
  • |
Text size +

The world around us is getting something it’s needed for a long time: captions.

Yes, captions - the explanatory text that accompanies images in newspapers and magazines, and on Web pages.

Engineers are devising ways to attach captions to pretty much everything in the world around us, while providing cheap and easy ways to see them. It’s a technology called “augmented reality,’’ or AR, and early versions of it now run on smartphones that use Google Inc.’s Android operating system and the latest iPhone 3GS from Apple Inc.

Don’t expect too much; these first-generation apps are fairly crude. But they offer a fascinating first taste of life in an annotated world.

Augmented reality is all about superimposing digital information onto the real world. You start with some kind of viewer, like a TV or the video screen of a cellphone, and AR software pastes extra data on top of the on-screen image. If you watch football, you’ve seen the yellow first-down marker, or the down and distance readout that looks painted onto the grass. That’s AR in action.

One application of AR comes from an unexpected source: The Postal Service. Log onto to try a clever little system for choosing the right box for Priority Mail shipping. The site displays virtual 3-D Priority Mail shipping boxes in three sizes. Point a webcam at an item you want to ship, and the program will superimpose the boxes on the image, matching them up so you can choose the ideal size. Since smaller boxes cost less, this little AR application can save you some money.

But AR gets really interesting when it goes mobile. Imagine strolling around Boston with a cellphone that recognizes its surroundings. Turn on the phone’s video camera and you see a real-world image, overlaid with background data about nearby landmarks, restaurants or hotels.

That’s the idea behind a couple of smartphone apps I’ve recently tested, including Layar, software created by programmers in the Netherlands that runs on Google Android phones. If you’ve got an Apple iPhone 3GS, you can check out an app from the local business search service Yelp. Both are free, both are fun. But both also show mobile AR still needs a lot of work.

Cellphones don’t have nearly enough computing power to identify objects on sight. Instead, an app like Layar uses the phone’s GPS unit and compass to figure out where you are and what the phone’s camera is aimed at. Then it compares that information to a database of interesting landmarks, such as the online encyclopedia Wikipedia. If you’re standing in Boston’s City Hall Plaza, Layar’s Wikipedia database knows Faneuil Hall is nearby. Point your cellphone in its general direction, and you’ll see an on-screen marker. Touch the screen, and you get a flood of Faneuil facts.

The iPhone Yelp app is more limited; indeed, you’re not supposed to use it at all. It’s a hidden feature, known in the software trade as an Easter egg, and it works only on the iPhone 3GS, because it’s the only iPhone with a compass. To activate the hidden feature, called Monocle, install the Yelp app, then shake the phone vigorously. Now view the world through the iPhone’s video camera, and you’ll see pointers to a variety of local businesses that have been reviewed by Yelp users.

The AR software often highlights landmarks that are miles away, far out of camera range. And the video overlay points you directly toward the location, as if you could walk in a straight line through Boston. Layar has a feature that will draw a Google map of the destination, but it lacks turn-by-turn directions on how to get there. Also, remember that GPS and compass data are only accurate to within a few dozen feet. So when you get close to a destination, the data overlay often appears on the other side of the street or down the block.

Even so, Layar or Yelp will help you quickly find nearby points of interest. And Layar lets developers build new databases and add captions to just about anything. One database, called HotPads, highlights housing for sale or rental; another maps nearby hotels; yet another is a Yellow Pages guide, where you can look up any kind of business and see its general location highlighted on the video screen. With this one, it takes about 10 seconds to find every hardware store or pizza joint in a 5-mile radius.

Besides, AR systems are bound to get better. Engineers are working on ways to add phone-readable bar codes to buildings and landmarks. If the concept catches on, travelers can start leaving the guidebooks at home. Their smartphones will read any city like a book.

Hiawatha Bray can be reached at