InvestSMART

The Wikipedia hero and his bot

Sverker Johansson could be the most prolific author you've never heard of. And with 2.7 million articles to his name and counting not everyone is happy with his methods.
comments Comments
Upsell Banner

Graph for The Wikipedia hero and his bot

Sverker Johansson can take credit for 2.7 million Wikipedia articles. Most were created using a computer program, or 'bot,' that he made. Ellen Emmerentze Jervell/The Wall Street Journal

Sverker Johansson could be the most prolific author you've never heard of.

Volunteering his time over the past seven years publishing to Wikipedia, the 53-year-old Swede can take credit for 2.7 million articles, or 8.5 per cent of the entire collection, according to Wikimedia analytics, which measures the site's traffic. His stats far outpace any other user, the group says.

He has been particularly prolific cataloguing obscure animal species, including butterflies and beetles, and is proud of his work highlighting towns in the Philippines. About one-third of his entries are uploaded to the Swedish language version of Wikipedia, and the rest are composed in two versions of Filipino, one of which is his wife's native tongue.

An administrator holding degrees in linguistics, civil engineering, economics and particle physics, he says he has long been interested in "the origin of things, oh, everything."

It isn't uncommon, however, for Wikipedia purists to complain about his method. That is because the bulk of his entries have been created by a computer software program—known as a bot. Critics say bots crowd out the creativity only humans can generate.

Mr. Johansson's program scrubs databases and other digital sources for information, and then packages it into an article. On a good day, he says his "Lsjbot" creates up to 10,000 new entries.

On Wikipedia, any registered user can create an entry. Mr. Johansson has to find a reliable database, create a template for a given subject and then launch his bot from his computer. The software program searches for information, then publishes it to Wikipedia.

Bots have long been used to author and edit entries on Wikipedia, and, more recently, an increasingly large amount of the site's new content is written by bots. Their use is regulated by Wikipedia users called the "Bot Approvals Group."

While Mr. Johansson works to achieve consensus approval for his project, he and his bot-loving peers expect to continue facing resistance. "There is a vocal minority who don't like it," he said during a recent speech on his work. Still, he soldiers on.

"I'm doing this to create absolute democracy online," Mr. Johansson said recently while sitting in front of a computer at his office at Sweden's Dalarna University.

Wikipedia, he reckons, should someday be able to tell people everything about everything. His bot, which took him months' worth of programming to create, is a step toward achieving that goal sooner rather than later—even if the entries it creates are bare-boned "stubs" containing basic information.

Battling the bots

Achim Raschka is one of the people who would like Mr. Johansson to change course. The 41-year-old German Wikipedia enthusiast can spend days writing an in-depth article about a single type of plant.

"I am against production of bot-generated stubs in general," he said. He is particularly irked by Mr. Johansson's Lsjbot, which prizes quantity over quality and is "not helping the readers and users of Wikipedia."

Mr. Raschka says these items "only contain more or less correct taxonomic information, not what the animal looks like and other important things."

Others have echoed his concerns on public chat forums, comparing Mr. Johansson to "rambot," a bot used to add county and city articles in the US, that contain only the most rudimentary information.

Wikipedia, a collaboratively edited Internet encyclopedia, is supported by the nonprofit Wikimedia Foundation. Volunteers write Wikipedia's 30 million articles in 287 languages. A representative didn't respond to requests for comment.

Boring but important

Mr. Johansson admits Lsjbot's entries can be boring, but argues there is value.

Take, for instance, an item the Lsjbot created for Swedish Wikipedia in 2012 on Basey, a city of about 44,000 in the Philippines. It contains coordinates, population and other details. At the top of the page, a disclosure notes the entry was created by Lsjbot.

Last year, when Typhoon Yolanda hit Southeast Asia, newspapers ran headlines saying people died in Basey. Mr. Johansson had already created his entry on Basey, allowing people to access a map, an image and information on where it is.

Mr. Johansson says the short stubs allow other writers—who know more about each particular subject—to fill in additional information later.

While bots are speedy and largely accurate, he says errors can happen.

For instance, his software used to only use information recorded in the Latin alphabet. When a project was done that needed bird photos, the bot turned to the Russian version of Wikimedia Commons, which provides millions of free-to-use images.

But at the time, Lsjbot couldn't read Cyrillic and made mistakes. Mr. Johansson later altered the software to address the glitch.

Currently, the most likely subjects for bots to tackle are technical, ranging from chemical elements to astronomical objects. They can be helpful in broadening a subject, such as the animal kingdom. While tigers and elephants have plenty of entries, worms and insects are underrepresented, Mr. Johansson says.

Share this article and show your support
Free Membership
Free Membership
Ellen Emmerentze Jervell - Wall Street Journal
Ellen Emmerentze Jervell - Wall Street Journal
Keep on reading more articles from Ellen Emmerentze Jervell - Wall Street Journal. See more articles
Join the conversation
Join the conversation...
There are comments posted so far. Join the conversation, please login or Sign up.