31 January 2019

GOOGLE GIVES WIKIMEDIA MILLIONS—PLUS MACHINE LEARNING TOOLS


GOOGLE IS POURING an additional $3.1 million into Wikipedia, bringing its total contribution to the free encyclopedia over the past decade to more than $7.5 million, the company announced at the World Economic ForumTuesday. A little over a third of those funds will go toward sustaining current efforts at the Wikimedia Foundation, the nonprofit that runs Wikipedia, and the remaining $2 million will focus on long-term viability through the organization’s endowment.

Google will also begin allowing Wikipedia editors to use several of its machine learning tools for free, the tech giant said. What's more, Wikimedia and Google will soon broaden Project Tiger, a joint initiative they launched in 2017 to increase the number of Wikipedia articles written in underrepresented languages in India, and to include 10 new languages in a handful of countries and regions. It will now be called GLOW, Growing Local Language Content on Wikipedia.

It’s certainly positive that Google is investing more in Wikipedia, one of the most popular and generally trustworthy online resources in the world. But the decision isn’t altruistic: Supporting Wikipedia is also a shrewd business decision that will likely benefit Google for years to come. Like other tech companies, including Amazon, Apple, and Facebook, Google already uses Wikipedia content in a number of its own products. When you search Google for “Paris,” a “knowledge panel” of information about the city will appear, some of which is sourced from Wikipedia. The company also has used Wikipedia articles to train machine learning algorithms, as well as fight misinformation on YouTube.

Even efforts like GLOW—which will now expand to Indonesia, Mexico, and Nigeria, as well as the Middle East and North Africa—can help Google’s own bottom line. When the initiative first launched in India, Google provided Chromebooks and internet access to editors, while the Centre for Internet and Society and the Wikimedia India Chapter organized a three-month article writing competition that resulted in nearly 4,500 new Wikipedia articles in 12 different Indic languages. Smartphone penetration in India is only around 27 percent; as more people in the country start using Android smartphones and Google Search, those articles will make the tech giant’s products more useful. Wikipedia’s blog post announcing Google’s new investment makes this strategy fairly clear, noting that the company also provided Project Tiger with “insights into popular search topics on Google for which no or limited local language content exists on Wikipedia.”

Google is also providing Wikipedia free access to its Custom Search API and its Cloud Vision API, which will help the encyclopedia’s volunteer editors more easily cite the facts they use. Each time a Wikipedia editor adds a new piece of information to an article, they need to cite the source where they learned it. The Search API will allow them quickly look up sources on the web without having to leave Wikipedia, while the vision tool will let editors automatically digitize books so they can be used to support Wikipedia articles too. Earlier this month, Wikimedia also announced Google Translate was coming to Wikipedia, allowing editors to convert content into 15 additional languages, bringing the total available to 121.

These machine learning tools will absolutely make it easier for Wikipedia to reach people who speak languages currently underrepresented on the web. But the encyclopedia is also the reason many AI programs exist in the first place. For example, Google-owned Jigsaw has used Wikipedia, in part, to trainits open source troll-fighting AI. The encyclopedia is also used by hundreds of other AI platforms, particularly becauseevery Wikipedia article is under Creative Commons—meaning it can be reproduced for free without copyright restrictions. Apple’s Siri and Amazon’s Alexa smart assistants use information from Wikipedia to answer questions, for instance. (Both companies also have donatedto the Wikimedia Foundation as well.)

Google’s new investments in Wikipedia, specifically in GLOW, will address a genuine problem. The majority of Wikipedia’s tens of millions of articles are in English or European languages like French, German, and Russian. (There are also lots of articles in Swedish and two versions of Filipino, but most of these pages were created by a prolific bot). As the estimated half of Earth’s population that still lacks an internet connection comes online, it will be important that reliable information is available in the native languages people speak. That doesn’t mean, though, that in helping solve these issues companies like Google—or Facebook—don’t also have something to gain.

No comments: