Web Surfing in every language and the Ethnologue

MPLewis | April 1, 2015

We've mentioned before how some scholars are making use of the Ethnologue, but a major report on the state of global internet connectivity from Internet.org, a consortium with Facebook at the lead, is one of the first to make extensive use of the Ethnologue Global Dataset. It's exciting for us to see how making the core data from our database accessible in this way can be of assistance in many areas of research.

The report, called State of Connectivity: A Report on Global Internet Access examines the degree to which people around the world are becoming connected, but more importantly, looks at the reasons why they are hindered in doing so.  The report divides the issues into three major categories: infrastructure, affordability, and relevance. The first two are fairly obvious.  If there isn't adequate infrastructure (hardware, wires, service providers) it's going to be difficult to get access to the internet.  Similarly, if the cost of getting connected is too high, people just won't be able to connect.  When the cost of internet access is as much as one earns in a day, it isn't very likely that there will be many who take advantage of the "opportunity."

It's the third category, relevance, where the Ethnolgogue data is, um, er, well, relevant. The report notes that for some people there just isn't any content online that is relevant to them.  Part of what enhances (and in our view, may even be crucial for) relevance is the availability of content in the language of the user. An interesting "sound bite" from the report is: "To provide relevant content to 80% of the world would require sufficient content in at least 92 languages." They got to that estimate by using Ethnologue's Global Dataset to calculate speaker populations.

The Ethnologue Global Dataset makes it possible for researchers to replicate the statistical tables that show up under the Statistics tab on the Ethnologue home page. It also enables them to create their own tables, link our data to their own and to make comparisons, look for correlations, and be as creative as they can be in ways that haven't been possible previously without a lot of manual labor. The license for the Global Dataset comes in two flavors.  One of them will certainly meet your needs if you want to do research about language in relation to almost anything else.

Oh, and just to let you know that progress is being made to increase relevance online, our colleagues at UNTI (Unión Nacional de Traductores Indígenas) in Mexico launched websites in 11 indigenous languages of Mexico on the 20th of March. I don't know how much that lowers the percentage of the unconnected reported by Internet.org, or if any of those 11 languages are even in the list of 92 that Internet.org considers to be top priority, but it is a positive development not only for the state of connectivity but most of all for the members of those language communities. If you are a Facebook user you can see the list of websites here.