Language Resources

MPLewis | July 1, 2013

In my second post to the Ethnoblog I noted that we had received considerable feedback about the apparent loss of the lists of language resources and publications associated with each language. You’ll recall that in previous online editions of the Ethnologue we included a listing of SIL publications for each language if there were any. In this edition, SIL publications in or about individual languages have been moved to www.sil.org and can still be referenced there.

This month we have added a new feature to the language entries by including an additional section on each language page called “Language Resources.” The content of that section consists solely of a link to a search of the OLAC database where a large and growing number of documentary resources on the languages of the world are identified. OLAC was created to identify, collect, and report those resources and goes well beyond the SIL Language and Culture Archives to include many other archives and repositories of language data. We see no point in our trying to duplicate that effort and are happy to support the OLAC project in this way.

So what is OLAC?

OLAC is the acronym for the Open Language Archives Community. Their website describes OLAC as “an international partnership of institutions and individuals who are creating a worldwide virtual library of language resources… .” Established in 2000, OLAC now contains more than 190,000 records on more than half of the world’s languages. I should point out that there is at least one record in the OLAC database for every identified language in the world. That is, the Ethnologue’s brief summary description of each language is included as one of the language resources that OLAC tracks. We’re happy to participate in this cooperative effort and we encourage others to participate as well.

One of the key insights of OLAC is the notion of interoperability. That’s a bit of technical jargon that basically means that those of us who maintain language-related databases for various purposes ought to be able to make our databases work together to support and amplify each other. By so doing, we can avoid duplication of effort and we can more seamlessly share what we know with a broader audience. OLAC has developed a very useful schema of metadata (another bit of jargon, I’m afraid) that helps us all describe the resources we have in our database. By using a common set of descriptors for what we have, we can link our databases together and multiply their usefulness. The ISO 639-3 codes that we use to identify languages in the Ethnologue are one part of that metadata that enhances our ability to interoperate. We can search the OLAC database using the ISO code for any specific language and the OLAC database will show us all of the resources it holds that have been tagged with that ISO code.

Of course, you don’t need to go through the Ethnologue to get to the OLAC website nor to use their very helpful search engine. We are happy to be able to provide quicker and easier access from the Ethnologue website for those who want to know more about what documentation exists (or at least has been archived in a participating archive) for the languages that we report on.

By the way, if you go to the OLAC website, you’ll see that OLAC began as an NSF-funded initiative of Steven Bird (then at the University of Pennsylvania) and our own Executive Editor, Gary F. Simons.

So what does this mean for you?

For one thing, if you are a student or a researcher, you have quicker and easier access to a rich catalog of resources that is growing and improving as more and more archives and individuals join OLAC.

If you are an author, linguist, or researcher who has produced documentary materials about a language and would like to make them known to others, OLAC provides the ideal place to reach a large audience and adds value to your data and research results by making them accessible, searchable, and interoperable. We often receive e-mail from field researchers asking us to list their publications in the Ethnologue. We don’t have the resources to compile and maintain a comprehensive list for all of the languages of the world. However, you can share your data and make your work known to an even larger audience by participating in OLAC using their mechanisms for submitting the metadata about your resources. Better yet, you can deposit those resources with a participating archive (there are 44 of them now) where they will not only be safe but also shareable.

And if you are an archivist or an administrator of an archive and your collection is not now included in the OLAC catalog, you should explore becoming part of this global community as a participating archive. As I’ve described above, using a global language documentation metadata standard increases interoperability and opens your collection to a broader audience. The OLAC Implementers’ FAQ page can help you.

Those lists of links we had to SIL language resources and publications are not lost after all. SIL’s Language and Culture Archives are participants in OLAC, so the lists of resources that we formerly published separately (and are still accessible here) are included in the results when you click on the Language Resources link in each language entry.

Give it a try, I think, you’ll like what you see.