Skip to content
Connect2id
OpenID Connect

Language tags (RFC 5646) for Java

Language tags (RFC 5646) provides a neat method for indicating the locale of data in JSON object members, such as in UserInfo messages of OpenID Connect. They also allow data to be represented in multiple languages and scripts.

Here is an example UserInfo object with en-US language tags:

{
 "user_id": "248289761001",
 "name#en-US": "Jane Doe",
 "given_name#en-US": "Jane",
 "family_name#en-US": "Doe",
 "email": "janedoe@example.com",
 "picture": "http://example.com/janedoe/me.jpg"
}

Today Connect2id released an open source (Apache 2.0) library for constructing, representing and parsing language tags in Java. It is called Nimbus Language Tags and handles the full spec of normal language subtags:

  • primary/extended language
  • script
  • region
  • variants
  • extensions
  • private use

You can browse the Language Tags library code and download a compiled JAR from its Bitbucket repository.

To construct a new language tag in Java from scratch:

// English as used in the United States
LangTag tag = new LangTag("en");
tag.setRegion("US");

// Returns "en-US"
tag.toString();

The toString() method returns the canonical string representation of the language tag, with dashes as subtag delimiters and properly capitalised.

To parse a language tag:

// Chinese, Mandarin, Simplified script, as used in China
LangTag tag = LangTag.parse("zh-cmn-Hans-CN");

// Returns "zh"
tag.getPrimaryLanguage();

// Returns "cmn"
tag.getExtendedLanguageSubtags()[0];

// Returns "zh-cmn"
tag.getLanguage();

// Returns "Hans"
tag.getScript();

// Returns "CN"
tag.getRegion();

The Nimbus Language Tags library comes with complete JavaDocs which you can use as programming reference. Should you have any questions, comments or bug reports - use the issue tracker.