Language in DIGGSML

DIGGSML is an international standard, and since many people all over the world speak different languages DIGGSML must respect this. Whilst the element names themselves are in "international English" their content can often be in one (or more) different languages.

Here we explain the best practices for internationalising a DIGGSML file, including how to implement a bi-lingual file.

Specifying the "default" language

The W3C (2006, 2006-2) specify the use of an xml:lang attribute to specify the language of a given XML fragment. Thus applying this xml:lang attribute to the root diggs object sets the default language for the file.

<diggs xml:lang="en-GB">
  <!-- British-English DIGGSML Data Here -->
</diggs>

or

<diggs xml:lang="en-US">
  <!-- US-English DIGGSML Data Here -->
</diggs>

Values entered into the xml:lang attribute should conform to RFC3066 (TIS, 2001), information on how to decode these language codes can be found in Ishida (2006) and the dictionary of codes is published by IANA.

If no xml:lang is specified the DIGGSML default of "International English" should be assumed.

Overriding that the default language

Since the W3C (2006, 2006-2) specify that the scope of the xml:lang attribute is that it should apply to the element it belongs to (including attributes) and all elements contained within then applying the xml:lang attribute to a diggs:Remarks object would imply that ALL of the diggs:Remark objects within were written in that specified language.

Multi-lingual files

In all situations where a property is of datatype "string" DIGGSML allows you to include many instances of that property, this is to allow the user to specify more than one language as shown below.

<Sample>
  <description xml:lang="en-US">Soft Sandy CLAY, Brown in Color</description>
  <description xml:lang="en-GB">Soft Sandy CLAY, Brown in Colour</description>
</Sample>

Conclusion

DIGGSML uses the W3C's best practices for including internationalisation in strings and hence allows users to specify data itself in any language.

References

Internet Assigned Numbers Authority (IANA), "Language Subtag Registry" available online at http://www.iana.org/assignments/language-subtag-registry

Ishida, R., 2006-3, "Language tags in HTML and XML", W3C, available online at http://www.w3.org/International/articles/language-tags/Overview.en.php

The Internet Society (TIS), 2001, "Tags for the Identification of Languages" available online at http://www.ietf.org/rfc/rfc3066.txt

W3C, 2006, "Best Practices for XML Internationalization - Working Draft" available online at http://www.w3.org/TR/xml-i18n-bp/

W3C, 2006-2, "Extensible Markup Language (XML) 1.0 (Fourth Edition)" available online at http://www.w3.org/TR/REC-xml/