pub struct Locale {
pub id: LanguageIdentifier,
pub extensions: Extensions,
}
Expand description
A core struct representing a Unicode Locale Identifier
.
A locale is made of two parts:
- Unicode Language Identifier
- A set of Unicode Extensions
Locale
exposes all of the same fields and methods as LanguageIdentifier
, and
on top of that is able to parse, manipulate and serialize unicode extension fields.
§Ordering
This type deliberately does not implement Ord
or PartialOrd
because there are
multiple possible orderings. Depending on your use case, two orderings are available:
- A string ordering, suitable for stable serialization:
Locale::strict_cmp
- A struct ordering, suitable for use with a BTreeSet:
Locale::total_cmp
See issue: https://github.com/unicode-org/icu4x/issues/1215
§Parsing
Unicode recognizes three levels of standard conformance for a locale:
- well-formed - syntactically correct
- valid - well-formed and only uses registered language subtags, extensions, keywords, types…
- canonical - valid and no deprecated codes or structure.
Any syntactically invalid subtags will cause the parsing to fail with an error.
This operation normalizes syntax to be well-formed. No legacy subtag replacements is performed.
For validation and canonicalization, see LocaleCanonicalizer
.
ICU4X’s Locale parsing does not allow for non-BCP-47-compatible locales allowed by UTS 35 for backwards compatability. Furthermore, it currently does not allow for language tags to have more than three characters.
§Examples
Simple example:
use icu::locale::{
extensions::unicode::{key, value},
locale,
subtags::{language, region},
};
let loc = locale!("en-US-u-ca-buddhist");
assert_eq!(loc.id.language, language!("en"));
assert_eq!(loc.id.script, None);
assert_eq!(loc.id.region, Some(region!("US")));
assert_eq!(loc.id.variants.len(), 0);
assert_eq!(
loc.extensions.unicode.keywords.get(&key!("ca")),
Some(&value!("buddhist"))
);
More complex example:
use icu::locale::{subtags::*, Locale};
let loc: Locale = "eN-latn-Us-Valencia-u-hC-H12"
.parse()
.expect("Failed to parse.");
assert_eq!(loc.id.language, "en".parse::<Language>().unwrap());
assert_eq!(loc.id.script, "Latn".parse::<Script>().ok());
assert_eq!(loc.id.region, "US".parse::<Region>().ok());
assert_eq!(
loc.id.variants.get(0),
"valencia".parse::<Variant>().ok().as_ref()
);
Fields§
§id: LanguageIdentifier
The basic language/script/region components in the locale identifier along with any variants.
extensions: Extensions
Any extensions present in the locale identifier.
Implementations§
Source§impl Locale
impl Locale
Sourcepub fn strict_cmp(&self, other: &[u8]) -> Ordering
pub fn strict_cmp(&self, other: &[u8]) -> Ordering
Compare this Locale
with BCP-47 bytes.
The return value is equivalent to what would happen if you first converted this
Locale
to a BCP-47 string and then performed a byte comparison.
This function is case-sensitive and results in a total order, so it is appropriate for
binary search. The only argument producing Ordering::Equal
is self.to_string()
.
§Examples
Sorting a list of locales with this method requires converting one of them to a string:
use icu::locale::Locale;
use std::cmp::Ordering;
use writeable::Writeable;
// Random input order:
let bcp47_strings: &[&str] = &[
"und-u-ca-hebrew",
"ar-Latn",
"zh-Hant-TW",
"zh-TW",
"und-fonipa",
"zh-Hant",
"ar-SA",
];
let mut locales = bcp47_strings
.iter()
.map(|s| s.parse().unwrap())
.collect::<Vec<Locale>>();
locales.sort_by(|a, b| {
let b = b.write_to_string();
a.strict_cmp(b.as_bytes())
});
let strict_cmp_strings = locales
.iter()
.map(|l| l.to_string())
.collect::<Vec<String>>();
// Output ordering, sorted alphabetically
let expected_ordering: &[&str] = &[
"ar-Latn",
"ar-SA",
"und-fonipa",
"und-u-ca-hebrew",
"zh-Hant",
"zh-Hant-TW",
"zh-TW",
];
assert_eq!(expected_ordering, strict_cmp_strings);
Sourcepub fn total_cmp(&self, other: &Self) -> Ordering
pub fn total_cmp(&self, other: &Self) -> Ordering
Returns an ordering suitable for use in BTreeSet
.
Unlike Locale::strict_cmp
, the ordering may or may not be equivalent
to string ordering, and it may or may not be stable across ICU4X releases.
§Examples
This method returns a nonsensical ordering derived from the fields of the struct:
use icu::locale::Locale;
use std::cmp::Ordering;
// Input strings, sorted alphabetically
let bcp47_strings: &[&str] = &[
"ar-Latn",
"ar-SA",
"und-fonipa",
"und-u-ca-hebrew",
"zh-Hant",
"zh-Hant-TW",
"zh-TW",
];
assert!(bcp47_strings.windows(2).all(|w| w[0] < w[1]));
let mut locales = bcp47_strings
.iter()
.map(|s| s.parse().unwrap())
.collect::<Vec<Locale>>();
locales.sort_by(Locale::total_cmp);
let total_cmp_strings = locales
.iter()
.map(|l| l.to_string())
.collect::<Vec<String>>();
// Output ordering, sorted arbitrarily
let expected_ordering: &[&str] = &[
"ar-SA",
"ar-Latn",
"und-u-ca-hebrew",
"und-fonipa",
"zh-TW",
"zh-Hant",
"zh-Hant-TW",
];
assert_eq!(expected_ordering, total_cmp_strings);
Use a wrapper to add a Locale
to a BTreeSet
:
use icu::locale::Locale;
use std::cmp::Ordering;
use std::collections::BTreeSet;
#[derive(PartialEq, Eq)]
struct LocaleTotalOrd(Locale);
impl Ord for LocaleTotalOrd {
fn cmp(&self, other: &Self) -> Ordering {
self.0.total_cmp(&other.0)
}
}
impl PartialOrd for LocaleTotalOrd {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
Some(self.cmp(other))
}
}
let _: BTreeSet<LocaleTotalOrd> = unimplemented!();
Trait Implementations§
Source§impl Display for Locale
This trait is implemented for compatibility with fmt!
.
To create a string, [Writeable::write_to_string
] is usually more efficient.
impl Display for Locale
This trait is implemented for compatibility with fmt!
.
To create a string, [Writeable::write_to_string
] is usually more efficient.
Source§impl From<&Locale> for DataLocale
impl From<&Locale> for DataLocale
Source§impl From<&Locale> for LocalePreferences
impl From<&Locale> for LocalePreferences
Source§impl From<(Language, Option<Script>, Option<Region>)> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{
locale,
subtags::{language, region, script},
};
assert_eq!(
Locale::from((
language!("en"),
Some(script!("Latn")),
Some(region!("US"))
)),
locale!("en-Latn-US")
);
impl From<(Language, Option<Script>, Option<Region>)> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{
locale,
subtags::{language, region, script},
};
assert_eq!(
Locale::from((
language!("en"),
Some(script!("Latn")),
Some(region!("US"))
)),
locale!("en-Latn-US")
);
Source§impl From<Language> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{locale, subtags::language};
assert_eq!(Locale::from(language!("en")), locale!("en"));
impl From<Language> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{locale, subtags::language};
assert_eq!(Locale::from(language!("en")), locale!("en"));
Source§impl From<LanguageIdentifier> for Locale
impl From<LanguageIdentifier> for Locale
Source§fn from(id: LanguageIdentifier) -> Self
fn from(id: LanguageIdentifier) -> Self
Source§impl From<Locale> for DataLocale
impl From<Locale> for DataLocale
Source§impl From<Locale> for LanguageIdentifier
impl From<Locale> for LanguageIdentifier
Source§impl From<Option<Region>> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{locale, subtags::region};
assert_eq!(Locale::from(Some(region!("US"))), locale!("und-US"));
impl From<Option<Region>> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{locale, subtags::region};
assert_eq!(Locale::from(Some(region!("US"))), locale!("und-US"));
Source§impl From<Option<Script>> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{locale, subtags::script};
assert_eq!(Locale::from(Some(script!("latn"))), locale!("und-Latn"));
impl From<Option<Script>> for Locale
§Examples
use icu::locale::Locale;
use icu::locale::{locale, subtags::script};
assert_eq!(Locale::from(Some(script!("latn"))), locale!("und-Latn"));
Source§impl Writeable for Locale
impl Writeable for Locale
Source§fn write_to<W: Write + ?Sized>(&self, sink: &mut W) -> Result
fn write_to<W: Write + ?Sized>(&self, sink: &mut W) -> Result
write_to_parts
, and discards any
Part
annotations.Source§fn writeable_length_hint(&self) -> LengthHint
fn writeable_length_hint(&self) -> LengthHint
Source§fn write_to_parts<S>(&self, sink: &mut S) -> Result<(), Error>where
S: PartsWrite + ?Sized,
fn write_to_parts<S>(&self, sink: &mut S) -> Result<(), Error>where
S: PartsWrite + ?Sized,
Part
annotations to the given sink. Errors from the
sink are bubbled up. The default implementation delegates to write_to
,
and doesn’t produce any Part
annotations.