Report a bug
If you spot a problem with this page, click here to create a Bugzilla issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page.
Requires a signed-in GitHub account. This works well for small changes.
If you'd like to make larger changes you may want to consider using
a local clone.
rt.util.utf
Encode and decode UTF-8, UTF-16 and UTF-32 strings.
For Win32 systems, the C wchar_t type is UTF-16 and corresponds to the D
wchar type.
For Posix systems, the C wchar_t type is UTF-32 and corresponds to
the D utf.dchar type.
UTF character support is restricted to (\u0000 <= character <= \U0010FFFF).
See Also:
License:
Authors:
Walter Bright, Sean Kelly
Source src/rt/util/utf.d
- pure nothrow @nogc @safe bool
isValidDchar
(dcharc
); - Test if c is a valid UTF-32 character.\uFFFE and \uFFFF are considered valid by this function, as they are permitted for internal use by an application, but they are not allowed for interchange by the Unicode standard.Returns:true if it is, false if not.
- pure nothrow @nogc @safe uint
stride
(in char[]s
, size_ti
); - stride() returns the length of a UTF-8 sequence starting at index i in string s.Returns:The number of bytes in the UTF-8 sequence or 0xFF meaning s[i] is not the start of of UTF-8 sequence.
- pure nothrow @nogc @safe uint
stride
(in wchar[]s
, size_ti
); - stride() returns the length of a UTF-16 sequence starting at index i in string s.
- pure nothrow @nogc @safe uint
stride
(in dchar[]s
, size_ti
); - stride() returns the length of a UTF-32 sequence starting at index i in string s.Returns:The return value will always be 1.
- pure @safe size_t
toUCSindex
(in char[]s
, size_ti
);
pure @safe size_ttoUCSindex
(in wchar[]s
, size_ti
);
pure nothrow @nogc @safe size_ttoUCSindex
(in dchar[]s
, size_ti
); - Given an index i into an array of characters s[], and assuming that index i is at the start of a UTF character, determine the number of UCS characters up to that index i.
- pure @safe size_t
toUTFindex
(in char[]s
, size_tn
);
pure nothrow @nogc @safe size_ttoUTFindex
(in wchar[]s
, size_tn
);
pure nothrow @nogc @safe size_ttoUTFindex
(in dchar[]s
, size_tn
); - Given a UCS index n into an array of characters s[], return the UTF index.
- pure @safe dchar
decode
(in char[]s
, ref size_tidx
);
pure @safe dchardecode
(in wchar[]s
, ref size_tidx
);
pure @safe dchardecode
(in dchar[]s
, ref size_tidx
); - Decodes and returns character starting at s[idx]. idx is advanced past the decoded character. If the character is not well formed, a UtfException is thrown and idx remains unchanged.
- pure nothrow @safe void
encode
(ref char[]s
, dcharc
);
pure nothrow @safe voidencode
(ref wchar[]s
, dcharc
);
pure nothrow @safe voidencode
(ref dchar[]s
, dcharc
); - Encodes character c and appends it to array s[].
- pure nothrow @nogc @safe ubyte
codeLength
(C)(dcharc
); - Returns the code length of c in the encoding using C as a code point. The code is returned in character count, not in bytes.
- pure @safe void
validate
(S)(in Ss
); - Checks to see if string is well formed or not. S can be an array of char, wchar, or dchar. Throws a UtfException if it is not. Use to check all untrusted input for correctness.
- pure nothrow @safe string
toUTF8
(strings
);
pure @trusted stringtoUTF8
(in wchar[]s
);
pure @trusted stringtoUTF8
(in dchar[]s
); - Encodes string s into UTF-8 and returns the encoded string.
- pure @trusted wstring
toUTF16
(in char[]s
);
pure @safe wptrtoUTF16z
(in char[]s
);
pure nothrow @safe wstringtoUTF16
(wstrings
);
pure nothrow @trusted wstringtoUTF16
(in dchar[]s
); - Encodes string s into UTF-16 and returns the encoded string. toUTF16z() is suitable for calling the 'W' functions in the Win32 API that take an LPWSTR or LPCWSTR argument.
- pure @trusted dstring
toUTF32
(in char[]s
);
pure @trusted dstringtoUTF32
(in wchar[]s
);
pure nothrow @safe dstringtoUTF32
(dstrings
); - Encodes string s into UTF-32 and returns the encoded string.
Copyright © 1999-2018 by the D Language Foundation | Page generated by
Ddoc on Wed May 2 05:57:51 2018