View source code
							
							
						
								Display the source code in rt/util/utf.d from which this
								page was generated on github.
							
						
							Report a bug
							
						
								If you spot a problem with this page, click here to create a
								Bugzilla issue.
							
						
							
								Improve this page
							
							
					
								Quickly fork, edit online, and submit a pull request for this page.
								Requires a signed-in GitHub account. This works well for small changes.
								If you'd like to make larger changes you may want to consider using
								local clone.
							
						Module rt.util.utf
Encode and decode UTF-8, UTF-16 and UTF-32 strings.
For Win32 systems, the C wchar_t type is UTF-16 and corresponds to the D wchar type. For Posix systems, the C wchar_t type is UTF-32 and corresponds to the D utf.dchar type.
UTF character support is restricted to (\u0000 <= character <= \U0010FFFF).
See Also
Wikipedia
      http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
      http://anubis.dkuug.dk/JTC1/SC2/WG2/docs/n1335
Functions
| Name | Description | 
|---|---|
| 
									codeLength(c)
								 | Returns the code length of cin the encoding usingCas a
code point. The code is returned in character count, not in bytes. | 
| 
									decode(s, idx)
								 | Decodes and returns character starting at s[idx]. idx is advanced past the decoded character. If the character is not well formed, a UtfException is thrown and idx remains unchanged. | 
| 
									encode(s, c)
								 | Encodes character c and appends it to array s[]. | 
| 
									isValidDchar(c)
								 | Test if c is a valid UTF-32 character. | 
| 
									stride(s, i)
								 | stride() returns the length of a UTF-8 sequence starting at index i in string s. | 
| 
									stride(s, i)
								 | stride() returns the length of a UTF-16 sequence starting at index i in string s. | 
| 
									stride(s, i)
								 | stride() returns the length of a UTF-32 sequence starting at index i in string s. | 
| 
									toUCSindex(s, i)
								 | Given an index i into an array of characters s[], and assuming that index i is at the start of a UTF character, determine the number of UCS characters up to that index i. | 
| 
									toUTF16(s)
								 | Encodes string s into UTF-16 and returns the encoded string. toUTF16z() is suitable for calling the 'W' functions in the Win32 API that take an LPWSTR or LPCWSTR argument. | 
| 
									toUTF16z(s)
								 | Encodes string s into UTF-16 and returns the encoded string. toUTF16z() is suitable for calling the 'W' functions in the Win32 API that take an LPWSTR or LPCWSTR argument. | 
| 
									toUTF32(s)
								 | Encodes string s into UTF-32 and returns the encoded string. | 
| 
									toUTF8(s)
								 | Encodes string s into UTF-8 and returns the encoded string. | 
| 
									toUTFindex(s, n)
								 | Given a UCS index n into an array of characters s[], return the UTF index. | 
| 
									validate(s)
								 | Checks to see if string is well formed or not. Scan be an array
 ofchar,wchar, ordchar. Throws aUtfExceptionif it is not. Use to check all untrusted input for correctness. | 
Authors
Walter Bright, Sean Kelly
License
					Copyright © 1999-2018 by the D Language Foundation | Page generated by ddox.