By Evan Schmidt
This mini-blog describes how to analyze every character in a unicode text string in order to find hidden characters, variable-byte characters, and unexpected unicode characters. This can be a useful because certain unicode characters can cause some applications to fail unexpectedly. This code will help you analyze the data and find the problem character.
DECLARE @nstring NVARCHAR(100)
SET @nstring = N’TeSt’
DECLARE @position int
SET @position = 1
DECLARE @CharList TABLE (
Position INT,
UnicodeChar NVARCHAR(1),
UnicodeValue INT
)
WHILE @position <= DATALENGTH(@nstring)
BEGIN
INSERT @CharList
SELECT @position as Position
, CONVERT(nchar(1), SUBSTRING(@nstring, @position, 1)) as UnicodeChar
, UNICODE(SUBSTRING(@nstring, @position, 1)) as UnicodeValue
SET @position = @position + 1
END
SELECT * FROM @CharList
The results of this query will display the unicode value for every character in the string:
Position |
UnicodeChar |
UnicodeValue |
1 |
T |
84 |
2 |
e |
101 |
3 |
Ș |
536 |
4 |
t |
116 |
Note the particularly large unicode value which shows this character is clearly not a standard ASCII value.