The issue here is that your bytes aren't a valid UTF-8 sequence.
Any sequence of bytes can be interpreted as valid ISO Latin-1, for example.? (There may be issues with bytes having values 0–31, but those generally don't stop the characters being stored and processed.)??Similar applies to most other 8-bit character sets.
But the same isn't true of UTF-8.? While all sequences of bytes in the range 1–127 are valid UTF-8 (and interpreted the same as they are in ASCII and most 8-bit encodings), bytes in the range 128–255 can only appear in certain well-defined combinations.? (This has several very useful properties: it lets you identify UTF-8 with a very high probability; it also avoids issues with synchronisation, searching, sorting, &c.)
In this case, the sequence in the question (which is 4E 17 29 33 E0 2A
in unsigned hex) isn't valid UTF-8.
So when you try to convert it to a string using the default encoding (UTF-8), the JVM substitutes the replacement character — value U+FFFD, which looks like this: ?
— in place of each invalid character.
Then, when you convert that back to UTF-8, you get the UTF-8 encoding of the replacment character, which is EF BF BD
.? And if you interpret that as signed bytes, you get -17 -65 -67
— as in the question.
So Kotlin/JVM is handling the invalid input as best it can.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…