UTF-8 바이트[]를 문자열로 변환하는 방법

programing

UTF-8 바이트[]를 문자열로 변환하는 방법

bestprogram 2023. 4. 17. 22:33

UTF-8 바이트[]를 문자열로 변환하는 방법

나는 가지고 있다byte[]UTF-8이 포함되어 있는 것을 알고 있는 파일로부터 로드된 어레이에는, UTF-8이 포함되어 있습니다.

일부 디버깅 코드에서는 문자열로 변환해야 합니다.이거 할 수 있는 원라이너 있어요?

이 커버에서는 할당과 메모 카피만으로 할 수 있기 때문에, 실장되어 있습니다.

string result = System.Text.Encoding.UTF8.GetString(byteArray);

이 변환에는 적어도 네 가지 다른 방법이 있습니다.

인코딩의 GetString
단, 원래 바이트가 ASC가 아닌 경우에는 원래 바이트를 되돌릴 수 없습니다.II 문자
비트 컨버터ToString(ToString)
출력은 "-"로 구분된 문자열이지만 없습니다.문자열을 바이트 배열로 다시 변환하는 NET 내장 메서드.
변환하다.ToBase64String
를 사용하여 출력 문자열을 바이트 배열로 쉽게 변환할 수 있습니다.Convert.FromBase64String.
참고: 출력 문자열에는 '+', '/' 및 '='이(가) 포함될 수 있습니다.URL 에 문자열을 사용하는 경우는, 명시적으로 부호화할 필요가 있습니다.
Http Server Utility 。UrlTokenEncode
를 사용하여 출력 문자열을 바이트 배열로 쉽게 변환할 수 있습니다.HttpServerUtility.UrlTokenDecode출력 문자열은 이미 URL에 친숙합니다!단점이 있다면,System.Web어셈블리의 경우, Web 프로젝트가 아닌 경우.

완전한 예:

byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters

string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1);  // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results

string s2 = BitConverter.ToString(bytes);   // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
    decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes

string s3 = Convert.ToBase64String(bytes);  // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes

string s4 = HttpServerUtility.UrlTokenEncode(bytes);    // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes

인코딩을 모를 때 바이트 배열에서 문자열로 변환하는 일반적인 솔루션:

static string BytesToStringConverted(byte[] bytes)
{
    using (var stream = new MemoryStream(bytes))
    {
        using (var streamReader = new StreamReader(stream))
        {
            return streamReader.ReadToEnd();
        }
    }
}

정의:

public static string ConvertByteToString(this byte[] source)
{
    return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}

사용방법:

string result = input.ConvertByteToString();

이 투고에서 몇 가지 답변을 보았는데, 같은 문제를 해결하기 위해 C# 프로그래밍에 몇 가지 접근법이 있기 때문에 완전한 기본 지식으로 간주할 수 있습니다.고려해야 할 것은 순수 UTF-8과 BOM을 사용한 UTF-8의 차이뿐입니다.

지난주에는 직장에서 CSV 파일을 BOM으로 출력하고 다른 CSV 파일을 순수 UTF-8(BOM 없이)로 출력하는 기능을 개발해야 했습니다.각 CSV 파일 인코딩 유형은 서로 다른 비표준 API에 의해 사용됩니다.한쪽 API는 BOM을 사용하여 UTF-8을 읽고 다른 한쪽 API는 BOM을 사용하지 않고 읽습니다.저는 이 개념에 대한 참고 자료를 조사해야 했습니다. "UTF-8과 UTF-8의 BOM 없는 차이점은 무엇입니까?" 스택 오버플로 질문 및 Wikipedia 기사 "바이트 주문 마크"를 읽고 접근 방식을 구축해야 했습니다.

마지막으로 UTF-8 부호화 타입(BOM 및 Pure 포함)의 C# 프로그래밍은 다음 예시와 비슷해야 합니다.

// For UTF-8 with BOM, equals shared by Zanoni (at top)
string result = System.Text.Encoding.UTF8.GetString(byteArray);

//for Pure UTF-8 (without B.O.M.)
string result = (new UTF8Encoding(false)).GetString(byteArray);

의 byte[] a까지string단순해 보이지만 어떤 종류의 부호화라도 출력 문자열을 망칠 수 있습니다. 작은 은 예기치 없이 합니다.

private string ToString(byte[] bytes)
{
    string response = string.Empty;

    foreach (byte b in bytes)
        response += (Char)b;

    return response;
}

「」를 사용합니다.(byte)b.ToString("x2") , " "b4b5dfe475e58b67

public static class Ext {

    public static string ToHexString(this byte[] hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return string.Empty;

        var s = new StringBuilder();
        foreach (byte b in hex) {
            s.Append(b.ToString("x2"));
        }
        return s.ToString();
    }

    public static byte[] ToHexBytes(this string hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return new byte[0];

        int l = hex.Length / 2;
        var b = new byte[l];
        for (int i = 0; i < l; ++i) {
            b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
        }
        return b;
    }

    public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
    {
        if (bytes == null && bytesToCompare == null) return true; // ?
        if (bytes == null || bytesToCompare == null) return false;
        if (object.ReferenceEquals(bytes, bytesToCompare)) return true;

        if (bytes.Length != bytesToCompare.Length) return false;

        for (int i = 0; i < bytes.Length; ++i) {
            if (bytes[i] != bytesToCompare[i]) return false;
        }
        return true;
    }

}

Unicode Encoding 클래스도 있습니다.사용법은 매우 간단합니다.

ByteConverter = new UnicodeEncoding();
string stringDataForEncoding = "My Secret Data!";
byte[] dataEncoded = ByteConverter.GetBytes(stringDataForEncoding);

Console.WriteLine("Data after decoding: {0}", ByteConverter.GetString(dataEncoded));

BitConverter하여 "class"를 할 수 .byte[]로로 합니다.string.

var convertedString = BitConverter.ToString(byteAttay);

『』의 BitConverter클래스는 MSDN에서 확인할 수 있습니다.

선택한 답변 외에 를 사용하고 있는 경우.NET 3.5 또는NET 3.5 CE에서는 디코딩할 첫 번째 바이트의 인덱스와 디코딩할 바이트 수를 지정해야 합니다.

string result = System.Text.Encoding.UTF8.GetString(byteArray, 0, byteArray.Length);

대체 방법:

 var byteStr = Convert.ToBase64String(bytes);

내가 아는 한, 주어진 답변 중 어느 것도 무효 종료와 함께 올바른 동작을 보장하지 않습니다.다른 사람이 나에게 다르게 보여줄 때까지 나는 다음 방법으로 이것을 처리하기 위해 나만의 스태틱클래스를 작성했다.

// Mimics the functionality of strlen() in c/c++
// Needed because niether StringBuilder or Encoding.*.GetString() handle \0 well
static int StringLength(byte[] buffer, int startIndex = 0)
{
    int strlen = 0;
    while
    (
        (startIndex + strlen + 1) < buffer.Length // Make sure incrementing won't break any bounds
        && buffer[startIndex + strlen] != 0       // The typical null terimation check
    )
    {
        ++strlen;
    }
    return strlen;
}

// This is messy, but I haven't found a built-in way in c# that guarentees null termination
public static string ParseBytes(byte[] buffer, out int strlen, int startIndex = 0)
{
    strlen = StringLength(buffer, startIndex);
    byte[] c_str = new byte[strlen];
    Array.Copy(buffer, startIndex, c_str, 0, strlen);
    return Encoding.UTF8.GetString(c_str);
}

의 startIndex하고 있는 에서 특히 가 있는 것은 was was작 was a a a was was a a 。특히 해석할 필요가 있는 것은byte[]문자열 합니다.은 경우 해도 무방하다

배열 byteArrFilename파일에서 순수 ASCII C 스타일의 제로 종단 문자열로 읽어 들입니다.오래된 아카이브 형식의 파일인덱스 테이블 등을 읽을 때 편리합니다.

String filename = new String(byteArrFilename.TakeWhile(x => x != 0)
                              .Select(x => x < 128 ? (Char)x : '?').ToArray());

용 i i i i를 쓴다.'?'여기서는 ASCII가 아닌 모든 기본 문자로 사용할 수 있지만 변경할 수 있습니다.는, 「검출할 수 있다」를 해 주세요.'\0' 대,,TakeWhile에 이 된 문자열은 할 수 .'\0'값을 지정합니다.

다음 콘솔 응용 프로그램을 사용해 보십시오.

static void Main(string[] args)
{
    //Encoding _UTF8 = Encoding.UTF8;
    string[] _mainString = { "Hello, World!" };
    Console.WriteLine("Main String: " + _mainString);

    // Convert a string to UTF-8 bytes.
    byte[] _utf8Bytes = Encoding.UTF8.GetBytes(_mainString[0]);

    // Convert UTF-8 bytes to a string.
    string _stringuUnicode = Encoding.UTF8.GetString(_utf8Bytes);
    Console.WriteLine("String Unicode: " + _stringuUnicode);
}

인코딩을 번거롭게 할 필요가 없었던 결과입니다.네트워크 클래스에서 사용하고 바이너리 오브젝트를 문자열로 전송합니다.

public static byte[] String2ByteArray(string str)
{
    char[] chars = str.ToArray();
    byte[] bytes = new byte[chars.Length * 2];

    for (int i = 0; i < chars.Length; i++)
        Array.Copy(BitConverter.GetBytes(chars[i]), 0, bytes, i * 2, 2);

    return bytes;
}

public static string ByteArray2String(byte[] bytes)
{
    char[] chars = new char[bytes.Length / 2];

    for (int i = 0; i < chars.Length; i++)
        chars[i] = BitConverter.ToChar(bytes, i * 2);

    return new string(chars);
}

string result = ASCIIEncoding.UTF8.GetString(byteArray);

언급URL : https://stackoverflow.com/questions/1003275/how-to-convert-utf-8-byte-to-string

'programing' 카테고리의 다른 글

아이폰 앱 아이콘 - 정확한 반지름? (0)	2023.04.17
AUTOINCREMENT로 인해 마스터 복제가 "키 'PRIMAY'에 대한 중복 항목"으로 중단되었습니다. (0)	2023.04.17
iPhone Simulator에 이미지 또는 비디오 추가 (0)	2023.04.17
Code에서 바인딩을 설정하는 방법 (0)	2023.04.17
메서드의 실행 시간을 밀리초 단위로 정확하게 기록하는 방법 (0)	2023.04.17

현재글UTF-8 바이트[]를 문자열로 변환하는 방법

각종 프로그래밍 정보를 다루는 블로그입니다.

git, ajax, spring-boot, json, mariadb, ASP.NET, Wordpress, sql-server, mongodb, c, reactjs, Excel, Angular, angularJS, MYSQL, Android, Oracle, jquery, python, bash,

Today :
Yesterday :

bestprogram

UTF-8 바이트[]를 문자열로 변환하는 방법

UTF-8 바이트[]를 문자열로 변환하는 방법

'programing' 카테고리의 다른 글

'programing'의 다른글

티스토리툴바

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

UTF-8 바이트[]를 문자열로 변환하는 방법

UTF-8 바이트[]를 문자열로 변환하는 방법

'programing' 카테고리의 다른 글

'programing'의 다른글

관련글

티스토리툴바