我正在構建一個字串庫來支持 ascii 和 utf8。我為和
創建了兩個 typedef 。ascii 可以安全地讀取為 utf8,但 utf8 不能安全地讀取為 ascii。
我有什么方法可以在隱式轉換 from 時發出警告,但不能在隱式轉換到時發出警告?t_ascii
t_utf8
t_utf8
t_ascii
t_ascii
t_utf8
理想情況下,我希望發出這些警告(并且只有這些警告):
#include <stdint.h>
typedef char t_ascii;
typedef uint_least8_t t_utf8;
int main()
{
t_ascii const* asciistr = "Hello world"; // Ok
t_utf8 const* utf8str = "你好世界"; // Ok
asciistr = utf8str; // Warning: utf8 to ascii is not safe
utf8str = asciistr; // Ok: ascii to utf8 is safe
t_ascii asciichar = 'A';
t_utf8 utf8char = 'B';
asciichar = utf8char; // Warning: utf8 to ascii is not safe
utf8char = asciichar; // Ok: ascii to utf8 is safe
}
目前,當使用 -Wall (甚至使用-funsigned-char
)構建時,我收到以下警告:
gcc main.c -Wall -Wextra
main.c: In function ‘main’:
main.c:10:35: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const unsigned char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
10 | t_utf8 const* utf8str = "你好世界"; // Ok
| ^~~~~~~~~~
main.c:12:18: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const unsigned char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
12 | asciistr = utf8str; // Warning: utf8 to ascii is not safe
| ^
main.c:16:17: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const unsigned char *’} differ in signedness [-Wpointer-sign]
16 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
uj5u.com熱心網友回復:
用 編譯-Wall
。總是用-Wall
.
<user>@squall:~/src/p1$ gcc -Wall -c test2.c
test2.c: In function ‘main’:
test2.c:9:31: warning: pointer targets in initialization of ‘const t_utf8 *’ {aka ‘const signed char *’} from ‘char *’ differ in signedness [-Wpointer-sign]
9 | t_utf8 const* utf8str = "你好世界";
| ^~~~~~~~~~~~~~
test2.c:11:13: warning: pointer targets in assignment from ‘const t_ascii *’ {aka ‘const char *’} to ‘const t_utf8 *’ {aka ‘const signed char *’} differ in signedness [-Wpointer-sign]
11 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
test2.c:12:14: warning: pointer targets in assignment from ‘const t_utf8 *’ {aka ‘const signed char *’} to ‘const t_ascii *’ {aka ‘const char *’} differ in signedness [-Wpointer-sign]
12 | asciistr = utf8str; // Should issue warning: utf8 to ascii is not safe
| ^
您希望從t_ascii
from投射它是安全的t_utf8
,但事實并非如此。簽名不同。
警告不是關于有效的 utf8 有時不是有效的 ASCII 的事實——編譯器對此一無所知。警告是關于標志的。
如果你想要一個無符號的char
,用 . 編譯-funsigned-char
。但隨后不會發出任何警告。
(順便說一句,如果您認為該型別int_least8_t
將能夠保存多位元組字符/完整的 utf8 代碼點編碼 - 它不會。所有int_least8_t
因此utf8_t
在單個編譯單元中將具有完全相同的大小。)
uj5u.com熱心網友回復:
只需使用標準 C 編譯器對其進行編譯。建議初學者學習 C 的編譯器選項是什么?
結果:
<source>: In function 'main':
<source>:9:31: error: pointer targets in initialization of 'const t_utf8 *' {aka 'const unsigned char *'} from 'char *' differ in signedness [-Wpointer-sign]
9 | t_utf8 const* utf8str = "你好世界"; // Ok
| ^~~~~~~~~~
<source>:11:14: error: pointer targets in assignment from 'const t_utf8 *' {aka 'const unsigned char *'} to 'const t_ascii *' {aka 'const char *'} differ in signedness [-Wpointer-sign]
11 | asciistr = utf8str; // Warning: utf8 to ascii is not safe
| ^
<source>:12:13: error: pointer targets in assignment from 'const t_ascii *' {aka 'const char *'} to 'const t_utf8 *' {aka 'const unsigned char *'} differ in signedness [-Wpointer-sign]
12 | utf8str = asciistr; // Ok: ascii to utf8 is safe
| ^
但不是在將 t_ascii 隱式轉換為 t_utf8 時?
不,您不能在標準 C 中使用它,因為它是無效的指標轉換。您可以使用顯式強制轉換使編譯器靜音,但如果這樣做,您將呼叫未定義的行為。
除此之外,您可以使用 C11_Generic
找出哪種型別uint_least8_t
歸結為:
#include <stdint.h>
#include <stdio.h>
#define what_type(obj) printf("%s is same as %s\n", #obj, \
_Generic ((obj), \
char: "char", \
unsigned char: "unsigned char", \
signed char: "signed char") );
int main (void)
{
typedef char t_ascii;
typedef uint_least8_t t_utf8;
t_ascii ascii;
t_utf8 utf8;
what_type(ascii);
what_type(utf8);
}
gcc x86 Linux 上的輸出:
ascii is same as char
utf8 is same as unsigned char
轉載請註明出處,本文鏈接:https://www.uj5u.com/net/473153.html
上一篇:C中的外部/全域變數