問題描述
我經(jīng)常使用字符.IsDigit
來檢查 char
是否是一個數(shù)字,這在 LINQ 查詢中特別方便以預(yù)先檢查 int.Parse
如下:"123".All(Char.IsDigit)
.
但是有些字符是數(shù)字,但不能像 ?
那樣解析為 int
.
//真bool isDigit = Char.IsDigit('?');var文化 = CultureInfo.GetCultures(CultureTypes.SpecificCultures);整數(shù);//錯誤的bool isIntForAnyCulture = 文化.Any(c => int.TryParse('?'.ToString(), NumberStyles.Any, c, out num));
這是為什么?我的 int.Parse
-通過 Char.IsDigit
進行預(yù)檢查是否不正確?
有 310 個字符是數(shù)字:
ListdigitList = Enumerable.Range(0, UInt16.MaxValue).Select(i => Convert.ToChar(i)).Where(c => Char.IsDigit(c)).ToList();
以下是 .NET 4 (ILSpy) 中 Char.IsDigit
的實現(xiàn):
public static bool IsDigit(char c){如果 (char.IsLatin1(c)){返回 c >= '0' &&c <= '9';}返回 CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.DecimalDigitNumber;}
那么為什么會有屬于 DecimalDigitNumber
-category("十進制數(shù)字字符,即 0 到 9 范圍內(nèi)的字符...")在任何文化中都不會被解析為 int
嗎?
這是因為它正在檢查 Unicode數(shù)字,十進制數(shù)字"類別中的所有數(shù)字,如下所列:
http://www.fileformat.info/info/unicode/類別/Nd/list.htm
這并不意味著它是當前語言環(huán)境中的有效數(shù)字字符.事實上,使用int.Parse()
,你只能解析正常的英文數(shù)字,??而不管區(qū)域設(shè)置如何.
例如,這不起作用:
int test = int.Parse("?", CultureInfo.GetCultureInfo("ar"));
即使 ?
是有效的阿拉伯數(shù)字字符,并且ar"是阿拉伯語區(qū)域設(shè)置標識符.
Microsoft 文章 如何:解析 Unicode 數(shù)字" 指出那個:
<塊引用><塊引用>.NET Framework 解析為十進制的唯一 Unicode 數(shù)字是 ASCII 數(shù)字 0 到 9,由代碼值 U+0030 到 U+0039 指定..NET Framework 將所有其他 Unicode 數(shù)字解析為字符.
但是,請注意,您可以使用 char.GetNumericValue()
將 unicode 數(shù)字字符轉(zhuǎn)換為雙精度數(shù)字.
返回值是 double 而不是 int 的原因是這樣的:
Console.WriteLine(char.GetNumericValue('?'));//打印 0.25
您可以使用類似的方法將字符串中的所有數(shù)字字符轉(zhuǎn)換為它們的 ASCII 等價物:
public string ConvertNumericChars(string input){StringBuilder 輸出 = new StringBuilder();foreach(輸入中的字符ch){如果 (char.IsDigit(ch)){雙值 = char.GetNumericValue(ch);if ((value >= 0) && (value <= 9) && (value == (int)value)){output.Append((char)('0'+(int)value));繼續(xù);}}output.Append(ch);}返回 output.ToString();}
I often use Char.IsDigit
to check if a char
is a digit which is especially handy in LINQ queries to pre-check int.Parse
as here: "123".All(Char.IsDigit)
.
But there are chars which are digits but which can't be parsed to int
like ?
.
// true
bool isDigit = Char.IsDigit('?');
var cultures = CultureInfo.GetCultures(CultureTypes.SpecificCultures);
int num;
// false
bool isIntForAnyCulture = cultures
.Any(c => int.TryParse('?'.ToString(), NumberStyles.Any, c, out num));
Why is that? Is my int.Parse
-precheck via Char.IsDigit
thus incorrect?
There are 310 chars which are digits:
List<char> digitList = Enumerable.Range(0, UInt16.MaxValue)
.Select(i => Convert.ToChar(i))
.Where(c => Char.IsDigit(c))
.ToList();
Here's the implementation of Char.IsDigit
in .NET 4 (ILSpy):
public static bool IsDigit(char c)
{
if (char.IsLatin1(c))
{
return c >= '0' && c <= '9';
}
return CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.DecimalDigitNumber;
}
So why are there chars that belong to the DecimalDigitNumber
-category("Decimal digit character, that is, a character in the range 0 through 9...") which can't be parsed to an int
in any culture?
It's because it is checking for all digits in the Unicode "Number, Decimal Digit" category, as listed here:
http://www.fileformat.info/info/unicode/category/Nd/list.htm
It doesn't mean that it is a valid numeric character in the current locale. In fact using int.Parse()
, you can ONLY parse the normal English digits, regardless of the locale setting.
For example, this doesn't work:
int test = int.Parse("?", CultureInfo.GetCultureInfo("ar"));
Even though ?
is a valid Arabic digit character, and "ar" is the Arabic locale identifier.
The Microsoft article "How to: Parse Unicode Digits" states that:
The only Unicode digits that the .NET Framework parses as decimals are the ASCII digits 0 through 9, specified by the code values U+0030 through U+0039. The .NET Framework parses all other Unicode digits as characters.
However, note that you can use char.GetNumericValue()
to convert a unicode numeric character to its numeric equivalent as a double.
The reason the return value is a double and not an int is because of things like this:
Console.WriteLine(char.GetNumericValue('?')); // Prints 0.25
You could use something like this to convert all numeric characters in a string into their ASCII equivalent:
public string ConvertNumericChars(string input)
{
StringBuilder output = new StringBuilder();
foreach (char ch in input)
{
if (char.IsDigit(ch))
{
double value = char.GetNumericValue(ch);
if ((value >= 0) && (value <= 9) && (value == (int)value))
{
output.Append((char)('0'+(int)value));
continue;
}
}
output.Append(ch);
}
return output.ToString();
}
這篇關(guān)于為什么 Char.IsDigit 對于無法解析為 int 的字符返回 true?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!