問題描述
我需要確定 80% 的文件是二進(jìn)制文件還是文本文件,有沒有什么方法可以在 c# 中快速和骯臟/丑陋?
I need to determine in 80% if a file is binary or text, is there any way to do it even quick and dirty/ugly in c#?
推薦答案
我可能會(huì)尋找大量的控制字符,這些字符通常出現(xiàn)在二進(jìn)制文件中,但很少出現(xiàn)在文本文件中.二進(jìn)制文件傾向于使用足夠的 0,以至于僅測試許多 0 字節(jié)可能足以捕獲大多數(shù)文件.如果您關(guān)心本地化,您還需要測試多字節(jié)模式.
I would probably look for an abundance of control characters which would typically be present in a binary file but rarely in an text file. Binary files tend to use 0 enough that just testing for many 0 bytes would probably be sufficient to catch most files. If you care about localization you'd need to test multi-byte patterns as well.
如前所述,您總是很不幸,得到一個(gè)看起來像文本的二進(jìn)制文件,反之亦然.
As stated though, you can always be unlucky and get a binary file that looks like text or vice versa.
這篇關(guān)于如何確定文件是c#中的二進(jìn)制文件還是文本文件?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!