本文介紹了從文本文件中刪除重復項的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!
問題描述
我想從文本文件中刪除重復的單詞.
I want to remove duplicate word from a text file.
我有一些文本文件,其中包含如下內容:
i have some text file which contain such like following:
None_None
ConfigHandler_56663624
ConfigHandler_56663624
ConfigHandler_56663624
ConfigHandler_56663624
None_None
ColumnConverter_56963312
ColumnConverter_56963312
PredicatesFactory_56963424
PredicatesFactory_56963424
PredicateConverter_56963648
PredicateConverter_56963648
ConfigHandler_80134888
ConfigHandler_80134888
ConfigHandler_80134888
ConfigHandler_80134888
結果輸出需要是:
None_None
ConfigHandler_56663624
ColumnConverter_56963312
PredicatesFactory_56963424
PredicateConverter_56963648
ConfigHandler_80134888
我只使用了這個命令:en=set(open('file.txt')但它不起作用.
I have used just this command: en=set(open('file.txt') but it does not work.
誰能幫我從文件中提取唯一的集合
Could anyone help me with how to extract only the unique set from the file
謝謝
推薦答案
這里是關于保留順序的選項(與集合不同),但仍然具有相同的行為(請注意,EOL 字符被故意剝離并忽略空行)...
Here's about option that preserves order (unlike a set), but still has the same behaviour (note that the EOL character is deliberately stripped and blank lines are ignored)...
from collections import OrderedDict
with open('/home/jon/testdata.txt') as fin:
lines = (line.rstrip() for line in fin)
unique_lines = OrderedDict.fromkeys( (line for line in lines if line) )
print unique_lines.keys()
# ['None_None', 'ConfigHandler_56663624', 'ColumnConverter_56963312',PredicatesFactory_56963424', 'PredicateConverter_56963648', 'ConfigHandler_80134888']
那么你只需要將上面的內容寫入你的輸出文件.
Then you just need to write the above to your output file.
這篇關于從文本文件中刪除重復項的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!
【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!