問題描述
作為一名自學成才的 Python 愛好者,我將如何學習使用標準格式導入和導出二進制文件?
As a self-taught python hobbyist, how would I go about learning to import and export binary files using standard formats?
我想實現一個腳本,它采用 ePub 電子書(zip 中的 XHTML + CSS)并將其轉換為 mobipocket (Palmdoc) 格式,以允許 Amazon Kindle 閱讀它(作為更大項目的一部分)我正在努力).
I'd like to implement a script that takes ePub ebooks (XHTML + CSS in a zip) and converts it to a mobipocket (Palmdoc) format in order to allow the Amazon Kindle to read it (as part of a larger project that I'm working on).
已經有一個很棒的用于管理電子書庫的開源項目:Calibre.我想嘗試自己將其作為學習/自學練習來實施.我開始查看他們的 python 源碼代碼 并意識到我不知道發生了什么.當然,在任何事情上自學的最大危險是不知道你不知道什么.
There is already an awesome open-source project for managing ebook libraries : Calibre. I wanted to try implementing this on my own as a learning/self-teaching exercise. I started looking at their python source code and realized that I have no idea what is going on. Of course, the big danger in being self-taught at anything is not knowing what you don't know.
在這種情況下,我知道我對這些二進制文件以及如何在 python 代碼中使用它們了解不多(struct?).但我認為我可能總體上缺少很多關于二進制文件的知識,我想要一些幫助來理解如何使用它們.這里是 mobi/palmdoc 標頭的詳細概述.謝謝!
In this case, I know that I don't know much about these binary files and how to work with them in python code (struct?). But I think I'm probably missing a lot of knowledge about binary files in general and I'd like some help understanding how to work with them. Here is a detailed overview of the mobi/palmdoc headers. Thanks!
沒問題,好點!您對如何獲得使用二進制文件的基本知識有任何提示嗎?特定于 Python 的方法會有所幫助,但其他方法也可能有用.
No question, good point! Do you have any tips on how to gain a basic knowledge of working with binary files? Python-specific would be helpful but other approaches could also be useful.
TOM:作為問題編輯,添加了介紹/更好的標題
TOM:Edited as question, added intro / better title
推薦答案
你應該從 struct 模塊,正如您在問題中指出的那樣,當然,將文件作為二進制文件打開.
You should probably start with the struct module, as you pointed to in your question, and of course, open the file as a binary.
基本上,您只需從文件的開頭開始,然后將其逐個分開.這是一個麻煩,但不是一個大問題.如果文件被壓縮或加密,事情會變得更加困難.如果您從一個您知道其內容的文件開始,這樣您就不會一直在猜測,這會很有幫助.
Basically you just start at the beginning of the file and pick it apart piece by piece. It's a hassle, but not a huge problem. If the files are compressed or encrypted, things can get more difficult. It's helpful if you start with a file that you know the contents of so you're not guessing all the time.
嘗試一下,也許您會提出更具體的問題.
Try it a bit, and maybe you'll evolve more specific questions.
這篇關于python中的二進制文件IO,從哪里開始?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!