問題描述
在我的 Java 代碼中,我使用 Guava 的 Multimap (com.google.common.collect.Multimap) 使用這個:
In my Java code, I am using Guava's Multimap (com.google.common.collect.Multimap) by using this:
Multimap<Integer, Integer> Index = HashMultimap.create()
這里,Multimap 鍵是 URL 的一部分,值是 URL 的另一部分(轉換為整數).現在,我分配我的 JVM 2560 Mb (2.5 GB) 堆空間(通過使用 Xmx 和 Xms).但是,它只能存儲 900 萬個這樣的(鍵、值)整數對(大約 1000 萬個).但是,理論上(根據 int
占用的內存)它應該存儲更多.
Here, Multimap key is some portion of a URL and value is another portion of the URL (converted into an integer). Now, I assign my JVM 2560 Mb (2.5 GB) heap space (by using Xmx and Xms). However, it can only store 9 millions of such (key,value) pairs of integers (approx 10 million). But, theoretically (according to memory occupied by int
) it should store more.
誰能幫幫我,
- 為什么
Multimap
使用大量內存?我檢查了我的代碼,沒有在Multimap
中插入對,它只使用了 1/2 MB 的內存. 2.
- Why is
Multimap
using lots of memory? I checked my code and without inserting pairs into theMultimap
, it only uses 1/2 MB of memory.
2.
是否有另一種方法或自制的解決方案來解決這個內存問題?意思是,有沒有辦法減少這些對象開銷,因為我只想存儲 int-int?在任何其他語言?或任何其他解決方案(首選自制)來解決我面臨的問題,意味著基于數據庫或類似的解決方案.
Is there another way or home-baked solution to solve this memory issue? Means, Is there any way to reduce those object overheads as I want to store only int-int? In any other language ? Or any other solution (home-baked preferred) to solve issue I faced, means DB based or something like that solution.
推薦答案
與 Multimap
相關的開銷很大.至少:
There's a huge amount of overhead associated with Multimap
. At a minimum:
- 每個鍵和值都是一個
Integer
對象,它(至少)使每個int
值的存儲需求翻倍. HashMultimap
中的每個唯一鍵值都與一個Collection
值相關聯(根據 來源,Collection
是哈希集
).- 每個
Hashset
都使用 8 個值的默認空間創建.
- Each key and value is an
Integer
object, which (at a minimum) doubles the storage requirements of eachint
value. - Each unique key value in the
HashMultimap
is associated with aCollection
of values (according to the source, theCollection
is aHashset
). - Each
Hashset
is created with default space for 8 values.
因此,每個鍵/值對(至少)需要的空間可能比您對兩個 int
值的預期多一個數量級.(當多個值存儲在一個鍵下時會少一些.)我預計 1000 萬個鍵/值對可能占用 400MB.
So each key/value pair requires (at a minimum) perhaps an order of magnitude more space than you might expect for two int
values. (Somewhat less when multiple values are stored under a single key.) I would expect 10 million key/value pairs to take perhaps 400MB.
雖然您有 2.5GB 的堆空間,但如果這還不夠,我也不會感到驚訝.我認為,上述估計偏低.此外,它僅說明地圖構建后需要存儲多少.隨著映射的增長,表需要重新分配和重新散列,這暫時至少使使用的空間量增加一倍.最后,所有這些都假設 int
值和對象引用需要 4 個字節.如果 JVM 使用 64 位尋址,字節數可能會翻倍.
Although you have 2.5GB of heap space, I wouldn't be all that surprised if that's not enough. The above estimate is, I think, on the low side. Plus, it only accounts for how much is needed to store the map once it is built. As the map grows, the table needs to be reallocated and rehashed, which temporarily at least doubles the amount of space used. Finally, all this assumes that int
values and object references require 4 bytes. If the JVM is using 64-bit addressing, the byte count probably doubles.
這篇關于多地圖空間問題:番石榴的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!