問(wèn)題描述
我有一個(gè)經(jīng)典的 Java EE 系統(tǒng),帶有 JSF 的 Web 層,用于 BL 的 EJB 3,以及對(duì) DB2 數(shù)據(jù)庫(kù)進(jìn)行數(shù)據(jù)訪問(wèn)的 Hibernate 3.我在以下場(chǎng)景中苦苦掙扎:用戶將啟動(dòng)一個(gè)涉及從數(shù)據(jù)庫(kù)中檢索大型數(shù)據(jù)集的過(guò)程.檢索過(guò)程需要一些時(shí)間,因此用戶不會(huì)立即收到響應(yīng),變得不耐煩并打開(kāi)新瀏覽器并再次啟動(dòng)檢索,有時(shí)會(huì)多次啟動(dòng).EJB 容器顯然沒(méi)有意識(shí)到第一次檢索不再相關(guān)的事實(shí),當(dāng)數(shù)據(jù)庫(kù)返回結(jié)果集時(shí),Hibernate 開(kāi)始填充一組占用大量?jī)?nèi)存的 POJO,最終導(dǎo)致 OutOfMemoryError代碼>.
I have a classic Java EE system, Web tier with JSF, EJB 3 for the BL, and Hibernate 3 doing the data access to a DB2 database. I am struggling with the following scenario: A user will initiate a process which involves retrieving a large data set from the database. The retrieval process takes some time and so the user does not receive an immediate response, gets impatient and opens a new browser and initiates the retrieval again, sometimes multiple times. The EJB container is obviously unaware of the fact that the first retrievals are no longer relevant, and when the database returns a result set, Hibernate starts populating a set of POJOs which take up vast amounts of memory, eventually causing an OutOfMemoryError
.
我想到的一個(gè)潛在解決方案是使用 Hibernate Session 的 cancelQuery
方法.但是,cancelQuery
方法僅在數(shù)據(jù)庫(kù)返回結(jié)果集之前 起作用.一旦數(shù)據(jù)庫(kù)返回結(jié)果集并且 Hibernate 開(kāi)始填充 POJO,cancelQuery
方法就不再有效.在這種情況下,數(shù)據(jù)庫(kù)查詢本身返回得相當(dāng)快,并且大部分性能開(kāi)銷(xiāo)似乎都存在于填充 POJO 上,此時(shí)我們不能再調(diào)用 cancelQuery
方法.
A potential solution that I thought of was to use the Hibernate Session's cancelQuery
method. However, the cancelQuery
method only works before the database returns a result set. Once the database returns a result set and Hibernate begins populating the POJOs, the cancelQuery
method no longer has an effect. In this case, the database queries themselves return rather quickly, and the bulk of the performance overhead seems to reside in populating the POJOs, at which point we can no longer call the cancelQuery
method.
推薦答案
最終實(shí)現(xiàn)的解決方案如下所示:
The solution implemented ended up looking like this:
一般的想法是維護(hù)當(dāng)前正在運(yùn)行查詢的所有 Hibernate 會(huì)話到啟動(dòng)它們的用戶的 HttpSession 的映射,這樣當(dāng)用戶關(guān)閉瀏覽器時(shí),我們就能夠終止正在運(yùn)行的查詢.
The general idea was to maintain a map of all the Hibernate sessions that are currently running queries to the HttpSession of the user who initiated them, so that when the user would close the browser we would be able to kill the running queries.
這里有兩個(gè)主要挑戰(zhàn)需要克服.一種是將 HTTP 會(huì)話 ID 從 Web 層傳播到 EJB 層,而不干擾沿途的所有方法調(diào)用——即不篡改系統(tǒng)中的現(xiàn)有代碼.第二個(gè)挑戰(zhàn)是弄清楚一旦數(shù)據(jù)庫(kù)已經(jīng)開(kāi)始返回結(jié)果并且 Hibernate 正在用結(jié)果填充對(duì)象時(shí)如何取消查詢.
There were two main challenges to overcome here. One was propagating the HTTP session-id from the web tier to the EJB tier without interfering with all the method calls along the way - i.e. not tampering with existing code in the system. The second challenge was to figure out how to cancel the queries once the database had already started returning results and Hibernate was populating objects with the results.
第一個(gè)問(wèn)題得到了克服,因?yàn)槲覀冋J(rèn)識(shí)到沿堆棧調(diào)用的所有方法都由同一個(gè)線程處理.這是有道理的,因?yàn)槲覀兊膽?yīng)用程序都存在于一個(gè)容器中,并且沒(méi)有任何遠(yuǎn)程調(diào)用.既然如此,我們創(chuàng)建了一個(gè) Servlet 過(guò)濾器,它攔截對(duì)應(yīng)用程序的每次調(diào)用,并添加一個(gè)帶有當(dāng)前 HTTP 會(huì)話 ID 的 ThreadLocal
變量.這樣一來(lái),HTTP session-id 將可用于沿線下方的每個(gè)方法調(diào)用.
The first problem was overcome based on our realization that all methods being called along the stack were being handled by the same thread. This makes sense, as our application exists all within one container and does not have any remote calls. Being that that is the case, we created a Servlet Filter that intercepts every call to the application and adds a ThreadLocal
variable with the current HTTP session-id. This way the HTTP session-id will be available to each one of the method calls lower down along the line.
第二個(gè)挑戰(zhàn)有點(diǎn)棘手.我們發(fā)現(xiàn)負(fù)責(zé)運(yùn)行查詢并隨后填充 POJO 的 Hibernate 方法稱為 doQuery
并位于 org.hibernate.loader.Loader.java
類中.(我們碰巧使用的是 Hibernate 3.5.3,但新版本的 Hibernate 也是如此.):
The second challenge was a little more sticky. We discovered that the Hibernate method responsible for running the queries and subsequently populating the POJOs was called doQuery
and located in the org.hibernate.loader.Loader.java
class. (We happen to be using Hibernate 3.5.3, but the same holds true for newer versions of Hibernate.):
private List doQuery(
final SessionImplementor session,
final QueryParameters queryParameters,
final boolean returnProxies) throws SQLException, HibernateException {
final RowSelection selection = queryParameters.getRowSelection();
final int maxRows = hasMaxRows( selection ) ?
selection.getMaxRows().intValue() :
Integer.MAX_VALUE;
final int entitySpan = getEntityPersisters().length;
final ArrayList hydratedObjects = entitySpan == 0 ? null : new ArrayList( entitySpan * 10 );
final PreparedStatement st = prepareQueryStatement( queryParameters, false, session );
final ResultSet rs = getResultSet( st, queryParameters.hasAutoDiscoverScalarTypes(), queryParameters.isCallable(), selection, session );
final EntityKey optionalObjectKey = getOptionalObjectKey( queryParameters, session );
final LockMode[] lockModesArray = getLockModes( queryParameters.getLockOptions() );
final boolean createSubselects = isSubselectLoadingEnabled();
final List subselectResultKeys = createSubselects ? new ArrayList() : null;
final List results = new ArrayList();
try {
handleEmptyCollections( queryParameters.getCollectionKeys(), rs, session );
EntityKey[] keys = new EntityKey[entitySpan]; //we can reuse it for each row
if ( log.isTraceEnabled() ) log.trace( "processing result set" );
int count;
for ( count = 0; count < maxRows && rs.next(); count++ ) {
if ( log.isTraceEnabled() ) log.debug("result set row: " + count);
Object result = getRowFromResultSet(
rs,
session,
queryParameters,
lockModesArray,
optionalObjectKey,
hydratedObjects,
keys,
returnProxies
);
results.add( result );
if ( createSubselects ) {
subselectResultKeys.add(keys);
keys = new EntityKey[entitySpan]; //can't reuse in this case
}
}
if ( log.isTraceEnabled() ) {
log.trace( "done processing result set (" + count + " rows)" );
}
}
finally {
session.getBatcher().closeQueryStatement( st, rs );
}
initializeEntitiesAndCollections( hydratedObjects, rs, session, queryParameters.isReadOnly( session ) );
if ( createSubselects ) createSubselects( subselectResultKeys, queryParameters, session );
return results; //getResultList(results);
}
在這種方法中,您可以看到首先以老式 java.sql.ResultSet
的形式從數(shù)據(jù)庫(kù)中獲取結(jié)果,然后在每個(gè)集合上循環(huán)運(yùn)行,然后從它創(chuàng)建一個(gè)對(duì)象.在循環(huán)之后調(diào)用的 initializeEntitiesAndCollections()
方法中執(zhí)行了一些額外的初始化.經(jīng)過(guò)一點(diǎn)調(diào)試,我們發(fā)現(xiàn)大部分性能開(kāi)銷(xiāo)都在方法的這些部分,而不是從數(shù)據(jù)庫(kù)獲取 java.sql.ResultSet
的部分,而是 cancelQuery
方法只對(duì)第一部分有效.因此解決方案是在 for 循環(huán)中添加一個(gè)附加條件,以檢查線程是否被中斷,如下所示:
In this method you can see that first the results are brought from the database in the form of a good old fashioned java.sql.ResultSet
, after which it runs in a loop over each set and creates an object from it. Some additional initialization is performed in the initializeEntitiesAndCollections()
method called after the loop. After debugging a little, we discovered that the bulk of the performance overhead was in these sections of the method, and not in the part that gets the java.sql.ResultSet
from the database, but the cancelQuery
method was only effective on the first part. The solution therefore was to add an additional condition to the for loop, to check whether the thread is interrupted like this:
for ( count = 0; count < maxRows && rs.next() && !currentThread.isInterrupted(); count++ ) {
// ...
}
以及在調(diào)用 initializeEntitiesAndCollections()
方法之前執(zhí)行相同的檢查:
as well as to perform the same check before calling the initializeEntitiesAndCollections()
method:
if (!Thread.interrupted()) {
initializeEntitiesAndCollections(hydratedObjects, rs, session,
queryParameters.isReadOnly(session));
if (createSubselects) {
createSubselects(subselectResultKeys, queryParameters, session);
}
}
另外,通過(guò)在第二次檢查時(shí)調(diào)用Thread.interrupted()
,標(biāo)志被清除并且不影響程序的進(jìn)一步運(yùn)行.現(xiàn)在,當(dāng)要取消查詢時(shí),取消方法會(huì)訪問(wèn)存儲(chǔ)在映射中的 Hibernate 會(huì)話和線程,其中 HTTP 會(huì)話 ID 作為鍵,調(diào)用會(huì)話上的 cancelQuery
方法并調(diào)用 interrupt
線程的方法.
Additionally, by calling the Thread.interrupted()
on the second check, the flag is cleared and does not affect the further functioning of the program. Now when a query is to be canceled, the canceling method accesses the Hibernate session and thread stored in a map with the HTTP session-id as the key, calls the cancelQuery
method on the session and calls the interrupt
method of the thread.
這篇關(guān)于多次搜索導(dǎo)致 OutOfMemoryError的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!