Hibernate批量大小混乱

这个程序依次进行数以万计的连续插入。 我从来没有用过Hibernate。 我得到了非常慢的性能(如果我只是手动连接和执行SQL,速度要快10-12倍,而且我的batch_size被设置为50,就像许多hibernate教程一样。

这里是一个单一的插入日志 – 也许你可以帮助我明白到底发生了什么:

START INSERT 11:02:56.121 [main] DEBUG org.hibernate.impl.SessionImpl - opened session at timestamp: 13106053761 11:02:56.121 [main] DEBUG ohtransaction.JDBCTransaction - begin 11:02:56.121 [main] DEBUG org.hibernate.jdbc.ConnectionManager - opening JDBC connection 11:02:56.121 [main] TRACE ohcDriverManagerConnectionProvider - total checked-out connections: 0 11:02:56.121 [main] TRACE ohcDriverManagerConnectionProvider - using pooled JDBC connection, pool size: 0 11:02:56.121 [main] DEBUG ohtransaction.JDBCTransaction - current autocommit status: false 11:02:56.121 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction begin 11:02:56.121 [main] TRACE org.hibernate.impl.SessionImpl - setting flush mode to: MANUAL 11:02:56.121 [main] TRACE ohedef.DefaultLoadEventListener - loading entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}] 11:02:56.121 [main] TRACE ohedef.DefaultLoadEventListener - creating new proxy for entity 11:02:56.122 [main] TRACE ohedDefaultSaveOrUpdateEventListener - saving transient instance 11:02:56.122 [main] DEBUG ohedef.AbstractSaveEventListener - generated identifier: component[keyW000]{keyW000=F000 ADSUFC}, using strategy: org.hibernate.id.CompositeNestedGeneratedValueGenerator 11:02:56.122 [main] TRACE ohedef.AbstractSaveEventListener - saving [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}] 11:02:56.123 [main] TRACE ohedAbstractFlushingEventListener - flushing session 11:02:56.123 [main] DEBUG ohedAbstractFlushingEventListener - processing flush-time cascades 11:02:56.123 [main] DEBUG ohedAbstractFlushingEventListener - dirty checking collections 11:02:56.123 [main] TRACE ohedAbstractFlushingEventListener - Flushing entities and processing referenced collections 11:02:56.125 [main] TRACE ohedAbstractFlushingEventListener - Processing unreferenced collections 11:02:56.125 [main] TRACE ohedAbstractFlushingEventListener - Scheduling collection removes/(re)creates/updates 11:02:56.126 [main] DEBUG ohedAbstractFlushingEventListener - Flushed: 1 insertions, 0 updates, 0 deletions to 62 objects 11:02:56.126 [main] DEBUG ohedAbstractFlushingEventListener - Flushed: 0 (re)creations, 0 updates, 0 removals to 0 collections 11:02:56.132 [main] TRACE ohedAbstractFlushingEventListener - executing flush 11:02:56.132 [main] TRACE org.hibernate.jdbc.ConnectionManager - registering flush begin 11:02:56.132 [main] TRACE ohpentity.AbstractEntityPersister - Inserting entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}] 11:02:56.132 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - about to open PreparedStatement (open PreparedStatements: 0, globally: 0) 11:02:56.132 [main] DEBUG org.hibernate.SQL - insert into MSW000 (W000_DATA_REC, W000_FILE_FLAGS, KEY_W000) values (?, ?, ?) 11:02:56.132 [main] TRACE org.hibernate.jdbc.AbstractBatcher - preparing statement 11:02:56.132 [main] TRACE ohpentity.AbstractEntityPersister - Dehydrating entity: [com.xyzcompany.foo.edoi.ejb.msw000.MSW000Rec#component[keyW000]{keyW000=F000 ADSUFC}] 11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding ' ADSUFCA ' to parameter: 1 11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding ' ' to parameter: 2 11:02:56.132 [main] TRACE org.hibernate.type.StringType - binding 'F000 ADSUFC' to parameter: 3 11:02:56.132 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - Executing batch size: 1 11:02:56.133 [main] DEBUG org.hibernate.jdbc.AbstractBatcher - about to close PreparedStatement (open PreparedStatements: 1, globally: 1) 11:02:56.133 [main] TRACE org.hibernate.jdbc.AbstractBatcher - closing statement 11:02:56.133 [main] TRACE org.hibernate.jdbc.ConnectionManager - registering flush end 11:02:56.133 [main] TRACE ohedAbstractFlushingEventListener - post flush 11:02:56.133 [main] DEBUG ohtransaction.JDBCTransaction - commit 11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - automatically flushing session 11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - before transaction completion 11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - before transaction completion 11:02:56.133 [main] DEBUG ohtransaction.JDBCTransaction - committed JDBC Connection 11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction completion 11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - transaction completed on session with on_close connection release mode; be sure to close the session to release JDBC resources! 11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - after transaction completion 11:02:56.133 [main] TRACE org.hibernate.impl.SessionImpl - closing session 11:02:56.133 [main] TRACE org.hibernate.jdbc.ConnectionManager - performing cleanup 11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - releasing JDBC connection [ (open PreparedStatements: 0, globally: 0) (open ResultSets: 0, globally: 0)] 11:02:56.133 [main] TRACE ohcDriverManagerConnectionProvider - returning connection to pool, pool size: 1 11:02:56.133 [main] TRACE org.hibernate.jdbc.JDBCContext - after transaction completion 11:02:56.133 [main] DEBUG org.hibernate.jdbc.ConnectionManager - transaction completed on session with on_close connection release mode; be sure to close the session to release JDBC resources! 11:02:56.134 [main] TRACE org.hibernate.impl.SessionImpl - after transaction completion FINISH INSERT 

当你调用session.save() ,hibernate会生成一个INSERT SQL。 这个INSERT SQL将被附加到冲洗期间发送到数据库(即session.flush() )。

在刷新期间,如果hibernate.jdbc.batch_size设置为非零值,Hibernate将使用JDBC2 API中引入的批处理function向数据库发出批量插入SQL。

例如,如果您save() 100条logging,并且您的hibernate.jdbc.batch_size设置为50.在刷新期间,而不是发出以下SQL 100次:

 insert into TableA (id , fields) values (1, 'val1'); insert into TableA (id , fields) values (2, 'val2'); insert into TableA (id , fields) values (3, 'val3'); ......................... insert into TableA (id , fields) values (100, 'val100'); 

Hiberate会将他们分成50组,并且只向数据库发出2个SQL,如下所示:

 insert into TableA (id , fields) values (1, 'val1') , (2, 'val2') ,(3, 'val3') ,(4, 'val4') ,......,(50, 'val50') insert into TableA (id , fields) values (51, 'val51') , (52, 'val52') ,(53, 'val53') ,(54, 'val54'),...... ,(100, 'val100') 

请注意,如果插入表的主键是GenerationType.Identity ,那么Hibernate将在JDBC级别透明地禁用插入批处理。

从您的日志中:您只save()只有一个logging,然后flush() ,因此每个刷新只能处理一个附加的INSERT SQL。 这就是为什么Hibernate不能帮你批量插入,因为只有一个INSERT SQL需要处理。 在调用flush()之前,您应该save()直到一定数量的logging,而不是每次调用save()都调用flush() save()

批量插入的最佳实践是这样的:

 Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction(); for ( int i=0; i<888888; i++ ) { TableA record = new TableA(); record.setXXXX(); session.save(record) if ( i % 50 == 0 ) { //50, same as the JDBC batch size //flush a batch of inserts and release memory: session.flush(); session.clear(); } } tx.commit(); session.close(); 

您保存并批量刷新logging。 在每个批处理结束时,您应该清除持久化上下文以释放一些内存,以防止每个持久对象被放入第一级caching(您的JVM的内存)时耗尽内存。 您也可以禁用二级caching来减less不必要的开销。


参考:

  • 官方Hibernate文档:第14章批处理
  • hibernate批处理 – 为什么你可能不会使用它。 (即使你认为你是)

如果您必须使用hibernate进行大量的批处理作业,那么StatelessSession是最好的select。 它将事物分解为最基本的转换对象到SQL语句的映射,并消除了在将批量填充到批量数据库时不使用ORMfunction的所有开销。

对实际的代码提出build议比日志要容易得多:)

 11:02:56.133 [main] DEBUG ohtransaction.JDBCTransaction - commit 

这是说数据库在每次插入后都提交。 确保你没有在插入循环中提交事务/closures你的会话。 反而做一次。