一、错误现象
运行一段时间,或者访问较长时间之后,比如推文短时间触发并发访问,会导致后台服务获取连接池错误,导致用户端无法方法。
二、错误日志
com.alibaba.druid.pool.GetConnectionTimeout Exception : wait millis 60000, active 64, maxActive 64, creating 0
2020-03-06 16:35:23.557 [Atomikos:696] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366312209015 has timed out and will rollback.
2020-03-06 16:35:23.600 [Atomikos:634] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366318409036 has timed out and will rollback.
2020-03-06 16:35:23.600 [Atomikos:665] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366315409027 has timed out and will rollback.
2020-03-06 16:35:23.600 [Atomikos:825] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366291309002 has timed out and will rollback.
2020-03-06 16:35:23.623 [Atomikos:633] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366291609004 has timed out and will rollback.
2020-03-06 16:35:23.624 [Atomikos:615] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366314809024 has timed out and will rollback.
2020-03-06 16:35:23.624 [Atomikos:836] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366306909013 has timed out and will rollback.
2020-03-06 16:35:23.625 [Atomikos:684] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366318009034 has timed out and will rollback.
2020-03-06 16:35:23.625 [qtp1604839423-3321] WARN c.atomikos. jdbc .AtomikosSQLException [] -Failed to grow the connection pool
com.atomikos.datasource.pool.CreateConnectionException: XAConnectionFactory: failed to create pooled connection - DBMS down or unreachable?
at com.atomikos.jdbc.AtomikosXAConnectionFactory.createPooledConnection(AtomikosXAConnectionFactory.java:48) ~[transactions-jdbc-4.0.6.jar:na]
2020-03-06 16:35:23.690 [Atomikos:805] WARN c.a.icatch.imp.ActiveStateHandler [] -Transaction com.atomikos.spring.jdbc.tm158348366315209025 has timed out and will
三、目前的连接池配置
<!--自动扫描组件 -->
<bean id="dataSource" class="com.alibaba.druid.pool.xa.DruidXADataSource" init-method="init" destroy-method="close">
<property name="url" value="${hostdb.url}"/>
<property name="username" value="${hostdb.user: root }"/>
<property name="password" value="${hostdb.password:root}"/>
<property name="filters" value="stat"/>
<property name="maxActive" value="${hostdb.druid.maxActive:64}"/>
<property name="initialSize" value="${hostdb.druid.initialSize:20}"/>
<property name="maxWait" value="${hostdb.druid.maxWait:60000}"/>
<property name="minIdle" value="${hostdb.druid.minIdle:20}"/>
<!-- 配置间隔多久才进行一次检测,检测需要关闭的空闲连接,单位是毫秒 -->
<property name="timeBetweenEvictionRunsMillis" value="${hostdb.druid.timeBetweenEvictionRunsMillis:2000}"/>
<!-- 配置一个连接在池中最小生存的时间,单位是毫秒 -->
<property name="minEvictableIdleTimeMillis" value="${hostdb.druid.minEvictableIdleTimeMillis:600000}"/>
<property name="maxEvictableIdleTimeMillis" value="${hostdb.druid.maxEvictableIdleTimeMillis:900000}"/>
<property name="validationQuery" value="select 1"/>
<property name="testWhileIdle" value="${hostdb.druid.testWhileIdle:true}"/>
<property name="testOnBorrow" value="${hostdb.druid.testOnBorrow:false}"/>
<property name="testOnReturn" value="${hostdb.druid.testOnReturn:false}"/>
<property name="poolPreparedStatements" value="${hostdb.druid.poolPreparedStatements:true}"/>
<property name="maxOpenPreparedStatements" value="${hostdb.druid.maxOpenPreparedStatements:20}"/>
<property name="asyncInit" value="${hostdb.druid.asyncInit:true}"/>
<!-- 打开removeAbandoned功能, 在出现链接泄露的情况下可以打开,查看日志 -->
<property name="removeAbandoned" value="${hostdb.druid.removeAbandoned:false}" />
<!-- 1800秒,也就是30分钟 -->
<property name="removeAbandonedTimeout" value="${hostdb.druid.removeAbandonedTimeout:1800}" />
<!-- 关闭abanded连接时输出错误日志 -->
<property name="logAbandoned" value="${hostdb.druid.logAbandoned:true}" />
</bean>
<bean id="hostDataSource" class="com.atomikos.jdbc.AtomikosDataSourceBean" init-method="init"
destroy-method="close">
<property name="uniqueResourceName" value=" mysql /dataSource"/>
<property name="xaDataSource" ref="dataSource"/>
<!-- set properties for datasource connection pool -->
<property name="minPoolSize" value="20"/>
<property name="maxPoolSize" value="${hostdb.maxPoolSize:64}"/>
<property name="borrowConnectionTimeout" value="${hostdb.borrowConnectionTimeout:30}"/>
<property name="maintenanceInterval" value="${hostdb.maintenanceInterval:60}"/>
<property name="maxIdleTime" value="${hostdb.maxIdleTime:60}"/>
<!-- 管理 Connection 被占用的时间 -->
<!-- 如果不设置这个值,Atomikos使用默认的300秒(即5分钟),那么在处理大批量数据读取的时候,一旦超过5分钟,就会抛出类似 Resultset
is close 的错误 -->
<property name="reapTimeout" value="${hostdb.reapTimeout:20000}"/>
<!-- 一个连接的生命时长(秒),超时而且没被使用则被释放(retired),缺省:无限制,建议设置比数据库超时时长少,
参考MySQL wait_timeout参数(show variables like '%timeout%';) -->
<property name="maxLifetime" value="${hostdb.maxLifetime:600}"/>
</bean>
四、分析原因
影响参数:
<property name="maxActive" value="${hostdb.druid.maxActive:64}"/>
<property name="maxWait" value="${hostdb.druid.maxWait:60000}"/>
短时间内数据库会话被占用,没正常结束回收释放,导致新的请求无法在最大等待时间内获取新的连接会话,报错。
<!-- 配置一个连接在池中最小生存的时间,单位是毫秒 -->
<property name="minEvictableIdleTimeMillis" value="${hostdb.druid.minEvictableIdleTimeMillis:600000}"/>
<property name="maxEvictableIdleTimeMillis" value="${hostdb.druid.maxEvictableIdleTimeMillis:900000}"/>
此参数可能影响不大,理解上看两参数是控制连接在pool的生存时间
<property name="maxPoolSize" value="${hostdb.maxPoolSize:64}"/>
Atomikos事务管理池的最大个数,可根据 druid的 maxActive 来同步增加。
五、尝试修改
新增:
<!-- 打开removeAbandoned功能, 在出现链接泄露的情况下可以打开,查看日志 -->
<property name="removeAbandoned" value="${hostdb.druid.removeAbandoned:true}" />
<!-- 1800秒,也就是30分钟 -->
<property name="removeAbandonedTimeout" value="${hostdb.druid.removeAbandonedTimeout:1800}" />
<!-- 关闭abanded连接时输出错误日志 -->
<property name="logAbandoned" value="${hostdb.druid.logAbandoned:true}" />
参考:
打开 连接池监测和强制抛弃配置
调整:
<property name="maxActive" value="${hostdb.druid.maxActive:1024}"/>
配置增加到 1024
<property name="maxWait" value="${hostdb.druid.maxWait:60000}"/>
不动
<property name="maxPoolSize" value="${hostdb.maxPoolSize:1024}"/>
增加到 1024
注意 jpa .properties内的
com.atomikos.icatch.max_actives = -1