批量insertOrUpdate或replace-into使用useGeneratedKeys报错简单分析
在项目中我们经常使用useGenerateKeys来返回自增主键,避免多一次查询。在项目中不管是单条数据插入还是批量数据插入,用此方法获取自增主键ID屡试不爽。当然也会使用on duplicate key update,来进行insertOrUpdate,来避免必须先query后才觉得是insert还是update。这招用起来很爽,但是也容易不知所以的掉坑,鄙人在一次批量更新的时候使用到了useGenerateKeys导致项目保错,促使我要研究一下使用useGenerateKeys的内部原理和使用奥秘。
现象
先贴出我出错的两段代码:
- 批量插入或更新商品对象:
public List batchUpdateItems(List itemPOS) {
if (CollectionUtils.isNotEmpty(itemPOS)) {
int count = itemNewDAO.batchUpdateItem(itemPOS);
if (count != itemPOS.size()) {
log.info("批量更新商品信息失败");
}
return itemPOS.stream().map(ItemPO::getId).collect(Collectors.toList());
}
return Lists.newArrayList();
}
- 使用replace-into:
REPLACE INTO item (id, item_id, sku_id,
`name`, tags,unit_price, `type`,quantity, creator, modifier,gmt_created, gmt_modified, is_del)
values
(
#{item.id,jdbcType=BIGINT},
#{item.itemId,jdbcType=BIGINT},
#{item.skuId,jdbcType=BIGINT},
#{item.name,jdbcType=VARCHAR},
#{item.tags,jdbcType=VARCHAR},
#{item.unitPrice,jdbcType=BIGINT},
#{item.type,jdbcType=CHAR},
#{item.quantity,jdbcType=BIGINT},
#{item.creator,jdbcType=BIGINT},
#{item.modifier,jdbcType=BIGINT},
#{item.gmtCreated,jdbcType=TIMESTAMP},
#{item.gmtModified,jdbcType=TIMESTAMP},
#{item.isDel,jdbcType=BIT}
)
- 使用on duplicate key:
insert into item (id, item_id, sku_id,
`name`, tags,unit_price, `type`,quantity, creator, modifier,gmt_created, gmt_modified, is_del)
values
(
#{item.id,jdbcType=BIGINT},
#{item.itemId,jdbcType=BIGINT},
#{item.skuId,jdbcType=BIGINT},
#{item.name,jdbcType=VARCHAR},
#{item.tags,jdbcType=VARCHAR},
#{item.unitPrice,jdbcType=BIGINT},
#{item.type,jdbcType=CHAR},
#{item.quantity,jdbcType=BIGINT},
#{item.creator,jdbcType=BIGINT},
#{item.modifier,jdbcType=BIGINT},
#{item.gmtCreated,jdbcType=TIMESTAMP},
#{item.gmtModified,jdbcType=TIMESTAMP},
#{item.isDel,jdbcType=BIT}
)
ON DUPLICATE KEY UPDATE
我想当然的认为批量更新(insertOrUpdate)返回的行数就是数据库受影响的行数;实际情况中很大概率会遇到错误提示为:“Too many keys are generated. There are only 1 target objects. You either specified a wrong 'keyProperty' or encountered a driver bug like #1523.”
- 查看官方文档
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag to the mysql_real_connect() C API function when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values.
- 0;没有行更新
- 1;insert
- 2;update
由此看出代码块1中的判断是明显达不到预期结果了;
- 首先我们看下mybatis对于useGeneratedKey的描述
This tells MyBatis to use the JDBC getGeneratedKeys method to retrieve keys generated internally by the database (e.g. auto increment fields in RDBMS like MySQL or SQL Server). Default: false.
- 我们再看下JDBC规范
Before version 3.0 of the JDBC API, there was no standard way of retrieving key values from databases that supported auto increment or identity columns. With older JDBC drivers for MySQL, you could always use a MySQL-specific method on the Statement interface, or issue the query SELECT LAST_INSERT_ID() after issuing an INSERT to a table that had an AUTO_INCREMENT key. Using the MySQL-specific method call isn't portable, and issuing a SELECT to get the AUTO_INCREMENT key's value requires another round-trip to the database, which isn't as efficient as possible. The following code snippets demonstrate the three different ways to retrieve AUTO_INCREMENT values. First, we demonstrate the use of the new JDBC 3.0 method getGeneratedKeys() which is now the preferred method to use if you need to retrieve AUTO_INCREMENT keys and have access to JDBC 3.0. The second example shows how you can retrieve the same value using a standard SELECT LAST_INSERT_ID() query. The final example shows how updatable result sets can retrieve the AUTO_INCREMENT value when using the insertRow() method.
- 我们看下auto_increment_id的定义。重点关注批量插入
For a multiple-row insert, LAST_INSERT_ID() and mysql_insert_id() actually return the AUTO_INCREMENT key from the first of the inserted rows. This enables multiple-row inserts to be reproduced correctly on other servers in a replication setup.
那么mysql server只返回一个id,客户端批量插入的时候怎么能实现获取全部的id呢?首先获取本次批量插入的影响行数,然后再执行具体的获取id操作。即按照受影响的范围+递增步长,迭代影响的行数,然后依次获取id。
所以批量insert是正确可以返回的。
但是批量insertOrUpdate就有问题了,批量insertOrUpdate的影响行数不是插入的数据行数,可能是0,1,2这样就导致了自增id有问题了。
比如插入3条数据,2条会update,1条会insert,这时候updateCount就是5,generateid就会5个了,mybatis然后取前3个塞到数据里,显然是错的。
结论
- 批量insertOrUpdate时,不能依赖useGeneratedKey返回主键。
推荐阅读
- 低头思故乡——只是因为睡不着
- 拒绝可以很艺术,或者很行为艺术。
- How|How Fear Works(PartⅡ)
- performSelectorOnMainThread:withObject:waitUntilDone:参数设置为NO或YES的区别
- 武功山金顶草甸或将易主(投资几千万开发,萍乡(有人想摘桃子))
- 狗趣
- 两个心得
- 插件化无法获取或找到.so文件
- 入以色列纪(四)
- 在树下