https://gitee.com/quancong/redis
问题:在执行redistemplate setnx操作和redistemplate setex操作之间如果分布式中某个服务应用宕机或者某台redis宕机,则这把锁永远也无法释放,造成死锁,如下图所示:
SET KEY VALUE [EX seconds] [PX milliseconds] [NX|XX]
EX seconds - 设置指定的到期时间,单位为秒。PX milliseconds - 设置指定到期时间,单位为毫秒。NX - 只有设置键,如果它不存在。XX - 只有设置键,如果它已经存在。例子:
127.0.0.1:6379> SET a redis EX 6 NX OK 127.0.0.1:6379> ttl a (integer) 3但是2.0.5.RELEASE版本的springboot没有setex和setnx连用的api,但经过了一翻探索之后,使用RedisConnection,搞定。
public boolean setLock(String key, long expire) { try { Boolean result = redisTemplate.execute(new RedisCallback<Boolean>() { @Override public Boolean doInRedis(RedisConnection connection) throws DataAccessException { return connection.set(key.getBytes(), getHostIp().getBytes(), Expiration.seconds(expire) ,RedisStringCommands.SetOption.ifAbsent()); } }); return result; } catch (Exception e) { logger.error("set redis occured an exception", e); } return false; }因lua脚本本身就是redis中的原子性操作,故是一种很有效的锁机制。
步骤: 3.1、在resource目录下面新增一个后缀名为.lua结尾的文件
3.2、编写lua脚本
local lockKey = KEYS[1] local lockValue = KEYS[2] -- setnx info local result_1 = redis.call('SETNX', lockKey, lockValue) if result_1 == true then local result_2= redis.call('SETEX', lockKey,3600, lockValue) return result_1 else return result_1 end3.3、传入lua脚本的key和arg并执行脚本
@Service public class LuaDistributeLock { private static final Logger logger = LoggerFactory.getLogger(LockNxExJob.class); @Autowired private RedisService redisService; @Autowired private RedisTemplate redisTemplate; private static String LOCK_PREFIX = "lua_"; private DefaultRedisScript<Boolean> lockScript; @Scheduled(cron = "0/10 * * * * *") public void lockJob() { String lock = LOCK_PREFIX + "LockNxExJob"; boolean luaRet = false; try { // 使用lua脚本 luaRet = luaExpress(lock,getHostIp()); //获取锁失败 if (!luaRet) { String value = (String) redisService.genValue(lock); //打印当前占用锁的服务器IP //logger.info("lua get lock fail,lock belong to:{}", value); return; } else { //获取锁成功 //logger.info("lua start lock lockNxExJob success"); Thread.sleep(5000); } } catch (Exception e) { logger.error("lock error", e); } finally { if (luaRet) { //logger.info("release lock success"); redisService.remove(lock); } } } /** * 获取lua结果 * @param key * @param value * @return */ public Boolean luaExpress(String key,String value) { lockScript = new DefaultRedisScript<Boolean>(); lockScript.setScriptSource( new ResourceScriptSource(new ClassPathResource("add.lua"))); lockScript.setResultType(Boolean.class); // 封装参数 List<Object> keyList = new ArrayList<Object>(); keyList.add(key); keyList.add(value); // 调用redisTemplate.execute方法执行脚本 Boolean result = (Boolean) redisTemplate.execute(lockScript, keyList); return result; } /** * 获取本机内网IP地址方法 * * @return */ private static String getHostIp() { try { Enumeration<NetworkInterface> allNetInterfaces = NetworkInterface.getNetworkInterfaces(); while (allNetInterfaces.hasMoreElements()) { NetworkInterface netInterface = (NetworkInterface) allNetInterfaces.nextElement(); Enumeration<InetAddress> addresses = netInterface.getInetAddresses(); while (addresses.hasMoreElements()) { InetAddress ip = (InetAddress) addresses.nextElement(); if (ip != null && ip instanceof Inet4Address && !ip.isLoopbackAddress() //loopback地址即本机地址,IPv4的loopback范围是127.0.0.0 ~ 127.255.255.255 && ip.getHostAddress().indexOf(":") == -1) { return ip.getHostAddress(); } } } } catch (Exception e) { e.printStackTrace(); } return null; } }B执行到一半Server A的任务才执行完,此时Server A会释放Server B所占有的锁,导致无锁。
解决方案:设置锁的时候redis存入当前服务器的ip,释放锁的时候判断当前线程的ip是否跟redis中的ip是否一致。
此时还可能出现一个问题,redis中存的是127.0.0.1,而并发执行的过程中导致另一个线程的ip(比如是127.0.0.2)和127.0.0.1比较了,导致不能释放锁。
解决方案:get到redis中ip的同时和当前线程ip比较一下是否相同,让两个命令同时执行,还是lua脚本。
@Component public class JedisDistributedLock { private final Logger logger = LoggerFactory.getLogger(JedisDistributedLock.class); private static String LOCK_PREFIX = "JedisDistributedLock_"; private DefaultRedisScript<Boolean> lockScript; @Resource private RedisTemplate<Object, Object> redisTemplate; @Autowired private RedisService redisService; public static final String UNLOCK_LUA; static { StringBuilder sb = new StringBuilder(); sb.append("if redis.call(\"get\",KEYS[1]) == ARGV[1] "); sb.append("then "); sb.append(" return redis.call(\"del\",KEYS[1]) "); sb.append("else "); sb.append(" return 0 "); sb.append("end "); UNLOCK_LUA = sb.toString(); } @Scheduled(cron = "0/10 * * * * *") public void lockJob() { String lock = LOCK_PREFIX + "JedisNxExJob"; boolean lockRet = false; try { lockRet = this.setLock(lock, 600); //获取锁失败 if (!lockRet) { String value = (String) redisService.genValue(lock); //打印当前占用锁的服务器IP logger.info("jedisLockJob get lock fail,lock belong to:{}", value); return; } else { //获取锁成功 logger.info("jedisLockJob start lock lockNxExJob success"); Thread.sleep(5000); } } catch (Exception e) { logger.error("jedisLockJob lock error", e); } finally { if (lockRet) { logger.info("jedisLockJob release lock success"); releaseLock(lock,getHostIp()); } } } public boolean setLock(String key, long expire) { try { Boolean result = redisTemplate.execute(new RedisCallback<Boolean>() { @Override public Boolean doInRedis(RedisConnection connection) throws DataAccessException { return connection.set(key.getBytes(), getHostIp().getBytes(), Expiration.seconds(expire) ,RedisStringCommands.SetOption.ifAbsent()); } }); return result; } catch (Exception e) { logger.error("set redis occured an exception", e); } return false; } public String get(String key) { try { RedisCallback<String> callback = (connection) -> { JedisCommands commands = (JedisCommands) connection.getNativeConnection(); return commands.get(key); }; String result = redisTemplate.execute(callback); return result; } catch (Exception e) { logger.error("get redis occured an exception", e); } return ""; } /** * 释放锁操作 * @param key * @param value * @return */ private boolean releaseLock(String key, String value) { lockScript = new DefaultRedisScript<Boolean>(); lockScript.setScriptSource( new ResourceScriptSource(new ClassPathResource("unlock.lua"))); lockScript.setResultType(Boolean.class); // 封装参数 List<Object> keyList = new ArrayList<Object>(); keyList.add(key); keyList.add(value); Boolean result = (Boolean) redisTemplate.execute(lockScript, keyList); return result; } /** * 获取本机内网IP地址方法 * * @return */ private static String getHostIp() { try { Enumeration<NetworkInterface> allNetInterfaces = NetworkInterface.getNetworkInterfaces(); while (allNetInterfaces.hasMoreElements()) { NetworkInterface netInterface = (NetworkInterface) allNetInterfaces.nextElement(); Enumeration<InetAddress> addresses = netInterface.getInetAddresses(); while (addresses.hasMoreElements()) { InetAddress ip = (InetAddress) addresses.nextElement(); if (ip != null && ip instanceof Inet4Address && !ip.isLoopbackAddress() //loopback地址即本机地址,IPv4的loopback范围是127.0.0.0 ~ 127.255.255.255 && ip.getHostAddress().indexOf(":") == -1) { return ip.getHostAddress(); } } } } catch (Exception e) { e.printStackTrace(); } return null; } }任务执行时间很长,等redis分布式锁失效了,这个任务还在执行,此时是无锁状态,其它线程又会并发执行任务,又会造成线程安全问题。
解决方案:加一个看门狗。说白了,就是监听这把锁有没有失效,任务执行时间是否已经超过了锁的时间。可以加一个定时任务(看门狗),去redis中get一下这把锁,如果任务没执行完成就为这把锁加多一点超时时间,执行完成了这个定时器不做任何操作,任务代码里finally块给它删除了,相当于为这把锁续命。
极端情况。redis主从复制的时候这把锁还没同步到从机,这个时候刚好有这把锁的主机崩了,那么redis中无锁。也就是说代码中刚上锁,redis主从复制时主机刚好崩了,其它线程又会继续执行任务,又会造成并发安全问题,解决方案暂时还没有。
任务执行过程中还没等finally语句块执行删除锁的时候redis宕掉了,此时分布式锁是有过期时间的,如果这个过期时间很长,那么其他线程只能等这把锁过期了,那么这段时间其它线程也是阻塞的,也会造成服务不可用。
解决方案:使用Redisson。既然这把锁是redis的,redis宕掉直接影响这把锁能不能释放掉,而Redisson中的分布式锁保证了redis任务执行的原子性。如果任务没执行完,锁没被释放掉,那么一开始上的这把锁也不生效的。