我正在尝试编写一个存储过程,以“重复删除”数据库中时间重叠的记录。
表中数据的示例:
首先,我按CODE,TYPE和VALID对记录进行分组,这意味着我仅在重叠时间段内搜索那些值匹配的记录。
在上图中,将有3个重叠的记录,其ID为100、101和102。
我想做的是在CPNEW列中添加值以“去重复”那些记录,我想在“ I&(x)”中添加类似“ I”的内容,其中x是重叠记录的数量,所以对于ID为100到102的I,I1,I2,基于DATEVALIDTO的降序排列,如下所示:
现在,我写了一个简单的选择查询,它完全符合我的要求,它看起来像这样:
UPDATE clist2 as con2fin
JOIN(
SELECT con.*,(CASE WHEN con.VALID = "NOT_VALID" THEN
IF(ROW_NUMber() OVER (PARTITION BY con.CODE,con.TYPE,con.VALID ORDER BY con.DATEVALIDTO DESC) > 6,concat("I","NOK"),IF(ROW_NUMber() OVER (PARTITION BY con.CODE,con.VALID ORDER BY con.DATEVALIDTO DESC) = 1,"I",ROW_NUMber() OVER (PARTITION BY con.CODE,con.VALID ORDER BY con.DATEVALIDTO DESC)-1)))
ELSE
IF(ROW_NUMber() OVER (PARTITION BY con.CODE,con.VALID ORDER BY con.DATEVALIDTO DESC) > 10,concat("A","A",con.VALID ORDER BY con.DATEVALIDTO DESC)-1))) END) as cp_new
FROM clist2 as con
RIGHT JOIN
(SELECT * FROM database_1.clist2 WHERE ID = 101) as fin
on fin.CODE = con.CODE AND fin.VALID = con.VALID AND fin.TYPE = con.TYPE
AND
(
(fin.DATEVALIDFROM <= con.DATEVALIDFROM AND fin.DATEVALIDTO >= con.DATEVALIDFROM AND fin.DATEVALIDTO <= con.DATEVALIDTO)
OR (fin.DATEVALIDFROM >= con.DATEVALIDFROM AND fin.DATEVALIDTO >= con.DATEVALIDFROM AND fin.DATEVALIDTO <= con.DATEVALIDTO)
OR (fin.DATEVALIDFROM >= con.DATEVALIDFROM AND fin.DATEVALIDFROM <= con.DATEVALIDTO AND fin.DATEVALIDTO >= con.DATEVALIDTO)
OR (fin.DATEVALIDFROM <= con.DATEVALIDFROM AND fin.DATEVALIDTO >= con.DATEVALIDTO)
OR (fin.DATEVALIDTO = con.DATEVALIDTO)
OR (fin.DATEVALIDTO = con.DATEVALIDFROM)
OR (fin.DATEVALIDFROM = con.DATEVALIDFROM)
OR (fin.DATEVALIDFROM = con.DATEVALIDTO)
)) as con2beforefin
on con2beforefin.ID = con2fin.ID
SET con2fin.CPNEW = con2beforefin.cp_new ;
设置为对与ID为101的记录重叠的所有记录进行重复数据删除。
不幸的是,我在此表中有很多记录,我需要创建查询,这将对表中的每个记录执行此操作,这意味着:
- 查找与第一条记录重叠的所有记录
- 删除重复的记录
- 跳转到下一条记录
- 如果该记录已被“重复数据删除”,请跳过此记录,否则请执行步骤2。
- 跳转到下一条记录
- 如果该记录已被“重复数据删除”,请跳过此记录,否则请执行步骤2。 等
我基本上需要循环这种逻辑。
所以它看起来像这样:
问题
所以我想我可以为此使用带有游标和循环的存储过程,所以我这样写:
CREATE DEFINER=`root`@`localhost` PROCEDURE `new_procedure`()
BEGIN
-- Variables
DeclARE finished INTEGER DEFAULT 0;
DeclARE v_ID varchar(10000) DEFAULT "";
-- Cursor to get ID
DeclARE curID
CURSOR FOR
SELECT ID FROM clist2;
-- NOT FOUND handler
DeclARE CONTINUE HANDLER
FOR NOT FOUND SET finished = 1;
-- OPEN cursor
OPEN curID;
-- START LOOP
getID: LOOP
-- FETCH Cursor value into variable
FETCH curID INTO v_ID;
-- CHECK IF NOT FOUND handler = 1
IF finished = 1 THEN
LEAVE getID;
ELSE
-- LOGIC PART OF SP
-- PART 1,IF CPNEW IS BLANK DO THIS
if ((SELECT CPNEW FROM clist2 WHERE ID = @v_ID) is null) then
UPDATE clist2 as con2fin
JOIN(
SELECT con.*,(CASE
WHEN con.VALID = "NOT_VALID"
THEN
IF(ROW_NUMber() OVER (PARTITION BY con.CODE,con.VALID ORDER BY con.DATEVALIDTO DESC)-1)))
ELSE
IF(ROW_NUMber() OVER (PARTITION BY con.CODE,con.VALID ORDER BY con.DATEVALIDTO DESC)-1)))
END) as cp_new FROM clist2 as con
RIGHT JOIN
(SELECT * FROM database_1.clist2
WHERE ID = @v_ID) as fin
on fin.CODE = con.CODE AND fin.VALID = con.VALID AND fin.TYPE = con.TYPE
AND (
(fin.DATEVALIDFROM <= con.DATEVALIDFROM AND fin.DATEVALIDTO >= con.DATEVALIDFROM AND fin.DATEVALIDTO <= con.DATEVALIDTO)
OR
(fin.DATEVALIDFROM >= con.DATEVALIDFROM AND fin.DATEVALIDTO >= con.DATEVALIDFROM AND fin.DATEVALIDTO <= con.DATEVALIDTO)
OR
(fin.DATEVALIDFROM >= con.DATEVALIDFROM AND fin.DATEVALIDFROM <= con.DATEVALIDTO AND fin.DATEVALIDTO >= con.DATEVALIDTO)
OR
(fin.DATEVALIDFROM <= con.DATEVALIDFROM AND fin.DATEVALIDTO >= con.DATEVALIDTO)
OR
(fin.DATEVALIDTO = con.DATEVALIDTO)
OR
(fin.DATEVALIDTO = con.DATEVALIDFROM)
OR
(fin.DATEVALIDFROM = con.DATEVALIDFROM)
OR
(fin.DATEVALIDFROM = con.DATEVALIDTO)
)) as con2beforefin on con2beforefin.ID = con2fin.ID
SET con2fin.CPNEW = con2beforefin.cp_new
;
-- PART 2,if CPNEW IS NOT BLANK DO THIS
else
UPDATE clist2 SET CPNEW = CPNEW
WHERE ID = @v_ID;
-- END OF THE SP
end if;
END IF;
END LOOP getID;
CLOSE curID;
SET @finished = 0;
end
不幸的是,此存储过程没有结果。
我有什么问题,要进行哪些更改才能使其正常工作?
非常感谢您