effdate termdate uid
---- ---- ----
1 2 1
3 4 2
5 8 3
7 9 4
11 12 5
12 13 6
3 6 7
5 9 8
i need to replace all the overlapping records with one record
such that resultant table shud look like
effdate termdate uid
1 2 1
11 13 2
3 9 3
ThanksGiven the following DDL and sample data:
CREATE TABLE Intervals (
ID int IDENTITY PRIMARY KEY,
Start int,
Stop int,
UNIQUE (Start, Stop)
)
INSERT INTO Intervals VALUES (1,2)
INSERT INTO Intervals VALUES (3,4)
INSERT INTO Intervals VALUES (5,8)
INSERT INTO Intervals VALUES (7,9)
INSERT INTO Intervals VALUES (11,12)
INSERT INTO Intervals VALUES (12,13)
INSERT INTO Intervals VALUES (3,6)
INSERT INTO Intervals VALUES (5,9)
The following query returns the expected result:
SELECT (
SELECT MIN(d.Start) FROM Intervals d
WHERE d.Start<x.Stop AND NOT EXISTS (
SELECT * FROM (
SELECT DISTINCT Stop FROM Intervals a
WHERE NOT EXISTS (
SELECT * FROM Intervals b
WHERE a.Stop BETWEEN b.Start AND b.Stop
AND b.Start BETWEEN a.Start AND a.Stop
AND a.ID<>b.ID
)
) y WHERE y.Stop<x.Stop
AND d.Start<y.Stop
)
) as Start, x.Stop
FROM (
SELECT DISTINCT Stop FROM Intervals a
WHERE NOT EXISTS (
SELECT * FROM Intervals b
WHERE a.Stop BETWEEN b.Start AND b.Stop
AND b.Start BETWEEN a.Start AND a.Stop
AND a.ID<>b.ID
)
) x
The above query was inspired by reading (a few years ago) the following
article:
http://msdn.microsoft.com/library/e...meintervals.asp
Razvan
Rsapru@.gmail.com wrote:
Quote:
Originally Posted by
i have a table containing following data
effdate termdate uid
---- ---- ----
1 2 1
3 4 2
5 8 3
7 9 4
11 12 5
12 13 6
3 6 7
5 9 8
>
i need to replace all the overlapping records with one record
such that resultant table shud look like
>
>
effdate termdate uid
1 2 1
11 13 2
3 9 3
>
>
Thanks
Thanks for replying.. it a lil bit.. but if we add some more data
... query goes bonkers... thanks anyways
Rohit|||Thanks for replying..
This query helped a lil bit, but if add some more data .. query goes
bonkers...
Thanks
Rohit|||Rsapru@.gmail.com wrote:
Quote:
Originally Posted by
Thanks for replying.. it a lil bit.. but if we add some more data
... query goes bonkers... thanks anyways
Can you give a specific example of a data set for which the query
misbehaves?
Also, here is your initial example:
effdate termdate uid
---- ---- ----
1 2 1
3 4 2
5 8 3
7 9 4
11 12 5
12 13 6
3 6 7
5 9 8
and a visual layout of which uid's fall within which time slots:
11
21
327
427
5378
6378
7348
8348
948
10
115
1256
136
and finally your initial example of the desired output:
effdate termdate uid
1 2 1
11 13 2
3 9 3
The ranges (1-2, 3-9, 11-13) make sense, but why do you have 11-13
associated with uid 2, and 3-9 with 3? I would expect 11-13 to be
associated with one of uid's (5,6), and 3-9 with one of (2,3,4,7,8).|||Try this
SELECT s1.Start,
MIN(t1.Stop) AS Stop
FROM Intervals s1
INNER JOIN Intervals t1 ON s1.Start <= t1.Stop
AND NOT EXISTS(SELECT * FROM Intervals t2
WHERE t1.Stop BETWEEN t2.Start AND t2.Stop
AND t2.Stop t1.Stop)
WHERE NOT EXISTS(SELECT * FROM Intervals s2
WHERE s1.Start BETWEEN s2.Start AND s2.Stop
AND s2.Start < s1.Start)
GROUP BY s1.Start|||markc600@.hotmail.com wrote:
Quote:
Originally Posted by
SELECT s1.Start,
MIN(t1.Stop) AS Stop
FROM Intervals s1
INNER JOIN Intervals t1 ON s1.Start <= t1.Stop
AND NOT EXISTS(SELECT * FROM Intervals t2
^^^
Quote:
Originally Posted by
WHERE t1.Stop BETWEEN t2.Start AND t2.Stop
AND t2.Stop t1.Stop)
WHERE NOT EXISTS(SELECT * FROM Intervals s2
^^^^^
Quote:
Originally Posted by
WHERE s1.Start BETWEEN s2.Start AND s2.Stop
AND s2.Start < s1.Start)
GROUP BY s1.Start
The indicated AND and WHERE need to be swapped, but otherwise this
appears to work. Translating to English:
1) Match each starting point with each ending point that exceeds it.
2) Eliminate all cases where the ending point could be pushed higher
by merging with another interval.
3) Eliminate all cases where the starting point could be pushed lower
by merging with another interval.
4) For each starting point, cut the list down to the lowest matched
ending point that hasn't been eliminated yet. (All other matched
ending points belong to a different merged-interval.)
Personally, I would replace
t1.Stop BETWEEN t2.Start AND t2.Stop AND t2.Stop t1.Stop
with
t2.Start <= t1.Stop AND t2.Stop t1.Stop
and similarly for the other WHERE ... BETWEEN.|||I want to suggest the next 2 queries
SELECT MIN(seq) AS start, MAX(seq) + 1 AS stop
FROM(SELECT T1.seq, T1.seq - COUNT(*) AS g
FROM (SELECT DISTINCT S1.seq
FROM Sequence AS S1,
Intervals AS I1
WHERE I1.start <= S1.seq AND I1.stop >= S1.seq + 1) AS
T1,
(SELECT DISTINCT S2.seq
FROM Sequence AS S2,
Intervals AS I2
WHERE I2.start <= S2.seq AND I2.stop >= S2.seq + 1) AS
T2
WHERE T2.seq <= T1.seq
GROUP BY T1.seq) AS T3
GROUP BY g;
SELECT MIN(seq) AS start, MAX(seq) + 1 AS stop
FROM (SELECT DISTINCT S1.seq, S1.seq - COUNT(DISTINCT S2.seq) AS g
FROM Sequence AS S1
, Sequence AS S2
, Intervals AS I1
, Intervals AS I2
WHERE I1.start <= S1.seq AND I1.stop >= S1.seq + 1
AND I2.start <= S2.seq AND I2.stop >= S2.seq + 1
AND S2.seq <= S1.seq
GROUP BY S1.seq) AS R
GROUP BY g;
The first query is cheaper in terms of I/O (i hope :)
The latter query is more compact :)
Both use a Sequence table
CREATE TABLE Sequence(seq INTEGER NOT NULL PRIMARY KEY);
INSERT INTO Sequence(seq)
SELECT (thousand * 1000 + hundred * 100 + ten * 10 + unit) AS seq
FROM (SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Thousands(thousand)
CROSS JOIN
(SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Hundreds(hundred)
CROSS JOIN
(SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Tens(ten)
CROSS JOIN
(SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Units(unit)
UNION ALL
SELECT 10000;
--
Andrey Odegov
avodeGOV@.yandex.ru
(remove GOV to respond)|||This might be one of those cases when a cursor runs faster.
--------
Alex Kuznetsov
http://sqlserver-tips.blogspot.com/
http://sqlserver-puzzles.blogspot.com/sql
No comments:
Post a Comment