Date/Time Stamp: Date overlapping

i have a table containing following data
effdate termdate uid
---- ---- ----
1 2 1
3 4 2
5 8 3
7 9 4
11 12 5
12 13 6
3 6 7
5 9 8

i need to replace all the overlapping records with one record
such that resultant table shud look like

effdate termdate uid
1 2 1
11 13 2
3 9 3

ThanksGiven the following DDL and sample data:

CREATE TABLE Intervals (
ID int IDENTITY PRIMARY KEY,
Start int,
Stop int,
UNIQUE (Start, Stop)
)

INSERT INTO Intervals VALUES (1,2)
INSERT INTO Intervals VALUES (3,4)
INSERT INTO Intervals VALUES (5,8)
INSERT INTO Intervals VALUES (7,9)
INSERT INTO Intervals VALUES (11,12)
INSERT INTO Intervals VALUES (12,13)
INSERT INTO Intervals VALUES (3,6)
INSERT INTO Intervals VALUES (5,9)

The following query returns the expected result:

SELECT (
SELECT MIN(d.Start) FROM Intervals d
WHERE d.Start<x.Stop AND NOT EXISTS (
SELECT * FROM (
SELECT DISTINCT Stop FROM Intervals a
WHERE NOT EXISTS (
SELECT * FROM Intervals b
WHERE a.Stop BETWEEN b.Start AND b.Stop
AND b.Start BETWEEN a.Start AND a.Stop
AND a.ID<>b.ID
)
) y WHERE y.Stop<x.Stop
AND d.Start<y.Stop
)
) as Start, x.Stop
FROM (
SELECT DISTINCT Stop FROM Intervals a
WHERE NOT EXISTS (
SELECT * FROM Intervals b
WHERE a.Stop BETWEEN b.Start AND b.Stop
AND b.Start BETWEEN a.Start AND a.Stop
AND a.ID<>b.ID
)
) x

The above query was inspired by reading (a few years ago) the following
article:
http://msdn.microsoft.com/library/e...meintervals.asp
Razvan

Rsapru@.gmail.com wrote:

Quote:

Originally Posted by

i have a table containing following data
effdate termdate uid
---- ---- ----
1 2 1
3 4 2
5 8 3
7 9 4
11 12 5
12 13 6
3 6 7
5 9 8
>
i need to replace all the overlapping records with one record
such that resultant table shud look like
>
>
effdate termdate uid
1 2 1
11 13 2
3 9 3
>
>
Thanks

|||Hey,
Thanks for replying.. it a lil bit.. but if we add some more data
... query goes bonkers... thanks anyways

Rohit|||Thanks for replying..
This query helped a lil bit, but if add some more data .. query goes
bonkers...

Thanks
Rohit|||Rsapru@.gmail.com wrote:

Quote:

Originally Posted by

Thanks for replying.. it a lil bit.. but if we add some more data
... query goes bonkers... thanks anyways

Can you give a specific example of a data set for which the query
misbehaves?

Also, here is your initial example:

effdate termdate uid
---- ---- ----
1 2 1
3 4 2
5 8 3
7 9 4
11 12 5
12 13 6
3 6 7
5 9 8

and a visual layout of which uid's fall within which time slots:

11
21
327
427
5378
6378
7348
8348
948
10
115
1256
136

and finally your initial example of the desired output:

effdate termdate uid
1 2 1
11 13 2
3 9 3

The ranges (1-2, 3-9, 11-13) make sense, but why do you have 11-13
associated with uid 2, and 3-9 with 3? I would expect 11-13 to be
associated with one of uid's (5,6), and 3-9 with one of (2,3,4,7,8).|||Try this

SELECT s1.Start,
MIN(t1.Stop) AS Stop
FROM Intervals s1
INNER JOIN Intervals t1 ON s1.Start <= t1.Stop
AND NOT EXISTS(SELECT * FROM Intervals t2
WHERE t1.Stop BETWEEN t2.Start AND t2.Stop
AND t2.Stop t1.Stop)
WHERE NOT EXISTS(SELECT * FROM Intervals s2
WHERE s1.Start BETWEEN s2.Start AND s2.Stop
AND s2.Start < s1.Start)
GROUP BY s1.Start|||markc600@.hotmail.com wrote:

Quote:

Originally Posted by

SELECT s1.Start,
MIN(t1.Stop) AS Stop
FROM Intervals s1
INNER JOIN Intervals t1 ON s1.Start <= t1.Stop
AND NOT EXISTS(SELECT * FROM Intervals t2

^^^

Quote:

Originally Posted by

WHERE t1.Stop BETWEEN t2.Start AND t2.Stop
AND t2.Stop t1.Stop)
WHERE NOT EXISTS(SELECT * FROM Intervals s2

^^^^^

Quote:

Originally Posted by

WHERE s1.Start BETWEEN s2.Start AND s2.Stop
AND s2.Start < s1.Start)
GROUP BY s1.Start

The indicated AND and WHERE need to be swapped, but otherwise this
appears to work. Translating to English:

1) Match each starting point with each ending point that exceeds it.

2) Eliminate all cases where the ending point could be pushed higher
by merging with another interval.

3) Eliminate all cases where the starting point could be pushed lower
by merging with another interval.

4) For each starting point, cut the list down to the lowest matched
ending point that hasn't been eliminated yet. (All other matched
ending points belong to a different merged-interval.)

Personally, I would replace
t1.Stop BETWEEN t2.Start AND t2.Stop AND t2.Stop t1.Stop
with
t2.Start <= t1.Stop AND t2.Stop t1.Stop
and similarly for the other WHERE ... BETWEEN.|||I want to suggest the next 2 queries

SELECT MIN(seq) AS start, MAX(seq) + 1 AS stop
FROM(SELECT T1.seq, T1.seq - COUNT(*) AS g
FROM (SELECT DISTINCT S1.seq
FROM Sequence AS S1,
Intervals AS I1
WHERE I1.start <= S1.seq AND I1.stop >= S1.seq + 1) AS
T1,
(SELECT DISTINCT S2.seq
FROM Sequence AS S2,
Intervals AS I2
WHERE I2.start <= S2.seq AND I2.stop >= S2.seq + 1) AS
T2
WHERE T2.seq <= T1.seq
GROUP BY T1.seq) AS T3
GROUP BY g;

SELECT MIN(seq) AS start, MAX(seq) + 1 AS stop
FROM (SELECT DISTINCT S1.seq, S1.seq - COUNT(DISTINCT S2.seq) AS g
FROM Sequence AS S1
, Sequence AS S2
, Intervals AS I1
, Intervals AS I2
WHERE I1.start <= S1.seq AND I1.stop >= S1.seq + 1
AND I2.start <= S2.seq AND I2.stop >= S2.seq + 1
AND S2.seq <= S1.seq
GROUP BY S1.seq) AS R
GROUP BY g;

The first query is cheaper in terms of I/O (i hope :)
The latter query is more compact :)

Both use a Sequence table

CREATE TABLE Sequence(seq INTEGER NOT NULL PRIMARY KEY);

INSERT INTO Sequence(seq)
SELECT (thousand * 1000 + hundred * 100 + ten * 10 + unit) AS seq
FROM (SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Thousands(thousand)
CROSS JOIN
(SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Hundreds(hundred)
CROSS JOIN
(SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Tens(ten)
CROSS JOIN
(SELECT 0 UNION ALL SELECT 1 UNION ALL
SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL
SELECT 6 UNION ALL SELECT 7 UNION ALL
SELECT 8 UNION ALL SELECT 9) AS Units(unit)
UNION ALL
SELECT 10000;

--
Andrey Odegov
avodeGOV@.yandex.ru
(remove GOV to respond)|||This might be one of those cases when a cursor runs faster.

--------
Alex Kuznetsov
http://sqlserver-tips.blogspot.com/
http://sqlserver-puzzles.blogspot.com/sql

Wednesday, March 21, 2012

Date overlapping

No comments:

Post a Comment

Date/Time Stamp

Blog Archive

About Me