Friday, March 9, 2012

JOIN vs Agregations

Hi,
Maybe the topic have already been touched
in that case don't hesitate to give me a link on
the corresponding thread.

I don't know the history concerning the managment
of joined tables through JOIN conditions or WHERE
statements.
But I was wondering if ANSI plans to manage aggregations
with JOIN syntaxes (like subqueries when we deal with
effectives dates for example) or if the future is given to
the subqueries

So do we have to keep using the JOIN syntax for his
lisibility or finally use only subqueries or both ?i'm not sure i understand your question, but i think the answer is "both"

it would be best if you could give an actual example of the sql|||There is a problem with your question, which is you are comparing two completely different concepts. It is therefore not possible to give you an answer specifically addressing your question, however, I can comment on the use of aggregate functions and their relationship with the Cartesian product of two sets.

The Cartesian product of two sets is not required in order to use an aggregate function. Conversely, one is not required to use aggregate functions in producing the Cartesian product of two sets.|||Sorry I have badly presented my question
Let me give you an example

Let's imagine those two tables
+----+
| employee |
+----+
| empl_id |
| first_name |
| last_name |
| company_id |
+----+

+----+
| empl_job |
+----+
| empl_id |
| eff_date |
| job_title |
+----+

Let's imagine we'd like to get the current job title
for each employees of the company xxx
Is it better to do

SELECT
a.first_name AS "FirstName"
, a.last_name AS "LastName"
, b.job_title AS "JobTitle"
FROM
employee AS a
JOIN empl_job AS b ON (
a.empl_id = b.empl_id
b.eff_date = (
SELECT MAX(eff_date)
FROM empl_job AS c
WHERE b.job_id = c.job_id
AND a.empl_id = c.empl_id
AND eff_date <= SYSDATE
)
)
WHERE
a.company_id = xxx
or

SELECT
a.first_name AS "FirstName"
, a.last_name AS "LastName"
, b.job_title AS "JobTitle"
FROM
employee AS a
JOIN empl_job AS b ON (a.empl_id = b.empl_id)
WHERE a.company_id = xxx
AND b.eff_date = (
SELECT MAX(eff_date)
FROM empl_job AS c
WHERE b.job_id = c.job_id
AND a.empl_id = c.empl_id
AND eff_date <= SYSDATE
)
)

or

SELECT
a.first_name AS "FirstName"
, a.last_name AS "LastName"
, b.job_title AS "JobTitle"
FROM
employee AS a
, empl_job AS b
WHERE a.company_id = xxx
AND a.empl_id = b.empl_id
AND b.eff_date = (
SELECT MAX(eff_date)
FROM empl_job AS c
WHERE b.job_id = c.job_id
AND a.empl_id = c.empl_id
AND eff_date <= SYSDATE
)
)

?|||so it appears that your question is "which approach is better?"

the answer is: test them and see, each dbms is different

by the way i can think of a few more variations of SQL for your problem|||With your experience,
What syntax do you considere the most maintainable ?
Which is the most portable ? and/or the most performant ?
And if I have to specify a DB, let's choose Oracle|||they are all equally maintainable

the most portable one is the one which does not use proprietary SQL like SYSDATE

performant? you will have to test them on your particular dbms|||The one that is easiest to maintain is the one that is easiest to understand by someone else. So maybe a few comments may help to explain what a subquery is doing or why certain constructs are used in specific places.

No comments:

Post a Comment