Wednesday, March 7, 2012

JOIN the only relationship method?

In the old days..with DbaseIII, if two tables were indexed on an ID field, a
t
the top of the program, I would "open" this relationship by using the right
syntax and "boom" ..it's done.. the record pointers are at the top of each
table waiting for a command.
In SQL, it appears the that JOIN statement is what sets up the
"relationship" only when I execute a join with a million records in each
table, it takes 15 seconds for the relationship to be "established". It
appears that during the JOIN, a scan takes place and all the records are in
both tables are "positioned" during a sequential scan of the main table.
Which wouldn't be so bad if it were a one time deal, but it's in a SP that
gets repeatedly called by a client application.
In Dbase III, because of the indexed fields, it knows to just positions the
record pointer at the top of each table..it doesn't have to scan through all
the tables "relating" all the records one by one each time the relationship
is established.
There has to be a quicker way to relate tables in these more modern
languages, but i'm not sure where to start looking.
Thanks,
Don
declare @.msym varchar(200)
declare @.mcat varchar(200)
declare @.msid varchar(200)
select @.msym = 'c'
select @.mcat = 'FINC'
select @.msid = 'Fly'
select distinct m.* from main m
left join sym2 s on s.id = m.id
left join cat2 c on c.id = m.id
left join servid2 i on i.id = m.id
The ID field in all 4 tables are INT and indexed.
The ID field in Main is unique. The ID fields in the other 3 fields are not
unique.The fact that data retrieval isn't constrained by pointers or
hierarchies in the database is a major advantage of relational
databases. SQL isn't truly relational but it does follow the model to
that extent.
Your real question should be "How do I optimize my queries?". But the
query you posted is a woeful example with which to tackle that
question. What's the point of retrieving the entire contents of the
tables in a single operation? Why have you used SELECT * instead of
specifying only the required columns? Why have you used DISTINCT? Very
likely this query will result in undesirable and slow scans and sorts -
but if this is your idea of how to write queries then you should
probably take a course before you do anything more. What are you
*really* trying to achieve?
David Portas
SQL Server MVP
--|||Here's the rest of the Select statement...you may be right, it may be a
design problem, i don't know.
where s.sym in (Select str from iter_charlist_to_table(@.msym, DEFAULT) )
or c.cat in (select str from iter_charlist_to_table(@.mcat, DEFAULT) )
or i.servid in (select str from ter_charlist_to_table(@.msid, DEFAULT))
sym, cat and servid are tables with 2 fields...ID and a varchar field. The
ID field's are unique. The varchar field can have repeating values. for
example,
1001 DELL
1004 DELL
1007 DELL
What needs to be returned is all fields from the Main table if the request
is for DELL.
What I'm trying to do is have 1 main table "Main" with unique ID field..be
related to 3 other tables on ID where the ID in these tables is not unique.
There is a varchar field in Main that has values like this:
"DELL,IBM,WHATEVER"...
A client wants all fields returned from Main that have the word "DELL" in th
e
"David Portas" wrote:

> The fact that data retrieval isn't constrained by pointers or
> hierarchies in the database is a major advantage of relational
> databases. SQL isn't truly relational but it does follow the model to
> that extent.
> Your real question should be "How do I optimize my queries?". But the
> query you posted is a woeful example with which to tackle that
> question. What's the point of retrieving the entire contents of the
> tables in a single operation? Why have you used SELECT * instead of
> specifying only the required columns? Why have you used DISTINCT? Very
> likely this query will result in undesirable and slow scans and sorts -
> but if this is your idea of how to write queries then you should
> probably take a course before you do anything more. What are you
> *really* trying to achieve?
> --
> David Portas
> SQL Server MVP
> --
>|||> where s.sym in (Select str from iter_charlist_to_table(@.msym, DEFAULT) )
> or c.cat in (select str from iter_charlist_to_table(@.mcat, DEFAULT) )
> or i.servid in (select str from ter_charlist_to_table(@.msid, DEFAULT))
Populate a table with those values before you execute the statement. That
way you can ensure the function only gets called once.

> ere is a varchar field in Main that has values like this:
> "DELL,IBM,WHATEVER"...
Deliminted lists of values in columns are a fundamental weakness. If these
are a discrete known set of values then put them in a related table.
David Portas
SQL Server MVP
--

No comments:

Post a Comment