Monday, March 12, 2012

Joining 2 fields: Redundancy results..need help

I am working on this access DB that has been created by some consulting firm couple of years ago, which I wasnt here at that time.

The DB is being by college faculty members to store about students who study abroad..
now,
there are students who has more than one major.. and when I run query to find out how many students are studying abroad..
it shows me more than the actual number of students who are studying abroad, because it shows same student twice because of double major..
In other words, it inserts two records for a student who has 2 majors into the DB.

Now my question is, is it possible to combine 2 records into one record on query results?
I know you can do the following: SELECT StudentName = 'major + major'

but the problem is, the name for field major is the same.. so I cannot say in my query 'Major + Major' to combine 2 records. it doesnt work..

let me know if anyone has solution to this, that will be greatful...
thanks,

moradCan you post your table structure? That would help me give you more specific answers.

The short answer is to pick one of the majors as the "most important", and select only that row using a condition in the WHERE clause. If you want access to both rows, use a LEFT JOIN to get access to the second row. If you post a DDL declaration, I can give you more specific help.

-PatP|||I just attached the structure..
well, there is most important major on this database..
and all records are stored in that table...

for example.. here is my query

First Name Last Name Host Country ProgramName Sponsor
Rebecca AINSWO Griffith University Australia N/A Direct
Ronda ALEXAN Universitat at Bonn Germany N/A Western Michigan University
Matt ANDER Rikkyo University Japan N/A Western Michigan University
Matt ANDER Rikkyo University Japan N/A Western Michigan University
Nicholas Applin University of Wollongong Australia N/A Western Michigan University

I get two records of Ander! because Matt Ander has double major, thus he has double records...two records are the same except the major.

let me know what you think...
thanks|||Now I've got the stuff that I needed to get specific! Try using:SELECT *
FROM tblMajor AS a
LEFT JOIN tblMajor AS b
ON (b.SID = a.SID)
WHERE a.MajId = (SELECT Min(c.MajId)
FROM tblMajor AS c
WHERE c.SID = a.SID)
AND (a.MajId < b.MajId OR b.MajId IS NULL);I'm pretty sure that this will give you what you want.

-PatP|||it worked :) but here is the thing though..
it only selected the duplicates, what about the other records that are not duplicate..

the query doesnt show them...
what do I have to add to show the rest... let me know :)
because I tried to use the union and I couldnt t use it union because I need to have same amount of columns for two tables.

thanks.. I appreciate your help|||Crud! The syntax I posted works with real SQL, but not with Jet (the default engine supplied with MS-Access). You could use something like:SELECT a.*
, (SELECT Max(b.MajId)
FROM tblMajor AS b
WHERE b.SID = a.SID
AND a.MajId < b.MajId) AS second_major
FROM tblMajor AS a
WHERE a.MajId = (SELECT Min(c.MajId)
FROM tblMajor AS c
WHERE c.SID = a.SID);This works around an ugly limitation of the Jet database engine.

-PatP|||I tried that today at work, and it worked, but the only thing is that I wanted to change is, instead of showing what record number of the 2nd major for the student, I wanted to show the actual 2nd Major Name.

in other words I want second_major to show the actual name of the major and not the field number of 2nd major..

let me know if you have an idea..

thanks|||Picky, picky, picky... ;)SELECT a.*
, (SELECT Max(b.Major)
FROM tblMajor AS b
WHERE b.SID = a.SID
AND a.MajId < b.MajId) AS second_major
FROM tblMajor AS a
WHERE a.MajId = (SELECT Min(c.MajId)
FROM tblMajor AS c
WHERE c.SID = a.SID);...should fix you right up!

-PatP|||hey pat,
thanks for great help...

Now, what I asked you about was for tblMajor.

what I am trying to do now, is run a query to list students and their majors as well as their minors..

When I tried to run your code for Majors.. it worked, and it added a new field called second major.

but now, when I try to include Minor in my query, it would show the same problem, because student can have more than one minor.

So basically, the objective is to get the following results
SID, Major, 2nd Major, Minor, 2nd Minor

And the minor table is same as major table design as shown above in my previous post.

Now what I want to know, is how to combine these two queries into one

this:
SELECT a.*, (SELECT Max(b.Major)
FROM tblMajor AS b
WHERE b.SID = a.SID
AND a.MajId < b.MajId) AS [Second Major]
FROM tblMajor AS a
WHERE a.MajId = (SELECT Min(c.MajId)
FROM tblMajor AS c
WHERE c.SID = a.SID);

AND

SELECT e.*, (SELECT Max(f.Minor)
FROM tblMinor AS f
WHERE f.SID = e.SID
AND e.MinId < b.MinId) AS [Second Minor]
FROM tblMinor AS e
WHERE e.MinId = (SELECT Min(g.MinId)
FROM tblMinor AS g
WHERE g.SID = e.SID);

Now I was thinking of having using SID as relationship between them, but then I couldnt figure it out.. let me know if you have an idea of how to..

thanks|||The second minor throws an interesting wrinkle into the query, because it now makes a three set intersection instead of just two. You'll need to test this carefully with your data, but I think that you can use:SELECT a.*
, (SELECT Max(b.Major)
FROM tblMajor AS b
WHERE b.SID = a.SID
AND a.MajId < b.MajId) AS second_major
, (SELECT Max(d.Minor)
FROM tblMajor AS d
WHERE d.SID = a.SID
AND d.Minor <> a.Minor) AS second_minor
FROM tblMajor AS a
WHERE a.MajId = (SELECT Min(c.MajId)
FROM tblMajor AS c
WHERE c.SID = a.SID);The gist of this query is that the first row you'd find if the table was sorted by SID then by MajId is assumed to contain the student's "primary" major and minor. The b and d subqueries find the largest Major and Minor that aren't the "primary" values. While this makes perfect sense to me as an outsider, it may or may not make sense in terms of your data, YMMV (your milage may vary). Test this carefully, but logically it should work.

-PatP|||I've tried the code you posted with few tweeks and I was able to get it through

SELECT a.*
, (SELECT Max(b.Major)
FROM tblMajor AS b
WHERE b.SID = a.SID
AND a.MajId < b.MajId) AS second_major
, (SELECT Max(d.Minor)
FROM tblMinor AS d, tblMinor AS e
WHERE d.SID = a.SID
AND d.Minor <> e.Minor) AS second_minor
FROM tblMajor AS a, tblProcessInfo, tblMinor
WHERE a.MajId = (SELECT Min(c.MajId)
FROM tblMajor AS c
WHERE c.SID = a.SID) AND tblProcessInfo.SID = a.SID AND tblProcessInfo.Term = '041';

Now, this query only shows the 2nd minor and not the first minor..
I tried to select tblMinor.Minor in the select statement, and that didnt help much..
let me know what you think..

thanks..|||This is a pure crap-shoot, based on the assumption that the tblMinor structure is exactly like the tblMajor structure. You'll need to test this very carefully before you "bless" this into production!!!SELECT a.*
, (SELECT Max(b.Major)
FROM tblMajor AS e
WHERE e.SID = a.SID
AND a.MajId < e.MajId) AS second_major
, (SELECT Max(d.Minor)
FROM tblMinor AS e
WHERE e.SID = b.SID
AND b.MinId < e.MinId) AS second_minor
FROM tblProcessInfo AS p
JOIN tblMajor AS a
ON (a.SID = p.SID
AND a.MajId = (SELECT Min(c.MajId)
FROM tblMajor AS c
WHERE c.SID = a.SID))
LEFT JOIN tblMinor AS b
on (b.SID = p.SID
AND b.MinId = (SELECT Min(d.MinId)
FROM tblMinor AS d
WHERE d.SID = a.SID))
WHERE tblProcessInfo.Term = '041';If this doesn't work, I'd suggest that you create a "play" copy of your MDB file. Butcher the names and the universities to avoid giving out any usable personal information and post the MDB so I can work with your structures instead of having to guess about everything.

Better yet, see if you can find some enterprising grad student scrabbling for some way to get a few co-op dollars or even just some resum worthy experience! I'm sure that some of them would eat this kind of problem alive, and grovel for the opportunity!

-PatP|||hey pat, thanks for your great help..
I was gone for finals and projects that were due..but everything is back to normal now :)

Now, I need a logical explanation for this problem..

When I run a query of how many students were in a certain country from year 2000 to 2004 I get 490 Students (No duplication records)

And when I run a query of how many students with majors (tblMajor Does have duplication records because of having more than one major) that went to that country from 2000 to 2004.. I get 435

I am missing 50 records when I link tblMajor.SID with tblPermInfo.SID and run a query.

I just dont get it why?!
I thought for myself, that tblPermInfo maybe is giving 490 because there is duplication of records for having more than one major, but then it is not linked to tblMajor..so there is no duplication in what so ever.
but when i run a query where SID of tblMajor and tblPermInfo is matched...it only gives me 435...
so there are SIDs that are left over because there is no match btw tables..right?

What other logical reasons could there be..
let me know what you think..
thanks

kicker

No comments:

Post a Comment