SQL GROUP BY
SQL GROUP BY
In this article, we use PostgreSQL to explian group by
operation. Let’s firstly create two tables course
and student
:
1 | CREATE TABLE COURSE |
So, now we have two tables in PostgreSQL:
Firstly let’s do the following query:
SELECT COURSENAME, NAME FROM STUDENT, COURSE WHERE STUDENT.COURSEID = COURSE.ID;
As shown, we have 6 students, who take 2 courses MSc
and BSc
. Can we separate these 6 students according to their coursename? We can use group by
:
SELECT COURSENAME, NAME FROM STUDENT, COURSE WHERE STUDENT.COURSEID = COURSE.ID GROUP BY COURSENAME;
However, this would confuse the DBMS, because it cannot do like this:
1 | DBMS can only generate tuples, however, the following data are not tuples: |
Therefore we need to modify our query like this:
1 | -- add a aggregate function on the attribute "NAME" |
Then the DBMS can do the following job for us:
1 | coursename | count |
Let’s try another query:
1 | SELECT COURSEID, COURSENAME, NAME FROM STUDENT, COURSE WHERE STUDENT.COURSEID = COURSE.ID GROUP BY COURSEID; |
It would give us the following result:
1 | courseid | coursename | name |
Now we want to use group by
like this:
1 | SELECT COURSEID, COURSENAME, COUNT(NAME) FROM STUDENT, COURSE WHERE STUDENT.COURSEID = COURSE.ID GROUP BY COURSEID; |
Again, this would confuse the DBMS, because it cannot do this:
1 | These are not tuples: |
To solve the problem, we have two methods:
- We can add a aggregate function to
COURSENAME
:
1 | SELECT COURSEID, COUNT(COURSENAME), COUNT(NAME) FROM STUDENT, COURSE WHERE STUDENT.COURSEID = COURSE.ID GROUP BY COURSEID; |
which gives us the following results:
1 | courseid | count(coursename) | count(name) |
- We can group by
COURSEID
andCOURSENAME
:
1 | SELECT COURSEID, COURSENAME, COUNT(NAME) FROM STUDENT, COURSE WHERE STUDENT.COURSEID = COURSE.ID GROUP BY COURSEID, COURSENAME; |
which gives us the following results:
1 | courseid | coursename | count |
Both methods can give us tuples.