Skip to content

The following articles were found with tag mysql:

Deriving values for a non-existent table

Tue Oct 16, 2007 at 16.06

This situation might arise in rare cases but if it does there is a simple solution that should get you on your way. Here is the problem: You have two tables that are related by a column in each of them. However, there may or may not be a record in either of the tables for any given key. It is a bit tricky to explain in words, so perhaps an example is in order. For whatever reason you are working with the following tables

create table league_games(
game_id int,
venue_id int,
game_date datetime,
total_points int,
team_id int,
primary key(game_id, venue_id, game_date),
unique ven_date_uniq(venue_id, game_date)
);

insert into league_games values
(1, 1, '2007-01-01 12:00:00', 89, 1),
(2, 1, '2007-01-01 16:30:00', 72, 2),
(3, 1, '2007-01-02 11:00:00', 103, 7),
(4, 1, '2007-01-02 19:30:00', 97, 10),
(5, 2, '2007-01-08 16:00:00', 65, 4),
(6, 2, '2007-01-08 19:00:00', 78, 2)
(17, 2, '2007-01-09 17:30:00', 102, 12),
(18, 2, '2007-01-09 19:00:00', 58, 13),
(19, 3, '2007-01-19 14:30:00', 78, 17),
(20, 3, '2007-01-19 16:00:00', 87, 22),
(21, 3, '2007-01-19 18:00:00', 109, 27);

create table tournament_games(
tournament_id int,
game_id int,
game_date datetime,
total_points int,
team_id int,
primary key(tournament_id, game_id),
unique team_date_uniq(team_id, game_date)
);

insert into tournament_games values
(1, 7, '2007-03-07 09:30:00', 88, 1),
(1, 8, '2007-03-07 12:00:00', 76, 3),
(1, 9, '2007-03-07 14:30:00', 92, 5),
(1, 10, '2007-03-07 17:00:00', 80, 7),
(1, 11, '2007-03-07 19:30:00', 109, 9),
(1, 12, '2007-03-08 09:30:00', 97, 2),
(1, 13, '2007-03-08 12:00:00', 56, 4),
(1, 14, '2007-03-08 14:30:00', 85, 6),
(1, 15, '2007-03-08 17:00:00', 105, 8),
(1, 16, '2007-03-08 19:30:00', 93, 10);

From these tables we are trying to calculate the total number of points accumulated by each team in both tournament and league play for the Spring 2007 (January - May 2007) season. It is important to note that we do not have a teams table to join against, which is the crux of the problem. However before we jump into a solution we need to discuss another fact. While it may seem a little counter-intuitive that a team would play in the tournament but not in the leauge, our data supports this situation. Also a team may play in the league but not in the tournament. This adds a dimension to our problem because you must account for the missing values (which in an OUTER JOIN will produce NULLs) in your query. This causes problems because the sum of an integer and NULL is not defined, so it is NULL. But NULL isn't a number so the total points cannot be NULL. To get around this problem we will use the COALESCE function which takes the first non-NULL value in its argument list as the return value. Let's have a look at our first query attempt

select l.team_id
, sum(coalesce(l.total_points, 0) + coalesce(t.total_points,0)) total_points
from league_games l
left outer join tournament_games t
on l.team_id = t.team_id
and t.game_date > '2006-12-31 23:59:59'
and t.game_date < '2007-05-01 00:00:00'
where l.game_date > '2006-12-31 23:59:59'
and l.game_date < '2007-05-01 00:00:00'
group by l.team_id

Now this works fine provided there are no teams in the tournament that are not also in the league play. Given the connotation of league and tournament play, it doesn't make sense for this situation to arise, but humor me. We are, given the data above, missing point totals for teams 3, 5, 9, 6, and 8. To get these values we can reverse the order of the tables in query but then we will be missing totals for teams 4, 12, 13, 17, 22, and 27. The only way is to add a third table to the query that gives the valid teams. In an ideal situation we would have a table, teams or even teams_seasons, that would provide the necessary information, but we are not so lucky in our example. To get this data we will derive our teams table in the query using a union. Here is the query to derive the teams table

select l.team_id
from league_games l
where l.game_date > '2006-12-31 23:59:59'
and l.game_date < '2007-05-01 00:00:00'

union

select l.team_id
from tournament_games l
where l.game_date > '2006-12-31 23:59:59'
and l.game_date < '2007-05-01 00:00:00'

order by team_id

This produces a list of team_ids that appeared in the Spring 2007 season which we can join against both games tables to get our final result. You might be tempted to just join against this table and forget about the date range for the games tables, but that would be a mistake. While it is possible that a given team only played during one season, it is more likely that they played in multiple seasons and not including the conditions on the games tables would skew your results to total points over all time and not only in the given season. Putting it all together we have our final solution

select teams.team_id
, sum(coalesce(l.total_points, 0) + coalesce(t.total_points,0)) total_points
from (
select l.team_id
from league_games l
where l.game_date > '2006-12-31 23:59:59'
and l.game_date < '2007-05-01 00:00:00'

union

select l.team_id
from tournament_games l
where l.game_date > '2006-12-31 23:59:59'
and l.game_date < '2007-05-01 00:00:00'

order by team_id
) teams
left outer join league_games l
on teams.team_id = l.team_id
and l.game_date > '2006-12-31 23:59:59'
and l.game_date < '2007-05-01 00:00:00'
left outer join tournament_games t
on teams.team_id = t.team_id
and t.game_date > '2006-12-31 23:59:59'
and t.game_date < '2007-05-01 00:00:00'
group by l.team_id
order by teams.team_id

Which correctly answers our query and is an elegant solution to the presented problem.

Bad Practice: DB Queries inside loops

Sat Sep 29, 2007 at 03.37

This is something I see day in and day out, and unfortunately is a deep gash into the good reputation of otherwise awesome PHP coders. If you ever find yourself putting a database query inside a loop, stop and study the code for a long moment. Chances are this is unnecessary.

Generally speaking the only acceptable time for a query to be inside a loop is when it is contained in a script that will only be used once and that is meant to do some sort of data transfer/transformation (which can probably be better accomplished using a pure SQL solution, but that is a battle for another day.)

So let's examine one of the more common occurrences of this problem:

$sql_1 = "SELECT user_id
, SUM(points) total_points
FROM user_points
GROUP BY user_id
ORDER BY total_points DESC";

$query_1 = mysql_query($sql_1) or die(mysql_error() . "<br />" . $sql_1);

while($row_1 = mysql_fetch_assoc( $query_1 )) {
$sql_2 = "SELECT first_name
, last_name
FROM users
WHERE user_id = " . $row_1['user_id'];

$query_2 = mysql_query( $sql_2 ) or die( mysql_error() . "<br />" . $sql_2);
//more code follows
}

If you recognize this code and are thinking, "Hey man, what's wrong with that? That's how I always do it!" then I am extremely happy that you are reading this post.

The problem is that we really don't know how many users there are - and even if we did this is still a bad solution - in the user_points table and thus we are repeating the query to get the user's name an unknown number of times (N times.) This is a resource hog of a script to do something that is simple to accomplish using the proper query.

In this case, had we known about and employed an INNER JOIN, we will eliminate the subsequent calls to the database and retrieved all the information we need in one trip. This terminates the script's connection to the database quicker and allows another connection to be established by a different script. This is a good thing, trust me. If you have ever had to figure out what is causing your server to crawl you will know that I speak the truth.

So how do we solve the problem? As stated above we take that query and get the information using an INNER JOIN. Here is our new query

SELECT u.user_id
, u.first_name
, u.last_name
, SUM( up.points ) total_points
FROM user_points up
INNER JOIN users u
ON up.user_id = u.user_id
GROUP BY user_id
ORDER BY total_points

If any of that looks foreign to you, it's time you brushed up on your SQL. I suggest reading a few tutorials on JOINs.

Debugging MySQL Queries in PHP

Fri Sep 28, 2007 at 23.21

Over at what is becoming my second home a frequently asked question ultimately results in the original poster needed some guidance in debugging interaction between their PHP scripts and their database, namely MySQL. I hope to uncover some best practices for making it easier to figure out what is going on with your script.

This first snippet is a commonly occurring one and will be known as "the hard way" in terms of finding the cause of your incorrect results quickly.

$result = @mysql_query("SELECT * FROM table");
while($row = @mysql_fetch_assoc($result)) {
echo $row['user_id'] . " " . $row['username'];
}

This is the hard way because we have suppressed any errors using the @ operator and assumed that there will always be results. If something goes wrong, we might be surprised, or even worse just assume that there aren't any results that match our query. The query from above is rather simple and will probably not produce an error, but once you get into more involved queries you might have a mistake or two. Which brings me to my first point: you should create your queries outside of PHP (e.g. on the command line or in a product such as MySQL Query Browser or PhpMyAdmin) and ensure that they are working before adding them to the equation in your script.

Once you have your query working correctly directly against the database add it to your script with a few additions to the above code. First we will utilize the native function mysql_error to return the error from the MySQL engine and use it to show us exactly what went wrong. This coupled with the display of the query that was actually sent to the engine will prove extremely helpful in the future. This is how a query snippet might look with some more debugging features

$sql = "SELECT user_id, username FROM users";
$result = mysql_query($sql) or die("MySQL error: " . mysql_error() . " generated by " . $sql);

In the event of a failed query, we can now see the error generated by the db engine and the actual query that caused the error. This however is not something you should find in a production environment. The reason is outside the scope of this post, but involves security by not hinting at database structure.

When handling the utilization of result sets the result set should first be verified. We did not do this in our first example we just assumed that there were rows and that the query was successful. That can also be handled better using the following snippet

if($result === false) {
echo "Database query returned an error.";
}
elseif( mysql_num_rows( $result ) == 0 ) {
echo "There were no results returned.";
}
else {
while($row = mysql_fetch_assoc( $result) ) {
echo $row['user_id'] . " " . $row['user_name'];
}
}

Naturally you will probably want to handle the cases a little differently (i.e. not just echo out those simple statements) but you get the general idea.

The Archives

Search the Archives

Subscribe

RSS - All

Recently Popular Tags (past 2 weeks)

All-time Popular Tags

Random Tag Delivery