SQL query help

Associate
Joined
9 Oct 2006
Posts
1,945
Location
Location Location!
As you can see my signature updates, does this using the GD library in PHP. It currently stores a ton of log information in SQL which is nearly 24,000 entries all containing IP, browser, time and referrer.

Im making a stats file, i've got many things such as how many times your IP viewed the signature, total number of unique viewers etc. ( got some very interesting numbers )

I am trying to find the most popular referrer. To do this i currently use a SELECT DISTINCT then loop through every result which takes absolutly ages seeing as its over 1800 queries.

I'm sure this is possible but i can't seem to find it. Need to find which link is most common in the logs table.

Help please :)
 
Last edited:
Do you have individual fields for your log entries? I.e [Time], [date], [c-ip] etc.. Or is it literally one long text entry per field of 'visitors'?
 
Conrad11 said:
Could you not have one record for each IP Address?

...adding one to the record every time they view?

The IP wouldn't view from the same referrer everytime.

!bluetonic! said:
Do you have individual fields for your log entries? I.e [Time], [date], [c-ip] etc.. Or is it literally one long text entry per field of 'visitors'?

Its a table with fields ID, IP, referrer, browser, time and it adds a new entry everytime signature is viewed.

Edit: Don't worry just got it working by looking within PHP instead of sending billions of queries.

It normally takes me 3 revisions to get my code effective. Im terrible heh.
 
jdickerson said:
Your counter just went down on a refresh :P

It what??? :confused: :confused:

Anyway, I finally found out the issue. The most referrers was no referrer, it that makes any sense ( dam you people with referrer turned off ).

Just changed the query to not include ones with a blank referrers and it works now.

Stats below :) :)

http://jaf.richieward.com/sigcounter/jaffa_cake/stats.php
 
Last edited:
Nice to see FF trouncing IE in the browser wars!

Mac beating Linux though? :o

Come on people! Kubuntu + Beryl = newbies love of linux :D
 
I feel its time for me to add more detailed stats to the image. Watch this space

index.php
 
one little criticism of your stats page if i may.... :p

you should be basing your browser/os stats off the unique hits, not the total views. :)
 
Jaffa_Cake said:
It what??? :confused: :confused:
It could well be my eyes but it went from blue xxxx6 to red xxxx5 the other day when I refreshed to see if it was measuring a unique IP referral or not. I could be going mad, no wait, madder, though.
 
marc2003 said:
one little criticism of your stats page if i may....

you should be basing your browser/os stats off the unique hits, not the total views.

It looks more impressive with total hits :( I know what you are saying and I will consider it.

Its about 9000 SQL queries to change the method of logging :D to make it effective.

Only about 20 to change the method of logging and to make it ineffective

jdickerson said:
It could well be my eyes but it went from blue xxxx6 to red xxxx5 the other day when I refreshed to see if it was measuring a unique IP referral or not. I could be going mad, no wait, madder, though.

Possibly went up 9 :) ?

Soon I will also be adding more options and custom text on the bottom left/right. :)

I am currently creating a full signup/control panel for signatures making them fully customizable instead of revising!
 
what do you mean 9000 sql queries? :p

i've recently started logging hits from my sig too using php/mysql (yes i'm a copycat :o).

this is the code in my sig....

Code:
$ip = $_SERVER["REMOTE_ADDR"];
$ref = $_SERVER["HTTP_REFERER"];
$time = time();
$agent = $_SERVER["HTTP_USER_AGENT"];
mysql_query("INSERT INTO logs(ip, ref, time, agent) VALUES('$ip', '$ref', '$time', '$agent')");

my results page is running 3 queries and i'm using php inside the while loops to get the totals. here's how i'm doing the browser/os bit....

Code:
//returns only records with unique ip
$result = mysql_query("SELECT agent, time FROM logs GROUP BY ip");
while($row = mysql_fetch_array($result))
        if(strstr(strtolower($row2['agent']), "intel mac os x")) {
		$os['intel mac os x']++;
	}
	if(strstr(strtolower($row2['agent']), "ppc mac os x")) {
		$os['ppc mac os x']++;
	}
        etc etc.....
}

and this is my results page. clicky
 
Thats the method I am currently using. To convert it to an effective method I figured using a SELECT DISTINCT then looping through every IP and creating a new single log for that IP in a different table.

Effectivly having one entry per IP to make things easier, maybe I should have thought this through when i originally posted it :rolleyes: Im seeing problems with that now.

Its still being improved and I would love to steal some of your code if you want to MSN me pleeease ;)

Im taking signature offline for maintenance and optimisation. I do like your stats page very much and my method of finding the most popular referrer.. works. BADLY, im sure there must be a better way.

My logging query looks as follows
Code:
$query = "INSERT INTO `jaffa_cake_logs` (`id`, `ip`, `referrer`, `browser`, `time`)
VALUES (NULL, " . $sql->safe($_SERVER['REMOTE_ADDR']) . ", " . $sql->safe(@$_SERVER['HTTP_REFERER']) . ", " . $sql->safe(@$_SERVER['HTTP_USER_AGENT']) . ", " . $sql->safe(time()) . " )";

In good tradition my code ALWAYS has to be terribly optimised until about 3 revisions later.
 
Last edited:
well you've already seen my insert statement above. is there any need to validate the server variables? i honesty didn't know they needed checking? i have just the one table containing just the 4 columns. ip, refererer, time and useragent.

as mentioned my results page has 3 queries - one for total hits, one for unique hits and one to get the top 10 referring pages. i don't comment my code unfortunately and it's probably a mess as i'm still a noob - i've only just figured out how to use functions. can't get my head around classes at all... :o

full source....

Code:
<?php
$time_start = microtime(true);
include '../config.php';

function display($title, $data, $width) {
	echo '<div style="margin-bottom: 10px; float: left; width: '.$width.'px;">';
	echo '<table style="width: '.$width.'px; table-layout: fixed; empty-cells: hide;">';
	echo '<tr><td style="width: '.($width-100).'px;">'.$title.'</td></tr>';
	foreach($data as $type => $value) {
		echo '<tr><td>'.$type.'</td><td>'.$value.'</td></tr>';
		echo "\n";
	}
	echo '</table></div>';
}

$total = $unique = array(
'overall' => 0,
'last 24 hours' => 0,
'last hour' => 0,
);

$os = array(
'intel mac os x' => 0,
'ppc mac os x' => 0,
'other linux' => 0,
'ubuntu' => 0,
'debian' => 0,
'fedora' => 0,
'other windows' => 0,
'windows 2000' => 0,
'windows xp' => 0,
'windows 2003' => 0,
'windows vista' => 0
);

$browser = array(
'firefox' => 0,
'opera' => 0,
'internet explorer' => 0,
'safari' => 0
);

$result1 = mysql_query("SELECT time FROM logs");
$total['overall'] = mysql_num_rows($result1);
while($row1 = mysql_fetch_array($result1)) {
	if(time()-$row1['time'] < 86400) {
		$total['last 24 hours']++;
	}
	if(time()-$row1['time'] < 3600) {
		$total['last hour']++;
	}
}

$result2 = mysql_query("SELECT agent, time FROM logs GROUP BY ip");
$unique['overall'] = mysql_num_rows($result2);
while($row2 = mysql_fetch_array($result2)) {
	if(time()-$row2['time'] < 86400) {
		$unique['last 24 hours']++;
	}
	if(time()-$row2['time'] < 3600) {
		$unique['last hour']++;
	}
	if(strstr(strtolower($row2['agent']), "linux")) {
		if(strstr(strtolower($row2['agent']), "debian")) {
			$os['debian']++;
		} elseif(strstr(strtolower($row2['agent']), "fedora")) {
			$os['fedora']++;
		} elseif(strstr(strtolower($row2['agent']), "ubuntu")) {
			$os['ubuntu']++;
		} else {
			$os['other linux']++;
		}
	}
	if(strstr(strtolower($row2['agent']), "intel mac os x")) {
		$os['intel mac os x']++;
	}
	if(strstr(strtolower($row2['agent']), "ppc mac os x")) {
		$os['ppc mac os x']++;
	}
	if(strstr(strtolower($row2['agent']), "windows")) {
		if(strstr(strtolower($row2['agent']), "windows nt 5.0")) {
			$os['windows 2000']++;
		} elseif(strstr(strtolower($row2['agent']), "windows nt 5.1")) {
			$os['windows xp']++;
		} elseif(strstr(strtolower($row2['agent']), "windows nt 5.2")) {
			$os['windows 2003']++;
		} elseif(strstr(strtolower($row2['agent']), "windows nt 6.0")) {
			$os['windows vista']++;
		} else {
			$os['other windows']++;
		}
	}
	if(strstr(strtolower($row2['agent']), "firefox")) {
		$browser['firefox']++;
	}
	if(strstr(strtolower($row2['agent']), "opera")) {
		$browser['opera']++;
	}
	if(strstr(strtolower($row2['agent']), "msie")) {
		$browser['internet explorer']++;
	}
	if(strstr(strtolower($row2['agent']), "safari")) {
		$browser['safari']++;
	}
}
array_multisort($os, SORT_DESC);
array_multisort($browser, SORT_DESC);
$result3 = mysql_query("SELECT COUNT(ref), ref FROM logs WHERE ref <> '' GROUP BY ref ORDER BY COUNT(ref) DESC LIMIT 10");
while($row3 = mysql_fetch_array($result3)) {
	$row3['ref'] = htmlentities($row3['ref']);
	$referrers['<a href="'.$row3['ref'].'">'.$row3['ref'].'</a>'] = $row3['COUNT(ref)'];	
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<title><?php echo $_SERVER["SCRIPT_URI"]; ?></title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link rel="stylesheet" type="text/css" href="../css/style.css" />
</head>
<body>
<div style="height: 100%;" class="main">
<div style="padding: 4px;">
<p>my now playing signature viewing statistics summary (<a href="raw.php">view raw database data</a>)</p><br />
<?php
display('total views', $total, 350);
display('unique hits', $unique, 350);
display('top operating systems', $os, 350);
display('top browsers', $browser, 350);
display('top 10 referring pages', $referrers, 700);
$time_end = microtime(true);
$time = $time_end - $time_start;
echo '<div style="clear: left;"><p>script execution time - '.$time.' seconds</p></div>';
?>
</div>
</div>
</body>
</html>
 
marc2003 said:
well you've already seen my insert statement above. is there any need to validate the server variables? i honesty didn't know they needed checking? i have just the one table containing just the 4 columns. ip, refererer, time and useragent.

as mentioned my results page has 3 queries - one for total hits, one for unique hits and one to get the top 10 referring pages. i don't comment my code unfortunately and it's probably a mess as i'm still a noob - i've only just figured out how to use functions. can't get my head around classes at all... :o

Classes took me a while too :p

I always filter ALL of my SQL input just from good practice. The HTTP_USER_AGENT and HTTP_REFERRER variables are taken from the HTTP headers sent by the users browser like mine is:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3

This can easily by modified sending bad input, same with the referrer.

Ty for code :)
 
Btw im not sure your generated time is correct, i used the method you are using originally. Now using this

Top
Code:
$mtime = microtime();
$mtime = explode(" ",$mtime);
$mtime = $mtime[1] + $mtime[0];
$starttime = $mtime;

Bottom
Code:
$mtime = microtime();
$mtime = explode(" ",$mtime);
$mtime = $mtime[1] + $mtime[0];
$endtime = $mtime;
$totaltime = ($endtime - $starttime);
 
well i've changed it but i'm not too sure if it makes any difference. for example if i dump the whole database on one page, it takes about 8 seconds using both timing methods. :p
 
Back
Top Bottom