Call now: (800) 766-1884  



 Home


 SQL Server Tips
 SQL Server Training

 SQL Server Consulting
 SQL Server Support
 SQL Server Remote DBA



 Articles
 Services
 SQL Server Scripts
 Scripts Menu



 

 

 

 
  SQL Server Tips by Gama and Naughter

 

Calculating Modal with a SP
 

The modal algorithm is a three step process: first it checks for all unique values from the list, second it counts the number of occurrences of each value and third it chooses the values with the highest number of occurrences. With SQL, one SELECT statement can do it:

 

CREATE PROCEDURE myModeRoyalty

--Calculate the modal of the royalty field from table roysched

AS

SELECT royalty AS Mode FROM roysched GROUP BY royalty HAVING COUNT(*) =

(SELECT MAX(tcount) as Tmax FROM (SELECT COUNT(*) as tcount  FROM roysched  GROUP BY royalty) tmpCount)

 

Let us examine this SELECT statement in detail. The first query checks for all unique values:

 

SELECT royalty AS Mode FROM roysched GROUP BY royalty

 

The GROUP statement groups all the same values together but it needs to filter the values with a number of occurrences lower than the maximum. HAVING will filter the values but it needs to know what is the maximum number of occurrences. Before identifying the highest number of occurrences, it is necessary to count the number of occurrences for all values. The subquery inside the derived table will count all unique elements:

 

SELECT royalty, count(royalty) AS total FROM roysched GROUP BY royalty

 

As it is not possible to perform an aggregate function on an expression containing an aggregate or a subquery, a derived table will extract the maximum number of occurrences:

 

SELECT MAX(total) FROM (SELECT count(royalty) AS total FROM roysched GROUP BY royalty) tmpCount

 

Putting it all together, we get the initial query.

 

One might be tempted to simplify the calculations by pre-calculating the highest number of occurrences by rewriting the SP like this:

 

CREATE PROCEDURE myModeRoyalty2

--Calculate the modal of the royalty field from table roysched, with the maximum number of occurrences pre-calculated

AS

DECLARE @tcount int

SELECT @tcount=COUNT(*)  FROM roysched  GROUP BY royalty ORDER BY COUNT(*)

SELECT royalty AS Mode FROM roysched GROUP BY royalty HAVING COUNT(*) =@tcount

 

This is actually worst than the previous solution because there are two table scans instead of one. If the first SELECT were identical to the derived table from the previous example there would be no such problem. Still there would be no performance gain as well.
 


The above book excerpt is from:

Super SQL Server Systems
Turbocharge Database Performance with C++ External Procedures

ISBN: 0-9761573-2-2
Joseph Gama, P. J. Naughter

 http://www.rampant-books.com/book_2005_2_sql_server_external_procedures.htm
 

 

Burleson Consulting Remote DB Administration


 

 


 

 

 

 

 
Burleson is the America's Team

Note: The pages on this site were created as a support and training reference for use by our staff of DBA consultants.  If you find it confusing, please exit this page.

Errata?  SQL Server technology is changing and we strive to update our SQL Server support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:and include the URL for the page.
 


Burleson Consulting
SQL Server database support

 

Copyright 1996 -  2013 by Vaaltech Web Services. All rights reserved.

Hit Counter