Median aggregate working

Rob Farley

Rob Farley has been consulting in IT since completing a Computer Science degree with first class honours in 1997. Before moving to Adelaide, he worked in consultancies in Melbourne and London. He runs the development department in one of Australia's leading IT firms, as well as doing database application consultancy and training. He heads up the Adelaide SQL Server User Group, and holds several Microsoft certifications.

Rob has been involved with Microsoft technologies for most of his career, but has also done significant work with Oracle and Unix systems. His preferred database is SQL Server and his preferred language is C#. Recently he has been involved with Microsoft Learning in the US, creating and reviewing new content for the next generation of Microsoft exams.

Over the years, Rob's clients have included BP Oil, OneLink Transit, Accenture, Avanade, Australian Electorial Commission, the Chartered Institute of Personnel and Development, the Royal Borough of Kingston, Help The Aged, Unisys, Department of Treasury and Finance (Vic), National Mutual, the Bible Society and others.

22 October 2005

Median aggregate working

Well, I feel like I cheated, but I do have a working user-defined aggregate function that calculates the median.

I wrote earlier about it not really being possible. Well, I managed to implement one that works, even if it's very nastily done, really slow, with other issues too.

I persist the data in a file, and then in the Terminate() function, I read it back into an ArrayList, sort it, and grab the middle record. It's not very elegant, but it works.

The issues are:

1/ I had to mark it as unsafe because it accesses the file system. So then I had to make the database trustworthy and use an sa account to import the assembly.

2/ If the query is interrupted, the file isn't deleted properly. I can't implement IDispose, because the CLR object is disposed after every record, so that idea doesn't work at all. So instead, I would use a separate process to look for old files and remove them.

3/ It's really slow. On AdventureWorks on my laptop, it took less than a second to work through the 290 records in Employee, but 13 seconds to get through SalesOrderHeader. But it does give the correct records. I figure it's no good for a website, but it could be okay for a scheduled report, and to be honest... if it's something that a SQL developer is putting together, then they can use row_number() to get the median out. This is more useful as an available function that could be used in a report built by a business user in Report Manager.

I guess 'really slow' is subjective. I hope so anyway. :)

I won't post code here... I'm sure you all know how to write data to a file and then read it back again. You can always post a comment if you really want to see the source code.

Rob Farley - You could win from here!

Rob Farley

Links

Blogs

22 October 2005

Median aggregate working

Subscribe

MVP

Contact Me

Blog Archive

Rob Farley Rob Farley has been consulting in IT since completing a Computer Science degree with first class honours in 1997. Before moving to Adelaide, he worked in consultancies in Melbourne and London. He runs the development department in one of Australia's leading IT firms, as well as doing database application consultancy and training. He heads up the Adelaide SQL Server User Group, and holds several Microsoft certifications. Rob has been involved with Microsoft technologies for most of his career, but has also done significant work with Oracle and Unix systems. His preferred database is SQL Server and his preferred language is C#. Recently he has been involved with Microsoft Learning in the US, creating and reviewing new content for the next generation of Microsoft exams. Over the years, Rob's clients have included BP Oil, OneLink Transit, Accenture, Avanade, Australian Electorial Commission, the Chartered Institute of Personnel and Development, the Royal Borough of Kingston, Help The Aged, Unisys, Department of Treasury and Finance (Vic), National Mutual, the Bible Society and others. Links My Aussenal podcast Arsenal Football Club SQL in Australia Diplomacy Blogs Alastair Waddell Andrew Coates Athena Pawlowski Chuck Sterling Clarke Scott Darren Neimke Darryl Burling Dave Glover Dave Lemphers David Gardiner Dr Neil Roodyn Frank Arrigo Geoff Orr Greg Low Jeff Wharton Jesper Johansson Michael Kleef Mitch Wheat Nick Randolph Paul Stovell Peter Ward Should you be here?... mail me	Did you mean to come here? My blog is now at http://msmvps.com/blogs/robfarley 22 October 2005 Median aggregate working Well, I feel like I cheated, but I do have a working user-defined aggregate function that calculates the median. I wrote earlier about it not really being possible. Well, I managed to implement one that works, even if it's very nastily done, really slow, with other issues too. I persist the data in a file, and then in the Terminate() function, I read it back into an ArrayList, sort it, and grab the middle record. It's not very elegant, but it works. The issues are: 1/ I had to mark it as unsafe because it accesses the file system. So then I had to make the database trustworthy and use an sa account to import the assembly. 2/ If the query is interrupted, the file isn't deleted properly. I can't implement IDispose, because the CLR object is disposed after every record, so that idea doesn't work at all. So instead, I would use a separate process to look for old files and remove them. 3/ It's really slow. On AdventureWorks on my laptop, it took less than a second to work through the 290 records in Employee, but 13 seconds to get through SalesOrderHeader. But it does give the correct records. I figure it's no good for a website, but it could be okay for a scheduled report, and to be honest... if it's something that a SQL developer is putting together, then they can use row_number() to get the median out. This is more useful as an available function that could be used in a report built by a business user in Report Manager. I guess 'really slow' is subjective. I hope so anyway. :) I won't post code here... I'm sure you all know how to write data to a file and then read it back again. You can always post a comment if you really want to see the source code. Posted by Rob Farley at 8:30 pm Newer Post Older Post Home	Subscribe MVP I'm a SQL Server MVP Contact Me MSN Msgr: rob_farley@hotmail.com Yahoo Msgr: robjfarley Skype: rob_farley Email: rob_farley@hotmail.com Blog Archive ► 2007 (1) ► January (1) ► 2006 (133) ► October (5) ► September (21) ► August (12) ► July (1) ► June (11) ► May (19) ► April (5) ► March (23) ► February (16) ► January (20) ▼ 2005 (87) ► December (12) ► November (21) ▼ October (24) SQL2005 Beta exams Daylight savings in databases Is it worth mentioning? AdSSUG meeting in November after all? Catching up with Kleefy EXTERNAL_ACCESS enough Value of a cheap car Median aggregate working Unlocked PCs Ready Tour 05 ADNUG and cross-browser compatibility I missed out! Median aggregate Will you be in Adelaide on Nov 21/22? England beats Poland Pocket Blogger Big market for mini phone-apps Egress upgrade New shirt! MapPoint A couple of days with Victor New role at AdSSUG Launch dates not official yet Should Functionality via Web Services Mean Low-Spe... ► September (22) ► June (1) ► April (7)