The longer I’m a SQL guy, the more convinced I am that many of the worlds problems can be solved with a numbers table. Well, SQL problems anyway. A numbers table is really nothing more than an ordered table of sequential integers from 0 or 1 to some useful though not huge number. Maybe 8000 or 1Million, maybe more depending on your uses. This table will serve as a counter just like looping through a WHILE loop or incrementing some counter variable or identity column, etc. The difference between those counters and a numbers table is that the numbers table is pre-generated and because SQL was born to work with set-based/table-based data really efficiently (vs generating counters and looping and doing string manipulation like true CLR/Programming languages do), we are able to realize incredible speed gains when we leverage that fact as much as possible.
So, you may say to me “OK, John…now I’ve got a big ass list of numbers...um…so now what?”
Well, the thinking goes like this:
Do I need to parse a string? Do I need to remove dupes from a string? Do I need to extract certain items from a string? Do I need to generate a range of dates? Do I need to generate a range of times? Do I need to generate a list of IP addresses? Do I need to find gaps in identity columns? Do I need to do anything that steps or loops or adds or otherwise moves sequentially forward or backward through anything?
If you’ve said “yes” to any of those, then you maybe should create a numbers table on any and all of your SQL servers. Now, of course I understand that it’s fairly easy to generate a sequential list of numbers at runtime in a CTE or otherwise, but in many (most?) cases a static numbers table will perform better because the table is likely already in memory and ordered/indexed and there’s no need to add the processing and memory requirements of number generation—light as they may be—to runtime.
Recently, I had the great fortune of playing with a Kindle Fire HD 8.9" tablet. At first I was really excited and stunned by how it performed. As time went on, however, I started to resent little things about it. Enough so that I finally turned it back in and am getting a Google Nexus 10" instead. Here are my experiences that led me to make this decision:
When Normalize version 2.x pulls data from SCCM, the service will use a .CONFIG file to define all the queries it will use to pull data. In general that file is either SCCM.LIGHT.EXTRACTOR.CONFIG or SCCM.FULL.EXTRACTOR.CONFIG. The LIGHT file has the minimum queries needed to normalize SCCM and make it useful. The FULL file includes a bunch of extra SCCM data for those people who want to just pass extra SCCM data on through and report off Normalize and SCCM data together instead of having to pull some data from Normalize and some data from SCCM.
So, that's what pulls data from SCCM, but what it actually pulls is data like Add/Remove Programs data (v_GS_Add_Remove_Programs_Data and v_GS_Add_Remove_Programs_64_DATA) as well as software inventory data (v_GS_SoftwareFile and v_ProductFileInfo) and software metering data (v_MonthlyUsageSummary). I'm not privy to the actual algorithm Normalize uses to match SCCM data to the catalog, but I know it tries to match the ARP, software inventory and metering data to products in the catalog. Well, I don't know about you, but in order to keep our SCCM database reasonably sized, we're only inventorying a few EXEs with Software Inventory, we've only got rules on a handful of EXEs for Software Metering and many of our packagers in the past have changed the Add/Remove Programs names of software they repackage so it no longer matches anything in anyone's catalog. What this all means is there are actually fewer data points than there should be to match to the Normalize catalog.
For those of you using BDNA Normalize version 2.x you may have noticed that when you're all done normalizing and you want to run normalize again, when you click on the RESET button, it can take a long time for that process to finish, especially if you have a larger SCCM environment.