We installed the App Catalog today in production and got the dreaded "Cannot connect to the application server" error when opening the catalog. There are a number of reasons this can happen. Microsoft has listed many of them here. We came up with a new reason on our own: TLS 1.0. Knowing that SSL 3.0 and TLS 1.0 are no longer considered secure protocols, we disabled them long ago. There are a number of places in CM where disabling those can break things. Well, you can add the app catalog to the list. It needs TLS 1.0 enabled (both client and server). Enabling without restarting any services cleared the error up immediately.
I found no blog posts for forum questions mentioning this setting, but I assume as more people move towards removing old protocols, it might pop up more often. And I do expect Microsoft to address this at some point so I can go back and disable TLS 1.0 again (and hopefully 1.1 as well).
Direct from the developer at Microsoft regarding the CM network access account: it does NOT need full permissions to the CM cache folder for Peer Cache. I thought this was the case myself and evidently documentation will be coming soon regarding this. Full permission to me seemed so insane that I didn't bother testing it and simply assumed Peer Cache to be a beta product not worthy of looking at. I happily stand corrected! Now all they need is some throttling. :-)
If you have a CAS (you shouldn't), and if you have enabled distributed views, you might want to hold off from upgrading to SP1 for CM12 R2.
It sounds like we'll need a hotfix for the upgrade so that certain tables and views are checked for during the upgrade. When you enable distributed views, views are created to show data left on primary sites that is not on the CAS. The upgrade isn't expecting those so when it goes to recreate tables, the name is in use (as are some keys in indexes) and the upgrade fails. The ugly part for me is that it failed so far in that running recovery using my SQL backup wouldn't work.
So will this fail for you too? I'm not sure. It might just be my certain layout that wasn't tested. Let me describe it.
PR1 has just the hardware inventory node enabled for distributed views.
PR2 has all three links enabled.
How much you extend your inventory affects the total number of distributed views created. In my case, I had 321 of them. But it's just the PR2 tables and views that the upgrade got upset over; the ones where one site had all links enabled for DV. What if PR1 had all three links enabled too? Would I have had the problem? What if PR2 had only the hardware inventory node enabled? Would I have had the problem? I don't know. Will you have the problem? I wouldn't take the chance.
To get past this issue, I nuked some tables and views:
DROP TABLE [dbo].[CollectedFiles_RCM] DROP TABLE [dbo].[FileUsageSummary_RCM] DROP TABLE [dbo].[FileUsageSummaryIntervals_RCM] DROP TABLE [dbo].[MonthlyUsageSummary_RCM] DROP TABLE [dbo].[SoftwareFile_RCM] DROP TABLE [dbo].[SoftwareFilePath_RCM] DROP TABLE [dbo].[SoftwareInventory_RCM] DROP TABLE [dbo].[SoftwareInventoryStatus_RCM] DROP TABLE [dbo].[SoftwareProduct_RCM] DROP TABLE [dbo].[SoftwareProductMap_RCM] DROP TABLE [dbo].[SummarizationInterval_RCM] DROP VIEW [dbo].[CollectedFiles] DROP VIEW [dbo].[FileUsageSummary] DROP VIEW [dbo].[FileUsageSummaryIntervals] DROP VIEW [dbo].[MonthlyUsageSummary] DROP VIEW [dbo].[SoftwareFile] DROP VIEW [dbo].[SoftwareFilePath] DROP VIEW [dbo].[SoftwareInventory] DROP VIEW [dbo].[SoftwareInventoryStatus] DROP VIEW [dbo].[SoftwareProduct] DROP VIEW [dbo].[SoftwareProductMap] DROP VIEW [dbo].[SummarizationInterval] DROP VIEW [_sde].[v_GeneralInfo] DROP VIEW [_sde].[v_GeneralInfoEx] DROP VIEW [_sde].[v_GS_AppInstalls] DROP VIEW [_sde].[v_HR_NSV] DROP VIEW [_sde].[v_MachineUsage]
So after restoring the CM database from backup, dropping the views and tables above, and then running the upgrade, it finally took. The CAS is now at SP1 and replication is looking good. The only reason I'm posting the views and tables above is in case someone else already got themselves in trouble. I wouldn't do this unless it's already too late. And those last views we created in our own schema, but the upgrade still doesn't like them so if you have any of your own, you might want to makes copies, blast them, and put them back after the upgrade.
Long story short, if you're using distributed views, I'd recommend you wait on SP1 until we hear from Microsoft.
Update: Notice that the views above are all related to the software inventory and software metering link. As I mentioned in the lab, one of my site had all three links set for DV and one primary was marked for DV for just hardware inventory. Well in production we have only the hardware inventory link enabled for DV so we decided to move forward with SP1 there and it worked fine. So if there is an issue, it would only be with the software inventory and software metering link. Now is there an issue? I sent our database off to Microsoft but never heard back.
When will Microsoft ever get Role Based Access (RBA) working for Automatic Detection Rules (ADRs)? I need to know that a server admin can make use of an ADR to setup his patches and that a workstation admin can't go in and edit the server ADRs. And vice versa.
Well, RBA is there. Already. Right now. At least in CM12 R2 it is. Was it always there? I could swear that when RTM came out, that this wasn't possible. But I verified this works yesterday. What isn't there is the option to right-click an ADR and assign the scope, but that's really not important.
The server admin can see the workstation admin's ADRs, but all the properties are grayed out and no changes can be made. The guts of this (as with all RBA) revolves around the collections each admin has access to. When a server admin creates an ADR which targets his collection that a workstation admin doesn't have access to, RBA kicks in and protects the admin.
So what's not to like about ADRs now?
Well, other than wishing they'd use saved searches instead of filters (which is another DCR submitted long ago) not much. I have just one thing driving me nuts before I let the admins know that they can start using ADRs now. Packages.
You can't make an ADR without filling out the packages prompts in the wizard. I'd have to let these admins also make patch packages on their own. And I can even grant that specific feature in our SUM role. So why could this be bad, especially if our single instance store in the Content Library is saving us space?
Well for one, it isn't saving us space on the source files (and for that I really need to move that share to a dedupe volume). But the other one is that one admin could now download a patch everyone is using and later just go delete it and break a lot of deployments. Sure, I could go fix that by downloading the patch myself quickly, but that could leave clients sitting around for a day before they retry. Maybe I'm over thinking this?
We have 3 primary sites under a CAS (bad, but we have no choice with so many clients). Because we also have Nomad, we don't care where clients get assigned. We care only that each site has roughly the same client count as the others. But we drifted about 30K clients too many on one site and simply made use of CM12 R2's function to move clients. So we moved them to level set the count.
The downside, and we knew this, was that each client would have to do a full inventory and SUP scan. That's a lot of traffic but we've done this before without issue. But this time we melted the SUPs with many full scans. And the wonderful Rapid Fail detection built into IIS decided to protect us by stopping our WSUS App pool. Late at night.
Now in CM12 post SP1 (we're on R2), clients make use of the SUPList which is a list of all possible SUPs available. Clients find one SUP off that list and stick to it. They never change unless they can't reach their SUP after 4 attempts (30 minutes between each - the 5th attempt is to a new SUP). Well with the app pool off, all clients trying to scan would fail and start looking for new SUPs. A new SUP means a full scan. A full scan from 110K clients is far worse than from just 10K when we're moving things. Needless to say our SUPs were working very hard the next morning to serve clients. On a normal day the NIC on one of our SUPs shows about 1Mbps of traffic, but after starting the WSUS App pool we were at over 850Mbps going out per SUP.
Disabling Rapid Fail is one nice fix to help keep that app pool from stopping, but we also increased the default RAM on that from 5GB to 20GB (the SUPs have 24GB so we were clearly wasting most of that). I know of another company who has 85K clients on 2 SUPs who boosted their RAM from 24 GB to 48 GB to help IIS serve clients. Another option is to add more SUPs, but RAM is probably cheaper than another VM. This default Private Memory Limit is 5GB, so for those of us weirdoes with lots of RAM, it makes sense to crank this up if you can. We actually did this long ago, but we're thinking the Server 2012 R2 upgrade over Server 2012 wiped our settings out.
By the way, the obvious 'treatment' during such a meltdown is to throttle IIS. We set our servers down to 50 Mbps and the network team was happy; your setting will vary based on client count and bandwidth. Our long term insurance here will be QoS. UPDATE: Jeff Carreon just posted a tidbit on how to throttle quickly in case of an emergency using PowerShell.
So how do we keep our settings? We ask Sherry who knows DCM! Read more on her CIs to enforce our settings here.
Here is something I've wanted to try forever - heck since they used to call it Server Core.
For my role servers like the MP or DP servers, would CM still work if I remove the GUI from the OS? Because Server 2012 R2 lets you take the Windows shell off and put it back on, it's easy to test. So just I did.
I mix my MP and DP servers on the same VM. So my test here is to see if those roles will still work after I take the UI away (and manage the servers strictly with PowerShell).
By using Service Manager, I ask to remove the feature User Interfaces and Infrastructure. Well that's a bit too extreme because we'd evidently lose the IIS BITS Server Extensions and Remote Differential Compression. And I know I need those for CM. So I back off and select only to remove the Server Graphical Shell (essentially Explorer and IE). That works!
So why am I even playing with it? Theoretically, the loss of the UI means a smaller attack surface so my server should be safer. And it could mean fewer patches might be needed in the future which could lead to fewer reboots and more uptime.
In reality, I doubt I'm gaining much here. The actual best benefit would be that my team is forced to manage more using PowerShell and quit playing with things one at a time in a UI. When you RDP to this server, you just get a cmd box and no explorer. This isn't supported by Microsoft yet as far as I know, but because my MP and DP logs (and CM client logs) look good, I'm sure it's simply a matter of Microsoft not testing this setup yet to support it.
I'll let this server in the lab sit for a couple months like this and decide then if I'd like to do the rest in the lab (role servers only; I highly doubt a primary site could work like this). Also, I have other internal apps to consider beyond CM. Like is Symantec Endpoint Protection still fine? Other server base apps I'm required to run also need to be checked.
Many apps might fail if you start with no UI, but it seems they mostly work if you remove it after the install. And if I change my mind about this or run into an issue, it's easy to put the Server Graphical Shell back on. Oh, and Kaido has a tip regarding this as your source files for the GUI can become stale.