Securing SCOM in a Privilege Tiered Access Model–Part 3

You can find Part 1 here.

You can find Part 2 here.

In summary, we went over various security concerns deploying SCOM. Although there are a bunch listed, there are two that I believe could take down an organization in a hurry: poor run as account distribution or a SCOM admin’s account being compromised. The run as account distribution could allow for quick lateral movement through Tier 1 as well as possibly exposing all of your data to the attacker (depending on the permissions of said run as account). A SCOM admin being exposed could allow for someone to use SCOM as a deployment mechanism to compromise your environment. These are both big deals in my opinion, and when it comes to designing for security, this is the concept that you need to design around.

As such, I think there’s some wisdom in treating this application as though it’s a Tier 0 asset. It should be protected carefully. If you wanted to strictly follow Microsoft’s model, you may end up putting separate management groups in Tier 0, Tier 1 (and possibly multiple management groups here), Red Forest, etc. While I suspect that there will be some in the cyber community that disagree with me (and if so, I would appreciate the feedback), I personally am not sure that this worth the effort. For one, it massively over-complicates administration, as you’re managing multiple SCOM environments. You have to have multiple lines of accountability for SCOM alerts. You now alerts coming from multiple management groups, and will have to tune across multiple management groups as well. You will now deal with multiple sets of reports as well as multiple sets of notifications.  You also haven’t really mitigated any risk to your Tier 1 environment in particular, and that is where all your data is stored. In the cyber community, we stress Tier 0 protection, and that’s good as compromising Tier 0 is by far the easiest way for an attacker to own your environment, but Tier 1 protection is just as critical, as that is where your assets are contained. Remember that in SCOM architecture, the agents should be running as local system, so on their own, they don’t pose a threat no matter where SCOM is located. It’s how you distribute your run as accounts and how you allow Operations Manager to be managed.

With that in mind, my ideal architecture would be to place my  management group in the Red Forest. To back track a bit, a red forest is an untrusted domain. If you implement a RF, you will use IPSEC to prevent management of your resources through any other way. That means no more RDP to a server from your standard desktop. You will use a privilege access workstation (PAW) that is joined to the red forest, as IPSEC will prevent you from accessing your server environment in any other way. The RF has no internet access or email, so it’s not prone to being infected by malware, and since it’s not trusted, it’s much harder to laterally move into it (though side note, if you’re using the same passwords in RF as you are in prod, then you do effectively have the same hash, which could in theory be compromised).

This does present some challenges. For one, to manage the production environment, you’ll need to setup gateways in production and configure certificate authentication between the management group and the gateways. This isn’t terribly difficult to do, but it can introduce a few points of failure, namely agent and gateway failover, as these do not occur automatically. Jimmy Harper wrote a nice piece on how to do this, and since (as I understand it) the commands are the same in 2016 as they are in 2012, I’ll simply link it here.  Since agent deployment is not a static thing, you’ll likely need to run the agent failover PowerShell as a scheduled task on the management server on a periodic basis, and add the gateway failover scripts as a part of any new gateway deployment. At least in this scenario, you’re only managing one management group, and you’ve mitigated the risks associated with SCOM administration. Second, you are still at risk to run as account distribution issues. I recommended in part 2 of this series that these be audited. This isn’t hard to do from the admin console. The bottom line is that this is something that needs to be performed with some frequency, as a poorly distributed run as account can lead to a very quick compromise. Third, this may present issues depending on how you created your RF. I’m definitely approaching it as more of a management network for the entire domain. If you have an RF that is for DA credentials only, then this really isn’t the best option, and you’re going to need to put it in your Tier 1 management network. That type of network will be joined to the production domain, so care will need to be taken to protect it against credential theft.

Legacy Protocols and Operating Systems

This is last thing that comes to mind in architecture for a secure SCOM environment. It goes without saying that leaving on legacy protocols potentially exposes yourself to all sorts of attacks. I think at this point, we are all familiar with the consequences to leaving SMB1 enabled. Though the exact cost to IT organizations remains unknown, the estimates range from hundreds of millions to around $4B for WannaCry alone, much of which could have been prevented if organizations had moved away from Windows XP well before the 2017 attack was launched.

That sadly, is not the only legacy protocol out there. Other protocols include NTLM V1, LANMAN, Digest Authentication, and older versions of TLS. All of these are, at this point, on official deprecation lists from Microsoft. Turning them off certainly presents risks to older applications, so there’s some value in eliminating older apps, or at the very least restricting where these protocols can be used.

SCOM, for the record, does not require any of these protocols, so I highly recommend removing these, as well as deprecating older Operating systems such as Server 2008/2012.

It is worth noting that there are issues with the SCOM installer and legacy protocols. I’m hoping that Microsoft does fix this at some point, as I know this has been reported to the product team, but there are some known issues with certain legacy protocols and the SCOM installer. In that case, you may need to turn them on only for the purpose of deployment.

  • RC4, documented on my blog.
  • TLS – I haven’t observed this, but have been told that the installer can have issues if older versions of TLS are not turned on. This does make sense as TLS 1.3 support was added in a recent SCOM 2016 UR.
  • NTLM V1 – Again, I haven’t observed this, but I do know there is an active investigation regarding the SCOM reporting piece requiring NTLM V1 for install.

Securing SCOM in a Privilege Tiered Access Model–Part 2

Previously, I discussed basic security posture and what is needed to secure a SCOM installation. The post can be found here. In summary, we discussed risks associated with malicious management packs and the use of a service account for agent action instead of the local system. This discussion will focus a bit deeper on account management.

Carefully plan Run-As account distribution

In my opinion, poor run as account distribution practices poses the greatest risk to your environment, as a poorly distributed account could potentially give an attacker the keys to your environment. The first thing worth noting about run as accounts is that they need to be able to logon locally. This effectively means that the account’s credentials are sitting in memory on any server that it was distributed to. I demonstrated this particular risk in this piece, and I recommend reading it before planning a SCOM installation.  Server 2016 has mitigated many risks associated with pass the hash, but older operating systems do not have the same mitigations in place, and as such, they are exposed. Keep in mind that it only takes one compromised server to compromise a tier. If you have a super account running on Server 2008, I can collect that hash and still use it to access a more secure 2016 system. The OS mitigations in place will prevent me from collecting additional hashes off that system, but once I’m on it, I can still do whatever I want with the system.

In the tiered structure, you don’t want Tier 0 accounts being used on Tier 1. In short, this means no Domain Admins logging on to anything that is not a domain controller. That’s simple enough. The AD MP doesn’t need a DA run as account anyways, so the only issue at hand is finding a method to patch/upgrade the agent on a domain controllers.

Tier 1, however is a bit more complicated. This is your server tier. Many organizations (and I’ve been guilty of this in the past as well) have a handful of super accounts that are local admins on every server in the environment. If any of those accounts are used as a run as account and distributed anywhere, this means that this account could potentially be harvested. All an attacker needs is local admin rights on one server that has this type of account running and your entire tier 1 environment is compromised. This is, as far as I’m concerned, just as bad as compromising tier 0. The attacker effectively has all of your data and access to any server in your environment. Even without domain admin rights, they will be able go about their business. In the tiered model, there should be very few of these types of accounts, and their use should be restricted from the management network (aka Red Forest). Other accounts in Tier 1 need to be restricted to only the machines that they need to run on.

As such, my general opinion is to stay as far away from using run as accounts as possible. For most of our management packs, this is not an issue. However, some MPs (SQL and SharePoint for instance), need them, and SharePoint does not even have an option for least privilege.  The first thing I’d recommend is using NT Service SIDS in their place. I know this works for SQL, as Kevin Holman has a great article on how to do this (though I highly recommend using the least privilege configuration and not SA rights). The Health Service SID effectively gives the local system’s health service the minimum permissions needed to monitor a SQL environment. The health service, given that it is not a user account, is not able to be mined by an attacker. I’m of the opinion that all management pack authoring needs to move in this type of direction, and if I were calling the shots, solutions such as Kevin’s solution for SQL would be integrated into every one of our MPs. Unfortunately, as of this writing, this is not the case.

Where run as accounts are required, an organization needs to put some intelligent controls in place.

  • Ensure this account can only log on to the machines that SCOM distributes it to.
  • NEVER use the less secure distribution option. I personally would argue that this feature should be removed from the product, as it makes it way too easy to expose yourself to massive amounts of risk.
  • Ensure the run-as account is not a high value account.
  • Strictly control the administration of SCOM as SCOM admins are the ones who can create and distribute them.
  • Train SCOM admins so that they understand this vulnerability.
  • Regularly audit run as account configuration and distribution.

Least privilege service accounts

This one speaks for itself, though I’ve seen plenty of organizations that assign way too many rights to a SCOM service account because it’s easy. You can find official requirements here, but as you can see, several of these accounts need local admin rights (note that’s admin rights on the management servers themselves, not everywhere… and most definitely NOT domain admins). I would further add that because of this, these accounts run in resident memory on the management servers. It would be wise to ensure they have no privileges elsewhere.

Some organizations will make the management server action account a server admin to facilitate agent deployment and upgrade. I would argue that this too is a bad practice. The account won’t sit in resident memory on agents (except when in use), but it does sit in resident memory on management servers, so by compromising a management server, you could potentially compromise this account as well, giving an attacker admin across the org.  Restricting the Management Server Action Account does have a small pain point in that you need to manually enter account credentials for agent deployment and update if you’re using the SCOM console, but to me, that’s a worthwhile trade. To be fair, managing software deployment accounts is a challenge for all organizations, though again this is where a Red Forest/Privilege Access Workstations come into play as these accounts can be restricted via IPSEC to only run from specific locations. Personally, I’m prefer to outsource agent deployments and updates to SCCM anyways. It’s not hard to change the IsManuallyInstalled flag in the SQL DB, and it allows for an automated solution to deploying agents and patching.

SCOM port considerations

Microsoft publishes SCOM’s port requirements here (see the “supported firewall scenarios” section). Note that this document is applicable for both SCOM 2016 and SCOM 2012 R2. I think most of what I have to say is common sense, so I won’t elaborate, but it’s definitely worth opening ports only as described in this document.

This concludes the potential security risks to consider when deploying SCOM. The next piece will cover how to architect an Operations Manager environment using Microsoft’s Tiered account structure.

Summary

  • Securing Privilege Access (AD Security) paper.
  • Carefully Manage RunAs Accounts
    • Avoid less secure distribution
    • Avoid using powerful accounts
    • Use IPSec to restrict RunAs accounts to only systems that need them.
  • Restrict privileges of SCOM accounts.
  • Turn on Agent Proxy only as needed

Part 3 can be found here.

Securing SCOM in a Privilege Tiered Access Model–Part 1

I’ve had a few discussions with some people internally on this subject. One thing that has been consistent in these conversations is that we (Microsoft) don’t have much in the way of good guidance on securing SCOM, and this really needs to be addressed. Since I’ve written quite a bit on Cyber Security and SCOM, have released a security monitoring solution for SCOM, and am now officially a Cyber Security Consultant at Microsoft, I figured I’d take a stab at this. It’s worth noting that this has been tossed around internally, though I wouldn’t be surprised if I have to update it at some point in the not so distant future as this is unofficial guidance.

Let’s start by giving a quick explanation of the tiered access model. For more detail, I’d highly recommend reading the Securing Privileged Access Reference Material that Microsoft has published, as it has much more detail. In summary, Microsoft recommends breaking accounts into various tiers. Microsoft recommend isolating identities into various Tiers.  Identities include user accounts, computer account, applications, etc.  Tier 0 represents those identities that can give you full access to the environment. These credentials should NEVER be used on Tier 1 or Tier 2 systems. They should only be used on Tier 0 systems (i.e. domain controllers). Tier 1 represents the server tier where your business and application data resides. Even in this scenario, it’s recommended to move away from that global server admin account which if compromised is almost as bad as an attacker getting that DA account. Compromising a Tier 0 account is certainly easier for an attacker, but if they get enough of Tier 1, they still have your data. Servers and accounts managing servers need to be isolated with various restrictions in place to prevent lateral movement and collection of these credentials. Microsoft does provide an engagement to help against this called SLAM, Securing (against) Lateral Account Movement. I highly recommend that as a way to start locking down your organization. Tier 1 credentials should never be used in Tier 0 or Tier 2. Tier 2 is the desktop tier with connectivity to the internet for browsing, email, and general application use. This is the assumed breach area, as no matter how hard you try, some one will click on something they shouldn’t and eventually compromise a desktop. Tier 1 and 0 creds should never be used on a Tier 2 device. This includes common things such as RDP to a Tier 1 server. RDP Restricted Admin settings can help in some ways, namely keeping a Tier 1 cred off of the Tier 2 system, but the recommendation for managing your environment would be to use separate Privilege Access Workstations (PAW) in some sort of Red Forest environment, which we call ESAE.

System Center services have high privilege in the environment to many systems including Tier 0 which makes them a prime target for attackers to do bad things in your environment.  As John Lambert mention as part of his “How InfoSec Security Controls Create Vulnerability” article, the method that Information Security systems are implemented without visualizing the security dependency graph is where individual risk management decisions fail to create a defensible system. As such, I’d highly recommend isolation of the system center stack.  This is an application that could potentially hold the credentials to powerful accounts, making it a high value target to attackers.

Let’s start with the architecture. SCOM uses an agent to run workflows and return data to the management server for alerting, collection, etc. In and of itself, this is a fairly innocuous task. Communication between the management server and the agents is fairly benign. The management servers will send configuration information the agents (i.e. which management packs to download) and the agents send the results of those MPs back to the management server. There are a few risks to this, with the biggest being run as accounts. We’ll talk more about them in the next part, but I’ll simply note here that poor distribution of run as accounts can expose your organization to credential theft and reuse (aka pass the hash).  For now though, I want to highlight two other areas of concern.

Agent Action Account should always be the local system account

This should not be confused with the Management Server Action account. This account is the default account for things like agent updates, agent deployment (and I would argue that it’s probably best not to use this account for those purposes, since it runs in resident memory on the management servers), and running various workflows on the management servers. The agent action account is the account that an agent uses to execute its workflows. By default, this is the local system account, as that is what the Microsoft Monitoring Agent runs under. That said, it is configurable and customers can have the monitoring agent run under service account credentials. This is a BAD IDEA. As mentioned in the Administrative Tools and Logon Types of the Securing Privileged Access Reference Material article, service accounts leave credentials behind on every system that the service runs.  By compromising one system where this service (that uses the service account) runs provides an attacker the capability to reuse those credentials to access all other systems that are allowed to use this account. If you must use a service account, then this account needs to have access restricted only to the machine that needs it. If that account has rights across the domain, then you’ve opened your environment up to being compromised quickly. I’ve written about this as well, and you can find that piece here.

Secure access to who can import/change MPs and from where they can import them.

While it’s not as obvious from the SCOM console, SCOM has extensive libraries to run PowerShell, command line, and VBS scripts, and to be fair, much of this is on the author of those management packs to follow best practices, and an attacker has no such obligation. This means that someone could write a management pack that could potentially deploy malicious software, create a back door, or even use SCOM as a vehicle to collect key information about an environment.  I could, for instance, write a management pack to use a PowerShell probe or task that connected to a remote share and install some malware on a system. I could potentially use it to lower the security posture of a system.  SCOM doesn’t have much in the way of auditing either, meaning that we cannot trace back who would have done something like this. Your only clues as to whether or not this could be going on is if you were regularly auditing the installed management packs as well as their content (and I find that this not done often).  It would be likely in this scenario that you would see a lot of the yellow SCOM alerts (Workflow failed to run, Workflow failed to initialize, OpsManager failed to start a process, etc.), but in my experience, very few organizations spend much time looking at these alerts.

Out of the box, I’d add, SCOM is very vulnerable as the BUILTIN\Administrators group is by default a SCOM administrator. This should be removed and replaced with an active directory group that is limited only to your SCOM engineers and appropriate SCOM service accounts (more on that in the next post).  You also need to control where this type of access can be performed. This fits into Microsoft’s PAW and Red Forest concepts, as administration of SCOM should not be allowed from your Tier 2 environment. Tier 2 is an assumed breach environment as it can be compromised easily. If your SCOM admin, for instance, has the SCOM console installed on his/her desktop and does a  “run as” to use it, their SCOM administrative credential is now sitting in the LSA on their local desktop, which means an attacker can steal those credentials. If those credentials have more access, the attacker just got your tier 1 environment. If they are just SCOM admins, the attacker could upload a malicious management pack to SCOM.  This also means your SCOM admin could feasibly be the victim of a targeted phishing attack as this could be a very quick way to compromise an environment.

Because of this, SCOM administration really needs to be occurring through a Red Forest. A Red Forest, for the record, is a non-trusted domain. It’s hardened and it does not have internet access, email, etc. You would use IPSEC and firewalls to restrict administration of your environment through only your Red Forest. Your SCOM admin should never be administering SCOM from an internet facing machine joined to your domain. They should be doing this from a red forest. If they do their administration on the management server directly, they should only be allowed to RDP to the management server from the Red Forest. This makes it very difficult for an attacker to steal your credentials.

That said, setting up a Red Forest will certainly take a lot of time. In the short term, consider enabling RDP Restricted Admin mode (instructions are here). This will lower the attack surface for lateral movement as RDP credentials will not be stored in the local machine’s LSA. Authentication will happen on the RDP target only. This isn’t as secure as a Red Forest, but it is an easy short term fix that can reduce your attack surface.

This covers the first piece in this series. In the next piece, I will cover more about least privileges, run as accounts, and other things that can be done to protect your Operations Manager environment.

Summary

  • Securing Privilege Access (AD Security) paper.
  • Agent Access Account should be the Local System Account
  • SCOM administrators should be restricted. The location of where SCOM administrators can administer SCOM should also be restricted.

Part 2 is here.

Part 3 is here.

Configuring SCOM to Monitor Dell Storage Solutions

I was asked by a customer recently to configure SCOM to monitor Dell EMC SANs. The request seemed easy enough, until I got to doing it and realized that the documentation is, well, less than stellar. As such, this will be a quick post as to how we managed to get this working. I’m not 100% sure that every step listed in here is needed, but this is what we did to accomplish that task.

First, the instructions. The best documentation we were able to find was not off of Dell’s site, sadly. It was here on systemcenter.wiki. Even that, however, appears to be a bit out of date. Dell’s instructions said the following:

  1. Install the ESI Service
  2. Configure the ESI service connection.
  3. Publish the connection.
  4. Import the Management Packs.

The “how” was missing from their documentation, and their online video that was supposed to show us how only showed us what it looked like once done. I’m sure there’s some better documentation out there, or at least I’d like to hope there is, but I was unable to find it and as such I’m publishing this article.

Installing the ESI Service

First we needed to find that. This too wasn’t easy. There was no ESI service download from Dell’s site. We did find it eventually, as it’s included in the ESI PowerShellSetup files that they made available for download. That was mentioned in the description of the download. There were several versions available, but for what it’s worth, we did not get the latest version to work and had to settle on 5.1.0.3. This may have been due to a few reasons. I suspect in retrospect that this is because the ESI service may be dependent on the Unisphere CLI component; however, the installer did not call this out as a dependency and let us proceed without it. Also, I will note that one of the big problems we had is that we installed the PowerShell patch that they provided. This did not uninstall, making rollbacks impossible. I would advise against this without a snapshot in advance and understanding of what this patch is doing and if you need it.

On to the steps.

  • Download the files. As mentioned before, we only got 5.1.0.3 working. These are the files as they are named on the Dell portal. We needed the ESI.5.1.0.3PowerShell.Setup, ESI.5.1.0.3.GUI.Setup, UnisphereCLI-Win32-x86-en_US-3.0.0.1.16-1, and the ESI.SCOM.ManagementPacks.5.1.0.3.Setup. 
    image

    I would note that we never got 5.2.0.9 working. There’s no GUI for it, which we needed, but there could have been other factors there, so I don’t want to say it won’t work. I will state that the GUI MMC kept crashing every time we launched it in this configuration.

  • We installed all of these from an administrative command prompt to get around UAC issues. The first piece we installed was the UnisphereCLI file. This is supposedly only needed for certain adaptors, but since we were using a Unity adaptor, this was a requirement. This may not be necessary depending on our SAN’s adaptor, though I’m not convinced that this isn’t a component for ESI as well.
  • Next, we installed the 5.1.0.3 PowerShell setup and then the GUI setup. I’ll note one other issue we had (though it was with 5.2.0.9), was that the AD publishing piece kept crashing on a Server 2016 build during install, as such, we decided to use the options to publish locally for this piece. Everything else was default.
  • Last, we ran the management packs. This is a typical MP extraction. We will need to import them later.

Configuring the ESI Service Connection

Once everything is installed and working, you should be able to successfully launch the EMC Storage Integrator MMC icon created during the install. What we need to do here is tell the ESI service to talk to the SAN devices that you want to monitor. This part is fairly straight forward, though you will need to have some sort of storage credentials to talk to this, so someone from your storage team may need to be involved.

  • Launch the EMC Storage Integrator.
  • Right click on “Storage Systems” and choose “Add Storage System”.
    image
  • Choose your adaptor type, and fill out the appropriate information (note that each type may have a different set of info, this screenshot is for Unity only). Definitely test the connection before clicking add.
    image
  • Repeat until all connections are added.

Publish the Connections to ESI

At this point, we need to publish this information. We chose the local host during setup, so effectively that means this info is stored locally. Active directory is an option, though as I mentioned earlier, this kept crashing during install. This could have been our lab, a bug in their software, or who knows. We didn’t spend a lot of time figuring that out as a local connection was acceptable. This too is done from the EMC Storage Integrator.

  • Right click on the EMC Storage Integrator MMC icon (from within the MMC) and choose the option to Publish Connection.image
  • This box appears:
    image
  • You’ll need to change the Publish to Target option to the ESI Service. Strangely enough, this would crash if we added any value in the “Target Host field,” so we left it blank. We kept the default installs during the various setup routines, so if you configured custom ports, no SSL, etc., you’ll need to change that accordingly.
  • Next step, is to click “Refresh”. That will display your targets in the left pane. They did not automatically appear here. Select the appropriate targets and click “Add”.
  • After that’s done, the Publish button will no longer be gray and you can click “Publish”.

Import the Management Packs

At this point, we are in the home stretch… or so we thought at least. We went ahead and imported the 5.1.0.3 management packs that we extracted earlier. The ESI service discovered as expected. Nothing else did. So now more digging. The problem is that the next discovery is disabled by default. That isn’t bad practice per se, but it would be nice if there had been any documentation on this, as you’ll need to do this in order to actually get good data into SCOM.

  • From the Authoring workspace, expand “Management Pack Objects,” and select “Object Discoveries”.
  • Do a find for “EMC SI Service Discovery”.
    image
  • You can see that this is disabled by default. I’m not exactly sure why this is targeted at Windows Computer and not the ESI Service that was already discovered, but I didn’t have access to the people that wrote this. Go ahead and perform an override. You can likely override just for the object that hosts the service, though we did all objects of the class. It’s worth noting that you do need to specify the machine name in the override settings (or at least we did).
  • This appears:
    image
  • It’s worth noting that there are lots of options here that can be overridden. We did the Enabled and ESI Service Host options, though if you have customizations, proxy, ports, accounts, etc., that should be included here. I would note that I don’t think this is being configured with SCOM run as accounts, so I’m not certain how secure it would be to put a username and password in this field.

That did it. A short time later, all of our storage pools were showing up in SCOM and monitoring was working.