As I have mentioned in the initial posts, using the security monitoring management pack is going to require certain practices in procedures be in place. Simply importing the management pack will not give you a picture of everything that it is designed to monitor. Ultimately, this management pack serves two purposes. The first is to detect certain anomalous events allowing a security team to respond to them. It certainly will not capture everything, but it does capture a lot of events that should be investigated immediately. The second is that it provides an organization a means to augment security best practices. This can provide a means of check and balances so to speak that security and change management teams may care about. If someone changes a GPO, we will tell you. If someone adds a user to the domain admins group, we will tell you. These are not necessarily bad things, but they are things that should happen rarely and can then be confirmed by an independent team. This allows an organization to get a better picture of the real business process in their organization versus what they think it is.
In some cases, I’ve already posted articles on this, and as such, I’ll link to them, but for the purpose of this article, I want to discuss what businesses should be doing if they expect to have some of the functionality that is designed into this MP.
Review Auditing Settings.
These are covered in this article. It’s worth reviewing to ensure you have all auditing settings properly set.
Consider deploying accesschk.exe to be used to enumerate writeable OS locations on critical systems:
I have a write up on how this works here. Consider reviewing all parts. This does create a decent number of objects per OS instance, but there are no monitors attached to these objects.
Special Group Logon Configuration:
Jessica Payne wrote a nice article on configuring active directory to audit special group logon. The how and why are covered in detail there, and I strongly recommend that this be implemented for the types of accounts she lists in the article (key service accounts, domain admins, etc.). What this does is generate a 4964 event when a designated account is used. If special group logon is not configured, 4964 events will never be recorded, which will effectively make this a useless check. By configuring special group monitoring, SCOM will generate an alert each time one of these accounts is used. There are two alert generating rules in place for this. One targets the forwarded events log, which if generated essentially means that these accounts are being used to sign on in a tier 2 environment, which is a terrible practice, making you ripe for easy credential theft. The second target is watching for this event in the tier 1 and tier 0 environments. To some extent, there could be noise here as a domain admin will occasional have a need to sign on to a domain controller, but this should be relatively rare if good security practices are in place. That said, it will allow a security team to verify with the particular admin that they were using their account, as opposed to someone stealing that accounts credentials. With this particular check there is also an event collection rule in place, as these events are not common. In a later revision, we hope to add a report off of this event.
KrbTgt Password Reset:
One of the features of the new Active Directory Domain Services MP is that if you configure client monitoring, it will generate an alert telling you when the last KrbTgt password reset occurred, which for most organizations was when they upgraded their AD domain level to Server 2008. This is essential to detecting and preventing an attacker from using golden tickets. I’ve already written on this particular subject, and it is probably worth reading if you are not familiar with golden tickets. In general, resetting this password should not be a problem. Issues arise from resetting this password when Active Directory replication is not functioning properly, but if you are resetting this on a 90 day cycle consistent with service account passwords, this risk is mitigated. The double reset that was described in my original article is a bit more daunting of a task. If you do not follow this practice, this rule will never generate an alert. If you are following this practice, you will get alerts due to issues with Kerberos tickets. This should be a very rare event, and as such, any time this is generated, it should be investigated as it means there is a high probability that an attacker attempted to use a Golden Ticket to return to your environment. There’s obviously other implications here, namely that they’ve been there already.
Pass the Hash Configuration:
There are rules associated with pass the hash/ticket/etc. detection built into this management pack. Several are disabled by default due to noise in the environment. I wrote about this in a multi part article in 2016. You can find it here.
The credential swap rule is enabled by default. In my lab environment, this never generated an alert unless I was using mimikatz to swap credentials. Beta testers also did not report any alerts associated with this rule. As such, I’m fairly confident (at this writing at least) that this is an excellent means of catching an attacker during an attack and strongly recommend investigation if you see this alert and rules are in place to target both the server environment and Forwarded Events if you are using them. There are also several disabled rules for pass the hash monitoring. This is in large part because normal events can trigger these other rules, and as such, additional configuration is required. That configuration is going to be based off of the organizations IP addressing scheme. In particular, we are interested in tier 2 (desktop) IP addresses accessing these servers in this capacity. As such, parameter 19 will need to be configured with these rules to match the organization’s desktop VLANs. There are three disabled rules that can be used.
- PtH against Tier 1 (server environment)
- PtH against DCs (domain controllers)
- PtH against Tier 2 (this watches the forwarded events and monitors movement within a desktop). I had no way to test this one, so I’m not quite sure how this will work in a production environment for the record.
Failed logon checks:
There is a rule in place that will generate an alert for 5 failed logon events in a 2 minute period. This will be very quiet in most environments with an obvious exception of systems where RDP is exposed to the internet. As an example, this is what this rule looks like in my Azure lab, which contains 4 VMs.
As of this writing, it has been in place for not quite 9 days and has an absurd repeat count. This is a good example as to why exposing RDP to the internet is a bad idea, as plenty of automated tools will attempt brute force attacks against your environment. No configuration is needed in dealing with this rule, and I’d add that there are already reports in this management pack showing you failed logons by IP address. This will allow a security team to send reports to the firewall team so that they can block these IP addresses if for whatever reason the organization will not follow good practices regarding RDP.
There is however, one additional monitor in place. Kevin Holman wrote one back in 2010 and built in a recovery that modified the Windows Firewall to block IP addresses. This management pack has the same monitor setup, though there are a few minor changes. It has been updated with a new logon type to account for Network Level Authentication, which did not exist back then. Alert generation is turned off (since we already have a rule in place) and the recovery is disabled. I would recommend that this be enabled for DMZ environments where there is no hardware firewall protecting these systems. One possible (though highly unlikely) consequence would be a scenario where the 5th failure came from an internal IP, thus prompting the recovery to block a legitimate IP address. It would be on of those situations that is highly coincidental due to bad timing, but it is theoretically possible.
Miscellaneous Disabled Rules:
Several rules are in place that will without a doubt generate noise, as such, they are disabled. For the most part, there isn’t a ton of value in turning these on, but there are some scenarios where an organization may be interested in generating alerts off of these events. I’ve noted them here, and the main recommendation is to think this one through carefully before enabling. Good maintenance practices should be in place as normal things like updates will trigger these events:
- System shut down unexpectedly
- System was rebooted
- Software was installed on a server
- Software was removed from a server
- System was powered off.
Setting up Processes with Security Teams:
Last, but most importantly, we need to discuss business process. This MP is meant to aid security teams in keeping an eye on operational best practices in the org as well as giving them an early warning to potential threats. This is not meant to be a “just send me an email” type monitoring solution. This should be scoped out so that the security team can view into SCOM, see these alerts, and close the rules after they have been verified. A good tuning process to address and respond to noise is also smart. This means that the security team and the monitoring team need to have a good line of communication. Alerting for this management pack can essentially be broken down into five categories:
- Forwarded Events – anything coming out of the desktop environment. Alerts coming from these servers are a good indication that a desktop may have been compromised. Security professionals operated under an “assumed breech” module, as no matter how much you train users, they will still click on things they shouldn’t. This allows the organization a quick response to investigate and/or re-image a desktop that has been compromised.
- Operational Events – These are likely normal, but the types of things that need verification. It also helps determine where operational security gaps exist. Examples include domain admin logons, creation of scheduled tasks, etc.
- Credible Threats – These should be investigated immediately. Examples include service creation on DCs, credential swap alerts, any 4688 detection rule in this MP, etc.
- Exterior Threats – Presently this is only the failed logon check specified above.
- Threat Hunting – These are monitors/rules that alert against known vulnerabilities that an org should address. Examples include the WDigest registry keys.